The Codegen Blog

Case Study: How Warmly’s CSMs Ship Production Features with Codegen

Codegen Technical Staff — Tue, 07 Oct 2025 09:48:40 +0000

About Warmly

Warmly’s person-level intent platform makes marketing more precise by identifying ideal customers, monitoring their buying intent in real time, and engaging through the right channel at the right moment. The product depends on constant small improvements that add up to a great customer experience.

The Challenge

Customers often share small but important requests, UX tweaks, bug fixes, and feature refinements, that improve day-to-day usability. Like many growth-stage teams, Warmly struggled to get those issues prioritized.

Maximus Greenwald, co-founder and CEO, said:

“I’m not able to code fast enough to appease my customers, and many many tickets are not worked on purely because of engineering bandwidth.”

Product and engineering were focused on major roadmap items, so low-friction fixes sat in the backlog even when the aggregate impact was big.

Bringing Codegen Into the Workflow

Warmly invited Codegen into their workspace to change that equation. Codegen CEO, Jay Hack, showed the team how to work efficiently with AI agents and encouraged them to ship ten features in a single day to demonstrate what’s possible.

The goal was to leave Warmly with a teammate that is a fraction of the cost and 10× the productivity of a typical engineer.

For the first time, customer service managers (CSMs) acted as product managers and junior developers. They identified problems, described solutions in natural language, and saw those solutions run live in production — all within hours.

One big customer request was to push a new feature to be able to drag, drop, and reorder the chat buttons on the front end. The CSM wrote a product requirement in Linear, asked Codegen to draft the technical spec, and an engineer only needed to confirm class names and function signatures before merge.

Greenwald states:

“Codegen is our Slack teammate that allows our CS team to interact with engineering and our codebase in a way that saves the engineering team tons of time and actually moves the needle on smaller features and bug requests that we never would have gotten to otherwise.”

Results

CSMs as PMs

Customer-facing teammates can now move directly from a user request to a live solution without blocking on engineering bandwidth.

Carina Boo, co-founder and Head of CS notes:

“A CSM basically can act as a product manager and then pair with an engineer and actually get stuff done. We have a ton of PRs already out to staging and we’re going to be deploying later today.”

Faster Feature Delivery

Warmly built a customer health app in about four hours, which replaced a $20,000 third-party tool and weeks of expected engineering time. They were also able to ship 30 features into production in a single day.

Stephanie Merlis, CSM, stated:

“This feature probably would have taken a full day to get shipped. But thanks to Codegen, we were able to do it in under an hour.”

10× Engineering Leverage

Codegen effectively gave every engineer a fleet of junior developers, enabling more PRs, quicker bug fixes, and continuous background improvements without hiring more staff.

Looking Ahead

Warmly proved that AI agents can expand who gets to build software. CSMs now function as an extension of engineering, delivering customer-driven fixes and entirely new applications in a fraction of the usual time.

Greenwald concluded:

“Before today, I thought the best way to solve engineering bottlenecks was to hire more engineers. Now I realize that I can use Codegen to save money on additional engineers, and empower my existing engineers to be 10x more effective.”

Ready to see what Codegen can do for your company? Try for free or reach out to our team for a demo.

The post Case Study: How Warmly’s CSMs Ship Production Features with Codegen appeared first on The Codegen Blog.

Codegen Weekly Diff

Codegen Technical Staff — Tue, 07 Oct 2025 08:32:40 +0000

The past week’s releases focus on refinement — polishing UI consistency, strengthening integrations, and streamlining daily workflows. From unified scrollbars to expanded Jira linking, Codegen is faster, cleaner, and more intuitive to build with.

Consistent scrollbars across components

We introduced a unified scrollbar system (cg-scrollbar) now used across 20+ components — from Kanban boards to tables and dialogs. The result: a smoother, cohesive feel across the entire app.

Clearer calendar design

The Calendar component now highlights today’s date with improved CSS selectors for quicker visual scanning. Subtle, but meaningful for time-sensitive work.

Kanban improvements

Kanban boards are getting smarter and more manageable:

Added tri-state checkboxes for “All columns,” so you can easily toggle column visibility.
Added a placeholder state when no columns are shown.
Refined visibility controls for better project navigation.

These small touches make large boards more intuitive — especially when managing multiple workflows.

Streamlined dates & cleaner code

The DateRangePicker now uses the compact MMM d, yy format (e.g., Oct 6, 25), improving readability across analytics dashboards.

Behind the scenes, unused props (popoverId, onOpenChatModal) and imports were removed across several components — another step toward a leaner, faster codebase.

Bigger uploads, better UX

We’ve removed the 100MB GitHub upload limit, letting developers handle larger assets and repositories directly through Codegen — no workarounds needed.

We also resolved card interaction bugs in Kanban, improved accessibility with focus outlines, and fixed pointer behavior in dialogs. These tweaks make the UI more predictable and a11y-friendly.

Jira issue linking

Developers can now create and manage relationships between Jira issues — like “blocks” or “relates to” — directly within Codegen. This strengthens end-to-end workflow visibility.

Voice and visual enhancements

Agents can now process Slack audio messages — MP3, WAV, AAC, FLAC, OGG — through the chat_with_video tool, enabling voice-based interactions with AI agents.

We also upgraded PR status chips in the command palette, making “Open,” “Merged,” “Closed,” and “Draft” states instantly recognizable.

Wrapping up

These releases may look small on paper but add up to a noticeably smoother, more consistent developer experience.

From unified visuals to integration depth, Codegen continues evolving as a platform that values both design polish and engineering precision.

Ready to get started? Try Codegen for free or reach out to our team for a demo.

The post Codegen Weekly Diff appeared first on The Codegen Blog.

How AI Development Is Moving from Specialized Agents to Orchestration

Codegen Technical Staff — Thu, 02 Oct 2025 15:40:11 +0000

Specialized coding agents got AI development off the ground. But as general-purpose models like Claude Code and Gemini CLI have matured, the real leverage is shifting up a layer — to orchestration. This is where engineering teams coordinate many agents at once with parallelism, clean workspaces, and human oversight built in.

Louis Knight-Webb, co-founder of Bloop, has seen this evolution first-hand. In a recent AI Hot Takes conversation with Codegen CEO Jay Hack, he traced the path from enterprise code search to COBOL modernization and now to Vibe Kanban, an orchestration platform for running multiple coding agents in parallel.

Let’s dive into why the future of agentic development depends less on specialized bots and more on the systems that direct them.

From vertical tools to general-purpose agents

When AI coding first caught on, most solutions were vertical: “coding agents for X,” like code search or a one-off COBOL modernization pipeline. These tools proved that autonomous development was possible and gave teams a safe place to experiment.

Louis described the progression inside Bloop:

“We started with code search that became enterprise code search… One of our customers loaded in a COBOL codebase… we realized that a lot of organizations wanted to modernize COBOL and spent 18 months working on that problem and building a fully automated end-to-end pipeline coding agent experience.”

But general-purpose models quickly overtook those narrow solutions. Reinforcement learning loops, bigger context windows, and richer APIs meant that a single agent could now handle what previously required bespoke design. As Louis put it, they realized “coding agents were moving in a way that really benefited more general purpose approaches rather than more specialized coding agents.”

Why orchestration emerged as the next layer

Once general agents became capable, the next bottleneck quickly became coordination. Running one agent is straightforward; running dozens efficiently is not. Without the right system, teams waste time watching logs and waiting for sequential runs to finish.

Vibe Kanban was built to solve this. “It’s just basically a way to orchestrate Claude Code, Gemini CLI, AMP and other coding agents at scale,” Louis explained. Instead of queuing tasks one by one, Vibe Kanban manages parallel execution with proper sandboxing and a workflow designed for fast-moving projects.

This shift is bigger than a single product. As agents complete tasks in minutes instead of days, orchestration such as task management, isolation, and reproducibility becomes the new foundation for serious software development.

The orchestration playbook

An effective orchestration layer focuses on engineering fundamentals:

Parallelism with clean state. Each task runs in its own git worktree, ensuring deterministic builds and eliminating side effects.
Automated setup and cleanup. Environments are built and torn down predictably, so dependencies don’t leak between runs.
Integrated boards. When tasks complete rapidly, project management has to fuse code, logs, and live previews into a single view.

This helps teams scale AI development to hundreds of concurrent tasks without sacrificing traceability or quality.

Keeping humans in the loop

Even with orchestration, some decisions can’t be automated. Louis was direct about this:

“The human element of review is very difficult to replace… the bits around the edges like choosing what work to even get done… I don’t think it is going to go away anytime soon.”

Scoping work, making architectural calls, and deciding when a feature is production-ready remain human responsibilities. Good orchestration respects that reality by surfacing the right context and making review and approval fast and reliable.

Designing the right layer

The key architectural question is what belongs in the orchestration layer versus inside the model itself. Models handle code generation and refactoring. Orchestration governs processes: breaking projects into tasks, preparing environments, managing dependencies, and structuring review workflows.

The line isn’t always obvious, but here are a few guidelines to help:

Keep process in orchestration. Breaking down projects into tasks, preparing clean environments, enforcing dependency checks, and coordinating reviews all belong at this layer.
Let the model focus on code. Generating, refactoring, or testing code is where large language models excel; avoid embedding these directly in orchestration logic.
Use models as subroutines, not supervisors. Treat agents as workers that execute well-defined steps, while orchestration handles scheduling and governance.

By separating responsibilities, teams can scale safely and adjust quickly as model capabilities evolve.

How Codegen helps

Codegen was built for this new layer. It provides:

Agent orchestration at scale with process isolation, reproducible environments, and parallel execution.
Integrated developer workflows that combine task tracking, logs, and live previews.
Enterprise-grade security so teams can run agents on production code with confidence.

By focusing on orchestration instead of one-off agents, Codegen gives engineering teams a durable foundation, even as the underlying models continue to evolve.

The bottom line

The story of AI development is changing from specialized coding agents to orchestration. Vertical tools proved what was possible. General-purpose agents made those tools obsolete. Now the opportunity, and the hard engineering work, is in coordinating agents effectively and keeping humans in control of the process.

For engineering leaders, platform teams, and founders building on Claude Code, Gemini CLI, or the next generation of agents, investing in orchestration is no longer optional. It’s how you scale AI-driven development with the reliability and transparency modern software demands.

Ready to get started? Try Codegen for free or reach out to our team for a demo.

The post How AI Development Is Moving from Specialized Agents to Orchestration appeared first on The Codegen Blog.

Customer Success Story: Lambda Curry

Codegen Technical Staff — Wed, 01 Oct 2025 16:55:39 +0000

For Lambda Curry, a modern software company building fast-moving products, traditional development workflows were hitting a wall. Boilerplate code, repetitive tasks, and context switching between Slack, GitHub, and project management tools slowed down the pace of delivery.

The team needed a way to reduce overhead, automate routine changes, and free engineers to focus on solving real problems — all without bolting on another tool that disrupted their workflow.

Lambda Curry co-founder, Jake Ruesink creates new issues in Linear, then drive progress with updates, questions, and threaded discussions.

Why Codegen

From the start, Codegen provided more than code generation. Acting as a 24/7 engineer and project manager inside Slack, GitHub, and Linear, it supported every step of the process:

Developing new components and API endpoints
Applying small changes instantly without disrupting active work
Explaining complex functions for easier debugging and refactoring
Looking up dependencies and internal references in seconds
Managing tasks with structured tracking inside Linear

Context-Rich Development

Lambda Curry used Codegen, and embedded AI-powered engineering directly into their existing stack. The team used Codegen in three distinct ways:

In Slack: Developers ask questions, debug tricky functions, and request code changes directly in chat.
In GitHub: Codegen automatically reviews code, generates suggestions, and opens pull requests.
In Linear: Tasks stay aligned with engineering work, and Codegen helps to create structured issues and track progress.

This meant Lambda Curry could trigger PRs, assign tasks, and resolve questions instantly — without jumping between systems.

Jake Ruesink links GitHub PRs & commits to Linear issues, post updates, and spin up follow-up tasks fast.

Real-World Impact

With Codegen, Lambda Curry transformed how its developers work day-to-day:

Thanks to Linear + Codegen, routine tasks are handled automatically, while higher-impact work gets engineers’ full attention.
Codegen generates the first draft of new components or endpoints, letting engineers focus on refining functionality instead of starting from scratch.
Configuration updates and style tweaks happen instantly, without derailing deep work.
Complex functions get explained on demand, simplifying debugging and refactoring.

Jake Ruesink, Lambda Curry co-founder, said:

“It helps us go from task planning to implementation with astonishing speed. Sometimes what we thought would take hours now takes minutes. It’s become hard to estimate timelines — in the best way.”

Moving Forward

Lambda Curry proves what’s possible when you bring Codegen into your workflow: AI that reshapes how teams review, build, and ship.

Want results like this? Try Codegen for free or reach out to our team for a demo.

The post Customer Success Story: Lambda Curry appeared first on The Codegen Blog.

Developer Productivity Tools: What Works, What Metrics Matter, and How Codegen Helps

Codegen Technical Staff — Fri, 26 Sep 2025 14:37:11 +0000

Productivity tools are everywhere. Code editors with thousands of plugins, CI/CD automations, AI assistants, dashboards, linters, etc. But having tools isn’t the same as having effective tools. The best tools shift the burden, reduce friction, and align with both developer experience and velocity.

Productivity tools can be a double edged sword. Tools promise speed. But poorly chosen or badly configured ones lead to tool sprawl, unnecessary overhead, false positives, and misaligned metrics that encourage “busy work” over sustainable work.

So the goal isn’t more tools — it’s better tools + better integration + signals that matter.

Productivity tools that move the needle

Based on research and field evidence, five categories of tools consistently deliver measurable improvements. Below, each section expands on why it matters, best practices, and key metrics to watch.

1. AI / Code assistant-backed tools

Repetitive coding, boilerplate generation, common refactors, API wiring, drains cognitive energy. AI-powered assistants such as GitHub, Copilot, or Codegen can automatically suggest code completions, generate scaffolding, or even refactor large blocks of code.

Best practices

Integrate directly into IDEs or code review flows to minimize context switching.
Use them as pair-programming partners rather than one-shot generators.
Configure them to respect repository-specific style and security guidelines.

Metrics to track

Time saved on common tasks (compare commit timestamps or sprint velocity pre- and post-adoption).
Reduction in trivial review comments (e.g., style or formatting issues).
Developer satisfaction and perceived focus time (survey-based, SPACE framework).

2. Review & merge flow enhancers

Slow code reviews are a bottleneck. Productivity drops when PRs wait days for feedback or when reviewers must handle low-value comments manually.

Examples

Automated reviewers that highlight high-risk sections and suggest fixes.
Intelligent routing to assign the right reviewers based on code ownership.
Dashboards that visualize review queues and lead times.

Best practices

Keep PRs small and reviewable; integrate tools that nudge contributors toward better PR hygiene.
Set SLAs for first review and monitor them with dashboards.
Combine automated checks with human oversight to maintain quality.

Metrics to track

Time to first review and total time to merge (p50/p90).
Number of review comments per PR and proportion resolved automatically.
Change failure rate and post-deploy defects to confirm quality isn’t sacrificed.

3. CI / Pipeline Automation

Builds, tests, and deployments often dominate cycle time. Waiting on flaky tests or slow builds can consume more hours than actual coding.

Examples

Parallelized and cached CI builds.
Automatic retries for transient test failures.
Agents that proactively fix common build or configuration errors.

Best practices

Instrument pipelines to detect bottlenecks and measure improvements.
Use predictive builds to run only the tests affected by a change.
Keep human oversight for critical paths (e.g., production deploys).

Metrics to track

Average and p90 build/test times.
Number of flaky failures and reruns.
Mean time to recover from failed checks.

4. DevEx & communication tools

A great developer environment minimizes friction and maximizes flow. Poor documentation search, unclear ownership, and constant context switching are top drivers of burnout.

Examples

Powerful code search and documentation search tools.
In-context information delivery (e.g., linking relevant wiki pages directly in the IDE).
Lightweight, integrated communication tools for quick clarifications.

Best practices

Standardize documentation and enforce easy discovery.
Integrate chat, docs, and ticketing into a single workflow to cut tool hopping.
Collect developer sentiment regularly to spot friction early.

Metrics to track

Context switches per day and average focus-session length.
Developer satisfaction with documentation and tooling.
Average time spent finding code references or internal APIs.

5. Monitoring & measurement tools

You can’t improve what you can’t measure. Monitoring tools aggregate signals from VCS, CI/CD, and issue trackers to give a unified picture of throughput, quality, and efficiency.

Examples

Analytics platforms like Swarmia or Jellyfish that calculate DORA metrics.
Custom dashboards showing lead time, deployment frequency, and review load.

Best practices

Focus on outcome metrics (lead time, CFR, MTTR) rather than vanity metrics like lines of code.
Make dashboards transparent and shared to drive team-wide improvements.
Feed measurements back into planning to guide where to automate next.

Metrics to track

DORA metrics (lead time, deployment frequency, change failure rate, MTTR).
Time spent per type of task (feature work vs. rework).
Tool adoption rates and developer-reported usefulness.

How Codegen helps

Codegen is built to maximize the impact of each of these categories by combining AI-powered automation with deep GitHub and CI/CD integration.

AI / code assistant-backed automation

Codegen agents create and modify code autonomously, fix failing checks, and generate boilerplate while you keep working. They integrate into existing IDEs and repositories, so developers stay in flow.

Review & merge flow

The PR Review Agent flags security issues, code-quality concerns, and architectural improvements with precise inline comments. Check Suite Auto-fixer automatically diagnoses and resolves CI failures, retrying intelligently before escalating.

CI / Pipeline optimization

Codegen’s auto-fixers and background agents cut down on waiting time, resolving errors and stabilizing builds without manual intervention.

Developer experience & communication

Trigger agents from existing tools like Slack, GitHub, Jira, or other common tools with a simple @codegen mention. Developers don’t have to leave their preferred workflow to get automated help.

Monitoring & measurement

Codegen Analytics offers live dashboards on agent performance, PR velocity, and cost savings, aligning perfectly with DORA and SPACE metrics. Teams can track ROI in real time, from time-to-merge improvements to cost-per-PR.

By automating low-value tasks, reducing context switching, and providing transparent analytics, Codegen gives engineering teams the speed of automation with the trust and insight of robust measurement — the combination that actually drives sustainable productivity.

Ready to get started? Try Codegen for free or reach out to our team for a demo.

The post Developer Productivity Tools: What Works, What Metrics Matter, and How Codegen Helps appeared first on The Codegen Blog.

Codegen On-Prem Deployment: Bring the OS for Code Agents In House

Codegen Technical Staff — Wed, 24 Sep 2025 18:02:06 +0000

If your organization can’t move code or logs outside its network, you shouldn’t have to sit out the agent era.

Today we’re introducing Codegen on-prem — the same operating system for code agents that powers our cloud, packaged for your Kubernetes. Install with Helm, keep all code and telemetry inside your environment, use your model API keys, and enforce your policies.

So how does on-prem deployment work, who benefits most from it, and what makes Codegen’s approach the right fit for modern engineering teams?

What is on-premises deployment?

On-premises deployment means the stack runs inside your own facilities or data centers, not in a vendor’s cloud. You procure the hardware and network, install and operate the software, and keep code and data within your physical and legal boundary.

The upside is full control. You can customize the environment end-to-end, enforce your security policies, and meet strict regulatory requirements with direct access to the systems that hold your IP. The trade-off is ownership of the entire lifecycle — capacity planning, purchasing, installation, patching, upgrades, monitoring, and security all sit with your team.

Who benefits from on-prem

On-prem is a fit when code and telemetry must stay in-region or on site; when audits, industry rules, or internal policies prohibit external processing; or when the network itself is constrained (strict egress, private services, even fully air-gapped).

In short: if “keep it in house” is non-negotiable, on-prem is the straightforward path.

Data residency / sovereignty: code and telemetry must remain in-region or in-house.
Regulatory and audit pressure: finance, healthcare, public sector, or any org with rigorous approvals.
IP sensitivity: proprietary models, unreleased features, or high-value codebases.
Network constraints: private services, strict egress, or air-gapped environments.
Operational integration: reuse of existing IAM, KMS/HSM, SIEM, proxies, and deployment processes.

How Codegen on-prem delivers

Codegen is an OS for code agents: it gives agents a safe runtime, orchestrates concurrent work, connects them to the tools engineers use daily, and records what happened with enough detail to trust the outcome.

On-prem is a Kubernetes-native platform. You install with Helm charts, manage configuration in values.yaml, and use the same GitOps and CI/CD workflows you already rely on.

Data stays put. Repositories, artifacts, logs, prompts, and agent trajectories live in your environment. If you need to route traffic through proxies or pin egress to specific destinations, you do that with your network policy and admission controls, not ours. And because model choice is yours, you bring your own API keys for the LLMs you use.

Keys are managed locally and rotated on your schedule, with request routing that respects your security boundaries. If you prefer customer-managed keys (BYOK/CMEK) backed by your HSM or cloud KMS, that’s supported too — along with clear docs on what the keys protect and where they live.

Security posture is opinionated but transparent. Pods run under restricted policies with minimal capabilities and node isolation where practical. Policies are enforced at admission and at runtime using mechanisms you can audit (e.g., OPA/Gatekeeper for egress allowlists, trusted registries, and image provenance; RBAC for least-privilege).

The point is simple: you control the guardrails, and the platform fits into them cleanly.

Observability is first-class. Codegen ships OpenTelemetry traces, metrics, and logs across agents, sandboxes, integrations, and check suites. We include ready-to-import “golden signal” dashboards and practical alert suggestions so SREs can see load, latency, and error profiles without reverse-engineering the system.

Networking is explicit. We document ingress and egress patterns, DNS and proxy requirements, and the steps to run with zero outbound in air-gapped environments. If you mirror images to a private registry and provide pull secrets and pinned digests, the platform runs fully disconnected.

For more information check out our official on-prem documentation.

What you should expect out of the box

Kubernetes-native deployment with Helm

Install, upgrade, and roll back with Helm charts. Manage values.yaml, pin images, verify signatures, and plug into GitOps and your CI/CD without special tooling.

Complete data sovereignty

Your repositories, artifacts, logs, prompts, and agent trajectories never leave your infrastructure. Enforce residency and org policies at the network and workload layers.

Your own API keys for AI models

Bring your providers and manage model API keys locally. Route traffic through your proxies, rotate on your schedule, and scope access by policy.

Enterprise-grade support and SLAs

On-Prem is an enterprise-only offering with SLAs and hands-on help for hardening, sizing, and performance. Runbooks and escalation paths are included.

Flexible infrastructure support

Self-managed Kubernetes, OpenShift, Rancher, EKS-Anywhere — supported. Air-gapped and restricted networks are first-class: private registry mirroring, pull secrets, and offline licensing.

Getting started

Ready to see how Codegen can fit into your engineering workflow?

Book a demo to watch it in action or contact our team to discuss deployment plans and pricing. We’ll help you explore the best path, cloud or on-prem, to bring AI agents safely into production.

The post Codegen On-Prem Deployment: Bring the OS for Code Agents In House appeared first on The Codegen Blog.

Introducing Codegen 3.0: The Operating System for Code Agents

Codegen Technical Staff — Wed, 17 Sep 2025 19:02:58 +0000

Software development is undergoing its biggest transformation since the compiler. Code generation has unlocked fundamentally new types of software we can build. But deploying AI agents in real engineering teams requires more than just powerful models – it requires infrastructure.

What Does It Mean to Be an OS for Code Agents?

Think about what an operating system does: it manages resources, provides isolation between processes, handles I/O, and gives applications a consistent interface to hardware. Codegen does the same for AI agents.

When you run code agents at scale, you need:

Process isolation: Sandboxes where agents can safely execute code without affecting production
Resource management: Orchestration that routes requests and manages concurrent agents
I/O handling: Deep integrations with your existing tools (Slack, GitHub, Linear, etc.)
Monitoring: Telemetry and analytics to understand what agents are doing
Access control: Permissions and rules to keep agents within boundaries

Codegen provides all of this as a unified platform. Your agents run on our infrastructure, but with your rules, your integrations, and complete transparency.

What’s New in Codegen 3.0

Claude Code Integration

CLI agents are the building blocks of modern AI development and Claude Code is no exception.

We’ve built tight integrations with Claude Code that bring enterprise-grade infrastructure to your terminal:

Cloud telemetry for every local session: Every Claude Code interaction is logged to the cloud, creating searchable history and audit trails
Background agents on command: Start long-running tasks without blocking your terminal
Full MCP access out of the box: All your Codegen integrations (Slack, Linear, databases) are automatically available through MCP

This means you can run Claude locally while getting the benefits of cloud infrastructure – monitoring, analytics, and team visibility.

State-of-the-Art AI Code Reviews

Code review has become more of a bottleneck than code production. That’s why we’re launching AI code reviews as a first-class citizen.

Our review agents:

Provide line-by-line analysis with actionable suggestions
Catch security vulnerabilities and unsafe patterns
Suggest architectural improvements
Maintain consistent code quality across human and AI contributions

Configure review rules at the organization level, then customize per repository. It’s built on the same infrastructure that powers our code generation agents, ensuring reliability and scale.

Enterprise-Grade Sandboxes

Real development means running tests, installing dependencies, and keeping massive codebases in sync.

Our sandbox infrastructure has been rebuilt from the ground up to handle this reality:

Instant boots: Pre-warmed environments ready in seconds
Persistent state: Maintain installed dependencies across agent runs
Production parity: Configure environments that match your stack exactly
Full transparency: See exactly what agents are doing in real-time

These aren’t toy environments – they’re built for codebases like Notion, ClickUp, and Linear.

Comprehensive Analytics

Any successful deployment of code agents requires measurement.

Codegen Analytics provides granular insights into:

Cost breakdowns: Track spending by model, agent, and team
Impact metrics: PRs merged, lines changed, velocity improvements
Adoption patterns: See who’s using what, where, and how often
ROI analysis: Understand the real value agents deliver to your organization

Live dashboards update in real-time, giving you the data needed to optimize your AI investment.

The Platform in Action

Here’s how it all comes together: A developer triggers an agent from Slack to implement a new feature. The agent spins up in a sandbox, pulls the latest code, implements the feature with full access to your tool stack via MCP, runs tests, and creates a PR.

When CI fails, the Check Suite Auto-fixer automatically analyzes logs and pushes fixes. The PR Review agent provides intelligent feedback. Throughout this process, everything is logged, measured, and visible in your analytics dashboard.

This is what we mean by an operating system for code agents – complete infrastructure that makes AI development work in production.

Built on Real Experience

These aren’t theoretical features. They’re built from years of experience with thousands of engineering teams shipping production code with AI. We’ve learned what breaks, what scales, and what actually matters when you’re trying to ship software, not demos.

The gap between vibe coding and professional software development is real. Codegen 3.0 bridges that gap with infrastructure built for the messy reality of software engineering.

Get Started

Codegen 3.0 is available today. Start with our GitHub integration, connect your tools, and see what it means to have real AI infrastructure. Explore all of Codegen’s capabilities in our docs.

The future of software development is here. And it’s running on Codegen.

The post Introducing Codegen 3.0: The Operating System for Code Agents appeared first on The Codegen Blog.

Why Code Review Will Determine Who Wins in the AI Era

Codegen Technical Staff — Tue, 09 Sep 2025 16:13:41 +0000

For decades, software engineering was defined by writing code. But that balance has shifted. With AI code agents producing high-quality output in seconds, the bottleneck isn’t generation anymore, it’s everything around it: reviewing, merging, testing, and governing the changes.

We’ve entered a new frontier where engineers spend less time typing and more time orchestrating. Writing code has become the easy part; making sure that code is correct, compliant, and production-ready is where the real challenge now lies.

Codegen CEO Jay Hack joined Merrill Lutsky, co-founder and CEO of Graphite, on AI Hot Takes to dig into why code review is the new bottleneck, and why the way teams rethink their outer loop will decide who ships and who stalls.

Reviews are where teams are getting stuck

Numbers show developers are able to generate code at a blistering pace. GitHub reported in 2023 that developers using Copilot completed tasks 55.8% faster than control groups, and because of this speed, Amazon says “developers report they spend an average of just one hour per day coding.”

But the outer loop hasn’t kept up. An analysis of ~1,000,000 PRs by LinearB shows that cycle time is dominated by review latency, with PRs sitting idle for an average of 4+ days before a reviewer even looks at them. In other words: we can now generate 10x the code, but we can’t yet review or ship it 10x faster.

Lutsky noted:

“If we have these 10x engineering agents, that just makes the problem of code review 10x more important, and more painful for companies that are using them.”

Is stacking how we keep the pace?

One of the most powerful responses to this bottleneck is stacked pull requests. Instead of submitting one massive PR, stacking breaks features into small, independently reviewable increments.

This isn’t new. Facebook built Phabricator to support stacked diffs across thousands of engineers, and Google’s Critique adopted similar practices. The reason was simple: smaller diffs are easier to review, unblock dependent work, and reduce the risk of merge conflicts.

Lutsky stated:

“Stacking was invented for orgs with thousands of engineers, but it’s suddenly relevant to every team now that agents can generate code at the scale of those orgs.”

Now, in the agents era, stacking feels less optional and more like a requirement. Agents generate code in bursts, and humans can’t keep up if the output lands as giant, monolithic PRs. Stacking makes agent output digestible, verifiable, and mergeable.

Solving AI problems with AI solutions

There’s a temptation to see AI review as replacing rule-based automation to keep the pace, but the reality is that both are necessary.

Deterministic systems such as branch protection, CI pipelines, and merge queues enforce non-negotiables. They ensure that every change passes tests, follows style guides, and respects permission boundaries. But they’re limited. They can’t reason about whether a design decision makes sense.

Agentic review fills that gap. Context-aware agents can scan a PR in seconds, check for subtle logic errors, and recommend fixes that a human might miss, especially in unfamiliar parts of the codebase. Studies suggest AI already outperforms humans at spotting certain categories of bugs.

Graphite is already combining merge queues with their review agent, Diamond. Lutsky noted:

“Combining those kinds of deterministic and more traditional methods with agentic review, and having a code review companion…our unique take is that you need both of those combined all into one platform in order to properly handle the volume of code that we’re seeing generated today.”

Deterministic controls to guarantee baseline standards, and agentic reviewers to accelerate semantic checks. The result is faster throughput without sacrificing safety.

Optimizing the Outer Loop

We’re in the middle of an exciting shift. Code generation is fast and plentiful. The bottlenecks are now review, orchestration, and governance — the outer loop of development.

Optimizing this outer loop requires:

Making stacking the default, so changes are digestible.
Blending deterministic rules with agentic review for speed and safety.
Building review interfaces that tell a story and scale to agent-level throughput.
Treating AI metadata as compliance-critical data, not an afterthought.
Meeting developers where they work, whether in Slack, GitHub, or natural language interfaces.

The message for teams of all sizes is clear. Code is no longer the bottleneck. Review is. The winners in this new era will be the teams that redesign their workflows around that fact.

Want to check out the full conversation? Watch Jay Hack and Merrill Lutsky discuss how AI code generation is breaking traditional development workflows, and why code review has become the real bottleneck on AI Hot Takes.

If you’re still stuck in PR purgatory, it’s time to try Codegen. Free to start, or schedule a demo if you want receipts.

The post Why Code Review Will Determine Who Wins in the AI Era appeared first on The Codegen Blog.

From Voice Notes to Production Code: Yes, You Can Ship from WhatsApp

Codegen Technical Staff — Wed, 03 Sep 2025 16:17:37 +0000

Some of the best ideas don’t happen at a desk. They come mid-conversation, on the go, or while you’re talking something through. One of our users recently showed us what it looks like to capture that moment of inspiration and turn it directly into working code — using a WhatsApp voice note.

From voice to Code(gen)

Andy Bromberg, co-founder of Eco Inc., recorded a short note describing the need to clean up scattered prompts across their app. Within a minute, Codegen launched a run to:

Scan WhatsApp routes for hardcoded prompts
Check other surfaces like email, chat, and tools for stray prompts
Move everything into a centralized LLM config
Update the code so prompts pull from config instead of being hardcoded

All of it was triggered from a single voice message — no IDE, no manual setup.

Codegen is where you are

Whether it starts in Slack, Linear, GitHub comments — or even a WhatsApp voice note — Codegen takes the input and turns it into production-ready code. And while Codegen doesn’t yet have a direct WhatsApp voice integration, here’s how such a workflow could work today:

Record a note describing requirements (e.g., “Create a React component for user profiles”).
A speech-to-text service (like OpenAI Whisper) transcribes the audio.
Middleware (n8n, Zapier, or custom API) formats and routes the transcription to Codegen.
Codegen analyzes context, generates code, and creates PRs or implementation plans.
Codegen’s output is sent back into WhatsApp with code snippets, PR links, or explanations.

Type less, code more

Voice-driven development may sound futuristic, but it’s part of a larger shift already underway: programming is becoming multimodal. Text, chat, voice — they’re all just inputs. What matters is that the AI system can understand your intent, process it in the context of your codebase, and deliver production-ready outputs.

That’s the future we’re building toward at Codegen. Fewer barriers, fewer delays, and workflows that adapt to how developers actually think and work.

See it for yourself — try Codegen for free or schedule a demo to get started.

The post From Voice Notes to Production Code: Yes, You Can Ship from WhatsApp appeared first on The Codegen Blog.

Codegen + Linear: Where Tickets Become Code

Codegen Technical Staff — Thu, 14 Aug 2025 17:49:00 +0000

Backlogs don’t clear themselves. But now, they can feel like they do. With our new Codegen + Linear integration, the distance between “this needs to get done” and “it’s live in production” just got a whole lot shorter. Codegen now works inside Linear to turn issues into production-ready PRs — automatically linked, status-tracked, and fully documented — without adding more meetings, pings, or manual tracking to your life.

Tag @codegen in a Linear issue, and watch it shift from Todo to PR merged while you focus on the work that actually needs human judgment.

How it works

Codegen isn’t another bot throwing one-off code suggestions at you. It’s an agentic developer, built on Anthropic and OpenAI APIs, that understands your tickets, your codebase, and your team’s workflow.

Once connected to Linear (and optionally Slack and GitHub), you can:

Tag @codegen in a Linear issue and watch it pick it up, plan the work, and ship the PR.
Skip the “did this get done?” loop. Codegen updates issue statuses from Todo → In Progress → Done in real time.
Automatically open follow-up tickets when it finds bugs, dependencies, or optimizations.
See GitHub PRs linked directly to the Linear issues they solve.

It’s like adding a full-stack engineer to your team — one who already knows your architecture, follows your rules, and works in parallel with your human developers.

Codegen × Linear in action

Watch how Nan Yu, Head of Product at Linear, walks through how AI agents streamline task management and development inside Linear using Codegen.

The benefits you’ll notice immediately

Streamlined project management

Forget babysitting tickets. Codegen updates statuses, creates follow-up tasks, and keeps the board accurate without a PM spending half their day in “update mode.”

Code-to-task without the glue work

Every PR Codegen ships is automatically linked to its Linear issue, complete with documentation of what changed and why. No more detective work to connect code changes to business intent.

Progress without more meetings

Status updates land directly in the issue. Design decisions stay attached to the ticket. You can skip the “round-the-room” syncs that just restate what everyone could’ve read.

From ticket to PR in minutes

Whether it’s a small bug fix or a routine feature, Codegen can take it from description to merged PR, fast. It clears your backlog to frees your team for high-value engineering.

Full context awareness

Codegen understands your codebase. It implements changes that fit your patterns, follow your dependencies, and avoid regressions.

Adopting it across your team

Start small. Pilot with one team, pair early adopters with new users, and document best practices. Track your KPIs — cycle time, PR quality, backlog velocity — so you can measure the impact.

Where Codegen shines

Routine feature tickets with clear specs
Bugs with reproducible steps.
Documentation updates
Backlog grooming and estimation

What to keep in mind

Complex architectural overhauls still need human oversight.
UX-heavy decisions benefit from human design review.
Vague tickets = vague output. Clear scope is king.
Write crystal-clear issue descriptions.
Link related tickets for context.
Give regular feedback on Codegen’s output.
Use a hybrid workflow: Codegen implements, humans review.
Treat Codegen’s explanations as a training library for your team.

How to get started

Create your Codegen account.
Link it to your Linear workspace.
Set permissions for PR creation, rule detection, and self-assigning issues.
Tag @codegen in a Linear ticket and let it run.

A backlog that doesn’t collect dust, engineers freed from repetitive tasks, and a workflow where planning and delivery happen in the same breath.

The next time you open Linear and see a pile of tickets, imagine half of them already shipped before your next standup. That’s the Codegen effect.

See it for yourself — schedule a demo and watch your backlog disappear.

The post Codegen + Linear: Where Tickets Become Code appeared first on The Codegen Blog.