Two Engineers With AI Agents Shipped Five Times a Day

PFF had ten engineers shipping code once every five days. Then they tried something different: two engineers, augmented with AI agents, working on the same codebase. The new team shipped five times per day.

That's not a marginal improvement. That's a different development model entirely. And it meant dismantling most of what we think of as "engineering process" - standups, sprint planning, code review rituals - because none of it made sense anymore.

What Actually Changed

The shift wasn't just "developers using AI tools". It was developers building with AI agents that handled the grunt work of software development - the kind of work that fills up a sprint but doesn't require human judgement.

Tickets updated themselves. When code merged, agents checked style and convention automatically. QA agents ran tests on every merge, flagged issues, suggested fixes. The humans focused on architecture decisions, product direction, and the kind of engineering problems that still require a person to think through trade-offs.

The case study from AI Engineer shows what this looks like in practice. The two-person team wasn't writing more code per day. They were shipping faster because all the coordination overhead disappeared. No waiting for standup to communicate progress. No multi-day code review cycles. No manual QA bottleneck before merging.

The Rituals That Dissolved

Standups exist because humans need to synchronise. When agents handle synchronisation automatically - tickets update as code changes, progress is visible in real-time, blockers surface immediately - the meeting becomes redundant.

Sprint planning exists because estimating work and allocating capacity is hard. When agents handle the predictable parts of development and humans focus on the uncertain parts, the rhythm changes. You're not planning two-week sprints anymore. You're making decisions about what to build next and letting the agents figure out how long it takes.

Code review exists because humans make mistakes and codebases need consistency. Agents can enforce style, catch common bugs, and verify tests pass before a human ever looks at the pull request. What's left for human review is the stuff that actually matters: does this solve the right problem? Are the trade-offs sensible? Is the architecture still coherent?

The Uncomfortable Implication

If two engineers with agents can match the output of ten engineers without agents, what does that mean for the other eight? That's the question most engineering managers don't want to ask yet. But the maths is unavoidable.

The PFF experiment suggests that the bottleneck in most development teams isn't coding speed - it's coordination overhead. Meetings, alignment, handoffs, waiting for reviews. Agents collapse that overhead by making coordination automatic. The result isn't that developers become ten times faster at writing code. It's that the eight hours between deciding to ship something and actually shipping it compress into one hour.

For small teams, that's transformative. A two-person startup can now ship at the velocity of a ten-person team, without the hiring cost, management overhead, or office space. For large organisations, it's more complicated. You can't just hand everyone agents and expect five-fold productivity gains - most enterprise engineering time is spent navigating organisational complexity, not writing code.

What This Means for Builders Right Now

If you're running a small team - or you're a solo developer - the PFF model is immediately relevant. You don't need to hire more people to increase output. You need to invest time in setting up agents that handle the repetitive parts of your development process.

That means auto-updating task tracking. Agentic style and convention checking on every commit. Automated QA that runs on merge and opens issues when tests fail. These aren't futuristic tools - they exist now, and they work.

The hard part isn't the tooling. It's letting go of the rituals. If your two-person team is still doing daily standups and two-week sprint planning, you're optimising for a process that doesn't match your new capabilities. The PFF team didn't just adopt agents - they redesigned their entire workflow around what agents could handle autonomously.

The Bigger Pattern

This is the third example in as many months of small teams with AI augmentation matching or exceeding the output of much larger teams. It's starting to look like a pattern, not an outlier.

The implication isn't that all engineering teams will shrink to two people. It's that the shape of engineering work is changing. The coordination-heavy, process-laden model that worked for scaling teams from five to fifty people might not be the right model anymore. When agents handle coordination automatically, the constraint shifts back to the actual thinking work - the architecture decisions, the product trade-offs, the judgement calls that still require a human.

PFF proved it's possible. Five times more deployments per day with a fifth of the headcount. The question for every engineering leader is: what are you going to do with that information?