2/7/2026

Scaling the Productivity of a Spherical Developer in a Vacuum #0

I originally wanted to write a single post about productivity - mostly because developer tooling has changed dramatically over the last couple of years, and my own workflow has quietly evolved with it.

Along the way, I collected a bunch of small tricks - and a few painful lessons. Some I learned the hard way. Some I borrowed from friends and people in the dev community. Some only appeared after I broke my setup for the third time and had to rebuild it from scratch. Apparently, that is part of the process.

I have also interviewed a few people about their setups and how they work. I hope this shared experience will be useful.

Part 0: Introduction and Handles

Your project setup is often not something you fully control. There are company policies, prescribed tools, and codebase structures you have to follow.

Let's exclude that from the discussion and assume you have full control over your project, tech stack, and workflow.

Effectiveness is about juggling maintainability, requirements, and speed to reach acceptable results. Most of the time, acceptance criteria come from someone else. Or you follow your own vision of how things should be done. The latter only works if you truly know what you want to achieve - and are lucky enough to get there without drifting too far.

There will always be something to optimize. The real question is where to start.

Pick something that matters for your specific project. And make sure you are actually optimizing, not just following rules.

Pick your metric first. For example:

  • You finish all sprint tasks in two days. Are you productive? From your manager's perspective, maybe you are just underloaded.
  • You have three issues: two small and one critical. You fix the two small ones. The board looks better. The release is still blocked. Were you productive - or just busy?

Think of yourself as a web server

Your goal is to handle tasks and ensure that every output is correct and secure.

You have subprocesses - your AI assistants. Models like Claude Opus or Gemini Pro can handle real tasks. Not perfectly. Not independently. But meaningfully.

Lets assume token usage is not your main concern.

Now let's apply a bit of parallel computing theory.

Tasks constantly arrive. You delegate some of them to AI assistants. After they finish, you verify the result and sometimes iterate on the prompt.

Why you should always verify the results?

Automated tools are limited. The less domain expertise you have, the easier it is to trust them blindly.

As an expert, you will notice flaws and subtle issues. That is why you need to iterate with your AI assistant.

A quick real story. I had finished a refactor and needed to update tests. I knew the tests should fail after the change and asked my assistant to update them. It did. But its internal reasoning said: “I see there is an issue in the code, but the user asked to update the test, so I will implement a mock.”

It was the wrong mock :)

In this system:

  1. You block all tasks. When they are ready, you approve, reject, or ask for another iteration.
  2. Some tasks are non-delegatable. For example, reviewing architecture or, occasionally, checking cat memes.
  3. Sometimes you write code faster than your assistant - especially when you know the codebase well.
A simplified capacity model

This is a simplified capacity model inspired by bottleneck analysis in multi-stage systems (related to classical queueing theory). Real workflows are messier, but the bottleneck intuition still applies.

Let's define:

  • a - time to write a prompt
  • v - time to verify and iterate
  • c - context switching cost (paid by you).
  • x - time an agent needs to generate a result
  • p - number of agents

You also have non-delegatable work. Let's say it consumes a fraction s of your time. For example, 5 out of 10 minutes - 50 percent.

So your remaining budget for delegated work is:

B=1sB = 1 - s

Every delegated task costs you:

a+v+ca + v + c

So your personal throughput is:

1sa+v+c\frac{1 - s}{a + v + c}

Agents generate results in parallel. If one result takes x time, then p agents produce:

px\frac{p}{x}

Your real throughput is the bottleneck of the two:

min ⁣(1sa+v+c,  px)\min\!\left(\frac{1-s}{a+v+c},\; \frac{p}{x}\right)

Now the interesting question: when does adding more agents actually help?

Adding agents increases p. It helps until the agent side stops being the bottleneck.

A simple rule of thumb for the crossover point is:

px(1s)a+v+cp^* \approx \frac{x\,(1-s)}{a+v+c}

Below that number, agents are limiting you. Above that number, you are limiting them.

Example

In one of my pet projects, an agent typically needs about 15 minutes to draft a plan and implement changes. I verify changes in a couple of minutes and switch context almost instantly. I also spend about 50 percent of my time on non-delegatable work.

So roughly:

10.51+2+00.166 tasks per minute\frac{1 - 0.5}{1 + 2 + 0} \approx 0.166 \text{ tasks per minute}

Which means I can process one delegated task every ~6 minutes of my attention.

If one agent produces one result every 15 minutes, then two agents are still slower than me. With three, they start to outrun my processing capacity.

That is why, in my case, two or three agents feel optimal. More than that becomes stressful - and the stress is not in the formula.

Below is a small calculator if you'd like to run the model with your own parameters.

(1 - s) / (a + v + c)
p / x

A practical takeaway

See which side is slower - you or the agents.

If agents produce results more slowly than you can process them, adding another one will increase throughput.

If you process results more slowly than they generate them, adding more agents will not help. The only real gains come from reducing non-delegatable work, writing clearer prompts, verifying faster, or switching context more efficiently.

In the next post, I will dive into practical ways to improve those levers.

What's next?

I hope this gives you a simple mental model for tweaking your workflow.

To be fair, the model is too simple

There are several limitations to this mindset:

  • We assumed context switching is constant. That might hold for one agent. With two, maybe. Beyond that, switching cost tends to grow nonlinearly.

  • We assumed tasks are evenly distributed and equally prioritized. In reality, tasks vary in size, complexity, and urgency. The simple formulas do not capture that.

  • We assumed agents work independently, and that verification does not introduce coordination or merge overhead. This only holds if tasks are cleanly separated - for example, different parts of the codebase.

  • We ignored token cost. With multiple agents running in parallel, tokens disappear surprisingly fast.

  • We assumed negligible error and rework cost. In practice, tasks often require several loops. The model still works if you treat those loops as part of the total effort - just include all prompt, generation, verification, and correction time in the same parameters.

  • We assumed an infinite supply of tasks. In practice, you might just finish everything and run out of work. That is a different optimization problem 😊

  • And finally - you are not a web server 😉

Next, I will share practical ways to adjust these parameters, look at how other people structure their setups, and then step away from raw productivity to talk about output quality.

As I've mentioned above, the quality of your results matters more than the quantity. At least in my ideal world 😊