Background Agents Explained

February 20, 2026

The concept is simple. Give an AI agent a task. Switch to something else. Come back and review the output.

That’s it. That’s background agents. The value isn’t in the concept — it’s in the execution details that make it actually work.

Why this matters

Your time costs something. The agent’s time doesn’t.

If you’re sitting in a terminal watching Claude Code write a component for 8 minutes, those are 8 minutes you could have spent scoping another task, reviewing a different project’s output, or making a product decision. The agent doesn’t need you watching. It needs a good spec and a clear target directory.

Background agents turn the economics of solo development upside down. Instead of “I can work on one thing at a time,” it becomes “I can have 6 things being worked on while I think about the 7th.”

This is how I run 8 sessions at once. Not by doing 8 things — by reviewing 8 things that were done while I was doing something else.

The prerequisites

Background agents don’t work without three things.

A tight spec. When you’re sitting with the agent, you can correct course mid-task. “No, not that file — this one.” “Actually, make it a dropdown instead.” Background agents don’t get that. They get the spec and nothing else until you come back. The spec needs to be complete enough that the agent can make every decision without you.

I cover spec writing in depth in the spec that makes AI agents work independently, but the short version: include the what, the where (specific files), the behavior, and the constraints. Skip any of those and you’ll come back to creative interpretation.

Clear boundaries. The agent should know what it’s allowed to touch and what it isn’t. If you’re asking it to add a feature to Scouter’s dashboard, and the spec doesn’t mention the API layer, the agent might modify API routes anyway if it thinks it needs to. Explicit scope prevents scope creep — in agent work just like in human work.

### Scope
- Modify: `src/components/Dashboard/FilterBar.tsx`
- Create: `src/components/Dashboard/DateRangePicker.tsx`
- Do NOT modify any API routes or database queries
- Do NOT change the existing filter logic, only add the date range option

Testable output. When you come back to review, you need to be able to verify the work quickly. That means either the diff is readable enough to review (most tasks), the feature is visible in the UI (front-end work), or tests pass (logic-heavy work). If the only way to verify the output is a 30-minute manual QA session, the task is too big for background execution.

The workflow

Here’s the actual flow, step by step.

1. Scope. Write the spec. Be specific. 2 minutes.

2. Launch. Paste the spec into a Claude Code session. Let it start running. 10 seconds.

3. Context switch. Move to another tab. Start scoping the next task, or start reviewing output from a previous one. This is the key moment — you walk away. The agent is working. You’re not needed.

4. Review. Come back when the agent signals it’s done (or check periodically). Read the diff. Check it against the spec. Does the output match? Are there unexpected changes? Did it stay within scope?

5. Accept or correct. If the work is good, accept it. Commit, move on. If it’s close but needs adjustment, type a correction and let the agent iterate. If it’s fundamentally wrong, check your spec — the problem is almost always there.

One cycle takes 20–40 minutes from scope to accepted commit, with maybe 10 minutes of your actual attention. The rest is agent execution time that you spend on other work.

How many to run simultaneously

This depends on you, not the AI.

The limiting factor isn’t compute. It’s your review bandwidth. Every background agent produces output that needs your eyes. If you’re running 8 agents and can only thoughtfully review 4 before lunch, the other 4 are sitting in a queue — and by the time you get to them, you’ve lost the context of what you asked for.

Starting out: 3 agents. This is manageable for anyone. Scope 3 tasks, launch them, review them as they finish. You’ll have natural gaps between reviews where you can scope the next batch.

Comfortable: 5–6 agents. After a few weeks, you’ll get faster at both scoping and reviewing. You’ll develop a feel for which tasks finish quickly (copy changes, simple bug fixes) versus which take longer (new features, refactors). You’ll stagger them so reviews don’t all land at once.

Full capacity: 6–8 agents. This is where I operate most days. It requires fast scoping (I can write a solid spec in 90 seconds for a familiar codebase), efficient reviewing (I know what to look for and what to skip), and good task selection (a mix of quick wins and longer tasks so reviews are staggered).

Don’t jump to 8. You’ll produce sloppy reviews and merge bad code. Scale up as your review speed improves.

The failure modes

I’ve hit all of these. Here’s what to watch for.

Too many at once. You launch 8 agents Monday morning. By 10am, all 8 have output ready. You rush through reviews to clear the queue. You miss a bug in the Scouter commit because you were skimming. That bug hits production Wednesday. The problem wasn’t the agent — it was your review quality degrading under volume. Fix: launch in batches. 4 first, review those, then 4 more.

Specs too vague. You launch a background agent with “add filtering to the table.” You come back to a fully-featured filter system with text search, date ranges, multi-select dropdowns, and a saved filters feature. You wanted a single dropdown to filter by status. The agent wasn’t wrong — your spec was. Fix: be specific about what filtering means. Every ambiguous word in a spec is a decision the agent makes for you.

Not reviewing promptly. You scope 6 tasks Monday morning. You get pulled into a client call. By the time you review Tuesday, you’ve forgotten the context of half the specs. Reviews take twice as long because you’re re-reading your own specs to remember what you asked for. Fix: review within 2 hours of scoping. The context is fresh. The review is faster.

Wrong task granularity. You hand a background agent a task that’s really 4 tasks. “Build the settings page — add the routes, create the form components, connect to the API, add validation.” The agent does all of it, but the diff is 400 lines across 12 files. Your review is now a 30-minute slog. Fix: break it into 4 separate tasks. 4 small diffs are easier to review than 1 large one. Scoping tasks covers this in detail.

The compound effect

Here’s the math that changed my workflow.

Before background agents: I could implement maybe 3 features per day if I was focused and undistracted. That’s a good day.

With background agents at full capacity: 8 agents running 3 cycles per day (morning batch, midday batch, afternoon batch) produces roughly 20 reviewed commits per day. Not all of those are features — some are fixes, refactors, tests. But the throughput is 5–6x what I could do alone.

Across a week, that’s the difference between moving one project forward and moving all of them forward. For a solo builder running multiple projects, that’s the difference between a portfolio that stalls and one that ships.

The key insight: background agents don’t make you faster at coding. They make coding happen while you’re doing something else. That’s a different kind of leverage. It’s the kind that scales with how many things need building, which — if you’re a solo dev with ideas — is always more than you have time for.

Start with 3. Build the muscle. Then read the parallel development playbook for the full multi-project setup.