How to Write Specs That AI Can Execute

March 13, 2026

I’ve written hundreds of specs for Claude Code at this point. Some of them produced clean diffs in under 5 minutes. Some of them produced 400 lines of code I threw away.

The difference is always the spec.

This is the tactical version. Not the philosophy — the mechanics. What goes into a spec, what to leave out, and the mistakes I keep seeing.

The five elements

Every working spec I’ve written contains these five things. I covered them briefly in the spec that makes AI agents work independently, but here’s the expanded version with more examples.

1. File paths

Tell the agent exactly where to work.

Bad: “Add a new API route for search.”

Good: “Add the route in src/api/routes/search.ts. Register it in src/api/index.ts.”

File paths do two things. They prevent the agent from creating files in unexpected locations. And they force you to think about where this code belongs before the agent starts writing it. Half the architectural decisions happen at this step.

If you’re creating a new file, include the path for that too. If you’re modifying existing files, list every one. When I was building the workout tracking flow in Triumfit, I listed 4 files in the spec — the new screen, the navigation config, the API call, and the type definitions. Claude Code touched exactly those 4 files and nothing else.

2. The interface

What goes in. What comes out. Be specific about types.

Bad: “The endpoint should accept search parameters and return results.”

Good:

Parameters:
- q (string, required) – search query
- platform (enum: instagram | tiktok | youtube, optional)
- page (number, optional, default 1)
- per_page (number, optional, default 20, max 100)

Returns: PaginatedResponse<Creator> using the type from src/api/types.ts

For UI components, the interface is props and rendered output:

Props:
- entries: Array<JournalEntry>
- onExport: (entryId: string) => void

Renders: A list of journal entries with an export icon button on each row.

The more precisely you define the interface, the less the agent improvises. Improvisation is the enemy.

3. Existing code references

This is the one most people skip. It’s the one that matters most.

Every codebase has conventions. Patterns. Utility functions. Shared types. If you don’t point the agent at them, it writes new ones. Then your codebase has two date formatting functions, three different API response wrappers, and four approaches to error handling.

Always include:

Types to use: “Use the Creator type from src/types/creator.ts”
Patterns to follow: “Follow the same pattern as src/api/routes/campaigns.ts”
Utilities to reuse: “Use the existing formatCurrency() from src/utils/format.ts”
Components to match: “Match the style of src/components/ui/Card.astro”

When I added the Octopus Coder blog layout, my spec referenced the existing BaseLayout.astro and PostCard.astro components. Claude Code used them instead of inventing new layout primitives. That saved me a cleanup pass.

4. Constraints

What the agent should NOT do. This is the most underrated element.

AI agents are helpful by default. They see an opportunity to add a feature, they add it. They notice an adjacent improvement, they make it. Without constraints, you get comprehensive solutions to problems you didn’t have.

Real constraints I’ve used:

“Don’t modify the existing schema”
“No animation”
“Single entry export only — no batch”
“Read-only endpoint. No mutations”
“Don’t add new dependencies”
“Don’t touch the navigation config”

When I skipped constraints on a Lucid export spec, Claude Code built multi-format export with batch processing and export history. 400 lines across 12 files. The constrained spec produced 47 lines across 3 files. Same feature. One-tenth the code.

5. A test scenario

One concrete example that defines “done.”

Test: GET /api/creators/search?q=fitness&platform=instagram
returns a paginated list of creators matching the query.

Test: At step 3 of 5, the progress bar fills 60% of the container width.

Test: Clicking the export button downloads a PDF of the current journal entry.

The test isn’t necessarily for a test suite. It’s for the agent to know what success looks like. Without it, “done” is whatever the agent decides it is.

The specificity spectrum

There’s a range between too vague and too prescriptive. Both ends cause problems.

Too vague: “Add analytics to the dashboard.”

The agent has to make every decision. Which metrics? What visualization? Where on the page? What data source? Every decision it makes is a decision you have to review and potentially undo.

Too prescriptive:

Create a div with className="chart-container" and style={{ width: '100%', height: 300 }}.
Inside, render a LineChart from recharts with the following props:
data={dauData}, margin={{ top: 5, right: 30, left: 20, bottom: 5 }}.
Add an XAxis with dataKey="date" and tick={{ fontSize: 12 }}.
Add a YAxis with tick={{ fontSize: 12 }}.
Add a Line with type="monotone", dataKey="count", stroke="#8884d8"...

At this point, you’re writing the code yourself in English. You’ve eliminated the agent’s judgment entirely. You might as well just write the code.

The sweet spot: Describe the what and where precisely. Describe the how loosely. Let the agent make implementation decisions within your constraints.

Task: Add DAU line chart to Scouter dashboard
File: src/components/Dashboard/DauChart.tsx
Render a 30-day line chart of daily active users.
Use the existing ChartWrapper from src/components/Charts/.
Data comes from analyticsApi.getDailyActiveUsers().
Match existing chart styles.
No interactivity beyond hover tooltips.

The agent decides which charting library, how to structure the component, how to handle loading states. Those are implementation details. You decided what to build, where it goes, and what it shouldn’t do. Those are design decisions. The split is right.

Before and after

Here’s a real example from Scouter. I needed a “mark as reviewed” feature for campaign reports.

The vague prompt I started with:

“Add a way to mark campaign reports as reviewed.”

What Claude Code built: A full review system. Review status enum (pending, in_review, reviewed, approved, rejected). Review history with timestamps and reviewer notes. A review modal with a comment field. Status badges on every report card. A filtered view showing only unreviewed reports. Database migration for review_status and review_history columns. About 600 lines.

The spec I wrote after throwing that away:

Task: Add "Mark Reviewed" button to campaign report detail view

File: src/components/Reports/ReportDetail.tsx
Add a button labeled "Mark Reviewed" in the report header, right side.
On click, call reportsApi.markReviewed(reportId) from src/api/reports.ts.
After success, show a green checkmark icon replacing the button.

File: src/api/routes/reports.ts
Add POST /api/reports/:id/reviewed
Sets reviewed_at = now() on the report record.
Returns the updated report.

Constraint: No review history. No status enum. No modal. Just a boolean — reviewed or not.
Test: Clicking "Mark Reviewed" on report abc-123 sets reviewed_at and shows the checkmark.

What Claude Code built: Exactly that. 62 lines across 2 files. Correct on the first pass.

Same feature. The difference is I made the design decisions instead of delegating them to the agent.

Common spec mistakes

Referencing code that doesn’t exist. “Use the existing ReportReviewService” — but there is no ReportReviewService. The agent will either create one (not what you wanted) or hallucinate an import that breaks. Always verify that the code you reference actually exists. I’ve covered more patterns like this in common Claude Code mistakes.

Mixing multiple tasks into one spec. “Add the chart AND refactor the data fetching AND update the types.” That’s three specs. Split them. The agent handles focused tasks well and compound tasks poorly.

Forgetting to mention existing conventions. Your codebase uses Zod for validation everywhere. You don’t mention it. The agent uses manual type checks. Now you have two validation approaches. Always reference the convention.

Over-constraining implementation details. “Use a useEffect with a cleanup function that calls AbortController.abort().” If you know the implementation that precisely, write the code. Specs should constrain what, not how.

No test scenario. The agent doesn’t know when it’s done. It keeps going. It adds features. It “improves” things. A test scenario is a stop sign.

When to include code snippets vs describe behavior

Include code snippets when you’re referencing specific existing code the agent needs to match or use:

// Use this exact response format from src/api/types.ts:
interface PaginatedResponse<T> {
  data: T[];
  page: number;
  per_page: number;
  total: number;
}

Describe behavior when you’re specifying what to build:

“The button should be disabled while the export is in progress, and show a checkmark for 2 seconds after completion.”

The rule: snippets for existing code. Descriptions for new behavior.

The compound effect

One good spec saves you 10 minutes of review and rework. Eight good specs save you a morning. Across a week of running parallel Claude Code sessions, the specs compound into days of saved time.

But it’s not just time. Good specs produce consistent code. Every file follows the same patterns, uses the same utilities, matches the same conventions — because every spec explicitly references them.

For the full methodology that ties this all together — the workflow, the philosophy, the different spec types — read the spec coding complete guide.