Building a SaaS with Claude Code

February 28, 2026

Scouter is a B2B lead enrichment tool. It scrapes public data, enriches it, scores leads, and surfaces the best prospects for sales teams. It has a Next.js frontend, an Express API, PostgreSQL, a job queue for scraping, and it’s deployed on Railway.

I built it with Claude Code. Not as an experiment — as the primary development method. Here’s how that went.

The spec phase

Before I opened a terminal, I spent two days on specs.

That sounds like a lot. It is. But those two days saved me from weeks of rework that I’ve seen happen on every project where I skipped this step.

Data models first. I mapped out every entity: users, organizations, scouts (saved searches), prospects, enrichment results, scraping jobs. Each model got a Drizzle schema definition with types, relationships, and indexes.

// From the spec — this is what I handed Claude
export const prospects = pgTable('prospects', {
  id: uuid('id').primaryKey().defaultRandom(),
  scoutId: uuid('scout_id').references(() => scouts.id),
  name: text('name').notNull(),
  email: text('email'),
  company: text('company'),
  title: text('title'),
  score: integer('score').default(0),
  enrichmentData: jsonb('enrichment_data'),
  foundAt: timestamp('found_at').defaultNow(),
  createdAt: timestamp('created_at').defaultNow(),
});

API design second. Every endpoint — path, method, request body, response shape, error cases. I use a simple format:

POST /api/scouts
  Body: { name, query, filters }
  Response: { scout: Scout }
  Errors: 400 (invalid query), 401 (not authenticated)

GET /api/scouts/:id/prospects
  Query: { page, limit, minScore }
  Response: { prospects: Prospect[], total: number }
  Errors: 404 (scout not found), 401

Scraper architecture third. This was the most complex piece. The scraper needed to run as background jobs, handle rate limiting, retry on failures, and store results incrementally. I specced out the job queue (BullMQ), the scraper pipeline (fetch → parse → enrich → score → store), and the error handling strategy.

The full spec was about 15 pages. Not prose — structured definitions. Models, endpoints, system behaviors, constraints.

This is the spec coding methodology at project scale. The spec is the source of truth. Claude executes against it.

The build phase

With specs in hand, the build was mechanical. I broke it into task groups and ran Claude Code sessions for each.

Week 1: Foundation. Database models, migrations, auth system, basic API skeleton. Each piece was a separate Claude Code session with a specific spec section as the prompt.

Task: Implement the database models in src/models/ using Drizzle ORM.
Reference: spec/data-models.md
Constraints: Follow the exact schema definitions.
Use the project's existing Drizzle config in drizzle.config.ts.

Claude produced the models in one session. I reviewed the diff, caught one missing index, had Claude fix it, committed.

Week 2: API endpoints. CRUD operations for all entities. This is where the scoping discipline paid off. Each endpoint was its own task. I’d run 4-5 in parallel — scouts CRUD, prospects CRUD, user settings, search filters, export functionality.

Most sessions produced clean, mergeable code on the first pass. The spec told Claude exactly what each endpoint should accept and return. There wasn’t room for creative interpretation.

Week 3: Scraper system. The hardest part. The scraper involved more moving pieces — job queues, rate limiting, retry logic, incremental result storage. I broke it into smaller specs:

Job queue setup and configuration
Base scraper class with retry and rate limiting
Individual scraper implementations
Result processing pipeline
Score calculation

Each one got its own session. The sessions were sequential here, not parallel, because each piece depended on the last. I’d review, commit, then spec the next piece referencing the committed code.

Week 4: Frontend. Next.js pages, components, data fetching. This went fastest because the API was already built and tested. Claude could see the API types and build the frontend against them.

The iteration phase

The first pass is never shippable. It’s close, but there are always gaps.

What needed fixing after the first pass:

Edge cases in the scraper’s retry logic. The initial implementation retried on all errors, including 404s. Fixed with a one-line spec: “Only retry on 429 and 5xx status codes.”
The prospect scoring algorithm weighted recent data too heavily. I adjusted the weights in the spec and had Claude regenerate the scoring function.
The frontend’s pagination component had an off-by-one error on the last page. Standard stuff.

Each fix was a small, scoped task. Describe the problem, reference the file, let Claude fix it, review the diff. The trust-but-verify approach — I trusted Claude to write the fix, but I verified every diff before committing.

What took longer than expected

The scraper rate limiting. My initial spec was too abstract. “Handle rate limiting appropriately” isn’t a spec — it’s a wish. When I rewrote it with specific behavior (“wait 60 seconds after a 429, exponential backoff starting at 2 seconds for 5xx, maximum 3 retries per request”), Claude nailed it on the first try.

Lesson: vague specs produce vague code. Every hour I saved by writing a lazy spec, I spent two hours on rework.

Testing the job queue. Testing async job processing is inherently harder than testing synchronous code. Claude wrote the tests, but the test setup — mocking BullMQ, simulating job completion, handling race conditions — required more back-and-forth than the application code itself.

Auth edge cases. Session expiry, token refresh, concurrent requests during refresh. These are the places where reviewing AI-generated code really matters. The happy path was fine. The edge cases needed manual attention.

What was surprisingly fast

CRUD operations. With a clear spec and existing patterns to reference, Claude produced all the basic CRUD endpoints in a single afternoon. 15 endpoints, each with validation, error handling, and TypeScript types. This would have been two days of manual work.

Database migrations. Hand Claude a schema definition, get back a working Drizzle migration. Every time.

Frontend components. Once the first few components established the pattern (data fetching with SWR, error states, loading states), Claude replicated the pattern perfectly for every subsequent component. I’d scope a new page, point Claude at an existing page as a reference, and get back consistent output.

Test coverage. Claude wrote the initial test suite for the API — about 80 test cases across all endpoints. Would have taken me two full days. Claude did it across 4 parallel sessions in an afternoon.

The role of specs across a multi-week build

Here’s what I didn’t expect: the specs became more valuable over time, not less.

In week 1, the specs guided the initial build. Standard.

In week 3, the specs were reference documents. When Claude was building the scraper, it could read the API spec to understand how results should be stored. The specs connected the system together.

In week 4, the specs were the contract between frontend and backend. Claude could build frontend components that matched the API exactly because both were built against the same spec.

After launch, the specs are documentation. When I come back to add a feature, the spec tells me (and Claude) how the system works without reading all the code.

Specs aren’t just a build tool. They’re a living document that keeps the AI on track across sessions, across weeks, across the entire lifecycle of the project.

The numbers

Spec writing: ~16 hours across 2 days
Build time: ~4 weeks, working ~5 hours/day
Lines of code: ~12,000 (application) + ~4,000 (tests)
Lines I wrote by hand: Maybe 200. Mostly config files and environment setup.
Claude Code sessions total: ~60
Sessions that produced usable code on first pass: ~45 (75%)
Sessions that needed re-scoping: ~15

What I’d do differently

Write more granular specs for complex systems. The scraper spec should have been 3 separate documents, not one monolithic one.

Start with integration tests, not unit tests. Claude writes excellent unit tests, but the bugs that slipped through were integration-level — things that only broke when components talked to each other.

Commit more aggressively. There were a few sessions where I let Claude make 3-4 changes before committing. When the fourth change introduced a bug, rolling back meant losing changes 1-3 too.

The takeaway

Building a SaaS with Claude Code isn’t magic. It’s project management. You scope the work, write the specs, run the sessions, and review the output. The AI writes the code. You make the decisions.

The total build time for Scouter was about 100 hours of my time. A similar project, coded manually, would have been 300-400 hours. The specs account for maybe 20% of the time savings. The parallel execution accounts for the rest.

If you’re considering building something real with Claude Code, start with the specs. Everything else follows.