How to write a rescue spec for a broken AI project

The rescue spec is the contract between you and whoever (or whatever) is going to fix the broken project. It's not optional documentation. Without it, every repair conversation starts from scratch, every fix is improvised, and the repair quality is bounded by whatever the repairer happened to notice in the moment.

With a rescue spec, the repair has a target. The repairer knows what's broken, what they're not allowed to change, what success looks like, what evidence proves it, what timeline they have, and what's outside their scope. The conversation becomes "execute against this spec" instead of "figure out what we're trying to do here."

I want to walk through the six required sections of a rescue spec, what each one should contain, and how to write each one in a way that produces actually-useful guidance rather than vague aspiration.

This is the spec discipline I run on my own projects when something goes wrong. The same discipline applies whether the repairer is a human consultant, a different AI session, or you returning to the project after some time away.

Section one: current broken state

What's wrong, in specific terms.

Bad version: "The app is broken." Doesn't tell the repairer anything.

Good version: "When a user submits the signup form, the form appears to submit (loading spinner shows, then disappears) but no record is created in the database. This started after the agent's session yesterday that was supposed to add email verification. The agent reported success. The signup flow worked correctly before that session."

The good version includes: specifically what behavior is wrong, what happens visibly, what happens invisibly, when it started, what change caused it, and what was working before.

This section should be as specific as you can make it. Vagueness here cascades into wrong fixes everywhere downstream. If you don't know exactly what's wrong, the first task is investigation rather than fixing.

Section two: protected scope

What must not change during the repair.

Bad version: "Don't break anything else." Doesn't define what's at risk.

Good version: "The following must not change: (1) the existing authentication flow including OAuth integration, (2) the user table schema, (3) the analytics integration that fires on signup. Do not modify files in /lib/auth/, /db/migrations/, or /lib/analytics/. Do not change the routing for /signup or /verify pages."

The good version specifies: what behaviors must be preserved, what files are off-limits, what URLs must remain stable, what integrations must continue working.

This section is critical because AI agents (and sometimes human repairers) will sometimes "improve" things outside the requested scope while doing the requested work. The protected scope prevents that explicitly.

If you can't enumerate the protected scope, you don't yet understand what your app does well enough to safely repair it. The first task is mapping the protected scope; the repair starts after.

Section three: success criteria

What "fixed" actually means.

Bad version: "Make it work again." Doesn't define what "work" means.

Good version: "Success criteria, all must be met: (1) submitting a valid signup form creates a user record in the users table, (2) the user receives a verification email at the address they submitted, (3) clicking the verification link in the email marks the user as verified in the database, (4) the existing tests for signup flow still pass, (5) the signup-completion analytics event fires correctly. Each criterion must be verifiable via the surface check (does the visible behavior happen) and the semantic check (does the underlying state match)."

The good version: specific verifiable conditions, every condition is testable, both surface and semantic verification are required, the test for success is unambiguous.

When success criteria are specific and verifiable, the repair has a clear endpoint. When they're vague, the repair drifts because there's no defined "done."

Section four: evidence requirements

What proves success was achieved.

Bad version: "Show me it works." Doesn't define the evidence.

Good version: "Evidence to provide on completion: (1) screenshots of the signup flow showing each step's UI, (2) database query output showing a test user was created during the verification, (3) the verification email content (text and headers), (4) test suite output showing all signup tests pass, (5) analytics dashboard screenshot showing the signup event fired for the test user, (6) git diff showing the changes made."

The good version specifies: what artifacts to provide, in what form, covering each success criterion, and the work-product that documents what was done.

Without evidence requirements, the repairer's report of success is unverifiable. With them, success is documented and auditable. This matters especially with AI-generated fixes where the agent's report of completion has been known to diverge from actual state.

Section five: deadline and priority

What's the timeline, what's negotiable.

Bad version: "ASAP." Doesn't help the repairer prioritize.

Good version: "Hard deadline: this needs to ship by Friday EOD because of a customer launch dependency. Acceptable trade-offs: minor UI inconsistencies can be deferred, the verification email design can be plain HTML instead of branded, the success page can be basic instead of polished. Non-negotiable: the core signup-to-verified flow must work reliably for real users."

The good version: specific deadline, what the deadline is tied to, what can be traded off to meet the deadline, what cannot be traded off regardless.

When the deadline structure is clear, the repairer can make local trade-offs intelligently. When it's not, the repairer either over-engineers (missing the deadline) or under-engineers (missing the actual requirements).

Section six: repair boundary

What's in scope for the repair, what's escalation.

Bad version: "Fix whatever needs fixing." No boundary, infinite scope.

Good version: "Repair boundary: in-scope is restoring the signup-to-verified flow including the database operations and email sending. Out-of-scope and requires escalation: changes to authentication, payment integration, third-party identity providers. If the repair surfaces issues outside the in-scope area, stop and escalate before proceeding."

The good version: specifically what's in scope, specifically what's out of scope, what to do when the repair surfaces out-of-scope issues.

This section prevents scope creep during repair. Without it, the repairer might fix things outside the original request, producing changes you weren't expecting and didn't authorize. With it, scope is bounded and surprises become escalations rather than silent changes.

Putting it together: the spec template

A complete rescue spec, formatted for a repair handoff:

RESCUE SPEC: [project name]

CURRENT BROKEN STATE
[Specific description of what's wrong, what should happen, what does happen, when it broke, what caused it]

PROTECTED SCOPE
[Behaviors that must be preserved]
[Files that are off-limits]
[URLs that must remain stable]
[Integrations that must continue working]

SUCCESS CRITERIA
[Numbered list of specific, verifiable conditions]
[Verification approach for each: surface + semantic]

EVIDENCE REQUIREMENTS
[Artifacts to provide on completion]
[Format and coverage]

DEADLINE AND PRIORITY
[Hard deadline and what it's tied to]
[Acceptable trade-offs]
[Non-negotiable requirements]

REPAIR BOUNDARY
[In-scope work]
[Out-of-scope work requiring escalation]
[Escalation process]

A good rescue spec runs one to three pages, depending on the project's complexity. Less than one page is usually missing important sections. More than three pages is usually over-specified for a rescue (vs. a new build).

Why this matters for AI-driven repair

AI agents work better against specifications than against vague requests. The same pattern that makes AI useful for new builds (specifications + iterative generation against them) makes AI useful for repair (rescue specs + iterative work against them).

Without the spec, the AI is guessing at what you want. The guess is sometimes right, often partially right, sometimes badly wrong. The bad cases produce the scope creep and protected-scope violations that this cluster's posts cover repeatedly.

With the spec, the AI has clear targets. The work focuses on hitting them. The verification has clear criteria. The repair has a defined endpoint. The outcome is much more reliable.

The pattern of capturing intent before generating, documented in how to vibe code a production landing page for new builds, applies identically to repairs. The spec is the artifact that turns AI from "make educated guesses" to "execute against this contract."

The codification angle

Each rescue spec you write becomes raw material for the next one. Patterns repeat: certain protected scopes appear in most rescues, certain success-criteria shapes work better than others, certain evidence formats are easier to verify.

Capturing the patterns turns rescue-spec writing from improvised work into template work. Over time, you develop a personal template that covers most rescues, with project-specific details filled in for each.

This is the same Move 1 → Move 2 codification loop covered in why AI keeps changing your code applied to rescue specifications. Each spec is a Move 1; the pattern extracted from it becomes part of the Move 2 template.

The discipline that frameworkifies the spec is documented in the four-layer enforcement framework, where the spec is the layer-one constraint that everything downstream operates against.

What to do when the project's so broken you can't write the spec

Sometimes the project is in a state where you can't even write the spec accurately because you don't yet know enough about what's broken. This is itself diagnostic.

The first move is investigation, not repair. Run the ai-coding-agent-broke-my-app recovery sequence to find out what state the system is actually in. Capture the baseline. Run the surface-vs-semantic check. Identify what's specifically broken.

Once you know what's broken, the rescue spec becomes writable. The investigation produces the input the spec needs.

Skipping the investigation and trying to spec the repair against a state you don't understand produces a spec that's based on assumptions, which produces repairs against the wrong baseline.

What this gives you

A well-written rescue spec turns a panic into a triagable engagement. The repairer (human, AI, or future-you) can execute against it. The verification is structured. The boundary is clear. The outcome is documented.

Without it, every rescue is improvised, and improvised rescues produce inconsistent outcomes. With it, rescues become repeatable, and the repeatability is what lets a team get good at rescue work over time rather than treating each one as a fresh crisis.

The spec is the most underrated artifact in AI-assisted work generally and in repair work specifically. Operators who treat it as essential produce reliable outcomes. Operators who treat it as overhead produce the firefighting cycles this cluster repeatedly addresses.


If you're facing a broken AI project and want help writing the rescue spec before handing off to a repairer, send the project state, the symptoms, and what you know about what should and shouldn't change. VibeKoded can scope a rescue diagnostic, stabilization sprint, or rebuild plan. → Work with VibeKoded