AI-assisted development breaks down once context resets become routine. Production work is interruption-heavy and constraint-heavy. If a workflow cannot survive that reality, it is not production-ready.
This is the opposite of "vibe coding" - the approach where you prompt an AI, accept whatever it generates, and hope for the best. Vibe coding optimizes for the first iteration. Spec-driven development optimizes for the next hundred.
You're reading Part 2 of 5 in the AI-assisted development series. Previous: Part 1: Understanding AI Code Assistants Next: Part 3: Security Boundaries for AI-Assisted Development in ASP.NET Core
This series moves from workflow -> safety -> performance -> publishing, using DAP iQ as the working system.
Common questions this answers
- How do you keep AI output reliable when context resets happen?
- What is the minimum spec that produces reviewable diffs?
- What do you record so the next session starts from facts?
- Why does vibe coding break down in production?
Why this exists
I want AI-assisted development to be repeatable for real systems. That means a workflow that produces diffs you can review, commands you can run, and decisions you can audit.
Default rule
Assume context will reset. Externalize decisions and constraints into small, durable artifacts.
This pattern enforces one critical discipline: read all memory bank files at the start of every task. This is not optional. Skipping it is how context drift starts.
Definition (what this means in practice)
A spec-driven AI workflow is a loop where:
- you write a short spec with constraints and acceptance criteria
- the assistant produces a small diff
- you validate it and checkpoint the result into durable project memory
In practice, this means keeping a small memory-bank/ folder, writing one spec at a time, and updating project memory after every meaningful change.
Terms used
- Memory bank: a small set of files that hold project facts and current intent. The pattern originates from Cline's Memory Bank. Related patterns include CLAUDE.md files and Cursor's .cursorrules.
- Hierarchical flow: files build on each other. projectbrief anchors everything; productContext, systemPatterns, and techContext extend it; activeContext captures current state; progress tracks what works.
- Active spec: the one spec you are implementing right now.
- Checkpoint: a short update that records what changed and what is next.
- Validation ladder: commands and HTTP checks that prove the work.
- Vibe coding: prompting an AI and accepting what it generates without a spec. Fast for prototypes, fragile for production.
Reader contract
This article is for:
- Engineers shipping production systems.
- Anyone who wants AI speed without losing review discipline.
You will leave with:
- A spec template that forces reviewable diffs.
- A minimal "memory bank" structure that survives context resets.
- A validation ladder that keeps AI output grounded.
This is not for:
- Prompt-only workflows.
- Teams that will not run real commands.
Quick start (10 minutes)
If you do nothing else, do this:
Verified on: ASP.NET Core (.NET 10), EF Core 10.
- Create a folder called
memory-bank/. - Add one file called
activeContext.md. - Write down three bullets: current goal, constraints, what "done" means.
- Before every AI session, paste those bullets into the prompt.
- After every AI session, update that file with what changed and what is next.
This is deliberately small. It is also the difference between "fast today" and "reliable next month".
The core loop: Plan -> Implement -> Checkpoint
Treat the assistant as a compiler for intent, not a source of intent. The intent lives in a spec. The result is a diff.
- Plan: write a small spec with acceptance criteria.
- Implement: change code or content to match the spec.
- Checkpoint: update project memory, then move the spec to done.
In DAP iQ, the loop is enforced with a small command vocabulary in AGENTS.md.
The important part is not the file name.
It is the invariant: there is one active spec and it moves when finished.
Minimal AGENTS.md example (commands)
This is an intentionally small command vocabulary. It keeps sessions consistent and makes "what happens next" obvious.
# Commands
INIT
STATUS
ASK <question>
PLAN <feature>
IMPLEMENT
BACKLOG <item>
CHECKPOINT
UPDATE MEMORY
# Rules
- Memory first: read memory-bank before work.
- Exactly one active spec in memory-bank/specs/.
- No edits during PLAN.
- Implement only the active spec.
Start-of-day routine (5 minutes)
- Run
INIT. - Read
memory-bank/activeContext.mdandmemory-bank/progress.md. - Pick one spec to plan or implement.
End-of-session routine (5 minutes)
- Update
memory-bank/activeContext.mdwith what changed. - Update
memory-bank/progress.mdwith what works/what's left. - Run
CHECKPOINTto archive the spec.
Reference implementation
This is the minimum set of artifacts that makes the workflow real.
A spec template you can copy/paste
Use this as the input to the assistant. It forces scope, touched files, and a rollback plan.
# Spec: <short name>
## Summary
<1-2 sentences>
## Constraints
- Stack: <framework/runtime>
- Security: <guardrails>
- SEO: <canonical/meta/structured data rules>
- Performance: <cache/query rules>
## Scope
In scope:
- <bullet>
Out of scope:
- <bullet>
## Files to touch
- <path>
## Acceptance criteria
- <observable check>
## Validation
- Commands:
- <command>
- HTTP checks:
- <url>
## Rollback
<how to revert safely>
Worked example spec (real, not templated)
This is what a small, production-ready spec looks like.
# Spec: Exempt cached feeds from rate limiting
## Summary
Prevent 429s for crawlers by exempting cached, read-only feed/sitemap endpoints from rate limiting.
## Constraints
- Stack: ASP.NET Core MVC (.NET 10)
- Security: do not trust spoofable IP headers
- SEO: preserve canonical URLs and feed/sitemap routes
- Performance: keep OutputCache enabled on feeds and sitemap
## Files to touch
- src/DapIq.Website/Program.cs
- memory-bank/activeContext.md
## Acceptance criteria
- /insights/feed stays 200 under repeated requests (no 429)
- /sitemap.xml stays 200 under repeated requests (no 429)
- Rate limiting still applies to normal MVC routes
## Validation
- Commands:
- dotnet build -c Release
- cd src/DapIq.Website && dotnet run --launch-profile http
- HTTP checks:
- curl -I http://localhost:5000/insights/feed
- curl -I http://localhost:5000/sitemap.xml
## Rollback
Revert the Program.cs change and rebuild.
Memory bank file contracts
Keep it boring. Boring is what survives.
Keep the memory bank short on purpose:
- activeContext.md target: <= 25 lines
- progress.md target: <= 50 lines
- systemPatterns.md target: <= 100 lines If a file grows past the cap, split it or delete stale content.
| File | Purpose | Must contain |
|---|---|---|
projectbrief.md |
Mission and non-negotiables | Stack, constraints, audience |
productContext.md |
Why the product exists | Problems solved, UX goals |
techContext.md |
How to run/build/deploy | Commands, infra assumptions |
systemPatterns.md |
Architecture decisions | Patterns you do not re-litigate |
activeContext.md |
What we are doing now | Current phase, recent changes, next steps |
progress.md |
What works vs what is left | Checklists, known issues |
When to update memory
The Cline pattern defines four update triggers:
- When you discover a new project pattern worth preserving.
- After implementing significant changes.
- When explicitly asked to update memory.
- When context needs clarification before continuing.
If none of these apply, do not update. Frequent small updates beat occasional large ones.
Checkpoint examples
If you cannot summarize what changed, you did not finish.
Good checkpoint (one line):
- "Added OutputCache to series feeds and exempted from rate limiting; validated with curl."
Bad checkpoint:
- "Did a bunch of caching stuff."
Copy/paste artifact: one-page spec template
Use this as a PR description or as the top of a spec file.
Goal:
Constraints (non-negotiable):
- <stack constraints, invariants, "no new packages", etc>
Files allowed to change:
- <explicit paths, keep this tight>
Acceptance criteria:
- <observable outcomes>
Validation:
- <commands to run, expected outputs>
Notes / decision log:
- <decision: ..., rationale: ...>
Guardrails that keep AI-assisted development grounded
Guardrails are not policies. They are things the repo, the compiler, and your review process can enforce.
Examples from DAP iQ:
- One active spec at a time.
- No repository pattern.
AppDbContextis the repository. - Dark mode only. No theme toggle.
- Validate with real commands, not "looks good".
If you do not constrain the solution space, the assistant will widen it. That creates long diffs and unclear ownership.
Turn prompts into diffs, not discussions
Ask for outputs that are easy to verify.
For code:
- require a patch
- require a file list
- require "why" in terms of acceptance criteria
For content:
- require consistent headings
- require a nav block
- require ASCII-only output if your pipeline is strict
For DAP iQ, "verify" usually means:
git status -sb
dotnet build
curl -I http://localhost:5000/series/ai-assisted-development
curl -I http://localhost:5000/sitemap.xml
curl -I http://localhost:5000/insights/feed
Surviving context window limits
When your conversation fills the context window, the assistant starts forgetting early context. The memory bank pattern handles this explicitly:
- Request "update memory bank" before the window fills.
- Start a fresh conversation.
- Tell the assistant to read memory bank files first.
The memory bank becomes a handoff document. Everything important survives in files, not chat history.
For teams building deeper integrations, the Model Context Protocol (MCP) provides a standardized way for AI assistants to access external tools and data sources. MCP reduces the "N x M" integration problem to "N + M" by providing a universal interface. Major AI providers (Anthropic, OpenAI, Google) have adopted it. If your workflow needs persistent access to databases, APIs, or file systems across sessions, MCP is the infrastructure layer that enables it.
Common failure modes
- Treating the chat as the spec, then losing it.
- Asking for "best practices" and getting generic advice.
- Allowing scope creep because the assistant can type faster than you can reason.
- Skipping validation because the diff "looks right".
- Letting the context window fill without checkpointing.
Checklist
- Write a spec with acceptance criteria.
- Copy in the constraints that matter (stack, routing, SEO, security).
- Keep the diff small enough to review in one sitting.
- Run
dotnet build(or the closest equivalent) before calling it done. - Update memory so the next AI-assisted development pass starts from facts.
FAQ
What is wrong with vibe coding?
Nothing, if you are prototyping or exploring. Vibe coding - prompting an AI and accepting what it generates - is fast for throwaway work. The problem is production. Vibe-coded features break when requirements change because there is no spec to trace back to. You end up rewriting from scratch. Spec-driven development adds friction upfront but pays off when the codebase lives longer than a week.
Do I need a memory bank, or is a README enough?
If the README stays current, it can work. Most READMEs do not. The memory bank is a forcing function: it is short, explicit, and updated on purpose.
Why "one active spec"?
Because parallel specs create parallel context. Common failure mode: the assistant mixes constraints across specs. Humans will forget which constraints applied to which decision.
What if the assistant is "mostly right"?
"Mostly right" is how regressions land. If it matters, require a diff you can review and a command you can run.
What belongs in a spec vs in the memory bank?
Put intent in the spec (goal, constraints, acceptance, validation). Put facts and long-lived decisions in the memory bank (stack, routes, invariants, known tradeoffs).
How do I keep the memory bank from turning into a second wiki?
Cap file sizes and enforce deletion. If something is stale or redundant, remove it.
What if I need to work on multiple things in parallel?
Use a backlog, but keep only one active spec. Parallel work is how context mixes and regressions land.
Should I run validation on every session, even for "docs only"?
Yes. For DAP iQ, publishing Markdown is a real build step.
How strict should the validation ladder be?
As strict as the blast radius. For a content site: build + publish + HTTP checks. For payments: add tests and staged verification.
How does this differ from the original memory bank pattern?
Cline defines the memory bank structure and read/update discipline. This workflow adds:
- A spec system with active/backlog/done states.
- A command vocabulary (INIT, PLAN, IMPLEMENT, CHECKPOINT).
- Explicit validation ladder requirements.
- The "one active spec" constraint to prevent parallel drift.
The Cline pattern is the foundation. This workflow adds production discipline on top.
How does this compare to GitHub Spec Kit?
GitHub Spec Kit is GitHub's open source toolkit for spec-driven development. It uses a similar philosophy: specifications become executable artifacts that drive implementation.
Spec Kit workflow: Specify -> Plan -> Tasks -> Implement. This workflow: Plan -> Implement -> Checkpoint.
Key differences:
- Spec Kit uses a CLI (
specify init,/specify,/plan,/tasks) and generates structured directories per feature. - This workflow uses a command vocabulary and a single active spec with backlog/done states.
- Spec Kit has a "constitution" file for immutable architectural principles. This workflow uses
systemPatterns.mdin the memory bank. - Spec Kit is agent-agnostic (Copilot, Claude Code, Gemini). This workflow is also agent-agnostic but optimized for session-based work.
Both approaches solve the same problem: making AI output predictable by constraining it with structured specifications. Choose based on your team's tooling preferences.
What to do next
Read Part 3: Security Boundaries for AI-Assisted Development in ASP.NET Core. Browse the AI-assisted development series for the full sequence. If you want a workflow you can sustain, start by making AI-assisted development produce verifiable artifacts. If you want to talk about applying this workflow to your system, reach out via Contact.
References
- Cline Memory Bank - The original memory bank pattern this workflow extends.
- GitHub Spec Kit - GitHub's open source toolkit for spec-driven development with AI.
- Spec-Driven Development with AI (GitHub Blog) - Introduction to spec-driven development philosophy.
- Model Context Protocol - Open standard for AI assistant integrations with external tools and data.
- Vibe Coding is Not AI-Assisted Engineering - Addy Osmani on why structured approaches beat unstructured prompting.
- GitHub Copilot Documentation
- Git Documentation (Reference Manual)
- .NET CLI overview
- dotnet build
- dotnet test
- Configuration providers in .NET (environment variables)
- Safe storage of app secrets in development in ASP.NET Core
- The Twelve-Factor App
Author notes
Decisions:
- Keep exactly one active spec at a time. Rationale: prevents parallel drift and conflicting context.
- Treat commands as part of the workflow. Rationale: "run this" is a better contract than "this should work".
- Prefer constraints over clever prompts. Rationale: constraints reduce hallucinated architecture.
- Reference related approaches (Cline, GitHub Spec Kit, MCP). Rationale: readers benefit from knowing the ecosystem; this workflow is one option, not the only option.
Observations:
- Before: context resets caused repeated debates about patterns and constraints.
- After: memory bank + spec discipline reduced rework and made changes reviewable.
- Observed: validation steps became consistent, which kept regressions visible.
- The Cline memory bank pattern provided a solid foundation; the spec system and command vocabulary were the missing pieces for production use.