A forge workshop — anvil, glowing metal, sparks, tools on wall
Chapter 1

What Is Plan Forge?

Answer "should I keep reading?" in five minutes.

The Problem in One Sentence

AI coding agents are powerful but directionless.

They generate code fast. But fast isn't the same as good. Without structure, AI-generated code tends to be untestable, insecure, architecturally inconsistent, and impossible to maintain at scale. That's fine for prototypes — it's not fine for production systems.

The 80/20 Wall

You've probably lived this pattern:

You fire up an AI agent — Copilot, Cursor, Claude, whatever — and describe the app you want. The first 80% is magic. Files appear, components wire up, the database schema materializes. You're shipping faster than you ever thought possible.

Then complexity creeps in. Auth flows interact with database queries. Middleware chains get long. The agent still works, but you notice it's making assumptions without asking — it picked a caching strategy you wouldn't have chosen, refactored code from three sessions ago that was working fine.

Then the wall. Every change breaks something else. Fix the auth bug, break the dashboard. Fix the dashboard, break the API response format. The agent is confidently producing code that compiles but doesn't work. You're debugging AI-generated code you don't fully understand, in an architecture you didn't fully choose.

The pattern everyone hits: prompt → hope → fix → re-prompt → hope harder.

This isn't a model problem. It's a planning problem. When agents work from loose intent rather than hardened specs, they do fine on greenfield builds but start thrashing once the codebase gets complex enough that every change has downstream consequences.

The fix is spec-driven development: specify what you want before the agent writes a line of code, lock the scope so it can't drift, checkpoint at every boundary, and audit with a fresh session that never saw the original work.

Vibe coding gets you a prototype. Plan Forge gets you a product.

What Happens Without Guardrails

Vibe Coding
  • ✗ Prompt → hope → fix → re-prompt
  • ✗ Agent picks architecture mid-stream
  • ✗ Every session starts from zero
  • ✗ Agent reviews its own work
  • ✗ "It compiles" = "it's done"
Spec-Driven (Plan Forge)
  • ✓ Specify → harden → execute → verify
  • ✓ Architecture locked before coding starts
  • ✓ Memory carries decisions across sessions
  • ✓ Fresh session audits independently
  • ✓ Build + test pass at every boundary

Without guardrails, AI coding agents:

If you've managed human dev teams, you know guardrails aren't about distrust — they're about consistency. The same principle applies when your team members are AI models.

  • Silently expand scope — "I'll also add..." (you didn't ask for that)
  • Make undiscussed decisions — picks a database pattern without telling you
  • Skip validation — ships code that doesn't build or pass tests
  • Lose context — forgets requirements halfway through long sessions
  • Never self-audit — the executor grades its own exam

These problems get worse the less technical your team is — you may not even notice the drift until it's too late.

Without guardrailsWith Plan Forge
Agent writes code that passes once, breaks in productionCode follows your architecture from the first line
30–50% of AI-generated code needs rework after reviewIndependent review catches drift before merge
Agent re-discovers solved problems every sessionPersistent memory loads prior decisions in seconds
Context window wasted on exploration and backtrackingHardened plan tells the agent exactly what to build
"It works on my machine" shipped to stagingValidation gates pass at every slice boundary

What Plan Forge Does

Plan Forge is a specification-driven framework that converts your rough ideas into hardened execution contracts — structured plans that AI coding agents follow without scope creep, skipped tests, or silent rewrites. It installs guardrail files into your project that automatically load when AI agents edit code, enforcing your architecture, security, testing, and coding standards at every step.

The Blacksmith Analogy

A blacksmith doesn't hand raw iron to a customer. They heat it, hammer it, and temper it until it holds its edge.

Plan Forge does the same for your development plans:

Forge StageWhat HappensPipeline Step
🔥 Heat — raw ironYou describe what you wantStep 0: Specify
🔨 Hammer — shape itThe plan gets structured into slices with validation gatesStep 2: Harden
💧 Quench — cool itAI builds it slice by slice, checked at every boundaryStep 3: Execute
🔍 Inspect — check the edgeA fresh session audits for drift, completeness, and qualityStep 5: Review

Who This Is For

Solo Developers

You're using Copilot or Claude to build features, but you've noticed the AI drifts when sessions get long. You spend time re-explaining your patterns. Plan Forge gives you a repeatable pipeline that remembers your standards, validates at every step, and catches the mistakes you'd normally catch in code review — except there's no reviewer. You are the team.

Development Teams

Your team uses AI tools but everyone gets different quality results. Junior devs get code that works but violates your architecture. Senior devs spend review cycles catching AI-generated antipatterns. Plan Forge makes the architecture the default — instruction files load automatically, validation gates enforce build+test, and the reviewer-gate agent catches drift before anyone opens a PR.

Enterprise & Regulated Environments

You need audit trails, consistent architecture, and code that meets compliance standards. Plan Forge gives you phase-level tracking (DEPLOYMENT-ROADMAP.md), per-slice cost accounting, OTLP telemetry, and 19 independent reviewer agents — including compliance, security, and multi-tenancy auditors that run automatically. Every execution has a trace.

What This Is Not

  • Not a code generator. Plan Forge doesn't write your code — it tells the AI how to write it, then verifies the result.
  • Not a CI/CD system. It doesn't deploy your app. It validates that what's built matches what was planned. Your CI pipeline is a separate concern.
  • Not a project manager. It doesn't assign tasks to humans or track sprints. It structures work for AI agents — slices, gates, scope contracts.
  • Not an AI model. Plan Forge works with whatever AI you already use — Copilot, Claude, Cursor, Codex, Gemini, Windsurf, or any tool that accepts text prompts.
  • Not opinionated about your stack. Nine presets cover .NET, TypeScript, Python, Java, Go, Swift, Rust, PHP, and Azure IaC. Each installs stack-appropriate guardrails.

How to Read This Manual

This manual is structured in three acts:

  • Act I — Learn (Chapters 1–3): What Plan Forge is, how it works, how to install it. Start here if you're new.
  • Act II — Build (Chapters 4–6): Hands-on. Build a feature with the pipeline, learn to write plans, tour the dashboard. Start here if you've already installed and want to use it.
  • Act III — Master (Chapters 7–14): Reference and advanced topics. CLI commands, customization, MCP tools, multi-agent setup, troubleshooting. Start here if you're looking up a specific feature.
Already installed and want to build something? Skip to Chapter 4: Your First Plan.

📄 Full reference: README.md