Blog

From the Forge

Lessons, patterns, and A/B test results from building with AI coding agents.

A forge smith writing scrolls that transform into glowing architectural blueprints, shelves of color-sealed scrolls behind
Case Study NEW April 23, 2026

The Loop That Never Ends β€” How TheProject Auto-Smelts Its Own Website Bugs

Most pipelines end. This one doesn't. A 4-pass discovery harness on a production Next.js site feeds the Crucible, tempering catches regressions, and the bug registry auto-smelts failures back into Discovery. The loop only pauses when there's nothing left to find.

Read the case study β†’
Problem April 7, 2026

The 80/20 Wall: Why AI Agents Break What They Build

AI coding agents get you to 80% fast β€” then start breaking things trying to close the last 20%. You've spent $1,500 in tokens to watch an agent destroy its own work. The fix isn't a better model. It's a better methodology.

Read article β†’
Origin Story April 7, 2026

Spec Kit + Plan Forge: Write the Spec, Enforce the Build

I started with Spec Kit and loved it. Specification-based planning and agentic coding is like peanut butter and jelly. As my ideas grew, Plan Forge was born. Here's how they work together β€” and an honest comparison of when to use which.

Read article β†’
Lessons April 7, 2026

I Built Guardrails for AI Coding Agents β€” Here's What I Learned

After auto-loading guardrails, 21 reviewer agents, and 9 tech stack presets, here are the seven hard-won lessons from building Plan Forge β€” and the mistakes that taught them.

Read article β†’
A/B Test April 7, 2026

Quorum Mode: What Happens When 3 AI Models Review Each Other's Code

We built a feature twice β€” once with a single model, once with three models in parallel consensus. Both passed all gates. But the code quality difference was measurable. Here are the A/B results.

Read article β†’
Architecture April 7, 2026

From WhatsApp to Shipped PR: How I Automated My Entire Dev Workflow

Three open-source tools wired into a closed-loop system. Describe a feature from your phone. The system hardens the plan, builds it, captures every decision, sends progress updates, reviews independently, and ships β€” all while you're at dinner.

Read article β†’
Multi-Agent April 7, 2026

One Framework, Seven AI Agents: Why We Stopped Picking Favorites

Your team uses Copilot, Claude, Cursor, and Gemini. Plan Forge generates native guardrail files for all 7 agents β€” same quality gates, same pipeline, same 21 reviewer agents. One setup command.

Read article β†’
A/B Test NEW April 11, 2026

The A/B Test: 99 vs 44 β€” Same App, Same Model, Same Time

We built the same .NET app twice: once with Plan Forge guardrails, once with pure vibe coding. Same model (Claude Opus 4.6). Same time (~7 min). The quality score: 99/100 vs 44/100. 60 tests vs 13. 6 interfaces vs 0. The data speaks for itself.

Read the results β†’
Origin Story NEW April 11, 2026

From Impossible to 7 Minutes: A Year of Building AI Coding Guardrails

A year ago, getting enterprise-grade code from an AI agent was nearly impossible. Today we build a 99/100 app in 7 minutes. This is the story of that journey β€” the failures, the breakthroughs, and what I learned along the way.

Read the full story β†’