Circular feedback flow with amber arrows curving between hammer, mirror, scroll, and brain totems converging on a central glowing core, the self-deterministic agent loop
Deep Dive · Act II, Forge · Master Narrative

The Self-Deterministic Agent Loop

The canonical overview. How Plan Forge's deterministic slice executor, the Phase-25 reflective layer, and the Phase-26 competitive layer compose into a single self-deterministic agent loop.

New here? Plain-English version. “Self-deterministic” is a mouthful. Here's what it really means: Plan Forge runs the same way every time (same plan + same config = same outcome, no surprises), but it also learns from every run and uses that knowledge to make the next run smarter. The execution stays predictable; the context gets richer.
  • Deterministic part, the slice executor. No random model picking, no hidden retries that change the result. You can re-run a plan and get the same answer.
  • Self-learning part, the “inner loop” (reflection on what worked) and “competitive loop” (multiple models racing) feed lessons back into the next slice or plan.
  • Safety, every learning signal is opt-in or advisory. Nothing silently changes a run you've already started.
This chapter is the master narrative tying it all together. If you want the focused deep dives, jump to Inner Loop (reflection) or Competitive Loop (racing).
Canonical reference. Start here if you want the whole picture. The companion chapters, The Inner Loop (Phase-25 reflective layer) and The Competitive Loop (Phase-26 worktree race, auto-fix, cost anomalies), drill into the individual subsystems.

What "self-deterministic" means

Plan Forge's slice executor is deterministic: same plan, same config, same model routing, same outcome. On top of that spine, the Phase-25 and Phase-26 subsystems let the loop observe itself and feed what it learns back into the next slice, the next plan, or a sibling project. The execution contract stays deterministic; the loop's context gets progressively better-informed. That combination is what we mean by self-deterministic:

  • Determinism at the execution boundary, no randomized control flow, no hidden model selection.
  • Reflective feedback at the learning boundary, trajectories, postmortems, auto-skills, and advisory signals.
  • Every signal is opt-in or advisory by default; nothing silently changes a deterministic plan run.

Diagram A — System-wide state flow

The outer pipeline is the same one Plan Forge has always had. The inner loop adds callback arrows that let later stages feed earlier stages without breaking the forward progression.

System-wide state diagram. runPlan moves through Plan, Preflight (environment validation), Harden (Step-2 hardener), Execute (slice loop), Sweep, Review, then Ship. Execute self-loops on reflexion retry when a gate fails. Sweep loops back to Execute when a completeness gap is found, and forward to Review when artifacts are consistent. Review loops back to Execute on an advisory signal (blocking if opted-in) and forward to Ship on a clean verdict. A long-range arrow runs from Execute back to Harden, indicating that postmortems written this run feed the next plan's hardener. An operator halt drops Review into a Stopped terminal state.
System-wide state flow, the deterministic outer pipeline with callback arrows that let later stages feed earlier stages.

Two things to notice: first, every backward arrow from Execute, Sweep, and Review is opt-in or advisory by default, the forward pipeline stays honest. Second, the arrow from Execute back to Harden crosses a plan boundary: a postmortem written at the end of this run is read by the hardener at the start of the next one.

Diagram B — Inner-loop callback graph

Zooming into a single slice, here is what happens at the slice boundary and how each Phase-25 and Phase-26 subsystem feeds something downstream, the next slice, the next plan's hardener, or a Dashboard promotion surface.

Inner-loop callback graph. The slice-execution subgraph runs BuildPrompt → AutoSkill lookup (L2) → WorkerInvoke → GateRun, with a fail-edge to a Reflexion block (L7) that loops back to WorkerInvoke and a pass-edge to Trajectory write (L8). Trajectory fans out into Postmortem (L5), AutoSkill capture (L2), GateSuggestion accrual (L6), and Cost-anomaly check (C3). GateRun also emits an advisory Reviewer call (L4) and a gate-fail-with-small-diff Auto-fix proposal (C2). BuildPrompt reads from Federation (L4-lite). Postmortems feed the next plan's hardener; auto-skill capture and federation reads feed the next slice; GateSuggestion, Reviewer, Cost-anomaly, and Auto-fix all surface on the Dashboard. A separate competitive-layer subgraph (C1) spawns Strategy A and Strategy B worktrees, runs winner election, promotes the winner to the working tree, and that winner feeds Trajectory write.
Inner-loop callback graph, slice-boundary signals (L2, L4, L5, L6, L8, C1, C2, C3) feeding the next slice, the next hardener, and Dashboard surfaces.

The Phase-25 subsystems are labeled L1–L8 in the capabilities surface (forge_capabilities → innerLoop); the Phase-26 subsystems, C1 competitive, C2 auto-fix, C3 cost-anomaly, extend the same surface. Every node in the diagram corresponds to one entry in INNER_LOOP_SURFACE.subsystems.

Subsystem roll-call

Every subsystem, the stage at which it fires, and where its output shows up. See the companion chapters for mechanics and configuration.

SubsystemFires atOutput lands inDefault posture
Reflexion (L7)Gate fail → retryNext attempt's promptAlways on
Trajectory (L8)Slice pass.forge/trajectories/Always on
Auto-skill library (L2)Slice pass → next slice.forge/auto-skills/Always on
Adaptive gate synthesis (L6)Pre-flightStdout + Dashboard promotion surfaceSuggest (never mutates plans)
Postmortem (L5)Run end.forge/plans/<basename>/postmortem-*.jsonAlways on (retention 10)
Federation (L4-lite)Brain miss → cross-repo readIn-memory recallOff (opt-in, absolute local paths)
Reviewer (L4)Gate-checkGate-check response, DashboardOff, advisory-only
Competitive (C1)Slice start (marked competitive)Winner's worktree → treeOff (opt-in)
Auto-fix (C2)Gate fail + small diff.forge/proposed-fixes/Advisory (never auto-apply)
Cost-anomaly (C3)Every slice.forge/cost-anomalies.jsonl, DashboardAdvisory (detection only)

Why this matters

The individual subsystems are useful on their own. The mesh is what turns a slice runner into a self-deterministic loop: a trajectory written today becomes part of tomorrow's planning context; a cost anomaly noticed this run becomes the reason next run's hardener picks a cheaper model for that slice; a gate command accepted three times graduates into the validation template for that domain. None of this changes the deterministic execution contract, it only changes the information the deterministic executor runs with.

Companion chapters. The Inner Loop covers L1–L8 (Phase-25) mechanics and configuration. The Competitive Loop covers C1–C3 (Phase-26). Dashboard → Inner Loop tab shows live state for all ten subsystems.