Isometric server architecture as stacked amber tower-anvils radiating tool icons connected by data streams

Chapter 11 · Full Reference

MCP Server — Full Reference

Complete tool tables for all 102 MCP tools across 8 categories, REST API endpoints, WebSocket hub events, OTLP telemetry, cost tracking, SDK, and API key configuration.

Just getting started? See MCP Server — Quick Start for the essential tools and a typical workflow. Return here when you need the full catalog or REST API details.

MCP Tools (102, in 8 Categories)

Every tool is callable from Copilot Chat, Claude Code, Cursor, or any MCP-compatible client. Tools are grouped by station / subsystem. The four "station" categories (Crucible, LiveGuard, Tempering, Bug Registry / Testbed) map directly to the four shop stations; the rest are cross-cutting infrastructure.

Discovery first: Call forge_capabilities before anything else, it returns the full live API surface including tool schemas, config options, available extensions, and per-tool error codes. Always authoritative.

Core — Execution, Diagnosis, Skills, Cost, Memory (37 tools)

Everything that powers the Smelt and Forge stations plus the cross-cutting surfaces (skills, memory, cost, search, review queue, notifications, image generation, meta-bug filing).

Tool	Description
Diagnostics & setup
`forge_smith`	Diagnose environment, VS Code config, setup health, version currency. The "shop inspector."
`forge_validate`	Validate setup files, check counts match preset, no placeholders
`forge_sweep`	Scan for TODO/FIXME/HACK/stub/placeholder markers
`forge_capabilities`	Machine-readable API surface, tools, intents, config, extensions, error codes
`forge_status`	Show phases from `DEPLOYMENT-ROADMAP.md` with status
Plan execution (Forge station)
`forge_run_plan`	Execute a hardened plan: spawn workers, validate gates, track tokens. Supports `--quorum=auto\|power\|speed\|false`
`forge_abort`	Abort the currently running plan execution
`forge_plan_status`	Latest execution status, per-slice results, tokens, duration
`forge_diff`	Compare changes against the plan's Scope Contract, detect drift
`forge_new_phase`	Create a new phase plan file + roadmap entry
Analysis & estimation
`forge_analyze`	Cross-artifact consistency scoring (0–100, 4 dimensions)
`forge_diagnose`	Multi-model bug investigation, root cause + fix recommendations
`forge_estimate_quorum`	Projected cost of a plan under all four quorum modes (auto/power/speed/false). Always call this before showing cost estimates, never hand-compute.
`forge_estimate_slice`	Per-slice cost estimate with confidence (heuristic vs historical)
`forge_doctor_quorum`	Diagnose quorum-mode availability and routing issues
`forge_graph_query`	Query the Plan Forge knowledge graph (built post-Slice via `postSlice` hook)
`forge_search`	Cross-artifact search across plans, runs, bugs, memory
Cost & performance
`forge_cost_report`	Cost tracking: total spend, per-model breakdown, monthly trend. Authoritative source for actual spend.
`forge_timeline`	Unified chronological view of runs, incidents, bugs, deploys, fm-turns, crucible events. 9 sources.
`forge_home_snapshot`	Snapshot of the “home” dashboard tile state, aggregate health surface
Skills & review
`forge_run_skill`	Execute a skill programmatically with step-level tracking
`forge_skill_status`	Recent skill execution events from the hub
`forge_review_add`	Queue a review item (used by Step 5 reviewer agents)
`forge_review_list`	List open / resolved review items
`forge_review_resolve`	Resolve a review item with verdict + notes
`forge_patterns_list`	List captured architectural patterns for a project
Memory (Learn station bridge)
`forge_memory_capture`	Normalise and broadcast a `memory-captured` hub event for OpenBrain
`forge_memory_report`	Aggregate report of recent captures, patterns, decisions
Notifications & bridge
`forge_notify_send`	Send a notification via the configured Remote Bridge (Slack / Teams / PagerDuty / OpenClaw / Telegram / Discord)
`forge_notify_test`	Test the Remote Bridge configuration end-to-end
`forge_delegate_to_agent`	Hand a sub-task to a specific reviewer agent in multi-agent mode
Extensions & meta
`forge_ext_search`	Search the community extension catalog
`forge_ext_info`	Detailed info about a specific extension
`forge_org_rules`	Export org custom instructions, consolidate instruction files for GitHub org-level Copilot config
`forge_meta_bug_file`	File a self-repair bug against Plan Forge itself (plan-defect / orchestrator-defect / prompt-defect)
`forge_triage_route`	Route a finding to the appropriate lane (bug / spec / classifier), powers the audit-loop drain
`forge_generate_image`	Generate images via Grok Aurora or DALL-E, save with format conversion

LiveGuard — Post-Ship Defense (14 tools)

The Guard station. Detect drift, capture incidents, watch dependencies, scan for secrets, propose fixes, all running against shipped code. Chapter 17 — LiveGuard Tools Reference covers each one in depth (flags, thresholds, output shapes, severity matrix). Listed here for completeness.

Tool	Description
`forge_liveguard_run`	Composite scan: drift + sweep + secrets + regression + deps + alerts + health. The "everything" command.
`forge_drift_report`	Score codebase against architecture guardrail rules; track drift over time
`forge_secret_scan`	High-entropy secret detection, values always redacted
`forge_dep_watch`	Scan dependencies for CVEs; alert on new vulnerabilities
`forge_regression_guard`	Extract validation gates from plans, execute against codebase
`forge_incident_capture`	Record incidents with severity, affected files, MTTR tracking
`forge_alert_triage`	Read incidents and drift violations, rank by priority
`forge_env_diff`	Environment variable key divergence across `.env` files
`forge_fix_proposal`	Generate scoped 1–2 slice fix plan from a regression / drift / incident finding
`forge_health_trend`	Aggregate drift, cost, incidents, model performance into health score 0–100
`forge_hotspot`	Identify git-churn hotspots, files that change most frequently
`forge_runbook`	Generate an operational runbook from a hardened plan file
`forge_deploy_journal`	Record deployments with version, deployer, notes
`forge_quorum_analyze`	Assemble structured quorum prompt from LiveGuard data, no LLM calls

Watcher — Cross-Project Read-Only Tail (2 tools)

Read-only observation of another project's forge run from a second VS Code session. See Chapter 19 — The Watcher.

Tool	Description
`forge_watch`	Snapshot or analyze (claude-opus-4.7) mode. Returns counts, anomalies, recommendations, diff cursor.
`forge_watch_live`	Live tail, streams events for fixed duration via target's WebSocket hub or events.log polling.

Crucible — Idea Smelting (8 tools)

The Smelt station. Interview-driven plan intake with a critical-fields gate that refuses to finalize until build-command, test-command, scope, gates, and forbidden-actions are all satisfied. Includes a deterministic Spec Kit importer. See Chapter 5 — Crucible.

Tool	Description
`forge_crucible_submit`	Submit a raw idea or feature request to start an interview
`forge_crucible_ask`	Answer the next interview question. Supports an optional `questionId` to refuse on out-of-sync clients with `ASK_QUESTION_MISMATCH`.
`forge_crucible_preview`	Preview the draft plan + flag any unresolved CRITICAL_FIELDS
`forge_crucible_finalize`	Finalize into `docs/plans/Phase-NN.md`. Refuses if plan exists with `PLAN_ALREADY_EXISTS`; pass `overwrite: true` to bypass. Refuses on missing CRITICAL_FIELDS with `CRITICAL_FIELDS_MISSING`.
`forge_crucible_list`	List all in-flight and finalized smelts
`forge_crucible_abandon`	Abandon an in-flight smelt
`forge_crucible_import`	Deterministic Spec Kit importer. Maps a Spec Kit checkout (`spec.md` + `plan.md` + `tasks.md` + optional `constitution.md`) into a Plan Forge smelt under `.forge/crucible/`. No LLM calls. Supports `--dry-run` and `--json`.
`forge_crucible_status`	Inspect imported smelts. Lists all smelts when called without an id, or returns the full smelt record (metadata + draft plan) when given a smelt id.

Tempering — Quality Drains & Audit Loop (5 tools)

Closed-loop self-tempering, scan, triage, fix, repeat until convergence. The audit-loop drain is opt-in via .forge.json → audit.mode = "off" | "auto" | "always". See Audit Loop Deep Dive.

Tool	Description
`forge_tempering_scan`	Run a single tempering scanner (mutation, content-audit, etc.)
`forge_tempering_run`	Run the full standard scanner sequence (10 scanners)
`forge_tempering_drain`	Iterate scan → triage → fix until convergence or `maxRounds`
`forge_tempering_status`	Latest tempering run status, scanners, findings
`forge_tempering_approve_baseline`	Approve current findings as the new baseline for visual-diff scanners

Bug Registry — Closed-Loop Bug Lifecycle (4 tools)

The Learn station. Fingerprint-deduped bug registry: register, fix, validate, remember. See Chapter 23 — The Bug Registry.

Tool	Description
`forge_bug_register`	Register a new bug with title, severity, fingerprint inputs, file paths
`forge_bug_list`	List bugs by status, severity, or fingerprint match
`forge_bug_update_status`	Update status (open / in-progress / fixed / verified / closed). Accepts both `newStatus` and `status`.
`forge_bug_validate_fix`	Run the bug's validation gate against the current codebase to confirm a fix landed

Testbed — Scenario Replay (3 tools)

Replay scenarios against a dedicated fixture repo (typically plan-forge-testbed/) to prove fixes don't regress. See Chapter 24 — The Testbed.

Tool	Description
`forge_testbed_run`	Execute a scenario against the testbed fixture
`forge_testbed_happypath`	Run the happy-path scenario set as a smoke test
`forge_testbed_findings`	Aggregate findings from the latest testbed run

Forge-Master — Read-Only Reasoning Orchestrator (1 MCP tool + REST surface)

Intent classifier with embedding cache and quorum advisory mode. Classifies open-ended prompts, fetches OpenBrain memory, and chains read-only forge tools on your behalf. The bulk of the Forge-Master surface is exposed via /api/forge-master/* REST routes (see below) plus the dashboard's Studio tab; only the one-shot reasoning entry-point is an MCP tool.

Tool	Description
`forge_master_ask`	One-shot reasoning entry point. Accepts a free-form message; returns lane classification, tool-call trace, and synthesized reply. Use for open-ended questions instead of chaining tools yourself.

Forge-Master chapter: The Forge-Master chapter covers the three-stage intent classifier (keyword → embedding cache → router LLM), quorum advisory mode for high-stakes decisions, and the /api/forge-master/cache-stats liveliness endpoint.

REST API

The REST surface is documented in full in Appendix W — REST API Reference: every endpoint, request/response shape, status codes, authentication model, and worked examples. The summary below points at the most-used subsystems, click through to Appendix W for the per-endpoint detail.

Subsystem	What it covers
Discovery	Liveness, version, capability manifest, well-known endpoint.
Plan execution & runs	Trigger/abort runs, traces, replay, plans, workers.
Search, timeline, hub	Cross-surface search, unified timeline, WebSocket upgrade.
Memory	Capture, drain, search, OpenBrain stats.
Crucible	Idea-smelt lifecycle: `submit → ask → preview → finalize`.
LiveGuard	Drift, incidents, deploy journal, regression guard, runbooks, secret scan, dep watch.
Bridge & approvals	The only cross-boundary auth surface (HMAC via `PFORGE_BRIDGE_SECRET`).
Forge-Master	Conversational entrypoint, chat, prefs, cache stats.
Generic MCP dispatcher	`POST /api/tool/:name`, invoke any of the 106 MCP tools over REST.

Trust model: the server binds to 127.0.0.1 only and has no authentication layer of its own; the OS user account is the access boundary. The only exception is the bridge approval surface, which is HMAC-protected. See Appendix W — Authentication, binding, and CORS for the full discussion.

WebSocket Hub

Connect to ws://localhost:3101 for real-time events. The dashboard uses this for live progress updates.

Event	When
`connected`	Client connects, includes event history replay
`run-started`	Plan execution begins
`slice-started`	Slice begins execution
`slice-completed`	Slice passes all validation gates
`slice-failed`	Slice or gate fails
`slice-escalated`	Slice escalated to quorum for multi-model consensus
`run-completed`	All slices finish
`run-aborted`	Execution aborted via `forge_abort`
`skill-started`	Skill execution begins
`skill-completed`	Skill finishes all steps
`approval-requested`	Bridge pauses for external approval
`bridge-notification-sent`	Webhook dispatched (Telegram, Slack, Discord)
`watch-snapshot-completed`	Watcher built a snapshot of a target project
`watch-anomaly-detected`	Watcher detected one or more anomalies (stalled, slice-failed, quorum-dissent, etc.)
`watch-advice-generated`	Watcher analyze-mode produced narrative advice from frontier model
`fm-turn`	Forge-Master turn (intent classification + tool-call trace + reply). Surfaces in the unified Timeline.
`quorum-estimate`	Forge-Master quorum advisory cost estimate, emitted before model dispatch so clients can cancel
`memory-captured`	Decision / pattern / postmortem captured to OpenBrain
`crucible-started` / `crucible-question` / `crucible-finalized`	Crucible interview lifecycle events
`tempering-round-completed`	One round of audit-loop drain finished (scan → triage → fix)
`slice-orphan-warning`	Failed slice's worker deliverables were staged but not committed; recovery commands available

Telemetry

Every plan execution emits OpenTelemetry (OTLP) traces stored in .forge/runs/<timestamp>/traces.json:

Resource context, project name, version, preset, model
Span hierarchy, run → slice → gate → escalation
Severity levels, INFO for passes, WARN for retries, ERROR for failures
Export, traces are OTLP-compatible, send to Jaeger, Grafana Tempo, or any collector

Cost Tracking

The orchestrator tracks tokens and computes cost per slice using a 23-model pricing table:

Per-slice, tokens in/out, model, duration, USD cost
Per-run, total cost, model breakdown
Monthly, aggregated in .forge/cost-history.json
Model performance, .forge/model-performance.json tracks success rate, avg cost, avg duration per model

The orchestrator auto-selects the cheapest model with >80% historical pass rate. Use --estimate to preview costs before executing.

SDK for Integrators

The pforge-sdk/ package provides a JavaScript/TypeScript API for building integrations:

JavaScript

import { createForgeClient } from 'pforge-sdk';

const forge = createForgeClient({ baseUrl: 'http://localhost:3100' });

// Run smith diagnostics
const health = await forge.smith();

// Get cost report
const cost = await forge.costReport();

// Execute a plan
const run = await forge.runPlan('docs/plans/Phase-1.md', {
  mode: 'estimate'
});

The SDK is currently in scaffold stage (v0.1.0), API surface defined, implementation in progress.

API Key Configuration

API keys for external providers (xAI Grok, OpenAI) are resolved in order: environment variable → .forge/secrets.json → null.

.forge/secrets.json

{
  "XAI_API_KEY": "xai-...",
  "OPENAI_API_KEY": "sk-..."
}

The .forge/ directory is gitignored by default, secrets never enter version control.

📄 Full reference: capabilities, Appendix V — Event Catalog (every WebSocket event grouped by family), EVENTS.md on GitHub, tools.json on GitHub