MCP Server — Full Reference
Complete tool tables for all 102 MCP tools across 8 categories, REST API endpoints, WebSocket hub events, OTLP telemetry, cost tracking, SDK, and API key configuration.
MCP Tools (102, in 8 Categories)
Every tool is callable from Copilot Chat, Claude Code, Cursor, or any MCP-compatible client. Tools are grouped by station / subsystem. The four "station" categories (Crucible, LiveGuard, Tempering, Bug Registry / Testbed) map directly to the four shop stations; the rest are cross-cutting infrastructure.
forge_capabilities before anything else, it returns the full live API surface including tool schemas, config options, available extensions, and per-tool error codes. Always authoritative.
Core — Execution, Diagnosis, Skills, Cost, Memory (37 tools)
Everything that powers the Smelt and Forge stations plus the cross-cutting surfaces (skills, memory, cost, search, review queue, notifications, image generation, meta-bug filing).
| Tool | Description |
|---|---|
| Diagnostics & setup | |
forge_smith | Diagnose environment, VS Code config, setup health, version currency. The "shop inspector." |
forge_validate | Validate setup files, check counts match preset, no placeholders |
forge_sweep | Scan for TODO/FIXME/HACK/stub/placeholder markers |
forge_capabilities | Machine-readable API surface, tools, intents, config, extensions, error codes |
forge_status | Show phases from DEPLOYMENT-ROADMAP.md with status |
| Plan execution (Forge station) | |
forge_run_plan | Execute a hardened plan: spawn workers, validate gates, track tokens. Supports --quorum=auto|power|speed|false |
forge_abort | Abort the currently running plan execution |
forge_plan_status | Latest execution status, per-slice results, tokens, duration |
forge_diff | Compare changes against the plan's Scope Contract, detect drift |
forge_new_phase | Create a new phase plan file + roadmap entry |
| Analysis & estimation | |
forge_analyze | Cross-artifact consistency scoring (0–100, 4 dimensions) |
forge_diagnose | Multi-model bug investigation, root cause + fix recommendations |
forge_estimate_quorum | Projected cost of a plan under all four quorum modes (auto/power/speed/false). Always call this before showing cost estimates, never hand-compute. |
forge_estimate_slice | Per-slice cost estimate with confidence (heuristic vs historical) |
forge_doctor_quorum | Diagnose quorum-mode availability and routing issues |
forge_graph_query | Query the Plan Forge knowledge graph (built post-Slice via postSlice hook) |
forge_search | Cross-artifact search across plans, runs, bugs, memory |
| Cost & performance | |
forge_cost_report | Cost tracking: total spend, per-model breakdown, monthly trend. Authoritative source for actual spend. |
forge_timeline | Unified chronological view of runs, incidents, bugs, deploys, fm-turns, crucible events. 9 sources. |
forge_home_snapshot | Snapshot of the “home” dashboard tile state, aggregate health surface |
| Skills & review | |
forge_run_skill | Execute a skill programmatically with step-level tracking |
forge_skill_status | Recent skill execution events from the hub |
forge_review_add | Queue a review item (used by Step 5 reviewer agents) |
forge_review_list | List open / resolved review items |
forge_review_resolve | Resolve a review item with verdict + notes |
forge_patterns_list | List captured architectural patterns for a project |
| Memory (Learn station bridge) | |
forge_memory_capture | Normalise and broadcast a memory-captured hub event for OpenBrain |
forge_memory_report | Aggregate report of recent captures, patterns, decisions |
| Notifications & bridge | |
forge_notify_send | Send a notification via the configured Remote Bridge (Slack / Teams / PagerDuty / OpenClaw / Telegram / Discord) |
forge_notify_test | Test the Remote Bridge configuration end-to-end |
forge_delegate_to_agent | Hand a sub-task to a specific reviewer agent in multi-agent mode |
| Extensions & meta | |
forge_ext_search | Search the community extension catalog |
forge_ext_info | Detailed info about a specific extension |
forge_org_rules | Export org custom instructions, consolidate instruction files for GitHub org-level Copilot config |
forge_meta_bug_file | File a self-repair bug against Plan Forge itself (plan-defect / orchestrator-defect / prompt-defect) |
forge_triage_route | Route a finding to the appropriate lane (bug / spec / classifier), powers the audit-loop drain |
forge_generate_image | Generate images via Grok Aurora or DALL-E, save with format conversion |
LiveGuard — Post-Ship Defense (14 tools)
The Guard station. Detect drift, capture incidents, watch dependencies, scan for secrets, propose fixes, all running against shipped code. Chapter 17 — LiveGuard Tools Reference covers each one in depth (flags, thresholds, output shapes, severity matrix). Listed here for completeness.
| Tool | Description |
|---|---|
forge_liveguard_run | Composite scan: drift + sweep + secrets + regression + deps + alerts + health. The "everything" command. |
forge_drift_report | Score codebase against architecture guardrail rules; track drift over time |
forge_secret_scan | High-entropy secret detection, values always redacted |
forge_dep_watch | Scan dependencies for CVEs; alert on new vulnerabilities |
forge_regression_guard | Extract validation gates from plans, execute against codebase |
forge_incident_capture | Record incidents with severity, affected files, MTTR tracking |
forge_alert_triage | Read incidents and drift violations, rank by priority |
forge_env_diff | Environment variable key divergence across .env files |
forge_fix_proposal | Generate scoped 1–2 slice fix plan from a regression / drift / incident finding |
forge_health_trend | Aggregate drift, cost, incidents, model performance into health score 0–100 |
forge_hotspot | Identify git-churn hotspots, files that change most frequently |
forge_runbook | Generate an operational runbook from a hardened plan file |
forge_deploy_journal | Record deployments with version, deployer, notes |
forge_quorum_analyze | Assemble structured quorum prompt from LiveGuard data, no LLM calls |
Watcher — Cross-Project Read-Only Tail (2 tools)
Read-only observation of another project's forge run from a second VS Code session. See Chapter 19 — The Watcher.
| Tool | Description |
|---|---|
forge_watch | Snapshot or analyze (claude-opus-4.7) mode. Returns counts, anomalies, recommendations, diff cursor. |
forge_watch_live | Live tail, streams events for fixed duration via target's WebSocket hub or events.log polling. |
Crucible — Idea Smelting (8 tools)
The Smelt station. Interview-driven plan intake with a critical-fields gate that refuses to finalize until build-command, test-command, scope, gates, and forbidden-actions are all satisfied. Includes a deterministic Spec Kit importer. See Chapter 5 — Crucible.
| Tool | Description |
|---|---|
forge_crucible_submit | Submit a raw idea or feature request to start an interview |
forge_crucible_ask | Answer the next interview question. Supports an optional questionId to refuse on out-of-sync clients with ASK_QUESTION_MISMATCH. |
forge_crucible_preview | Preview the draft plan + flag any unresolved CRITICAL_FIELDS |
forge_crucible_finalize | Finalize into docs/plans/Phase-NN.md. Refuses if plan exists with PLAN_ALREADY_EXISTS; pass overwrite: true to bypass. Refuses on missing CRITICAL_FIELDS with CRITICAL_FIELDS_MISSING. |
forge_crucible_list | List all in-flight and finalized smelts |
forge_crucible_abandon | Abandon an in-flight smelt |
forge_crucible_import | Deterministic Spec Kit importer. Maps a Spec Kit checkout (spec.md + plan.md + tasks.md + optional constitution.md) into a Plan Forge smelt under .forge/crucible/. No LLM calls. Supports --dry-run and --json. |
forge_crucible_status | Inspect imported smelts. Lists all smelts when called without an id, or returns the full smelt record (metadata + draft plan) when given a smelt id. |
Tempering — Quality Drains & Audit Loop (5 tools)
Closed-loop self-tempering, scan, triage, fix, repeat until convergence. The audit-loop drain is opt-in via .forge.json → audit.mode = "off" | "auto" | "always". See Audit Loop Deep Dive.
| Tool | Description |
|---|---|
forge_tempering_scan | Run a single tempering scanner (mutation, content-audit, etc.) |
forge_tempering_run | Run the full standard scanner sequence (10 scanners) |
forge_tempering_drain | Iterate scan → triage → fix until convergence or maxRounds |
forge_tempering_status | Latest tempering run status, scanners, findings |
forge_tempering_approve_baseline | Approve current findings as the new baseline for visual-diff scanners |
Bug Registry — Closed-Loop Bug Lifecycle (4 tools)
The Learn station. Fingerprint-deduped bug registry: register, fix, validate, remember. See Chapter 23 — The Bug Registry.
| Tool | Description |
|---|---|
forge_bug_register | Register a new bug with title, severity, fingerprint inputs, file paths |
forge_bug_list | List bugs by status, severity, or fingerprint match |
forge_bug_update_status | Update status (open / in-progress / fixed / verified / closed). Accepts both newStatus and status. |
forge_bug_validate_fix | Run the bug's validation gate against the current codebase to confirm a fix landed |
Testbed — Scenario Replay (3 tools)
Replay scenarios against a dedicated fixture repo (typically plan-forge-testbed/) to prove fixes don't regress. See Chapter 24 — The Testbed.
| Tool | Description |
|---|---|
forge_testbed_run | Execute a scenario against the testbed fixture |
forge_testbed_happypath | Run the happy-path scenario set as a smoke test |
forge_testbed_findings | Aggregate findings from the latest testbed run |
Forge-Master — Read-Only Reasoning Orchestrator (1 MCP tool + REST surface)
Intent classifier with embedding cache and quorum advisory mode. Classifies open-ended prompts, fetches OpenBrain memory, and chains read-only forge tools on your behalf. The bulk of the Forge-Master surface is exposed via /api/forge-master/* REST routes (see below) plus the dashboard's Studio tab; only the one-shot reasoning entry-point is an MCP tool.
| Tool | Description |
|---|---|
forge_master_ask | One-shot reasoning entry point. Accepts a free-form message; returns lane classification, tool-call trace, and synthesized reply. Use for open-ended questions instead of chaining tools yourself. |
/api/forge-master/cache-stats liveliness endpoint.
REST API
The REST surface is documented in full in Appendix W — REST API Reference: every endpoint, request/response shape, status codes, authentication model, and worked examples. The summary below points at the most-used subsystems, click through to Appendix W for the per-endpoint detail.
| Subsystem | What it covers |
|---|---|
| Discovery | Liveness, version, capability manifest, well-known endpoint. |
| Plan execution & runs | Trigger/abort runs, traces, replay, plans, workers. |
| Search, timeline, hub | Cross-surface search, unified timeline, WebSocket upgrade. |
| Memory | Capture, drain, search, OpenBrain stats. |
| Crucible | Idea-smelt lifecycle: submit → ask → preview → finalize. |
| LiveGuard | Drift, incidents, deploy journal, regression guard, runbooks, secret scan, dep watch. |
| Bridge & approvals | The only cross-boundary auth surface (HMAC via PFORGE_BRIDGE_SECRET). |
| Forge-Master | Conversational entrypoint, chat, prefs, cache stats. |
| Generic MCP dispatcher | POST /api/tool/:name, invoke any of the 106 MCP tools over REST. |
127.0.0.1 only and has no authentication layer of its own; the OS user account is the access boundary. The only exception is the bridge approval surface, which is HMAC-protected. See Appendix W — Authentication, binding, and CORS for the full discussion.
WebSocket Hub
Connect to ws://localhost:3101 for real-time events. The dashboard uses this for live progress updates.
| Event | When |
|---|---|
connected | Client connects, includes event history replay |
run-started | Plan execution begins |
slice-started | Slice begins execution |
slice-completed | Slice passes all validation gates |
slice-failed | Slice or gate fails |
slice-escalated | Slice escalated to quorum for multi-model consensus |
run-completed | All slices finish |
run-aborted | Execution aborted via forge_abort |
skill-started | Skill execution begins |
skill-completed | Skill finishes all steps |
approval-requested | Bridge pauses for external approval |
bridge-notification-sent | Webhook dispatched (Telegram, Slack, Discord) |
watch-snapshot-completed | Watcher built a snapshot of a target project |
watch-anomaly-detected | Watcher detected one or more anomalies (stalled, slice-failed, quorum-dissent, etc.) |
watch-advice-generated | Watcher analyze-mode produced narrative advice from frontier model |
fm-turn | Forge-Master turn (intent classification + tool-call trace + reply). Surfaces in the unified Timeline. |
quorum-estimate | Forge-Master quorum advisory cost estimate, emitted before model dispatch so clients can cancel |
memory-captured | Decision / pattern / postmortem captured to OpenBrain |
crucible-started / crucible-question / crucible-finalized | Crucible interview lifecycle events |
tempering-round-completed | One round of audit-loop drain finished (scan → triage → fix) |
slice-orphan-warning | Failed slice's worker deliverables were staged but not committed; recovery commands available |
Telemetry
Every plan execution emits OpenTelemetry (OTLP) traces stored in .forge/runs/<timestamp>/traces.json:
- Resource context, project name, version, preset, model
- Span hierarchy, run → slice → gate → escalation
- Severity levels, INFO for passes, WARN for retries, ERROR for failures
- Export, traces are OTLP-compatible, send to Jaeger, Grafana Tempo, or any collector
Cost Tracking
The orchestrator tracks tokens and computes cost per slice using a 23-model pricing table:
- Per-slice, tokens in/out, model, duration, USD cost
- Per-run, total cost, model breakdown
- Monthly, aggregated in
.forge/cost-history.json - Model performance,
.forge/model-performance.jsontracks success rate, avg cost, avg duration per model
The orchestrator auto-selects the cheapest model with >80% historical pass rate. Use --estimate to preview costs before executing.
SDK for Integrators
The pforge-sdk/ package provides a JavaScript/TypeScript API for building integrations:
import { createForgeClient } from 'pforge-sdk';
const forge = createForgeClient({ baseUrl: 'http://localhost:3100' });
// Run smith diagnostics
const health = await forge.smith();
// Get cost report
const cost = await forge.costReport();
// Execute a plan
const run = await forge.runPlan('docs/plans/Phase-1.md', {
mode: 'estimate'
});
The SDK is currently in scaffold stage (v0.1.0), API surface defined, implementation in progress.
API Key Configuration
API keys for external providers (xAI Grok, OpenAI) are resolved in order: environment variable → .forge/secrets.json → null.
{
"XAI_API_KEY": "xai-...",
"OPENAI_API_KEY": "sk-..."
}
The .forge/ directory is gitignored by default, secrets never enter version control.
📄 Full reference: capabilities, Appendix V — Event Catalog (every WebSocket event grouped by family), EVENTS.md on GitHub, tools.json on GitHub