Capability Reference

Everything Plan Forge can do — tools, commands, agents, skills, telemetry, and integrations. One page, complete coverage.

106 MCP tools displayed in a honeycomb grid
106 MCP Tools ~14 Agents ~12 Skills 37 Dashboard Tabs Quorum Mode 🛡️ LiveGuard — GA
16s
3 slices executed
24/24
pipeline gates pass
103
self-tests passing
23
AI models supported
🛡️

LiveGuard — Post-Coding Intelligence

Shipped — GA since v2.30

The forge builds your code. LiveGuard watches after it ships. 14 MCP tools, 22 REST endpoints, 3 lifecycle hooks, and an optional OpenClaw analytics bridge — all surfaced in a LIVEGUARD section of the unified dashboard. Secret scanning and env-diff landed in v2.28; self-healing fix proposals and composite forge_liveguard_run landed in v2.29–v2.30; Watcher bridge in v2.34–v2.35.

14 MCP tools 22 REST endpoints 5 dashboard tabs 3 lifecycle hooks OpenClaw analytics bridge

The Four Stations

Plan Forge is an AI-Native SDLC Forge Shop. Every capability on this page lives in one of four stations. See the full Shop Tour for deep-dive walkthroughs.

🪨

Smelt

Raw idea → Scope Contract

  • Specifier agent · /specify
  • Hardener · /harden-plan
  • Project Principles
  • Crucible (idea intake)
  • Tempering gates
🔨

Forge

Contract → Shipped code

  • pforge run-plan
  • Slice gates + quorum mode
  • Agent-per-slice routing
  • Auto-escalation
  • Fresh-session review
🛡️

Guard

Post-deploy defense (LiveGuard)

  • Secret scan · Env diff
  • Drift report · Regression guard
  • Incident capture · Triage
  • Watcher + Watcher-live
  • Remote bridge (Telegram/Slack)
🧠

Learn

Memory & retrospectives

  • OpenBrain (L3 memory)
  • Bug registry (closed-loop)
  • Testbed scenarios
  • Health DNA fingerprint
  • Forge Intelligence

MCP Tools

All tools callable via Copilot Chat, Claude, Cursor, or any MCP client. Start with forge_capabilities for full discovery.

forge_capabilities

Full API surface — tools, workflows, config, memory, glossary

forge_run_plan

Execute plan — DAG scheduling, gates, token tracking, retry

forge_abort

Abort active execution between slices

forge_plan_status

Latest run status from .forge/runs/

forge_cost_report

Spend by model, monthly aggregation

forge_smith

Environment diagnostics + actionable fixes

forge_validate

Setup file validation

forge_sweep

TODO/FIXME/stub marker scanner

forge_status

Phase status from roadmap

forge_diff

Scope drift detection

forge_analyze

Consistency scoring — single or quorum (multi-model consensus)

forge_diagnose

Multi-model bug investigation with quorum synthesis

forge_ext_search

Browse extension catalog

forge_ext_info

Extension details

forge_new_phase

Create plan + roadmap entry

forge_skill_status

Query recent skill execution events

forge_run_skill

Execute skills programmatically with dry-run

forge_generate_image

Generate images via xAI Aurora or OpenAI DALL-E

forge_memory_capture

Normalise & broadcast a memory-captured event; returns capture_thought payload for OpenBrain

forge_github_status

Check GitHub API connectivity, Copilot subscription status, and GitHub Models API availability — returns auth state, rate limits, per-service health

forge_github_metrics

Live GitHub repo metrics via gh CLI — stars, forks, PRs, commit activity

forge_team_dashboard

Multi-developer plan coordination — per-operator stats + conflict-risk assessment

forge_team_activity

Recent run summaries from .forge/team-activity.jsonl

forge_delegate_review

Delegate the current branch's PR to the Copilot Coding Agent for review

forge_export_plan

Convert a loose Copilot cloud-agent plan into a hardened Phase-X-PLAN.md

forge_estimate_quorum

Projected plan cost under all four quorum modes — required before showing any dollar amount

forge_estimate_slice

Projected cost for a single slice — cheaper than full-plan estimate

forge_graph_query

Query the Plan Forge knowledge graph — phase, file, neighbor, recent-changes

forge_patterns_list

Recurring patterns across runs — gate-failure recurrences, model failure rates, cost anomalies

forge_meta_bug_file

File a self-repair meta-bug against Plan Forge itself (plan/orchestrator/prompt defects)

forge_classifier_issue

File a classifier rule update issue when a tempering finding routes to the 'classifier' lane

🛡️ LiveGuard Tools (14 shipped v2.27–v2.30 + 2 Watcher v2.34/v2.35) composite run · forge_liveguard_run
forge_drift_report

Architecture drift vs. plan baseline

forge_incident_capture

Incident log, MTTR, on-call tracking

forge_dep_watch

Dependency vulnerability change detection

forge_regression_guard

Validation gate pass/fail history

forge_runbook

Operational runbook store and retrieval

forge_hotspot

High-churn / high-failure file detection

forge_health_trend

Long-term health trend + MTTBF scoring

forge_alert_triage

Cross-signal ranked alert list with severity

forge_deploy_journal

Deploy log with pre/post health delta

forge_secret_scan

High-entropy secret detection in staged diffs — values always redacted

forge_env_diff

Env variable key divergence across .env files — keys only, values never read

forge_fix_proposalv2.29

Generates scoped 1-2 slice fix plan from regression/drift/incident/secret failure — capped, human-approved only

forge_quorum_analyzev2.29

Assembles structured quorum prompt from LiveGuard data for multi-model analysis — no LLM calls in server

forge_liveguard_runv2.30

Composite scan: drift + sweep + secrets + regression + deps + alerts + health in one call

forge_watchv2.34/v2.35

Read-only watcher — tail another project's pforge run from a second VS Code session. Snapshot or analyze mode (claude-opus-4.7). Returns counts, anomalies, recommendations, diff cursor.

forge_watch_livev2.35

Live tail — streams events for a fixed duration via target's WebSocket hub or events.log polling fallback. Read-only subscriber.

14 LiveGuard tools (v2.27–v2.30) plus 2 Watcher tools (v2.34/v2.35). All available as MCP tools and REST endpoints. See Chapter 16 — LiveGuard Tools Reference for full documentation.

🔥 Crucible Tools (6 tools — v2.37 in development) raw idea → hardened spec funnel

The pre-forge funnel. Converts rough ideas into scoped plan files through a lane-aware interview (tweak / feature / full), atomic phase-number claims, and Plan Hardener handoff at finalize. Enforces that every plan has a crucibleId: frontmatter or was grandfathered via --manual-import.

forge_crucible_submit

Start a smelt — infers lane, creates record, emits crucible-smelt-started

forge_crucible_ask

Next interview question with recommended default sourced from L3 memory / principles / prior phases (or null if none)

forge_crucible_preview

Render current draft + list unresolved {{TBD:}} fields

forge_crucible_finalize

Atomically claim next phase number, write docs/plans/<phase>.md, hand off to Plan Hardener

forge_crucible_list

List smelts by status (in-progress / finalized / abandoned)

forge_crucible_abandon

Mark smelt abandoned and release any claimed phase number

Crucible is v2.37 (in development — shipping across 6 slices). Documentation chapter lands in the user manual at v2.37.0 release.

🔨 Tempering Tools (6 tools — v2.40+) temper-quality scoring

Post-hardening quality pipeline. Scores a plan's Scope Contract clarity, validation gates, slice sizing, and forbidden actions. Maintains an approved-baseline threshold so regressions block future commits.

forge_tempering_run

Run full pipeline (scan + score) against a Crucible-finalized plan; writes temper-score snapshot

forge_tempering_scan

Scan for temper-quality signals (contract clarity, gates, slice sizing, forbidden actions)

forge_tempering_status

Read latest tempering results per plan (score, findings, baseline delta)

forge_tempering_approve_baseline

Approve current tempering score as the new baseline threshold

forge_tempering_drain

Run the audit drain loop — iterates content-audit scan → triage → fix until convergence (v2.80+)

forge_triage_route

Route a finding through the triage classifier — returns lane (bug/spec/classifier) + payload (v2.80+)

🐛 Bug Registry Tools (4 tools — v2.45+) native bug tracking

First-class bug tracking inside Plan Forge — register, filter, transition, and validate fixes. Surfaces in the dashboard timeline + Bug Registry tab, and LiveGuard incidents can auto-link to registered bugs.

forge_bug_register

Register a bug with severity, title, description, affected files, linked plan/slice

forge_bug_list

List bugs with status/severity/plan filters

forge_bug_update_status

Transition state (open → investigating → in-progress → resolved → closed)

forge_bug_validate_fix

Verify proposed fix against bug description + linked slice gates

🧪 Testbed Tools (3 tools — v2.50+) happy-path scenarios

End-to-end scenario runner against an isolated testbed repository. Guards every release with Chapter 8 happy-path regression validation; failures produce findings linked to the causing change.

forge_testbed_run

Execute a single scenario by ID against the configured testbed project

forge_testbed_happypath

Run all happy-path scenarios sequentially, aggregate pass/fail summary

forge_testbed_findings

Read cumulative testbed findings (failures, flaky scenarios, runtime trends)

🕸️ Lattice, Hallmark & Anvil — Code Intelligence (5 MCP tools + CLI — v2.95+) code-graph · provenance · cache

The Lattice code-graph engine builds a semantic chunk index and BFS call-graph over any git repository (5 MCP tools). Hallmark attaches a lightweight hallmark/v1 provenance envelope to any artifact so drift detection can verify source integrity across sessions (CLI + SDK). Anvil is the content-hash-keyed memoization cache that prevents re-indexing unchanged files and owns the L2→L3 dead-letter queue (CLI). See Chapter 25 — How the Shop Remembers for the plain-English tour.

forge_lattice_index

Build or update the Lattice chunk index; --since enables incremental re-indexing from a git SHA

forge_lattice_stat

Index statistics: chunk count, edge count, language breakdown, Anvil hit rate, index size

forge_lattice_query

Full-text search over the chunk index; returns bounded 80-char snippets ranked by camelCase-aware token-overlap score (v3.5.1+)

forge_lattice_callers

Find all callers of a named symbol using the edge graph

forge_lattice_blast

BFS call-graph traversal up to depth 5; returns truncated: true when frontier is capped

pforge hallmark show · verify

CLI — read or drift-check a hallmark/v1 provenance record (schema version, tool name, captured timestamp, content hash). SDK at pforge-sdk/hallmark.

pforge anvil stat · clear · rebuild · dlq

CLI — memoization cache stats, selective invalidation by tool or git SHA, dead-letter queue list/drain. Lives under .forge/anvil/.

Lattice, Hallmark, and Anvil ship in v2.95.0. Hallmark and Anvil are CLI-only because they are local-file utilities that don't benefit from MCP overhead — capability metadata is still exposed via forge_capabilities so agents can discover them. See pforge lattice --help, pforge hallmark --help, pforge anvil --help.

🧠 Copilot Memory Sync (2 MCP tools — v2.99+) memory bridge · cheaper models

Bridges forge memory upward into GitHub Copilot's own Memory store — the next IDE session auto-discovers project decisions, lessons, and patterns without requiring OpenBrain configuration. Soft-sync is additive and hash-deduped, so safe to run repeatedly. Together with Hallmark provenance, Anvil DLQ, and the Lattice code-graph, this completes the v3.x memory upgrades that let cheaper, faster models produce flagship-grade results. Full plain-English tour: Chapter 25 — How the Shop Remembers.

forge_sync_memories

Generate .github/copilot-memory-hints.md from forge decisions — trajectory notes, auto-skills, brain L2 entries. CLI: pforge sync-memories.

forge_sync_instructions

Generate .github/copilot-instructions.md from project profile + principles + .forge.json. Completes the Copilot integration trilogy. CLI: pforge sync-instructions.

🧭 Forge-Master Studio (1 tool + dashboard — v2.63+) open-ended reasoning · read-only
Forge-Master orchestrating ghostly apprentice-smiths at their anvils — a visual metaphor for multi-agent tool orchestration

A read-only reasoning orchestrator. Classifies user intent, retrieves OpenBrain memory context, and orchestrates other forge tools on the agent's behalf. Purpose-built for multi-step troubleshooting, plan status queries, and funneling ideas into Crucible smelts — without chaining tools by hand. Phase-29 added the Forge-Master Studio dashboard tab with a curated prompt gallery, streaming chat, and a live tool-call trace pane.

forge_master_ask

Accepts a freeform message. Returns a structured reasoning response built from intent classification, memory retrieval, and allowlist-gated read-only tool calls.

Studio tab · prompt gallery · chat stream · tool-call trace

Dashboard UI at localhost:3100/dashboard. Also available as CLI via pforge forge-master status|logs.

🧭 Collaboration, Notifications & Dashboard (10 tools) reviews, alerts, search, memory, release
forge_review_add

Capture a review thread (audit, gate failure, drift finding) linked to plan/slice

forge_review_list

List open/resolved review threads

forge_review_resolve

Mark a review thread resolved with outcome + rationale

forge_notify_send

Emit notification through configured channels (Telegram, Slack, webhook, email)

forge_notify_test

Smoke-test every notification channel; returns success/failure per channel

forge_home_snapshot

Build the dashboard Home tab payload (run state, drift, incidents, cost, health DNA)

forge_timeline

Unified cursor-paged timeline across runs, incidents, deploys, bugs, Crucible, Tempering

forge_search

Cross-surface search over plans, events, bugs, incidents, memory (filters by type/date/severity)

forge_memory_report

OpenBrain memory usage — captures per day, hit rate on searches, top-recalled thoughts

forge_org_rules

Export aggregated .github/instructions/*.md as a single org-rules document

forge_doctor_quorum

Health-check every quorum participant — auth, latency, rate-limit headers, availability

forge_delegate_to_agent

Delegate a prompt/slice to a specialized reviewer agent (database, security, performance, …)

forge_self_update

Check for the latest Plan Forge release, fetch release notes, and optionally install

Total: 106 MCP tools across all subsystems. Call forge_capabilities or open pforge-mcp/tools.json for the machine-readable surface.

Autonomous Execution

Full Auto

One command. pforge run-plan spawns gh copilot CLI for each slice. Gates validate at every boundary. Supports Claude, GPT, and Gemini via --model.

Assisted

You code in VS Code Copilot. Orchestrator prompts you per slice and validates gates automatically. Best of both: human creativity + automated quality.

Cloud Agent

Copilot cloud agent provisions the environment via copilot-setup-steps.yml. Guardrails auto-load, all 106 MCP tools are available, and forge_run_plan executes slices autonomously on GitHub Issues. Use --worker copilot-coding-agent to route each slice to a Copilot cloud agent session via GitHub Issue dispatch.

Parallel

[P]-tagged slices run concurrently. DAG-aware scheduling with scope conflict detection. Up to maxParallelism: 3 workers.

Agent-Per-Slice Routing

Assign a different AI model to each execution role. The orchestrator auto-selects based on the current operation — tune cost vs. quality at every stage without changing your plan files.

default
claude-opus-4.6

Spec, harden, review operations

execute
gpt-5.2-codex

Writing code, generating tests

review
claude-sonnet-4.6

Gate checks, drift detection

// .forge.json
"modelRouting": { "default": "claude-opus-4.6", "execute": "gpt-5.2-codex", "review": "claude-sonnet-4.6" }

Auto-Escalation

When a slice fails on one model, the orchestrator automatically walks the escalationChain and retries on the next model — no manual intervention. Emits a slice-escalated event on each re-route.

Attempt 0

Configured model (or modelRouting.execute)

Attempt 1+

Walks chain in order — "auto" defers to execute routing

Event

slice-escalatedsliceId, reason, models

// .forge.json
"escalationChain": ["auto", "claude-sonnet-4.6", "claude-opus-4.6"]

Model Performance Tracking

Per-slice performance data is appended to .forge/model-performance.json after every run. The orchestrator reads this on startup and auto-selects the cheapest model with >80% historical success rate for each slice type.

Auto-Selection

--estimate shows recommended model per slice with historical success rate. Agent-per-slice routing uses this data to tune cost vs. quality automatically.

Dashboard Cost Tab

Model Comparison table shows: run count, pass rate (color-coded), average duration, cost per run, total tokens — aggregated from model-performance.json.

Quorum Mode

Multi-model consensus: dispatch complex slices to 3 AI models for independent analysis, synthesize the best approach, then execute with higher confidence. A/B tested: +20% more tests, better code structure, fewer brittle patterns vs single-model execution. Read the full A/B test results →

// Quorum workflow per slice
executeSlice(slice)
├─ scoreComplexity() → 1-10 score (7 weighted signals)
├─ score < threshold → normal execution
└─ score ≥ threshold → quorumDispatch()
├─ Claude Opus 4.6 → dry-run plan ─┐
├─ GPT-5.3-Codex → dry-run plan ─┼─ Promise.all()
└─ Grok 4.20 → dry-run plan ─┘
quorumReview() ← synthesize best approach per file
spawnWorker(enhancedPrompt) ← execute with consensus
gate ✓

Complexity Scoring

7 weighted signals: file scope (20%), cross-module deps (20%), security keywords (15%), database keywords (15%), gate count (10%), task count (10%), historical failure rate (10%).

Auto Mode

--quorum=auto triggers quorum only on high-complexity slices (score ≥ 6). Simple CRUD runs normally. Best of both: quality where it matters, speed where it doesn't.

Graceful Degradation

If <2 models respond, falls back to normal execution. If reviewer fails, uses best single dry-run. No model unavailability blocks your pipeline.

A/B Tested

Invoice Engine (rate tiers, discounts, tax, rounding): quorum produced 20% more tests, extracted DRY helpers, used idiomatic .NET patterns, and caught edge cases the single model missed.

A/B Test: Invoice Engine (4 slices, rate tiers + discounts + tax + banker's rounding)

MetricStandardQuorum (3 models)Delta
Pass rate4/44/4Tie
Duration12 min32 min+168%
Tests generated1518+20%
DRY helpersInlineExtractedBetter
Test datesHardcoded (fragile)Relative (robust)Better
Edge case coverageStandard+voided regen, +sequenceBetter

Quorum Presets

PresetModelsReviewerThresholdTimeout
--quorum=powerClaude Opus 4.6 + GPT-5.3-Codex + Grok 4.20 ReasoningOpus55 min
--quorum=speedClaude Sonnet 4.6 + GPT-5.4-mini + Grok 4.1 Fast ReasoningSonnet72 min

Available via CLI (--quorum=power), MCP (quorum: "power"), and config (.forge.jsonquorum.preset: "power").

Web UI — Live Dashboard

localhost:3100/dashboard — 8 real-time tabs via WebSocket. No build step. Also runs standalone: node pforge-mcp/server.mjs --dashboard-only

📊

Progress

Live slice cards

📋

Runs

History table

💰

Cost

Model breakdown

Actions

One-click tools

🔄

Replay

Session logs

🧩

Extensions

Catalog browser

⚙️

Config

Visual editor

🔍

Traces

OTLP waterfall

Agents & Skills

~12 Reviewer Agents

Stack (6-7 per preset): architecture, database, deploy, performance, security, test-runner (+ stack-specific extras)

Pipeline (5): specifier → plan-hardener → executor → reviewer-gate → shipper

Audit (1): classifier-reviewer (audit-loop triage)

AI Tool Adapters

pforge init -Agent <tool> generates adapter files for each platform:

copilot.github/copilot-instructions.md (default)

claudeCLAUDE.md + .claude/commands/

cursor.cursorrules + .cursor/rules/

windsurf.windsurfrules + .windsurf/workflows/

geminiGEMINI.md + .gemini/commands/ + MCP config

generic.ai/instructions.md (configurable dir)

all — all adapters at once

13 Slash Command Skills

/database-migration · /staging-deploy · /test-sweep

/dependency-audit · /security-audit · /code-review

/release-notes · /api-doc-gen · /onboarding

/health-check · /forge-execute · /forge-troubleshoot

/forge-quench

/forge-quenchReduce code complexity while preserving behavior — Chesterton's Fence

Every skill follows the Skill Blueprint format and includes Temper Guards, Warning Signs, and Exit Proof sections.

Temper Guards & Warning Signs — Every instruction file includes tables of common shortcuts agents use (with rebuttals) and observable anti-patterns that indicate the file's guidance is being violated.

Observability & Memory

Memory Layers

Plan Forge uses three distinct memory systems — each with a specific role in the 3-session pipeline. They're complementary, not competing.

LayerWhat It IsScopeBest For
Copilot Memory/memories/ built-in note storage (user / session / repo scopes)User / Session / RepoFree-form notes, personal patterns, ad-hoc insights
Plan Forge Session BridgeStructured /memories/repo/current-phase.md + lessons-learned.mdRepositoryCarrying Session 1 → 2 → 3 state through the hardening pipeline
OpenBrainSemantic vector memory via MCP search_thoughts / capture_thoughtGlobalAuto-injecting prior decisions before each slice — no manual prompting

OTLP Telemetry

Every run produces trace.json with resource context, span kinds (SERVER/INTERNAL/CLIENT), severity levels, and log summaries.

  • Per-run manifest + global index (append-only, corruption-tolerant)
  • Dashboard Traces tab with waterfall timeline
  • Optional OTLP collector forwarding (Jaeger, Aspire, Grafana)

OpenBrain Context Injection Docs

Plan Forge's L3 memory layer (built in as of v3.6, no extension needed). Prior decisions and conventions are searched and injected as context before each slice begins, bridging the 3-session model with long-term memory.

  • Context injected before each slice (search_thoughts)
  • Decisions captured after each slice (capture_thought)
  • Cost anomaly detection (>2x average triggers insight)
  • Run summary captured for future phase planning

Stack Presets

PresetInstructionsAgentsPromptsSkills
.NET1719159
TypeScript1819159
Python1719159
Java1719159
Go1719159
PHP1719159
Rust1719159
Swift1619139
Azure IaC121863

REST API — External Integration

The MCP server exposes a REST API for external agents, CI systems, and tools like OpenClaw. Discover the full surface via GET /api/capabilities or GET /.well-known/plan-forge.json on first connect.

Run Control
  • POST /api/runs/trigger — start a plan run remotely
  • POST /api/runs/abort — abort the active run
  • GET /api/runs/status — current run state
Memory
  • POST /api/memory/search — semantic search (OpenBrain)
  • POST /api/memory/capture — normalise + emit memory event
Discovery
  • GET /api/capabilities — full machine-readable surface
  • GET /.well-known/plan-forge.json — RFC 8615 discovery
  • GET /llms.txt — LLM-readable endpoint reference
Auth

Write endpoints accept Authorization: Bearer <secret> or ?token=<secret>. Set bridge.approvalSecret in .forge.json. Without a secret, endpoints are open (local-only use).

Full curl examples and config template: AGENT-SETUP.md Section 6.

Bridge — External Notifications

The Plan Forge Bridge subscribes to the WebSocket hub and dispatches run events to external platforms. Rate-limited (1/5s per channel), with automatic reconnect.

📨

Telegram

Bot API

💬

Slack

Incoming webhook

🎮

Discord

Webhook

🔗

Generic

Any HTTP endpoint

// .forge.json bridge config
{
  "bridge": {
    "enabled": true,
    "channels": [
      { "type": "telegram", "url": "https://api.telegram.org/bot<TOKEN>/sendMessage", "chatId": "<ID>", "level": "important" },
      { "type": "slack",    "url": "https://hooks.slack.com/services/...", "level": "all" },
      { "type": "discord",  "url": "https://discord.com/api/webhooks/...", "level": "critical" },
      { "type": "webhook",  "url": "https://your-endpoint.example.com/hook", "level": "all" }
    ]
  }
}

Levels: all (every event) · important (run start/end + failures) · critical (failures only)

CI/CD Hook Event

The ci-triggered event is emitted when a CI workflow is dispatched from a plan run. Observable via the WebSocket hub or captured in the run's events.log. The slice-escalated event is emitted when a slice is re-routed to a new model via the escalation chain.

ci-triggered

Dispatched when a CI workflow is triggered from a plan run.

  • workflow — workflow file or ID
  • ref — git ref (branch or SHA)
  • inputs — dispatch input parameters

slice-escalated

Emitted when auto-escalation re-routes a slice to the next model in the chain.

  • sliceId — which slice was escalated
  • reason — why escalation triggered
  • models — models tried / next model

Updating an Existing Install

pforge smith automatically checks GitHub for a newer Plan Forge release — 5 s timeout, 24 h cache in .forge/version-check.json, silent when offline.

New version available: vX.Y.Z → run pforge self-update

✓ Preferred: upgrade in place

pforge self-update --force  # latest GitHub release
pforge update              # auto-mode (v2.56.0+)
pforge update --from-github # force GitHub tag

Preserves .forge.json, copilot-instructions.md, project principles, and plan files.

✗ Do not clone to update

git clone https://github.com/srnichols/plan-forge.git

Re-cloning is the first-time install path. For existing installs it can drag -dev bytes onto a clean release and clobber local config.

Control the update source with pforge config set update-source <auto|github-tags|local-sibling> (v2.56.0+). See Manual Appendix G.

Dual-Publish Extensions

pforge ext publish <path> validates the extension and outputs two catalog entries in one command.

Plan Forge Catalog

catalog.json format — installable with pforge ext install and browseable via pforge ext search.

Spec Kit Compatible

extensions.json format for the Spec Kit registry. Extensions marked speckit_compatible: true work in both tools.

GitHub Stack Integration

First-class integration with GitHub Copilot, GitHub Models, and GitHub Actions for cloud-based execution and security-driven plan generation.

forge_github_status

Check GitHub API connectivity, Copilot subscription status, and GitHub Models API availability. Returns auth state, rate limits, and per-service health. CLI: pforge github-status

githubAuthauthenticated / unauthenticated
copilotPlanindividual / business / enterprise / none
modelsApiAvailabletrue when models.github.ai/inference is reachable
rateLimitRemainingRemaining GitHub API requests for the hour

GitHub Models

models.github.ai/inference is the recommended API provider for Plan Forge — the default inference endpoint when GITHUB_TOKEN (or gh auth login) is configured.

Supported models: gpt-4o-mini (default), gpt-4o, claude-sonnet-4, claude-opus-4. Set GITHUB_TOKEN to enable; no separate API key required beyond GitHub auth.

Copilot Coding Agent Worker

Dispatch slice execution to the Copilot coding agent instead of the local CLI. Each slice becomes a GitHub Issue; the agent picks it up, opens a PR, and the orchestrator polls for completion.

pforge run-plan <plan> --worker copilot-coding-agent

Requires copilot-setup-steps.yml in .github/ and Copilot for Business or Enterprise. Pre-flight calls forge_github_statuswarn on the assignability check promotes to a hard fail to prevent silent dispatch drops.

plan-from-sarif

Generate a remediation plan from a GitHub Code Scanning SARIF report. Groups findings by CWE / rule ID and emits a hardened Plan Forge plan where each slice targets a specific vulnerability class.

pforge plan-from-sarif <sarif-file> [--severity high,critical] [--output docs/plans/]

High-severity findings are auto-registered via forge_bug_register. Integrates with forge_secret_scan. Gate: pforge run-plan docs/plans/<sarif-plan>.md.

github-metrics

Pull GitHub repository metrics (PR velocity, code frequency, contributor cadence) into the LiveGuard health context.

pforge github-metrics [--repo <owner/repo>] [--window 30d]

Metrics written to .forge/github-metrics.json and surfaced on the Dashboard GitHub tab. forge_health_trend incorporates PR cycle time as a signal when the file is present. Requires GITHUB_TOKEN with repo scope.

Ready to forge?

Machine-readable: forge_capabilities MCP tool · .well-known/plan-forge.json