A massive glowing golden vault door embedded in the forge wall, slightly ajar with warm light pouring out. An illuminated audit ledger on the workbench. Ghost-translucent compliance seals drift in ember trails around the vault
Appendix N

Compliance and Data Residency

Where data lives, what's logged, how to export for audit, identity (today and roadmap), and the air-gapped / Azure Government deployment paths.

Audience: Security architects, compliance officers, and platform leads conducting a security review of Plan Forge.

Scope: Where data lives, what's logged, how to export for audit, identity model (today and roadmap), and the air-gapped / Azure Government deployment paths.

TL;DR for security review

Plan Forge is local-first. The orchestrator runs on the developer's machine or a CI runner inside the customer's network. There is no Plan Forge SaaS service. Source code does not leave the customer's network unless the customer chooses to call a hosted LLM (and even then, all logging stays local). The audit trail is structured, complete, and exportable. Identity supports bearer tokens, Entra ID OIDC (v3.14.0+), Okta OIDC (v3.16.0+), and SCIM 2.0 provisioning (v3.17.0+) with SCIM-group-to-RBAC-role mapping (v3.18.0+) — see the Identity providers section below.

ConcernStatus
Source code leaves networkOnly when customer-configured LLM provider is hosted; all logging stays local
Audit log of agent actionsStructured, complete, production-grade today (telemetry.mjs, EVENTS.md)
Audit log exportOTel exporter on roadmap (Week 2 of enterprise hardening); manual export available today
Identity / SSOBearer, Entra OIDC, Okta OIDC, SCIM 2.0 shipped (v3.14.0–v3.17.0); SAML on roadmap
RBACRole-based authz with SCIM group mapping shipped (v3.18.0)
Data residency controlsCustomer chooses LLM provider region; Plan Forge respects
Air-gapped deploymentArchitecturally supported; documentation gap (this doc)
Encryption at restCustomer's filesystem encryption (Plan Forge respects)
Secret redactionBuilt-in for testbed findings; configurable scope on roadmap
FedRAMP / IL5 / IL6 / HIPAA / PCI / SOC2Plan Forge is OSS, compliance posture is the customer's deployment, not a Plan Forge certification

Data flow

Five concrete data movements. For each, who handles the data and where it goes.

1. Source code

Stays in the customer's network, except for:

  • The bytes of files you choose to send to a hosted LLM as part of a prompt (Anthropic API, OpenAI API, GitHub Copilot, etc.)
  • The bytes of code Copilot Cloud Agent reads on its GitHub-hosted ephemeral runner (subject to GitHub's data handling)

If you use only on-prem inference (Foundry Local, Ollama, vLLM, llama.cpp, etc.), source code never leaves your network for any reason.

2. Plan files

Stay in the customer's repo. Plan files (docs/plans/*.md) are committed to git. They live wherever the repo lives.

3. .forge/ artifacts

Stay on the local filesystem (developer machine or CI runner). Includes:

  • .forge/runs/<id>/, per-run trajectory, events, slice artifacts, summary, traces, cost history
  • .forge/cost-history.json, aggregate cost
  • .forge/telemetry/tool-calls.jsonl, MCP tool invocations
  • .forge/liveguard-events.jsonl, LiveGuard scan events
  • .forge/trajectories/<plan-slug>.jsonl, Copilot Coding Agent trajectories (when CCA is the worker)
  • .forge/fm-sessions/*.jsonl, Forge-Master conversation sessions

.forge/ is gitignored by default. It can be committed for audit purposes if your security policy requires.

4. Memory

Three tiers, three different residency stories:

TierLocationLifetimeNotes
L1 (volatile hub)In-process RAMPer-processBounded ring buffer, evicted on restart
L2 (structured)Local filesystem (.forge/, .github/, docs/plans/)PersistentSurvives restart; lives where the repo lives
L3 (semantic via OpenBrain)External Postgres + pgvector (optional)ForeverCross-project by design. If used, deploy the Postgres in your network

If L3/OpenBrain is not configured, Plan Forge runs single-project, single-session memory only. No external service required.

5. Telemetry / observability

By default, telemetry stays local in .forge/telemetry/. With the OTel exporter (Week 2 of enterprise hardening), traces and metrics are emitted in the OpenTelemetry gen_ai.* semantic-convention format to a customer-chosen OTLP endpoint. Common targets:

  • Splunk Observability Cloud
  • Datadog
  • Grafana Tempo / Mimir / Loki
  • Microsoft Application Insights (especially relevant for Foundry-attached deployments)
  • Honeycomb
  • Customer-hosted OTel Collector forwarding anywhere

The OTel exporter is off by default. Enable by setting OTEL_EXPORTER_OTLP_ENDPOINT.

Audit logging

What's logged

Plan Forge emits structured events for 38 event types across eight families. The full ebook reference, envelope, enums, payloads, retention, is Appendix V — Event Catalog; the canonical JSON schema lives in pforge-mcp/EVENTS.md. Categories include:

  • Plan execution lifecycle (run-started, slice-started, slice-completed, run-completed)
  • Worker LLM calls (model, provider, token counts, latency, cost)
  • MCP tool invocations (tool name, args [optional], result [optional])
  • Validation gate execution (gate name, result, duration)
  • Quorum dispatch (quorum-started, quorum-model-replied, quorum-synthesized)
  • LiveGuard scans (drift-, incident-, secret-scan-, dep-watch-)
  • Crucible smelts (idea funnel)
  • Tempering runs (plan hardening)
  • Bug registry (open, status changes)
  • Skill execution (start, step, complete)
  • Watcher events (when one project tails another)

Each event carries:

  • ISO8601 timestamp
  • Event type
  • Correlation ID (groups events from the same run)
  • Source (which subsystem emitted it)
  • Severity (TRACE / DEBUG / INFO / WARN / ERROR / FATAL)
  • Type-specific data payload

Where it's logged

SinkFormatRetention
.forge/runs/<id>/events.logNDJSONPer-run, kept until manual cleanup
.forge/runs/<id>/trace.jsonOTLP-compatiblePer-run
.forge/telemetry/tool-calls.jsonlNDJSON, append-onlyPersistent
.forge/liveguard-events.jsonlNDJSON, append-onlyPersistent
Hub event streamIn-memory + WebSocketVolatile (last N events)

How to export for audit

Today (manual):

# Aggregate all events from a date range
jq -s 'sort_by(.ts)' .forge/runs/*/events.log > audit-export.json

# Or use forge_search for filtered export
pforge search --since 2026-04-01 --sources run,liveguard,bug --output audit.json

Roadmap (Week 2 of enterprise hardening): pforge audit export --since <date> --format <json|csv> as a first-class CLI.

Secret redaction

Built-in for testbed findings (defect-log.mjs). High-entropy secret detection in diffs (forge_secret_scan) always redacts values; findings are masked before caching or display. Plan to formalize as a configurable scope in Week 3 (auth/RBAC scaffolding).

Identity and authentication

Today

Plan Forge ships four authentication providers (selected via auth.provider in .forge.json) plus role-based authorization with optional SCIM-driven group mapping. All providers feed the same authenticate() + resolveRoles() / hasScope() pipeline in pforge-mcp/auth/.

ProviderSinceUse case
nonev2.xDefault. Local dev, single-user CI runs. No identity enforced.
bearerv2.xDashboard / hub write operations. Token configured as bridge.approvalSecret in .forge.json.
entra-oidcv3.14.0Microsoft Entra ID / Azure AD. RS256-signed JWTs validated against https://login.microsoftonline.com/{tenantId}/discovery/v2.0/keys. Both v2.0 and v1.0 issuer formats accepted. JWKS cached 5 min per tenant. Returns claims.sub ?? claims.oid as the RBAC principal key.
okta-oidcv3.16.0Okta. Supports custom authorization servers (https://{domain}/oauth2/{authServerId}/v1/keys) and the Org authorization server (https://{domain}/oauth2/v1/keys). Org tokens may use cid instead of aud — both validated. JWKS cached 5 min per domain::authServerId.

SCIM 2.0 user / group provisioning (v3.17.0+)

14 SCIM 2.0 endpoints are exposed under /scim/v2/ when a bearer token is configured via PFORGE_SCIM_TOKEN env var or .forge/secrets.json#scimBearerToken. SCIM is disabled by default — absent token returns HTTP 503.

  • Discovery: GET /scim/v2/ServiceProviderConfig, GET /scim/v2/Schemas (both unauthenticated per spec; IdPs probe before presenting credentials)
  • Users: GET|POST /scim/v2/Users, GET|PUT|PATCH|DELETE /scim/v2/Users/:id
  • Groups: GET|POST /scim/v2/Groups, GET|PUT|PATCH|DELETE /scim/v2/Groups/:id

Users and groups persist to .forge/scim-users.json / .forge/scim-groups.json. SCIM PatchOp add, replace, and remove are supported. Compatible with Okta SCIM 2.0, Microsoft Entra provisioning, and any spec-compliant IdP.

RBAC + SCIM group mapping (v3.18.0+)

Role definitions live in .forge.json#auth.rbac as { roles, assignments }. The SCIM-RBAC bridge in pforge-mcp/auth/scim-rbac-bridge.mjs maps SCIM group displayName values to RBAC role names via auth.scimGroupRoles:

{
  "auth": {
    "provider": "entra-oidc",
    "entraOidc": { "tenantId": "00000000-0000-0000-0000-000000000000" },
    "rbac": {
      "roles": {
        "admin":     { "scopes": ["*"] },
        "developer": { "scopes": ["run-plan", "read"] },
        "reader":    { "scopes": ["read"] }
      }
    },
    "scimGroupRoles": {
      "Platform Admins":  ["admin"],
      "Plan Forge Devs":  ["developer"],
      "Plan Forge Users": ["reader"]
    }
  }
}

At authentication time, buildScimAssignments(scimStore, mappings) resolves the user's group memberships into role assignments, which then flow into resolveRoles + hasScope. The bridge is purely read-only — it never mutates the SCIM store or the RBAC config.

Recognized secret env vars

Plan Forge looks for the following secrets in process.env first, falling back to .forge/secrets.json:

  • GITHUB_TOKEN
  • XAI_API_KEY
  • OPENAI_API_KEY
  • ANTHROPIC_API_KEY
  • OPENCLAW_API_KEY
  • PFORGE_SCIM_TOKEN (SCIM provisioning bearer)

Identity roadmap

What's still on the roadmap, in priority order:

  1. First-class Azure OpenAI BYOAZURE_OPENAI_API_KEY + endpoint as recognized secrets; deployment-name vs model-name handled in config; Entra ID auth via azure-identity SDK
  2. SAML 2.0 provider — for IdPs that don't speak OIDC
  3. RBAC enforcement on the MCP tool surface — today RBAC is wired through authenticate() + hasScope(); per-tool scope declarations on every forge_* handler are still being rolled out

Compliance posture

Plan Forge is open-source software (MIT license). Compliance certifications (FedRAMP, IL5/IL6, HIPAA, PCI-DSS, SOC2) attach to the customer's deployment of Plan Forge, not to Plan Forge itself. There is no Plan Forge SaaS to certify.

Even so, several Plan Forge architectural choices are friendly to compliance audits:

PostureWhat helps
No SaaS data planeNothing to subpoena from a vendor; data lives where you put it
Structured audit trailEvery action logged with timestamps, correlation IDs, severity
Open sourceAuditable end-to-end; no proprietary closed binaries
Local-first by defaultAir-gapped deployment is structurally possible (see below)
Open standardsAGENTS.md, MCP, OTel gen_ai.*, no proprietary lock-in to challenge
Compliance reviewer agent.github/agents/compliance-reviewer.agent.md ships out of the box for GDPR/CCPA/SOC2/HIPAA-aware code review
Project profile compliance frameworks.github/prompts/project-profile.prompt.md collects SOC2, HIPAA, PCI-DSS, GDPR, FedRAMP early in setup

For specific frameworks:

SOC2 Type II

  • Audit trail completeness: (events, traces, run artifacts)
  • Access controls: (bearer token today; SSO/RBAC on roadmap)
  • Change management: (git-based plan files, scope contracts, gates)
  • Encryption in transit: for LLM API calls; for OTel export when configured with TLS
  • Encryption at rest: customer's filesystem encryption

HIPAA

  • BAA: not applicable (no Plan Forge SaaS to BAA)
  • Customer's BAA with their LLM provider applies to inference data
  • Audit log: structured and complete
  • PHI handling: customer's responsibility, Plan Forge does not pre-process content

PCI-DSS

  • Scope reduction: Plan Forge does not handle payment data unless customer-configured to read it. Recommend isolating any PCI-relevant code review to dedicated Plan Forge instances with strict secret scanning enabled.
  • Secret handling: built-in detection + redaction for high-entropy strings in diffs

FedRAMP / IL5 / IL6

  • Plan Forge is deployable in Azure Government and on-prem environments that match FedRAMP / IL boundaries
  • Use only FedRAMP-authorized LLM providers (Azure OpenAI in Azure Government has FedRAMP-authorized models, gpt-5.1, gpt-4.1, o3-mini, gpt-4o)
  • Plan Forge itself does not require FedRAMP authorization (it's software you run, not a service you consume)

GDPR / CCPA

  • Data minimization: Plan Forge does not collect personal data unless customer-configured
  • Right to access / delete: applies to data the customer chooses to capture; .forge/ artifacts are deletable

Air-gapped deployment

Plan Forge is architecturally compatible with fully air-gapped deployment. The complete pattern:

What works air-gapped

  • Plan Forge orchestrator (Node.js process; no inbound network calls required)
  • Dashboard (localhost:3100)
  • Plan execution against local repos
  • All .forge/ artifact storage
  • L1 (in-memory) and L2 (filesystem) memory tiers
  • OTel export to in-network OTel collector
  • Validation gates (run locally as shell commands)

What requires special handling air-gapped

ComponentAir-gapped solution
LLM inferenceUse Foundry Local powered by Azure Local (preview May 2026), Ollama, vLLM, llama.cpp, or similar on-prem inference. Configure as the OpenAI-compatible endpoint Plan Forge talks to.
GitHub EnterpriseUse GitHub Enterprise Server (GHES) instead of GitHub.com. Plan Forge supports GHES; Cloud Agent local-MCP-server pattern works
Update checksSet PFORGE_NO_UPDATE_CHECK=1 to disable. Manual updates via pforge self-update --from-local <path> or repo sync from internal mirror
OpenBrain L3 memoryOptional; if used, deploy the Postgres+pgvector inside the boundary
MCP serversSelf-host any MCP server you want available; point .vscode/mcp.json at internal endpoints only

What does NOT work air-gapped

  • Plan Forge Hub WebSocket connections to external observability (configure local OTel collector instead)
  • Any LLM provider that requires public internet (configure on-prem inference instead)
  • The community extensions catalog (use pforge ext add --from-local <path> for vetted extensions)

Deployment checklist for air-gap

  • On-prem LLM inference deployed (Foundry Local / Ollama / vLLM)
  • GHES instead of GitHub.com (or no GitHub at all if your VCS is internal)
  • Internal git mirror for srnichols/plan-forge updates
  • OTel collector inside the boundary
  • OpenBrain (if using L3 memory) deployed inside the boundary
  • All MCP server endpoints internal
  • PFORGE_NO_UPDATE_CHECK=1 set
  • Network egress audit confirms zero outbound to public internet

This is the differentiator vs. competitors. Cursor cannot offer this (control plane in AWS even with self-hosted workers). Sourcegraph Amp explicitly cannot (no self-host, no BYOK). GitHub Copilot Cloud Agent runs on GitHub-hosted infrastructure. For air-gapped requirements, Plan Forge is structurally the only viable option in the comparison set.

Azure Government

For customers deploying in Azure Government:

What works

  • Plan Forge orchestrator running on Azure Government VMs / AKS / Functions
  • Azure OpenAI in Azure Government as the LLM provider
  • Endpoint domain: openai.azure.us (not openai.azure.com)
  • Auth: login.microsoftonline.us Entra ID (when first-class Entra support lands)
  • Today: API key auth works via the manually-configured Azure OpenAI path

Model availability

Azure Government has a substantially smaller catalog than commercial Azure:

  • gpt-5.1
  • gpt-4.1
  • gpt-4.1-mini
  • o3-mini
  • gpt-4o
  • Embeddings: text-embedding-3-large, text-embedding-3-small, text-embedding-ada-002

Available in usgovarizona and usgovvirginia, with Data Zone Standard and Provisioned variants.

Plan Forge implications

  • The default power quorum preset (assumes flagship models like gpt-5.5 or claude-opus-4.7) won't resolve cleanly
  • Use a power-gov preset (planned) or graceful fallback
  • The speed preset works (gpt-4.1-mini exists in gov)

Compliance certifications inherited

Both global Azure and Azure Government are FedRAMP High. Azure Government adds contractual commitments around US-based data storage and screened-US-persons access. HIPAA and PCI are covered under Azure's standard compliance umbrella for the underlying services; Plan Forge running on top inherits the boundary.

For Azure Government Secret and Top Secret cloud feature availability, contact your Microsoft account team, public documentation is limited.

Observability export

The Week 2 work in the enterprise hardening track adds first-class OpenTelemetry export. Spec is documented in the enterprise-fleet-readiness research §8.6. Summary:

What gets emitted

  • Spans for every LLM call (CLIENT, kind chat/embeddings/etc.) with full gen_ai.* attribute set including token counts (input, output, cache_read, cache_write, reasoning), latency, model, provider
  • Spans for every MCP tool call (INTERNAL, kind execute_tool) with tool name and call ID
  • Spans for every slice (INTERNAL, kind invoke_agent) with plan/slice correlation
  • Spans for every plan run (INTERNAL, kind invoke_workflow)
  • Spans for every validation gate (INTERNAL, plan-forge-vendor namespace)
  • Metrics, gen_ai.client.operation.duration histogram, gen_ai.client.token.usage histogram
  • Events (opt-in), gen_ai.client.inference.operation.details with input/output messages (gated by pforge.telemetry.captureContent flag, default off, PII implications)

Vendor-namespaced extensions

pforge.* attributes for plan/slice/run correlation, scope contract IDs, gate names, cost USD (since gen_ai.cost doesn't exist in the spec).

Backends supported

Anything that speaks OTLP. Tested compatibility (planned for Week 2):

  • Splunk Observability Cloud
  • Datadog
  • Grafana Tempo
  • Microsoft Application Insights (especially relevant for Foundry-attached deployments, Foundry uses the same OTel gen_ai.* conventions, so Plan Forge runs land in the same dashboards as the customer's Foundry agents)
  • Honeycomb
  • Customer-hosted OTel Collector

Privacy controls

  • Content capture (prompt + completion text) is opt-in by default
  • Three patterns supported: don't capture / capture as span attributes / externalize via hook to a separate store with only references on the span
  • Toggle via pforge.telemetry.captureContent config flag and standard OTel env vars

Common security review questions

Where can our source code go?

Wherever you choose to send it via your configured LLM provider. With on-prem inference, nowhere outside your network. Plan Forge itself never transmits source code.

Does Plan Forge phone home?

No telemetry is transmitted to Plan Forge maintainers. The optional update check fetches release metadata from GitHub. Disable with PFORGE_NO_UPDATE_CHECK=1.

Can we audit every action an agent took?

Yes. Per-run trajectory in .forge/runs/<id>/ includes events, slice artifacts, traces, cost history, and (for CCA-dispatched runs) the full Copilot Cloud Agent trajectory.

How do we prevent agents from editing files outside scope?

Plan Forge enforces scope contracts at the plan level (In Scope, Out of Scope, Forbidden Actions blocks). Pre-tool-use hooks block edits to forbidden paths. Post-execution pforge diff checks for drift.

Honest gap: enforcement is best-effort at the worker level, the orchestrator can't always prevent a bad edit, only detect it. Roadmap item to harden.

What happens if an agent malfunctions?

Per-slice workerTimeoutMs cap kills runaway workers. Reflexion retry with backoff handles recoverable failures. forge_alert_triage ranks issues by priority. In-loop stuck detector is on the roadmap (OpenHands-pattern).

Can we enforce a budget per team?

.forge.json per repo supports cost.dailyMax and similar caps (planned formalization). Per-engineer attribution is on the roadmap.

What's the data retention model?

Plan Forge does not delete .forge/ artifacts automatically. Retention is the customer's policy, implement via standard filesystem tools or post-run cleanup hooks.

Are LLM responses cached?

Plan Forge does not cache LLM responses. Some LLM providers (Anthropic, OpenAI) do prompt caching, that's their infrastructure, billed at reduced rates. Plan Forge tracks cache hit/miss for cost accuracy (Phase-COST-TOKEN-COVERAGE landed the per-vendor billing math).

How do we know Plan Forge itself isn't compromised?

Open source. MIT license. Audit the code. Plan Forge is dogfooded against itself, every release ships through the same Plan Forge pipeline that customers use. Self-repair tooling (forge_meta_bug_file) gives agents a way to file defects against Plan Forge during execution.