LiveGuard Alert Runbooks
The guardian fired. Here's exactly what to do next.
Severity Matrix
Every LiveGuard alert carries one of four severity levels. The matrix below defines response SLA and escalation path. Full runbooks per alert type follow.
| Severity | Response SLA | Notify | Dashboard Badge |
|---|---|---|---|
| Critical | Immediate — within 1 hour | On-call + team lead | Red badge on Triage tab |
| High | Same business day | On-call engineer | Amber badge on Triage tab |
| Medium | Next sprint | Team chat | Yellow dot on relevant tab |
| Low | Backlog | — | No badge |
Per-Alert Runbooks
Drift Spike — Architecture Diverged from Plan Baseline
Source: forge_drift_report | Typical severity: Medium–High
- Assess: Run
pforge driftto get the current score and delta. If delta > 10 points in one session, treat as High. - Identify: Check the
violations[]in the output — each violation lists the file, rule, and instruction file it violates. - Root cause: Was this an intentional architectural change? If yes, update the instruction file or plan baseline. If no, the code drifted from the plan.
- Fix: For unintentional drift, refactor to match the plan. For intentional changes, update the plan's Scope Contract to reflect the new architecture.
- Verify: Re-run
pforge drift— score should recover to within 5 points of the previous baseline.
Secret Found — High-Entropy String in Committed Diff
Source: forge_secret_scan | Typical severity: Critical
- Do not push: If the commit hasn't been pushed, amend it to remove the secret.
git reset HEAD~1, remove the credential, re-commit. - Rotate immediately: If the commit has been pushed, the credential is compromised. Rotate it in the external provider (API dashboard, vault, etc.) before any other action.
- Remove from history: Use
git filter-repoor BFG Repo-Cleaner to purge the secret from git history. A simple amendment is not sufficient — the old commit object still exists. - Move to secrets manager: Store the new credential in
.forge/secrets.json(gitignored), an environment variable, or your cloud vault. Never in source code. - Verify: Re-run
pforge secret-scan— output should showclean: true.
Env Diff Gap — Required Key Missing from Environment File
Source: forge_env_diff | Typical severity: Medium–High
- Review gaps: Run
pforge env-diffto see which keys are missing and in which files. - Categorize: Is the key required for the target environment? Some keys (e.g.,
DEBUG=true) are intentionally absent from production. - Add missing keys: For required keys, add them to the target
.env.*file with the appropriate value for that environment. - Document exceptions: If a key is intentionally absent, add a comment in the baseline
.envfile:# NOT_IN_PROD: DEBUG. - Verify: Re-run
pforge env-diff— output should showclean: trueor only expected gaps.
Regression Gate Failure — Previously Passing Gate Now Fails
Source: forge_regression_guard | Typical severity: High
- Identify: Run
pforge regression-guardto see which gates failed and their error output. - Bisect: Use
git logto find which commit broke the gate. The gate command output usually points at the exact file. - Fix or update: If the code broke a valid gate, fix the code. If the gate is outdated (the feature was intentionally changed), update the gate command in the plan file.
- Verify: Re-run
pforge regression-guard --plan <affected-plan>— all gates should pass.
Dependency Vulnerability — New CVE in a Watched Package
Source: forge_dep_watch | Typical severity: Medium–Critical (depends on CVE severity)
- Assess: Run
pforge dep-watchto see new vulnerabilities with their CVE IDs and severity. - Check exploitability: Not all CVEs are exploitable in your context. Check if the vulnerable code path is reachable in your app.
- Update:
npm update <package>or pin to a patched version. For transitive dependencies, usenpm audit fix. - If no patch exists: Evaluate alternatives, add a compensating control, or document the accepted risk with a timeline for re-evaluation.
- Verify: Re-run
pforge dep-watch— the vulnerability should move fromnewVulnerabilitiestoresolvedVulnerabilities.
Incident MTTR Exceeded — Time-to-Resolve Beyond Threshold
Source: forge_alert_triage (via MTTR calculation) | Typical severity: High
- Review: Run
pforge incident listto see open incidents and their MTTR. - Escalate: If the incident has been open beyond the SLA for its severity level (see severity matrix above), escalate to the next tier defined in
onCall.escalation. - Root cause: Is the incident blocked on external factors? If so, document the blocker in the incident description.
- Close: Once resolved, update the incident status. MTTR is automatically calculated from capture time to close time.
Fix Proposal Workflow (v2.29)
When a LiveGuard tool fires a failure (regression, drift, incident, or secret found), forge_fix_proposal generates a scoped 1-2 slice fix plan for human review. This is the detect → propose → approve → fix loop.
- Trigger: Run
pforge fix-proposal --source regression(or drift/incident/secret) after the alert fires. - Review the plan: Open
docs/plans/auto/LIVEGUARD-FIX-<incidentId>.md. The plan contains the failing command, affected files, and a template fix slice with<!-- TODO -->markers for you to fill in. - Fill in the fix: Complete the TODO markers in the fix slice. For secret findings, the template directs you to remove the credential from the diff and rotate it externally before proceeding.
- Execute on a branch:
pforge run-plan --assisted docs/plans/auto/LIVEGUARD-FIX-<incidentId>.md. The plan targets a dedicated branch — never master. - Verify: The second slice re-runs the exact commands that originally failed. Green gate = fix confirmed.
- Promote or close: Merge the branch if the fix holds. Close the proposal by updating its status in
.forge/fix-proposals.json. Auto-generated plans indocs/plans/auto/are gitignored — promote manually todocs/plans/if you want to keep it in version history.
forge_fix_proposal generates at most one proposal per incidentId. If the first proposal doesn’t resolve the issue, address it manually — the tool will return status: "needs-human-intervention" on the second call.