Constellation of glowing amber instruction-rule nodes interconnected by threads of light, auto-loading guardrails activating in context

Chapter 10 · Reference

Agents, Skills & Hooks

Complete reference for the 21 reviewer agents, 4 reviewer skills, and pipeline prompts, and the lifecycle hook system.

Instruction files catalog, looking for the auto-loading rules, universal files, and domain catalog? See Chapter 10 — Instruction Files & Agents.

Agents

21 reviewer agents organized in three categories. Agents are read-only, they audit code but can't edit files.

Stack-Specific Agents (6)

Vary by preset, examples for dotnet:

Agent	Reviews
architecture-reviewer	Layer separation, dependency direction, SOLID
database-reviewer	Query patterns, migrations, connection management
deploy-reviewer	Dockerfile, health checks, container optimization
performance-reviewer	Hot paths, allocations, async patterns
security-reviewer	Input validation, auth, secrets, OWASP
test-runner	Test coverage, test patterns, mocking strategy

Cross-Stack Agents (8)

Shared across all presets, same expertise regardless of language:

Agent	Reviews
api-contract-reviewer	API versioning, backward compatibility, OpenAPI
accessibility-reviewer	WCAG 2.2, semantic HTML, ARIA, keyboard nav
multi-tenancy-reviewer	Tenant isolation, data leakage, RLS, cache separation
cicd-reviewer	Pipeline safety, secrets, rollback strategies
observability-reviewer	Structured logging, distributed tracing, metrics
dependency-reviewer	CVEs, outdated packages, license conflicts
compliance-reviewer	GDPR, CCPA, SOC2, PII handling, audit logs
error-handling-reviewer	Exception hierarchy, error boundaries, ProblemDetails

Pipeline Agents (6)

Drive the 7-step pipeline with handoff buttons between stages:

Agent	Pipeline Step	What It Does
specifier	Step 0	Interviews you, produces specification
preflight	Step 1	Verifies prerequisites, checks environment readiness
plan-hardener	Step 2	Converts spec into hardened execution contract
executor	Step 3	Executes slices, validates gates
reviewer-gate	Step 5	Independent audit for drift and compliance
shipper	Step 6	Commits, updates roadmap, captures lessons

Skills

Skills are multi-step procedures the AI runs end-to-end, they read files, write files, run terminal commands, and emit events the dashboard can watch. Unlike agents (which review) and hooks (which gate), skills do work. There are two tiers: shared skills installed across every preset, and stack-specific skills tailored to the chosen language.

The SKILL.md runtime contract

Every skill is a single Markdown file with YAML frontmatter followed by numbered ### N. Step Name sections. The skill-runner parses the file into a step DAG, executes bash blocks per step, and emits lifecycle events to the WebSocket hub. The contract:

Frontmatter field	Required	Purpose
`name`	Yes	Slash-command alias (the file's directory name). `name: database-migration` → `/database-migration`.
`description`	Yes	One-paragraph trigger guidance. The classifier matches user prompts against this field. Best practice: include USE FOR and DO NOT USE FOR phrases.
`argument-hint`	Optional	One-line example of the argument shape, surfaced in the slash-command picker.
`tools`	Optional	Allow-list of tools the skill may invoke. Inline (`tools: [run_in_terminal, read_file]`) or block list. Enforces least-privilege at runtime.

After the frontmatter, three Markdown sections are recognized by the runner:

## Steps, the main body. Each ### N. Step Name heading defines one step; bash fences inside are executed in order. A step with no bash block is informational and auto-passes.
## Safety Rules, bullet list of invariants. Surfaced in the dashboard and injected into the skill's context.
## Persistent Memory, optional. Block appended to the skill's L2 capture so OpenBrain remembers cross-run lessons.

Two structural patterns are recognized inside step bodies:

Conditionals, a blockquote starting with If <condition> → <action>. If the action contains the word skip, the runner aborts the current step's remaining commands. Used for early-exit paths like “If migration fails → rollback & STOP.”
Temper Guards, a Markdown table at the end of the file capturing shortcuts authors took that broke the skill. Surfaced in code review and the Plan Forge knowledge graph.

Events the runner emits

Every skill execution emits four event types on the hub (cataloged in Appendix V — Skills events):

Event	When	Payload
`skill-started`	Once, at entry	`{ skillName, stepCount }`
`skill-step-started`	Before each step	`{ skillName, stepNumber, stepName }`
`skill-step-completed`	After each step	`{ skillName, stepNumber, stepName, status, duration }`
`skill-completed`	Once, at exit	`{ skillName, passed, failed, duration }`

Three ways to invoke a skill

Slash command in chat, /database-migration add user_profiles table. The most common path; works in VS Code Copilot, Claude, Cursor, and Codex once setup.ps1 -Agent <name> has run.
MCP tool, forge_run_skill with { name, args }. Returns the same lifecycle events plus a structured result envelope.
REST, POST /api/tool/forge_run_skill through the generic dispatcher (see Appendix W). Used by the dashboard and any external integration.

The orchestrator can also defer a skill into the decision tray when it wants a human to choose; clients query GET /api/skills/pending and resolve through POST /api/skills/{accept,reject,defer} (full surface in Appendix W — Skills).

Shared skills (every preset)

Seven skills ship under presets/shared/skills/ and install regardless of language. These are the cross-cutting workflows.

Skill	Invocation	What it does	Key tools
`audit-loop`	`/audit-loop [--max=N --env=dev]`	Recursive scan → triage → fix until findings converge to zero. The orchestrator's drain loop, exposed as a one-shot.	`forge_tempering_*`, `forge_bug_register`, `forge_triage_route`
`bug-fix`	`/bug-fix <bugId>`	End-to-end tempering bug fix: load bug → `/code-review` pre-fix → write failing test (TDD red) → fix → `forge_bug_validate_fix` → `/test-sweep` → close. Composes `/code-review`, `/clean-code-review`, `/forge-quench`, `/test-sweep` around `forge_bug_*` so a fix never closes without a regression check.	`forge_bug_register`, `forge_bug_validate_fix`, `forge_bug_update_status`
`forge-execute`	`/forge-execute`	Guided plan execution: list plans → estimate cost → execute → report. The friendly path for new users.	`forge_run_plan`, `forge_estimate_quorum`, `forge_cost_report`
`forge-quench`	`/forge-quench <plan>`	Final hardening pass before committing a plan, runs validators and the completeness sweep.	`forge_validate`, `forge_sweep`
`forge-troubleshoot`	`/forge-troubleshoot`	Diagnose common Plan Forge issues: missing API keys, stale orchestrator logs, broken hub, hook conflicts.	`forge_smith`, `forge_diagnose`
`health-check`	`/health-check`	Forge diagnostic chain: `forge_smith` → `forge_validate` → `forge_sweep`. Run on a clean checkout before opening a PR.	`forge_smith`, `forge_validate`, `forge_sweep`
`security-audit` (shared variant)	`/security-audit`	Generic OWASP scan, secrets detection, severity report. Stack presets override with language-specific scanners.	`forge_secret_scan`, `forge_dep_watch`

Stack-specific skills (per language preset)

Ten skills ship per language preset under presets/<stack>/.github/skills/. Same skill names across stacks, but the implementation calls the language's idiomatic toolchain, database-migration uses Knex / Prisma for TypeScript, EF Core for .NET, Alembic for Python, GORM for Go, and so on.

Skill	Invocation	What it does
`api-doc-gen`	`/api-doc-gen`	Generate or update OpenAPI spec, validate spec-to-code consistency.
`code-review`	`/code-review`	Comprehensive review: architecture, security, testing, patterns.
`database-migration`	`/database-migration "<change>"`	Generate, review, test locally, deploy to staging, with rollback. Five-step DAG with conditional early-exit on migration failure.
`dependency-audit`	`/dependency-audit`	Scan for vulnerabilities, outdated packages, license issues. Wraps `npm audit` / `dotnet list package --vulnerable` / `pip-audit` per stack.
`forge-quench` (stack variant)	`/forge-quench`	Same shape as shared variant, but invokes the stack's linter and test runner.
`onboarding`	`/onboarding`	Walk a new developer through project setup, architecture, and first task.
`release-notes`	`/release-notes "<tag>"`	Generate release notes from git history and CHANGELOG. Output formatted for GitHub Release, Slack, or email.
`security-audit` (stack variant)	`/security-audit`	Language-specific OWASP scan plus shared scanners. Wraps semgrep / bandit / brakeman / govulncheck per stack.
`staging-deploy`	`/staging-deploy`	Build, push, migrate, deploy, and verify on staging with health-check probe.
`test-sweep`	`/test-sweep [category]`	Run all test suites (unit, integration, API, E2E) and aggregate results into a summary report. Run before the Review Gate.

Authoring a new skill

The minimum viable skill is one frontmatter block + one numbered step. Drop it under .github/skills/<name>/SKILL.md and it's available as /<name> in the next chat session. Example:

---
name: deploy-canary
description: "Deploy current branch to canary environment and watch metrics for 10 minutes. USE FOR: gradual rollout. DO NOT USE FOR: hotfixes (use /staging-deploy)."
argument-hint: "[optional: minutes to watch, default 10]"
tools: [run_in_terminal, read_file]
---

# Deploy Canary Skill

## Steps

### 1. Build & Push
```bash
docker build -t myapp:canary .
docker push myregistry/myapp:canary
```

### 2. Apply
```bash
kubectl set image deployment/myapp myapp=myregistry/myapp:canary -n canary
kubectl rollout status deployment/myapp -n canary --timeout=2m
```

### Conditional: Rollout Failure
> If rollout fails → immediately `kubectl rollout undo`, report the error, and STOP. Do not proceed to watch.

### 3. Watch
```bash
sleep ${MINUTES:-600}
kubectl logs -l app=myapp -n canary --tail=200
```

## Safety Rules
- NEVER deploy from a dirty working tree
- ALWAYS rollback within 60s if 5xx rate exceeds 1%

Authoring guidance:

Step granularity: one step = one logical unit the dashboard can show pass/fail for. Keep step count under 8; longer skills should split into sub-skills called from a coordinator.
Conditional placement: use blockquote conditionals between steps for early-exit paths. They render distinctly in the dashboard timeline.
Tool allow-list: list only the tools the skill actually needs. The runtime warns on unused entries and refuses unlisted tool calls.
Idempotency: assume the skill may be re-run after a partial failure. Use IF NOT EXISTS, check-then-act patterns, and explicit cleanup steps.
Persistent memory: when the skill learns something cross-cutting (a new failure mode, an env-specific quirk), append it under ## Persistent Memory. The capture is routed to L2 and, if configured, L3 OpenBrain.

Reference reading: every skill in presets/<stack>/.github/skills/ is a worked example. The richest are database-migration (5-step DAG with conditional rollback) and audit-loop (recursive convergence loop).

Lifecycle Hooks

Hooks run automatically during agent sessions, no manual activation:

Hook	When	What It Enforces
SessionStart	Session begins	Injects Project Principles, current phase, forbidden patterns
PreToolUse	Before file edit	Blocks edits to paths listed in plan's Forbidden Actions
PostToolUse	After file edit	Auto-formats, warns on TODO/FIXME/stub markers
Stop	Session ends	Warns if code modified but no test run detected

PostToolUse warning: If you see "⚠ Deferred-work marker detected" after an edit, the AI left a TODO or stub. Address it before moving on, the completeness sweep (Step 4) will catch it anyway.

📄 Full reference: capabilities, Multi-Agent Setup — GitHub Copilot