Agents, Skills & Hooks
Complete reference for the 14 reviewer agents, 16 slash-command skills (6 shared + 10 per stack), and the lifecycle hook system.
Agents
14 reviewer agents organized in three categories. Agents are read-only, they audit code but can't edit files.
Stack-Specific Agents (6)
Vary by preset, examples for dotnet:
| Agent | Reviews |
|---|---|
| architecture-reviewer | Layer separation, dependency direction, SOLID |
| database-reviewer | Query patterns, migrations, connection management |
| deploy-reviewer | Dockerfile, health checks, container optimization |
| performance-reviewer | Hot paths, allocations, async patterns |
| security-reviewer | Input validation, auth, secrets, OWASP |
| test-runner | Test coverage, test patterns, mocking strategy |
Cross-Stack Agents (8)
Shared across all presets, same expertise regardless of language:
| Agent | Reviews |
|---|---|
| api-contract-reviewer | API versioning, backward compatibility, OpenAPI |
| accessibility-reviewer | WCAG 2.2, semantic HTML, ARIA, keyboard nav |
| multi-tenancy-reviewer | Tenant isolation, data leakage, RLS, cache separation |
| cicd-reviewer | Pipeline safety, secrets, rollback strategies |
| observability-reviewer | Structured logging, distributed tracing, metrics |
| dependency-reviewer | CVEs, outdated packages, license conflicts |
| compliance-reviewer | GDPR, CCPA, SOC2, PII handling, audit logs |
| error-handling-reviewer | Exception hierarchy, error boundaries, ProblemDetails |
Pipeline Agents (6)
Drive the 7-step pipeline with handoff buttons between stages:
| Agent | Pipeline Step | What It Does |
|---|---|---|
| specifier | Step 0 | Interviews you, produces specification |
| preflight | Step 1 | Verifies prerequisites, checks environment readiness |
| plan-hardener | Step 2 | Converts spec into hardened execution contract |
| executor | Step 3 | Executes slices, validates gates |
| reviewer-gate | Step 5 | Independent audit for drift and compliance |
| shipper | Step 6 | Commits, updates roadmap, captures lessons |
Skills
Skills are multi-step procedures the AI runs end-to-end, they read files, write files, run terminal commands, and emit events the dashboard can watch. Unlike agents (which review) and hooks (which gate), skills do work. There are two tiers: shared skills installed across every preset, and stack-specific skills tailored to the chosen language.
The SKILL.md runtime contract
Every skill is a single Markdown file with YAML frontmatter followed by numbered ### N. Step Name sections. The skill-runner parses the file into a step DAG, executes bash blocks per step, and emits lifecycle events to the WebSocket hub. The contract:
| Frontmatter field | Required | Purpose |
|---|---|---|
name | Yes | Slash-command alias (the file's directory name). name: database-migration → /database-migration. |
description | Yes | One-paragraph trigger guidance. The classifier matches user prompts against this field. Best practice: include USE FOR and DO NOT USE FOR phrases. |
argument-hint | Optional | One-line example of the argument shape, surfaced in the slash-command picker. |
tools | Optional | Allow-list of tools the skill may invoke. Inline (tools: [run_in_terminal, read_file]) or block list. Enforces least-privilege at runtime. |
After the frontmatter, three Markdown sections are recognized by the runner:
## Steps, the main body. Each### N. Step Nameheading defines one step; bash fences inside are executed in order. A step with no bash block is informational and auto-passes.## Safety Rules, bullet list of invariants. Surfaced in the dashboard and injected into the skill's context.## Persistent Memory, optional. Block appended to the skill's L2 capture so OpenBrain remembers cross-run lessons.
Two structural patterns are recognized inside step bodies:
- Conditionals, a blockquote starting with
If <condition> → <action>. If the action contains the word skip, the runner aborts the current step's remaining commands. Used for early-exit paths like “If migration fails → rollback & STOP.” - Temper Guards, a Markdown table at the end of the file capturing shortcuts authors took that broke the skill. Surfaced in code review and the Plan Forge knowledge graph.
Events the runner emits
Every skill execution emits four event types on the hub (cataloged in Appendix V — Skills events):
| Event | When | Payload |
|---|---|---|
skill-started | Once, at entry | { skillName, stepCount } |
skill-step-started | Before each step | { skillName, stepNumber, stepName } |
skill-step-completed | After each step | { skillName, stepNumber, stepName, status, duration } |
skill-completed | Once, at exit | { skillName, passed, failed, duration } |
Three ways to invoke a skill
- Slash command in chat,
/database-migration add user_profiles table. The most common path; works in VS Code Copilot, Claude, Cursor, and Codex oncesetup.ps1 -Agent <name>has run. - MCP tool,
forge_run_skillwith{ name, args }. Returns the same lifecycle events plus a structured result envelope. - REST,
POST /api/tool/forge_run_skillthrough the generic dispatcher (see Appendix W). Used by the dashboard and any external integration.
The orchestrator can also defer a skill into the decision tray when it wants a human to choose; clients query GET /api/skills/pending and resolve through POST /api/skills/{accept,reject,defer} (full surface in Appendix W — Skills).
Shared skills (every preset)
Six skills ship under presets/shared/skills/ and install regardless of language. These are the cross-cutting workflows.
| Skill | Invocation | What it does | Key tools |
|---|---|---|---|
audit-loop | /audit-loop [--max=N --env=dev] | Recursive scan → triage → fix until findings converge to zero. The orchestrator's drain loop, exposed as a one-shot. | forge_tempering_*, forge_bug_register, forge_triage_route |
forge-execute | /forge-execute | Guided plan execution: list plans → estimate cost → execute → report. The friendly path for new users. | forge_run_plan, forge_estimate_quorum, forge_cost_report |
forge-quench | /forge-quench <plan> | Final hardening pass before committing a plan, runs validators and the completeness sweep. | forge_validate, forge_sweep |
forge-troubleshoot | /forge-troubleshoot | Diagnose common Plan Forge issues: missing API keys, stale orchestrator logs, broken hub, hook conflicts. | forge_smith, forge_diagnose |
health-check | /health-check | Forge diagnostic chain: forge_smith → forge_validate → forge_sweep. Run on a clean checkout before opening a PR. | forge_smith, forge_validate, forge_sweep |
security-audit (shared variant) | /security-audit | Generic OWASP scan, secrets detection, severity report. Stack presets override with language-specific scanners. | forge_secret_scan, forge_dep_watch |
Stack-specific skills (per language preset)
Ten skills ship per language preset under presets/<stack>/.github/skills/. Same skill names across stacks, but the implementation calls the language's idiomatic toolchain, database-migration uses Knex / Prisma for TypeScript, EF Core for .NET, Alembic for Python, GORM for Go, and so on.
| Skill | Invocation | What it does |
|---|---|---|
api-doc-gen | /api-doc-gen | Generate or update OpenAPI spec, validate spec-to-code consistency. |
code-review | /code-review | Comprehensive review: architecture, security, testing, patterns. |
database-migration | /database-migration "<change>" | Generate, review, test locally, deploy to staging, with rollback. Five-step DAG with conditional early-exit on migration failure. |
dependency-audit | /dependency-audit | Scan for vulnerabilities, outdated packages, license issues. Wraps npm audit / dotnet list package --vulnerable / pip-audit per stack. |
forge-quench (stack variant) | /forge-quench | Same shape as shared variant, but invokes the stack's linter and test runner. |
onboarding | /onboarding | Walk a new developer through project setup, architecture, and first task. |
release-notes | /release-notes "<tag>" | Generate release notes from git history and CHANGELOG. Output formatted for GitHub Release, Slack, or email. |
security-audit (stack variant) | /security-audit | Language-specific OWASP scan plus shared scanners. Wraps semgrep / bandit / brakeman / govulncheck per stack. |
staging-deploy | /staging-deploy | Build, push, migrate, deploy, and verify on staging with health-check probe. |
test-sweep | /test-sweep [category] | Run all test suites (unit, integration, API, E2E) and aggregate results into a summary report. Run before the Review Gate. |
Authoring a new skill
The minimum viable skill is one frontmatter block + one numbered step. Drop it under .github/skills/<name>/SKILL.md and it's available as /<name> in the next chat session. Example:
---
name: deploy-canary
description: "Deploy current branch to canary environment and watch metrics for 10 minutes. USE FOR: gradual rollout. DO NOT USE FOR: hotfixes (use /staging-deploy)."
argument-hint: "[optional: minutes to watch, default 10]"
tools: [run_in_terminal, read_file]
---
# Deploy Canary Skill
## Steps
### 1. Build & Push
```bash
docker build -t myapp:canary .
docker push myregistry/myapp:canary
```
### 2. Apply
```bash
kubectl set image deployment/myapp myapp=myregistry/myapp:canary -n canary
kubectl rollout status deployment/myapp -n canary --timeout=2m
```
### Conditional: Rollout Failure
> If rollout fails → immediately `kubectl rollout undo`, report the error, and STOP. Do not proceed to watch.
### 3. Watch
```bash
sleep ${MINUTES:-600}
kubectl logs -l app=myapp -n canary --tail=200
```
## Safety Rules
- NEVER deploy from a dirty working tree
- ALWAYS rollback within 60s if 5xx rate exceeds 1%
Authoring guidance:
- Step granularity: one step = one logical unit the dashboard can show pass/fail for. Keep step count under 8; longer skills should split into sub-skills called from a coordinator.
- Conditional placement: use blockquote conditionals between steps for early-exit paths. They render distinctly in the dashboard timeline.
- Tool allow-list: list only the tools the skill actually needs. The runtime warns on unused entries and refuses unlisted tool calls.
- Idempotency: assume the skill may be re-run after a partial failure. Use
IF NOT EXISTS, check-then-act patterns, and explicit cleanup steps. - Persistent memory: when the skill learns something cross-cutting (a new failure mode, an env-specific quirk), append it under
## Persistent Memory. The capture is routed to L2 and, if configured, L3 OpenBrain.
presets/<stack>/.github/skills/ is a worked example. The richest are database-migration (5-step DAG with conditional rollback) and audit-loop (recursive convergence loop).
Lifecycle Hooks
Hooks run automatically during agent sessions, no manual activation:
| Hook | When | What It Enforces |
|---|---|---|
| SessionStart | Session begins | Injects Project Principles, current phase, forbidden patterns |
| PreToolUse | Before file edit | Blocks edits to paths listed in plan's Forbidden Actions |
| PostToolUse | After file edit | Auto-formats, warns on TODO/FIXME/stub markers |
| Stop | Session ends | Warns if code modified but no test run detected |
📄 Full reference: capabilities, Multi-Agent Setup — GitHub Copilot