voidforge-build 23.11.4 → 23.12.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/dist/.claude/agents/batman-qa.md +1 -0
  2. package/dist/.claude/agents/galadriel-frontend.md +2 -0
  3. package/dist/.claude/agents/kusanagi-devops.md +4 -0
  4. package/dist/.claude/agents/lucius-config.md +6 -0
  5. package/dist/.claude/agents/samwise-accessibility.md +4 -0
  6. package/dist/.claude/agents/silver-surfer-herald.md +13 -4
  7. package/dist/.claude/commands/architect.md +9 -0
  8. package/dist/.claude/commands/assemble.md +4 -1
  9. package/dist/.claude/commands/assess.md +13 -1
  10. package/dist/.claude/commands/audit-docs.md +106 -0
  11. package/dist/.claude/commands/deploy.md +29 -1
  12. package/dist/.claude/commands/engage.md +19 -1
  13. package/dist/.claude/commands/gauntlet.md +23 -4
  14. package/dist/.claude/commands/imagine.md +15 -0
  15. package/dist/.claude/commands/sentinel.md +15 -0
  16. package/dist/.claude/commands/ux.md +36 -0
  17. package/dist/.claude/commands/void.md +1 -0
  18. package/dist/CHANGELOG.md +65 -0
  19. package/dist/CLAUDE.md +9 -0
  20. package/dist/VERSION.md +3 -1
  21. package/dist/docs/methods/AI_INTELLIGENCE.md +33 -0
  22. package/dist/docs/methods/ASSEMBLER.md +31 -2
  23. package/dist/docs/methods/BUILD_PROTOCOL.md +2 -0
  24. package/dist/docs/methods/CAMPAIGN.md +46 -0
  25. package/dist/docs/methods/DEVOPS_ENGINEER.md +194 -0
  26. package/dist/docs/methods/DOC_AUDIT.md +92 -0
  27. package/dist/docs/methods/FORGE_KEEPER.md +16 -5
  28. package/dist/docs/methods/GAUNTLET.md +38 -0
  29. package/dist/docs/methods/PRODUCT_DESIGN_FRONTEND.md +57 -0
  30. package/dist/docs/methods/QA_ENGINEER.md +21 -0
  31. package/dist/docs/methods/RELEASE_MANAGER.md +27 -0
  32. package/dist/docs/methods/SECURITY_AUDITOR.md +12 -1
  33. package/dist/docs/methods/SUB_AGENTS.md +54 -0
  34. package/dist/docs/methods/SYSTEMS_ARCHITECT.md +13 -0
  35. package/dist/docs/methods/TESTING.md +19 -0
  36. package/dist/docs/patterns/README.md +3 -0
  37. package/dist/docs/patterns/ai-eval.ts +63 -0
  38. package/dist/docs/patterns/daemon-process.ts +90 -0
  39. package/dist/docs/patterns/database-migration.ts +65 -0
  40. package/dist/docs/patterns/deploy-preflight.ts +85 -2
  41. package/dist/docs/patterns/design-tokens.ts +338 -0
  42. package/dist/docs/patterns/error-message-categorization.tsx +376 -0
  43. package/dist/wizard/lib/patterns/daemon-process.d.ts +2 -1
  44. package/dist/wizard/lib/patterns/daemon-process.js +89 -1
  45. package/package.json +2 -2
@@ -137,6 +137,7 @@ Verify and celebrate:
137
137
  - New method docs → mention what agent/domain they cover
138
138
  - New patterns → mention what they demonstrate
139
139
  - Changes to build protocol → recommend reviewing before next `/build`
140
+ - **New `.claude/agents/` files added → announce that the user must RESTART Claude Code before those agents are usable** (field report #343). Agent registration is session-scoped: a freshly written agent file is NOT available as a `subagent_type` value until Claude Code reloads. After a sync that wrote any new agent files, state plainly: "New agents were added: [names]. RESTART Claude Code before using them — agent registration happens at session start, so the new `subagent_type` values won't resolve until you reload. Until then, commands that dispatch these agents (including the Silver Surfer roster) will fail to launch them." List the specific new agent filenames so the user knows what became available.
140
141
  5. **Content impact check:** If NAMING_REGISTRY.md or CLAUDE.md was updated, diff the agent/command lists against the previous version. If new agents or commands were added, explicitly announce them: "New in this update: [Agent Name] ([role]). New commands: [/command]." Then warn: **"If your project displays agent counts, command lists, references the team roster, or assigns agents to protocol phases (e.g., marketing sites, docs pages, about pages, protocol timelines), update those data sources to match. New agents may need to be added to protocol phase assignments — check which phases they should participate in."** This is a handoff, not an auto-fix — `/void` doesn't know your project's data model.
141
142
  - If `VERSION.md` was updated, check if the project has any pages displaying version/release history. Flag versions in VERSION.md not reflected in site content.
142
143
  6. Announce completion with flair
package/dist/CHANGELOG.md CHANGED
@@ -6,6 +6,71 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/), and this
6
6
 
7
7
  ---
8
8
 
9
+ ## [23.12.1] - 2026-06-09
10
+
11
+ ### Follow-on triage — #354/#355 (8 fixes) + chronic CI-check fix
12
+
13
+ `#354` and `#355` were filed *during* the v23.12.0 run, so a second `/debrief --inbox` triaged them against the post-v23.12.0 tree. The adversarial verify pass overturned one false "already-fixed" (#355 F4); `#355 F5` was confirmed already-shipped (derived-counts doctrine). 8 fixes applied across 15 files.
14
+
15
+ ### Changed
16
+
17
+ - **REFUTE lens reaches `/engage` + `/sentinel`** (#354 F1) — v23.12.0 added the vote-based adversarial REFUTE pass (skeptics told to refute, ≥1 CONFIRM to keep, re-rate from votes) to `/gauntlet` + `/assemble`, but `/engage` and `/sentinel` still used the older "second agent disagrees → drop" model. Ported as `/engage` Step 2.5 and a `/sentinel` Phase 3 gate; named find→cluster→3-lens-verify as the default review shape in `SUB_AGENTS.md`.
18
+ - **Enforcement-keyed severity rubric** (#354 F2) — severity keys to the enforcement *layer*, not the symptom location: a client affordance leak the server still enforces (render-then-403) is UX-only (P2/P3), not a breach. "Where is this actually enforced?" added to the audit + verify lens. → `SECURITY_AUDITOR.md`, `PRODUCT_DESIGN_FRONTEND.md`, `/ux`, `/sentinel`.
19
+ - **"Isolation-green ≠ deploy-green"** (#354 F3) — targeted/isolation test runs are necessary but not sufficient; only the full suite is the deploy gate (environment coupling regresses unrelated tests invisibly to isolation runs). → `BUILD_PROTOCOL.md`, `/deploy`, `QA_ENGINEER.md`.
20
+ - **Boot-time DDL-ownership class** (#354 F4) — startup schema re-application can fail on tables owned by a different DB role than the app connects as; the other two deploy-env classes it named (served-artifact, `.env` precedence) were already covered by v23.12.0/prior. → `DEVOPS_ENGINEER.md`, `database-migration.ts`.
21
+ - **Contrast findings must cite source hex** (#355 F1) — a contrast finding must quote the literal source hex for *both* fg and bg with `file:line`, and re-grep that the class pairing exists, before being rated Critical; token NAMES are not proxies for VALUES (a token called "paper" may be near-black). Defends against the token-name-swap false site-wide Critical. → `PRODUCT_DESIGN_FRONTEND.md`, `GAUNTLET.md`, `samwise-accessibility.md`.
22
+ - **Glob-derived fan-out work-lists** (#355 F2) — derive per-agent file lists for a directory/migration fan-out from a glob, never a hand-typed list, and pair every fan-out with a mandatory post-fan-out completeness sweep before the wave is "done"; an "unsampled"/"not-checked" flag is coverage debt to carry forward. → `CAMPAIGN.md`, `SUB_AGENTS.md`, `silver-surfer-herald.md`.
23
+ - **Focused single-lens roster sizing** (#355 F3) — when `--focus` names one lens, cap the roster ~6–8 and partition agents by surface, not by near-duplicate persona. → `silver-surfer-herald.md`, `/ux`, `GAUNTLET.md`.
24
+ - **Per-wave staging deploy = status checkpoint** (#355 F4) — inlined in `CAMPAIGN.md` Step 4/5 action prose (the anti-pattern callout existed; the inline action-prose statement did not — caught by the verify pass).
25
+ - **Fixed the chronically-red `validate-branches.yml` slash-command check** — its `grep '| \`/[a-z]'` matched the Docs-Reference table's `/docs/*.md` rows and the `sed` mangled them into bogus `MISSING` paths, failing the job on every release since v23.11.0. Now anchored to bare `/command` cells (letters/hyphens, closing backtick) so `/docs/...` and `/HOLOCRON.md` are excluded. This is itself the #352 "gate that doesn't gate" class. (Note: the workflow's separate `e2e-tests` job still fails on a pre-existing wizard `aria-required-children` a11y issue — unrelated to methodology.)
26
+ - **Registered `/audit-docs`** in the CLAUDE.md Slash Commands table (shipped as a command in v23.12.0 but not listed). Synced to `packages/methodology/CLAUDE.md`.
27
+ - **`packages/voidforge/package.json`** methodology dep range `^23.12.0` → `^23.12.1` (ADR-062).
28
+
29
+ ### Closes
30
+
31
+ - **#354**, **#355**. #355 F5 already shipped (derived-counts doctrine); no out-of-scope items.
32
+
33
+ ---
34
+
35
+ ## [23.12.0] - 2026-06-09
36
+
37
+ ### Field Report Triage — 12 reports closed (#342–#353), 58 fixes + 5 new files
38
+
39
+ The v23.12 methodology pass. `/debrief --inbox` triaged all 12 open field reports against the live codebase, then applied every accepted fix in one session via two-phase workflow orchestration: a **triage** pass (one agent per report, each grepping the codebase to separate already-shipped from open), followed by an **apply** pass (one writer agent per target file — disjoint files, no conflicts) with an **adversarial verify** agent re-reading every file. Two reports' fixes were found already-shipped and adversarially confirmed (#349 F-4, #352 #3); four proposed fixes were out of scope (Claude Code core / harness skill / Workflow tool) and left to upstream. 58 fixes landed across 32 files plus 5 new files.
40
+
41
+ The reports corroborated each other into 7 clusters:
42
+
43
+ ### Added
44
+
45
+ - **`docs/patterns/design-tokens.ts`** — semantic color/type token layer (one indirection) so a palette/type pivot is a token edit, not a component-wide rewrite (#351).
46
+ - **`docs/patterns/nginx-vhost.conf`** — Cloudflare-Flexible-safe vhost template: security-header stack, ACME http-01 passthrough, no origin redirect loop behind CF Flexible SSL, `limit_req_zone` http-context comment (#344 F2/F4a).
47
+ - **`docs/patterns/error-message-categorization.tsx`** — categorize errors at the UI boundary (network / auth / validation / server / quota / unknown) before choosing copy, so a billing error never renders "try a different file" (#343 F8).
48
+ - **`.claude/commands/audit-docs.md`** + **`docs/methods/DOC_AUDIT.md`** — a Surfer-led doc-audit path (Troi / Wong / Irulan / Coulson) for currency, cross-reference, and command↔method sync, so doc audits stop being mis-routed through `/ux` (#342 F-3).
49
+ - **`scripts/regen-claude-md.sh`** — idempotent regenerator for a marker-delimited generated CLAUDE.md block from a truth source (`docs/_truth.yml` / `package.json` + git), exits cleanly when no truth source is present (#342 F-2).
50
+ - Pattern library 48 → 51.
51
+
52
+ ### Changed
53
+
54
+ - **Verify the FIX, not just the finding** (#348, #349 F-2, #350 #4) — `SUB_AGENTS.md`, `GAUNTLET.md`, and `/engage` now require the adversarial pass to interrogate the *proposed fix* for new failure modes (wedge / unbounded retry / loop / orphan / double-send), especially when it adds a coordination primitive (sentinel/lock/retry-state) without a liveness signal. Anchored to the M5 mint-fence incident (a reclaim window unreachable inside the retry budget wedged drafts in FAILED). `/engage` also now names the governing SSOT and reconciles fix direction (loosen/tighten) before a finding is actionable.
55
+ - **Production-config gate** (#350) — `GAUNTLET.md` gains an `APP_ENV=production` boot-assertion exit criterion plus a sandbox-blind-spot round ("what does the green sandbox suite NOT exercise?"); `CAMPAIGN.md` Victory Checklist now requires a prod-config boot before declaring victory. Sandbox-green is necessary, not sufficient.
56
+ - **Spring Cleaning consumer-vs-clone** (#343 F10, destructive-risk) — `FORGE_KEEPER.md` Step 1.5 now distinguishes methodology *consumers* (app projects — skip the always-remove list, fingerprint defensively) from *clones* (apply the migration registry), with a `package.json`-deps detection heuristic, so an app project no longer loses `tsconfig.json` / lockfiles / test configs.
57
+ - **Silver Surfer roster sizing** (#343 F6, #344 F5, #346 #1, #345 DEAL-001) — `silver-surfer-herald.md` gains `scope_bias` (lean roster when explicit file/dir scope is given), `scope_density` (6–10 for small single-shot surfaces), a ~18 single-mission cap with core/advisory tiering, and a basename-normalization rule; the corrected Output Format example now shows real basenames.
58
+ - **Creative/UX grounding** (#347, #351) — `/ux` gains a mandatory World-Scan / Reference-Grounding step (award galleries + live competitors → reference dossier), a prototype-to-feel step, and a de-AI checklist gate; `PRODUCT_DESIGN_FRONTEND.md` documents the committee-converges-on-the-mean failure mode, token-scoped theming, and show-don't-tell; Galadriel gains matching learnings; `/architect` and `/imagine` apply world-scan when design is in scope; the Surfer must add a web-capable scout to creative rosters (design agents have no web access).
59
+ - **Deploy / DevOps foot-guns** (#344, #349 F-1, #352, #353) — `DEVOPS_ENGINEER.md` gains 13 entries: no `eval`-export `.env` parsing (mangles `$`-bearing secrets), no `MemoryDenyWriteExecute` on Node systemd units (V8 JIT SIGTRAP), Cloudflare-Flexible redirect-loop + token-scope, served-artifact-≠-built-artifact deploy verification, worktree directory-rename pointer fragility, per-user git ident, blue-green nomenclature check, `pm2 reload` log-path binding, `docker compose up --dry-run` topology + merge semantics, config foot-guns, Docker-cleanup ownership preflight, read-back-after-vendor-PUT. Mirrored as learnings on Kusanagi and Lucius; `deploy-preflight.ts` and `daemon-process.ts` gain the matching checks/stanzas; `/deploy` gains a served-artifact fingerprint step.
60
+ - **Doc-currency & QA gates** (#342, #346, #349 F-3, #352 #1) — `CAMPAIGN.md` + `ASSEMBLER.md` gain a pre-SEAL Doc-Currency Refresh (Coulson + Wong) with `--no-doc-refresh`, an execution-time cluster sub-split, and integrated-changeset per-mission review; `RELEASE_MANAGER.md` retires the auto-rotting PROJECT_VERSION footer; `GAUNTLET.md` + `/assemble` + `/gauntlet` formalize a vote-based adversarial REFUTE sub-step and critical-always-verified routing (also in `/assess` Blueprint Mode); `QA_ENGINEER.md` gains a planted-bug "gates must gate" check and a stash-compare failure-attribution rule; `AI_INTELLIGENCE.md` adds the live-eval pre-launch gate + null-optional normalization gotcha.
61
+ - **CLAUDE.md** — Personality gains "Apply findings, don't offer a picker" (#343 F5) and "Honor authorized autonomy with single-question gates" (#344 G1); the Silver Surfer Gate documents when it fires (review phase, not solo build — #348) and a roster-name normalization step in the Orchestrator contract (#345). Synced to `packages/methodology/CLAUDE.md`.
62
+ - **`packages/voidforge/package.json`** methodology dep range `^23.11.4` → `^23.12.0` (ADR-062).
63
+
64
+ ### Closes
65
+
66
+ - **#342–#353** (12 field reports). Every accepted fix applied and adversarially verified. #349 F-4 (`/git` PROJECT_VERSION SSOT) and #352 #3 (find→adversarially-verify default) were already shipped — confirmed, not re-implemented. Out of scope and left upstream: #345 DEAL-004 (Workflow-tool args coercion), #353 RC-001/RC-002 + the `/update-config` callout (Claude Code core / harness skill — VoidForge cannot patch what it does not ship).
67
+
68
+ ### Pipeline
69
+
70
+ This release was produced by `/debrief --inbox` run as two background workflows: a 14-agent triage pass and a 73-agent apply+verify pass (one writer per file over a disjoint partition, so 37 files were edited concurrently without conflict). The lone collision (a duplicated patterns-README row from two agents both registering the same new pattern) was caught by the verify pass and corrected. Method-doc and pattern edits propagate to the npm methodology package via `prepack.sh`; `packages/methodology/CLAUDE.md` (the one tracked source copy) was re-synced with the ADR-058 strip transform.
71
+
72
+ ---
73
+
9
74
  ## [23.11.4] - 2026-05-12
10
75
 
11
76
  ### Wong promotion cluster + #260 closeout
package/dist/CLAUDE.md CHANGED
@@ -7,11 +7,15 @@
7
7
  - **Challenge when appropriate.** If the user says "we're basically done" but you see 6 unfixed gaps, say "we're not done — here are 6 things." Agreeing to be agreeable ships bugs.
8
8
  - **Separate opinion from analysis.** State facts first, then your recommendation. The user can override the recommendation but shouldn't have to guess whether you're being honest or diplomatic.
9
9
  - **Solve, don't delegate.** Attempt actions before listing prerequisites. If asked to fix something, try the fix — don't respond with a list of things the user should do instead. When blocked, explain what you tried and what specifically failed.
10
+ - **Apply findings, don't offer a picker.** When a review surfaces a clear list of fixable findings, DEFAULT to applying them in batches rather than surfacing a multi-option "which subset do you want?" picker (field report #343). A picker is only warranted when the choice is genuinely architectural — mutually exclusive directions with real trade-offs the user must own. Mechanical fixes (lint, missing validation, IDOR, a11y, dead code) are not architectural; fix them and report what you did.
11
+ - **Honor authorized autonomy with single-question gates.** When the operator explicitly authorizes autonomy ("go", "run the whole thing", "don't stop to ask"), execute the campaign end-to-end and gate only on irreducible externals — secrets, API tokens, billing approval, anything you genuinely cannot obtain or invent (field report #344). Do NOT seek interim confirmations on constants the agent itself invented and already disclosed (port numbers, table names, file paths, default copy). Surface those in the running log, not as a blocking question.
10
12
 
11
13
  ## Silver Surfer Gate (ADR-048, ADR-051, ADR-060)
12
14
 
13
15
  ADR-051 enforces this gate at the hook level (PreToolUse). The prose below is the backstop if the hook is absent or disabled. One day the prose may be removed entirely — the hook is the intended permanent mechanism.
14
16
 
17
+ **When the gate fires.** The gate fires at the REVIEW phase — the moment you deploy sub-agents — not during the solo build that precedes it (field report #348). Building the work yourself first, then mustering the Surfer roster to review it, is the intended sequence; the lead agent is expected to produce the artifact solo before any agent dispatch. The gate exists to stop you from cherry-picking the review roster, not to force agents onto the build itself.
18
+
15
19
  **Gated commands:** `/engage` (alias: `/review`), `/qa`, `/sentinel` (alias: `/security`), `/ux`, `/architect`, `/build`, `/assemble`, `/gauntlet`, `/campaign`, `/test`, `/devops`, `/deploy`, `/ai`, `/assess`.
16
20
 
17
21
  **Procedure — execute in order:**
@@ -33,6 +37,7 @@ ADR-051 enforces this gate at the hook level (PreToolUse). The prose below is th
33
37
 
34
38
  1. After the Silver Surfer sub-agent returns its roster, and before launching any other Agent: `[ -x scripts/surfer-gate/record-roster.sh ] && bash scripts/surfer-gate/record-roster.sh || true` (optionally pass the roster JSON as the first argument for audit). The existence guard is a defensive no-op for projects that predate v23.10.0 — when the gate started shipping via the npm methodology package per #317.
35
39
  2. When the user's command includes `--light` or `--solo`, BEFORE launching the Surfer or any other agent: `[ -x scripts/surfer-gate/bypass.sh ] && bash scripts/surfer-gate/bypass.sh --light || true` (or `--solo`). **Fails closed on unknown flag values** (ADR-060 v23.8.18 hardening, SEC-003) — passing anything other than `--light` or `--solo` exits 2 with an error. No silent bypass.
40
+ 3. **Normalize roster names before dispatch** (field report #345, DEAL-001). The Silver Surfer returns agent names from a Haiku pre-scan, which can drift from the actual filenames in `.claude/agents/` (extra `silver-surfer-` prefix, a stray `.md` suffix, a hyphen/underscore mismatch). Before you launch, validate each name against `ls .claude/agents/`: if it matches a file (with or without the `.md` extension), keep it; if not, attempt exactly one correction — strip a known prefix/suffix or normalize separators — and re-check. If it still doesn't resolve, DROP that single name and proceed with the rest of the roster. Never block the whole dispatch over one unresolved name; log the dropped name so the Herald roster can be corrected upstream. The gate's job is to enforce *that* a roster ran, not to fail the run over a typo in *one* name.
36
41
 
37
42
  If `scripts/surfer-gate/check.sh` exists but you skip step 1, your first non-Surfer Agent call in that turn will be blocked with a clear message and your own log line in `/tmp/voidforge-session-$SESSION_ID/gate.log`. You are expected to comply with the block (launch Surfer / run record-roster), not to fight it. If the script does not exist, your project predates v23.10.0; pull the gate from `tmcleod3/voidforge:scripts/surfer-gate/` and merge `settings-snippet.json` into `.claude/settings.json`, or re-run `npx voidforge-build init` against the methodology source.
38
43
 
@@ -121,6 +126,9 @@ Reference implementations in `/docs/patterns/`. Match these shapes when writing.
121
126
  - `ai-prompt-safety.ts` — Type A (instructions, statistical) vs Type B (constraints, enforced); AUTHORITY-as-text caveat; defense-in-depth stack
122
127
  - `llm-state-dedup.ts` — LLM ids are display labels, not keys; content-hash dedup; lifecycle-state snapshot completeness
123
128
  - `autonomous-ops-triage-policy.md` — 4-bucket model (self-resolving / runbook-safe / operator-approval / hard-never) + SessionStart hook visibility rule for ops-flavored projects
129
+ - `design-tokens.ts` — Semantic color/type tokens (one indirection layer) so a theme pivot is a token change, not a component-wide find-replace (field report #351, #343)
130
+ - `nginx-vhost.conf` — Cloudflare-Flexible-safe vhost template: security headers, ACME http-01 passthrough, no redirect loop behind CF's flexible SSL (field report #351, #344)
131
+ - `error-message-categorization.tsx` — Categorize errors at the UI boundary (network / auth / validation / server / unknown) before choosing copy, so users see actionable messages not raw internals (field report #351, #343)
124
132
 
125
133
  ## Slash Commands
126
134
 
@@ -148,6 +156,7 @@ Reference implementations in `/docs/patterns/`. Match these shapes when writing.
148
156
  | `/campaign` | Sisko's War Room — read the PRD, pick the next mission, finish the fight, repeat until done | All |
149
157
  | `/imagine` | Celebrimbor's Forge — AI image generation from PRD visual descriptions | All |
150
158
  | `/debrief` | Bashir's Field Report — post-mortem analysis, upstream feedback via GitHub issues | All |
159
+ | `/audit-docs` | Documentation audit — Surfer-led doc roster (Troi/Wong/Irulan/Coulson) for currency, cross-references, command↔method sync | All |
151
160
  | `/dangerroom` | The Danger Room (X-Men, Marvel) — installable operations dashboard for build/deploy/agent monitoring | Full |
152
161
  | `/cultivation` | Cultivation (Cosmere Shard) — installable autonomous growth engine: marketing, ads, creative, A/B testing, spend optimization | Full |
153
162
  | `/grow` | Kelsier's 6-phase growth protocol — initial setup within Cultivation, then autonomous loop | Full |
package/dist/VERSION.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Version
2
2
 
3
- **Current:** 23.11.4
3
+ **Current:** 23.12.1
4
4
 
5
5
  ## Versioning Scheme
6
6
 
@@ -14,6 +14,8 @@ This project uses [Semantic Versioning](https://semver.org/):
14
14
 
15
15
  | Version | Date | Summary |
16
16
  |---------|------|---------|
17
+ | 23.12.1 | 2026-06-09 | Follow-on field-report pass — `/debrief --inbox` triaged #354/#355 (filed during the v23.12.0 run) against the post-v23.12.0 tree and applied 8 fixes across 15 files. #354: port the vote-based REFUTE lens from /gauntlet into `/engage` + `/sentinel` (they used the old "second agent disagrees → drop" model), name find→cluster→3-lens-verify as the default review shape in `SUB_AGENTS.md`, **enforcement-keyed severity rubric** (a server-enforced client affordance leak is UX P2/P3, not a P0 breach — `SECURITY_AUDITOR.md`/`PRODUCT_DESIGN_FRONTEND.md`/`/ux`/`/sentinel`), "isolation-green ≠ deploy-green" (`BUILD_PROTOCOL.md`/`/deploy`/`QA_ENGINEER.md`), boot-time DDL-ownership class (`DEVOPS_ENGINEER.md`/`database-migration.ts`). #355: **contrast findings must cite literal source hex for fg+bg with file:line + re-grep the pairing before Critical** (token NAMES ≠ VALUES — defends against the false site-wide Critical; `PRODUCT_DESIGN_FRONTEND.md`/`GAUNTLET.md`/Samwise), **glob-derived fan-out work-lists + mandatory post-fan-out residual sweep** (`CAMPAIGN.md`/`SUB_AGENTS.md`/herald), focused single-lens roster cap + surface-partition (herald/`/ux`/`GAUNTLET.md`), per-wave staging deploy = status checkpoint inlined in CAMPAIGN action prose (verify pass overturned a false already-fixed). #355 F5 confirmed already-shipped (derived-counts doctrine). Also: **fixed the chronically-red `validate-branches.yml` slash-command CI check** (its grep mis-read `/docs/*` Docs-Reference rows as commands — the #352 "gate that doesn't gate" class) and registered `/audit-docs` in the CLAUDE.md Slash Commands table. Dep range `^23.12.0` → `^23.12.1`. |
18
+ | 23.12.0 | 2026-06-09 | The v23.12 methodology pass — `/debrief --inbox` triaged all 12 open field reports (#342–#353) and applied every accepted fix in one session via two-phase workflow orchestration (triage → apply), with an adversarial verify pass on every file. 58 fixes across 32 files + 5 new files. 7 clusters: **verify-the-FIX** (the adversarial pass must vet the proposed fix, not just the finding — SUB_AGENTS.md, GAUNTLET.md, /engage; #348/#349/#350, M5 mint-fence incident); **production-config gate** (sandbox-green ≠ ship-ready — GAUNTLET.md prod-boot + sandbox-blind-spot round, CAMPAIGN.md Victory Checklist; #350); **Spring Cleaning consumer-vs-clone** (FORGE_KEEPER.md destructive-risk branch so app projects don't lose tsconfig/lockfiles; #343 F10); **Surfer roster sizing** (silver-surfer-herald.md scope_bias/scope_density/~18-cap + basename normalization; #343/#344/#345/#346); **creative/UX grounding** (world-scan + de-AI + token-scoped theming — ux.md, PRODUCT_DESIGN_FRONTEND.md, galadriel; #347/#351); **deploy/DevOps foot-guns** (DEVOPS_ENGINEER.md +13: eval-env, Node-MDWE, CF-Flexible, served-vs-built, compose-topology, docker-cleanup; #344/#349/#352/#353); **doc-currency** (CAMPAIGN/ASSEMBLER pre-SEAL refresh + new /audit-docs & DOC_AUDIT.md; #342). 3 new patterns (design-tokens.ts, nginx-vhost.conf, error-message-categorization.tsx; 48 → 51) + new /audit-docs command + DOC_AUDIT.md + scripts/regen-claude-md.sh. CLAUDE.md Personality +2 (anti-picker #343, authorized-autonomy #344), gate-timing #348, roster normalization #345. Dep range `^23.11.4` → `^23.12.0` (ADR-062). #349 F-4 and #352 #3 were already shipped (verified); #345 DEAL-004 + #353 RC-001/002/callout out of scope (Claude Code core / Workflow tool). |
17
19
  | 23.11.4 | 2026-05-12 | Wong promotion cluster + #260 closeout. /debrief --inbox re-triage of all 9 open field-report issues produced 3 ready-now promotion clusters (3+ data points across different reports each). BUILD_PROTOCOL.md Principle #11 "Derived counts discipline" (from #336 F6, #334 F6, #332 hidden #5 — three projects independently drifted the same class). New pattern `docs/patterns/autonomous-ops-triage-policy.md` codifying the 4-bucket model + SessionStart hook visibility rule (from #337 F3, #336 F7, #334 F5 — two operators independently reinvented). CAMPAIGN.md Planning Mode Step 4 "Scope-adversary check for bug classes" (from #332, #338 #2 — voidforge-marketing-site missed `/patterns` because the bug class scope was narrowed). Also closes #260 remaining items: PRODUCT_DESIGN_FRONTEND.md Operating Rule #12 "Tutorial-context checklist for slash commands" + QA_ENGINEER.md Operating Rule #13 "Tutorial smoke test for slash commands." Dep range `^23.11.3` → `^23.11.4` per ADR-062 discipline. Pattern count 47 → 48. 11 field reports remain open for v23.12 methodology pass. |
18
20
  | 23.11.3 | 2026-05-12 | Three-phase pipeline (/architect → /debrief --inbox → /campaign) shipped 12 fixes + 2 ADRs + 1 LEARNINGS entry + 2 mechanical guards. **Issue #331** destructive-bug fix: `findProjectRoot()` now enforces `$HOME` boundary + `statSync().isFile()` guard, no more silent overwrite of `~/CLAUDE.md` on `npx voidforge-build update`. **HIGH CVE** fast-xml-parser/builder via `@aws-sdk/*` patched via `npm audit fix`. **Dep contract** pinned: `voidforge-build → voidforge-build-methodology` from `"*"` to `"^23.11.3"` (ADR-062), enforced mechanically by `check-methodology-pin.sh` prepublishOnly script. **engines.node** added to methodology package.json. **publish.yml hardening**: post-publish `npm view` verification step (both jobs, 6×10s retry), `recover-partial` job with `npm deprecate` on XOR-failure, `needs: publish-methodology` ordering on publish-voidforge. **copy-assets.sh** ADR-058 template strip applied (parity with methodology prepack). **Docs**: HOLOCRON Quick Start "launch Claude Code first" preamble + npm-prefix workaround (#260, #333p), FORGE_KEEPER Rule #11 "never write to $HOME" (ADR-063), RELEASE_MANAGER ROADMAP-sync checklist line, ROADMAP.md pointer v23.8.11 → v23.11.3 (24-version drift closed). Marker integration test `no-home-writes.integration.test.ts` mechanically enforces ADR-063. 1390 tests pass. |
19
21
  | 23.11.2 | 2026-05-12 | `voidforge init` now prompts browser-vs-CLI when no mode flag is passed (TTY only) and `--browser` was added for explicit opt-in. Headless init prompts for name/dir/oneliner/domain/repo when `--name` is omitted in a TTY. Non-TTY no-flag now errors cleanly instead of silently launching a wizard server. Separately: `packages/methodology/scripts/surfer-gate/` (8 files) now ships in the npm methodology package — closes ADR-051 distribution gap (#317). 9 pattern-table rows + existence-guarded orchestrator-contract bash propagated into the methodology CLAUDE.md. |
@@ -151,6 +151,37 @@ Run sequentially — each builds on findings from parallel phase:
151
151
  - Can you detect regression when prompts change?
152
152
  - Is there human-in-the-loop scoring for ambiguous cases?
153
153
  - Are quality metrics tracked over time? (Not just at launch)
154
+ - Is there a LIVE eval layer that runs against the real model before launch? (Not just the sandbox layer)
155
+
156
+ #### The LIVE eval layer is the pre-launch gate (field report #352, #4)
157
+
158
+ Evals stratify into two layers, and they catch different bug classes:
159
+
160
+ - **Deterministic / sandbox layer** — fast, hermetic, runs in CI with mocked or recorded model responses. Catches scoring-logic bugs, prompt-template-rendering bugs, and golden-dataset regressions. It **cannot** catch model-output-shape bugs, because the fixtures are shapes *you* authored — not shapes the live model actually emits.
161
+ - **LIVE eval layer** — runs the real model against the golden dataset before launch. This is the pre-launch gate. It is the only layer that observes the model's *actual* output shape, and the only layer that can catch a contract drift between "the response shape we coded against" and "the response shape the model produces."
162
+
163
+ Treat the LIVE layer as a mandatory gate, not an optional smoke test: a component cannot ship until its LIVE eval has run against the real model and passed. The sandbox layer gates *every commit*; the LIVE layer gates *every launch*.
164
+
165
+ **Gotcha — normalize null-to-undefined before Zod `.optional()` (field report #352, #4).** A live model emits `null` (not omission) for an absent optional field — e.g. it returns `{ "category": "billing", "subcategory": null }` rather than dropping `subcategory`. Zod's `.optional()` accepts `undefined`, **not** `null`, so the valid response fails schema validation and your retry/fallback path fires on output that was actually fine. This is invisible in the sandbox layer because hand-authored fixtures usually omit the key instead of setting it to `null`. Normalize before validating:
166
+
167
+ ```ts
168
+ // Strip nulls the model emits for absent optionals, so `.optional()` matches.
169
+ function nullToUndefined<T>(obj: T): T {
170
+ if (obj === null) return undefined as unknown as T;
171
+ if (Array.isArray(obj)) return obj.map(nullToUndefined) as unknown as T;
172
+ if (typeof obj === "object") {
173
+ return Object.fromEntries(
174
+ Object.entries(obj as Record<string, unknown>).map(([k, v]) => [k, nullToUndefined(v)]),
175
+ ) as T;
176
+ }
177
+ return obj;
178
+ }
179
+
180
+ // Apply BEFORE Zod validation — never validate raw model output directly.
181
+ const parsed = ResponseSchema.parse(nullToUndefined(JSON.parse(modelText)));
182
+ ```
183
+
184
+ If a field is legitimately nullable (the model is *meant* to return `null` as a value), use `.nullish()` (accepts both `null` and `undefined`) on that specific field instead of blanket-stripping — but the default posture for absent optionals is normalize-then-`.optional()`.
154
185
 
155
186
  **Dors Venabili (Observability):** Visibility.
156
187
  - Can you see what the AI decided and why?
@@ -223,6 +254,8 @@ If issues found, return to Phase 3. Maximum 2 iterations.
223
254
  - [ ] Regression suite runs on prompt changes
224
255
  - [ ] Quality metrics tracked over time
225
256
  - [ ] Human review process for edge cases
257
+ - [ ] LIVE eval layer runs against the real model and passes before launch (sandbox layer alone cannot catch model-output-shape bugs) (field report #352, #4)
258
+ - [ ] Model output normalized null-to-undefined before Zod `.optional()` validation (field report #352, #4)
226
259
 
227
260
  ### AI Gate Bootstrapping (Cold-Start Problem)
228
261
  AI-gated approval systems have a cold-start problem: no historical outcomes -> gate rejects all requests -> no operations -> no outcomes. During the first N decisions (configurable, default 20), the gate should approve at reduced size (0.5-0.7x normal) to build a track record. The gate should never reject solely because "no historical data exists." Include explicit prompt guidance: "Lack of history is not a reason to reject — approve at reduced size to build the track record." (Field report #152)
@@ -53,7 +53,8 @@ Fury calls ALL of them. That's the point.
53
53
  8. `--skip-arch` and `--skip-build` allow re-running reviews on existing code.
54
54
  9. `--resume` picks up from the last completed phase.
55
55
  10. Only suggest a fresh session if `/context` shows actual usage above 85%. Do not preemptively checkpoint or reduce quality for context reasons.
56
- 11. **All phases dispatch to sub-agents per ADR-036.** The main thread orchestrates it plans, launches, triages, and decides. It does NOT read source files, analyze code inline, or generate findings from raw code. See `SUB_AGENTS.md` "Parallel Agent Standard" for brief format, deliverables, and concurrency rules. (Field report #270: full 11-phase /assemble ran through 15+ sub-agents with context at 15-25%, vs 80%+ inline.)
56
+ 11. Phase 13.5 (Doc-Currency Refresh) runs before sealing unless `--no-doc-refresh` is set; the skip is logged. `--doc-audit` is a docs-only one-shot mode that runs none of Phases 1-13 (field report #342 F-1, F-5).
57
+ 12. **All phases dispatch to sub-agents per ADR-036.** The main thread orchestrates — it plans, launches, triages, and decides. It does NOT read source files, analyze code inline, or generate findings from raw code. See `SUB_AGENTS.md` "Parallel Agent Standard" for brief format, deliverables, and concurrency rules. (Field report #270: full 11-phase /assemble ran through 15+ sub-agents with context at 15-25%, vs 80%+ inline.)
57
58
 
58
59
  ## The Pipeline
59
60
 
@@ -72,6 +73,7 @@ Fury calls ALL of them. That's the point.
72
73
  | 11 | /test | 1 | Suite green, coverage acceptable |
73
74
  | 12 | Crossfire | 1 | All 4 adversarial agents sign off |
74
75
  | 13 | Council | 1-3 | All 5 cross-domain agents sign off (incl. Troi PRD compliance) |
76
+ | 13.5 | Doc-Currency Refresh | 1 | Project docs sweep is clean before sealing (skip with `--no-doc-refresh`) |
75
77
 
76
78
  ### Phase 6.5 — Seldon's AI Review (conditional)
77
79
 
@@ -125,9 +127,36 @@ Verify no circular calls between store actions and API methods. Specifically che
125
127
 
126
128
  When a feature is added to one surface (API, dashboard, CLI, marketing site), verify all other surfaces displaying the same entities are updated. A new field added to the API response but missing from the dashboard table, or a new tier added to the pricing page but missing from the settings panel, creates an inconsistent product. After each pipeline phase that adds or modifies a feature, grep for the entity name across all surfaces: API routes, React/Vue components, CLI output formatters, marketing page copy, email templates, admin panels. (Triage fix from field report batch #149-#153.)
127
129
 
130
+ ### Phase 13.5 — Doc-Currency Refresh (pre-SEAL)
131
+
132
+ After the Council signs off, but BEFORE Fury seals the run and makes the Deploy Offer, sweep the project's source-of-truth docs for drift introduced over the course of the pipeline. A full `/assemble` touches architecture, features, version, and build state — by the time the Council finishes, the docs that describe the project frequently no longer match it. This mirrors the Doc-Currency Refresh mission in `CAMPAIGN.md`: same checklist, applied once at the end of the pipeline instead of once per mission. (Field report #342 F-1: `/assemble` shipped a Council-clean build whose `CLAUDE.md` Project block and `PROJECT_VERSION` line still described the pre-build scaffold.)
133
+
134
+ Sweep each of these for drift against the current state of the code, and fix what's stale:
135
+
136
+ 1. **`CLAUDE.md`** — Project block (name, one-liner, domain, repo), stack, and any phase/feature claims that the pipeline changed.
137
+ 2. **`MEMORY.md`** (auto-memory index, if present) — entries that now point at retired or renamed work.
138
+ 3. **`README.md`** — install/run/feature sections that the build moved past.
139
+ 4. **`PROJECT_VERSION.md`** (or equivalent) — the **Current** line must name the version this run produced, not the one it started from.
140
+ 5. **`/logs/build-state.md`** — the recorded "current state" must reflect the completed pipeline, not a stale prior session (the Phase 9 deployment-verification trap, generalized to all of build-state).
141
+ 6. **`/logs/campaign-state.md`** (if this `/assemble` ran inside a campaign) — the mission this run closed must be marked done so `/campaign` doesn't re-pick it.
142
+
143
+ This phase is **additive verification, not a rewrite** — only touch lines that are demonstrably stale. If every doc is already current, record "Doc-Currency Refresh: clean" in `assemble-state.md` and proceed. Dispatch the sweep to a sub-agent per ADR-036; the main thread triages the diff.
144
+
145
+ **`--no-doc-refresh`** skips Phase 13.5 entirely (documented opt-out, for runs where docs are maintained out-of-band). The skip is logged to `assemble-state.md` so a later run knows the docs were never swept. (Field report #342 F-1.)
146
+
128
147
  ### Post-Pipeline: Deploy Offer
129
148
 
130
- After Phase 13 (Council sign-off), if a deployment target is configured (`.vercel/project.json`, `fly.toml`, `railway.toml`, or PRD deploy section), Fury offers: "Council has signed off. Deploy to production?" This closes the loop instead of leaving deployment as an implicit user action. In campaign blitz mode, auto-deploy if the deploy method is known. (Field report #37: user had to prompt three times before agent deployed to Vercel.)
149
+ After Phase 13.5 (Doc-Currency Refresh, or its skip), if a deployment target is configured (`.vercel/project.json`, `fly.toml`, `railway.toml`, or PRD deploy section), Fury offers: "Council has signed off. Deploy to production?" This closes the loop instead of leaving deployment as an implicit user action. In campaign blitz mode, auto-deploy if the deploy method is known. (Field report #37: user had to prompt three times before agent deployed to Vercel.)
150
+
151
+ ## `--doc-audit` — One-Shot Documentation Audit
152
+
153
+ `/assemble --doc-audit` runs a **docs-only** pass: it dispatches a Silver-Surfer-led doc-audit roster and runs **no code review, no build, no security/QA phases** — none of Phases 1-13. The entire pipeline is replaced by a single sweep whose only job is to catch documentation drift across the whole corpus, then report. Use it when you suspect the docs have fallen behind the code but don't want to pay for a full Initiative run. (Field report #342 F-5.)
154
+
155
+ This is deliberately **broader** than `/git` Step 5.5. Step 5.5 only checks the 13 known method-doc ↔ command-file pairs (e.g. `ASSEMBLER.md` ↔ `assemble.md`) after a release touches a method doc — it is a paired-file sync check, not a corpus audit. `--doc-audit` instead lets the Surfer muster a roster sized to the actual surface area: `docs/`, `*.md` at the repo root, per-directory `CLAUDE.md` files, ADR records, pattern headers, and the slash-command/agent inventory — anywhere prose can disagree with reality.
156
+
157
+ **Canonical path.** `--doc-audit` is a convenience entry point into the canonical doc-audit mechanism, not a second implementation of it. The doc-audit roster, scoring, and report format are defined by the **`/audit-docs`** command and **`DOC_AUDIT.md`** — `/assemble --doc-audit` dispatches that same Surfer-led audit and surfaces its report. When in doubt about what the audit covers or how findings are graded, `DOC_AUDIT.md` is the source of truth; this section only documents the `/assemble` doorway to it. (Field report #342 F-5.)
158
+
159
+ `--doc-audit` is incompatible with the build/review flags (`--skip-arch`, `--skip-build`, `--fast`, `--resume`) since it runs none of those phases; pair it with `--focus "topic"` to bias the Surfer's roster toward a corner of the corpus.
131
160
 
132
161
  ## Deliverables
133
162
 
@@ -299,6 +299,8 @@ After running any build command (`build:workers`, `tsc --build`, webpack, etc.),
299
299
  8. Kenobi: Maul re-probes all remediated vulnerabilities, verifies fixes hold.
300
300
  9. If Pass 2 finds new issues, fix and re-verify until clean.
301
301
 
302
+ **Isolation-green is NOT deploy-green (field report #354 F3).** Nightwing's full-suite re-run in Pass 2 is the deploy gate — not the targeted/isolation runs used while fixing. A fix can pass every targeted test, every isolation run, and every re-probe of the area it touched, yet still regress UNRELATED tests through environment coupling the isolation runs cannot see (shared fixtures, global state, test ordering, env vars, a mutated singleton, a migration side effect). Isolation runs validate the fix in a vacuum; only the FULL suite observes the coupling. So "every targeted run is green" is never a deploy signal — the gate is the whole suite passing, and you do not advance to Phase 12 on isolation-green alone.
303
+
302
304
  **Phase 12 — Kusanagi Deploys.**
303
305
  1. Execute `/docs/methods/DEVOPS_ENGINEER.md` full sequence
304
306
  2. Complete first-deploy pre-flight checklist (see `/devops` command)
@@ -67,6 +67,24 @@ Examples that triggered this rule:
67
67
 
68
68
  Skip this step only when the ADR's scope is bounded by entity (one file, one table, one route group) — bounded ADRs don't need an audit because the count is visible in the scope itself.
69
69
 
70
+ ### Fan-out completeness (glob-derived lists + mandatory sweep)
71
+
72
+ This is distinct from the plan-time audit grep above. The audit grep *counts* sites so the architect can *estimate* effort. Fan-out completeness governs *execution*: when a mission fans a directory-wide or migration-wide change across parallel agents, it ensures every file in scope is actually touched and nothing is silently dropped at the seams between agents (field report #355 F2).
73
+
74
+ **Two rules, both mandatory:**
75
+
76
+ 1. **Derive per-agent file lists from a GLOB, never a hand-typed list.** When splitting a directory/migration fan-out across N agents, the source-of-truth file list MUST come from a glob expansion (`git ls-files 'src/widgets/**/*.ts'`, `find migrations -name '*.sql'`, `grep -rl '<legacy pattern>' <tree>`), then partitioned among agents — never a list typed from memory or eyeballed from a tree view. A hand-typed list is a fan-out's single biggest source of silent omission: the one file nobody remembered never gets an agent, compiles clean in isolation, and ships the legacy pattern. The glob is the manifest; agent assignments are slices of it.
77
+
78
+ 2. **Pair every fan-out with a post-fan-out completeness sweep BEFORE the wave is declared done.** After all fan-out agents return, re-run the legacy pattern across the *whole* target tree — not the per-agent slices, the entire tree — and assert zero residual hits (or a fully-justified allowlist):
79
+
80
+ ```bash
81
+ # The wave touched src/widgets/**; sweep the whole tree for the old pattern.
82
+ grep -rnE '<legacy pattern>' src/ | grep -v '<intentional-keep paths>'
83
+ # Must be empty (or every line individually justified) before the wave closes.
84
+ ```
85
+
86
+ The per-agent reviews each pass — every slice is internally consistent — yet the union can still miss a file that fell between two agents' globs, a file added after the manifest was captured, or a path one agent assumed another owned. The completeness sweep is the only check that sees the whole tree at once. A fan-out wave is NOT done until its sweep is green. (Field report #355 F2: a directory-wide migration declared complete on green per-agent reviews still shipped legacy-pattern residue because no sweep ran the old pattern across the full tree after the wave.)
87
+
70
88
  ### Closeout grep pinning
71
89
 
72
90
  When a `/campaign` closeout report cites a followup count or backlog size (e.g., "F-V710-ORG1-DEFAULTS — ~12 sites remaining" or "~21 cumulative followups"), the followup definition MUST embed the literal grep pattern + observed `n=N` at closeout HEAD. The next campaign's `/architect --plan` re-runs the same grep before accepting the count.
@@ -279,10 +297,13 @@ User confirms, redirects, or overrides. On confirm → Step 4.
279
297
  3. Fury runs the full pipeline (or `--fast` if user prefers). **Note:** `--fast` skips Crossfire + Council but NEVER skips `/sentinel` if the mission adds new endpoints, WebSocket handlers, or credential-handling code.
280
298
  3a. **Per-mission Kenobi quick-scan:** If the mission creates or modifies auth, crypto, HMAC, credential handling, or webhook verification code, run a focused Kenobi security scan within the mission — do not defer to the Victory Gauntlet. The reduced pipeline's single review round is calibrated for business logic, not security-sensitive code. Quick-scan scope: credential leakage, timing attacks, input validation, error message exposure. (Field report #265: webhook HMAC bypass, credential leakage in errors, and auth header override all shipped through the reduced pipeline and were only caught by the Victory Gauntlet.)
281
299
  4. Only checkpoint if `/context` shows actual usage above 85%. Do not preemptively suggest checkpoints.
300
+ 4a. **A per-wave staging deploy is a STATUS checkpoint — report and continue, never a decision-frame pause** (field report #355 F4). When a mission or fan-out wave pushes to staging mid-campaign, treat the deploy result exactly like a mission-complete status line: announce it ("M-7.4 deployed to staging, health check green — starting M-7.5") and proceed to the next wave. Do NOT frame the staging deploy as a gate, milestone, or "continue or pause?" question — the staging push is part of the autonomous flow, not a hand-back point. The only valid pauses remain the ones in the Pause-Bias Anti-Pattern list below (context >85%, BLOCKED item, un-auto-fixable Critical, user interrupt). This is the action-prose statement of that rule; the callout below is its rationale.
282
301
  5. On completion → Step 5
283
302
 
284
303
  **Post-infrastructure enforcement gate:** For infrastructure campaigns (deploy targets, CI/CD, monitoring, staging environments): after the infrastructure is provisioned, run `/architect --plan` to verify workflow enforcement gates exist — not just infrastructure existence. Infrastructure without process gates is incomplete.
285
304
 
305
+ **Silver Surfer gate fires at the REVIEW phase, not the solo build.** Within a mission, the gate (ADR-051 PreToolUse hook on the Agent tool) engages when Fury deploys the review/audit roster as sub-agents — NOT during the orchestrator's solo build of the mission's code. Solo-build-before-review is intentional, not a skipped gate: parallel agents editing the same tightly-coupled engine files (game loop, state machine, shared service) would clobber each other's edits and produce merge garbage. So the orchestrator builds the changeset solo, THEN the Surfer-gated review roster reads it. If you find yourself mid-build asking "did a gate get skipped?", the answer is no — the gate has not fired yet because the review phase has not started. (Field report #348 #3: mid-build confusion over an un-fired gate that fires correctly at the review phase.)
306
+
286
307
  **Dispatch model (ADR-044):** Per-mission `/assemble` runs SHOULD dispatch phases to sub-agents per `SUB_AGENTS.md` "Parallel Agent Standard." Agents are launched as named subagent types defined in `.claude/agents/` with description-driven dispatch — Opus scans `git diff --stat` and matches changed files against agent descriptions to auto-select specialists. The campaign orchestrator (main thread) manages the mission sequence, inter-mission gates, and campaign state — it does NOT perform inline code analysis. Pass findings summaries between missions, not raw code. See `docs/AGENT_CLASSIFICATION.md` for the full agent manifest (see docs/AGENT_CLASSIFICATION.md). (Field report #270)
287
308
 
288
309
  ### Campaign-Mode Pipeline
@@ -440,12 +461,27 @@ Even in `--fast` mode, each mission gets at least **1 review round** (not 3, but
440
461
 
441
462
  **UI→server route tracing (within review):** When a mission writes both UI code and server code, the review must trace every `fetch()` call in the UI to a registered server route. For each `fetch('/api/...')` in `.js`/`.ts` UI files, verify the path exists as an `addRoute()` call in the server. Missing routes produce silent 404s that are invisible in development. (Field report #50: UI button called `/api/server/restart` but no endpoint was created.)
442
463
 
464
+ **Review the integrated changeset, not only the new files.** The per-mission review gate must read the full diff from the prior mission's HEAD (`git diff <prev-mission-sha>..HEAD`), not just the files this mission created. Reviewing only the new files misses pre-existing cross-cutting defects that the integration surfaces: a missing config entry the new code now depends on, a Dockerfile `COPY` that never included the directory this mission populated, a doc-vs-reality drift where the new wiring contradicts a README/PRD claim, a build/import that only breaks once the new module is referenced. The new files can each be clean in isolation while the integrated system is broken at the seams. The diff is the unit of review, not the file list. (Field report #346 #4.)
465
+
443
466
  ### One Mission, One Commit Anti-Pattern
444
467
 
445
468
  **Each mission gets its own commit.** Do NOT batch multiple missions into a single commit. The per-mission commit serves as evidence: the diff for Mission 3 should contain only Mission 3's deliverables. If the diff contains work from Missions 3-11 combined, the review is meaningless — you can't verify what changed for which mission.
446
469
 
447
470
  If a mission is small enough to merge with an adjacent one, that's fine — but explicitly acknowledge it: "Missions 3-4 combined (both methodology-only, same target file)." Never silently batch.
448
471
 
472
+ ### Execution-Time Cluster Sub-Split
473
+
474
+ Plan-time cluster recognition (Step 1 #9, field report #326) splits a cluster-natured mission into sub-missions BEFORE the campaign starts, when Dax can see 4+ named deliverables on the board. But some clusters only reveal their seam at EXECUTION time: a mission spans a **foundation + N consumers** (a new schema/migration the rest of the mission builds on, a shared client/adapter, a base config, a core engine module) and the consumers cannot be safely reviewed until the foundation is real. Or a `RISK` item surfaces mid-mission demanding the foundation land and be verified BEFORE its consumers are wired.
475
+
476
+ When that happens, split at execution time along the **foundation/consumers seam** — even though the mission was a single board entry:
477
+
478
+ 1. **Sub-mission A (foundation):** build the foundation alone. Its own review gate. Its own commit (`M-XX.a — <foundation>`).
479
+ 2. **Sub-mission B..N (consumers):** build the consumers against the now-verified foundation. Each gets its own review gate and its own commit (`M-XX.b`, `M-XX.c`, ...).
480
+
481
+ Each sub-mission is a real mission for gate purposes: 1-round review minimum, per-mission commit (One Mission, One Commit still holds), and its slice recorded in campaign-state.md. The foundation's review can catch a contract defect (a column the consumers will read but the migration didn't add, an adapter method the consumers call but the foundation didn't expose) BEFORE the consumers are written against a broken base — instead of the Gauntlet catching it three missions later.
482
+
483
+ **This complements, not duplicates, plan-time recognition (#326):** plan-time splits a cluster on the board before execution; execution-time sub-split triggers when a foundation/consumers seam (often a `RISK` item) emerges *during* a mission that looked atomic at plan time. If you notice the seam at plan time, split there; if it only surfaces under the build, split here. (Field report #346 #3.)
484
+
449
485
  ### Per-Mission Verification Agents
450
486
 
451
487
  After each mission's review round, two agents run quick checks:
@@ -602,12 +638,22 @@ All PRD requirements are COMPLETE or explicitly BLOCKED:
602
638
  7. **PRD sync check:** Before declaring victory, compare PRD numeric claims (agent counts, feature counts, route counts, component counts) against the actual codebase for this campaign's domain. Stale PRD claims erode trust and compound across campaigns. (Field report #119)
603
639
  7a. **Tenant isolation completeness (conditional):** If the campaign touched auth, multi-tenant, or user-scoped data, grep ALL tables for `org_id` (or equivalent ownership column). Every table must be classified as either "tenant-scoped" (has org_id) or "global by design" (with documented justification). Tables without org_id and without justification are IDOR risks. This catches incomplete tenant migrations that survive per-phase sweeps — the per-phase check (BUILD_PROTOCOL Phase 4) only covers tables modified in that phase. (Field reports #229, #231)
604
640
  8. **Entity selector completeness** — for every user-facing selector (dropdown, combobox, autocomplete) that selects from a database-backed list: verify the selector can handle entities that don't exist yet. If a user can only pick from existing DB records, the feature is incomplete — the selector needs a creation flow or an external lookup fallback. Common examples: city selector (needs geocoding fallback), category picker (needs "Other" or custom entry), user selector (needs invite flow). (Field report #263: city selector only searched existing DB cities — users couldn't set homebase to any city not already in the database.)
641
+ 8b. **Doc-Currency Refresh mission (Coulson + Wong) — mandatory pre-SEAL sweep.** Before the final sign-off seals the version, run a dedicated cross-doc currency sweep. The Step 0 freshness check and the Step 6 #9 `build-state.md` update each cover ONE file at ONE moment; across rapid sealed versions the load-bearing docs rot collectively because no single mission owns their joint currency. This mission gives that sweep an owner. **Coulson** (release authority) drives version-line accuracy; **Wong** (lessons/changelog/PRD refresh) drives prose currency. Sweep and reconcile against the current `git log -1` + `package.json`/`pyproject.toml` version:
642
+ - **`CLAUDE.md`** — Project block, version references, command/agent counts, any "as of vX.Y.Z" claims
643
+ - **`MEMORY.md`** (auto-memory index, if present) — stale "next:" pointers, completed-work entries that still read as pending
644
+ - **`README.md`** — install/usage snippets, badge versions, feature lists that drifted from reality
645
+ - **`PROJECT_VERSION.md` Current line** (or `VERSION.md` "Current:" line) — must equal the version about to be sealed
646
+ - **`/logs/build-state.md`** — version, test counts, deployment state
647
+ - **`/logs/campaign-state.md`** — Prophecy Board statuses, final mission status
648
+ For each file, fix drift in place — never seal known drift. If a doc is already current, note "current at <SHA>" and move on (idempotent). **Opt-out: `--no-doc-refresh`** skips this mission for fast methodology-only or hotfix campaigns where the docs provably did not move; log the skip in campaign-state.md with a one-line reason. This complements (does not replace) the existing state-freshness checks. (Field report #342 F-1: load-bearing docs rotted across rapid sealed versions because the per-file freshness checks had no cross-doc owner.)
605
649
  9. **Victory Checklist** — ALL must be true before sign-off:
606
650
  - [ ] Gauntlet Council signed off (6/6 or all domains pass)
607
651
  - [ ] All BLOCKED items acknowledged by user
608
652
  - [ ] PRD claims verified against codebase
609
653
  - [ ] `/debrief --submit` filed (issue number recorded)
610
654
  - [ ] Campaign-state.md updated with final status
655
+ - [ ] Doc-Currency Refresh completed (or `--no-doc-refresh` skip logged with reason)
656
+ - [ ] **Production-config boot assertion passed** — a green sandbox suite is necessary but NOT sufficient. Boot the app under its real production config (production env vars / `NODE_ENV=production`, real adapter selection, production build artifact) and assert it reaches a ready state without a config/credential fault. Sandbox adapters can pass every test while the production path fails on a missing env var, a real-vs-sandbox adapter mismatch, or a build-only import error. Do not declare victory on sandbox-green alone. (Field report #350 #3: sandbox suite was fully green but the production-config boot path was never asserted before sign-off.)
611
657
 
612
658
  ### The Reckoning (Optional Pre-Launch Audit)
613
659