voidforge-build 23.11.4 → 23.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/dist/.claude/agents/batman-qa.md +1 -0
  2. package/dist/.claude/agents/galadriel-frontend.md +2 -0
  3. package/dist/.claude/agents/kusanagi-devops.md +4 -0
  4. package/dist/.claude/agents/lucius-config.md +6 -0
  5. package/dist/.claude/agents/silver-surfer-herald.md +11 -4
  6. package/dist/.claude/commands/architect.md +9 -0
  7. package/dist/.claude/commands/assemble.md +4 -1
  8. package/dist/.claude/commands/assess.md +13 -1
  9. package/dist/.claude/commands/audit-docs.md +106 -0
  10. package/dist/.claude/commands/deploy.md +28 -0
  11. package/dist/.claude/commands/engage.md +2 -0
  12. package/dist/.claude/commands/gauntlet.md +23 -4
  13. package/dist/.claude/commands/imagine.md +15 -0
  14. package/dist/.claude/commands/ux.md +32 -0
  15. package/dist/.claude/commands/void.md +1 -0
  16. package/dist/CHANGELOG.md +39 -0
  17. package/dist/CLAUDE.md +8 -0
  18. package/dist/VERSION.md +2 -1
  19. package/dist/docs/methods/AI_INTELLIGENCE.md +33 -0
  20. package/dist/docs/methods/ASSEMBLER.md +31 -2
  21. package/dist/docs/methods/CAMPAIGN.md +27 -0
  22. package/dist/docs/methods/DEVOPS_ENGINEER.md +158 -0
  23. package/dist/docs/methods/DOC_AUDIT.md +92 -0
  24. package/dist/docs/methods/FORGE_KEEPER.md +16 -5
  25. package/dist/docs/methods/GAUNTLET.md +33 -0
  26. package/dist/docs/methods/PRODUCT_DESIGN_FRONTEND.md +53 -0
  27. package/dist/docs/methods/QA_ENGINEER.md +19 -0
  28. package/dist/docs/methods/RELEASE_MANAGER.md +27 -0
  29. package/dist/docs/methods/SUB_AGENTS.md +31 -0
  30. package/dist/docs/methods/SYSTEMS_ARCHITECT.md +13 -0
  31. package/dist/docs/methods/TESTING.md +19 -0
  32. package/dist/docs/patterns/README.md +3 -0
  33. package/dist/docs/patterns/ai-eval.ts +63 -0
  34. package/dist/docs/patterns/daemon-process.ts +90 -0
  35. package/dist/docs/patterns/deploy-preflight.ts +85 -2
  36. package/dist/docs/patterns/design-tokens.ts +338 -0
  37. package/dist/docs/patterns/error-message-categorization.tsx +376 -0
  38. package/dist/wizard/lib/patterns/daemon-process.d.ts +2 -1
  39. package/dist/wizard/lib/patterns/daemon-process.js +89 -1
  40. package/package.json +2 -2
package/dist/CLAUDE.md CHANGED
@@ -7,11 +7,15 @@
7
7
  - **Challenge when appropriate.** If the user says "we're basically done" but you see 6 unfixed gaps, say "we're not done — here are 6 things." Agreeing to be agreeable ships bugs.
8
8
  - **Separate opinion from analysis.** State facts first, then your recommendation. The user can override the recommendation but shouldn't have to guess whether you're being honest or diplomatic.
9
9
  - **Solve, don't delegate.** Attempt actions before listing prerequisites. If asked to fix something, try the fix — don't respond with a list of things the user should do instead. When blocked, explain what you tried and what specifically failed.
10
+ - **Apply findings, don't offer a picker.** When a review surfaces a clear list of fixable findings, DEFAULT to applying them in batches rather than surfacing a multi-option "which subset do you want?" picker (field report #343). A picker is only warranted when the choice is genuinely architectural — mutually exclusive directions with real trade-offs the user must own. Mechanical fixes (lint, missing validation, IDOR, a11y, dead code) are not architectural; fix them and report what you did.
11
+ - **Honor authorized autonomy with single-question gates.** When the operator explicitly authorizes autonomy ("go", "run the whole thing", "don't stop to ask"), execute the campaign end-to-end and gate only on irreducible externals — secrets, API tokens, billing approval, anything you genuinely cannot obtain or invent (field report #344). Do NOT seek interim confirmations on constants the agent itself invented and already disclosed (port numbers, table names, file paths, default copy). Surface those in the running log, not as a blocking question.
10
12
 
11
13
  ## Silver Surfer Gate (ADR-048, ADR-051, ADR-060)
12
14
 
13
15
  ADR-051 enforces this gate at the hook level (PreToolUse). The prose below is the backstop if the hook is absent or disabled. One day the prose may be removed entirely — the hook is the intended permanent mechanism.
14
16
 
17
+ **When the gate fires.** The gate fires at the REVIEW phase — the moment you deploy sub-agents — not during the solo build that precedes it (field report #348). Building the work yourself first, then mustering the Surfer roster to review it, is the intended sequence; the lead agent is expected to produce the artifact solo before any agent dispatch. The gate exists to stop you from cherry-picking the review roster, not to force agents onto the build itself.
18
+
15
19
  **Gated commands:** `/engage` (alias: `/review`), `/qa`, `/sentinel` (alias: `/security`), `/ux`, `/architect`, `/build`, `/assemble`, `/gauntlet`, `/campaign`, `/test`, `/devops`, `/deploy`, `/ai`, `/assess`.
16
20
 
17
21
  **Procedure — execute in order:**
@@ -33,6 +37,7 @@ ADR-051 enforces this gate at the hook level (PreToolUse). The prose below is th
33
37
 
34
38
  1. After the Silver Surfer sub-agent returns its roster, and before launching any other Agent: `[ -x scripts/surfer-gate/record-roster.sh ] && bash scripts/surfer-gate/record-roster.sh || true` (optionally pass the roster JSON as the first argument for audit). The existence guard is a defensive no-op for projects that predate v23.10.0 — when the gate started shipping via the npm methodology package per #317.
35
39
  2. When the user's command includes `--light` or `--solo`, BEFORE launching the Surfer or any other agent: `[ -x scripts/surfer-gate/bypass.sh ] && bash scripts/surfer-gate/bypass.sh --light || true` (or `--solo`). **Fails closed on unknown flag values** (ADR-060 v23.8.18 hardening, SEC-003) — passing anything other than `--light` or `--solo` exits 2 with an error. No silent bypass.
40
+ 3. **Normalize roster names before dispatch** (field report #345, DEAL-001). The Silver Surfer returns agent names from a Haiku pre-scan, which can drift from the actual filenames in `.claude/agents/` (extra `silver-surfer-` prefix, a stray `.md` suffix, a hyphen/underscore mismatch). Before you launch, validate each name against `ls .claude/agents/`: if it matches a file (with or without the `.md` extension), keep it; if not, attempt exactly one correction — strip a known prefix/suffix or normalize separators — and re-check. If it still doesn't resolve, DROP that single name and proceed with the rest of the roster. Never block the whole dispatch over one unresolved name; log the dropped name so the Herald roster can be corrected upstream. The gate's job is to enforce *that* a roster ran, not to fail the run over a typo in *one* name.
36
41
 
37
42
  If `scripts/surfer-gate/check.sh` exists but you skip step 1, your first non-Surfer Agent call in that turn will be blocked with a clear message and your own log line in `/tmp/voidforge-session-$SESSION_ID/gate.log`. You are expected to comply with the block (launch Surfer / run record-roster), not to fight it. If the script does not exist, your project predates v23.10.0; pull the gate from `tmcleod3/voidforge:scripts/surfer-gate/` and merge `settings-snippet.json` into `.claude/settings.json`, or re-run `npx voidforge-build init` against the methodology source.
38
43
 
@@ -121,6 +126,9 @@ Reference implementations in `/docs/patterns/`. Match these shapes when writing.
121
126
  - `ai-prompt-safety.ts` — Type A (instructions, statistical) vs Type B (constraints, enforced); AUTHORITY-as-text caveat; defense-in-depth stack
122
127
  - `llm-state-dedup.ts` — LLM ids are display labels, not keys; content-hash dedup; lifecycle-state snapshot completeness
123
128
  - `autonomous-ops-triage-policy.md` — 4-bucket model (self-resolving / runbook-safe / operator-approval / hard-never) + SessionStart hook visibility rule for ops-flavored projects
129
+ - `design-tokens.ts` — Semantic color/type tokens (one indirection layer) so a theme pivot is a token change, not a component-wide find-replace (field report #351, #343)
130
+ - `nginx-vhost.conf` — Cloudflare-Flexible-safe vhost template: security headers, ACME http-01 passthrough, no redirect loop behind CF's flexible SSL (field report #351, #344)
131
+ - `error-message-categorization.tsx` — Categorize errors at the UI boundary (network / auth / validation / server / unknown) before choosing copy, so users see actionable messages not raw internals (field report #351, #343)
124
132
 
125
133
  ## Slash Commands
126
134
 
package/dist/VERSION.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Version
2
2
 
3
- **Current:** 23.11.4
3
+ **Current:** 23.12.0
4
4
 
5
5
  ## Versioning Scheme
6
6
 
@@ -14,6 +14,7 @@ This project uses [Semantic Versioning](https://semver.org/):
14
14
 
15
15
  | Version | Date | Summary |
16
16
  |---------|------|---------|
17
+ | 23.12.0 | 2026-06-09 | The v23.12 methodology pass — `/debrief --inbox` triaged all 12 open field reports (#342–#353) and applied every accepted fix in one session via two-phase workflow orchestration (triage → apply), with an adversarial verify pass on every file. 58 fixes across 32 files + 5 new files. 7 clusters: **verify-the-FIX** (the adversarial pass must vet the proposed fix, not just the finding — SUB_AGENTS.md, GAUNTLET.md, /engage; #348/#349/#350, M5 mint-fence incident); **production-config gate** (sandbox-green ≠ ship-ready — GAUNTLET.md prod-boot + sandbox-blind-spot round, CAMPAIGN.md Victory Checklist; #350); **Spring Cleaning consumer-vs-clone** (FORGE_KEEPER.md destructive-risk branch so app projects don't lose tsconfig/lockfiles; #343 F10); **Surfer roster sizing** (silver-surfer-herald.md scope_bias/scope_density/~18-cap + basename normalization; #343/#344/#345/#346); **creative/UX grounding** (world-scan + de-AI + token-scoped theming — ux.md, PRODUCT_DESIGN_FRONTEND.md, galadriel; #347/#351); **deploy/DevOps foot-guns** (DEVOPS_ENGINEER.md +13: eval-env, Node-MDWE, CF-Flexible, served-vs-built, compose-topology, docker-cleanup; #344/#349/#352/#353); **doc-currency** (CAMPAIGN/ASSEMBLER pre-SEAL refresh + new /audit-docs & DOC_AUDIT.md; #342). 3 new patterns (design-tokens.ts, nginx-vhost.conf, error-message-categorization.tsx; 48 → 51) + new /audit-docs command + DOC_AUDIT.md + scripts/regen-claude-md.sh. CLAUDE.md Personality +2 (anti-picker #343, authorized-autonomy #344), gate-timing #348, roster normalization #345. Dep range `^23.11.4` → `^23.12.0` (ADR-062). #349 F-4 and #352 #3 were already shipped (verified); #345 DEAL-004 + #353 RC-001/002/callout out of scope (Claude Code core / Workflow tool). |
17
18
  | 23.11.4 | 2026-05-12 | Wong promotion cluster + #260 closeout. /debrief --inbox re-triage of all 9 open field-report issues produced 3 ready-now promotion clusters (3+ data points across different reports each). BUILD_PROTOCOL.md Principle #11 "Derived counts discipline" (from #336 F6, #334 F6, #332 hidden #5 — three projects independently drifted the same class). New pattern `docs/patterns/autonomous-ops-triage-policy.md` codifying the 4-bucket model + SessionStart hook visibility rule (from #337 F3, #336 F7, #334 F5 — two operators independently reinvented). CAMPAIGN.md Planning Mode Step 4 "Scope-adversary check for bug classes" (from #332, #338 #2 — voidforge-marketing-site missed `/patterns` because the bug class scope was narrowed). Also closes #260 remaining items: PRODUCT_DESIGN_FRONTEND.md Operating Rule #12 "Tutorial-context checklist for slash commands" + QA_ENGINEER.md Operating Rule #13 "Tutorial smoke test for slash commands." Dep range `^23.11.3` → `^23.11.4` per ADR-062 discipline. Pattern count 47 → 48. 11 field reports remain open for v23.12 methodology pass. |
18
19
  | 23.11.3 | 2026-05-12 | Three-phase pipeline (/architect → /debrief --inbox → /campaign) shipped 12 fixes + 2 ADRs + 1 LEARNINGS entry + 2 mechanical guards. **Issue #331** destructive-bug fix: `findProjectRoot()` now enforces `$HOME` boundary + `statSync().isFile()` guard, no more silent overwrite of `~/CLAUDE.md` on `npx voidforge-build update`. **HIGH CVE** fast-xml-parser/builder via `@aws-sdk/*` patched via `npm audit fix`. **Dep contract** pinned: `voidforge-build → voidforge-build-methodology` from `"*"` to `"^23.11.3"` (ADR-062), enforced mechanically by `check-methodology-pin.sh` prepublishOnly script. **engines.node** added to methodology package.json. **publish.yml hardening**: post-publish `npm view` verification step (both jobs, 6×10s retry), `recover-partial` job with `npm deprecate` on XOR-failure, `needs: publish-methodology` ordering on publish-voidforge. **copy-assets.sh** ADR-058 template strip applied (parity with methodology prepack). **Docs**: HOLOCRON Quick Start "launch Claude Code first" preamble + npm-prefix workaround (#260, #333p), FORGE_KEEPER Rule #11 "never write to $HOME" (ADR-063), RELEASE_MANAGER ROADMAP-sync checklist line, ROADMAP.md pointer v23.8.11 → v23.11.3 (24-version drift closed). Marker integration test `no-home-writes.integration.test.ts` mechanically enforces ADR-063. 1390 tests pass. |
19
20
  | 23.11.2 | 2026-05-12 | `voidforge init` now prompts browser-vs-CLI when no mode flag is passed (TTY only) and `--browser` was added for explicit opt-in. Headless init prompts for name/dir/oneliner/domain/repo when `--name` is omitted in a TTY. Non-TTY no-flag now errors cleanly instead of silently launching a wizard server. Separately: `packages/methodology/scripts/surfer-gate/` (8 files) now ships in the npm methodology package — closes ADR-051 distribution gap (#317). 9 pattern-table rows + existence-guarded orchestrator-contract bash propagated into the methodology CLAUDE.md. |
@@ -151,6 +151,37 @@ Run sequentially — each builds on findings from parallel phase:
151
151
  - Can you detect regression when prompts change?
152
152
  - Is there human-in-the-loop scoring for ambiguous cases?
153
153
  - Are quality metrics tracked over time? (Not just at launch)
154
+ - Is there a LIVE eval layer that runs against the real model before launch? (Not just the sandbox layer)
155
+
156
+ #### The LIVE eval layer is the pre-launch gate (field report #352, #4)
157
+
158
+ Evals stratify into two layers, and they catch different bug classes:
159
+
160
+ - **Deterministic / sandbox layer** — fast, hermetic, runs in CI with mocked or recorded model responses. Catches scoring-logic bugs, prompt-template-rendering bugs, and golden-dataset regressions. It **cannot** catch model-output-shape bugs, because the fixtures are shapes *you* authored — not shapes the live model actually emits.
161
+ - **LIVE eval layer** — runs the real model against the golden dataset before launch. This is the pre-launch gate. It is the only layer that observes the model's *actual* output shape, and the only layer that can catch a contract drift between "the response shape we coded against" and "the response shape the model produces."
162
+
163
+ Treat the LIVE layer as a mandatory gate, not an optional smoke test: a component cannot ship until its LIVE eval has run against the real model and passed. The sandbox layer gates *every commit*; the LIVE layer gates *every launch*.
164
+
165
+ **Gotcha — normalize null-to-undefined before Zod `.optional()` (field report #352, #4).** A live model emits `null` (not omission) for an absent optional field — e.g. it returns `{ "category": "billing", "subcategory": null }` rather than dropping `subcategory`. Zod's `.optional()` accepts `undefined`, **not** `null`, so the valid response fails schema validation and your retry/fallback path fires on output that was actually fine. This is invisible in the sandbox layer because hand-authored fixtures usually omit the key instead of setting it to `null`. Normalize before validating:
166
+
167
+ ```ts
168
+ // Strip nulls the model emits for absent optionals, so `.optional()` matches.
169
+ function nullToUndefined<T>(obj: T): T {
170
+ if (obj === null) return undefined as unknown as T;
171
+ if (Array.isArray(obj)) return obj.map(nullToUndefined) as unknown as T;
172
+ if (typeof obj === "object") {
173
+ return Object.fromEntries(
174
+ Object.entries(obj as Record<string, unknown>).map(([k, v]) => [k, nullToUndefined(v)]),
175
+ ) as T;
176
+ }
177
+ return obj;
178
+ }
179
+
180
+ // Apply BEFORE Zod validation — never validate raw model output directly.
181
+ const parsed = ResponseSchema.parse(nullToUndefined(JSON.parse(modelText)));
182
+ ```
183
+
184
+ If a field is legitimately nullable (the model is *meant* to return `null` as a value), use `.nullish()` (accepts both `null` and `undefined`) on that specific field instead of blanket-stripping — but the default posture for absent optionals is normalize-then-`.optional()`.
154
185
 
155
186
  **Dors Venabili (Observability):** Visibility.
156
187
  - Can you see what the AI decided and why?
@@ -223,6 +254,8 @@ If issues found, return to Phase 3. Maximum 2 iterations.
223
254
  - [ ] Regression suite runs on prompt changes
224
255
  - [ ] Quality metrics tracked over time
225
256
  - [ ] Human review process for edge cases
257
+ - [ ] LIVE eval layer runs against the real model and passes before launch (sandbox layer alone cannot catch model-output-shape bugs) (field report #352, #4)
258
+ - [ ] Model output normalized null-to-undefined before Zod `.optional()` validation (field report #352, #4)
226
259
 
227
260
  ### AI Gate Bootstrapping (Cold-Start Problem)
228
261
  AI-gated approval systems have a cold-start problem: no historical outcomes -> gate rejects all requests -> no operations -> no outcomes. During the first N decisions (configurable, default 20), the gate should approve at reduced size (0.5-0.7x normal) to build a track record. The gate should never reject solely because "no historical data exists." Include explicit prompt guidance: "Lack of history is not a reason to reject — approve at reduced size to build the track record." (Field report #152)
@@ -53,7 +53,8 @@ Fury calls ALL of them. That's the point.
53
53
  8. `--skip-arch` and `--skip-build` allow re-running reviews on existing code.
54
54
  9. `--resume` picks up from the last completed phase.
55
55
  10. Only suggest a fresh session if `/context` shows actual usage above 85%. Do not preemptively checkpoint or reduce quality for context reasons.
56
- 11. **All phases dispatch to sub-agents per ADR-036.** The main thread orchestrates it plans, launches, triages, and decides. It does NOT read source files, analyze code inline, or generate findings from raw code. See `SUB_AGENTS.md` "Parallel Agent Standard" for brief format, deliverables, and concurrency rules. (Field report #270: full 11-phase /assemble ran through 15+ sub-agents with context at 15-25%, vs 80%+ inline.)
56
+ 11. Phase 13.5 (Doc-Currency Refresh) runs before sealing unless `--no-doc-refresh` is set; the skip is logged. `--doc-audit` is a docs-only one-shot mode that runs none of Phases 1-13 (field report #342 F-1, F-5).
57
+ 12. **All phases dispatch to sub-agents per ADR-036.** The main thread orchestrates — it plans, launches, triages, and decides. It does NOT read source files, analyze code inline, or generate findings from raw code. See `SUB_AGENTS.md` "Parallel Agent Standard" for brief format, deliverables, and concurrency rules. (Field report #270: full 11-phase /assemble ran through 15+ sub-agents with context at 15-25%, vs 80%+ inline.)
57
58
 
58
59
  ## The Pipeline
59
60
 
@@ -72,6 +73,7 @@ Fury calls ALL of them. That's the point.
72
73
  | 11 | /test | 1 | Suite green, coverage acceptable |
73
74
  | 12 | Crossfire | 1 | All 4 adversarial agents sign off |
74
75
  | 13 | Council | 1-3 | All 5 cross-domain agents sign off (incl. Troi PRD compliance) |
76
+ | 13.5 | Doc-Currency Refresh | 1 | Project docs sweep is clean before sealing (skip with `--no-doc-refresh`) |
75
77
 
76
78
  ### Phase 6.5 — Seldon's AI Review (conditional)
77
79
 
@@ -125,9 +127,36 @@ Verify no circular calls between store actions and API methods. Specifically che
125
127
 
126
128
  When a feature is added to one surface (API, dashboard, CLI, marketing site), verify all other surfaces displaying the same entities are updated. A new field added to the API response but missing from the dashboard table, or a new tier added to the pricing page but missing from the settings panel, creates an inconsistent product. After each pipeline phase that adds or modifies a feature, grep for the entity name across all surfaces: API routes, React/Vue components, CLI output formatters, marketing page copy, email templates, admin panels. (Triage fix from field report batch #149-#153.)
127
129
 
130
+ ### Phase 13.5 — Doc-Currency Refresh (pre-SEAL)
131
+
132
+ After the Council signs off, but BEFORE Fury seals the run and makes the Deploy Offer, sweep the project's source-of-truth docs for drift introduced over the course of the pipeline. A full `/assemble` touches architecture, features, version, and build state — by the time the Council finishes, the docs that describe the project frequently no longer match it. This mirrors the Doc-Currency Refresh mission in `CAMPAIGN.md`: same checklist, applied once at the end of the pipeline instead of once per mission. (Field report #342 F-1: `/assemble` shipped a Council-clean build whose `CLAUDE.md` Project block and `PROJECT_VERSION` line still described the pre-build scaffold.)
133
+
134
+ Sweep each of these for drift against the current state of the code, and fix what's stale:
135
+
136
+ 1. **`CLAUDE.md`** — Project block (name, one-liner, domain, repo), stack, and any phase/feature claims that the pipeline changed.
137
+ 2. **`MEMORY.md`** (auto-memory index, if present) — entries that now point at retired or renamed work.
138
+ 3. **`README.md`** — install/run/feature sections that the build moved past.
139
+ 4. **`PROJECT_VERSION.md`** (or equivalent) — the **Current** line must name the version this run produced, not the one it started from.
140
+ 5. **`/logs/build-state.md`** — the recorded "current state" must reflect the completed pipeline, not a stale prior session (the Phase 9 deployment-verification trap, generalized to all of build-state).
141
+ 6. **`/logs/campaign-state.md`** (if this `/assemble` ran inside a campaign) — the mission this run closed must be marked done so `/campaign` doesn't re-pick it.
142
+
143
+ This phase is **additive verification, not a rewrite** — only touch lines that are demonstrably stale. If every doc is already current, record "Doc-Currency Refresh: clean" in `assemble-state.md` and proceed. Dispatch the sweep to a sub-agent per ADR-036; the main thread triages the diff.
144
+
145
+ **`--no-doc-refresh`** skips Phase 13.5 entirely (documented opt-out, for runs where docs are maintained out-of-band). The skip is logged to `assemble-state.md` so a later run knows the docs were never swept. (Field report #342 F-1.)
146
+
128
147
  ### Post-Pipeline: Deploy Offer
129
148
 
130
- After Phase 13 (Council sign-off), if a deployment target is configured (`.vercel/project.json`, `fly.toml`, `railway.toml`, or PRD deploy section), Fury offers: "Council has signed off. Deploy to production?" This closes the loop instead of leaving deployment as an implicit user action. In campaign blitz mode, auto-deploy if the deploy method is known. (Field report #37: user had to prompt three times before agent deployed to Vercel.)
149
+ After Phase 13.5 (Doc-Currency Refresh, or its skip), if a deployment target is configured (`.vercel/project.json`, `fly.toml`, `railway.toml`, or PRD deploy section), Fury offers: "Council has signed off. Deploy to production?" This closes the loop instead of leaving deployment as an implicit user action. In campaign blitz mode, auto-deploy if the deploy method is known. (Field report #37: user had to prompt three times before agent deployed to Vercel.)
150
+
151
+ ## `--doc-audit` — One-Shot Documentation Audit
152
+
153
+ `/assemble --doc-audit` runs a **docs-only** pass: it dispatches a Silver-Surfer-led doc-audit roster and runs **no code review, no build, no security/QA phases** — none of Phases 1-13. The entire pipeline is replaced by a single sweep whose only job is to catch documentation drift across the whole corpus, then report. Use it when you suspect the docs have fallen behind the code but don't want to pay for a full Initiative run. (Field report #342 F-5.)
154
+
155
+ This is deliberately **broader** than `/git` Step 5.5. Step 5.5 only checks the 13 known method-doc ↔ command-file pairs (e.g. `ASSEMBLER.md` ↔ `assemble.md`) after a release touches a method doc — it is a paired-file sync check, not a corpus audit. `--doc-audit` instead lets the Surfer muster a roster sized to the actual surface area: `docs/`, `*.md` at the repo root, per-directory `CLAUDE.md` files, ADR records, pattern headers, and the slash-command/agent inventory — anywhere prose can disagree with reality.
156
+
157
+ **Canonical path.** `--doc-audit` is a convenience entry point into the canonical doc-audit mechanism, not a second implementation of it. The doc-audit roster, scoring, and report format are defined by the **`/audit-docs`** command and **`DOC_AUDIT.md`** — `/assemble --doc-audit` dispatches that same Surfer-led audit and surfaces its report. When in doubt about what the audit covers or how findings are graded, `DOC_AUDIT.md` is the source of truth; this section only documents the `/assemble` doorway to it. (Field report #342 F-5.)
158
+
159
+ `--doc-audit` is incompatible with the build/review flags (`--skip-arch`, `--skip-build`, `--fast`, `--resume`) since it runs none of those phases; pair it with `--focus "topic"` to bias the Surfer's roster toward a corner of the corpus.
131
160
 
132
161
  ## Deliverables
133
162
 
@@ -283,6 +283,8 @@ User confirms, redirects, or overrides. On confirm → Step 4.
283
283
 
284
284
  **Post-infrastructure enforcement gate:** For infrastructure campaigns (deploy targets, CI/CD, monitoring, staging environments): after the infrastructure is provisioned, run `/architect --plan` to verify workflow enforcement gates exist — not just infrastructure existence. Infrastructure without process gates is incomplete.
285
285
 
286
+ **Silver Surfer gate fires at the REVIEW phase, not the solo build.** Within a mission, the gate (ADR-051 PreToolUse hook on the Agent tool) engages when Fury deploys the review/audit roster as sub-agents — NOT during the orchestrator's solo build of the mission's code. Solo-build-before-review is intentional, not a skipped gate: parallel agents editing the same tightly-coupled engine files (game loop, state machine, shared service) would clobber each other's edits and produce merge garbage. So the orchestrator builds the changeset solo, THEN the Surfer-gated review roster reads it. If you find yourself mid-build asking "did a gate get skipped?", the answer is no — the gate has not fired yet because the review phase has not started. (Field report #348 #3: mid-build confusion over an un-fired gate that fires correctly at the review phase.)
287
+
286
288
  **Dispatch model (ADR-044):** Per-mission `/assemble` runs SHOULD dispatch phases to sub-agents per `SUB_AGENTS.md` "Parallel Agent Standard." Agents are launched as named subagent types defined in `.claude/agents/` with description-driven dispatch — Opus scans `git diff --stat` and matches changed files against agent descriptions to auto-select specialists. The campaign orchestrator (main thread) manages the mission sequence, inter-mission gates, and campaign state — it does NOT perform inline code analysis. Pass findings summaries between missions, not raw code. See `docs/AGENT_CLASSIFICATION.md` for the full agent manifest (see docs/AGENT_CLASSIFICATION.md). (Field report #270)
287
289
 
288
290
  ### Campaign-Mode Pipeline
@@ -440,12 +442,27 @@ Even in `--fast` mode, each mission gets at least **1 review round** (not 3, but
440
442
 
441
443
  **UI→server route tracing (within review):** When a mission writes both UI code and server code, the review must trace every `fetch()` call in the UI to a registered server route. For each `fetch('/api/...')` in `.js`/`.ts` UI files, verify the path exists as an `addRoute()` call in the server. Missing routes produce silent 404s that are invisible in development. (Field report #50: UI button called `/api/server/restart` but no endpoint was created.)
442
444
 
445
+ **Review the integrated changeset, not only the new files.** The per-mission review gate must read the full diff from the prior mission's HEAD (`git diff <prev-mission-sha>..HEAD`), not just the files this mission created. Reviewing only the new files misses pre-existing cross-cutting defects that the integration surfaces: a missing config entry the new code now depends on, a Dockerfile `COPY` that never included the directory this mission populated, a doc-vs-reality drift where the new wiring contradicts a README/PRD claim, a build/import that only breaks once the new module is referenced. The new files can each be clean in isolation while the integrated system is broken at the seams. The diff is the unit of review, not the file list. (Field report #346 #4.)
446
+
443
447
  ### One Mission, One Commit Anti-Pattern
444
448
 
445
449
  **Each mission gets its own commit.** Do NOT batch multiple missions into a single commit. The per-mission commit serves as evidence: the diff for Mission 3 should contain only Mission 3's deliverables. If the diff contains work from Missions 3-11 combined, the review is meaningless — you can't verify what changed for which mission.
446
450
 
447
451
  If a mission is small enough to merge with an adjacent one, that's fine — but explicitly acknowledge it: "Missions 3-4 combined (both methodology-only, same target file)." Never silently batch.
448
452
 
453
+ ### Execution-Time Cluster Sub-Split
454
+
455
+ Plan-time cluster recognition (Step 1 #9, field report #326) splits a cluster-natured mission into sub-missions BEFORE the campaign starts, when Dax can see 4+ named deliverables on the board. But some clusters only reveal their seam at EXECUTION time: a mission spans a **foundation + N consumers** (a new schema/migration the rest of the mission builds on, a shared client/adapter, a base config, a core engine module) and the consumers cannot be safely reviewed until the foundation is real. Or a `RISK` item surfaces mid-mission demanding the foundation land and be verified BEFORE its consumers are wired.
456
+
457
+ When that happens, split at execution time along the **foundation/consumers seam** — even though the mission was a single board entry:
458
+
459
+ 1. **Sub-mission A (foundation):** build the foundation alone. Its own review gate. Its own commit (`M-XX.a — <foundation>`).
460
+ 2. **Sub-mission B..N (consumers):** build the consumers against the now-verified foundation. Each gets its own review gate and its own commit (`M-XX.b`, `M-XX.c`, ...).
461
+
462
+ Each sub-mission is a real mission for gate purposes: 1-round review minimum, per-mission commit (One Mission, One Commit still holds), and its slice recorded in campaign-state.md. The foundation's review can catch a contract defect (a column the consumers will read but the migration didn't add, an adapter method the consumers call but the foundation didn't expose) BEFORE the consumers are written against a broken base — instead of the Gauntlet catching it three missions later.
463
+
464
+ **This complements, not duplicates, plan-time recognition (#326):** plan-time splits a cluster on the board before execution; execution-time sub-split triggers when a foundation/consumers seam (often a `RISK` item) emerges *during* a mission that looked atomic at plan time. If you notice the seam at plan time, split there; if it only surfaces under the build, split here. (Field report #346 #3.)
465
+
449
466
  ### Per-Mission Verification Agents
450
467
 
451
468
  After each mission's review round, two agents run quick checks:
@@ -602,12 +619,22 @@ All PRD requirements are COMPLETE or explicitly BLOCKED:
602
619
  7. **PRD sync check:** Before declaring victory, compare PRD numeric claims (agent counts, feature counts, route counts, component counts) against the actual codebase for this campaign's domain. Stale PRD claims erode trust and compound across campaigns. (Field report #119)
603
620
  7a. **Tenant isolation completeness (conditional):** If the campaign touched auth, multi-tenant, or user-scoped data, grep ALL tables for `org_id` (or equivalent ownership column). Every table must be classified as either "tenant-scoped" (has org_id) or "global by design" (with documented justification). Tables without org_id and without justification are IDOR risks. This catches incomplete tenant migrations that survive per-phase sweeps — the per-phase check (BUILD_PROTOCOL Phase 4) only covers tables modified in that phase. (Field reports #229, #231)
604
621
  8. **Entity selector completeness** — for every user-facing selector (dropdown, combobox, autocomplete) that selects from a database-backed list: verify the selector can handle entities that don't exist yet. If a user can only pick from existing DB records, the feature is incomplete — the selector needs a creation flow or an external lookup fallback. Common examples: city selector (needs geocoding fallback), category picker (needs "Other" or custom entry), user selector (needs invite flow). (Field report #263: city selector only searched existing DB cities — users couldn't set homebase to any city not already in the database.)
622
+ 8b. **Doc-Currency Refresh mission (Coulson + Wong) — mandatory pre-SEAL sweep.** Before the final sign-off seals the version, run a dedicated cross-doc currency sweep. The Step 0 freshness check and the Step 6 #9 `build-state.md` update each cover ONE file at ONE moment; across rapid sealed versions the load-bearing docs rot collectively because no single mission owns their joint currency. This mission gives that sweep an owner. **Coulson** (release authority) drives version-line accuracy; **Wong** (lessons/changelog/PRD refresh) drives prose currency. Sweep and reconcile against the current `git log -1` + `package.json`/`pyproject.toml` version:
623
+ - **`CLAUDE.md`** — Project block, version references, command/agent counts, any "as of vX.Y.Z" claims
624
+ - **`MEMORY.md`** (auto-memory index, if present) — stale "next:" pointers, completed-work entries that still read as pending
625
+ - **`README.md`** — install/usage snippets, badge versions, feature lists that drifted from reality
626
+ - **`PROJECT_VERSION.md` Current line** (or `VERSION.md` "Current:" line) — must equal the version about to be sealed
627
+ - **`/logs/build-state.md`** — version, test counts, deployment state
628
+ - **`/logs/campaign-state.md`** — Prophecy Board statuses, final mission status
629
+ For each file, fix drift in place — never seal known drift. If a doc is already current, note "current at <SHA>" and move on (idempotent). **Opt-out: `--no-doc-refresh`** skips this mission for fast methodology-only or hotfix campaigns where the docs provably did not move; log the skip in campaign-state.md with a one-line reason. This complements (does not replace) the existing state-freshness checks. (Field report #342 F-1: load-bearing docs rotted across rapid sealed versions because the per-file freshness checks had no cross-doc owner.)
605
630
  9. **Victory Checklist** — ALL must be true before sign-off:
606
631
  - [ ] Gauntlet Council signed off (6/6 or all domains pass)
607
632
  - [ ] All BLOCKED items acknowledged by user
608
633
  - [ ] PRD claims verified against codebase
609
634
  - [ ] `/debrief --submit` filed (issue number recorded)
610
635
  - [ ] Campaign-state.md updated with final status
636
+ - [ ] Doc-Currency Refresh completed (or `--no-doc-refresh` skip logged with reason)
637
+ - [ ] **Production-config boot assertion passed** — a green sandbox suite is necessary but NOT sufficient. Boot the app under its real production config (production env vars / `NODE_ENV=production`, real adapter selection, production build artifact) and assert it reaches a ready state without a config/credential fault. Sandbox adapters can pass every test while the production path fails on a missing env var, a real-vs-sandbox adapter mismatch, or a build-only import error. Do not declare victory on sandbox-green alone. (Field report #350 #3: sandbox suite was fully green but the production-config boot path was never asserted before sign-off.)
611
638
 
612
639
  ### The Reckoning (Optional Pre-Launch Audit)
613
640
 
@@ -97,6 +97,11 @@ For each service in `docker-compose.yml`, verify:
97
97
  7. **Dependency health** — `depends_on` with `condition: service_healthy` (compose v2.1+). Without it, the app starts before its database is ready.
98
98
  (Field report #280)
99
99
 
100
+ **Compose validation goes deeper than syntax (field report #352 #2).** `docker compose config` only validates *syntax* — it renders the merged YAML and exits 0 even when the resulting topology is wrong. Two failure modes it will not catch:
101
+
102
+ - **Dependency closure.** A service can reference a network, volume, or `depends_on` target whose definition exists but whose *startup* chain is broken. Check the closure with `docker compose up --dry-run` — it walks the full dependency graph and reports what would actually start (and in what order) without launching containers.
103
+ - **Overlay merge, not overlay replace.** Compose **merges** list-and-map fields like `depends_on` and `environment` across overlay files (`-f base.yml -f docker-compose.dev.yml`); it does not replace them. The classic trap: `base.yml` declares `depends_on: [redis]` for development, and an overlay tries to drop it with `depends_on: []` — the empty list **merges into** the base list, the `redis` edge **survives**, and prod still waits on (or starts) a dev-only Redis. To *replace* rather than merge, use the override tags: `depends_on: !override []` (replace the whole list) or `!reset null` (remove the key entirely). Verify the rendered result with `docker compose config` and confirm the unwanted edge is actually gone — never assume the overlay won.
104
+
100
105
  **L — Monitoring:** Health endpoint (/api/health checking DB, Redis, disk). External uptime monitor. Request logging (method, path, status, duration). Error tracking. Slow query logging (>1s). Worker job logging. Alerts: CPU >80%, Memory >85%, Disk >80%.
101
106
 
102
107
  **Build Staleness Detection (health endpoint):** The health endpoint MUST include a build fingerprint check. At startup, capture a build fingerprint (git commit hash, `BUILD_HASH` env var, or entry bundle mtime). Include it in `/api/health` responses. After any deploy, compare the health endpoint's fingerprint against the expected value. A mismatch means the process serves stale code — the build completed but was never reloaded. Automate: if health fingerprint != deployed commit, trigger process reload. This is the #1 cause of "I deployed but nothing changed" incidents. (Field reports #278, #279)
@@ -164,6 +169,33 @@ If a process manager (PM2, systemd, Docker, supervisord) owns the application po
164
169
 
165
170
  **Detection rule:** When writing CLAUDE.md "How to Run" sections or session restart commands, check if the project uses a process manager (`ecosystem.config.js`, `docker-compose.yml`, `*.service` files). If yes, the restart command MUST go through the PM — not through port killing.
166
171
 
172
+ ### PM2 Operational Foot-guns
173
+
174
+ **`pm2 reload <config>` does NOT re-read log paths.** `error_file` / `out_file` paths bind at process *registration* time, not at reload time (field report #343 F9). If you change a log path in `ecosystem.config.js` and run `pm2 reload`, PM2 keeps writing to the old paths — the new ones never take effect, and a log-rotation or disk-pressure fix silently does nothing. Changing log paths requires a full re-registration cycle:
175
+ ```bash
176
+ pm2 delete <app> # drop the old registration
177
+ pm2 start ecosystem.config.js --cwd /path/to/project
178
+ pm2 save # persist so the new paths survive reboot
179
+ ```
180
+ The same applies to any other property that binds at registration (`exec_mode`, `instances`, `cwd`): `pm2 reload` reloads code, not the process definition.
181
+
182
+ **Multi-user deploy setups need per-user git identity (field report #343 F3).** When each environment runs as a different OS user (e.g. `deploy-staging`, `deploy-prod`), any git operation the deploy performs as that user — a merge commit, a `git stash`, a tag, an auto-commit of generated lockfiles — fails with `fatal: empty ident name (for <user@host>) not allowed` if that user has no `user.email` / `user.name`. The fault is invisible until a fallback path that commits actually runs in production. Provision git identity per deploy user:
183
+ ```bash
184
+ sudo -u deploy-prod git config --global user.email "deploy@example.com"
185
+ sudo -u deploy-prod git config --global user.name "Prod Deploy"
186
+ ```
187
+ Add this to `provision.sh` for every Unix user that will run git as part of a deploy or fallback path.
188
+
189
+ ### Deploy-Strategy Nomenclature Check
190
+
191
+ If a deploy script's comments or docs claim **blue-green** or **zero-downtime**, verify the code actually implements an atomic-swap mechanism before believing the label (field report #343 F7). A real zero-downtime swap is one of:
192
+
193
+ - **temp-build-then-rename** — build into `release-new/`, then `mv release-new release` (or repoint a `current` symlink) in a single atomic operation,
194
+ - **container swap** — start the new container, health-check it, then cut traffic over and stop the old one, or
195
+ - **load-balancer cutover** — add the new instance to the pool, drain and remove the old one.
196
+
197
+ A `stop → build → start` loop mislabeled "blue-green" serves nothing during the build window and produces a 502 gap on every deploy. The label is not the mechanism. Audit check: grep the deploy script for the claim, then confirm a rename/symlink-repoint, container cutover, or LB pool change exists. If it's a stop-build-start loop, either fix it to atomic-swap or correct the comment — a mislabeled strategy hides a recurring outage.
198
+
167
199
  ### CI runs `npm test` at repo root
168
200
 
169
201
  In monorepo CI workflows, run `npm test` at the repository root — NOT `npm run test -w <workspace-name>`. The workspace-scoped form skips the root `pretest` hook, silently bypassing any root-level validators (agent-ref checkers, gate tests, consistency checks).
@@ -191,6 +223,21 @@ fi
191
223
 
192
224
  Applies to: Vercel Git Integration, Cloudflare Pages Git Integration, Netlify Git Integration, Firebase web-hook auto-deploys.
193
225
 
226
+ ### The served artifact is not the built artifact
227
+
228
+ Every step exiting 0 — `git pull` ✓, `npm run build` ✓, `pm2 reload` / `docker compose up` ✓ — proves the build *ran*; it does NOT prove the **served** bundle is the one you just built (field report #349 F-1). The two can diverge whenever the thing that builds and the thing that serves are different processes pointed at different paths. The canonical split: a **host nginx static root** serves `/var/www/app/dist`, while the build runs *inside a Docker container* and writes to a **container-internal `dist`** that the host root never sees. Build succeeds, container restarts clean, health check is green — and prod serves the previous bundle indefinitely because nginx is reading a directory nobody rebuilt.
229
+
230
+ Rule: after deploy, confirm the SERVED bundle matches the BUILT one by **fingerprint fetched back through the public/served path** — not by exit codes. Capture a build fingerprint (git short SHA, `BUILD_HASH`, or the hashed entry-bundle filename) at build time, then fetch it back through the real serving path and assert equality:
231
+ ```bash
232
+ EXPECTED="$(git rev-parse --short HEAD)"
233
+ # Pull the fingerprint through the SERVED path — the public URL or the host static root,
234
+ # whichever end users actually hit — never the build directory.
235
+ SERVED="$(curl -s "https://$DEPLOY_URL/version.txt")" # or grep the hashed main.<hash>.js from index.html
236
+ [ "$SERVED" = "$EXPECTED" ] || { echo "SERVED ARTIFACT MISMATCH: served=$SERVED built=$EXPECTED"; exit 1; }
237
+ ```
238
+
239
+ This **generalizes to manual `/deploy`** the two automated checks already in this doc: **Build Staleness Detection** (§health endpoint — the process serves stale code) catches a build-but-no-reload within one process, and **Post-push live-URL fingerprint** (§above) catches a broken platform auto-deploy. This entry is the same fingerprint discipline for the self-hosted, multi-location, hand-run deploy: assert the served fingerprint equals the built fingerprint as the final gate of any manual deploy, not just platform pushes.
240
+
194
241
  ### Methodology-exposure check (static-host deploys)
195
242
 
196
243
  After deploying to a static CDN (Cloudflare Pages, Vercel, Netlify, Firebase, S3+CloudFront), curl a known methodology path and assert 404 / denied:
@@ -293,6 +340,23 @@ If health check fails after deploy:
293
340
 
294
341
  **PM2 discipline: never `pm2 delete` + `pm2 start` without `--cwd`.** Always specify the working directory explicitly: `pm2 start ecosystem.config.js --cwd /path/to/project`. Without `--cwd`, PM2 resolves paths relative to the current shell directory, which may differ from the project root — especially in deploy scripts that `cd` between operations. A `pm2 start` from the wrong directory silently starts the process with wrong paths, serving 404s on every route. (Triage fix from field report batch #149-#153.)
295
342
 
343
+ ### Docker Cleanup Preflight
344
+
345
+ Before any `rm -rf` against a Docker **bind-mount** path (volumes the container wrote to as root — pgdata, redis dumps, uploaded files), preflight the ownership; do not just run the delete and hope (field report #353 RC-003). Docker bind-mounts written by a container default to **root** ownership on the host, so an unprivileged agent's `rm -rf` fails partway with `Permission denied`, often after deleting the writable half of the tree — a worse state than not starting.
346
+
347
+ Preflight: `stat` the path's owner first, and branch on it:
348
+ ```bash
349
+ target=/var/lib/myapp/pgdata
350
+ owner="$(stat -c %U "$target" 2>/dev/null || stat -f %Su "$target")" # GNU || BSD/macOS
351
+ if [ "$owner" = "root" ] && [ "$(id -u)" -ne 0 ]; then
352
+ echo "MANUAL STEP REQUIRED — $target is root-owned; run as operator:"
353
+ echo " sudo rm -rf $target"
354
+ else
355
+ rm -rf "$target"
356
+ fi
357
+ ```
358
+ When the path is root-owned and the agent is unprivileged, **emit the `sudo`-prefixed step as a MANUAL operator action** rather than attempting (and half-completing) the delete. A clean handoff beats a partial destruction. (`stat -c %U` is GNU coreutils; `stat -f %Su` is BSD/macOS — the snippet tries both for portability.)
359
+
296
360
  ## Multi-Environment Isolation
297
361
 
298
362
  When staging and production coexist on the same server, enforce full isolation:
@@ -308,6 +372,29 @@ When staging and production coexist on the same server, enforce full isolation:
308
372
 
309
373
  Convention isn't enough — enforcement is. The pre-push hook is the single most effective protection. (Field report #241: 68-hour production outage from shared infrastructure.)
310
374
 
375
+ ### Renaming a Linked Worktree Directory Breaks Git Silently
376
+
377
+ A linked git worktree (staging worktree, release worktree) keeps **two** pointer files that must agree on the directory's path. Renaming the worktree directory with a plain `mv` orphans both, and git gives you no warning (field report #343 F2):
378
+
379
+ 1. The worktree's own `.git` **file** (not a directory — it contains `gitdir: /abs/path/to/main/.git/worktrees/<name>`).
380
+ 2. The main repo's `.git/worktrees/<name>/gitdir` file, which points back at the worktree's `.git` file.
381
+
382
+ After a bare `mv staging staging-old`, both paths are stale. The worst part: **`git worktree list` does NOT warn** — it happily prints the old path, so the breakage is invisible until a git command inside the moved worktree fails with `fatal: not a git repository` or a deploy that `cd`s into the worktree silently operates on the wrong tree.
383
+
384
+ Fix — never `mv` a worktree directory. Use the porcelain that updates both pointers atomically:
385
+ ```bash
386
+ git worktree move staging /new/abs/path/staging-old
387
+ ```
388
+ If a directory was already moved by hand, repair both pointers manually:
389
+ ```bash
390
+ # 1. fix the worktree's own .git file
391
+ echo "gitdir: /abs/main/.git/worktrees/staging" > /new/abs/path/staging-old/.git
392
+ # 2. fix the main repo's back-pointer
393
+ echo "/new/abs/path/staging-old/.git" > /abs/main/.git/worktrees/staging/gitdir
394
+ git worktree repair /new/abs/path/staging-old # validates both ends
395
+ ```
396
+ `git worktree repair` is the belt-and-suspenders step — run it after any manual edit to confirm both ends resolve.
397
+
311
398
  ## Deploy Safety Rules
312
399
 
313
400
  **rsync exclusion mandate:** NEVER use `rsync --delete` without excluding VPS-only directories. User-uploaded files, generated avatars, and data files only exist on the VPS — `--delete` will destroy them. Mandatory exclusions:
@@ -332,6 +419,62 @@ Add project-specific exclusions for any directory that receives runtime-generate
332
419
 
333
420
  **Post-deploy asset verification:** After deploying, verify specifically the files that *changed* in this deploy — not pre-existing assets. Check: (a) correct content-type header (text/html on a static asset means the file is missing from the deployment), (b) correct content-length (not the index.html fallback size), (c) deployment list shows the correct environment. Do NOT verify only pre-existing assets — they prove the host is up, not that the deploy succeeded. (Field report #114)
334
421
 
422
+ **Read back after a vendor PUT that doesn't echo the object.** When a deploy or config step `PUT`s to a vendor/control-plane API (DNS provider, CDN, Plex, a SaaS settings endpoint) and the response does **not** contain the mutated object, do NOT treat the `200` as confirmation — issue a follow-up `GET` and assert the field you set actually took (field report #353 RC-004). A vendor `PUT` can return `200 OK` while silently discarding body params it doesn't recognize, applies asynchronously, or rejects at a validation layer that still returns success (the Plex pattern: settings PUT returns 200 but the value is unchanged). The status code confirms the request was *received*, not that the *mutation persisted*. Rule: for any non-echoing PUT/PATCH on the deploy path, follow with a read-back and compare before declaring success.
423
+
424
+ ## Env-File Loading Safety
425
+
426
+ **NEVER load `.env` files with eval-export.** The pattern `while read line; do eval "export $line"; done < .env` (and `export $(cat .env | xargs)`) routes every value through the shell's positional-parameter and command expansion. Any secret containing a literal `$` — bcrypt hashes (`$2b$12$...`), PHP-style hashes, JWT signing keys, some base64 — gets mangled: `$2b` and `$12` are expanded as positional parameters and silently substituted (usually to empty), corrupting the secret. The app then boots with a broken hash and rejects every login, or signs tokens with a truncated key. The failure is invisible until auth breaks in production (field report #344 F1).
427
+
428
+ Use a `$`-safe literal parser that never re-evaluates the value:
429
+ ```bash
430
+ # Safe: read the line verbatim, split on the FIRST '=' only, no expansion.
431
+ while IFS='=' read -r key val; do
432
+ case "$key" in
433
+ ''|'#'*) continue ;; # skip blanks and comments
434
+ esac
435
+ export "$key=$val" # value is a literal string, never eval'd
436
+ done < .env
437
+ ```
438
+ `$`-safe alternatives that bypass the shell entirely:
439
+
440
+ - **Node ≥20:** `node --env-file=.env app.js` — Node parses the file itself, no shell expansion.
441
+ - **systemd:** `EnvironmentFile=/etc/myapp/app.env` in the unit — systemd reads values literally.
442
+ - **Docker Compose:** `env_file: .env` — Compose reads the file directly (it does NOT eval).
443
+
444
+ Audit existing deploy scripts: grep for `eval "export`, `eval export`, and `export $(cat`. Any hit is a latent secret-corruption bug — replace it with the literal parser or one of the runtime-native loaders above.
445
+
446
+ ## systemd Unit Hardening (Node.js)
447
+
448
+ Sandboxing directives in a systemd unit are good practice, but **Node.js units must NOT set `MemoryDenyWriteExecute=true`** (field report #344 F3). V8's JIT compiler maps pages that are simultaneously writable and executable (W^X is violated by design for JIT); `MemoryDenyWriteExecute=true` (MDWE) forbids exactly that, so the Node process dies with **`SIGTRAP` at boot** before it serves a single request. The crash looks unrelated to the unit file — operators chase the app for hours.
449
+
450
+ Safe Node hardening stanza — everything useful **except** MDWE:
451
+ ```ini
452
+ [Service]
453
+ # --- hardening (Node-safe) ---
454
+ NoNewPrivileges=true
455
+ ProtectSystem=full
456
+ ProtectHome=true
457
+ PrivateTmp=true
458
+ PrivateDevices=true
459
+ ProtectKernelTunables=true
460
+ ProtectKernelModules=true
461
+ ProtectControlGroups=true
462
+ RestrictSUIDSGID=true
463
+ RestrictRealtime=true
464
+ LockPersonality=true
465
+ # MemoryDenyWriteExecute=true # <-- DO NOT: V8 JIT needs W+X pages; SIGTRAP at boot
466
+ ReadWritePaths=/var/lib/myapp /var/log/myapp
467
+ ```
468
+ Note: ahead-of-time-compiled binaries (Go, Rust, statically compiled C/C++) have no JIT and **can** keep `MemoryDenyWriteExecute=true` — the restriction is specific to JIT runtimes (Node/V8, the JVM, PyPy, .NET with JIT). When a unit template is shared across services, gate MDWE on the runtime, not on the unit boilerplate.
469
+
470
+ ## Config Foot-Guns (deploy/runtime)
471
+
472
+ Three recurring config traps that pass every syntax check yet break at runtime (field report #352 #5):
473
+
474
+ - **Empty-string env defaults are non-nullish.** A shell default of the form `${VAR:-}` (or a Compose `VAR: ""`) sets the variable to `""`, which is a *defined, non-null* value. Downstream `cfg.X = process.env.VAR ?? defaultX` then keeps `""` — nullish coalescing (`??`) only fires on `null`/`undefined`, never on empty string — so the intended default is silently poisoned and the app runs with an empty config value. Either leave the var truly unset (omit the `:-` default) or validate-and-coerce empty strings at the config boundary.
475
+ - **Dev hostnames hardcoded in worker healthchecks false-fail in prod.** A worker healthcheck that pings `http://localhost:3000` or `redis://dev-redis` passes in dev and fails in prod, marking a healthy worker unhealthy (and triggering restart loops). Healthcheck targets must come from the same env config the worker uses, never literals.
476
+ - **Awaiting best-effort side effects on the auth path blocks sign-in.** `await analytics.track(...)` / `await auditLog.write(...)` inline in the login handler means a slow or down telemetry backend stalls — or fails — the sign-in. Best-effort side effects must be fire-and-forget (queue them, `void`-them, or move them off the request path), never `await`ed on a latency-critical auth route.
477
+
335
478
  ## Subdomain Routing (Cloudflare Pages / Vercel / Netlify)
336
479
 
337
480
  Platform-hosted static sites serve the entire project from root. Subdomain-to-subdirectory routing (e.g., `labs.example.com` → `/labs/`) requires platform-specific configuration:
@@ -344,6 +487,21 @@ Platform-hosted static sites serve the entire project from root. Subdomain-to-su
344
487
 
345
488
  **Always test routing before announcing a subdomain.** Curl the subdomain and verify it serves the expected content, not the root index.html.
346
489
 
490
+ ## Cloudflare TLS Mode (Flexible vs Full/Strict)
491
+
492
+ On a **Flexible** TLS zone, Cloudflare terminates TLS at the edge and talks to the origin over **plain HTTP**. If that origin then **301-redirects HTTP → HTTPS** (the near-universal nginx/Caddy default), it bounces the edge's HTTP request back to HTTPS, which Cloudflare re-fetches over HTTP, which redirects again — an **infinite redirect loop** (`ERR_TOO_MANY_REDIRECTS`) for every visitor (field report #344 F4a). On a Flexible zone the origin must serve the app on plain HTTP and must NOT force the HTTPS upgrade — let Cloudflare own the HTTPS edge.
493
+
494
+ **A Let's Encrypt cert on a sibling host is NOT proof the zone is Full/Strict.** Operators see `https://api.example.com` with a valid LE cert and assume the apex is Full mode too — but TLS mode is per-zone (sometimes per-host with config-rule overrides), and a working cert elsewhere says nothing about the mode applied to *this* hostname. Don't infer the mode from a neighbor's cert; check it.
495
+
496
+ **Behavioral check — count redirect hops, don't read config:**
497
+ ```bash
498
+ # Healthy Full/Strict origin: 0–1 hops. A Flexible-loop origin spirals (curl caps at --max-redirs).
499
+ curl -sIL --max-redirs 10 "http://$ORIGIN_HOST/" | grep -ci '^location:'
500
+ ```
501
+ A count at or near the cap (or `curl: (47) Maximum (10) redirects followed`) is the Flexible-loop signature. Fix by either switching the zone to **Full (strict)** and keeping the origin's HTTPS redirect, OR keeping **Flexible** and removing the origin's HTTP→HTTPS 301. Pick one; don't mix.
502
+
503
+ **Minimum Cloudflare API token scope for `/deploy`.** So `/deploy` can verify the zone's SSL mode *before* it writes an nginx config that adds a redirect (and thus before it can create the loop), the deploy token must include **`Zone → SSL and Certificates → Read`** (`Zone:SSL`) and **`Zone → Certificates → Read`** (field report #344 F4b). With those scopes the deploy step queries the zone's `ssl` setting, and only emits a redirect-bearing origin config when the mode is Full/Strict. A token scoped to DNS-only cannot see the SSL mode and will happily ship a redirect into a Flexible zone.
504
+
347
505
  ## Deploy Surface Boundary
348
506
 
349
507
  **Invariant:** the repository root is NEVER the deploy surface. Physical separation between "all files tracked in the repo" and "files uploaded to the CDN / server" is enforced by tool configuration, not by `.gitignore`.
@@ -0,0 +1,92 @@
1
+ # DOC AUDIT — Documentation Currency & Cross-Reference Integrity
2
+ ## Lead: **Surfer-led roster** · Domain: Documentation correctness (NOT UX)
3
+
4
+ > *"The map is not the territory — but a map that lies about the territory is worse than no map at all."*
5
+
6
+ ## Identity
7
+
8
+ A doc audit verifies that VoidForge's prose tells the truth about VoidForge's behavior. It is a correctness discipline, not a styling one. Code drifts; commands get added, renamed, or retired; versions bump; ADRs supersede each other. Every drift leaves the docs one step behind reality, and stale docs are load-bearing lies — the next session (or the next user) acts on them. The doc audit catches that drift before it ships (field report #342 F-3).
9
+
10
+ **Doc audits are NOT a `/ux` concern.** Galadriel's UX pass evaluates the *user-facing product* — screens, flows, a11y, copy that end users read. A doc audit evaluates the *methodology and developer documentation* — method docs, command specs, the Holocron, README, ADRs, CLAUDE.md. These are different artifacts reviewed against different sources of truth. Routing doc-currency findings into a UX pass buries them under unrelated screenshots and produces neither a real UX review nor a real doc audit. Keep them separate (#342 F-3).
11
+
12
+ ## Goal
13
+
14
+ After a doc audit, every documented claim is true at audit time, every cross-reference resolves, every command listed in CLAUDE.md has a matching spec (and vice versa), and the version stated in the docs matches the single source of truth. A reader who trusts the docs is not misled.
15
+
16
+ ## The Four Checks
17
+
18
+ A doc audit is four distinct verifications. Each has its own source of truth — the audit is the act of diffing prose against that source (#342 F-3).
19
+
20
+ ### 1. Currency
21
+
22
+ Documentation describes the system **as it is now**, not as it was. For every factual claim that can go stale — counts, file paths, feature lists, "X does Y" behavioral statements — verify it against the live artifact:
23
+
24
+ | Claim type | How to verify |
25
+ |------------|--------------|
26
+ | Agent / command / pattern counts | `ls .claude/agents/*.md \| wc -l`, count the rows in the relevant table |
27
+ | File paths cited as deliverables | `[ -f <path> ] && echo present \|\| echo MISSING` |
28
+ | Behavioral claims ("the hook blocks X") | Read the code, not memory — the doc must match the implementation |
29
+ | Retired / renamed features | grep for the old name; if it still appears as current, it is stale |
30
+
31
+ Document each verified claim with its source (`from ls`, `from <file>:<line>`). A claim you cannot anchor to an artifact is a claim you cannot defend.
32
+
33
+ ### 2. Cross-Reference Integrity
34
+
35
+ Every internal reference must resolve. Broken cross-references rot silently because nothing errors at read time:
36
+
37
+ - **File links** — every `/docs/...`, `/.claude/...`, pattern, and method path cited must exist on disk.
38
+ - **ADR references** — every `ADR-NNN` mentioned must correspond to a real, current ADR; superseded ADRs must say so.
39
+ - **"See X" pointers** — the target section/doc must still exist and still cover what the pointer claims.
40
+
41
+ Verification is mechanical: extract every path-like and `ADR-NNN`-like token, then existence-check each one.
42
+
43
+ ### 3. Command ↔ Method Sync
44
+
45
+ The command table in CLAUDE.md, the slash-command specs in `.claude/commands/`, and the method docs in `/docs/methods/` describe the same surface from three angles. They must agree:
46
+
47
+ - Every command in the CLAUDE.md table has a spec file in `.claude/commands/` and a method-doc entry where one is expected.
48
+ - Every command spec in `.claude/commands/` appears in the CLAUDE.md table (no orphan commands).
49
+ - Aliases (e.g. `/review` → `/engage`, `/security` → `/sentinel`) are documented as aliases in all three places, not as independent commands in one and missing from another.
50
+ - Flag taxonomy claims (which flag works on which command) match the command specs.
51
+
52
+ A command documented in one place but missing from another is a sync defect — report it.
53
+
54
+ ### 4. Version-SSOT Consistency
55
+
56
+ There is one source of truth for the version. Every other mention must match it:
57
+
58
+ - Identify the SSOT (e.g. `VERSION.md`, `package.json` `version`, the latest `/git` release tag).
59
+ - Every version string in docs, changelog headers, and README badges must equal the SSOT.
60
+ - The changelog must have an entry for the current version; a version bumped without a changelog entry is a currency defect (Coulson's domain).
61
+
62
+ Mismatched version strings are the most common — and most embarrassing — doc defect. They are also the cheapest to verify: grep for version-shaped tokens, compare to SSOT.
63
+
64
+ ## The Surfer-Led Doc Roster
65
+
66
+ Doc audits run under the Silver Surfer Gate like any other review command — announce the herald, launch the Surfer, deploy the roster it returns. The Surfer biases toward the documentation specialists below for this domain; it is not a fixed list, and the Surfer may add cross-domain agents when the diff warrants (#342 F-3).
67
+
68
+ | Agent | Universe | Focus in a doc audit |
69
+ |-------|----------|----------------------|
70
+ | **Troi** | Star Trek | PRD ↔ implementation claim traceability — does every documented claim trace to something that actually exists in code/PRD? Catches requirement and asset gaps. |
71
+ | **Wong** | Marvel | Doc accuracy — API docs, inline comments, README correctness; the guardian of "does the prose match the code." |
72
+ | **Irulan** | Dune | Documentation completeness — the historian checks for *missing* documentation: undocumented commands, agents, ADRs, and features that exist but are described nowhere. |
73
+ | **Coulson** | Marvel | Version / changelog currency — version-SSOT consistency, changelog completeness, release-note accuracy. |
74
+
75
+ **Division of labor:** Troi works inward (claim → does it trace?), Irulan works outward (artifact → is it documented?). Together they close both gaps: documented-but-false (Troi) and true-but-undocumented (Irulan). Wong validates the prose itself; Coulson owns everything version- and release-shaped.
76
+
77
+ ## Integration Points
78
+
79
+ | Command | How it uses the doc audit |
80
+ |---------|---------------------------|
81
+ | `/git` | Before a release, Coulson verifies version-SSOT consistency and changelog currency as part of the bump. |
82
+ | `/engage` | Code-review passes flag doc drift adjacent to changed code — but a full doc audit is its own pass, not a `/engage` side effect. |
83
+ | `/debrief` | Field reports about stale or wrong docs feed doc-audit scope; Bashir routes them here, not to `/ux`. |
84
+ | `/void` | After a methodology sync, run a doc audit to confirm the merged docs still cross-reference correctly. |
85
+
86
+ ## Anti-Patterns
87
+
88
+ - **Routing doc currency to `/ux`** — the most common misroute. UX reviews the product; doc audits review the methodology docs. Different source of truth, different roster (#342 F-3).
89
+ - **Auditing from memory** — every currency claim must be anchored to a live artifact (`ls`, file existence, code read). A claim you cannot source is a claim you cannot verify.
90
+ - **Spot-checking cross-references** — broken links rot silently. Extract *every* path/ADR token and existence-check all of them, not a sample.
91
+ - **Fixing prose without checking the other two angles** — a command renamed in CLAUDE.md but not in its spec file (or vice versa) is still broken. Sync is a three-way agreement, not a one-way edit.
92
+ - **Treating version mismatch as cosmetic** — a wrong version string in the docs is a correctness defect: it tells the reader they are on a release they are not on.