npm - voidforge-build - Versions diffs - 23.11.3 → 23.12.0 - Mend

voidforge-build 23.11.3 → 23.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (42) hide show

package/dist/.claude/agents/batman-qa.md +1 -0
package/dist/.claude/agents/galadriel-frontend.md +2 -0
package/dist/.claude/agents/kusanagi-devops.md +4 -0
package/dist/.claude/agents/lucius-config.md +6 -0
package/dist/.claude/agents/silver-surfer-herald.md +11 -4
package/dist/.claude/commands/architect.md +9 -0
package/dist/.claude/commands/assemble.md +4 -1
package/dist/.claude/commands/assess.md +13 -1
package/dist/.claude/commands/audit-docs.md +106 -0
package/dist/.claude/commands/deploy.md +28 -0
package/dist/.claude/commands/engage.md +2 -0
package/dist/.claude/commands/gauntlet.md +23 -4
package/dist/.claude/commands/imagine.md +15 -0
package/dist/.claude/commands/ux.md +32 -0
package/dist/.claude/commands/void.md +1 -0
package/dist/CHANGELOG.md +68 -0
package/dist/CLAUDE.md +9 -0
package/dist/VERSION.md +3 -1
package/dist/docs/methods/AI_INTELLIGENCE.md +33 -0
package/dist/docs/methods/ASSEMBLER.md +31 -2
package/dist/docs/methods/BUILD_PROTOCOL.md +1 -0
package/dist/docs/methods/CAMPAIGN.md +31 -3
package/dist/docs/methods/DEVOPS_ENGINEER.md +158 -0
package/dist/docs/methods/DOC_AUDIT.md +92 -0
package/dist/docs/methods/FORGE_KEEPER.md +16 -5
package/dist/docs/methods/GAUNTLET.md +33 -0
package/dist/docs/methods/PRODUCT_DESIGN_FRONTEND.md +54 -0
package/dist/docs/methods/QA_ENGINEER.md +20 -0
package/dist/docs/methods/RELEASE_MANAGER.md +27 -0
package/dist/docs/methods/SUB_AGENTS.md +31 -0
package/dist/docs/methods/SYSTEMS_ARCHITECT.md +13 -0
package/dist/docs/methods/TESTING.md +19 -0
package/dist/docs/patterns/README.md +3 -0
package/dist/docs/patterns/ai-eval.ts +63 -0
package/dist/docs/patterns/autonomous-ops-triage-policy.md +102 -0
package/dist/docs/patterns/daemon-process.ts +90 -0
package/dist/docs/patterns/deploy-preflight.ts +85 -2
package/dist/docs/patterns/design-tokens.ts +338 -0
package/dist/docs/patterns/error-message-categorization.tsx +376 -0
package/dist/wizard/lib/patterns/daemon-process.d.ts +2 -1
package/dist/wizard/lib/patterns/daemon-process.js +89 -1
package/package.json +2 -2

package/dist/CHANGELOG.md CHANGED Viewed

@@ -6,6 +6,74 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/), and this
 ---
+## [23.12.0] - 2026-06-09
+### Field Report Triage — 12 reports closed (#342–#353), 58 fixes + 5 new files
+The v23.12 methodology pass. `/debrief --inbox` triaged all 12 open field reports against the live codebase, then applied every accepted fix in one session via two-phase workflow orchestration: a **triage** pass (one agent per report, each grepping the codebase to separate already-shipped from open), followed by an **apply** pass (one writer agent per target file — disjoint files, no conflicts) with an **adversarial verify** agent re-reading every file. Two reports' fixes were found already-shipped and adversarially confirmed (#349 F-4, #352 #3); four proposed fixes were out of scope (Claude Code core / harness skill / Workflow tool) and left to upstream. 58 fixes landed across 32 files plus 5 new files.
+The reports corroborated each other into 7 clusters:
+### Added
+- **`docs/patterns/design-tokens.ts`** — semantic color/type token layer (one indirection) so a palette/type pivot is a token edit, not a component-wide rewrite (#351).
+- **`docs/patterns/nginx-vhost.conf`** — Cloudflare-Flexible-safe vhost template: security-header stack, ACME http-01 passthrough, no origin redirect loop behind CF Flexible SSL, `limit_req_zone` http-context comment (#344 F2/F4a).
+- **`docs/patterns/error-message-categorization.tsx`** — categorize errors at the UI boundary (network / auth / validation / server / quota / unknown) before choosing copy, so a billing error never renders "try a different file" (#343 F8).
+- **`.claude/commands/audit-docs.md`** + **`docs/methods/DOC_AUDIT.md`** — a Surfer-led doc-audit path (Troi / Wong / Irulan / Coulson) for currency, cross-reference, and command↔method sync, so doc audits stop being mis-routed through `/ux` (#342 F-3).
+- **`scripts/regen-claude-md.sh`** — idempotent regenerator for a marker-delimited generated CLAUDE.md block from a truth source (`docs/_truth.yml` / `package.json` + git), exits cleanly when no truth source is present (#342 F-2).
+- Pattern library 48 → 51.
+### Changed
+- **Verify the FIX, not just the finding** (#348, #349 F-2, #350 #4) — `SUB_AGENTS.md`, `GAUNTLET.md`, and `/engage` now require the adversarial pass to interrogate the *proposed fix* for new failure modes (wedge / unbounded retry / loop / orphan / double-send), especially when it adds a coordination primitive (sentinel/lock/retry-state) without a liveness signal. Anchored to the M5 mint-fence incident (a reclaim window unreachable inside the retry budget wedged drafts in FAILED). `/engage` also now names the governing SSOT and reconciles fix direction (loosen/tighten) before a finding is actionable.
+- **Production-config gate** (#350) — `GAUNTLET.md` gains an `APP_ENV=production` boot-assertion exit criterion plus a sandbox-blind-spot round ("what does the green sandbox suite NOT exercise?"); `CAMPAIGN.md` Victory Checklist now requires a prod-config boot before declaring victory. Sandbox-green is necessary, not sufficient.
+- **Spring Cleaning consumer-vs-clone** (#343 F10, destructive-risk) — `FORGE_KEEPER.md` Step 1.5 now distinguishes methodology *consumers* (app projects — skip the always-remove list, fingerprint defensively) from *clones* (apply the migration registry), with a `package.json`-deps detection heuristic, so an app project no longer loses `tsconfig.json` / lockfiles / test configs.
+- **Silver Surfer roster sizing** (#343 F6, #344 F5, #346 #1, #345 DEAL-001) — `silver-surfer-herald.md` gains `scope_bias` (lean roster when explicit file/dir scope is given), `scope_density` (6–10 for small single-shot surfaces), a ~18 single-mission cap with core/advisory tiering, and a basename-normalization rule; the corrected Output Format example now shows real basenames.
+- **Creative/UX grounding** (#347, #351) — `/ux` gains a mandatory World-Scan / Reference-Grounding step (award galleries + live competitors → reference dossier), a prototype-to-feel step, and a de-AI checklist gate; `PRODUCT_DESIGN_FRONTEND.md` documents the committee-converges-on-the-mean failure mode, token-scoped theming, and show-don't-tell; Galadriel gains matching learnings; `/architect` and `/imagine` apply world-scan when design is in scope; the Surfer must add a web-capable scout to creative rosters (design agents have no web access).
+- **Deploy / DevOps foot-guns** (#344, #349 F-1, #352, #353) — `DEVOPS_ENGINEER.md` gains 13 entries: no `eval`-export `.env` parsing (mangles `$`-bearing secrets), no `MemoryDenyWriteExecute` on Node systemd units (V8 JIT SIGTRAP), Cloudflare-Flexible redirect-loop + token-scope, served-artifact-≠-built-artifact deploy verification, worktree directory-rename pointer fragility, per-user git ident, blue-green nomenclature check, `pm2 reload` log-path binding, `docker compose up --dry-run` topology + merge semantics, config foot-guns, Docker-cleanup ownership preflight, read-back-after-vendor-PUT. Mirrored as learnings on Kusanagi and Lucius; `deploy-preflight.ts` and `daemon-process.ts` gain the matching checks/stanzas; `/deploy` gains a served-artifact fingerprint step.
+- **Doc-currency & QA gates** (#342, #346, #349 F-3, #352 #1) — `CAMPAIGN.md` + `ASSEMBLER.md` gain a pre-SEAL Doc-Currency Refresh (Coulson + Wong) with `--no-doc-refresh`, an execution-time cluster sub-split, and integrated-changeset per-mission review; `RELEASE_MANAGER.md` retires the auto-rotting PROJECT_VERSION footer; `GAUNTLET.md` + `/assemble` + `/gauntlet` formalize a vote-based adversarial REFUTE sub-step and critical-always-verified routing (also in `/assess` Blueprint Mode); `QA_ENGINEER.md` gains a planted-bug "gates must gate" check and a stash-compare failure-attribution rule; `AI_INTELLIGENCE.md` adds the live-eval pre-launch gate + null-optional normalization gotcha.
+- **CLAUDE.md** — Personality gains "Apply findings, don't offer a picker" (#343 F5) and "Honor authorized autonomy with single-question gates" (#344 G1); the Silver Surfer Gate documents when it fires (review phase, not solo build — #348) and a roster-name normalization step in the Orchestrator contract (#345). Synced to `packages/methodology/CLAUDE.md`.
+- **`packages/voidforge/package.json`** methodology dep range `^23.11.4` → `^23.12.0` (ADR-062).
+### Closes
+- **#342–#353** (12 field reports). Every accepted fix applied and adversarially verified. #349 F-4 (`/git` PROJECT_VERSION SSOT) and #352 #3 (find→adversarially-verify default) were already shipped — confirmed, not re-implemented. Out of scope and left upstream: #345 DEAL-004 (Workflow-tool args coercion), #353 RC-001/RC-002 + the `/update-config` callout (Claude Code core / harness skill — VoidForge cannot patch what it does not ship).
+### Pipeline
+This release was produced by `/debrief --inbox` run as two background workflows: a 14-agent triage pass and a 73-agent apply+verify pass (one writer per file over a disjoint partition, so 37 files were edited concurrently without conflict). The lone collision (a duplicated patterns-README row from two agents both registering the same new pattern) was caught by the verify pass and corrected. Method-doc and pattern edits propagate to the npm methodology package via `prepack.sh`; `packages/methodology/CLAUDE.md` (the one tracked source copy) was re-synced with the ADR-058 strip transform.
+---
+## [23.11.4] - 2026-05-12
+### Wong promotion cluster + #260 closeout
+After v23.11.3 shipped, a fresh `/debrief --inbox` re-triage on all 9 open field reports produced 3 promotion-ready clusters (each backed by 3+ data points across different reports / different projects / different operators) plus the deferred remainder of #260.
+### Added
+- **`docs/patterns/autonomous-ops-triage-policy.md`** (pattern #48) — codifies the 4-bucket model (self-resolving / runbook-safe / operator-approval-required / hard-never) for ops-flavored projects (infrastructure repos, monitoring daemons, homelab automation). Two operators independently reinvented this exact model across three projects (#337 F3, #336 F7, #334 F5). Pattern includes SessionStart hook visibility rule (the hook output is context-only — assistant must `echo` it back to the operator to confirm the policy is live), JSON Lines log format, decision tree, and adoption checklist.
+- **CLAUDE.md Code Patterns table** — new row for `autonomous-ops-triage-policy.md` in both root and `packages/methodology/CLAUDE.md`.
+### Changed
+- **`docs/methods/BUILD_PROTOCOL.md` Principles #11 — Derived counts discipline.** Any user-facing numeric claim ("141+ pages", "Gated pages: 19", "6 missions completed", "1390 tests") must be derived from source truth at build time OR explicitly marked with `<!-- last-verified: YYYY-MM-DD -->` and tracked in the RELEASE_MANAGER Verification Checklist. No unverified scalar claims ship. Three independent projects (#336 F6, #334 F6, #332 hidden #5) drifted the same class — this is the scalar equivalent of the No Stubs doctrine.
+- **`docs/methods/CAMPAIGN.md` Planning Mode — Scope-adversary check for bug classes.** New Step 4 in `--plan` mode: when a mission documents a specific bug class, dispatch a verification agent (Riker, Feyd-Rautha, or Spock) with the explicit prompt "list all other surfaces this class touches that were NOT in the mission scope." voidforge-marketing-site (#332) deployed with a known bug class on two surfaces because the plan was scoped to one. #338 #2 independently demonstrated the same need.
+- **`docs/methods/PRODUCT_DESIGN_FRONTEND.md` Operating Rule #12 — Tutorial-context checklist for slash commands.** Standalone `/<command>` references in tutorial content must establish "inside Claude Code" context (preceding `claude` block, callout box, or contextual prose). First-touch user content with missing launch context is a Critical UX defect. Galadriel's Step 1.5 Usability Review now explicitly flags this. (#260 remainder.)
+- **`docs/methods/QA_ENGINEER.md` Operating Rule #13 — Tutorial smoke test for slash commands.** Batman's QA pass on tutorial/onboarding docs runs a grep-based check: every `/<command>` mention must have launch context within 5 lines or a callout block on the same page. Sister-rule to PRODUCT_DESIGN_FRONTEND.md #12. (#260 remainder.)
+- **`packages/voidforge/package.json`** methodology dep range `^23.11.3` → `^23.11.4` per ADR-062 discipline (always pin methodology dep to current version on every release).
+### Closes
+- **#260** — HOLOCRON preamble shipped in v23.11.3; PRODUCT_DESIGN_FRONTEND.md + QA_ENGINEER.md tutorial checklist proposals now ship in v23.11.4. Fully addressed.
+### Pipeline
+This release is the first Wong promotion-cluster pass executed end-to-end in one session: `/debrief --inbox` triaged all 9 field reports, identified the 3 ready-now clusters, and promoted them directly into method docs and pattern library. 11 field reports remain open as v23.12 methodology campaign scope (7 priority clusters identified by Bashir: security & declaration discipline, database migration patterns, PRD/release sync, architect protocol, build/CI gates, Plex pattern bundle, container/infra patterns).
+---
 ## [23.11.3] - 2026-05-12
 ### Issue #331 destructive-bug fix + HIGH CVE patch + dep contract pin + CI hardening

package/dist/CLAUDE.md CHANGED Viewed

@@ -7,11 +7,15 @@
 - **Challenge when appropriate.** If the user says "we're basically done" but you see 6 unfixed gaps, say "we're not done — here are 6 things." Agreeing to be agreeable ships bugs.
 - **Separate opinion from analysis.** State facts first, then your recommendation. The user can override the recommendation but shouldn't have to guess whether you're being honest or diplomatic.
 - **Solve, don't delegate.** Attempt actions before listing prerequisites. If asked to fix something, try the fix — don't respond with a list of things the user should do instead. When blocked, explain what you tried and what specifically failed.
+- **Apply findings, don't offer a picker.** When a review surfaces a clear list of fixable findings, DEFAULT to applying them in batches rather than surfacing a multi-option "which subset do you want?" picker (field report #343). A picker is only warranted when the choice is genuinely architectural — mutually exclusive directions with real trade-offs the user must own. Mechanical fixes (lint, missing validation, IDOR, a11y, dead code) are not architectural; fix them and report what you did.
+- **Honor authorized autonomy with single-question gates.** When the operator explicitly authorizes autonomy ("go", "run the whole thing", "don't stop to ask"), execute the campaign end-to-end and gate only on irreducible externals — secrets, API tokens, billing approval, anything you genuinely cannot obtain or invent (field report #344). Do NOT seek interim confirmations on constants the agent itself invented and already disclosed (port numbers, table names, file paths, default copy). Surface those in the running log, not as a blocking question.
 ## Silver Surfer Gate (ADR-048, ADR-051, ADR-060)
 ADR-051 enforces this gate at the hook level (PreToolUse). The prose below is the backstop if the hook is absent or disabled. One day the prose may be removed entirely — the hook is the intended permanent mechanism.
+**When the gate fires.** The gate fires at the REVIEW phase — the moment you deploy sub-agents — not during the solo build that precedes it (field report #348). Building the work yourself first, then mustering the Surfer roster to review it, is the intended sequence; the lead agent is expected to produce the artifact solo before any agent dispatch. The gate exists to stop you from cherry-picking the review roster, not to force agents onto the build itself.
 **Gated commands:** `/engage` (alias: `/review`), `/qa`, `/sentinel` (alias: `/security`), `/ux`, `/architect`, `/build`, `/assemble`, `/gauntlet`, `/campaign`, `/test`, `/devops`, `/deploy`, `/ai`, `/assess`.
 **Procedure — execute in order:**
@@ -33,6 +37,7 @@ ADR-051 enforces this gate at the hook level (PreToolUse). The prose below is th
 1. After the Silver Surfer sub-agent returns its roster, and before launching any other Agent: `[ -x scripts/surfer-gate/record-roster.sh ] && bash scripts/surfer-gate/record-roster.sh || true` (optionally pass the roster JSON as the first argument for audit). The existence guard is a defensive no-op for projects that predate v23.10.0 — when the gate started shipping via the npm methodology package per #317.
 2. When the user's command includes `--light` or `--solo`, BEFORE launching the Surfer or any other agent: `[ -x scripts/surfer-gate/bypass.sh ] && bash scripts/surfer-gate/bypass.sh --light || true` (or `--solo`). **Fails closed on unknown flag values** (ADR-060 v23.8.18 hardening, SEC-003) — passing anything other than `--light` or `--solo` exits 2 with an error. No silent bypass.
+3. **Normalize roster names before dispatch** (field report #345, DEAL-001). The Silver Surfer returns agent names from a Haiku pre-scan, which can drift from the actual filenames in `.claude/agents/` (extra `silver-surfer-` prefix, a stray `.md` suffix, a hyphen/underscore mismatch). Before you launch, validate each name against `ls .claude/agents/`: if it matches a file (with or without the `.md` extension), keep it; if not, attempt exactly one correction — strip a known prefix/suffix or normalize separators — and re-check. If it still doesn't resolve, DROP that single name and proceed with the rest of the roster. Never block the whole dispatch over one unresolved name; log the dropped name so the Herald roster can be corrected upstream. The gate's job is to enforce *that* a roster ran, not to fail the run over a typo in *one* name.
 If `scripts/surfer-gate/check.sh` exists but you skip step 1, your first non-Surfer Agent call in that turn will be blocked with a clear message and your own log line in `/tmp/voidforge-session-$SESSION_ID/gate.log`. You are expected to comply with the block (launch Surfer / run record-roster), not to fight it. If the script does not exist, your project predates v23.10.0; pull the gate from `tmcleod3/voidforge:scripts/surfer-gate/` and merge `settings-snippet.json` into `.claude/settings.json`, or re-run `npx voidforge-build init` against the methodology source.
@@ -120,6 +125,10 @@ Reference implementations in `/docs/patterns/`. Match these shapes when writing.
 - `refactor-extraction.md` — 8-commit per-entity large-refactor template with IDOR matrix discipline
 - `ai-prompt-safety.ts` — Type A (instructions, statistical) vs Type B (constraints, enforced); AUTHORITY-as-text caveat; defense-in-depth stack
 - `llm-state-dedup.ts` — LLM ids are display labels, not keys; content-hash dedup; lifecycle-state snapshot completeness
+- `autonomous-ops-triage-policy.md` — 4-bucket model (self-resolving / runbook-safe / operator-approval / hard-never) + SessionStart hook visibility rule for ops-flavored projects
+- `design-tokens.ts` — Semantic color/type tokens (one indirection layer) so a theme pivot is a token change, not a component-wide find-replace (field report #351, #343)
+- `nginx-vhost.conf` — Cloudflare-Flexible-safe vhost template: security headers, ACME http-01 passthrough, no redirect loop behind CF's flexible SSL (field report #351, #344)
+- `error-message-categorization.tsx` — Categorize errors at the UI boundary (network / auth / validation / server / unknown) before choosing copy, so users see actionable messages not raw internals (field report #351, #343)
 ## Slash Commands

package/dist/VERSION.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Version
-**Current:** 23.11.3
+**Current:** 23.12.0
 ## Versioning Scheme
@@ -14,6 +14,8 @@ This project uses [Semantic Versioning](https://semver.org/):
 | Version | Date | Summary |
 |---------|------|---------|
+| 23.12.0 | 2026-06-09 | The v23.12 methodology pass — `/debrief --inbox` triaged all 12 open field reports (#342–#353) and applied every accepted fix in one session via two-phase workflow orchestration (triage → apply), with an adversarial verify pass on every file. 58 fixes across 32 files + 5 new files. 7 clusters: **verify-the-FIX** (the adversarial pass must vet the proposed fix, not just the finding — SUB_AGENTS.md, GAUNTLET.md, /engage; #348/#349/#350, M5 mint-fence incident); **production-config gate** (sandbox-green ≠ ship-ready — GAUNTLET.md prod-boot + sandbox-blind-spot round, CAMPAIGN.md Victory Checklist; #350); **Spring Cleaning consumer-vs-clone** (FORGE_KEEPER.md destructive-risk branch so app projects don't lose tsconfig/lockfiles; #343 F10); **Surfer roster sizing** (silver-surfer-herald.md scope_bias/scope_density/~18-cap + basename normalization; #343/#344/#345/#346); **creative/UX grounding** (world-scan + de-AI + token-scoped theming — ux.md, PRODUCT_DESIGN_FRONTEND.md, galadriel; #347/#351); **deploy/DevOps foot-guns** (DEVOPS_ENGINEER.md +13: eval-env, Node-MDWE, CF-Flexible, served-vs-built, compose-topology, docker-cleanup; #344/#349/#352/#353); **doc-currency** (CAMPAIGN/ASSEMBLER pre-SEAL refresh + new /audit-docs & DOC_AUDIT.md; #342). 3 new patterns (design-tokens.ts, nginx-vhost.conf, error-message-categorization.tsx; 48 → 51) + new /audit-docs command + DOC_AUDIT.md + scripts/regen-claude-md.sh. CLAUDE.md Personality +2 (anti-picker #343, authorized-autonomy #344), gate-timing #348, roster normalization #345. Dep range `^23.11.4` → `^23.12.0` (ADR-062). #349 F-4 and #352 #3 were already shipped (verified); #345 DEAL-004 + #353 RC-001/002/callout out of scope (Claude Code core / Workflow tool). |
+| 23.11.4 | 2026-05-12 | Wong promotion cluster + #260 closeout. /debrief --inbox re-triage of all 9 open field-report issues produced 3 ready-now promotion clusters (3+ data points across different reports each). BUILD_PROTOCOL.md Principle #11 "Derived counts discipline" (from #336 F6, #334 F6, #332 hidden #5 — three projects independently drifted the same class). New pattern `docs/patterns/autonomous-ops-triage-policy.md` codifying the 4-bucket model + SessionStart hook visibility rule (from #337 F3, #336 F7, #334 F5 — two operators independently reinvented). CAMPAIGN.md Planning Mode Step 4 "Scope-adversary check for bug classes" (from #332, #338 #2 — voidforge-marketing-site missed `/patterns` because the bug class scope was narrowed). Also closes #260 remaining items: PRODUCT_DESIGN_FRONTEND.md Operating Rule #12 "Tutorial-context checklist for slash commands" + QA_ENGINEER.md Operating Rule #13 "Tutorial smoke test for slash commands." Dep range `^23.11.3` → `^23.11.4` per ADR-062 discipline. Pattern count 47 → 48. 11 field reports remain open for v23.12 methodology pass. |
 | 23.11.3 | 2026-05-12 | Three-phase pipeline (/architect → /debrief --inbox → /campaign) shipped 12 fixes + 2 ADRs + 1 LEARNINGS entry + 2 mechanical guards. **Issue #331** destructive-bug fix: `findProjectRoot()` now enforces `$HOME` boundary + `statSync().isFile()` guard, no more silent overwrite of `~/CLAUDE.md` on `npx voidforge-build update`. **HIGH CVE** fast-xml-parser/builder via `@aws-sdk/*` patched via `npm audit fix`. **Dep contract** pinned: `voidforge-build → voidforge-build-methodology` from `"*"` to `"^23.11.3"` (ADR-062), enforced mechanically by `check-methodology-pin.sh` prepublishOnly script. **engines.node** added to methodology package.json. **publish.yml hardening**: post-publish `npm view` verification step (both jobs, 6×10s retry), `recover-partial` job with `npm deprecate` on XOR-failure, `needs: publish-methodology` ordering on publish-voidforge. **copy-assets.sh** ADR-058 template strip applied (parity with methodology prepack). **Docs**: HOLOCRON Quick Start "launch Claude Code first" preamble + npm-prefix workaround (#260, #333p), FORGE_KEEPER Rule #11 "never write to $HOME" (ADR-063), RELEASE_MANAGER ROADMAP-sync checklist line, ROADMAP.md pointer v23.8.11 → v23.11.3 (24-version drift closed). Marker integration test `no-home-writes.integration.test.ts` mechanically enforces ADR-063. 1390 tests pass. |
 | 23.11.2 | 2026-05-12 | `voidforge init` now prompts browser-vs-CLI when no mode flag is passed (TTY only) and `--browser` was added for explicit opt-in. Headless init prompts for name/dir/oneliner/domain/repo when `--name` is omitted in a TTY. Non-TTY no-flag now errors cleanly instead of silently launching a wizard server. Separately: `packages/methodology/scripts/surfer-gate/` (8 files) now ships in the npm methodology package — closes ADR-051 distribution gap (#317). 9 pattern-table rows + existence-guarded orchestrator-contract bash propagated into the methodology CLAUDE.md. |
 | 23.11.1 | 2026-05-10 | `/git` release-discipline patch — Step 4.5 (auto-tag, default-on) and Step 7 (`--npm` opt-in publish). Closes the silent-release gap that stranded v23.10.0 and v23.11.0 between GitHub and npm. Tag-push triggers the existing `publish.yml` workflow; `--npm` is a same-session fallback. Documents the `latest` dist-tag race when multiple tags are pushed simultaneously. RELEASE_MANAGER.md mirrored. |

package/dist/docs/methods/AI_INTELLIGENCE.md CHANGED Viewed

@@ -151,6 +151,37 @@ Run sequentially — each builds on findings from parallel phase:
 - Can you detect regression when prompts change?
 - Is there human-in-the-loop scoring for ambiguous cases?
 - Are quality metrics tracked over time? (Not just at launch)
+- Is there a LIVE eval layer that runs against the real model before launch? (Not just the sandbox layer)
+#### The LIVE eval layer is the pre-launch gate (field report #352, #4)
+Evals stratify into two layers, and they catch different bug classes:
+- **Deterministic / sandbox layer** — fast, hermetic, runs in CI with mocked or recorded model responses. Catches scoring-logic bugs, prompt-template-rendering bugs, and golden-dataset regressions. It **cannot** catch model-output-shape bugs, because the fixtures are shapes *you* authored — not shapes the live model actually emits.
+- **LIVE eval layer** — runs the real model against the golden dataset before launch. This is the pre-launch gate. It is the only layer that observes the model's *actual* output shape, and the only layer that can catch a contract drift between "the response shape we coded against" and "the response shape the model produces."
+Treat the LIVE layer as a mandatory gate, not an optional smoke test: a component cannot ship until its LIVE eval has run against the real model and passed. The sandbox layer gates *every commit*; the LIVE layer gates *every launch*.
+**Gotcha — normalize null-to-undefined before Zod `.optional()` (field report #352, #4).** A live model emits `null` (not omission) for an absent optional field — e.g. it returns `{ "category": "billing", "subcategory": null }` rather than dropping `subcategory`. Zod's `.optional()` accepts `undefined`, **not** `null`, so the valid response fails schema validation and your retry/fallback path fires on output that was actually fine. This is invisible in the sandbox layer because hand-authored fixtures usually omit the key instead of setting it to `null`. Normalize before validating:
+```ts
+// Strip nulls the model emits for absent optionals, so `.optional()` matches.
+function nullToUndefined<T>(obj: T): T {
+  if (obj === null) return undefined as unknown as T;
+  if (Array.isArray(obj)) return obj.map(nullToUndefined) as unknown as T;
+  if (typeof obj === "object") {
+    return Object.fromEntries(
+      Object.entries(obj as Record<string, unknown>).map(([k, v]) => [k, nullToUndefined(v)]),
+    ) as T;
+  }
+  return obj;
+}
+// Apply BEFORE Zod validation — never validate raw model output directly.
+const parsed = ResponseSchema.parse(nullToUndefined(JSON.parse(modelText)));
+```
+If a field is legitimately nullable (the model is *meant* to return `null` as a value), use `.nullish()` (accepts both `null` and `undefined`) on that specific field instead of blanket-stripping — but the default posture for absent optionals is normalize-then-`.optional()`.
 **Dors Venabili (Observability):** Visibility.
 - Can you see what the AI decided and why?
@@ -223,6 +254,8 @@ If issues found, return to Phase 3. Maximum 2 iterations.
 - [ ] Regression suite runs on prompt changes
 - [ ] Quality metrics tracked over time
 - [ ] Human review process for edge cases
+- [ ] LIVE eval layer runs against the real model and passes before launch (sandbox layer alone cannot catch model-output-shape bugs) (field report #352, #4)
+- [ ] Model output normalized null-to-undefined before Zod `.optional()` validation (field report #352, #4)
 ### AI Gate Bootstrapping (Cold-Start Problem)
 AI-gated approval systems have a cold-start problem: no historical outcomes -> gate rejects all requests -> no operations -> no outcomes. During the first N decisions (configurable, default 20), the gate should approve at reduced size (0.5-0.7x normal) to build a track record. The gate should never reject solely because "no historical data exists." Include explicit prompt guidance: "Lack of history is not a reason to reject — approve at reduced size to build the track record." (Field report #152)

package/dist/docs/methods/ASSEMBLER.md CHANGED Viewed

@@ -53,7 +53,8 @@ Fury calls ALL of them. That's the point.
 8. `--skip-arch` and `--skip-build` allow re-running reviews on existing code.
 9. `--resume` picks up from the last completed phase.
 10. Only suggest a fresh session if `/context` shows actual usage above 85%. Do not preemptively checkpoint or reduce quality for context reasons.
-11. **All phases dispatch to sub-agents per ADR-036.** The main thread orchestrates — it plans, launches, triages, and decides. It does NOT read source files, analyze code inline, or generate findings from raw code. See `SUB_AGENTS.md` "Parallel Agent Standard" for brief format, deliverables, and concurrency rules. (Field report #270: full 11-phase /assemble ran through 15+ sub-agents with context at 15-25%, vs 80%+ inline.)
+11. Phase 13.5 (Doc-Currency Refresh) runs before sealing unless `--no-doc-refresh` is set; the skip is logged. `--doc-audit` is a docs-only one-shot mode that runs none of Phases 1-13 (field report #342 F-1, F-5).
+12. **All phases dispatch to sub-agents per ADR-036.** The main thread orchestrates — it plans, launches, triages, and decides. It does NOT read source files, analyze code inline, or generate findings from raw code. See `SUB_AGENTS.md` "Parallel Agent Standard" for brief format, deliverables, and concurrency rules. (Field report #270: full 11-phase /assemble ran through 15+ sub-agents with context at 15-25%, vs 80%+ inline.)
 ## The Pipeline
@@ -72,6 +73,7 @@ Fury calls ALL of them. That's the point.
 | 11 | /test | 1 | Suite green, coverage acceptable |
 | 12 | Crossfire | 1 | All 4 adversarial agents sign off |
 | 13 | Council | 1-3 | All 5 cross-domain agents sign off (incl. Troi PRD compliance) |
+| 13.5 | Doc-Currency Refresh | 1 | Project docs sweep is clean before sealing (skip with `--no-doc-refresh`) |
 ### Phase 6.5 — Seldon's AI Review (conditional)
@@ -125,9 +127,36 @@ Verify no circular calls between store actions and API methods. Specifically che
 When a feature is added to one surface (API, dashboard, CLI, marketing site), verify all other surfaces displaying the same entities are updated. A new field added to the API response but missing from the dashboard table, or a new tier added to the pricing page but missing from the settings panel, creates an inconsistent product. After each pipeline phase that adds or modifies a feature, grep for the entity name across all surfaces: API routes, React/Vue components, CLI output formatters, marketing page copy, email templates, admin panels. (Triage fix from field report batch #149-#153.)
+### Phase 13.5 — Doc-Currency Refresh (pre-SEAL)
+After the Council signs off, but BEFORE Fury seals the run and makes the Deploy Offer, sweep the project's source-of-truth docs for drift introduced over the course of the pipeline. A full `/assemble` touches architecture, features, version, and build state — by the time the Council finishes, the docs that describe the project frequently no longer match it. This mirrors the Doc-Currency Refresh mission in `CAMPAIGN.md`: same checklist, applied once at the end of the pipeline instead of once per mission. (Field report #342 F-1: `/assemble` shipped a Council-clean build whose `CLAUDE.md` Project block and `PROJECT_VERSION` line still described the pre-build scaffold.)
+Sweep each of these for drift against the current state of the code, and fix what's stale:
+1. **`CLAUDE.md`** — Project block (name, one-liner, domain, repo), stack, and any phase/feature claims that the pipeline changed.
+2. **`MEMORY.md`** (auto-memory index, if present) — entries that now point at retired or renamed work.
+3. **`README.md`** — install/run/feature sections that the build moved past.
+4. **`PROJECT_VERSION.md`** (or equivalent) — the **Current** line must name the version this run produced, not the one it started from.
+5. **`/logs/build-state.md`** — the recorded "current state" must reflect the completed pipeline, not a stale prior session (the Phase 9 deployment-verification trap, generalized to all of build-state).
+6. **`/logs/campaign-state.md`** (if this `/assemble` ran inside a campaign) — the mission this run closed must be marked done so `/campaign` doesn't re-pick it.
+This phase is **additive verification, not a rewrite** — only touch lines that are demonstrably stale. If every doc is already current, record "Doc-Currency Refresh: clean" in `assemble-state.md` and proceed. Dispatch the sweep to a sub-agent per ADR-036; the main thread triages the diff.
+**`--no-doc-refresh`** skips Phase 13.5 entirely (documented opt-out, for runs where docs are maintained out-of-band). The skip is logged to `assemble-state.md` so a later run knows the docs were never swept. (Field report #342 F-1.)
 ### Post-Pipeline: Deploy Offer
-After Phase 13 (Council sign-off), if a deployment target is configured (`.vercel/project.json`, `fly.toml`, `railway.toml`, or PRD deploy section), Fury offers: "Council has signed off. Deploy to production?" This closes the loop instead of leaving deployment as an implicit user action. In campaign blitz mode, auto-deploy if the deploy method is known. (Field report #37: user had to prompt three times before agent deployed to Vercel.)
+After Phase 13.5 (Doc-Currency Refresh, or its skip), if a deployment target is configured (`.vercel/project.json`, `fly.toml`, `railway.toml`, or PRD deploy section), Fury offers: "Council has signed off. Deploy to production?" This closes the loop instead of leaving deployment as an implicit user action. In campaign blitz mode, auto-deploy if the deploy method is known. (Field report #37: user had to prompt three times before agent deployed to Vercel.)
+## `--doc-audit` — One-Shot Documentation Audit
+`/assemble --doc-audit` runs a **docs-only** pass: it dispatches a Silver-Surfer-led doc-audit roster and runs **no code review, no build, no security/QA phases** — none of Phases 1-13. The entire pipeline is replaced by a single sweep whose only job is to catch documentation drift across the whole corpus, then report. Use it when you suspect the docs have fallen behind the code but don't want to pay for a full Initiative run. (Field report #342 F-5.)
+This is deliberately **broader** than `/git` Step 5.5. Step 5.5 only checks the 13 known method-doc ↔ command-file pairs (e.g. `ASSEMBLER.md` ↔ `assemble.md`) after a release touches a method doc — it is a paired-file sync check, not a corpus audit. `--doc-audit` instead lets the Surfer muster a roster sized to the actual surface area: `docs/`, `*.md` at the repo root, per-directory `CLAUDE.md` files, ADR records, pattern headers, and the slash-command/agent inventory — anywhere prose can disagree with reality.
+**Canonical path.** `--doc-audit` is a convenience entry point into the canonical doc-audit mechanism, not a second implementation of it. The doc-audit roster, scoring, and report format are defined by the **`/audit-docs`** command and **`DOC_AUDIT.md`** — `/assemble --doc-audit` dispatches that same Surfer-led audit and surfaces its report. When in doubt about what the audit covers or how findings are graded, `DOC_AUDIT.md` is the source of truth; this section only documents the `/assemble` doorway to it. (Field report #342 F-5.)
+`--doc-audit` is incompatible with the build/review flags (`--skip-arch`, `--skip-build`, `--fast`, `--resume`) since it runs none of those phases; pair it with `--focus "topic"` to bias the Surfer's roster toward a corner of the corpus.
 ## Deliverables

package/dist/docs/methods/BUILD_PROTOCOL.md CHANGED Viewed

@@ -453,3 +453,4 @@ Examples of batches that are too big:
 8. Test as you build. Write tests alongside features. Tests are a breaking gate.
 9. Skip what doesn't apply. Not every project needs every phase.
 10. Log everything. Decisions, test results, failures, handoffs. The journal is your memory.
+11. **Derived counts discipline.** Any user-facing numeric claim ("141+ pages", "Gated pages: 19", "6 missions completed", "1390 tests") must be derived from data the build can verify, never hardcoded. Either: (a) compute at build time from source truth (file counts, array lengths, manifest scans), or (b) explicitly mark the claim with `<!-- last-verified: YYYY-MM-DD -->` and add a maintenance task to `docs/methods/RELEASE_MANAGER.md` Verification Checklist. No unverified scalar claims ship in any release. This is the scalar equivalent of the No Stubs doctrine (CLAUDE.md Coding Standards). Three projects independently drifted the same class (#336 F6, #334 F6, #332 hidden #5) — the cost compounds because nobody knows which numbers are stale until somebody counts.

package/dist/docs/methods/CAMPAIGN.md CHANGED Viewed

@@ -101,9 +101,10 @@ When the user passes `--plan [description]`, Sisko updates the plan instead of e
 1. Read the current PRD and ROADMAP.md
 2. Dax analyzes where the new idea fits — new feature (PRD), improvement (ROADMAP), or reprioritization
 3. Odo checks dependencies — does this depend on something not yet built?
-4. Present proposed changes for user review
-5. Write updates on confirmation
-6. Do NOT start building — planning only
+4. **Scope-adversary check for bug classes.** When the new idea is a bug fix OR documents a specific bug class, dispatch at least one verification agent (Riker, Feyd-Rautha, or Spock) with an explicit prompt: *"This mission fixes a [slug-array / render-time data / schema-constraint / IDOR / etc.] class. List all other surfaces in this codebase that this class touches but were NOT in the mission scope."* The plan must explicitly account for every class-instance found, or explicitly defer with rationale. (Field reports #332 + #338: voidforge-marketing-site planning missed `/patterns` because the bug was scoped to `/commands`; same class, both surfaces, only one fixed. Class-generalization is a briefing discipline that costs zero code and prevents whole categories of regressions.)
+5. Present proposed changes for user review
+6. Write updates on confirmation
+7. Do NOT start building — planning only
 This is how ideas get into the plan without breaking the execution flow. The user describes what they want in plain language; Dax figures out where it goes.
@@ -282,6 +283,8 @@ User confirms, redirects, or overrides. On confirm → Step 4.
 **Post-infrastructure enforcement gate:** For infrastructure campaigns (deploy targets, CI/CD, monitoring, staging environments): after the infrastructure is provisioned, run `/architect --plan` to verify workflow enforcement gates exist — not just infrastructure existence. Infrastructure without process gates is incomplete.
+**Silver Surfer gate fires at the REVIEW phase, not the solo build.** Within a mission, the gate (ADR-051 PreToolUse hook on the Agent tool) engages when Fury deploys the review/audit roster as sub-agents — NOT during the orchestrator's solo build of the mission's code. Solo-build-before-review is intentional, not a skipped gate: parallel agents editing the same tightly-coupled engine files (game loop, state machine, shared service) would clobber each other's edits and produce merge garbage. So the orchestrator builds the changeset solo, THEN the Surfer-gated review roster reads it. If you find yourself mid-build asking "did a gate get skipped?", the answer is no — the gate has not fired yet because the review phase has not started. (Field report #348 #3: mid-build confusion over an un-fired gate that fires correctly at the review phase.)
 **Dispatch model (ADR-044):** Per-mission `/assemble` runs SHOULD dispatch phases to sub-agents per `SUB_AGENTS.md` "Parallel Agent Standard." Agents are launched as named subagent types defined in `.claude/agents/` with description-driven dispatch — Opus scans `git diff --stat` and matches changed files against agent descriptions to auto-select specialists. The campaign orchestrator (main thread) manages the mission sequence, inter-mission gates, and campaign state — it does NOT perform inline code analysis. Pass findings summaries between missions, not raw code. See `docs/AGENT_CLASSIFICATION.md` for the full agent manifest (see docs/AGENT_CLASSIFICATION.md). (Field report #270)
 ### Campaign-Mode Pipeline
@@ -439,12 +442,27 @@ Even in `--fast` mode, each mission gets at least **1 review round** (not 3, but
 **UI→server route tracing (within review):** When a mission writes both UI code and server code, the review must trace every `fetch()` call in the UI to a registered server route. For each `fetch('/api/...')` in `.js`/`.ts` UI files, verify the path exists as an `addRoute()` call in the server. Missing routes produce silent 404s that are invisible in development. (Field report #50: UI button called `/api/server/restart` but no endpoint was created.)
+**Review the integrated changeset, not only the new files.** The per-mission review gate must read the full diff from the prior mission's HEAD (`git diff <prev-mission-sha>..HEAD`), not just the files this mission created. Reviewing only the new files misses pre-existing cross-cutting defects that the integration surfaces: a missing config entry the new code now depends on, a Dockerfile `COPY` that never included the directory this mission populated, a doc-vs-reality drift where the new wiring contradicts a README/PRD claim, a build/import that only breaks once the new module is referenced. The new files can each be clean in isolation while the integrated system is broken at the seams. The diff is the unit of review, not the file list. (Field report #346 #4.)
 ### One Mission, One Commit Anti-Pattern
 **Each mission gets its own commit.** Do NOT batch multiple missions into a single commit. The per-mission commit serves as evidence: the diff for Mission 3 should contain only Mission 3's deliverables. If the diff contains work from Missions 3-11 combined, the review is meaningless — you can't verify what changed for which mission.
 If a mission is small enough to merge with an adjacent one, that's fine — but explicitly acknowledge it: "Missions 3-4 combined (both methodology-only, same target file)." Never silently batch.
+### Execution-Time Cluster Sub-Split
+Plan-time cluster recognition (Step 1 #9, field report #326) splits a cluster-natured mission into sub-missions BEFORE the campaign starts, when Dax can see 4+ named deliverables on the board. But some clusters only reveal their seam at EXECUTION time: a mission spans a **foundation + N consumers** (a new schema/migration the rest of the mission builds on, a shared client/adapter, a base config, a core engine module) and the consumers cannot be safely reviewed until the foundation is real. Or a `RISK` item surfaces mid-mission demanding the foundation land and be verified BEFORE its consumers are wired.
+When that happens, split at execution time along the **foundation/consumers seam** — even though the mission was a single board entry:
+1. **Sub-mission A (foundation):** build the foundation alone. Its own review gate. Its own commit (`M-XX.a — <foundation>`).
+2. **Sub-mission B..N (consumers):** build the consumers against the now-verified foundation. Each gets its own review gate and its own commit (`M-XX.b`, `M-XX.c`, ...).
+Each sub-mission is a real mission for gate purposes: 1-round review minimum, per-mission commit (One Mission, One Commit still holds), and its slice recorded in campaign-state.md. The foundation's review can catch a contract defect (a column the consumers will read but the migration didn't add, an adapter method the consumers call but the foundation didn't expose) BEFORE the consumers are written against a broken base — instead of the Gauntlet catching it three missions later.
+**This complements, not duplicates, plan-time recognition (#326):** plan-time splits a cluster on the board before execution; execution-time sub-split triggers when a foundation/consumers seam (often a `RISK` item) emerges *during* a mission that looked atomic at plan time. If you notice the seam at plan time, split there; if it only surfaces under the build, split here. (Field report #346 #3.)
 ### Per-Mission Verification Agents
 After each mission's review round, two agents run quick checks:
@@ -601,12 +619,22 @@ All PRD requirements are COMPLETE or explicitly BLOCKED:
 7. **PRD sync check:** Before declaring victory, compare PRD numeric claims (agent counts, feature counts, route counts, component counts) against the actual codebase for this campaign's domain. Stale PRD claims erode trust and compound across campaigns. (Field report #119)
 7a. **Tenant isolation completeness (conditional):** If the campaign touched auth, multi-tenant, or user-scoped data, grep ALL tables for `org_id` (or equivalent ownership column). Every table must be classified as either "tenant-scoped" (has org_id) or "global by design" (with documented justification). Tables without org_id and without justification are IDOR risks. This catches incomplete tenant migrations that survive per-phase sweeps — the per-phase check (BUILD_PROTOCOL Phase 4) only covers tables modified in that phase. (Field reports #229, #231)
 8. **Entity selector completeness** — for every user-facing selector (dropdown, combobox, autocomplete) that selects from a database-backed list: verify the selector can handle entities that don't exist yet. If a user can only pick from existing DB records, the feature is incomplete — the selector needs a creation flow or an external lookup fallback. Common examples: city selector (needs geocoding fallback), category picker (needs "Other" or custom entry), user selector (needs invite flow). (Field report #263: city selector only searched existing DB cities — users couldn't set homebase to any city not already in the database.)
+8b. **Doc-Currency Refresh mission (Coulson + Wong) — mandatory pre-SEAL sweep.** Before the final sign-off seals the version, run a dedicated cross-doc currency sweep. The Step 0 freshness check and the Step 6 #9 `build-state.md` update each cover ONE file at ONE moment; across rapid sealed versions the load-bearing docs rot collectively because no single mission owns their joint currency. This mission gives that sweep an owner. **Coulson** (release authority) drives version-line accuracy; **Wong** (lessons/changelog/PRD refresh) drives prose currency. Sweep and reconcile against the current `git log -1` + `package.json`/`pyproject.toml` version:
+   - **`CLAUDE.md`** — Project block, version references, command/agent counts, any "as of vX.Y.Z" claims
+   - **`MEMORY.md`** (auto-memory index, if present) — stale "next:" pointers, completed-work entries that still read as pending
+   - **`README.md`** — install/usage snippets, badge versions, feature lists that drifted from reality
+   - **`PROJECT_VERSION.md` Current line** (or `VERSION.md` "Current:" line) — must equal the version about to be sealed
+   - **`/logs/build-state.md`** — version, test counts, deployment state
+   - **`/logs/campaign-state.md`** — Prophecy Board statuses, final mission status
+   For each file, fix drift in place — never seal known drift. If a doc is already current, note "current at <SHA>" and move on (idempotent). **Opt-out: `--no-doc-refresh`** skips this mission for fast methodology-only or hotfix campaigns where the docs provably did not move; log the skip in campaign-state.md with a one-line reason. This complements (does not replace) the existing state-freshness checks. (Field report #342 F-1: load-bearing docs rotted across rapid sealed versions because the per-file freshness checks had no cross-doc owner.)
 9. **Victory Checklist** — ALL must be true before sign-off:
    - [ ] Gauntlet Council signed off (6/6 or all domains pass)
    - [ ] All BLOCKED items acknowledged by user
    - [ ] PRD claims verified against codebase
    - [ ] `/debrief --submit` filed (issue number recorded)
    - [ ] Campaign-state.md updated with final status
+   - [ ] Doc-Currency Refresh completed (or `--no-doc-refresh` skip logged with reason)
+   - [ ] **Production-config boot assertion passed** — a green sandbox suite is necessary but NOT sufficient. Boot the app under its real production config (production env vars / `NODE_ENV=production`, real adapter selection, production build artifact) and assert it reaches a ready state without a config/credential fault. Sandbox adapters can pass every test while the production path fails on a missing env var, a real-vs-sandbox adapter mismatch, or a build-only import error. Do not declare victory on sandbox-green alone. (Field report #350 #3: sandbox suite was fully green but the production-config boot path was never asserted before sign-off.)
 ### The Reckoning (Optional Pre-Launch Audit)

package/dist/docs/methods/DEVOPS_ENGINEER.md CHANGED Viewed

@@ -97,6 +97,11 @@ For each service in `docker-compose.yml`, verify:
 7. **Dependency health** — `depends_on` with `condition: service_healthy` (compose v2.1+). Without it, the app starts before its database is ready.
 (Field report #280)
+**Compose validation goes deeper than syntax (field report #352 #2).** `docker compose config` only validates *syntax* — it renders the merged YAML and exits 0 even when the resulting topology is wrong. Two failure modes it will not catch:
+- **Dependency closure.** A service can reference a network, volume, or `depends_on` target whose definition exists but whose *startup* chain is broken. Check the closure with `docker compose up --dry-run` — it walks the full dependency graph and reports what would actually start (and in what order) without launching containers.
+- **Overlay merge, not overlay replace.** Compose **merges** list-and-map fields like `depends_on` and `environment` across overlay files (`-f base.yml -f docker-compose.dev.yml`); it does not replace them. The classic trap: `base.yml` declares `depends_on: [redis]` for development, and an overlay tries to drop it with `depends_on: []` — the empty list **merges into** the base list, the `redis` edge **survives**, and prod still waits on (or starts) a dev-only Redis. To *replace* rather than merge, use the override tags: `depends_on: !override []` (replace the whole list) or `!reset null` (remove the key entirely). Verify the rendered result with `docker compose config` and confirm the unwanted edge is actually gone — never assume the overlay won.
 **L — Monitoring:** Health endpoint (/api/health checking DB, Redis, disk). External uptime monitor. Request logging (method, path, status, duration). Error tracking. Slow query logging (>1s). Worker job logging. Alerts: CPU >80%, Memory >85%, Disk >80%.
 **Build Staleness Detection (health endpoint):** The health endpoint MUST include a build fingerprint check. At startup, capture a build fingerprint (git commit hash, `BUILD_HASH` env var, or entry bundle mtime). Include it in `/api/health` responses. After any deploy, compare the health endpoint's fingerprint against the expected value. A mismatch means the process serves stale code — the build completed but was never reloaded. Automate: if health fingerprint != deployed commit, trigger process reload. This is the #1 cause of "I deployed but nothing changed" incidents. (Field reports #278, #279)
@@ -164,6 +169,33 @@ If a process manager (PM2, systemd, Docker, supervisord) owns the application po
 **Detection rule:** When writing CLAUDE.md "How to Run" sections or session restart commands, check if the project uses a process manager (`ecosystem.config.js`, `docker-compose.yml`, `*.service` files). If yes, the restart command MUST go through the PM — not through port killing.
+### PM2 Operational Foot-guns
+**`pm2 reload <config>` does NOT re-read log paths.** `error_file` / `out_file` paths bind at process *registration* time, not at reload time (field report #343 F9). If you change a log path in `ecosystem.config.js` and run `pm2 reload`, PM2 keeps writing to the old paths — the new ones never take effect, and a log-rotation or disk-pressure fix silently does nothing. Changing log paths requires a full re-registration cycle:
+```bash
+pm2 delete <app>           # drop the old registration
+pm2 start ecosystem.config.js --cwd /path/to/project
+pm2 save                   # persist so the new paths survive reboot
+```
+The same applies to any other property that binds at registration (`exec_mode`, `instances`, `cwd`): `pm2 reload` reloads code, not the process definition.
+**Multi-user deploy setups need per-user git identity (field report #343 F3).** When each environment runs as a different OS user (e.g. `deploy-staging`, `deploy-prod`), any git operation the deploy performs as that user — a merge commit, a `git stash`, a tag, an auto-commit of generated lockfiles — fails with `fatal: empty ident name (for <user@host>) not allowed` if that user has no `user.email` / `user.name`. The fault is invisible until a fallback path that commits actually runs in production. Provision git identity per deploy user:
+```bash
+sudo -u deploy-prod git config --global user.email "deploy@example.com"
+sudo -u deploy-prod git config --global user.name  "Prod Deploy"
+```
+Add this to `provision.sh` for every Unix user that will run git as part of a deploy or fallback path.
+### Deploy-Strategy Nomenclature Check
+If a deploy script's comments or docs claim **blue-green** or **zero-downtime**, verify the code actually implements an atomic-swap mechanism before believing the label (field report #343 F7). A real zero-downtime swap is one of:
+- **temp-build-then-rename** — build into `release-new/`, then `mv release-new release` (or repoint a `current` symlink) in a single atomic operation,
+- **container swap** — start the new container, health-check it, then cut traffic over and stop the old one, or
+- **load-balancer cutover** — add the new instance to the pool, drain and remove the old one.
+A `stop → build → start` loop mislabeled "blue-green" serves nothing during the build window and produces a 502 gap on every deploy. The label is not the mechanism. Audit check: grep the deploy script for the claim, then confirm a rename/symlink-repoint, container cutover, or LB pool change exists. If it's a stop-build-start loop, either fix it to atomic-swap or correct the comment — a mislabeled strategy hides a recurring outage.
 ### CI runs `npm test` at repo root
 In monorepo CI workflows, run `npm test` at the repository root — NOT `npm run test -w <workspace-name>`. The workspace-scoped form skips the root `pretest` hook, silently bypassing any root-level validators (agent-ref checkers, gate tests, consistency checks).
@@ -191,6 +223,21 @@ fi
 Applies to: Vercel Git Integration, Cloudflare Pages Git Integration, Netlify Git Integration, Firebase web-hook auto-deploys.
+### The served artifact is not the built artifact
+Every step exiting 0 — `git pull` ✓, `npm run build` ✓, `pm2 reload` / `docker compose up` ✓ — proves the build *ran*; it does NOT prove the **served** bundle is the one you just built (field report #349 F-1). The two can diverge whenever the thing that builds and the thing that serves are different processes pointed at different paths. The canonical split: a **host nginx static root** serves `/var/www/app/dist`, while the build runs *inside a Docker container* and writes to a **container-internal `dist`** that the host root never sees. Build succeeds, container restarts clean, health check is green — and prod serves the previous bundle indefinitely because nginx is reading a directory nobody rebuilt.
+Rule: after deploy, confirm the SERVED bundle matches the BUILT one by **fingerprint fetched back through the public/served path** — not by exit codes. Capture a build fingerprint (git short SHA, `BUILD_HASH`, or the hashed entry-bundle filename) at build time, then fetch it back through the real serving path and assert equality:
+```bash
+EXPECTED="$(git rev-parse --short HEAD)"
+# Pull the fingerprint through the SERVED path — the public URL or the host static root,
+# whichever end users actually hit — never the build directory.
+SERVED="$(curl -s "https://$DEPLOY_URL/version.txt")"   # or grep the hashed main.<hash>.js from index.html
+[ "$SERVED" = "$EXPECTED" ] || { echo "SERVED ARTIFACT MISMATCH: served=$SERVED built=$EXPECTED"; exit 1; }
+```
+This **generalizes to manual `/deploy`** the two automated checks already in this doc: **Build Staleness Detection** (§health endpoint — the process serves stale code) catches a build-but-no-reload within one process, and **Post-push live-URL fingerprint** (§above) catches a broken platform auto-deploy. This entry is the same fingerprint discipline for the self-hosted, multi-location, hand-run deploy: assert the served fingerprint equals the built fingerprint as the final gate of any manual deploy, not just platform pushes.
 ### Methodology-exposure check (static-host deploys)
 After deploying to a static CDN (Cloudflare Pages, Vercel, Netlify, Firebase, S3+CloudFront), curl a known methodology path and assert 404 / denied:
@@ -293,6 +340,23 @@ If health check fails after deploy:
 **PM2 discipline: never `pm2 delete` + `pm2 start` without `--cwd`.** Always specify the working directory explicitly: `pm2 start ecosystem.config.js --cwd /path/to/project`. Without `--cwd`, PM2 resolves paths relative to the current shell directory, which may differ from the project root — especially in deploy scripts that `cd` between operations. A `pm2 start` from the wrong directory silently starts the process with wrong paths, serving 404s on every route. (Triage fix from field report batch #149-#153.)
+### Docker Cleanup Preflight
+Before any `rm -rf` against a Docker **bind-mount** path (volumes the container wrote to as root — pgdata, redis dumps, uploaded files), preflight the ownership; do not just run the delete and hope (field report #353 RC-003). Docker bind-mounts written by a container default to **root** ownership on the host, so an unprivileged agent's `rm -rf` fails partway with `Permission denied`, often after deleting the writable half of the tree — a worse state than not starting.
+Preflight: `stat` the path's owner first, and branch on it:
+```bash
+target=/var/lib/myapp/pgdata
+owner="$(stat -c %U "$target" 2>/dev/null || stat -f %Su "$target")"   # GNU || BSD/macOS
+if [ "$owner" = "root" ] && [ "$(id -u)" -ne 0 ]; then
+  echo "MANUAL STEP REQUIRED — $target is root-owned; run as operator:"
+  echo "    sudo rm -rf $target"
+else
+  rm -rf "$target"
+fi
+```
+When the path is root-owned and the agent is unprivileged, **emit the `sudo`-prefixed step as a MANUAL operator action** rather than attempting (and half-completing) the delete. A clean handoff beats a partial destruction. (`stat -c %U` is GNU coreutils; `stat -f %Su` is BSD/macOS — the snippet tries both for portability.)
 ## Multi-Environment Isolation
 When staging and production coexist on the same server, enforce full isolation:
@@ -308,6 +372,29 @@ When staging and production coexist on the same server, enforce full isolation:
 Convention isn't enough — enforcement is. The pre-push hook is the single most effective protection. (Field report #241: 68-hour production outage from shared infrastructure.)
+### Renaming a Linked Worktree Directory Breaks Git Silently
+A linked git worktree (staging worktree, release worktree) keeps **two** pointer files that must agree on the directory's path. Renaming the worktree directory with a plain `mv` orphans both, and git gives you no warning (field report #343 F2):
+1. The worktree's own `.git` **file** (not a directory — it contains `gitdir: /abs/path/to/main/.git/worktrees/<name>`).
+2. The main repo's `.git/worktrees/<name>/gitdir` file, which points back at the worktree's `.git` file.
+After a bare `mv staging staging-old`, both paths are stale. The worst part: **`git worktree list` does NOT warn** — it happily prints the old path, so the breakage is invisible until a git command inside the moved worktree fails with `fatal: not a git repository` or a deploy that `cd`s into the worktree silently operates on the wrong tree.
+Fix — never `mv` a worktree directory. Use the porcelain that updates both pointers atomically:
+```bash
+git worktree move staging /new/abs/path/staging-old
+```
+If a directory was already moved by hand, repair both pointers manually:
+```bash
+# 1. fix the worktree's own .git file
+echo "gitdir: /abs/main/.git/worktrees/staging" > /new/abs/path/staging-old/.git
+# 2. fix the main repo's back-pointer
+echo "/new/abs/path/staging-old/.git" > /abs/main/.git/worktrees/staging/gitdir
+git worktree repair /new/abs/path/staging-old   # validates both ends
+```
+`git worktree repair` is the belt-and-suspenders step — run it after any manual edit to confirm both ends resolve.
 ## Deploy Safety Rules
 **rsync exclusion mandate:** NEVER use `rsync --delete` without excluding VPS-only directories. User-uploaded files, generated avatars, and data files only exist on the VPS — `--delete` will destroy them. Mandatory exclusions:
@@ -332,6 +419,62 @@ Add project-specific exclusions for any directory that receives runtime-generate
 **Post-deploy asset verification:** After deploying, verify specifically the files that *changed* in this deploy — not pre-existing assets. Check: (a) correct content-type header (text/html on a static asset means the file is missing from the deployment), (b) correct content-length (not the index.html fallback size), (c) deployment list shows the correct environment. Do NOT verify only pre-existing assets — they prove the host is up, not that the deploy succeeded. (Field report #114)
+**Read back after a vendor PUT that doesn't echo the object.** When a deploy or config step `PUT`s to a vendor/control-plane API (DNS provider, CDN, Plex, a SaaS settings endpoint) and the response does **not** contain the mutated object, do NOT treat the `200` as confirmation — issue a follow-up `GET` and assert the field you set actually took (field report #353 RC-004). A vendor `PUT` can return `200 OK` while silently discarding body params it doesn't recognize, applies asynchronously, or rejects at a validation layer that still returns success (the Plex pattern: settings PUT returns 200 but the value is unchanged). The status code confirms the request was *received*, not that the *mutation persisted*. Rule: for any non-echoing PUT/PATCH on the deploy path, follow with a read-back and compare before declaring success.
+## Env-File Loading Safety
+**NEVER load `.env` files with eval-export.** The pattern `while read line; do eval "export $line"; done < .env` (and `export $(cat .env | xargs)`) routes every value through the shell's positional-parameter and command expansion. Any secret containing a literal `$` — bcrypt hashes (`$2b$12$...`), PHP-style hashes, JWT signing keys, some base64 — gets mangled: `$2b` and `$12` are expanded as positional parameters and silently substituted (usually to empty), corrupting the secret. The app then boots with a broken hash and rejects every login, or signs tokens with a truncated key. The failure is invisible until auth breaks in production (field report #344 F1).
+Use a `$`-safe literal parser that never re-evaluates the value:
+```bash
+# Safe: read the line verbatim, split on the FIRST '=' only, no expansion.
+while IFS='=' read -r key val; do
+  case "$key" in
+    ''|'#'*) continue ;;          # skip blanks and comments
+  esac
+  export "$key=$val"               # value is a literal string, never eval'd
+done < .env
+```
+`$`-safe alternatives that bypass the shell entirely:
+- **Node ≥20:** `node --env-file=.env app.js` — Node parses the file itself, no shell expansion.
+- **systemd:** `EnvironmentFile=/etc/myapp/app.env` in the unit — systemd reads values literally.
+- **Docker Compose:** `env_file: .env` — Compose reads the file directly (it does NOT eval).
+Audit existing deploy scripts: grep for `eval "export`, `eval export`, and `export $(cat`. Any hit is a latent secret-corruption bug — replace it with the literal parser or one of the runtime-native loaders above.
+## systemd Unit Hardening (Node.js)
+Sandboxing directives in a systemd unit are good practice, but **Node.js units must NOT set `MemoryDenyWriteExecute=true`** (field report #344 F3). V8's JIT compiler maps pages that are simultaneously writable and executable (W^X is violated by design for JIT); `MemoryDenyWriteExecute=true` (MDWE) forbids exactly that, so the Node process dies with **`SIGTRAP` at boot** before it serves a single request. The crash looks unrelated to the unit file — operators chase the app for hours.
+Safe Node hardening stanza — everything useful **except** MDWE:
+```ini
+[Service]
+# --- hardening (Node-safe) ---
+NoNewPrivileges=true
+ProtectSystem=full
+ProtectHome=true
+PrivateTmp=true
+PrivateDevices=true
+ProtectKernelTunables=true
+ProtectKernelModules=true
+ProtectControlGroups=true
+RestrictSUIDSGID=true
+RestrictRealtime=true
+LockPersonality=true
+# MemoryDenyWriteExecute=true   # <-- DO NOT: V8 JIT needs W+X pages; SIGTRAP at boot
+ReadWritePaths=/var/lib/myapp /var/log/myapp
+```
+Note: ahead-of-time-compiled binaries (Go, Rust, statically compiled C/C++) have no JIT and **can** keep `MemoryDenyWriteExecute=true` — the restriction is specific to JIT runtimes (Node/V8, the JVM, PyPy, .NET with JIT). When a unit template is shared across services, gate MDWE on the runtime, not on the unit boilerplate.
+## Config Foot-Guns (deploy/runtime)
+Three recurring config traps that pass every syntax check yet break at runtime (field report #352 #5):
+- **Empty-string env defaults are non-nullish.** A shell default of the form `${VAR:-}` (or a Compose `VAR: ""`) sets the variable to `""`, which is a *defined, non-null* value. Downstream `cfg.X = process.env.VAR ?? defaultX` then keeps `""` — nullish coalescing (`??`) only fires on `null`/`undefined`, never on empty string — so the intended default is silently poisoned and the app runs with an empty config value. Either leave the var truly unset (omit the `:-` default) or validate-and-coerce empty strings at the config boundary.
+- **Dev hostnames hardcoded in worker healthchecks false-fail in prod.** A worker healthcheck that pings `http://localhost:3000` or `redis://dev-redis` passes in dev and fails in prod, marking a healthy worker unhealthy (and triggering restart loops). Healthcheck targets must come from the same env config the worker uses, never literals.
+- **Awaiting best-effort side effects on the auth path blocks sign-in.** `await analytics.track(...)` / `await auditLog.write(...)` inline in the login handler means a slow or down telemetry backend stalls — or fails — the sign-in. Best-effort side effects must be fire-and-forget (queue them, `void`-them, or move them off the request path), never `await`ed on a latency-critical auth route.
 ## Subdomain Routing (Cloudflare Pages / Vercel / Netlify)
 Platform-hosted static sites serve the entire project from root. Subdomain-to-subdirectory routing (e.g., `labs.example.com` → `/labs/`) requires platform-specific configuration:
@@ -344,6 +487,21 @@ Platform-hosted static sites serve the entire project from root. Subdomain-to-su
 **Always test routing before announcing a subdomain.** Curl the subdomain and verify it serves the expected content, not the root index.html.
+## Cloudflare TLS Mode (Flexible vs Full/Strict)
+On a **Flexible** TLS zone, Cloudflare terminates TLS at the edge and talks to the origin over **plain HTTP**. If that origin then **301-redirects HTTP → HTTPS** (the near-universal nginx/Caddy default), it bounces the edge's HTTP request back to HTTPS, which Cloudflare re-fetches over HTTP, which redirects again — an **infinite redirect loop** (`ERR_TOO_MANY_REDIRECTS`) for every visitor (field report #344 F4a). On a Flexible zone the origin must serve the app on plain HTTP and must NOT force the HTTPS upgrade — let Cloudflare own the HTTPS edge.
+**A Let's Encrypt cert on a sibling host is NOT proof the zone is Full/Strict.** Operators see `https://api.example.com` with a valid LE cert and assume the apex is Full mode too — but TLS mode is per-zone (sometimes per-host with config-rule overrides), and a working cert elsewhere says nothing about the mode applied to *this* hostname. Don't infer the mode from a neighbor's cert; check it.
+**Behavioral check — count redirect hops, don't read config:**
+```bash
+# Healthy Full/Strict origin: 0–1 hops. A Flexible-loop origin spirals (curl caps at --max-redirs).
+curl -sIL --max-redirs 10 "http://$ORIGIN_HOST/" | grep -ci '^location:'
+```
+A count at or near the cap (or `curl: (47) Maximum (10) redirects followed`) is the Flexible-loop signature. Fix by either switching the zone to **Full (strict)** and keeping the origin's HTTPS redirect, OR keeping **Flexible** and removing the origin's HTTP→HTTPS 301. Pick one; don't mix.
+**Minimum Cloudflare API token scope for `/deploy`.** So `/deploy` can verify the zone's SSL mode *before* it writes an nginx config that adds a redirect (and thus before it can create the loop), the deploy token must include **`Zone → SSL and Certificates → Read`** (`Zone:SSL`) and **`Zone → Certificates → Read`** (field report #344 F4b). With those scopes the deploy step queries the zone's `ssl` setting, and only emits a redirect-bearing origin config when the mode is Full/Strict. A token scoped to DNS-only cannot see the SSL mode and will happily ship a redirect into a Flexible zone.
 ## Deploy Surface Boundary
 **Invariant:** the repository root is NEVER the deploy surface. Physical separation between "all files tracked in the repo" and "files uploaded to the CDN / server" is enforced by tool configuration, not by `.gitignore`.