voidforge-build 23.9.2 → 23.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (75) hide show
  1. package/dist/.claude/agents/bashir-field-medic.md +1 -0
  2. package/dist/.claude/agents/coulson-release.md +3 -0
  3. package/dist/.claude/agents/irulan-historian.md +3 -0
  4. package/dist/.claude/agents/kusanagi-devops.md +8 -0
  5. package/dist/.claude/agents/leia-secrets.md +10 -0
  6. package/dist/.claude/agents/loki-chaos.md +1 -0
  7. package/dist/.claude/agents/picard-architecture.md +11 -0
  8. package/dist/.claude/agents/silver-surfer-herald.md +17 -0
  9. package/dist/.claude/agents/sisko-campaign.md +3 -0
  10. package/dist/.claude/agents/thufir-protocol-parsing.md +10 -0
  11. package/dist/.claude/commands/architect.md +56 -0
  12. package/dist/.claude/commands/campaign.md +26 -1
  13. package/dist/.claude/commands/deploy.md +31 -0
  14. package/dist/.claude/commands/gauntlet.md +11 -0
  15. package/dist/.claude/commands/git.md +13 -3
  16. package/dist/.claude/commands/prd.md +8 -0
  17. package/dist/CHANGELOG.md +107 -0
  18. package/dist/CLAUDE.md +13 -4
  19. package/dist/VERSION.md +3 -1
  20. package/dist/docs/methods/AI_INTELLIGENCE.md +15 -0
  21. package/dist/docs/methods/BACKEND_ENGINEER.md +48 -0
  22. package/dist/docs/methods/BUILD_PROTOCOL.md +19 -0
  23. package/dist/docs/methods/CAMPAIGN.md +204 -1
  24. package/dist/docs/methods/DEVOPS_ENGINEER.md +80 -0
  25. package/dist/docs/methods/FORGE_KEEPER.md +80 -3
  26. package/dist/docs/methods/GAUNTLET.md +2 -0
  27. package/dist/docs/methods/PRD_GENERATOR.md +15 -0
  28. package/dist/docs/methods/QA_ENGINEER.md +46 -0
  29. package/dist/docs/methods/RELEASE_MANAGER.md +59 -0
  30. package/dist/docs/methods/SECURITY_AUDITOR.md +53 -0
  31. package/dist/docs/methods/SPEC_HANDOFF.md +53 -0
  32. package/dist/docs/methods/SUB_AGENTS.md +90 -0
  33. package/dist/docs/methods/SYSTEMS_ARCHITECT.md +55 -2
  34. package/dist/docs/methods/TESTING.md +17 -0
  35. package/dist/docs/methods/TIME_VAULT.md +17 -0
  36. package/dist/docs/methods/TROUBLESHOOTING.md +27 -0
  37. package/dist/docs/patterns/adr-verification-gate.md +80 -0
  38. package/dist/docs/patterns/ai-eval.ts +87 -0
  39. package/dist/docs/patterns/ai-prompt-safety.ts +242 -0
  40. package/dist/docs/patterns/audit-log.ts +132 -0
  41. package/dist/docs/patterns/deploy-preflight.ts +195 -0
  42. package/dist/docs/patterns/llm-state-dedup.ts +246 -0
  43. package/dist/docs/patterns/middleware.ts +83 -0
  44. package/dist/docs/patterns/multi-tenant-pool-bypass.ts +134 -0
  45. package/dist/docs/patterns/multi-tenant-property-test.ts +127 -0
  46. package/dist/docs/patterns/refactor-extraction.md +96 -0
  47. package/dist/scripts/voidforge.js +0 -0
  48. package/dist/wizard/lib/anomaly-detection.d.ts +59 -0
  49. package/dist/wizard/lib/anomaly-detection.js +122 -0
  50. package/dist/wizard/lib/asset-scanner.d.ts +23 -0
  51. package/dist/wizard/lib/asset-scanner.js +107 -0
  52. package/dist/wizard/lib/build-analytics.d.ts +39 -0
  53. package/dist/wizard/lib/build-analytics.js +91 -0
  54. package/dist/wizard/lib/codegen/erd-gen.d.ts +16 -0
  55. package/dist/wizard/lib/codegen/erd-gen.js +98 -0
  56. package/dist/wizard/lib/codegen/openapi-gen.d.ts +15 -0
  57. package/dist/wizard/lib/codegen/openapi-gen.js +79 -0
  58. package/dist/wizard/lib/codegen/prisma-types.d.ts +15 -0
  59. package/dist/wizard/lib/codegen/prisma-types.js +44 -0
  60. package/dist/wizard/lib/codegen/seed-gen.d.ts +16 -0
  61. package/dist/wizard/lib/codegen/seed-gen.js +128 -0
  62. package/dist/wizard/lib/correlation-engine.d.ts +59 -0
  63. package/dist/wizard/lib/correlation-engine.js +152 -0
  64. package/dist/wizard/lib/desktop-notify.d.ts +27 -0
  65. package/dist/wizard/lib/desktop-notify.js +98 -0
  66. package/dist/wizard/lib/image-gen.d.ts +56 -0
  67. package/dist/wizard/lib/image-gen.js +159 -0
  68. package/dist/wizard/lib/natural-language-deploy.d.ts +30 -0
  69. package/dist/wizard/lib/natural-language-deploy.js +186 -0
  70. package/dist/wizard/lib/project-init.js +57 -0
  71. package/dist/wizard/lib/route-optimizer.d.ts +28 -0
  72. package/dist/wizard/lib/route-optimizer.js +93 -0
  73. package/dist/wizard/lib/service-install.d.ts +18 -0
  74. package/dist/wizard/lib/service-install.js +182 -0
  75. package/package.json +1 -1
package/dist/VERSION.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Version
2
2
 
3
- **Current:** 23.9.2
3
+ **Current:** 23.11.0
4
4
 
5
5
  ## Versioning Scheme
6
6
 
@@ -14,6 +14,8 @@ This project uses [Semantic Versioning](https://semver.org/):
14
14
 
15
15
  | Version | Date | Summary |
16
16
  |---------|------|---------|
17
+ | 23.11.0 | 2026-05-10 | Field Report Triage — 18 reports closed (#313-#320, #322-#330). Two combined batches across multi-tenant retrofit campaigns (#313-#320) and autonomous-mode + AI-execution campaigns (#322-#330). 9 new patterns: adr-verification-gate.md, audit-log.ts, multi-tenant-{pool-bypass,property-test}.ts, rls-test-fixture.py, structural-sql-sentinel.py, refactor-extraction.md, ai-prompt-safety.ts, llm-state-dedup.ts. ai-eval.ts extended with Claude-prompt-eval template. middleware.ts extended with hot-path logging gate. 18 method-doc sections across CAMPAIGN, SYSTEMS_ARCHITECT, SECURITY_AUDITOR, QA_ENGINEER, SUB_AGENTS, BACKEND_ENGINEER, RELEASE_MANAGER, GAUNTLET, AI_INTELLIGENCE, DEVOPS_ENGINEER, FORGE_KEEPER, TESTING, TIME_VAULT. Surfer Gate now ships via npm methodology package + wizard project-init (closes ADR-051 distribution gap #317). Operational learnings added to Picard, Sisko, Coulson, Bashir, Loki, Irulan, Silver Surfer. |
18
+ | 23.10.0 | 2026-04-20 | Field Report Triage — 6 reports closed (#303-#308). 33 approved fixes: SPEC_HANDOFF.md (new method doc), deploy-preflight.ts + post-deploy-probe.sh (new patterns), LEARNINGS LRN-5..10, Deploy Surface Boundary + Deployment Hygiene (field report #305 remediation), Step 2.5 pre-deploy secret scan + Step 4.5 post-deploy probe, TECH_DEBT SLA, npm-name pre-flight, ADR-050 Rename Verification Checklist, Silver Surfer HARD CONSTRAINT + 4 agent operational learnings. |
17
19
  | 23.9.2 | 2026-04-20 | CI workflow idempotency + provenance baseline — `publish.yml` guards each publish with "already-published" check so re-runs skip cleanly. Tag-push re-publishes via CI to attach npm provenance attestation (absent on v23.9.1's manual publish). |
18
20
  | 23.9.1 | 2026-04-20 | ADR-061 pivot — `@voidforge` npm org unavailable (squat-adjacent). Rebranded publish target to `voidforge-build` / `voidforge-build-methodology` matching the voidforge.build domain. Migration banner for legacy `thevoidforge` / `@voidforge/cli` installs. Farewell releases + npm deprecate for smooth transition. |
19
21
  | 23.9.0 | 2026-04-20 | Campaign 42 — @voidforge scoped npm rename (ADR-061), gauntlet --fast 3-round mandate, README value-prop + first-command pointer, LEARNINGS.md 4 entries. Victory Gauntlet 3 fix batches: methodology runtime dep, registry-pin + env-stripping, BLOCK absolute paths. Publish gated on user scope claim + NPM_TOKEN rotation. |
@@ -227,6 +227,21 @@ If issues found, return to Phase 3. Maximum 2 iterations.
227
227
  ### AI Gate Bootstrapping (Cold-Start Problem)
228
228
  AI-gated approval systems have a cold-start problem: no historical outcomes -> gate rejects all requests -> no operations -> no outcomes. During the first N decisions (configurable, default 20), the gate should approve at reduced size (0.5-0.7x normal) to build a track record. The gate should never reject solely because "no historical data exists." Include explicit prompt guidance: "Lack of history is not a reason to reject — approve at reduced size to build the track record." (Field report #152)
229
229
 
230
+ ### Event-Ladder Severity Gradient
231
+
232
+ When implementing an event ladder (engagement → escalation → climactic), every rung's Sentry / observability severity must be **strictly louder** than the previous rung: `info < warning < error < fatal`. The climactic event MUST emit at least at `fatal`, never silence into prose-only logs.
233
+
234
+ Common failure mode (field report #319 §4): the immediate event is `error`, the hourly reminder is `warning`, and the deadline-trip becomes `logger.critical(...)` only — **inverting the gradient.** The most consequential event becomes the LEAST observable. This is structural, not malicious — agents implement events one at a time and lose track of relative severity across rungs.
235
+
236
+ **Implementation checklist:**
237
+
238
+ - [ ] List every rung of the ladder with its Sentry level
239
+ - [ ] Verify monotonic ordering: each rung ≥ the previous
240
+ - [ ] Climactic rung is `fatal` (not `critical`-via-logger-only, not `warning` because "we already alerted earlier")
241
+ - [ ] Add a unit test that asserts the ordering — `assert ladder.severities == sorted(ladder.severities)` or equivalent
242
+
243
+ Pair with `/docs/patterns/middleware.ts` hot-path logging gate (fire-once or rate-limited emission) so the louder severity doesn't translate to log flooding.
244
+
230
245
  ## Anti-Patterns
231
246
 
232
247
  | Anti-Pattern | What Happens | Fix |
@@ -156,6 +156,54 @@ Services deployed in ephemeral environments (containers, serverless, spot instan
156
156
 
157
157
  Step 2 (Strange's Service Layer) already mandates "stateless composable services." This subsection makes the requirement concrete: stateless means *reconstructable from durable storage within one cycle*. (Field report #274)
158
158
 
159
+ ### AST Lints Are Cheap (8+ duplicates rule)
160
+
161
+ When a contract has 8 or more duplicates in production code (e.g., "every `complete()` call must pass `org_id`", "no router file may import `app.db.backends`", "every webhook handler must call `verify_signature()` before reading the body"), write an AST-based lint with a baseline-grandfather pattern + `--regenerate-baseline` flag.
162
+
163
+ **The pattern:**
164
+
165
+ 1. AST query — walk the AST, identify violations by structural pattern (not regex). Avoids false positives on quoted strings, comments, similarly-named functions in different namespaces.
166
+ 2. Baseline file — record existing violations at the time the lint is introduced. The CI gate fails only on NEW violations.
167
+ 3. `--regenerate-baseline` flag — operator can refresh the baseline when intentionally adding a violation (with PR comment justifying it).
168
+
169
+ **Why this works:**
170
+
171
+ A contract enforced by a checklist gets violated. A contract enforced by code review gets violated when the reviewer is tired. A contract enforced by AST lint with baseline-grandfather is structurally impossible to silently violate — the CI gate flips RED on the offending PR before merge.
172
+
173
+ **Cost-benefit:**
174
+
175
+ - AST lint authoring: ~30-60 min for the first lint in a project; subsequent lints reuse the framework.
176
+ - Baseline file maintenance: trivial; only changes when violations are intentionally added.
177
+ - Violation prevention: every future regression of the contract is caught at PR time.
178
+
179
+ Field report #324 (Union Station v7.8): F11 broadened UNS001 to catch `app.db.{backends,tenant_pool,rls_fail_open}` imports in router files — flagged 4 grandfathered usages immediately, prevented all future regressions. UNS003 (complete-org-id) shipped with empty baseline — every future M-37-style strict-mode regression is caught at PR time.
180
+
181
+ **Reference implementations from the field reports:**
182
+
183
+ - `scripts/lint_router_no_db.py` (PIC-001, Union Station) — routers must not import DB internals
184
+ - `scripts/lint_complete_org_id.py` (UNS003 / M-37 followup) — every `complete()` call passes `org_id`
185
+ - `scripts/check-org-id-defaults.sh` (ADR-137) — no `org_id: int = 1` defaults outside test fixtures (with `# system-org-allowed` waiver convention)
186
+
187
+ When NOT to AST-lint: contracts with <8 duplicates (use code review), one-off cleanups (do the cleanup; lint isn't needed), evolving contracts (the API hasn't stabilized — lint locks you in).
188
+
189
+ ### Lifespan & Daemon ContextVar Coverage (Multi-Tenant FORCE-RLS)
190
+
191
+ When sweeping unscoped tenant-pool callsites for a multi-tenant boundary tightening (org_id, tenant_id, workspace_id), grep alone catches HTTP-middleware-served paths. It misses code that runs *outside* the request lifecycle, where ContextVar isn't set by middleware. These paths fail-fast immediately under FORCE-RLS with a non-owner role, and they are the load-bearing finding from Union Station's M-05 cutover.
192
+
193
+ **Sweep this whole list, not just `grep acquire`:**
194
+
195
+ | Path class | Why grep misses it | Fix |
196
+ |---|---|---|
197
+ | Lifespan startup (`@app.on_event("startup")`, FastAPI lifespan, Django ready) | One-shot at boot, before middleware sets ContextVar | Wrap with `set_current_org_id(org_id)` per-org loop OR `pre_org_resolution_scope()` for cross-tenant work |
198
+ | Scheduler ticks (cron, Celery beat, conductor sweeps) | Fires on day-boundaries; no HTTP request, no ContextVar | Wrap with `pre_org_resolution_scope()` (admin pool, BYPASSRLS) |
199
+ | Leader-elected daemons (queue cleanup, retention, distributed locks) | Long interval; no HTTP context | Same as scheduler ticks |
200
+ | Pool callbacks (`asyncpg` setup, SQLAlchemy event listeners) | Run outside any caller transaction | `SET LOCAL` does NOT work here — use `set_config(..., is_local=false)` + explicit `RESET` on connection release |
201
+ | Admin endpoints intended to bypass tenant scope | Need explicit admin pool acquisition | Use `_get_db_admin()` — never silently fall back to tenant pool |
202
+
203
+ **Test fixture trap:** the dev superuser (`postgres`, `us_test`) has `BYPASSRLS=t` and silently bypasses FORCE RLS at the engine level. A test that uses the dev superuser will pass even if the policy is broken. Tests that exercise tenant-isolation invariants must run under the non-owner runtime role (`{project}_app`, BYPASSRLS=f). See `/docs/patterns/rls-test-fixture.py` for the `db_as_app` SAVEPOINT pattern. (Field reports #318 + #319.)
204
+
205
+ **Forensic check before declaring sweep complete:** boot the service under the runtime DSN (not the dev DSN), exercise lifespan + one tick of every scheduled job + one admin-bypass codepath. Watch for `TenantInvariantError`. Anything that raises was missed by the sweep. (Field report #319: 4 lifespan/daemon paths surfaced this way during M-05 cutover.)
206
+
159
207
  ## Step 5 — Deliverables
160
208
 
161
209
  1. BACKEND_AUDIT.md
@@ -304,6 +304,25 @@ After running any build command (`build:workers`, `tsc --build`, webpack, etc.),
304
304
  2. Complete first-deploy pre-flight checklist (see `/devops` command)
305
305
  3. **Route registration check:** Verify all new API/route files are imported in the server entry point. Grep the entry point (e.g., `server.ts`, `app.ts`, `urls.py`) for imports of every new route file created during this build. An exported handler that isn't imported is a silent 404. (Field report #258: blueprint API routes exported but never registered via `addRoute()` in `server.ts` — wizard UI silently 404'd.)
306
306
  4. **Docker smoke test (field report #147):** If the project uses Docker/docker-compose, verify the container entrypoint runs the NEW code, not a legacy file. Run `docker compose up --build` (or equivalent) and confirm the process that starts is the architecture you just built. A 39-mission campaign once shipped with the legacy entrypoint because nobody checked what `CMD` pointed to.
307
+
308
+ ### External API integrations require live smoke tests (field report #304)
309
+
310
+ Unit tests, protocol parser reviews, and SDK verification can all pass while a real API call fails. Signing bugs, wire-format encoding mismatches, and domain-field errors only surface when a live request is made.
311
+
312
+ **Scope.** This applies to components that PRODUCE data for external consumption — custom signing, custom wire-format encoding, request serialization, or transport-layer framing. Read-only API clients that use a provider SDK with no custom signing (e.g., a Stripe balance read via `stripe-node`, a Supabase query via `@supabase/supabase-js`) are out of scope — their integration tests and the SDK's own test coverage provide adequate verification.
313
+
314
+ For any in-scope component:
315
+
316
+ 1. Build a `src/tools/smoke-<service>.ts` (or equivalent, language-appropriate) harness IMMEDIATELY AFTER the client builds — before Gauntlet review.
317
+ 2. Run it against real credentials (paper mode, testnet, or sandbox at minimum; live if reversible).
318
+ 3. Document the exit criteria: what response or observable state counts as "working."
319
+ 4. Re-run smoke after every signing or wire-format change.
320
+
321
+ **Credentials unavailable?** In regulated domains (banking APIs, government data APIs), sandbox access may take weeks to provision. Do NOT block Gauntlet indefinitely. Instead: document the exit criteria as an inline comment in the harness (`// SMOKE-TEST BLOCKED: awaiting <provider> sandbox credentials — target <date>`) and run the smoke test at the earliest CI checkpoint that has credentials. Phase log must name the blocker and the unblock target.
322
+
323
+ BarrierWatch campaign (field report #304) shipped three signing bugs past 206 unit tests, a 44-agent gauntlet, and a 3-agent contract review. All three were caught on the first live API call. Code-review agents cannot see what only a live call reveals.
324
+
325
+ Corollary: when reviewing external API clients, prefer agents that can fetch the SDK source (WebFetch, dependency audit) over agents that read docs alone. See thufir-protocol-parsing.md Operational Learnings.
307
326
  5. **Schema.sql sync gate:** After applying any migrations, regenerate `schema.sql` from the live database (e.g., `sqlite3 db.sqlite3 .schema > schema.sql`). Post-process the output: add `IF NOT EXISTS` to all `CREATE TABLE` and `CREATE INDEX` statements, remove `sqlite_sequence` (cannot be created manually). Commit the updated schema.sql. Stale schema.sql files cause false findings in `/assess` and mislead downstream consumers. (Field reports #232, #242)
308
327
  6. **Reference file freshness:** Before running `/assess` on an existing codebase, regenerate reference files (schema.sql, API docs, type exports) from the live system. Stale reference files generate false findings that waste triage time — the v7.0 assessment over-reported multi-tenant gaps because schema.sql showed 20 tables vs 52 actual. (Field report #232)
309
328
  7. Log to `/logs/phase-12-deploy.md`
@@ -51,6 +51,47 @@ Autonomous campaign execution: read the PRD, figure out what's next, build it, v
51
51
  12. **Log deviations.** When the build deviates from PRD architecture, update the PRD or log it in campaign-state.md. Never leave a silent contradiction.
52
52
  13. **Operational verification after deploy.** After deploying to a live environment, wait for 1 full operational cycle (1 trade cycle, 1 cron job, 1 polling interval) and check logs for errors, halts, and successful operations before marking the mission complete. "It deployed" ≠ "it works." (Field report #152)
53
53
 
54
+ ### Audit-first missions for callsite-counted ADRs
55
+
56
+ When `/architect` produces an ADR whose effort estimate scales with callsite count (refactor sweeps, ContextVar migrations, security boundary tightening, schema-column adds across N tables), require a paired **audit mission** BEFORE plan finalization. The audit produces the actual count; plan estimates use that count, not the architect's grep-by-eye guess.
57
+
58
+ Examples that triggered this rule:
59
+ - ADR-138 (Union Station, #316 §3): estimated "12+ unscoped tenant-pool callsites." Audit mission M-04.5 found ~900 across ~95 files — 75x off. The original "manual per-callsite refactor" plan was infeasible at that scale; audit forced the architectural rewrite (pool callback + ContextVar middleware).
60
+ - Any ADR claiming "tighten boundary at every X" — if X is a code pattern, count it before planning.
61
+
62
+ **Audit mission shape:**
63
+ 1. Picard or Spock writes the audit query (grep recipe, AST query, or schema introspection)
64
+ 2. The audit runs against the live codebase
65
+ 3. The result is committed as `docs/audits/<topic>-<date>.md` with the count + per-file breakdown
66
+ 4. THEN the architect re-estimates effort and sequences subsequent missions
67
+
68
+ Skip this step only when the ADR's scope is bounded by entity (one file, one table, one route group) — bounded ADRs don't need an audit because the count is visible in the scope itself.
69
+
70
+ ### Closeout grep pinning
71
+
72
+ When a `/campaign` closeout report cites a followup count or backlog size (e.g., "F-V710-ORG1-DEFAULTS — ~12 sites remaining" or "~21 cumulative followups"), the followup definition MUST embed the literal grep pattern + observed `n=N` at closeout HEAD. The next campaign's `/architect --plan` re-runs the same grep before accepting the count.
73
+
74
+ **Required closeout shape:**
75
+
76
+ ```
77
+ ### F-<NAME>
78
+ Scope: verified at <SHA> with `grep -rcE '<pattern>' <paths> | awk -F: '{s+=$2} END {print s}'` → n=<COUNT>
79
+ Severity: <level>
80
+ Status: open
81
+ ```
82
+
83
+ Field report #329 documents the cost of skipping this: v7.10 closeout cited "~12 sites" for `org_id=1` defaults without a verification grep. v7.11 plan-mode re-ran the grep and found 65 sites — 5× the estimate. The campaign plan had to restructure into a parallel sub-campaign (M-59-SWEEP) instead of a serial mission slate. The verification grep is one shell command. Skipping it cascades into wrong plans.
84
+
85
+ This rule reciprocates the ADR-side "Scope-confidence interval" requirement in `SYSTEMS_ARCHITECT.md` — the closeout writes the grep, the next plan re-runs it.
86
+
87
+ ### TECH_DEBT SLA enforcement
88
+
89
+ `/campaign` and `/assemble` audit `TECH_DEBT.md` before every mission selection. Critical + Immediate + LowEffort items overdue by 48h BLOCK campaign advancement. Critical + Immediate + HighEffort items overdue by 72h without owner + deadline BLOCK. High + Immediate items overdue by 7 days WARN.
90
+
91
+ See `.claude/commands/campaign.md` Step 0.5 for the full gate.
92
+
93
+ Evidence: field report #305. A Critical + Immediate + LowEffort credential-leak entry sat for 32 days because TECH_DEBT had labels but no contract.
94
+
54
95
  ## Two Modes
55
96
 
56
97
  ### Planning Mode (`--plan`)
@@ -161,7 +202,8 @@ Dax reads the Prophets' plan:
161
202
  7. Diff: PRD requirements vs. implemented features (structural AND semantic — not just "does the route exist?" but "does the component render what the PRD describes?")
162
203
  8. Produce: **The Prophecy Board** — ordered list of missions with scope, plus a separate list of BLOCKED items (assets, credentials, user decisions)
163
204
  8a. **Cross-mission data handoff check (Odo):** For any system that forms a closed loop (e.g., generate → track → analyze → feed back), identify every data handoff point between missions. Each handoff must be explicitly scoped in at least one mission: "Mission N produces X, Mission M consumes X via [mechanism]." If the loop spans 3+ missions, draw the handoff map. Unscoped handoffs become no-ops — the code on each side compiles and tests independently, but the data never flows between them. (Field report #265: seedPush extracted winning variant data but discarded it — the feedback loop was documented but not wired because the two ends were in separate missions with no explicit handoff.)
164
- 9. **Acceptance criteria gate:** Every mission on the Prophecy Board MUST have at least one acceptance criterion before Dax finalizes the board. Acceptance criteria are concrete, verifiable conditions "endpoint returns 200 with correct schema," "UI renders empty/loading/error/success states," "test covers the happy path." Missions without acceptance criteria are stubs that escape quality gates later. If a mission's scope is too vague to produce criteria, it's too vague to build split or clarify first. This applies to `--plan` mode too, not just build mode. (Field report #129: Phases 3-6 written as stubs without criteria, caught late by blitz compliance check.)
205
+ 9. **Cluster-mission recognition:** Before finalizing the board, Dax asks: "Are any of these missions cluster-natured?" A cluster-mission is a single-line entry that actually spans 4+ ADR sections, 4+ sub-components, or 4+ migration steps. Examples: M-51 cluster (per-org MCP topology) genuinely required 4 sub-missions per ADR-107 §c-§f; M-44 series required 5 sub-missions per ADR-117. Pretending a cluster is one mission produces 2-3× planning underestimates and forces mid-campaign restructuring. If a mission has 4+ named deliverables in different files/modules, split into sub-missions (M-51a/b/c/d) at plan time, not at execution time. (Field report #326: Sisko's original v7.10 slate was 9 missions; reality was 21 because cluster recognition was deferred.)
206
+ 10. **Acceptance criteria gate:** Every mission on the Prophecy Board MUST have at least one acceptance criterion before Dax finalizes the board. Acceptance criteria are concrete, verifiable conditions — "endpoint returns 200 with correct schema," "UI renders empty/loading/error/success states," "test covers the happy path." Missions without acceptance criteria are stubs that escape quality gates later. If a mission's scope is too vague to produce criteria, it's too vague to build — split or clarify first. This applies to `--plan` mode too, not just build mode. (Field report #129: Phases 3-6 written as stubs without criteria, caught late by blitz compliance check.)
165
207
 
166
208
  ### Deep Codebase Scan for PRD Diff
167
209
 
@@ -271,6 +313,120 @@ These issues are invisible to standard code review but Critical when found by th
271
313
 
272
314
  After each mission's 1-round review, check: "Did this mission modify any file that was also modified by a prior mission in this campaign?" If so, verify that the prior mission's patterns (error handling, locking, validation) are preserved in the new changes. This is a 30-second scan per shared file — run `git log --name-only` to identify cross-mission file overlap. Cross-cutting bugs that span files modified in different missions are invisible to single-mission review. (Field report #38: 2 Critical findings — chat stream timeout and optimistic locking omission — both involved files modified across multiple missions.)
273
315
 
316
+ ### Caller-Graph Audit (silent-default abstractions)
317
+
318
+ When a mission closes a class of bug rooted in a silent default value (`org_id: int = 1`, `tenant_id = None`, `user_id = SYSTEM`, `region = "us-east-1"`), the mission brief MUST enumerate every caller-graph site of every function whose default was wrong — not just the primary site that surfaced the bug.
319
+
320
+ **Detection pattern (F6-class abstractions):**
321
+
322
+ A "silent default" is a parameter whose default value short-circuits a multi-tenant / authorization / region-scoping invariant when the caller forgets to pass it. The fix is two-step:
323
+
324
+ 1. Remove the default OR change it to a sentinel that fails-loud (`None` with assertion, `NotProvided` enum value)
325
+ 2. Update every caller — the mission's grep MUST find them all
326
+
327
+ Without step 2, callers that omitted the parameter relied on the wrong default; making the function safer breaks them. Worse, callers that PASSED `org_id=1` explicitly (the wrong default) now silently leak across tenants. The cleanup density of these bugs is high — one explicit defect surfaces N adjacent latents of the same family.
328
+
329
+ **Required mission shape:**
330
+
331
+ ```
332
+ M-XX — F-V710-ORG1-DEFAULTS cleanup, batch N
333
+ Scope: <module/* glob>
334
+ Pre-mission audit:
335
+ grep -rnE 'org_id\s*:\s*int\s*=\s*1' <glob> → N callsites
336
+ For each callsite, classify:
337
+ - Defensible (cross-tenant by design, documented)
338
+ - Retrofit residue (CRITICAL, fix)
339
+ - Caller of retrofit residue (verify it passes a real org_id)
340
+ Fix shape:
341
+ 1. Remove the default (or replace with fail-loud sentinel)
342
+ 2. Update every caller-graph site enumerated above
343
+ 3. Add a property test or AST lint to prevent re-introduction
344
+ ```
345
+
346
+ Field report #326 (Union Station v7.10 M-55): the mission brief named 9 sigs in `widget_pipeline.py`. Reality required 11 sigs (9 + 2 in `registry.py`) PLUS 2 sigs in `providers.py`. The bonus F-K-M55-1 HIGH bug — cross-tenant providers leak via `SlackCacheContext.inject` and `IdeasDbContext.enrich` defaults — was caught by Kenobi during M-55 review and fixed same-commit. Without the caller-graph enumeration upfront, it would have shipped.
347
+
348
+ This is the "cleanup-density pattern" (F-V710-CLEANUP-DENSITY in the source field report). Budget for it: when a mission closes one defect in this class, expect 0.5-2 bonus defects of the same family in the same commit.
349
+
350
+ ### V710 Acceptance Template Inheritance Counter
351
+
352
+ For multi-mission campaigns shipping a class of fix (multi-tenant retrofit, dialect migration, auth tightening), establish a project-scoped **acceptance template** — a 4-8 item matrix that every mission in the class must satisfy. Track inheritance with a monotonic counter; relax to spot-check or retire only via explicit Picard-countersigned amendment at version-rollover gates.
353
+
354
+ **Template anatomy (per field report #326, v7.10):**
355
+
356
+ The V710 template (Batman, M-50b R0) was 6 items:
357
+
358
+ 1. NO POLICY-AS-PURE-FUNCTION (policies must consume DB state, not compile-time constants)
359
+ 2. NO LOG-OUTPUT-AS-ASSERTION (test assertions must check returned values, not log lines)
360
+ 3. NO SINGLE-HAPPY-PATH (every test covers happy + at least one negative + boundary)
361
+ 4. NO `?→$N` MONKEYPATCH SHIMS (use proper fixtures, not parameter rewrites at test time)
362
+ 5. ASYNC PATHS GET ASYNC TESTS (await paths tested under `asyncio.run`, not synchronously)
363
+ 6. SHAPE OF TRUTH PINS (test asserts the literal SQL/JSON shape returned, not a derived count)
364
+
365
+ Each mission's test files were reviewed against the matrix at acceptance. Counter advanced 1/1, 2/2, ... clean inheritance through 13/13 by Phase C close, 20/20 by v7.10 closeout. Zero waivers.
366
+
367
+ **Why this works:**
368
+
369
+ A class of bug is recurring because the test shape that would catch it is non-obvious. Document the test shape ONCE as an acceptance template; every mission in the class inherits the discipline. The counter ratchets forward without becoming bureaucratic because it's tied to a real bug class, not made-up process.
370
+
371
+ **When to relax:**
372
+
373
+ At version-rollover gates (v7.X → v7.X+1, phase boundaries), Picard countersigns ONE of:
374
+
375
+ - **(a) re-affirm** — keep mandatory for next phase (default if unrecorded)
376
+ - **(b) relax-to-spot-check** — sample 1 in N missions
377
+ - **(c) retire** — the bug class is closed; future tests don't need the matrix
378
+
379
+ Document the decision in a `logs/campaign-decisions-{version}.md` entry. Defaulting to re-affirm preserves discipline; relax/retire requires evidence (the bug class hasn't recurred for N missions).
380
+
381
+ ### Operator Decision Documents
382
+
383
+ For campaigns where the operator delegates architectural calls mid-campaign (split mission X into A+B, choose recomputation over bit-cast, accept ADR amendment scope, etc.), record each call in `logs/campaign-decisions-{version}.md` with a stable ID, the call, and the rationale.
384
+
385
+ **Why this is a first-class artifact:**
386
+
387
+ Mission briefs encode WHAT to build. Operator decisions encode WHAT THE OPERATOR CHOSE among multiple valid paths. Without a decision log, agents reading the campaign-state later cannot distinguish operator intent from happenstance — they're equally likely to rationalize the wrong choice as "load-bearing" or "incidental."
388
+
389
+ **Shape:**
390
+
391
+ ```
392
+ # Campaign Decisions — v7.10
393
+
394
+ ## D-1: M-51 split scope (2026-05-08)
395
+ Question: split per-org MCP topology into 1, 2, or 4 sub-missions?
396
+ Operator chose: 4 (M-51b/c/d/e per ADR-107 §c-§f)
397
+ Rationale: cluster-mission recognition; each sub-section was a 1-2 day mission.
398
+
399
+ ## D-2: M-52 recomputation vs bit-cast (2026-05-09)
400
+ Question: V092 migrates 13.6K embedding rows. Use bit-cast or recompute from source?
401
+ Operator chose: recompute from source (NULL → repopulate via embed_text())
402
+ Rationale: ADR-110's bit-cast scheme was wrong as written (Spock + Loki LK-2.2 caught
403
+ that "bit-identical reinterpret cast" was actually a value cast that would corrupt
404
+ every row). Recompute is slow (~3 min for 13.6K rows) but correct.
405
+ ...
406
+ ```
407
+
408
+ Field report #326 (v7.10 M-57): a literal SQL `false→true` flip referenced in D-6 would have BROKEN the contract because ADR-138 §Addendum supersedes ADR-139. Investigation surfaced the supersession; the mission honored architectural reality instead of mechanical application. The decision log captured INTENT; the mission delivered CORRECTNESS. Both are required.
409
+
410
+ ### LOC Growth Tracker (per-mission)
411
+
412
+ After each mission's 1-round review, run a LOC sweep against files modified in the mission. Any file that crossed the 300-LOC threshold (or grew >100 LOC in a single mission) is flagged for split-or-justify review.
413
+
414
+ **Detection (run post-build, before commit):**
415
+
416
+ ```bash
417
+ # Files this mission touched
418
+ git diff --name-only HEAD | while read f; do
419
+ if [ -f "$f" ]; then
420
+ lines=$(wc -l < "$f")
421
+ [ "$lines" -gt 300 ] && echo "LOC: $f -> $lines (over 300)"
422
+ fi
423
+ done
424
+ ```
425
+
426
+ When the tracker fires: Boromir or Stark reviews whether the growth is justified or if a split is overdue (per ADR-066 or project's equivalent decomposition boundary). The check costs <5 seconds; the alternative is the Gauntlet catching it 4 missions later, when the file has compounded with two other missions' changes and the safe split is harder.
427
+
428
+ Field report #322 (barrierwatch): `statistical-gate.ts` grew 425 → 775 LOC across M5 + M6 + Fix Batch additions. Each per-mission review was clean in isolation — only Gauntlet Round 3 caught the cumulative violation. A per-mission LOC tracker would have surfaced it at +100, not +350.
429
+
274
430
  ### Pattern Replication Check
275
431
 
276
432
  When a mission duplicates or extends an existing code path (adding a version-aware path alongside a legacy path, adding a new endpoint that mirrors an existing one), verify that security patterns (locking, rate limiting, validation, sanitization) from the original path are replicated in the new path. Grep for the original pattern and confirm it exists in the new code. (Field report #38: optimistic locking in legacy chat edit was not replicated to the version-aware path.)
@@ -361,6 +517,53 @@ If you believe context justifies reducing quality:
361
517
 
362
518
  The Gauntlet is never reduced. Checkpoints are never lightweight. Debriefs are never skipped. Run `/context` or run the full protocol.
363
519
 
520
+ ### Pause-Bias Anti-Pattern (autonomous mode)
521
+
522
+ When a mission completes in autonomous mode (`--blitz`, `--autonomous`, or default ADR-043 autonomous-by-default), the orchestrator's next action is to mark the next mission `in_progress` and START. Do NOT present "milestone summaries" framed as decision points. Do NOT ask "continue with M-X or pause?" Do NOT rationalize a pause as "strategic checkpoint" or "context budget management."
523
+
524
+ **The distinction is structural, not stylistic:**
525
+
526
+ - ✅ **Status update** (valid): "M-7.4 shipped at `6d7f5b3`. Starting M-7.5."
527
+ - ❌ **Decision-frame question** (anti-pattern): "M-7.4 complete. Continue with M-7.5 or pause for review?"
528
+ - ❌ **Rationalized pause** (anti-pattern): "Given the context usage, recommend resuming M-7.5 in a fresh session."
529
+
530
+ Status updates are FINE — they tell the operator what happened. Questions are NOT — they shift control back to the operator, which defeats the autonomous-by-default contract.
531
+
532
+ **The only valid pause triggers in autonomous mode** remain:
533
+ 1. `/context` shows actual usage above 85% (cite the number)
534
+ 2. A BLOCKED item requires user input (e.g., missing credentials, design decision)
535
+ 3. A Critical finding from `/assemble` that can't be auto-fixed (per `--autonomous` git-tag rollback)
536
+ 4. The user interrupts
537
+
538
+ Field report #323 (barrierwatch Phase 2): mid-campaign, after 4 missions shipped, the orchestrator presented a "checkpoint summary" with `"Continue with M-7.5 or pause?"` The operator responded sharply that pause-bias was a recurring pattern and to stop asking questions. The rationalizations ("context budget," "strategic checkpoint") had been rejected before via project memory; the orchestrator re-introduced them anyway.
539
+
540
+ **Why this happens:** the pause-frame feels like good operational hygiene. It is not. ADR-043 made autonomous the default specifically to eliminate this friction. Status updates between missions preserve transparency without re-introducing decision points. Trust the operator to interrupt if they need to.
541
+
542
+ ### ROADMAP Path Disambiguation
543
+
544
+ If both `ROADMAP.md` (root) and `docs/ROADMAP.md` exist, the **root** file is canonical for active campaign state. `docs/ROADMAP.md` is typically historical or aspirational — do not mutate it during `/campaign` or `/architect --plan` unless explicitly scoped.
545
+
546
+ Sisko + Picard verify which file holds the active campaign section at Step 0/Step 1 entry. The verification is a single `head -20 ROADMAP.md docs/ROADMAP.md` to inspect both — disambiguate before reading, not after editing the wrong one.
547
+
548
+ Field report #323: Victory Gauntlet reported "no Active Campaign section in `docs/ROADMAP.md`" — false alarm because the canonical `ROADMAP.md` was at repo root (2761 LOC). Reading the wrong file produces wrong findings.
549
+
550
+ ### Pre-Split Blocker Phase (ADR-066-style file splits)
551
+
552
+ When a campaign includes file splits of signing-critical, replay-critical, or load-bearing modules (per ADR-066 or project equivalent), the first split commit MUST be preceded by a "Phase B" of pre-split blockers. Without them, splits ship correctness regressions that surface only in production.
553
+
554
+ **The four pre-split blockers** (all must pass tsc + lint + existing test suite BEFORE any split commit):
555
+
556
+ 1. **Signing/serialization golden-vector test** for every signing path the splits will touch (EIP-712, action hashes, HMAC, JWT). Pinned hex inputs → pinned hex output.
557
+ 2. **Byte-identical replay-equivalence harness** using a frozen test fixture. Pattern: copy live DB to `runs/fixtures/`, capture canonical stdout via the deterministic strategy/runner, commit baseline + sha256 lock, write a vitest/pytest `skipIf-fixture` test that diffs new output against baseline.
558
+ 3. **ADR amendment** declaring the canonical home for any utility being extracted (rate-limiter, types, error-envelope). Preempts ad-hoc decisions during the splits.
559
+ 4. **Load-bearing function ships first** if the campaign also implements a critical decision function (e.g., `applyReengagementGate`) — ship the function before the split that would otherwise re-route its wiring.
560
+
561
+ Each blocker is its own mission, gated on tsc + lint + tests green. THEN the split missions land.
562
+
563
+ Reference implementation from field report #323 (barrierwatch v0.5.0): `src/research/lib/rolling-r1-verdict.ts` (Zod-schema parser for cron output contracts), `src/__tests__/replay-equivalence.test.ts` + `scripts/replay-equivalence.sh` + `tests/fixtures/replay-r1-baseline.txt` (byte-identical regression harness), `state/db-checksum-baseline.txt` (frozen fixture pin). Three splits (`hl-exchange-client.ts`, `pm-clob-client.ts`, `main.ts`) all preserved replay-equivalence BYTE-IDENTICAL.
564
+
565
+ The discipline answers: "splits worked" vs "splits worked safely."
566
+
364
567
  ### Step 5 — Debrief and Commit
365
568
 
366
569
  1. **Security gate (before commit):** Check if this mission added new TypeScript/JavaScript files that handle network I/O (HTTP endpoints, WebSocket handlers), user input (form parsing, body parsing), or credential storage (vault writes, env file generation). If yes, flag: **"This mission added network-facing code. Run `/sentinel` before committing."** Even in `--fast` mode, security is non-negotiable for new attack surface. This prevents shipping Critical vulnerabilities that only get caught in a post-hoc hardening pass.
@@ -142,12 +142,70 @@ Known issues when deploying Tailwind v4 to Vercel or similar build platforms:
142
142
 
143
143
  Never combine methodology syncs (`/void`) with unrelated debugging in the same session. If a sync introduces a problem, the debug commits interleave with sync commits, making it impossible to identify which change broke what. Rule: sync first, verify, THEN debug separately. If needed, hard-reset to the pre-sync state and reapply incrementally. (Field report #29: 6 retcon commits interleaved with 20 CSS-fix commits.)
144
144
 
145
+ ### Production Runtime Topology Authoritative-Source
146
+
147
+ Production runtime should run under a **single supervisor** — typically systemd, sometimes PM2 or Docker — and the active topology must be discoverable from one source. Temporary workarounds drift the topology silently:
148
+
149
+ - A `nohup`/`tmux`/manual `&` launch outlives its purpose; the systemd unit drifts from reality.
150
+ - `ExecStart` paths ossify against an old binary location (`~/.local/bin/uvicorn` vs `.venv/bin/uvicorn`).
151
+ - `StartLimitBurst` exhausts; the unit shows `failed` while a manual process serves traffic.
152
+
153
+ When a temporary workaround is acceptable, document it in `OPERATIONS.md` §Runtime Topology (or equivalent) as the canonical runtime, then either fix the systemd unit OR set a calendar reminder to revisit it. Field report #319 §7: Union Station served via nohup-launched uvicorn from 2026-03-27 onward — the systemd unit was `enabled` but `failed`. M-05 cutover required killing the nohup process (brief outage), fixing `ExecStart`, `systemctl reset-failed`, `daemon-reload`, `restart`. None of that should have been in the cutover contract.
154
+
155
+ **Pre-deploy check (mandatory):**
156
+
157
+ 1. `systemctl status <unit>` (or `pm2 list`) — what does the supervisor think is running?
158
+ 2. `ps -ef | grep <binary>` — what's actually running?
159
+ 3. Reconcile. If they disagree, fix BEFORE the deploy starts.
160
+
145
161
  ### Process Manager Discipline
146
162
 
147
163
  If a process manager (PM2, systemd, Docker, supervisord) owns the application port, NEVER kill the port directly (`fuser -k`, `kill`, `lsof -ti | xargs kill`). Always reload through the process manager: `pm2 reload`, `systemctl restart`, `docker compose restart`. Killing the port causes the process manager to auto-restart the old build, creating a race condition with any manual start attempt — the user sees stale code while the fix is already built. (Field report #123: 30+ minutes of stale code serving in production because `fuser -k 5005/tcp` raced with PM2's auto-restart.)
148
164
 
149
165
  **Detection rule:** When writing CLAUDE.md "How to Run" sections or session restart commands, check if the project uses a process manager (`ecosystem.config.js`, `docker-compose.yml`, `*.service` files). If yes, the restart command MUST go through the PM — not through port killing.
150
166
 
167
+ ### CI runs `npm test` at repo root
168
+
169
+ In monorepo CI workflows, run `npm test` at the repository root — NOT `npm run test -w <workspace-name>`. The workspace-scoped form skips the root `pretest` hook, silently bypassing any root-level validators (agent-ref checkers, gate tests, consistency checks).
170
+
171
+ Evidence: field report #308 RC-3 — the `stat -f %m` portability bug in surfer-gate was latent for multiple releases because CI used `npm test -w @voidforge/cli`, which bypassed the root pretest that ran gate tests. Surfaced only when v23.9.0 switched CI to root `npm test`. See LRN-8 in docs/LEARNINGS.md.
172
+
173
+ ### Post-push live-URL fingerprint (platform auto-deploy integrity)
174
+
175
+ The health-endpoint build-fingerprint (above) catches processes serving stale code. It does NOT catch the case where the platform auto-deploy integration is broken and no new deploy happened at all. To catch that:
176
+
177
+ After every `git push` to a branch that auto-deploys, wait ~60 seconds, then hit a known endpoint on the live URL. Compare a content fingerprint (a string from the just-pushed commit, or the `last-modified` header age) against expected. If the fingerprint didn't change, the auto-deploy integration is broken — run the platform-specific manual deploy (`vercel --prod`, `flyctl deploy`, `wrangler pages deploy ./dist`, `firebase deploy`, etc.) and flag the hook as needing reconnection.
178
+
179
+ Evidence: field report #307 — voidforge-marketing-site Vercel auto-deploy silently failed for 8 days after the repo was renamed (`voidforge-marketing-site` → `voidforge-site`). Eight days of unbuilt pushes went live as April-15 stale content until a `/assess` caught it. A post-push fingerprint check would have caught it on day one.
180
+
181
+ Canonical check snippet (note: `Last-Modified` header is optional on some CDNs — fallback is the content-hash grep on the second line):
182
+ ```bash
183
+ EXPECTED_SHA="$(git rev-parse --short HEAD)"
184
+ sleep 60
185
+ FINGERPRINT="$(curl -sI https://$DEPLOY_URL | grep -i '^last-modified:')"
186
+ if [[ -z "$FINGERPRINT" ]] || ! curl -s https://$DEPLOY_URL | grep -q "$EXPECTED_SHA"; then
187
+ echo "AUTO-DEPLOY FAILED — running manual deploy"
188
+ # platform-specific manual deploy here
189
+ fi
190
+ ```
191
+
192
+ Applies to: Vercel Git Integration, Cloudflare Pages Git Integration, Netlify Git Integration, Firebase web-hook auto-deploys.
193
+
194
+ ### Methodology-exposure check (static-host deploys)
195
+
196
+ After deploying to a static CDN (Cloudflare Pages, Vercel, Netlify, Firebase, S3+CloudFront), curl a known methodology path and assert 404 / denied:
197
+
198
+ ```bash
199
+ for path in /.claude/agents/silver-surfer-herald.md /docs/methods/FORGE_KEEPER.md /HOLOCRON.md /CHANGELOG.md /VERSION.md; do
200
+ status=$(curl -s -o /dev/null -w "%{http_code}" "https://$DEPLOY_URL$path")
201
+ [[ "$status" == "200" ]] && echo "LEAK: $path returned $status"
202
+ done
203
+ ```
204
+
205
+ If any path returns 200, add a `.cfignore` / `.vercelignore` / `firebase.json ignore` entry that excludes `.claude/`, `docs/methods/`, `docs/patterns/`, `HOLOCRON.md`, `CHANGELOG.md`, `VERSION.md`, `logs/`. Methodology files must not be publicly served.
206
+
207
+ Evidence: field report #303 — saltwater.com was serving 264 agent files, 37 patterns, method docs, HOLOCRON, CHANGELOG, and VERSION publicly on its Cloudflare CDN. Affects every VoidForge-generated project deployed to a static host until an ignore file is added. Companion: FORGE_KEEPER.md §Deployment Hygiene.
208
+
151
209
  ## E2E CI Architecture
152
210
 
153
211
  E2E tests run as a separate CI job, parallel with unit tests. Browser binaries cached via `actions/cache` (GitHub Actions) or equivalent CI cache. E2E failures are informational for the first release (v18.0-v18.1), then enforced as blocking. Playwright uses Chromium only in CI to minimize binary size (~250MB cached). Configuration:
@@ -286,6 +344,28 @@ Platform-hosted static sites serve the entire project from root. Subdomain-to-su
286
344
 
287
345
  **Always test routing before announcing a subdomain.** Curl the subdomain and verify it serves the expected content, not the root index.html.
288
346
 
347
+ ## Deploy Surface Boundary
348
+
349
+ **Invariant:** the repository root is NEVER the deploy surface. Physical separation between "all files tracked in the repo" and "files uploaded to the CDN / server" is enforced by tool configuration, not by `.gitignore`.
350
+
351
+ Why this matters: most deploy tools (wrangler Direct Upload, `aws s3 sync`, Firebase `firebase deploy --only hosting`) do NOT honor `.gitignore`. Deploying from repo root uploads `.env`, `.claude/`, `docs/methods/`, `logs/`, test fixtures, and any other sensitive or non-production file.
352
+
353
+ ### Required configuration per platform
354
+
355
+ | Platform | Enforcement |
356
+ |----------|------------|
357
+ | Cloudflare Pages | `wrangler.toml` with `pages_build_output_dir = "./dist"` (or similar). Deploy command: `wrangler pages deploy ./dist` — never `wrangler pages deploy .` |
358
+ | Vercel | `vercel.json` with `outputDirectory`. Never point at repo root |
359
+ | Netlify | `netlify.toml` with `publish = "dist"` or similar |
360
+ | Firebase Hosting | `firebase.json` `hosting.public = "dist"` + `hosting.ignore` list with methodology paths |
361
+ | AWS S3 + CloudFront | `aws s3 sync ./dist s3://bucket` — never `aws s3 sync . s3://bucket` |
362
+
363
+ ### Verification
364
+
365
+ The methodology-exposure check above (curl denylist) is the runtime assertion that enforcement holds. Run it after every deploy. If any path returns 200 that should not, the deploy surface boundary is breached — stop and fix the ignore/output-dir configuration before continuing.
366
+
367
+ Evidence: field report #305 documents a 32-day credential leak caused by `wrangler pages deploy .` (dot path) uploading `.env` to production. The `.gitignore` entry was present — wrangler Direct Upload ignored it. Field report #303 documents methodology files publicly served on Cloudflare CDN for all VoidForge static-host deploys lacking `.cfignore`.
368
+
289
369
  ## Deliverables
290
370
 
291
371
  1. /scripts/provision.sh, deploy.sh, rollback.sh, backup-db.sh
@@ -72,6 +72,8 @@ VERSION.md ← Only sync the "Current:" line. If the pro
72
72
 
73
73
  **CLAUDE.md path detection:** Some projects use `.claude/CLAUDE.md` instead of root `CLAUDE.md`. Before syncing, check both locations. If `.claude/CLAUDE.md` exists and root `CLAUDE.md` does not, sync to `.claude/CLAUDE.md`. If both exist, warn the user — do not create a duplicate. (Field report #58)
74
74
 
75
+ **CHANGELOG.md identity check:** Before syncing `CHANGELOG.md`, read the local file's first non-title heading. If it references non-methodology versions (e.g., `Site v2.x`, `App v1.x`) or a semver that does not match `VERSION.md`'s `Current:` line, the local `CHANGELOG.md` belongs to the downstream project — not to VoidForge methodology. Treat `CHANGELOG.md` as **skipped** for this project and print the reason in the sync plan. Never clobber a project's own changelog with the methodology changelog. (Field report #307 F1)
76
+
75
77
  **Never touched by Bombadil:**
76
78
  ```
77
79
  .claude/settings.json ← User's permissions and hooks (review new permissions manually)
@@ -95,7 +97,8 @@ Orient to the current state:
95
97
  1. Read `VERSION.md` — identify the current VoidForge version
96
98
  2. Check which shared methodology files exist locally — determines if this is a VoidForge project
97
99
  3. Note any locally modified shared files via `git status` or file timestamps
98
- 4. Announce: *"Old Tom is listening... you're running VoidForge vX.Y.Z. Let's see what the river brings."*
100
+ 4. **Parallel-session detect.** Before starting the sync, run `git log --since="1 hour ago" --all`. If commits exist that weren't authored in this session, warn: *"Another session may have committed recently: [SHA] [subject]. Review before proceeding."* Resolve with the user — never race another session's writes against Bombadil's. (Field report #307 F5)
101
+ 5. Announce: *"Old Tom is listening... you're running VoidForge vX.Y.Z. Let's see what the river brings."*
99
102
 
100
103
  ### Step 1 — Listen to the River (Goldberry)
101
104
 
@@ -115,6 +118,24 @@ Fetch the latest from the source:
115
118
  - **Unchanged** — identical
116
119
  - **Locally modified** — local version differs from BOTH the old upstream and new upstream (user made custom changes)
117
120
 
121
+ ### Step 1.4 — Distribution-vs-Source Drift Check (Goldberry)
122
+
123
+ When syncing methodology, verify that every artifact mentioned in the published `CLAUDE.md` prose is actually present after sync. CLAUDE.md cites scripts and paths as if they exist; if the npm package's `files` array or `prepack.sh` doesn't ship them, downstream consumers get prose that points at nothing.
124
+
125
+ **Procedure:**
126
+ 1. Grep the synced `CLAUDE.md` for path-shaped strings: `scripts/`, `.claude/settings`, `docs/adrs/`, `bash scripts/`.
127
+ 2. For each path, run `[ -e <path> ] && echo present || echo MISSING`.
128
+ 3. If any cited path is missing, this is a **distribution gap** — flag to the user with the specific missing entries.
129
+ 4. Surface the gap as a manifest line in Step 2:
130
+ ```
131
+ Distribution gap detected:
132
+ - scripts/surfer-gate/check.sh — cited in CLAUDE.md but not shipped
133
+ - scripts/surfer-gate/record-roster.sh — cited in CLAUDE.md but not shipped
134
+ Action: pull from tmcleod3/voidforge:<paths> and re-run sync, OR upgrade to vX.Y.Z+ where the gap is closed.
135
+ ```
136
+
137
+ This catches future ADR-051-shaped drift: a permanent enforcement mechanism that the methodology documents as live but never actually ships. Field report #317 documents this exact failure for the Silver Surfer Gate scripts pre-v23.10.0 — the gap was known and recorded in CHANGELOG.md for at least one published version before being closed.
138
+
118
139
  ### Step 1.5 — Spring Cleaning (Treebeard)
119
140
 
120
141
  When upgrading across versions, check the **Migration Registry** for one-time cleanup actions that apply to the version range being crossed. Migrations only run once — they clean up artifacts from older VoidForge versions that should never have been on npm package.
@@ -231,7 +252,11 @@ Apply the updates:
231
252
 
232
253
  ### Step 4.5 — Preview Deploy Verification
233
254
 
234
- After syncing methodology files, if the project has a deploy target, run a preview build (`npm run build`) to verify the sync didn't break anything. For Vercel projects: `vercel` (without `--prod`) to create a preview URL and verify it loads. Only promote to production after preview passes. This prevents the scenario where synced .md files trigger Tailwind v4 content scanning failures that only manifest in platform build environments.
255
+ After syncing methodology files, if the project has a deploy target, run `npm test` **first**, then a preview build (`npm run build`) to verify the sync didn't break anything. For Vercel projects: `vercel` (without `--prod`) to create a preview URL and verify it loads. Only promote to production after both pass.
256
+
257
+ Run `npm test` before `npm run build` because content drift from the sync (new commands, agents, patterns) surfaces as consistency-check failures in tests — not in build output. Build-only verification misses them. If root `pretest` is absent the test invocation is cheap; if present, it catches agent-reference drift, orphan patterns, and dead links before they ship. (Field report #307 F2)
258
+
259
+ This also prevents the scenario where synced `.md` files trigger Tailwind v4 content scanning failures that only manifest in platform build environments.
235
260
 
236
261
  ### Step 4 — The Song Continues (Bombadil)
237
262
 
@@ -250,7 +275,7 @@ Verify and celebrate:
250
275
  4. Check for handoffs — if new commands or agents were added, mention them
251
276
  5. **Content drift check:** If the sync changed methodology counts (agent counts, command counts, pattern counts) AND the project has a data layer that displays VoidForge metadata (e.g., `releases.ts`, `commands.ts`, site content), flag: "The sync changed [N] agents/commands/patterns. If your project displays these counts, update the data layer to match." This prevents stale counts on marketing sites and docs pages after version bumps. (Field report #113)
252
277
  5b. **Description accuracy check (Radagast):** For projects that display command descriptions (marketing sites, docs sites, README generators), compare each command's user-facing description against the upstream method doc's actual steps. If the upstream method doc gained new steps, flags, or capabilities in this sync that aren't reflected in the site's description, flag: "Command /X gained [capability] in this sync but the site description doesn't mention it. Update the description in [data file]." Count-based checks catch missing entries; this catches stale descriptions on existing entries. The most common void sync change is adding capabilities to existing commands, not adding new commands. (Field report #267: 9 commands had outdated descriptions after a sync that added capabilities to 12 agents — the biggest feature was invisible on the site.)
253
- 5. **Version history check:** If VERSION.md was updated, compare the version table entries against any project pages that display release history (roadmap pages, changelog displays, "shipped versions" sections). Flag versions present in VERSION.md that are missing from site content. This prevents version drift between the methodology's version history and user-facing release pages.
278
+ 5c. **Version history check:** If VERSION.md was updated, compare the version table entries against any project pages that display release history (roadmap pages, changelog displays, "shipped versions" sections). Flag versions present in VERSION.md that are missing from site content. This prevents version drift between the methodology's version history and user-facing release pages.
254
279
  6. Announce: *"Hey dol! merry dol! The forge burns bright! VoidForge vA.B.C — all tools sharp, all songs true. The world is good."*
255
280
 
256
281
  ## Deliverables
@@ -268,3 +293,55 @@ Verify and celebrate:
268
293
  - **Network failure:** Bombadil announces the failure cheerfully and stops. No retries, no partial state.
269
294
  - **Full-tier users:** Bombadil only syncs methodology files. For wizard updates, tell the user to run `npx voidforge-build update --self`.
270
295
  - **Rollback:** All updates are applied to the working tree (not committed). If anything goes wrong, `git checkout -- .` restores every file to its last committed state. Bombadil should mention this safety net before applying updates.
296
+ - **Two-pass sync from scaffold-era `/void` (pre-v20.2):** Projects upgrading from the old scaffold-era `/void` implementation require two runs. The first sync rewrites `.claude/commands/void.md` to point at `main` (npm-transport) instead of the legacy scaffold branch; the second sync, now using the new `void.md`, fetches main's full content. This is self-resolving but can look confusing — announce up front that a second run may be needed. (Field report #303)
297
+
298
+ ## Deployment Hygiene
299
+
300
+ After syncing methodology to a project that deploys to a static CDN (Cloudflare Pages, Vercel static, Netlify, Firebase Hosting, GitHub Pages), methodology files MUST be excluded from the public deploy. Add a platform-appropriate ignore file that excludes:
301
+
302
+ ```
303
+ .claude/
304
+ docs/methods/
305
+ docs/patterns/
306
+ HOLOCRON.md
307
+ CHANGELOG.md
308
+ VERSION.md
309
+ logs/
310
+ ```
311
+
312
+ Per-platform file names:
313
+ - Cloudflare Pages → `.cfignore`
314
+ - Vercel → `.vercelignore`
315
+ - Netlify → publish-ignore rules in `netlify.toml` or `_headers` / build config
316
+ - Firebase → `firebase.json` `hosting.ignore` array
317
+
318
+ **Verification (run after the next deploy):**
319
+
320
+ ```bash
321
+ curl -sI "$DEPLOY_URL/.claude/agents/silver-surfer-herald.md" | head -1
322
+ ```
323
+
324
+ If status is `200`, the exclusion is missing — methodology files are being publicly served. Expected status: `404`. Repeat for a representative sample: `docs/methods/FORGE_KEEPER.md`, `HOLOCRON.md`, `CHANGELOG.md`. Methodology files leaking to the public origin is an information-disclosure finding and a hygiene failure. (Field report #303)
325
+
326
+ ## Cross-Repo Scalar Sync
327
+
328
+ Methodology consumers (marketing sites, docs sites, dashboards) often hardcode scalar counts — agent count, method-doc count, pattern count, test count, ADR count — in TypeScript data constants. These drift silently when `/void` runs, because `/void` touches methodology files but not sibling-repo data layers.
329
+
330
+ **Target state (auto-sync):** Scaffold CI produces a `stats.json` artifact on every release:
331
+
332
+ ```json
333
+ {
334
+ "version": "vX.Y.Z",
335
+ "agents": 42,
336
+ "methodDocs": 37,
337
+ "patterns": 41,
338
+ "scaffoldTests": 128,
339
+ "adrs": 61
340
+ }
341
+ ```
342
+
343
+ Sibling repos import `stats.json` at build time rather than hardcoding. A failed fetch falls back to the last-known-good committed copy.
344
+
345
+ **Current state (manual sync):** Until the artifact ships, `docs/methods/RELEASE_MANAGER.md` release checklist MUST include a step: *"Sync scalar counts to sibling repos (marketing site, docs site) — update `totalADRs`, `totalMethodDocs`, `totalPatterns`, `totalScaffoldTests`, agent count."* Omission is how the marketing site landed 12 ADRs and 9 versions behind in field report #308 PF-6.
346
+
347
+ Any sibling repo that displays these counts is in-scope for this sync, not just the primary marketing site.
@@ -162,6 +162,8 @@ Fix batches happen between rounds:
162
162
 
163
163
  **Pass 2 false-positive severity:** When Pass 2 identifies a potential false-positive in a security pattern added during Pass 1, classify as **Must Fix**, not Medium. A false positive in a security scanner is functionally a regression — it degrades working features. Do not defer with "monitor in production" unless a monitoring mechanism actually exists and is configured. (Field report #121)
164
164
 
165
+ **Production-parity exit criterion:** Before any Gauntlet round can be marked PASS, verify that the test execution backend matches the project's declared production backend. If `PROJECT_VERSION.md` (or equivalent) declares PostgreSQL but `tests/conftest.py` autouse fixture pins SQLite (or vice versa), the Gauntlet **FAILS** regardless of green test counts. Tests pinned to the wrong backend silently mask the integrations that actually run in prod (RLS, asyncpg pools, advisory locks, LISTEN/NOTIFY, FOR UPDATE SKIP LOCKED, transaction semantics). Field report #315 M3: this slipped past 4 dual-backend Gauntlets on Union Station between v6.2.1 cutover and v7.6 — every Gauntlet was structurally blind to the runtime risk it was supposed to be reviewing. Concrete check at end of each round: `grep -nE "_backend\s*=\s*['\"]" tests/conftest.py` and reconcile against `cat PROJECT_VERSION.md | grep -i 'database\|backend'`. Mismatch = FAIL the round.
166
+
165
167
  ## Finding Format
166
168
 
167
169
  Every finding, from every agent, in every round, uses this format: