voidforge-build 23.19.0 → 23.21.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (64) hide show
  1. package/dist/.claude/agents/celebrimbor-forge-artist.md +1 -0
  2. package/dist/.claude/agents/ducem-token-economics.md +1 -0
  3. package/dist/.claude/agents/galadriel-frontend.md +1 -0
  4. package/dist/.claude/agents/romanoff-integrations.md +4 -0
  5. package/dist/.claude/agents/silver-surfer-herald.md +19 -4
  6. package/dist/.claude/commands/architect.md +4 -3
  7. package/dist/.claude/commands/assemble.md +12 -0
  8. package/dist/.claude/commands/assess.md +1 -0
  9. package/dist/.claude/commands/build.md +8 -0
  10. package/dist/.claude/commands/contextmeter.md +56 -0
  11. package/dist/.claude/commands/debrief.md +10 -0
  12. package/dist/.claude/commands/engage.md +5 -0
  13. package/dist/.claude/commands/git.md +19 -3
  14. package/dist/.claude/commands/imagine.md +1 -1
  15. package/dist/.claude/commands/seal.md +81 -0
  16. package/dist/.claude/commands/ux.md +13 -0
  17. package/dist/.claude/workflows/gauntlet.workflow.js +13 -1
  18. package/dist/CHANGELOG.md +63 -0
  19. package/dist/CLAUDE.md +10 -1
  20. package/dist/HOLOCRON.md +16 -2
  21. package/dist/VERSION.md +3 -1
  22. package/dist/docs/methods/AI_INTELLIGENCE.md +3 -0
  23. package/dist/docs/methods/ASSEMBLER.md +12 -0
  24. package/dist/docs/methods/BUILD_PROTOCOL.md +15 -0
  25. package/dist/docs/methods/CAMPAIGN.md +11 -0
  26. package/dist/docs/methods/DEVOPS_ENGINEER.md +66 -0
  27. package/dist/docs/methods/FIELD_MEDIC.md +1 -0
  28. package/dist/docs/methods/FORGE_ARTIST.md +3 -4
  29. package/dist/docs/methods/GAUNTLET.md +6 -0
  30. package/dist/docs/methods/MUSTER.md +2 -0
  31. package/dist/docs/methods/PRODUCT_DESIGN_FRONTEND.md +18 -0
  32. package/dist/docs/methods/QA_ENGINEER.md +21 -1
  33. package/dist/docs/methods/RELEASE_MANAGER.md +38 -0
  34. package/dist/docs/methods/SECURITY_AUDITOR.md +11 -1
  35. package/dist/docs/methods/SUB_AGENTS.md +33 -0
  36. package/dist/docs/methods/SYSTEMS_ARCHITECT.md +15 -0
  37. package/dist/docs/methods/TESTING.md +2 -0
  38. package/dist/docs/methods/TROUBLESHOOTING.md +2 -2
  39. package/dist/docs/methods/WORKFLOWS.md +14 -0
  40. package/dist/docs/patterns/ai-prompt-safety.ts +85 -0
  41. package/dist/docs/patterns/data-pipeline.ts +59 -1
  42. package/dist/docs/patterns/egress-sandbox.sh +43 -0
  43. package/dist/docs/patterns/exclusion-set-invariant.md +62 -0
  44. package/dist/docs/patterns/multi-tenant-property-test.ts +64 -0
  45. package/dist/docs/patterns/nginx-vhost.conf +156 -0
  46. package/dist/docs/patterns/oauth-token-lifecycle.ts +21 -0
  47. package/dist/docs/patterns/post-deploy-probe.sh +115 -0
  48. package/dist/docs/patterns/rls-test-fixture.py +140 -0
  49. package/dist/docs/patterns/structural-sql-sentinel.py +134 -0
  50. package/dist/scripts/statusline/README.md +38 -0
  51. package/dist/scripts/statusline/context-awareness-hook.sh +53 -0
  52. package/dist/scripts/statusline/settings-snippet.json +17 -0
  53. package/dist/scripts/statusline/voidforge-statusline.sh +91 -0
  54. package/dist/scripts/voidforge.js +69 -6
  55. package/dist/wizard/lib/claude-md-strategy.d.ts +87 -0
  56. package/dist/wizard/lib/claude-md-strategy.js +198 -0
  57. package/dist/wizard/lib/marker.d.ts +48 -1
  58. package/dist/wizard/lib/marker.js +58 -2
  59. package/dist/wizard/lib/patterns/oauth-token-lifecycle.d.ts +14 -0
  60. package/dist/wizard/lib/patterns/oauth-token-lifecycle.js +21 -0
  61. package/dist/wizard/lib/project-init.js +59 -0
  62. package/dist/wizard/lib/updater.d.ts +19 -0
  63. package/dist/wizard/lib/updater.js +84 -33
  64. package/package.json +2 -2
@@ -49,6 +49,7 @@ Structure your generation report as:
49
49
  - Derive consistent visual language from the PRD's brand section. Every asset should look like it belongs to the same family — palette, mood, style, and typography intent must be coherent.
50
50
  - Optimize assets for their target context: OG images (1200x630), favicons (multiple sizes), hero images (responsive). Wrong dimensions are a bug.
51
51
  - Document which model produced each asset. Model versions matter for reproducibility.
52
+ - Image model is `gpt-image-1` (b64_json response, quality `low`/`medium`/`high`, no `style` param). `dall-e-3` and the `style`/`hd` params are retired and return 400.
52
53
 
53
54
  ## Required Context
54
55
 
@@ -25,6 +25,7 @@ You are Ducem Barr, patrician scholar of Siwenna who understands the economics o
25
25
  - Analyze cost-per-request across different model tiers and use cases
26
26
  - Identify opportunities to reduce token consumption without quality loss
27
27
  - Every token has a cost — track input, output, cached, and wasted separately
28
+ - **Per-token LLM cost constants are a staleness liability — verify them against current provider pricing.** Models get retired and repriced; a per-token rate that was correct at build time silently rots. Whenever you touch cost-tracking, COGS, or cost-cap code, re-verify every hardcoded rate against the provider's live pricing — never trust the value in the repo, the PRD, or a prior vault. A stale rate mis-records COGS and mis-sets margin guards. (Field report #364: Opus hardcoded at $15/$75 per 1M tokens vs an actual current $5/$25 — a 3× over-statement that inflated recorded costs and set AI-cost caps above subscription revenue, a live margin leak.)
28
29
 
29
30
  ## Output Format
30
31
 
@@ -58,6 +58,7 @@ Structure all findings as:
58
58
  - **CSS percentage heights in flex items:** Percentage heights on flex items resolve to the parent's explicit height, which in a flex layout is often undefined (produces 0px). Use px, vh, or `flex: 1` instead.
59
59
  - **Never generate visual direction from training priors alone:** Before proposing any visual direction, fan out to real, current award sites (Awwwards, FWA, Godly, SiteInspire) and live competitor sites — then cite specific mechanics by name: named sites, exact typefaces, concrete interactions. Direction sourced only from the model's training distribution is converged "AI slop": the committee-of-agents pattern averages toward the statistical mean without external grounding. Ground every recommendation in something a human can go look at right now. (Field reports #347, #3.)
60
60
  - **Build a feel-able prototype of the signature moment before finalizing direction:** Don't sign off on direction from static comps or prose alone — build an interactive prototype of the one signature moment (the hero interaction, the transition, the reveal) so it can actually be felt. Architect via semantic design tokens (`--color-accent`, `--font-display`) so palette and type pivots stay cheap and don't require touching components. Before sign-off, run the de-AI checklist on both copy and visuals: em-dashes in prose, generic adjectives ("seamless", "elevate", "powerful"), gradient-text headings, pill-shaped eyebrow labels, default Inter/Playfair pairing, and the cream-editorial trope. Each hit is a flag to justify or remove. (Field reports #351, #4.)
61
+ - **A green build + green unit tests do NOT catch render-gate regressions:** A removed or renamed prop can silently kill a feature while `build` and the unit suite stay green — e.g. a component still gates its render on `!token`, the `token` prop is removed (now always `null`), and a headline panel becomes invisible to every signed-in user. Unit tests and the compiler never see it; only a real browser does. When a prop or contract changes, visually verify EACH changed component in BOTH signed-in and signed-out states, and re-check any render *gate* that keys off the changed prop. Do not satisfy the "screenshot every page" rule with an e2e that exercises a *different* surface than the one that changed. (Field report #375.)
61
62
 
62
63
  ## Required Context
63
64
 
@@ -40,6 +40,10 @@ Findings tagged by severity, with file and line references:
40
40
  [INFO] file:line — Observation or suggestion
41
41
  ```
42
42
 
43
+ ## Operational Learnings
44
+
45
+ - **External-API versions, endpoints, and auth named in plans/PRDs/vaults are STALE BY DEFAULT — verify against the provider's live docs before building.** Those specifics were written in an earlier session and the provider may have moved on, especially on fast-deprecating platforms (Google/Meta/Stripe ad & billing APIs). Before writing any integration code, web-verify the provider's CURRENT API version, deprecation/sunset notices, and auth requirements against the live docs; treat the plan's API specifics as unconfirmed until checked. (Field report #364: a vault prescribed "fix the daemon v17→v21, use `uploadClickConversions`" — live docs showed the current API was actually v24, the planned upload path was blocked for the project 3 days later, and the correct route was a different API entirely with a different OAuth scope and request shape. Building blind would have wasted the whole integration and missed an external deadline.)
46
+
43
47
  ## Reference
44
48
 
45
49
  - Agent registry: `/docs/NAMING_REGISTRY.md`
@@ -62,13 +62,26 @@ You must:
62
62
 
63
63
  1. **Read all agent definitions:** `ls .claude/agents/*.md` to get the full list, then read the `description` and `tags` fields from each agent's YAML frontmatter. Use Grep to extract these efficiently — don't read each file fully.
64
64
  2. **Assess the codebase:** What kind of project is this? (web app, API, game, CLI, financial, etc.) What domains are relevant? (security, UX, database, deploy, AI, etc.)
65
- 3. **Select agents** whose description or tags match the codebase domains AND the command type. Be aggressiveover-include rather than under-include.
66
- 4. **Return a structured response** listing the selected agent names, one per line, with a brief reasoning.
65
+ 3. **Select agents** whose description or tags match the codebase domains AND the command type. Over-include across domains but cap at DISTINCT LENSES (see next).
66
+ 4. **Cap at distinct lenses dedupe overlapping agents.** Coverage is measured in *perspectives*, not headcount. Before returning, collapse same-domain agents that would re-read the *same artifact through the same lens* down to one representative per lens. Five data agents auditing one importer, or six security agents re-scanning one endpoint, is bloat — it burns sub-agent launches and produces redundant findings the orchestrator must re-dedupe. Keep one agent per genuinely distinct angle (e.g. one schema/correctness lens, one materiality lens, one refutation lens, one security lens, one data lens) rather than three near-duplicates of each. Over-include on *breadth of domain*; never on *depth of redundancy within a domain*. (Field report #378, RC-3.)
67
+ 5. **Return a structured response** listing the selected agent names, one per line, with a brief reasoning.
67
68
 
68
69
  ## Output Format
69
70
 
70
71
  **BASENAME CONSTRAINT — READ BEFORE WRITING THE ROSTER (field report #345, DEAL-001; a #318 recurrence).** Every roster line MUST be the exact `.claude/agents/*.md` basename (filename minus `.md`) — NOT a display-name alias or character-name shorthand. Write `picard-architecture`, never `Picard`; `worf-security-arch`, never `Worf`; `dockson-treasury`, never `Dockson`. The example below already obeys this — do not "humanize" it back to display names. If you don't know the literal basename, run `ls .claude/agents/` and copy it verbatim.
71
72
 
73
+ **WARNING — ASCII-ONLY agent names. Curly/smart apostrophes cause SILENT agent drops (field report #365, #1).** Every `agentType` / roster name MUST use plain ASCII characters only. A curly apostrophe (`’`, U+2019), smart quote, or other Unicode look-alike where an ASCII apostrophe (`'`, U+0027) or hyphen belongs makes the runtime fail to resolve the agent — and it drops the agent SILENTLY ("agent type not found"), so a 3-lens verify quietly runs 2-lens with nobody told. This is invisible unless you inspect `/workflows`. Auto-correct/editor smart-quoting is the usual culprit. When in doubt, copy the basename verbatim from `ls .claude/agents/` rather than retyping it.
74
+
75
+ **Canonical 3-lens verify roster (all ASCII — field report #365, #1).** For the adversarial verify phase, the canonical, apostrophe-free 3-lens roster is:
76
+
77
+ ```
78
+ - spock-schema (correctness lens)
79
+ - faramir-judgment (materiality lens)
80
+ - wonder-woman-truth (refutation lens)
81
+ ```
82
+
83
+ Use these exact basenames. They are deliberately ASCII-clean to avoid the silent-drop trap above (the prior `T'Pol` verify lens carried a curly apostrophe and was dropped, collapsing a 3-lens verify to 2-lens undetected).
84
+
72
85
  ```
73
86
  ROSTER:
74
87
  - picard-architecture (architecture lead — always included)
@@ -79,10 +92,10 @@ ROSTER:
79
92
 
80
93
  REASONING: [One sentence explaining the selection logic]
81
94
  TOTAL: [count]
82
-
83
- DEPLOYMENT REMINDER: You MUST now launch an Agent sub-process for EVERY agent listed above. Do NOT proceed to your own analysis. Do NOT write code, plans, or answers yourself. Launch the agents. They do the work. You orchestrate.
84
95
  ```
85
96
 
97
+ You return a roster and your reasoning — nothing more. You do NOT emit imperative "launch all / do not analyze yourself" directives. Whether and how to dispatch is the orchestrator's decision, governed by the Silver Surfer Gate (which enforces only THAT a roster ran and was recorded — not that every name is launched blindly). Commanding the orchestrator to launch every name overrides its legitimate prune authority and its synthesis step; that is out of your lane. (Field report #378, RC-3.)
98
+
86
99
  ## Operating Rules
87
100
 
88
101
  - **Over-include, never under-include.** A false positive costs one sub-agent launch. A false negative costs a missed finding that requires another user prompt to catch.
@@ -90,6 +103,7 @@ DEPLOYMENT REMINDER: You MUST now launch an Agent sub-process for EVERY agent li
90
103
  - **Never remove the command's lead agents.** You add specialists; leads are non-negotiable.
91
104
  - **Read the agent tags first** — tagged agents have `tags: [...]` in their YAML. These are the most cross-domain relevant. Start there, then scan descriptions of untagged agents.
92
105
  - **Be fast.** You're the first agent called. Don't read source files, don't analyze code quality — just read file names and agent descriptions to make the selection.
106
+ - **`--plan` earns a leaner roster than `--review`.** When the command carries `--plan` (e.g. `/architect --plan`, `/campaign --plan`), the pass emits *docs and plans*, not code — so it does not need the full review bench. Field a lean planning roster: the placement/dependency lens, the in-focus domain lead, and at most one or two cross-domain advisors genuinely implicated by the plan. Do NOT field the full security-and-UX bench for a marketing-page plan or a doc-update plan (three security agents on a planning task is pure overhead). The deep adversarial roster is for reviewing the built artifact, not for shaping the plan that precedes it. (Field report #376, F1-B.)
93
107
  - **Small-codebase scaling.** For very small codebases (<1000 LOC, static sites, methodology-only repos), roster size may exceed useful returns. Continue to over-include, but acknowledge that diminishing returns kick in earlier. A 30-agent roster on a 400-LOC static site is not wrong, but the marginal agent adds less than on a 50-file application. (Field report #303.)
94
108
 
95
109
  ## Operational Learnings
@@ -107,6 +121,7 @@ DEPLOYMENT REMINDER: You MUST now launch an Agent sub-process for EVERY agent li
107
121
  - **Creative/UX rosters need a web-capable scout.** The design agents (Galadriel, Arwen, Eowyn, Glorfindel, Celeborn) carry only Read/Write/Edit/Bash/Grep/Glob — they cannot see the web, so they can't ground a creative or UX roster in current design conventions, competitor patterns, or external references. Any creative/UX roster MUST include at least one web-capable scout (a general-purpose agent equipped with WebSearch/WebFetch); if no such agent is on the roster, flag explicitly that the roster needs external grounding so the orchestrator can add one. (Field report #347, #5.)
108
122
  - **Orchestrator roster-name normalization (handoff note).** Before launching, the orchestrator validates each roster name against `ls .claude/agents/` (basenames minus `.md`). For any name with no exact match, it attempts exactly one correction — strip a known prefix/suffix (e.g. `voidforge-`, or add/remove a `-architecture`/`-security-arch` suffix) and re-check — then DROPS the name if still unmatched rather than blocking the whole dispatch on one bad entry. You make this rarely necessary by emitting exact basenames per the BASENAME CONSTRAINT above. (Field report #345, DEAL-001.)
109
123
  - **coverage_debt — an "unsampled"/"not-checked" flag from a prior review agent is COVERAGE DEBT, not a closed item.** When the orchestrator's context carries a finding from an earlier phase that a file, route, or surface was explicitly NOT sampled / NOT checked (e.g. "only 3 of 11 endpoints reviewed" or "templates dir not examined"), that gap is owed work — carry it explicitly into the next phase's roster reasoning and work-list rather than letting it silently drop. Name the unsampled surface in your reasoning and weight an agent to own it next pass. Coverage debt that nobody is assigned to repay becomes a permanent blind spot. (Field report #355 F2.)
124
+ - **plan_vs_review_sizing — a planning pass wants a leaner roster than a review pass.** Sizing keys off the *output type*, not just the codebase. A `--plan` pass produces documents (architecture notes, mission sequencing, scope) — its job is breadth of consideration, not adversarial depth on a built artifact, so it wants a lean roster (placement + dependency + the in-focus domain, ~6-11) rather than the full bench. A `--review` / build-verify pass produces findings against real code and earns the deeper, multi-lens adversarial roster. Same codebase, different roster, because the deliverable differs. Concretely: `/architect --plan` returning 16 agents and `/campaign --plan` returning 11 (three of them security agents on a marketing-page plan) is over-rostered for a doc-emitting pass. Right-size down for `--plan`. (Field report #376, F1-B.)
110
125
  - **focused_partition — a single named lens caps the roster ~6-8 and PARTITIONS by surface, not by persona.** When the user names exactly ONE review lens (copy-only / contrast-only / perf-only / a single-domain FOCUSED review), do NOT field a stack of near-duplicate personas all reviewing everything — that multiplies redundant findings without adding coverage. Cap the roster at roughly 6-8 and PARTITION the agents by SURFACE/SECTION so each owns a distinct set of files (agent A: marketing pages, agent B: app shell, agent C: settings/account, etc.), all applying the same single lens. This keys on the user naming one lens — distinct from scope_density/scope_bias, which key on codebase size and explicit path scope. (Field report #355 F3.)
111
126
 
112
127
  ## Required Context
@@ -91,15 +91,16 @@ In autonomous/blitz mode: append every AGENT_INVENTED constraint to `needs_opera
91
91
 
92
92
  Evidence: BarrierWatch campaign (field report #304) invented a $20 kill switch and $50/$50 capital split that took ~90 minutes to remove across 39 files. Both propagated into ROADMAP, source modules, config YAML, tests, and an ADR before the operator reviewed the design.
93
93
 
94
- ## Step 4.6 — Schema-vs-ADR Cross-Check (Spock + Worf)
94
+ ## Step 4.6 — Reality-Anchor: Internal + External Claim Cross-Check (Spock + Worf + Uhura)
95
95
 
96
- Before any ADR claiming a property of an existing table or callsite is marked Accepted, validate the claim against code reality. Field reports #312, #313, #316 document a pattern where ADRs say *"every tenant-touching table has `org_id`"* or *"X primitive landed in mission Y"* and downstream missions discover the claim was aspirational. SQL errors at build time, ~1 day mid-mission rescoping per occurrence.
96
+ Before any ADR claiming a fact is marked Accepted, validate the claim against reality — both *internal* reality (the code/schema/files) and *external* reality (the third-party platform's live docs). Field reports #312, #313, #316 document the internal pattern: ADRs say *"every tenant-touching table has `org_id`"* or *"X primitive landed in mission Y"* and downstream missions discover the claim was aspirational. Field report #376 documents the external pattern: a PRD claimed *"Calendar/Contacts are restricted scopes → CASA assessment required,"* which is wrong (they are *sensitive* scopes — no CASA), and the false "fact" propagated into multiple agent outputs as ground truth, steering the project toward a months-long security assessment that does not apply. A wrong external fact is as costly as a wrong internal one — and harder to spot, because nothing in the codebase contradicts it.
97
97
 
98
- For each ADR with a "Implementation Scope" or "Existing State" claim, run:
98
+ For each ADR with an "Implementation Scope," "Existing State," or third-party-platform claim, run the applicable checks:
99
99
 
100
100
  1. **Existing-table claims** → grep schema files for the column/constraint/index. Do NOT trust prose. Spock confirms with `grep -nE "^\s*org_id\s+(INTEGER|UUID|BIGINT)" schema*.sql` per claim.
101
101
  2. **"X already landed in mission Y" claims** → Worf empirically inspects the referenced files. If the claim is about a security primitive (paper-gate, allowlist, RLS policy), verification is mandatory before any downstream mission treats it as scope-reduction.
102
102
  3. **File-path claims** → `[ -f <path> ] && echo present || echo MISSING` for every path the ADR cites as a deliverable. Reject "Fully implemented in vX.Y" framing for paths that don't exist at HEAD.
103
+ 4. **External-platform claims (Uhura)** → Any PRD/ADR claim about a third-party platform's **OAuth scope tiers** (restricted vs sensitive vs basic), **API rate/quota limits**, **auth or verification process** (CASA assessment, app review, brand verification), **pricing/plan gating**, or **token lifecycle** MUST be re-verified against the live provider docs via **WebFetch** before any downstream mission scopes work around it. Cite the doc URL and quote the relevant line in the ADR. A claim like "this scope requires a CASA security assessment" or "this endpoint is rate-limited to N/min" is not ground truth until a live-docs fetch confirms it. If no web tools are available, mark the claim `UNVERIFIED — external` and flag that downstream scoping is provisional. (Field report #376: a CASA-vs-sensitive-scope error in the PRD nearly steered the project into months of unnecessary security-assessment work; the operator caught it, but a WebFetch check would have caught it first.)
103
104
 
104
105
  If verification fails, the ADR's status is `Proposed`, not `Accepted`, until the gap is closed. Do not apply Riker's review to an unverified claim — the reviewer is testing the *decision*, not the *factual ground state*.
105
106
 
@@ -67,6 +67,18 @@ Mandatory runtime verification BEFORE code review begins:
67
67
 
68
68
  **Gate:** All endpoints return expected status codes. No route collisions. No infinite render loops detected. Update assemble-state.
69
69
 
70
+ ## Phase 2.75 — Adversarially Verify the As-Implemented Diff (Maul + Deathstroke + Constantine)
71
+ **Fury:** "Before anyone reviews the codebase, review the *change*. The bugs we just shipped are in the new lines, not the old ones."
72
+
73
+ The Phase 3-5 review roster reviews the whole codebase (and pre-build, it reviewed the OLD code). This phase is different and runs FIRST: an adversarial pass over the AS-IMPLEMENTED diff specifically — the lines Phase 2 just wrote — to catch defects the build introduced that a whole-codebase or pre-build review structurally cannot (a fix that latches a flag, focus jumping, a re-attach-into-false-timeout dead-end). Reviewing intent ≠ reviewing the artifact.
74
+
75
+ 1. Capture the mission diff: `git diff <build-start-ref>..HEAD` (use the ref Phase 2 recorded at start). This is the ONLY surface under review here.
76
+ 2. Use the Agent tool to run in parallel, each prompted to attack the NEW diff: `subagent_type: Maul` (exploits introduced by the change), `subagent_type: Deathstroke` (boundaries the new code crosses), `subagent_type: Constantine` (code in the diff that works by accident).
77
+ 3. Each agent answers: "What did THIS change break that didn't exist before it?" — not "is the codebase good."
78
+ 4. Fix all Must Fix findings on the diff before proceeding to the full review rounds.
79
+
80
+ **Gate:** No Must Fix defects in the as-implemented diff. Update assemble-state. (Field report #376 #3.)
81
+
70
82
  ## Workflow Execution — review phases (ADR-067)
71
83
 
72
84
  The **review-heavy fan-out phases** — Phase 3-5 (engage), 7-8 (sentinel), 12 (crossfire), 13 (council) — run as a **Dynamic Workflow** (`.claude/workflows/assemble-review.workflow.js`) over the mission's working diff, so the 15+-agent fan-out stays out of the lead's context (ADR-067; see `docs/methods/WORKFLOWS.md`). The **build/architecture/devops phases (1-2.5, 9) STAY prose orchestration** — they write code, are sequentially dependent, and need lead judgment + `--interactive` gates between them.
@@ -40,6 +40,7 @@ Run `/gauntlet --assess` — Rounds 1-2 only (Discovery + First Strike). No fix
40
40
  - **Stubs returning success:** Methods that return True/ok without side effects (RC-2 pattern)
41
41
  - **Auth-free defaults:** HTTP endpoints with no authentication middleware (RC-3 pattern)
42
42
  - **Dead code:** Services wired but never called, preferences stored but never read
43
+ - **ADR concurrency claim not honored:** When an ADR (or PRD) claims concurrency, a bounded worker pool, parallel/async fan-out, or batched parallel I/O for a stage, verify the *implementation* actually honors it — a sequential `for`-loop over per-row network calls (LLM/enrichment/API) silently satisfies a "use a worker pool" ADR and passes small-fixture tests, then detonates at production scale. Grep the claimed-concurrent stages for serial iteration over awaited network calls; flag any stage whose code is sequential while its ADR claims parallelism. This is a throughput regression, not a correctness one, so correctness tests will not catch it. See the **Concurrency-claim verification gate** in `/docs/methods/SYSTEMS_ARCHITECT.md`. (Field report #378: an ADR specified "a bounded worker pool for I/O-bound stages"; both the LLM-classify and enrich stages shipped as sequential loops.)
43
44
 
44
45
  **Standing rule — CRITICAL findings are unconditionally routed to adversarial verification (field report #345 DEAL-003):** `/assess` is review-only and produces no fix batches, but its findings still drive a rebuild plan — so a false-negative Critical is just as costly here as in `/gauntlet`. Mirror the GAUNTLET.md principle: confidence is an advisory signal for routing *Medium and below only*. It is NEVER a fast-track that lets a **Critical**-severity finding skip adversarial verification, regardless of how high its confidence score or any advisory flag (`--light`, `--fast`, `--solo`) suggests. Severity dominates confidence: a Critical at confidence 97 is routed to the adversarial refute pass exactly the same as a Critical at confidence 40. Critical-routes-to-verification is a structural property of assessment, not a per-finding flag an agent can toggle off.
45
46
 
@@ -90,6 +90,14 @@ Opus scans `git diff --stat` and matches changed files against the `description`
90
90
  3. **PRD alignment check (Living PRD):** Compare marketing pages against the PRD's marketing section. Verify copy claims (feature counts, pricing tiers, testimonial accuracy), visual treatments, and page structure match. Fix code OR update PRD for deviations.
91
91
  4. **Gate:** Pages render, SEO present, mobile responsive, PRD alignment verified
92
92
 
93
+ ## Phase 8.5 — Adversarially Verify the As-Implemented Diff (Maul + Deathstroke + Constantine)
94
+ This runs BEFORE the whole-codebase review cycle and is distinct from it: a review of the NEW diff specifically — the lines this build wrote — not of the codebase or of pre-build intent. The double-pass review (Phases 9-11) reads the whole project; this phase reads only the change, catching defects the build introduced that a whole-codebase pass structurally misses (a fix that latches a flag, focus jumping backward, a re-attach-into-false-timeout dead-end). Reviewing intent ≠ reviewing the artifact.
95
+ 1. Capture the build diff: `git diff <build-start-ref>..HEAD` — this is the only surface under review here.
96
+ 2. Run in parallel via the Agent tool, each prompted to attack the NEW diff: `subagent_type: Maul`, `subagent_type: Deathstroke`, `subagent_type: Constantine`. Each answers "what did THIS change break that didn't exist before it?" — not "is the codebase good?"
97
+ 3. Fix all Must Fix findings on the diff before entering the review cycle.
98
+ 4. Log to `/logs/phase-08-5-diff-verify.md`.
99
+ 5. **Gate:** No Must Fix defects in the as-implemented diff. (Field report #376 #3.)
100
+
93
101
  ## Phases 9-11 — Double-Pass Review Cycle (Batman + Galadriel + Kenobi)
94
102
 
95
103
  ### Pass 1 — Find (all three run in parallel)
@@ -0,0 +1,56 @@
1
+ # /contextmeter — Context Budget Meter (Ducem Barr)
2
+
3
+ > *Ducem Barr counts every token. This one tells you — and the model — how many are left.*
4
+
5
+ Installs VoidForge's context-usage **status line** (a colored meter, green → yellow → red as the window fills) and a **`UserPromptSubmit` awareness hook** that injects remaining-context awareness into Claude itself as you approach the limit. The status line is for you; the hook is for the model, so it can checkpoint with `/vault` or `/seal` before compaction instead of being surprised by it.
6
+
7
+ **Default-on.** `npx voidforge-build init` already wires both into a new project's `.claude/settings.json` with defaults **warn 80% / crit 92%** (the meter turns yellow at 80%, red at 92%). Run this command only to **re-install**, **retune** the thresholds (`--warn-pct` / `--crit-pct`), check `--status`, or `--uninstall`.
8
+
9
+ **Why not `/statusline`?** Claude Code ships native `/statusline` and `/context` commands that *always shadow* a same-named project command — a `.claude/commands/statusline.md` would never fire. So this is `/contextmeter`. (Tracked in `docs/NATIVE_CAPABILITIES.md`, ADR-066.)
10
+
11
+ ## Context Setup
12
+ 1. Read `scripts/statusline/README.md` for what the two scripts do and the env knobs.
13
+ 2. The scripts ship with the methodology at `scripts/statusline/`. If that directory is missing, the project predates this feature — pull it from `tmcleod3/voidforge:scripts/statusline/` or re-run `npx voidforge-build update`.
14
+
15
+ ## Step 0 — Preflight
16
+ 0. Read `.claude/settings.json` — if the context-awareness hook is already wired (the default after `init`), this run is a re-install/retune, not a first install. Proceed idempotently (Step 2 replaces the existing entry, never duplicates it).
17
+ 1. Confirm `scripts/statusline/voidforge-statusline.sh` and `context-awareness-hook.sh` exist. If not, stop and tell the user how to get them (above).
18
+ 2. Check `jq` is installed (`command -v jq`). If absent, warn: the meter prints an "install jq" notice and the hook no-ops until `jq` is present (`brew install jq` / `apt install jq`). Continue — installation still proceeds.
19
+ 3. Read `.claude/settings.json` if it exists (else it will be created).
20
+
21
+ ## Step 1 — Make scripts executable
22
+ `chmod +x scripts/statusline/*.sh` (invocation is via `bash <script>`, so this is hygiene, not strictly required).
23
+
24
+ ## Step 2 — Merge settings (non-destructively)
25
+ Merge `scripts/statusline/settings-snippet.json` into `.claude/settings.json`:
26
+ 1. **`statusLine`** — if the project already has a `statusLine`, do NOT clobber it silently. Show the existing one and ask whether to replace it (or back it up to `settings.json.bak`). On a fresh/absent statusLine, set it from the snippet.
27
+ 2. **`hooks.UserPromptSubmit`** — APPEND the awareness-hook entry to the existing array; never overwrite other hooks. (The Silver Surfer gate's hook is on `PreToolUse`, so there is no conflict.) If an identical context-awareness-hook entry is already present, skip — don't duplicate.
28
+ 3. If `--warn-pct N` / `--crit-pct N` / `--window N` were passed, bake them into BOTH command strings (the `statusLine` and the hook) so the meter's yellow/red bands and the hook's warn/critical bands stay in lockstep, and persist without a shell export:
29
+ - statusLine: `"command": "VOIDFORGE_CONTEXT_WARN_PCT=80 VOIDFORGE_CONTEXT_CRIT_PCT=92 bash scripts/statusline/voidforge-statusline.sh"`
30
+ - hook: `"command": "VOIDFORGE_CONTEXT_WARN_PCT=80 VOIDFORGE_CONTEXT_CRIT_PCT=92 bash scripts/statusline/context-awareness-hook.sh"`
31
+ With no flags, leave both as plain commands — the scripts already default to warn 80 / crit 92.
32
+ 4. Write `.claude/settings.json` back as valid JSON (preserve all unrelated keys).
33
+
34
+ ## Step 3 — Verify
35
+ Render a sample so the user sees the meter immediately:
36
+ ```
37
+ printf '%s' '{"model":{"display_name":"Opus 4.8"},"context_window":{"used_percentage":72,"context_window_size":200000}}' | bash scripts/statusline/voidforge-statusline.sh
38
+ ```
39
+ Show the output. Note that Claude Code applies a new `statusLine`/hook config on the next render / next prompt (a session restart guarantees it).
40
+
41
+ ## Step 4 — Confirm
42
+ Report: what was installed, the active thresholds (warn/crit), the jq status, and the tuning env vars. Tell the user the hook stays silent until context crosses the warn threshold.
43
+
44
+ ## Arguments
45
+ | Flag | Effect |
46
+ |------|--------|
47
+ | (none) | Install the status line + awareness hook with defaults (warn 80%, crit 92%). |
48
+ | `--warn-pct N` / `--crit-pct N` | Set the hook's warning / critical thresholds (baked into settings.json). |
49
+ | `--window N` | Set the fallback context-window denominator (default 200000; auto-detects 1M otherwise). |
50
+ | `--status` | Report whether the meter + hook are currently wired in `.claude/settings.json`, and the active thresholds. Don't modify anything. |
51
+ | `--uninstall` | Remove the `statusLine` block (if it's ours) and the awareness-hook entry from `.claude/settings.json`. Leave the scripts on disk and all other settings intact. |
52
+ | `--dry-run` | Show the settings.json changes that WOULD be made, without writing. |
53
+
54
+ ## Handoffs
55
+ - The hook recommends `/vault` and `/seal` at the critical threshold — those are the checkpoint commands it points the model toward.
56
+ - Tuning lives in `scripts/statusline/README.md`; the scripts themselves are plain bash + `jq`.
@@ -5,6 +5,16 @@ Bashir examines the patient. Time to diagnose.
5
5
  2. Read `/logs/build-state.md`, `/logs/assemble-state.md`, `/logs/campaign-state.md`
6
6
  3. Read recent git history: `git log --oneline -20`
7
7
 
8
+ ## Gate Bypass — REQUIRED before any sub-agent (#366 F4)
9
+
10
+ `/debrief` is NOT in the Silver Surfer gated-commands list, but the gate's `PreToolUse` hook blocks *every* non-Surfer Agent launch — so Ezri/O'Brien/Nog/Jake below would be blocked. This is a fixed, command-prescribed roster (not a cherry-picked review), so it takes the bypass rather than mustering the Surfer. BEFORE Step 0, run:
11
+
12
+ ```bash
13
+ [ -x scripts/surfer-gate/bypass.sh ] && bash scripts/surfer-gate/bypass.sh --light || true
14
+ ```
15
+
16
+ The existence guard is a no-op on projects that predate the gate. **Stale-pointer self-repair (#384 RC-3):** `bypass.sh` now repoints a stale pointer (one left by a `/clear`ed or crashed session) to the live session automatically — read from `CLAUDE_CODE_SESSION_ID` — so the bypass lands correctly on the first try. On older Claude Code builds without that env var the legacy behavior applies: if the first sub-agent launch still blocks, re-run the exact same `bypass.sh --light` line once (the first blocked `check.sh` fire repoints the pointer, so the second write lands correctly). See CLAUDE.md "Silver Surfer Gate" → "Stale session pointer — auto-repaired."
17
+
8
18
  ## Step 0 — Reconstruct the Timeline
9
19
 
10
20
  **Ezri** `subagent_type: Ezri` reads the session's history and reconstructs what happened:
@@ -94,6 +94,11 @@ For every `useEffect` in new/modified components:
94
94
  - Generic fallback messages are only used when the server truly returns no useful error info
95
95
  - The UI form state after error allows retry without losing user input
96
96
 
97
+ **CLIENT-GATE-MIGRATION AUDIT (mandatory when the diff changes an auth mechanism — field report #371).** When a change swaps or alters how authentication is carried — token string → cookie session, header → httpOnly cookie, a new principal/role check, a different session source — the server side is only half the migration. Find EVERY client-side gate that depended on the OLD mechanism and confirm it was migrated to the new one:
98
+ - Grep the client for every place that reads the old credential (e.g., `token`, `localStorage.getItem('token')`, `Authorization` header construction) and every UI gate keyed on it (`if (token)`, `disabled={!token}`, route guards, panel-visibility checks).
99
+ - For each, verify it now reads the NEW source. A server that authenticates via cookie while the SPA still gates UI on a token string that is no longer set means signing in unlocks nothing — the feature is *inert* despite the request layer being correct.
100
+ - This is the wiring/composition gap per-mission review misses and the Victory Gauntlet's UX lens catches late: the server is wired, the client gates are not switched. Verify the gate migration as part of the review, not as a downstream surprise.
101
+
97
102
  ## Step 1.5 — Conflict Detection
98
103
  After parallel analysis completes, scan findings from all agents for conflicts:
99
104
  - **Same code, different verdicts:** Spock says "pattern violation" but Data says "intentional trade-off"
@@ -8,8 +8,12 @@
8
8
  Scope the changes:
9
9
  1. Run `git status` — identify staged, unstaged, and untracked files
10
10
  2. Run `git diff --stat` — get a summary of what changed
11
- 3. If there are unstaged changes, ask the user: "Stage everything, or should I be selective?"
12
- 4. If there are no changes at all, stop: "Nothing to version. Working tree is clean."
11
+ 3. **Unrelated / pre-existing-change detection (field report #384 RC-1 — never `git add -A` blind).** Before staging, separate what this session authored from changes that were already in the working tree or fall outside the session's stated scope. Two mechanical checks:
12
+ - **Dependency manifests get special scrutiny.** If any manifest or lockfile appears in the diff — `package.json`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`, `requirements.txt`, `pyproject.toml`, `Cargo.toml` / `Cargo.lock`, `go.mod` / `go.sum`, `Gemfile` / `Gemfile.lock` — read the actual dependency-level diff (`git diff -- <manifest>`), not just the filename. A dependency added / changed / removed that the session did not deliberately introduce is the exact bug this check exists for: the v23.20.0 `vercel` near-miss was a stray `npm install` that added `vercel` to root `dependencies` plus ~5,900 lockfile lines, and a naive `git add -A` would have shipped it into a methodology release. Flag every dependency change for an explicit include/exclude decision and honor "no new dependencies without justification" (CLAUDE.md Coding Standards).
13
+ - **Scope diff.** Cross-check the full changed-file list against what this session actually touched. Anything you did not author this session — a leftover edit, a scratch/probe file, an untracked artifact — is surfaced for an explicit keep/drop decision.
14
+ Present the split — *session-authored (stage these)* vs *pre-existing or out-of-scope (decide)* — and get the include/exclude decision BEFORE Step 4 staging. **Never `git add -A` / `git add .` a release without this split.**
15
+ 4. If there are unstaged changes, ask the user: "Stage everything, or should I be selective?" — informed by step 3's split.
16
+ 5. If there are no changes at all, stop: "Nothing to version. Working tree is clean."
13
17
 
14
18
  ## Step 1 — Analyze (Vision)
15
19
  Read the actual diffs and classify every change:
@@ -56,10 +60,22 @@ Update all version files:
56
60
  4. Update the version in **every** versioned `package.json`. For a single-package repo that is the root `package.json`. For a workspaces/monorepo it is each non-private workspace package — VoidForge has **two**: `packages/voidforge/package.json` and `packages/methodology/package.json` (the root is `"private": true` with no version). **Also bump any internal dep pin:** when one workspace package depends on a sibling, set the range to `^<new-version>` — for VoidForge, `voidforge-build`'s `voidforge-build-methodology` dependency, per ADR-062. A bump that updates one package but not its sibling or the pin ships an inconsistent release (and the Step 7 publish skips any package whose version doesn't match).
57
61
  5. **Re-sync tracked generated copies of release files.** If a source file changed this release has a tracked copy that is regenerated at publish, re-sync it so the in-repo copy doesn't go stale between releases. VoidForge: `packages/methodology/CLAUDE.md` is the root `CLAUDE.md` with the ADR-058 template block stripped — `sed '/<!-- REMOVE-FOR-NPM-PUBLISH/,/END-REMOVE-FOR-NPM-PUBLISH -->/d' CLAUDE.md > packages/methodology/CLAUDE.md`.
58
62
 
63
+ ## Step 3.5 — Removal Sweep (Rogers)
64
+ If Step 1 classified any change as **Removed** — a deleted symbol, export, prop, env var, command, or any named artifact — sweep its name out of comments and user-facing copy before the commit lands. A green build and green tests confirm the code compiles without it; they say nothing about a comment that still describes it or a doc that still tells users to set it.
65
+
66
+ For each removed name, grep the **whole tree** — code AND comments AND user-facing copy (READMEs, docs, CLAUDE.md, command files, UI strings, help text) — not just source:
67
+
68
+ ```bash
69
+ # NAME = the deleted symbol/export/prop/env-var/command
70
+ git grep -nI -- "$NAME" -- ':!CHANGELOG.md' ':!PROJECT_VERSION.md' ':!VERSION.md'
71
+ ```
72
+
73
+ Every hit that is not the intentional "Removed" changelog line is either updated to the new reality or itself removed before committing. Field report #375: retiring `MONITOR_TOKEN` left ~8 stale comment sites plus user-facing copy ("set a monitor token") that a green build + 97 unit tests never caught — because none of the stale references were code.
74
+
59
75
  ## Step 4 — Commit (Rogers)
60
76
  Stage and commit:
61
77
  1. Stage all modified version files: `VERSION.md`, the active changelog (`CHANGELOG.md` or `PROJECT_VERSION.md`), **every** bumped `package.json` (all workspace packages, not just the root), and any generated copy re-synced in Step 3
62
- 2. Stage any other files that are part of this release (from Step 0)
78
+ 2. Stage any other files that are part of this release — explicitly, from Step 0's *session-authored* split, including any prose fixed by the Step 3.5 removal sweep. Stage by path; do **not** `git add -A` / `git add .` (that re-admits the pre-existing/out-of-scope changes Step 0 just excluded — field report #384 RC-1)
63
79
  3. Craft commit message in the format: `vX.Y.Z: One-line summary`
64
80
  - If elaboration needed, add a blank line then details
65
81
  - Match the style of existing commits (check `git log --oneline -10`)
@@ -98,7 +98,7 @@ If `$ARGUMENTS` contains `--regen "name"`, regenerate just that asset (overwrite
98
98
 
99
99
  ## Step 5.5 — Optimize for Web (Gimli)
100
100
 
101
- DALL-E outputs 1024x1024 PNGs regardless of display size. A 40px avatar served from a 1024px source wastes 99% of bandwidth. For every generated image:
101
+ gpt-image-1 outputs 1024x1024 PNGs regardless of display size. A 40px avatar served from a 1024px source wastes 99% of bandwidth. For every generated image:
102
102
 
103
103
  1. **Determine display dimensions** — check the asset manifest for intended usage (avatar, hero, card, portrait). If the PRD or component specifies dimensions, use 2x those (retina). Default sizes by category: avatars → 200px, portraits → 400px, cards → 600px, hero → 1200px, OG images → 1200x630.
104
104
  2. **Resize** — use `sharp` (already a project dependency) to resize to 2x display dimensions. Never serve 1024px for a 40px slot.
@@ -0,0 +1,81 @@
1
+ # /seal — Session Closeout Ritual
2
+
3
+ > *Seal the session: ship the work, file the field report, preserve the intelligence, hand off the baton.*
4
+
5
+ `/seal` is a thin orchestration command. It runs no new persona of its own — it conducts three existing agents in sequence and always ends by printing the copy-paste prompt that boots the next session:
6
+
7
+ ```
8
+ /git (commit) → /git (push) → /debrief --submit → /vault --seal → HANDOFF PROMPT
9
+ Coulson Coulson Bashir Seldon (always printed)
10
+ ```
11
+
12
+ The ordering is deliberate: commit and push first so the field report and vault describe the *final* committed state; run `/debrief` before `/vault` because `/vault` Step 1.5 folds the debrief's approved learnings into the briefing; emit the handoff prompt last so it is the final thing on screen.
13
+
14
+ ## Context Setup
15
+ 1. Read this file fully before acting — `/seal` is a pipeline with short-circuit rules; running stages out of order ships a misleading release.
16
+ 2. The three stages have their own method docs — load on demand: `RELEASE_MANAGER.md` (`/git`), `FIELD_MEDIC.md` (`/debrief`), `TIME_VAULT.md` (`/vault`).
17
+
18
+ ## Step 0 — Preflight (decide which stages apply)
19
+ 1. `git status` + `git diff --stat` — is there anything to commit?
20
+ - **Working tree clean:** there is no release to ship. Skip Stages 1–2 (commit + push), tell the user "Nothing to ship — sealing without a release," and proceed to Stage 3 (debrief) + Stage 4 (vault). A clean tree is not an error.
21
+ 2. **Unrelated / pre-existing-change detection (field report #384 RC-1).** Run the same split `/git` Step 0 does — separate session-authored changes from changes that were already in the tree or fall outside this session's scope, giving **dependency manifests / lockfiles** (`package.json`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`, `requirements.txt`, `Cargo.lock`, `go.sum`, …) special scrutiny via their dependency-level diff. This is exactly the vigilance that caught the v23.20.0 `vercel` near-miss by hand; doing it at Preflight surfaces it as part of the plan disclosure (step 3) instead of relying on the operator noticing mid-commit. Surface any pre-existing/out-of-scope change for an explicit include/exclude decision; **never let the downstream `/git` stage `git add -A` a release without this split.**
22
+ 3. `git rev-parse --abbrev-ref HEAD` — confirm the branch. If on the default branch and a release is being cut, that is fine for this repo's flow; just surface it.
23
+ 4. Echo the plan the user is about to authorize: the stages that will run, whether a push will happen, whether a GitHub field report will be filed, **and any pre-existing/out-of-scope changes detected in step 2 (with the include/exclude call)**. This single up-front disclosure is the contract — do not re-ask before each stage (the operator already authorized the whole ritual by invoking `/seal`).
24
+ 5. **Arm the gate bypass for Stage 3.** `/debrief --submit` deploys sub-agents (Ezri / O'Brien / Nog / Jake), and the Silver Surfer PreToolUse hook gates *every* Agent launch — `/debrief` is an analysis command, not a Surfer review roster, so it takes the documented bypass (field report #366-F4). Run, existence-guarded:
25
+ `[ -x scripts/surfer-gate/bypass.sh ] && bash scripts/surfer-gate/bypass.sh --light || true`
26
+ **Stale-pointer self-repair (#384 RC-3):** `bypass.sh` now reads the live session id from `CLAUDE_CODE_SESSION_ID` and, when the repo pointer is stale (left by a `/clear`ed or crashed session), repoints it to the live session automatically — the bypass lands correctly on the first try. On older Claude Code builds without that env var the legacy behavior applies: if Stage 3's first Agent call is still blocked despite the bypass, re-run the same `bypass.sh --light` line once (the blocked `check.sh` fire repoints the pointer), then retry. Do not fight the gate beyond one re-run; if it still blocks, fall back to `/debrief --solo` for this stage.
27
+
28
+ ## Step 1 — Ship (Coulson · `/git`)
29
+ Run the full `/git` release flow (version bump → changelog → commit → tag → verify). Pass through `--major` / `--minor` / `--patch` / `--no-tag` if supplied.
30
+
31
+ **Gate (do not skip):** `/git` Step 5 runs the test suite. **If the suite fails, HALT the pipeline here.** Do not push, do not file a field report (you would be reporting success on a broken build — field report #363). Jump straight to Stage 4 and seal a vault that records the failing state, so the next session resumes by fixing it. Report the failure plainly.
32
+
33
+ By default `/seal` pauses **once** — at `/git`'s version-bump + commit-message confirmation — because a commit (and the tag it arms) is consequential. `--yes` removes that pause and accepts `/git`'s recommended bump.
34
+
35
+ ## Step 2 — Push (Coulson · `/git` Step 6)
36
+ Push the branch and the version tag. `/git` Step 6 is opt-in by design; invoking `/seal` *is* the opt-in. Skip this stage if `--no-push` was passed (a local-only seal). A failed push (no upstream, rejected non-fast-forward) HALTS before Stage 3 — surface it; a field report and vault that claim a pushed release would be wrong.
37
+
38
+ ## Step 3 — Debrief (Bashir · `/debrief --submit`)
39
+ Run `/debrief --submit`: reconstruct the session timeline, root-cause any failures, and file the field report upstream to `tmcleod3/voidforge`. `--submit` presents the full report before it goes out (the review obligation) and then auto-proceeds — do not re-prompt.
40
+
41
+ Degrade gracefully, never fail the seal:
42
+ - **No `gh` auth / GitHub unreachable:** take `/debrief`'s `[save]` path — write the report to `/logs/debrief-YYYY-MM-DD.md` — and continue. Note that it was saved locally, not filed.
43
+ - **`--no-submit`:** run the debrief but stop at the local save; do not file upstream.
44
+ - **`--no-debrief`:** skip this stage entirely.
45
+
46
+ ## Step 4 — Vault (Seldon · `/vault --seal`)
47
+ Run `/vault --seal` to write `/logs/vault-YYYY-MM-DD.md` and generate the pickup prompt. `--seal` auto-confirms the write (no review pause) — appropriate here because the operator already authorized the closeout. `/vault` Step 1.5 will pick up any learnings the debrief just approved instead of re-extracting them.
48
+
49
+ This stage always runs (even on a Stage-1 test failure or a clean tree) — the vault is the handoff artifact, and a session that ended blocked is exactly the one whose next pickup needs the most context.
50
+
51
+ ## Step 5 — Handoff (always)
52
+ Print `/vault`'s **Artifact 2: Pickup Prompt** as the final output, in a fenced block, prominently, so the user can copy it verbatim into the next session. This is `/seal`'s signature deliverable and must appear even if an earlier stage halted — in that case the prompt's "Resume from:" line names the thing that blocked (e.g. "fix the 3 failing tests in X before re-sealing").
53
+
54
+ Then print a one-line ledger of what actually happened, so the outcome is unambiguous:
55
+ ```
56
+ Sealed: vX.Y.Z committed ✓ pushed ✓ field report #NN filed ✓ vault sealed ✓
57
+ ```
58
+ Mark any stage that was skipped or halted (`— skipped (clean tree)`, `— HALTED (tests failing)`, `— saved locally (no gh auth)`) rather than implying success.
59
+
60
+ ## Arguments
61
+ | Flag | Effect |
62
+ |------|--------|
63
+ | (none) | Run all stages; pause once at the `/git` commit confirmation; file the field report upstream. |
64
+ | `--dry-run` | Show what each stage *would* do — proposed version bump, changelog entry, commit message, whether it would push, whether a field report would file, and the vault outline — without executing any of it. |
65
+ | `--yes` | Full autonomy: accept `/git`'s recommended bump + commit with no pause. (Outward-facing actions — push, upstream field report — still happen; use `--no-push` / `--no-submit` to suppress them.) |
66
+ | `--no-push` | Commit locally but do not push (Stages 1, 3, 4 only). |
67
+ | `--no-submit` | Run the debrief but save it locally; do not file the GitHub field report. |
68
+ | `--no-debrief` | Skip Stage 3 entirely. |
69
+ | `--major` / `--minor` / `--patch` / `--no-tag` | Passed through to `/git`. |
70
+
71
+ ## Short-circuit summary
72
+ | Condition | Behavior |
73
+ |-----------|----------|
74
+ | Clean working tree | Skip commit + push; still debrief + vault. |
75
+ | `/git` test suite fails | HALT pipeline; skip push + submit; seal a vault recording the failure; print a "resume by fixing tests" handoff. |
76
+ | Push fails | HALT before debrief; surface the push error. |
77
+ | No `gh` auth | Save the debrief locally; continue to vault. |
78
+
79
+ ## Handoffs
80
+ - `--npm` publishing is **not** part of `/seal` — it is irreversible and broadcasts to every consumer. If this release should publish, run `/git --npm` deliberately, separately.
81
+ - If the debrief surfaces a MAJOR breaking change → recommend `/architect` before the next session resumes.
@@ -31,6 +31,8 @@ Document in phase log: "How to run", key routes, where components/styles/copy li
31
31
 
32
32
  **Screenshot mandate (MANDATORY):** If the app is runnable, start the server, take screenshots of EVERY page via Playwright or browser, and READ them via the Read tool. Without screenshots, the review is code-reading — not visual verification. Take at desktop (1440x900), plus 375px and 768px for responsive proof-of-life.
33
33
 
34
+ **Render-gate regression coverage (MANDATORY) (field report #375).** A green build and a green unit suite do NOT catch render-gate regressions — a removed or renamed prop can silently kill a feature (a component still gating its render on a prop that is now always `null`) while every automated gate stays green. So when this review covers a change that touched a prop or a shared contract, the browser/e2e pass must cover **EVERY surface that consumes the changed prop/contract — not a sampled page** — and must explicitly **re-check the render *gates* that key off the changed prop** (does the panel that gated on it still render?). Verify each changed component in BOTH signed-in and signed-out states. Do not satisfy this mandate with an e2e that exercises a *different* surface than the one that changed.
35
+
34
36
  ## Step 0.5 — World-Scan / Reference Grounding (MANDATORY) (field report #347 #1)
35
37
  Before any creative direction is finalized, web-capable agents fan out and ground the review in the current state of the craft. This is a **required input to every downstream generation agent** — visual, design-system, and enhancement work in Steps 2, 5, and 6 must cite the dossier produced here.
36
38
 
@@ -44,6 +46,17 @@ Before any creative direction is finalized, web-capable agents fan out and groun
44
46
 
45
47
  If no web tools are available, log the gap explicitly in the phase log and proceed with PRD-derived references only — but flag that reference grounding is degraded.
46
48
 
49
+ ### Originality Gate — JUSTIFY-OR-REJECT the homogenized defaults (MANDATORY) (field report #376 #1)
50
+ The world-scan above grounds the work; this gate forces it to *bite*. Before any visual direction is emitted, run the proposed direction through an explicit anti-reference list and, for EACH item, record one of: **REJECTED** (not in this direction) or **JUSTIFIED** (deliberately kept, with the reason tied to a concrete artifact in the Step 0.5 reference dossier). "It looked fine" is not a justification — only a named dossier reference is. The homogenized defaults to check, by name:
51
+
52
+ - **blue-600 (or the equivalent default-primary) hero** — the reflexive Tailwind/SaaS accent.
53
+ - **purple→cyan (violet→teal) gradient headings** — the `bg-clip-text` rainbow headline.
54
+ - **the shadcn default hero** — centered headline + sub + two buttons + faint grid/radial, untouched.
55
+ - **floating orbs / particles / aurora blobs** — decorative background motion with no meaning.
56
+ - **the default Inter / Playfair (modern-sans + elegant-serif) pairing** — reached for by reflex.
57
+
58
+ A direction passes only when every item above is explicitly REJECTED or JUSTIFIED against the dossier. The default posture is **distinctive and ownable, not "current SaaS standard."** If three or more items are JUSTIFIED rather than rejected, treat that as a smell that the direction has converged on the mean and send it back to Step 0.5. Log the gate result (the five verdicts + reasons) to the phase log.
59
+
47
60
  ## Step 1 — Product Surface Map
48
61
  List every screen/route, primary user journeys, key shared components, and the state taxonomy (loading/empty/error/success/partial/unauthorized). Write to phase log.
49
62
 
@@ -81,7 +81,19 @@ const VERDICT = {
81
81
  },
82
82
  }
83
83
 
84
- const key = (f) => `${(f.file || '').toLowerCase()}::${(f.title || '').toLowerCase().slice(0, 60)}`
84
+ // Normalize the file path before keying so the SAME finding reported by two agents —
85
+ // one with an absolute `/Users/.../repo/src/x.ts:42`, one with a relative `src/x.ts:42` —
86
+ // dedupes instead of surviving as two "distinct" claims (#366 F6). Strip the repo-root
87
+ // prefix (the launch cwd, passed via args; fall back to env) and any leading `./`.
88
+ // Workflows forbid argless `new Date()`/`Math.random()` but env reads are fine.
89
+ const REPO_ROOT = (input.repoRoot || (typeof process !== 'undefined' && process.cwd && process.cwd()) || '')
90
+ .replace(/\/+$/, '')
91
+ const normPath = (p) => {
92
+ let s = (p || '').trim()
93
+ if (REPO_ROOT && s.startsWith(REPO_ROOT + '/')) s = s.slice(REPO_ROOT.length + 1)
94
+ return s.replace(/^\.\/+/, '').toLowerCase()
95
+ }
96
+ const key = (f) => `${normPath(f.file)}::${(f.title || '').toLowerCase().slice(0, 60)}`
85
97
 
86
98
  // ── Round 1: Discovery + Round 2/3: Strike ────────────────────────────────────
87
99
  const dom = (a) => a.domain || a.key || 'their domain' // avoid literal "undefined" in prompts
package/dist/CHANGELOG.md CHANGED
@@ -6,6 +6,69 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/), and this
6
6
 
7
7
  ---
8
8
 
9
+ ## [23.21.0] - 2026-06-24
10
+
11
+ ### Triaged field reports #382 / #383 / #384 → `/seal`-hardening + DevOps/QA/orchestration fixes + a pattern-distribution gap
12
+
13
+ **From #384 (the v23.20.0 `/seal` session's own debrief):**
14
+
15
+ - **Added** — Release Step 0 *unrelated / pre-existing-change detection* (`/git`, `/seal`, `RELEASE_MANAGER.md`): before staging, the working tree is split into session-authored vs pre-existing/out-of-scope changes, with dependency manifests / lockfiles getting dependency-level scrutiny — the exact vigilance that caught the v23.20.0 `vercel` near-miss, now mechanical. Never `git add -A` a release.
16
+ - **Added** — *Creation-time native-collision gate* (`BUILD_PROTOCOL.md`, `NATIVE_CAPABILITIES.md`): a new command's name is checked against the native command/skill set *before* the file is written, and its `NATIVE_CAPABILITIES` row is added at creation — the check shifts left from release-time re-audit to command-creation (the `/statusline`→`/contextmeter` rework cause).
17
+ - **Fixed** — `scripts/surfer-gate/bypass.sh` *stale-pointer self-repair*: reads the live session id from `CLAUDE_CODE_SESSION_ID` and repoints a pointer left by a `/clear`ed/crashed session, so a single `bypass.sh --light` lands correctly on the first try — no operator re-run. Older CLIs without the env var keep the documented re-run fallback. (+4 regression tests; gate suite 27→31.)
18
+
19
+ **From #382 (QA-isolation prod outage + sandbox / spend / coverage):**
20
+
21
+ - **Added** — `DEVOPS_ENGINEER.md`: locking a shared parent dir must enumerate every traversing service account and grant each an explicit traverse ACL, then `curl` the prod FE and assert 200 (a `/home/ubuntu` `0750` lock 500'd nginx). Plus a headless-OAuth-bootstrap note (SSH `-L` port-forward or paste-the-code fallback).
22
+ - **Added** — `docs/patterns/egress-sandbox.sh`: run a `systemd-run` egress-confined workload under `--uid`/`--gid` so artifacts stay user-owned, not root-owned — `IPAddress*` filtering is a cgroup property and uid-independent.
23
+ - **Added** — `SUB_AGENTS.md`: a global spend ceiling must reserve max-possible in-flight child budget before launching the next child (an $80 cap spent $83.72), or document the overshoot bound.
24
+ - **Added** — `QA_ENGINEER.md`: coverage honesty — count a case covered only at the fidelity actually exercised; record proof in a per-lane ledger; never reclassify a coverage SSOT on partial evidence.
25
+
26
+ **Distribution:**
27
+
28
+ - **Fixed** — `prepack.sh` / `copy-assets.sh` now ship **every** `docs/patterns/` file regardless of extension. Globbing by `.ts`/`.tsx`/`.md` had silently dropped the `.sh`/`.py`/`.conf` patterns (`post-deploy-probe.sh`, `nginx-vhost.conf`, `rls-test-fixture.py`, `structural-sql-sentinel.py`) from the published package — an LRN-11 gap, now a future-proof whole-dir copy.
29
+
30
+ **#383** (`/contextmeter`) shipped in v23.20.0 — closed as implemented; its creation-time-collision proposal is covered by #384 RC-2.
31
+
32
+ Build clean, suite 1420 (gate 27→31). Dep `^23.20.0` → `^23.21.0`.
33
+
34
+ ## [23.20.0] - 2026-06-23
35
+
36
+ ### Triaged 12 upstream field reports → methodology hardening, + `/seal` and `/contextmeter`
37
+
38
+ Ran `/debrief --inbox` over the 12 open field reports (#364–#378), applied every accepted fix across ~41 method docs / agents / patterns / commands (one applier per file, diff-coverage verified), implemented the two wizard-code reports, and shipped two new commands. All 12 issues closed. Build clean, full suite 1392 → 1420.
39
+
40
+ The recurring theme across the reports: **green static gates (build, unit tests, denylists, ADR claims, status codes, declared coverage) keep passing while the real runtime / scale / auth / external path is never exercised.**
41
+
42
+ ### Added
43
+
44
+ - **`/seal`** — session-closeout ritual: `/git` commit → push → `/debrief --submit` → `/vault --seal`, then always prints the next-session handoff prompt. Fail-safe pipeline (a test failure halts before push but still seals a vault recording the blocked state). Thin orchestrator — no new persona.
45
+ - **`/contextmeter`** (Ducem Barr) — context-budget meter. A status line rendering a colored context-usage meter (green → yellow → red; yellow 80%, red 92%) plus a `UserPromptSubmit` hook that injects remaining-context awareness into Claude itself once usage crosses the threshold, so it can `/vault`/`/seal` before compaction. **Default-on:** `init` auto-wires it (new `mergeStatuslineSettings`, mirrors the surfer-gate hook merge; sets `statusLine` only if absent, appends the hook idempotently). Named to avoid the native `/statusline`/`/context` collision (logged in `NATIVE_CAPABILITIES.md`). `scripts/statusline/` wired through all four distribution paths + the npm `files` allowlist (LRN-11).
46
+ - **`docs/patterns/post-deploy-probe.sh`** — deploy probe asserting response content + Content-Type, not HTTP status only (defeats the SPA catch-all false-positive). [#371]
47
+ - **`docs/patterns/exclusion-set-invariant.md`** — superset invariant across `.gitignore` / rsync / secret-scanner exclusion sets. [#377]
48
+
49
+ ### Changed — methodology hardening (#364–#378)
50
+
51
+ - **QA / Testing** [#378/#373/#365] — throughput/scale gate for per-row network stages; partial/edge-state smoke; drift-guard shared check-fn + CI-wiring proof.
52
+ - **Architecture** [#378/#373/#376] — ADR concurrency-claim verification gate; ADRs record provider-doc-verified token lifecycle; `architect` Step 4.6 extended to verify EXTERNAL platform claims vs live docs.
53
+ - **Security** [#378/#377] — PII export-format `.gitignore`; denylist = tripwire vs authoritative-boundary (prove reachability before escalating to CRITICAL).
54
+ - **DevOps** [#377/#365/#364] — production-path tracer for arming gates; live contrastive systemd/sandbox smoke; promote gate verifies deployed-commit == branch HEAD.
55
+ - **Gauntlet / Campaign** [#377/#373/#365/#371] — live runtime assertion on fix-batch acceptance; dark-flag activation gated on review (not deploy); declared-vs-implemented reconciliation lens; FR-A5 HTTP two-principal isolation + planted-uid red-check.
56
+ - **Frontend / UX** [#376/#375] — anti-generic originality gate (justify-or-reject named defaults); render-gate coverage (every changed-prop surface, both auth states).
57
+ - **Build / Release** [#376/#366/#364/#375] — "adversarially verify the as-implemented diff" step; distribution-validator rule; pre-integration web-verification; removal sweep.
58
+ - **AI safety** [#378/#364] — deny-list discipline (negation-adjacency / proper-noun allowlist / NFKC / independent eval); LLM per-token cost-constant staleness.
59
+ - **Sub-agents / Workflows** [#378/#377/#371/#366] — orchestrator owns dedup + dispatch; verify the empirical premise of a severity rating; auth-mechanism → client-gate-migration audit; mktemp scratch rule; WORKFLOWS recovery subsection.
60
+ - **Forge artist currency** [#367] — removed the retired DALL-E 3 HD provider row; gpt-image-1 throughout.
61
+ - **Wizard `update`** [#368/#369] — non-destructive CLAUDE.md merge (new `claude-md-strategy.ts`: preserve + side-file / sentinel-fence merge / skip; marker `claudeMd` field; `--help` guard; section-loss warning), replacing the destructive "preserve first 10 lines, overwrite the rest" clobber; legacy-consumer marker detection (offer to create the marker instead of erroring to `init`).
62
+
63
+ ### Fixed
64
+
65
+ - **Silver Surfer gate `/debrief` gap** [#366-F4] — `/debrief` and other fixed-roster non-review commands take a `--light` / `--solo` bypass before launching sub-agents (the hook blocks *every* non-Surfer Agent launch regardless of the gated-commands list). Documented in `CLAUDE.md` + `debrief.md` + `FIELD_MEDIC.md`, including the live-observed `bypass.sh` stale-pointer bug + re-run workaround.
66
+ - **`gauntlet.workflow.js` dedup** [#366-F6] — strip the repo-root prefix from a finding's path before keying, so the same finding raised by two agents dedupes.
67
+
68
+ Dep `^23.19.0` → `^23.20.0`.
69
+
70
+ ---
71
+
9
72
  ## [23.19.0] - 2026-06-13
10
73
 
11
74
  ### Gauntlet acceptance test → 14 fixes