thevoidforge-methodology 23.5.2 → 23.5.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -30,7 +30,7 @@ Before agent deployment, run the Herald to select the optimal roster:
30
30
  **`--light`** skips the Herald entirely — uses only the command's hardcoded core roster.
31
31
  **`--solo`** skips both Herald and all sub-agents — lead agent only.
32
32
 
33
- ## Phase 0 — AI Surface Map (`subagent_type: seldon-ai`)
33
+ ## Phase 0 — AI Surface Map (`subagent_type: Seldon`)
34
34
 
35
35
  Reconnaissance — find all AI integration points:
36
36
  1. Grep for LLM SDK imports (`anthropic`, `openai`, `@ai-sdk`, `langchain`)
@@ -43,22 +43,22 @@ Reconnaissance — find all AI integration points:
43
43
 
44
44
  Use the Agent tool to run all four in parallel:
45
45
 
46
- - **Agent 1** `subagent_type: salvor-model-selection` — Model selection: right model per call? Smaller/faster alternative? Latency budget met? Cost tracked?
47
- - **Agent 2** `subagent_type: gaal-prompt-arch` — Prompt architecture: structured, versioned, testable? System prompt separated? Output format specified? Edge cases? Few-shot?
48
- - **Agent 3** `subagent_type: hober-tool-schema` — Tool schemas: clear descriptions? Correct parameter types? Required vs optional? No overlapping tools? Return types documented?
49
- - **Agent 4** `subagent_type: bliss-ai-safety` — AI safety: prompt injection risk? PII in prompts? Output content safety? System prompt extractable? Jailbreak vectors?
46
+ - **Agent 1** `subagent_type: Salvor Hardin` — Model selection: right model per call? Smaller/faster alternative? Latency budget met? Cost tracked?
47
+ - **Agent 2** `subagent_type: Gaal Dornick` — Prompt architecture: structured, versioned, testable? System prompt separated? Output format specified? Edge cases? Few-shot?
48
+ - **Agent 3** `subagent_type: Hober Mallow` — Tool schemas: clear descriptions? Correct parameter types? Required vs optional? No overlapping tools? Return types documented?
49
+ - **Agent 4** `subagent_type: Bliss` — AI safety: prompt injection risk? PII in prompts? Output content safety? System prompt extractable? Jailbreak vectors?
50
50
 
51
51
  ## Phase 2 — Sequential Audits (7 agents)
52
52
 
53
53
  Run sequentially — each builds on the previous:
54
54
 
55
- - **Bel Riose** `subagent_type: bel-riose-orchestration` — Orchestration: completion/chain/agent loop/workflow? Reliability appropriate? Loops bounded? State persisted?
56
- - **The Mule** `subagent_type: mule-adversarial-ai` — Failure modes: hallucination, refusal, timeout, context overflow, API down. Fallback? Circuit breaker? Bounded retries?
57
- - **Ducem Barr** `subagent_type: ducem-token-economics` — Token economics: usage tracked? Caching? Context window efficient? System prompts deduplicated? Streaming?
58
- - **Bayta Darell** `subagent_type: bayta-evals` — Evaluation: golden datasets? Automated scoring? Regression suite for prompt changes? Quality degradation detection?
59
- - **Dors Venabili** `subagent_type: dors-observability` — Observability: trace logging? Inputs/outputs logged (PII-scrubbed)? Latency tracked? Quality scores?
60
- - **Janov Pelorat** `subagent_type: janov-context-eng` — Context engineering: RAG retrieval relevance? Embedding dimensionality? Chunking strategy?
61
- - **R. Daneel Olivaw** `subagent_type: daneel-model-migration` — Versioning: behavior change on model updates? Prompts pinned? Migration strategy?
55
+ - **Bel Riose** `subagent_type: Bel Riose` — Orchestration: completion/chain/agent loop/workflow? Reliability appropriate? Loops bounded? State persisted?
56
+ - **The Mule** `subagent_type: The Mule` — Failure modes: hallucination, refusal, timeout, context overflow, API down. Fallback? Circuit breaker? Bounded retries?
57
+ - **Ducem Barr** `subagent_type: Ducem Barr` — Token economics: usage tracked? Caching? Context window efficient? System prompts deduplicated? Streaming?
58
+ - **Bayta Darell** `subagent_type: Bayta Darell` — Evaluation: golden datasets? Automated scoring? Regression suite for prompt changes? Quality degradation detection?
59
+ - **Dors Venabili** `subagent_type: Dors Venabili` — Observability: trace logging? Inputs/outputs logged (PII-scrubbed)? Latency tracked? Quality scores?
60
+ - **Janov Pelorat** `subagent_type: Janov Pelorat` — Context engineering: RAG retrieval relevance? Embedding dimensionality? Chunking strategy?
61
+ - **R. Daneel Olivaw** `subagent_type: R. Daneel Olivaw` — Versioning: behavior change on model updates? Prompts pinned? Migration strategy?
62
62
 
63
63
  ## Phase 3 — Remediate
64
64
 
@@ -66,7 +66,7 @@ Fix all Critical and High findings. Use the standard finding format with confide
66
66
 
67
67
  ## Phase 4 — Re-Verify
68
68
 
69
- **The Mule** `subagent_type: mule-adversarial-ai` + **Wanda Seldon** `subagent_type: wanda-seldon-validation` re-probe all remediated areas. Wanda validates structured outputs. The Mule attempts adversarial bypass of fixes.
69
+ **The Mule** `subagent_type: The Mule` + **Wanda Seldon** `subagent_type: Wanda Seldon` re-probe all remediated areas. Wanda validates structured outputs. The Mule attempts adversarial bypass of fixes.
70
70
 
71
71
  ## Arguments
72
72
  - `--focus "topic"` → Bias Herald toward topic (natural-language, additive)
@@ -8,7 +8,7 @@ Opus scans `git diff --stat` and matches changed files against the `description`
8
8
 
9
9
  **Dispatch control:** `--light` skips dynamic dispatch (core only). `--solo` runs lead agent only.
10
10
 
11
- **Promoted agent:** **Riker** `subagent_type: riker-review` runs on every ADR written — challenges trade-offs.
11
+ **Promoted agent:** **Riker** `subagent_type: Riker` runs on every ADR written — challenges trade-offs.
12
12
 
13
13
  ## Herald Pre-Scan (ADR-047)
14
14
 
@@ -35,12 +35,12 @@ Before any deep analysis, scan the PRD frontmatter for structural contradictions
35
35
 
36
36
  ## Agent Deployment Manifest
37
37
 
38
- **Lead:** `subagent_type: picard-architecture`
38
+ **Lead:** `subagent_type: Picard`
39
39
  **Full bridge crew:** `spock-schema`, `uhura-integration`, `worf-security-arch`, `tuvok-deep-current`, `scotty-infrastructure`, `kim-api-design`, `janeway-novel-arch`, `torres-site-scanner`, `la-forge-reliability`, `data-tech-debt`, `crusher-diagnostics`, `archer-greenfield`, `pike-bold-decisions`, `riker-review`, `troi-prd-compliance`
40
40
 
41
41
  ## Step 0 — System Discovery
42
- - **Crusher** `subagent_type: crusher-diagnostics` — System health baseline: test coverage, build time, dependency age, code complexity.
43
- - **Archer** `subagent_type: archer-greenfield` — (greenfield only) Initial directory structure, module boundaries, naming conventions.
42
+ - **Crusher** `subagent_type: Crusher` — System health baseline: test coverage, build time, dependency age, code complexity.
43
+ - **Archer** `subagent_type: Archer` — (greenfield only) Initial directory structure, module boundaries, naming conventions.
44
44
 
45
45
  Produce: system identity, component inventory, data flow diagram (ASCII), dependency graph.
46
46
  Write to `/logs/` (phase-00 if during orient, or a dedicated architecture log).
@@ -48,31 +48,31 @@ Write to `/logs/` (phase-00 if during orient, or a dedicated architecture log).
48
48
  ## Step 1 — Parallel Analysis
49
49
  Use the Agent tool to run these in parallel — they are independent analysis tasks:
50
50
 
51
- - **Agent 1** `subagent_type: spock-schema` — Schema review: normalization, index/query alignment, nullable fields, audit fields, PII isolation, data lifecycle, backup/recovery.
52
- - **Agent 2** `subagent_type: uhura-integration` — Integration review: service inventory (purpose, failure mode, fallback, cost, lock-in), API version pinning, response validation, abstraction layers.
53
- - **Agent 3** `subagent_type: worf-security-arch` — Security implications of architectural decisions: PII colocation, unauthenticated internal state access, permissive service boundaries. Audits *design*, not code.
54
- - **Agent 4** `subagent_type: tuvok-deep-current` — Security architecture: auth flow design, token storage, session architecture, encryption at rest vs in transit. Where Worf flags implications, Tuvok designs solutions.
51
+ - **Agent 1** `subagent_type: Spock` — Schema review: normalization, index/query alignment, nullable fields, audit fields, PII isolation, data lifecycle, backup/recovery.
52
+ - **Agent 2** `subagent_type: Uhura` — Integration review: service inventory (purpose, failure mode, fallback, cost, lock-in), API version pinning, response validation, abstraction layers.
53
+ - **Agent 3** `subagent_type: Worf` — Security implications of architectural decisions: PII colocation, unauthenticated internal state access, permissive service boundaries. Audits *design*, not code.
54
+ - **Agent 4** `subagent_type: Tuvok` — Security architecture: auth flow design, token storage, session architecture, encryption at rest vs in transit. Where Worf flags implications, Tuvok designs solutions.
55
55
 
56
56
  Synthesize findings from all four agents.
57
57
 
58
58
  ## Step 2 — Service Architecture + API Design
59
- - **Scotty** `subagent_type: scotty-infrastructure` — Boundary assessment, monolith vs services, async vs sync decisions.
60
- - **Kim** `subagent_type: kim-api-design` — API surface review: REST conventions, error shapes, pagination, versioning.
61
- - **Janeway** `subagent_type: janeway-novel-arch` — (conditional) When standard monolith doesn't fit: event-sourcing, CQRS, serverless, edge computing.
59
+ - **Scotty** `subagent_type: Scotty` — Boundary assessment, monolith vs services, async vs sync decisions.
60
+ - **Kim** `subagent_type: Kim` — API surface review: REST conventions, error shapes, pagination, versioning.
61
+ - **Janeway** `subagent_type: Janeway` — (conditional) When standard monolith doesn't fit: event-sourcing, CQRS, serverless, edge computing.
62
62
  - Informed by Spock's schema, Uhura's integrations, and Worf/Tuvok's security findings.
63
63
 
64
64
  ## Step 3 — Scaling + Performance
65
- - **Scotty** `subagent_type: scotty-infrastructure` — First bottleneck identification, three-tier scaling plan (current → 10x vertical → 100x horizontal), cost estimates.
66
- - **Torres** `subagent_type: torres-site-scanner` — Performance architecture: N+1 patterns, missing indexes, connection pool sizing, caching strategy gaps.
65
+ - **Scotty** `subagent_type: Scotty` — First bottleneck identification, three-tier scaling plan (current → 10x vertical → 100x horizontal), cost estimates.
66
+ - **Torres** `subagent_type: Torres` — Performance architecture: N+1 patterns, missing indexes, connection pool sizing, caching strategy gaps.
67
67
 
68
68
  ## Step 4 — Parallel Analysis
69
69
  Use the Agent tool to run these in parallel — they are independent analysis tasks:
70
70
 
71
- - **Agent 1** `subagent_type: la-forge-reliability` — Failure analysis: for each component, answer "What happens when this fails?" (DB down, cache down, API down, worker crash).
72
- - **Agent 2** `subagent_type: data-tech-debt` — Tech debt catalog: wrong/missing abstraction, premature optimization, deferred decisions, dependency debt, documentation debt. Severity table with impact/risk/effort/urgency.
71
+ - **Agent 1** `subagent_type: La Forge` — Failure analysis: for each component, answer "What happens when this fails?" (DB down, cache down, API down, worker crash).
72
+ - **Agent 2** `subagent_type: Data` — Tech debt catalog: wrong/missing abstraction, premature optimization, deferred decisions, dependency debt, documentation debt. Severity table with impact/risk/effort/urgency.
73
73
 
74
74
  ## Step 5 — ADRs + Decision Review
75
- Write Architecture Decision Records to `/docs/adrs/` for every non-obvious choice. After writing, **Riker** `subagent_type: riker-review` reviews: challenges trade-offs, verifies alternatives were truly considered, checks for second-order effects.
75
+ Write Architecture Decision Records to `/docs/adrs/` for every non-obvious choice. After writing, **Riker** `subagent_type: Riker` reviews: challenges trade-offs, verifies alternatives were truly considered, checks for second-order effects.
76
76
  ```
77
77
  # ADR-001: [Title]
78
78
  ## Status: Accepted
@@ -147,10 +147,10 @@ Run the full `/test` protocol. Write missing unit tests, integration tests, and
147
147
 
148
148
  Use the Agent tool to run these in parallel — all are adversarial, read-only analysis:
149
149
 
150
- - `subagent_type: maul-red-team` — attacks code that passed /review. Looks for exploits in "clean" code.
151
- - `subagent_type: deathstroke-adversarial` — probes endpoints that /security hardened. Tests if remediations can be bypassed.
152
- - `subagent_type: loki-chaos` — chaos-tests features that /qa cleared. Finds what breaks under unexpected conditions.
153
- - `subagent_type: constantine-cursed-code` — hunts cursed code in FIXED areas specifically. Code that works by accident.
150
+ - `subagent_type: Maul` — attacks code that passed /review. Looks for exploits in "clean" code.
151
+ - `subagent_type: Deathstroke` — probes endpoints that /security hardened. Tests if remediations can be bypassed.
152
+ - `subagent_type: Loki` — chaos-tests features that /qa cleared. Finds what breaks under unexpected conditions.
153
+ - `subagent_type: Constantine` — hunts cursed code in FIXED areas specifically. Code that works by accident.
154
154
 
155
155
  Synthesize findings. **Conflict detection:** If any two agents produce conflicting findings on the same code (one says "fix," another says "by design" or "not exploitable"), trigger the debate protocol instead of listing both. See SUB_AGENTS.md "Agent Debate Protocol": Agent A states finding → Agent B responds → Agent A rebuts → Arbiter (Picard or user) decides. 3 exchanges max. Log the debate transcript as an ADR. Fix all Must Fix items. If any fixes were applied, re-run the four agents on the fixed areas only.
156
156
 
@@ -161,11 +161,11 @@ Synthesize findings. **Conflict detection:** If any two agents produce conflicti
161
161
 
162
162
  Use the Agent tool to run these in parallel:
163
163
 
164
- - `subagent_type: spock-schema` — Did any security/QA/UX fix break code patterns or quality?
165
- - `subagent_type: ahsoka-access-control` — Did any review/QA fix introduce access control gaps?
166
- - `subagent_type: nightwing-regression` — Did any fix cause a regression? Run the full test suite.
167
- - `subagent_type: samwise-accessibility` — Did any fix break accessibility?
168
- - `subagent_type: troi-prd-compliance` — PRD compliance: read the PRD prose section-by-section, verify every claim against the implementation. Not just "does the route exist?" but "does the component render what the PRD describes?" Check numeric claims, visual treatments, copy accuracy. Flag asset gaps as BLOCKED. (Troi runs on the final Council iteration, or always when `--skip-build` is used for campaign victory gates.)
164
+ - `subagent_type: Spock` — Did any security/QA/UX fix break code patterns or quality?
165
+ - `subagent_type: Ahsoka` — Did any review/QA fix introduce access control gaps?
166
+ - `subagent_type: Nightwing` — Did any fix cause a regression? Run the full test suite.
167
+ - `subagent_type: Samwise` — Did any fix break accessibility?
168
+ - `subagent_type: Troi` — PRD compliance: read the PRD prose section-by-section, verify every claim against the implementation. Not just "does the route exist?" but "does the component render what the PRD describes?" Check numeric claims, visual treatments, copy accuracy. Flag asset gaps as BLOCKED. (Troi runs on the final Council iteration, or always when `--skip-build` is used for campaign victory gates.)
169
169
 
170
170
  **Conflict detection:** If Council members disagree (e.g., Spock says a fix broke patterns but Ahsoka says it's necessary for access control), trigger the debate protocol. Do not list both opinions — resolve via debate. Arbiter: Picard for code/architecture conflicts, Troi for PRD compliance conflicts.
171
171
 
@@ -40,8 +40,8 @@ Run `/gauntlet --assess` — Rounds 1-2 only (Discovery + First Strike). No fix
40
40
 
41
41
  ### Step 3 — PRD Gap Analysis
42
42
  If a PRD exists:
43
- 1. **Dax** `subagent_type: dax-legacy-wisdom` diffs PRD requirements against implemented features (structural + semantic)
44
- 2. **Troi** `subagent_type: troi-prd-compliance` reads PRD prose section-by-section and verifies claims against reality
43
+ 1. **Dax** `subagent_type: Dax` diffs PRD requirements against implemented features (structural + semantic)
44
+ 2. **Troi** `subagent_type: Troi` reads PRD prose section-by-section and verifies claims against reality
45
45
  3. Check for YAML frontmatter — if missing, flag it (see CAMPAIGN.md Step 1)
46
46
 
47
47
  If no PRD exists:
@@ -7,8 +7,8 @@ Opus scans `git diff --stat` and matches changed files against the `description`
7
7
  **Dispatch control:** `--light` skips dynamic dispatch (core only). `--solo` runs lead agent only.
8
8
 
9
9
  **Promoted agents:**
10
- - **Troi** `subagent_type: troi-prd-compliance` runs after every build mission completion — catches PRD drift before it compounds.
11
- - **Riker** `subagent_type: riker-review` runs whenever an ADR is written during the build — prevents rubber-stamped decisions.
10
+ - **Troi** `subagent_type: Troi` runs after every build mission completion — catches PRD drift before it compounds.
11
+ - **Riker** `subagent_type: Riker` runs whenever an ADR is written during the build — prevents rubber-stamped decisions.
12
12
 
13
13
  ## Herald Pre-Scan (ADR-047)
14
14
 
@@ -36,7 +36,7 @@ Before agent deployment, run the Herald to select the optimal roster:
36
36
  4. Extract from PRD: tech stack, database schema, API routes, page routes, integrations, env vars
37
37
  5. Read `/docs/LESSONS.md` — check for relevant lessons from previous projects. If any lessons match this project's tech stack (framework, database, auth, integrations), note them: "Lessons from prior builds: [list relevant ones]." These inform later phases — e.g., if a lesson says "React useEffect render loops escape review," trace render cycles proactively in Phase 4+.
38
38
  6. Flag any gaps or ambiguities — list them explicitly, don't guess
39
- 7. **Troi** `subagent_type: troi-prd-compliance` confirms PRD extraction: reads the PRD prose and verifies the extraction matches — catches misinterpretations before 8+ build phases propagate them.
39
+ 7. **Troi** `subagent_type: Troi` confirms PRD extraction: reads the PRD prose and verifies the extraction matches — catches misinterpretations before 8+ build phases propagate them.
40
40
  8. **Save PRD snapshot:** Copy `/docs/PRD.md` to `/docs/PRD-snapshot-phase0.md`. This is the baseline for drift detection — the Living PRD feature compares the evolving PRD against this snapshot at phase gates and at Victory.
41
41
  9. Write initial ADRs to `/docs/adrs/`
42
42
  10. Create `/logs/build-state.md` and `/logs/phase-00-orient.md` with extraction results + relevant lessons
@@ -125,6 +125,9 @@ Before agent deployment, run the Herald to select the optimal roster:
125
125
  ## Phase 12.5 — Wong's Pattern Usage Log
126
126
  After build and before launch, log which patterns were used: pattern name, framework adaptation, custom mods. Store in `docs/pattern-usage.json`. Feeds Wong's promotion analysis in `/debrief`.
127
127
 
128
+ ## Phase 12.75 — Distribution Verification Gate
129
+ If this build introduces a new shared file category (e.g., `.claude/agents/`, new patterns subdirectory), verify ALL 6 consumption paths include it: prepack.sh, copy-assets.sh, project-init.ts, updater.ts, FORGE_KEEPER.md, void.md. Missing one path = users silently miss the feature. (Field report #297.)
130
+
128
131
  ## Phase 13 — Launch (All agents)
129
132
  1. Full checklist: SSL, email, payments, analytics, monitoring, backups, security headers, legal, performance, mobile, accessibility, all tests passing
130
133
  2. Log final status to `/logs/phase-13-launch.md`
@@ -33,12 +33,12 @@ If `$ARGUMENTS` contains `--plan`, skip execution and update the plan instead:
33
33
 
34
34
  1. Read the current PRD (`/PRD-VOIDFORGE.md` or `/docs/PRD.md`) and `ROADMAP.md` (if it exists)
35
35
  2. Parse what the user wants to add from `$ARGUMENTS` (everything after `--plan`)
36
- 3. **Dax** (`subagent_type: dax-legacy-wisdom`) **analyzes** where it fits:
36
+ 3. **Dax** (`subagent_type: Dax`) **analyzes** where it fits:
37
37
  - Is it a new feature? → Add to the PRD under the right section (Core Features, Integrations, etc.)
38
38
  - Is it a bug fix or improvement? → Add to ROADMAP.md under the appropriate version
39
39
  - Is it a new version-worth of work? → Create a new version section in ROADMAP.md
40
40
  - Does it change priorities? → Reorder the roadmap accordingly
41
- 4. **Odo** (`subagent_type: odo-structural-anomaly`) **checks** dependencies: does this new item depend on something not yet built? Flag it.
41
+ 4. **Odo** (`subagent_type: Odo`) **checks** dependencies: does this new item depend on something not yet built? Flag it.
42
42
  5. Present the proposed changes to the user for review before writing
43
43
  6. On confirmation, write the updates to the PRD and/or ROADMAP.md
44
44
  7. Do NOT start building — planning mode only updates the plan
@@ -70,7 +70,7 @@ Before agent deployment, run the Herald to select the optimal roster:
70
70
 
71
71
  ## Execution Mode (default)
72
72
 
73
- ## Step 0 — Kira's Operational Reconnaissance (`subagent_type: kira-pragmatic`)
73
+ ## Step 0 — Kira's Operational Reconnaissance (`subagent_type: Kira`)
74
74
 
75
75
  Check for unfinished business:
76
76
 
@@ -100,9 +100,9 @@ If vault exists and `.env` is sparse (missing keys that the vault has):
100
100
  1. Run `voidforge deploy --env-only` to write vault credentials to `.env`
101
101
  2. In `--blitz` mode: auto-run without confirmation
102
102
  3. In normal mode: show what will be written, ask for confirmation
103
- 4. This runs BEFORE Dax's (`subagent_type: dax-legacy-wisdom`) full analysis so the populated `.env` is visible
103
+ 4. This runs BEFORE Dax's (`subagent_type: Dax`) full analysis so the populated `.env` is visible
104
104
 
105
- ## Step 1 — Dax's Strategic Analysis (`subagent_type: dax-legacy-wisdom`)
105
+ ## Step 1 — Dax's Strategic Analysis (`subagent_type: Dax`)
106
106
 
107
107
  Read the PRD and diff against the codebase:
108
108
 
@@ -113,7 +113,7 @@ Read the PRD and diff against the codebase:
113
113
  5. **Classify every requirement by type:** Code (buildable), Asset (needs external generation — images, illustrations, OG cards), Copy (text accuracy), Infrastructure (DNS, env vars, dashboards)
114
114
  6. Diff: what the PRD describes vs. what's implemented — **structural AND semantic** (not just "does the route exist?" but "does the component render what the PRD describes?")
115
115
  7. Produce the ordered mission list — each mission is 1-3 PRD sections, scoped to be buildable in one `/assemble` run
116
- 8. **Pike** (`subagent_type: pike-bold-decisions`) **challenges the ordering:** "Should we attempt a harder mission first while context is fresh?" Bold counterbalance to Dax's dependency-based ordering. If Pike's argument is stronger, reorder.
116
+ 8. **Pike** (`subagent_type: Pike`) **challenges the ordering:** "Should we attempt a harder mission first while context is fresh?" Bold counterbalance to Dax's dependency-based ordering. If Pike's argument is stronger, reorder.
117
117
  9. **Separately list BLOCKED items** — asset/infrastructure requirements that code can't satisfy
118
118
 
119
119
  **Priority cascade:**
@@ -124,7 +124,7 @@ Read the PRD and diff against the codebase:
124
124
  5. Skip sections flagged as no/none in frontmatter
125
125
  6. Asset/infrastructure requirements → flag as BLOCKED, don't include in code missions
126
126
 
127
- ## Step 2 — Odo's Prerequisite Check (`subagent_type: odo-structural-anomaly`)
127
+ ## Step 2 — Odo's Prerequisite Check (`subagent_type: Odo`)
128
128
 
129
129
  For the next mission on the list:
130
130
  - Are dependencies met? (e.g., Payments needs Auth)
@@ -160,7 +160,7 @@ On confirmation (or immediately in `--blitz` mode):
160
160
  2. If `$ARGUMENTS` includes `--fast`, pass `--fast` to assemble (skip Crossfire + Council). Note: `--blitz` does NOT imply `--fast`.
161
161
  3. Monitor for context pressure symptoms (re-reading files, forgetting decisions). If noticed, ask user to run `/context` — only checkpoint if usage exceeds 70%.
162
162
 
163
- ## Step 4.5 — Gauntlet Checkpoint (`subagent_type: thanos-gauntlet`)
163
+ ## Step 4.5 — Gauntlet Checkpoint (`subagent_type: Thanos`)
164
164
 
165
165
  After every 4th mission (missions 4, 8, 12, etc.), run a Gauntlet checkpoint before continuing:
166
166
 
@@ -192,13 +192,13 @@ After `/assemble` completes:
192
192
 
193
193
  **Context pressure check:** Do NOT checkpoint based on mission count. Check actual context usage via `/context`. Only checkpoint when usage exceeds 70% (~700k tokens). Never pause a blitz based on mission count alone.
194
194
 
195
- ## Step 6 — Victory Condition (Gauntlet + Troi's Compliance Check) (`subagent_type: troi-prd-compliance`)
195
+ ## Step 6 — Victory Condition (Gauntlet + Troi's Compliance Check) (`subagent_type: Troi`)
196
196
 
197
197
  All PRD requirements are COMPLETE or explicitly BLOCKED:
198
198
 
199
199
  1. **Run `/gauntlet` (full 5 rounds)** — mandatory final Gauntlet on the complete codebase. This is non-negotiable, even with `--fast`. The Gauntlet tests the combined system across all domains: architecture, code review, UX, security, QA, DevOps, adversarial crossfire, and council convergence. Individual `/assemble` runs review one mission at a time; the Gauntlet reviews everything together.
200
200
  2. **Fix all Critical and High findings** from the Gauntlet.
201
- 3. **Troi** (`subagent_type: troi-prd-compliance`) **reads the PRD section-by-section** (runs as part of the Gauntlet Council round) — verifies every prose claim against the implementation. Not just "does the route exist?" but "does the component render what the PRD describes?" Checks numeric claims, visual treatments, copy accuracy, asset gaps.
201
+ 3. **Troi** (`subagent_type: Troi`) **reads the PRD section-by-section** (runs as part of the Gauntlet Council round) — verifies every prose claim against the implementation. Not just "does the route exist?" but "does the component render what the PRD describes?" Checks numeric claims, visual treatments, copy accuracy, asset gaps.
202
202
  4. Fix code discrepancies. Flag asset requirements as BLOCKED.
203
203
  5. Report: COMPLETE items, BLOCKED items (with reasons), deviations from PRD
204
204
  6. Victory only if: Gauntlet Council signs off AND user acknowledges all BLOCKED items
@@ -7,7 +7,7 @@ Bashir examines the patient. Time to diagnose.
7
7
 
8
8
  ## Step 0 — Reconstruct the Timeline
9
9
 
10
- **Ezri** `subagent_type: ezri-session-analyst` reads the session's history and reconstructs what happened:
10
+ **Ezri** `subagent_type: Ezri` reads the session's history and reconstructs what happened:
11
11
 
12
12
  1. Read all `/logs/` files — build state, assemble state, campaign state, phase logs
13
13
  2. Read `git log` — all commits from this session/campaign
@@ -20,7 +20,7 @@ Default: auto-detect scope from available logs.
20
20
 
21
21
  ## Step 1 — Investigate Root Causes
22
22
 
23
- **O'Brien** `subagent_type: obrien-root-cause` investigates. For each failure, difficulty, or retry identified by Ezri:
23
+ **O'Brien** `subagent_type: O'Brien` investigates. For each failure, difficulty, or retry identified by Ezri:
24
24
 
25
25
  Classify the root cause:
26
26
  - **Methodology gap** — missing step, wrong order, blind spot in the protocol
@@ -34,7 +34,7 @@ Map each root cause to the VoidForge component responsible (which command, which
34
34
 
35
35
  ## Step 2 — Propose Solutions
36
36
 
37
- **Nog** `subagent_type: nog-solutions` proposes a fix for each root cause that works within VoidForge's existing framework:
37
+ **Nog** `subagent_type: Nog` proposes a fix for each root cause that works within VoidForge's existing framework:
38
38
 
39
39
  - New agent? → name it from the correct universe, define the role
40
40
  - New step in existing command? → specify where it goes in the sequence
@@ -56,7 +56,7 @@ Approved entries written to `docs/LEARNINGS.md` (created on first use). Hard cap
56
56
 
57
57
  ## Step 2.5b — Promotion Analysis
58
58
 
59
- After extraction, **Wong** `subagent_type: wong-documentation` checks `docs/LESSONS.md` for lesson clusters AND checks `docs/LEARNINGS.md` for promotable entries (appeared in 2+ projects):
59
+ After extraction, **Wong** `subagent_type: Wong` checks `docs/LESSONS.md` for lesson clusters AND checks `docs/LEARNINGS.md` for promotable entries (appeared in 2+ projects):
60
60
  - If 3+ lessons share the same category AND target the same method doc → Wong drafts a specific method doc update
61
61
  - Present for user approval: "Wong recommends promoting these lessons into [method doc] [section]: [proposed text]. Approve?"
62
62
  - If approved: apply the change, mark lessons as "Promoted to: [doc]" in LESSONS.md
@@ -64,7 +64,7 @@ After extraction, **Wong** `subagent_type: wong-documentation` checks `docs/LESS
64
64
 
65
65
  ## Step 3 — Write the Report
66
66
 
67
- **Jake** `subagent_type: jake-reporter` produces a structured post-mortem:
67
+ **Jake** `subagent_type: Jake` produces a structured post-mortem:
68
68
 
69
69
  ```markdown
70
70
  # Field Report — [Project Name]
@@ -28,27 +28,27 @@ Before agent deployment, run the Herald to select the optimal roster:
28
28
 
29
29
  ## Agent Deployment Manifest
30
30
 
31
- **Lead:** Kusanagi (`subagent_type: kusanagi-devops`)
31
+ **Lead:** Kusanagi (`subagent_type: Kusanagi`)
32
32
 
33
33
  **Core team (always deployed):**
34
- - **Senku** (`subagent_type: senku-provisioning`) — provisioning: server setup, dependencies, runtime, idempotent scripts
35
- - **Levi** (`subagent_type: levi-deploy`) — deployment: process management, zero-downtime, rollback scripts
36
- - **Spike** (`subagent_type: spike-routing`) — networking: reverse proxy, DNS, TLS, firewall, CORS headers
34
+ - **Senku** (`subagent_type: Senku`) — provisioning: server setup, dependencies, runtime, idempotent scripts
35
+ - **Levi** (`subagent_type: Levi`) — deployment: process management, zero-downtime, rollback scripts
36
+ - **Spike** (`subagent_type: Spike`) — networking: reverse proxy, DNS, TLS, firewall, CORS headers
37
37
  - **L** — monitoring: health checks, uptime, alerting, log aggregation (honorary — no agent definition)
38
- - **Bulma** (`subagent_type: bulma-engineering`) — backup: database dumps, file backup, retention, restore testing
38
+ - **Bulma** (`subagent_type: Bulma`) — backup: database dumps, file backup, retention, restore testing
39
39
  - **Holo** — cost: resource sizing, instance selection, cost estimation, optimization (honorary — no agent definition)
40
40
 
41
41
  **Extended team (deployed on full infra reviews):**
42
- - **Valkyrie** (`subagent_type: valkyrie-recovery`) — disaster recovery: failover, data center redundancy, RTO/RPO
43
- - **Vegeta** (`subagent_type: vegeta-monitoring`) — scaling: horizontal scaling, load balancing, auto-scaling policies
44
- - **Trunks** (`subagent_type: trunks-rollback`) — migration: database migration strategy, zero-downtime schema changes
45
- - **Mikasa** (`subagent_type: mikasa-protection`) — security hardening: SSH config, fail2ban, unattended upgrades
46
- - **Erwin** (`subagent_type: erwin-strategy`) — strategy: multi-environment management, staging/production parity
47
- - **Mustang** (`subagent_type: mustang-cleanup`) — orchestration: Docker Compose, container networking, service discovery
48
- - **Olivier** (`subagent_type: olivier-hardening`) — cold region: CDN configuration, edge caching, geographic distribution
49
- - **Hughes** (`subagent_type: hughes-observability`) — documentation: runbook writing, infrastructure diagrams, onboarding docs
50
- - **Calcifer** (`subagent_type: calcifer-daemon`) — energy: resource efficiency, idle scaling, sleep/wake optimization
51
- - **Duo** (`subagent_type: duo-teardown`) — CI/CD: GitHub Actions, pipeline design, automated testing in deploy
42
+ - **Valkyrie** (`subagent_type: Valkyrie`) — disaster recovery: failover, data center redundancy, RTO/RPO
43
+ - **Vegeta** (`subagent_type: Vegeta`) — scaling: horizontal scaling, load balancing, auto-scaling policies
44
+ - **Trunks** (`subagent_type: Trunks`) — migration: database migration strategy, zero-downtime schema changes
45
+ - **Mikasa** (`subagent_type: Mikasa`) — security hardening: SSH config, fail2ban, unattended upgrades
46
+ - **Erwin** (`subagent_type: Erwin`) — strategy: multi-environment management, staging/production parity
47
+ - **Mustang** (`subagent_type: Mustang`) — orchestration: Docker Compose, container networking, service discovery
48
+ - **Olivier** (`subagent_type: Olivier`) — cold region: CDN configuration, edge caching, geographic distribution
49
+ - **Hughes** (`subagent_type: Hughes`) — documentation: runbook writing, infrastructure diagrams, onboarding docs
50
+ - **Calcifer** (`subagent_type: Calcifer`) — energy: resource efficiency, idle scaling, sleep/wake optimization
51
+ - **Duo** (`subagent_type: Duo`) — CI/CD: GitHub Actions, pipeline design, automated testing in deploy
52
52
 
53
53
  ## Deploy Target Branching
54
54
 
@@ -36,11 +36,11 @@ Before agent deployment, run the Herald to select the optimal roster:
36
36
 
37
37
  Use the Agent tool to run all five in parallel — these are read-only analysis:
38
38
 
39
- - **Agent 1** `subagent_type: picard-architecture` — Schema review, service boundaries, dependency graph, scaling assessment. Read the full `/architect` protocol but produce findings only (no ADRs — this is review, not design).
40
- - **Agent 2** `subagent_type: stark-backend` — Pattern compliance, logic errors, type safety, cross-module data flow tracing. Read `/review` protocol. One pass across all source files.
41
- - **Agent 3** `subagent_type: galadriel-frontend` — Product surface map, usability walkthrough (Step 1.5), Éowyn's enchantment scan (Step 1.75). No fixes yet — discovery only.
42
- - **Agent 4** `subagent_type: kenobi-security` — List all endpoints, WebSocket handlers, file I/O, credential access points, user input parsing. Classify each by risk tier. No deep audit yet — just the map.
43
- - **Agent 5** `subagent_type: kusanagi-devops` — Scan deploy scripts, generated configs, provisioning scripts, CI/CD templates. Classify each by risk: hardcoded credentials, open ports, missing auth on generated services. No deep audit yet — just the map.
39
+ - **Agent 1** `subagent_type: Picard` — Schema review, service boundaries, dependency graph, scaling assessment. Read the full `/architect` protocol but produce findings only (no ADRs — this is review, not design).
40
+ - **Agent 2** `subagent_type: Stark` — Pattern compliance, logic errors, type safety, cross-module data flow tracing. Read `/review` protocol. One pass across all source files.
41
+ - **Agent 3** `subagent_type: Galadriel` — Product surface map, usability walkthrough (Step 1.5), Éowyn's enchantment scan (Step 1.75). No fixes yet — discovery only.
42
+ - **Agent 4** `subagent_type: Kenobi` — List all endpoints, WebSocket handlers, file I/O, credential access points, user input parsing. Classify each by risk tier. No deep audit yet — just the map.
43
+ - **Agent 5** `subagent_type: Kusanagi` — Scan deploy scripts, generated configs, provisioning scripts, CI/CD templates. Classify each by risk: hardcoded credentials, open ports, missing auth on generated services. No deep audit yet — just the map.
44
44
 
45
45
  Synthesize all five into a unified findings list. Log to `/logs/gauntlet-round-1.md`.
46
46
 
@@ -50,10 +50,10 @@ Synthesize all five into a unified findings list. Log to `/logs/gauntlet-round-1
50
50
 
51
51
  Use the Agent tool to run all four in parallel — full domain audits:
52
52
 
53
- - **Agent 1** `subagent_type: batman-qa` — Run the complete `/qa` protocol. Oracle + Red Hood + Alfred + Deathstroke + Constantine + Nightwing + Lucius. Every edge case, every error state, every boundary.
54
- - **Agent 2** `subagent_type: galadriel-frontend` — Run the complete `/ux` protocol. Elrond + Arwen + Samwise + Bilbo + Legolas + Gimli + Radagast + Éowyn. Usability, visual, a11y, copy, performance, edge cases, enchantment.
55
- - **Agent 3** `subagent_type: kenobi-security` — Run the complete `/security` protocol. Leia + Chewie + Rex + Maul parallel scans, then Yoda → Windu → Ahsoka → Padmé sequential audits.
56
- - **Agent 4** `subagent_type: stark-backend` — For every API endpoint, trace the full data path: client request → validation → service → database → response. For every file upload, trace: upload → storage → retrieval → display. For every credential, trace: entry → vault → usage → cleanup.
53
+ - **Agent 1** `subagent_type: Batman` — Run the complete `/qa` protocol. Oracle + Red Hood + Alfred + Deathstroke + Constantine + Nightwing + Lucius. Every edge case, every error state, every boundary.
54
+ - **Agent 2** `subagent_type: Galadriel` — Run the complete `/ux` protocol. Elrond + Arwen + Samwise + Bilbo + Legolas + Gimli + Radagast + Éowyn. Usability, visual, a11y, copy, performance, edge cases, enchantment.
55
+ - **Agent 3** `subagent_type: Kenobi` — Run the complete `/security` protocol. Leia + Chewie + Rex + Maul parallel scans, then Yoda → Windu → Ahsoka → Padmé sequential audits.
56
+ - **Agent 4** `subagent_type: Stark` — For every API endpoint, trace the full data path: client request → validation → service → database → response. For every file upload, trace: upload → storage → retrieval → display. For every credential, trace: entry → vault → usage → cleanup.
57
57
 
58
58
  Merge all findings. Deduplicate across domains.
59
59
 
@@ -76,10 +76,10 @@ This catches runtime bugs invisible to static analysis: IPv6 binding, native mod
76
76
 
77
77
  Use the Agent tool to run all four in parallel — targeted re-verification:
78
78
 
79
- - **Agent 1** `subagent_type: batman-qa` — Nightwing re-runs the test suite. Red Hood re-probes fixed areas. Deathstroke tests new boundaries created by the fixes. Focus on regressions.
80
- - **Agent 2** `subagent_type: galadriel-frontend` — Samwise re-audits a11y on all modified components. Radagast re-checks edge cases on fixed flows. Bilbo re-checks microcopy on any changed UI.
81
- - **Agent 3** `subagent_type: kenobi-security` — Maul re-probes all remediated vulnerabilities. Ahsoka verifies access control across every role boundary. Padmé verifies the primary user flow still works (critical path smoke test).
82
- - **Agent 4** `subagent_type: kusanagi-devops` — Run the complete `/devops` protocol with full team: Senku (provisioning), Levi (deploy), Spike (networking), L (monitoring), Bulma (backup), Holo (cost), Valkyrie (disaster recovery). Deploy scripts, monitoring, backups, health checks, page weight gate, security headers.
79
+ - **Agent 1** `subagent_type: Batman` — Nightwing re-runs the test suite. Red Hood re-probes fixed areas. Deathstroke tests new boundaries created by the fixes. Focus on regressions.
80
+ - **Agent 2** `subagent_type: Galadriel` — Samwise re-audits a11y on all modified components. Radagast re-checks edge cases on fixed flows. Bilbo re-checks microcopy on any changed UI.
81
+ - **Agent 3** `subagent_type: Kenobi` — Maul re-probes all remediated vulnerabilities. Ahsoka verifies access control across every role boundary. Padmé verifies the primary user flow still works (critical path smoke test).
82
+ - **Agent 4** `subagent_type: Kusanagi` — Run the complete `/devops` protocol with full team: Senku (provisioning), Levi (deploy), Spike (networking), L (monitoring), Bulma (backup), Holo (cost), Valkyrie (disaster recovery). Deploy scripts, monitoring, backups, health checks, page weight gate, security headers.
83
83
 
84
84
  **→ FIX BATCH 2:** Fix remaining findings.
85
85
 
@@ -89,11 +89,11 @@ Use the Agent tool to run all four in parallel — targeted re-verification:
89
89
 
90
90
  Use the Agent tool to run all five in parallel — pure adversarial:
91
91
 
92
- - `subagent_type: maul-red-team` — Attacks code that passed /review. Looks for exploits in "clean" code.
93
- - `subagent_type: deathstroke-adversarial` — Probes endpoints that /security hardened. Tests if remediations can be bypassed.
94
- - `subagent_type: loki-chaos` — Chaos-tests features that /qa cleared. What breaks under unexpected conditions?
95
- - `subagent_type: constantine-cursed-code` — Hunts cursed code in FIXED areas specifically. Code that only works by accident.
96
- - `subagent_type: eowyn-delight` — Final enchantment pass on the polished, hardened product. Where can delight still be added without compromising security or stability?
92
+ - `subagent_type: Maul` — Attacks code that passed /review. Looks for exploits in "clean" code.
93
+ - `subagent_type: Deathstroke` — Probes endpoints that /security hardened. Tests if remediations can be bypassed.
94
+ - `subagent_type: Loki` — Chaos-tests features that /qa cleared. What breaks under unexpected conditions?
95
+ - `subagent_type: Constantine` — Hunts cursed code in FIXED areas specifically. Code that only works by accident.
96
+ - `subagent_type: Eowyn` — Final enchantment pass on the polished, hardened product. Where can delight still be added without compromising security or stability?
97
97
 
98
98
  **→ FIX BATCH 3:** Fix all adversarial findings. If any fix is applied, re-run the affected adversarial agent on the fixed area only.
99
99
 
@@ -103,12 +103,12 @@ Use the Agent tool to run all five in parallel — pure adversarial:
103
103
 
104
104
  Use the Agent tool to run all six in parallel:
105
105
 
106
- - `subagent_type: spock-schema` — Did any QA/security/UX fix break code patterns or quality?
107
- - `subagent_type: ahsoka-access-control` — Did any fix introduce access control gaps?
108
- - `subagent_type: nightwing-regression` — Full regression: run the entire test suite. Any failures?
109
- - `subagent_type: samwise-accessibility` — Final accessibility audit on all modified components.
110
- - `subagent_type: padme-data-protection` — Critical path functional verification. Open the app, complete the main task, verify output.
111
- - `subagent_type: troi-prd-compliance` — PRD compliance: read the PRD prose section-by-section, verify every claim against the implementation. Numeric claims, visual treatments, copy accuracy.
106
+ - `subagent_type: Spock` — Did any QA/security/UX fix break code patterns or quality?
107
+ - `subagent_type: Ahsoka` — Did any fix introduce access control gaps?
108
+ - `subagent_type: Nightwing` — Full regression: run the entire test suite. Any failures?
109
+ - `subagent_type: Samwise` — Final accessibility audit on all modified components.
110
+ - `subagent_type: Padme` — Critical path functional verification. Open the app, complete the main task, verify output.
111
+ - `subagent_type: Troi` — PRD compliance: read the PRD prose section-by-section, verify every claim against the implementation. Numeric claims, visual treatments, copy accuracy.
112
112
 
113
113
  If the Council finds issues:
114
114
  1. Fix code discrepancies. Flag asset requirements as BLOCKED.
@@ -8,7 +8,7 @@ Opus scans `git diff --stat` and matches changed files against the `description`
8
8
 
9
9
  **Dispatch control:** `--light` skips dynamic dispatch (core only). `--solo` runs lead agent only.
10
10
 
11
- **Promoted agent:** **Constantine** `subagent_type: constantine-cursed-code` runs on every `/qa` final pass — finds code that works by accident.
11
+ **Promoted agent:** **Constantine** `subagent_type: Constantine` runs on every `/qa` final pass — finds code that works by accident.
12
12
 
13
13
  ## Herald Pre-Scan (ADR-047)
14
14
 
@@ -36,22 +36,26 @@ Before agent deployment, run the Herald to select the optimal roster:
36
36
  2. Create `/logs/phase-09-qa-audit.md` (or appropriate phase log)
37
37
 
38
38
  ## Step 1 — Attack Plan
39
- **Green Lantern** `subagent_type: green-lantern-scenarios` generates the test matrix first — what inputs x what states x what conditions should be tested. Then assign targets:
40
- - **Oracle** `subagent_type: oracle-static-analysis` — Static: critical flows, missing awaits, null checks, type mismatches, race conditions.
41
- - **Red Hood** `subagent_type: red-hood-aggressive` — Dynamic: empty/huge/unicode inputs, network failures, malformed JSON, rapid clicking.
42
- - **Alfred** `subagent_type: alfred-dependencies` — Dependencies: `npm audit`, outdated libs, deprecated APIs, version conflicts.
43
- - **Lucius** `subagent_type: lucius-config` — Config: .env completeness, secrets not in git, prod vs dev mismatches.
44
- - **Deathstroke** `subagent_type: deathstroke-adversarial` — Adversarial: bypass validations, chain interactions, exploit business logic.
45
- - **Constantine** `subagent_type: constantine-cursed-code` — Cursed code: unreachable branches, dead state, impossible conditions, accidental correctness.
46
- - **Cyborg** `subagent_type: cyborg-system-integration` — Integration: trace full data path across 3+ module boundaries, inconsistent response shapes.
47
- - **Raven** `subagent_type: raven-deep-analysis` — Deep analysis: bugs hidden beneath 3 layers of abstraction, logic correct per function but wrong in composition.
48
- - **Wonder Woman** `subagent_type: wonder-woman-truth` — Truth: code that says one thing and does another, misleading names, stale docs.
39
+ **Green Lantern** `subagent_type: Green Lantern` generates the test matrix first — what inputs x what states x what conditions should be tested. Then assign targets:
40
+ - **Oracle** `subagent_type: Oracle` — Static: critical flows, missing awaits, null checks, type mismatches, race conditions.
41
+ - **Red Hood** `subagent_type: Red Hood` — Dynamic: empty/huge/unicode inputs, network failures, malformed JSON, rapid clicking.
42
+ - **Alfred** `subagent_type: Alfred` — Dependencies: `npm audit`, outdated libs, deprecated APIs, version conflicts.
43
+ - **Lucius** `subagent_type: Lucius` — Config: .env completeness, secrets not in git, prod vs dev mismatches.
44
+ - **Deathstroke** `subagent_type: Deathstroke` — Adversarial: bypass validations, chain interactions, exploit business logic.
45
+ - **Constantine** `subagent_type: Constantine` — Cursed code: unreachable branches, dead state, impossible conditions, accidental correctness.
46
+ - **Cyborg** `subagent_type: Cyborg` — Integration: trace full data path across 3+ module boundaries, inconsistent response shapes.
47
+ - **Raven** `subagent_type: Raven` — Deep analysis: bugs hidden beneath 3 layers of abstraction, logic correct per function but wrong in composition.
48
+ - **Wonder Woman** `subagent_type: Wonder Woman` — Truth: code that says one thing and does another, misleading names, stale docs.
49
49
 
50
50
  ## Step 2 — Baseline
51
51
  Get the project running. Verify manually: app starts, primary flow works, auth works (if applicable), data persists, error states display.
52
52
 
53
+ **Dynamic count check:** Grep for hardcoded numeric claims ("263 agents", "37 patterns", etc.) across all pages and data files. Every count that can change between releases must be computed from the source, not hardcoded. (Field report #298.)
54
+
55
+ **Cross-array uniqueness audit:** If the codebase uses multiple data arrays for entity categories (e.g., leadAgents + subAgents), verify no entity appears in more than one array. Duplicates inflate totals. (Field report #298.)
56
+
53
57
  ## Step 2.5 — Smoke Tests
54
- After build + restart, **Flash** `subagent_type: flash-rapid-test` parallelizes curl commands against the running server for each new or modified feature:
58
+ After build + restart, **Flash** `subagent_type: Flash` parallelizes curl commands against the running server for each new or modified feature:
55
59
  - **Primary user flow:** Execute via curl/fetch against localhost — verify the end-to-end path works
56
60
  - **File uploads:** Upload a file, then fetch the returned URL and verify HTTP 200 + correct content-type
57
61
  - **Form submissions:** Submit valid data (verify 200), then submit invalid/duplicate data (verify error message is specific, not generic)
@@ -62,20 +66,20 @@ This catches integration failures that static code review misses. If the server
62
66
 
63
67
  ## Step 3 — Pass 1: Find Bugs (parallel analysis)
64
68
  Use the Agent tool to run these in parallel — these are read-only analysis tasks:
65
- - **Agent 1** `subagent_type: oracle-static-analysis` — Scan /src/lib/ and /src/app/ for logic flaws, missing awaits, unsafe assumptions.
66
- - **Agent 2** `subagent_type: red-hood-aggressive` — Test all API endpoints with malformed inputs, empty bodies, missing auth.
67
- - **Agent 3** `subagent_type: alfred-dependencies` — Run `npm audit`, check package.json for deprecated/vulnerable packages.
68
- - **Agent 4** `subagent_type: deathstroke-adversarial` — Adversarial probing: bypass validations, chain unexpected interactions, test authorization boundaries.
69
- - **Agent 5** `subagent_type: constantine-cursed-code` — Hunt cursed code: dead branches, impossible conditions, accidental correctness, shadowed variables.
70
- - **Agent 6** `subagent_type: batgirl-detail` — Deep per-module audit: every edge of every form, every boundary of every validation, every regex. Not broad -- *thorough*.
71
- - **Agent 7** `subagent_type: aquaman-deep-dive` — Deep dive on the hardest/largest module (500+ lines or 10+ functions). Exhaustive testing of one complex area.
69
+ - **Agent 1** `subagent_type: Oracle` — Scan /src/lib/ and /src/app/ for logic flaws, missing awaits, unsafe assumptions.
70
+ - **Agent 2** `subagent_type: Red Hood` — Test all API endpoints with malformed inputs, empty bodies, missing auth.
71
+ - **Agent 3** `subagent_type: Alfred` — Run `npm audit`, check package.json for deprecated/vulnerable packages.
72
+ - **Agent 4** `subagent_type: Deathstroke` — Adversarial probing: bypass validations, chain unexpected interactions, test authorization boundaries.
73
+ - **Agent 5** `subagent_type: Constantine` — Hunt cursed code: dead branches, impossible conditions, accidental correctness, shadowed variables.
74
+ - **Agent 6** `subagent_type: Batgirl` — Deep per-module audit: every edge of every form, every boundary of every validation, every regex. Not broad -- *thorough*.
75
+ - **Agent 7** `subagent_type: Aquaman` — Deep dive on the hardest/largest module (500+ lines or 10+ functions). Exhaustive testing of one complex area.
72
76
 
73
77
  Synthesize findings from all agents into a unified list.
74
78
 
75
- **Lucius** `subagent_type: lucius-config` reviews config separately (reads .env files -- sensitive, don't delegate to sub-agent).
79
+ **Lucius** `subagent_type: Lucius` reviews config separately (reads .env files -- sensitive, don't delegate to sub-agent).
76
80
 
77
81
  ## Step 3.5 — Automated Tests
78
- Run `npm test`. Analyze failures. Cross-reference with findings from Step 3. **Huntress** `subagent_type: huntress-flaky-bugs` identifies flaky/non-deterministic tests — race conditions, timing dependencies, order-dependent assertions. For every bug found, ask: "Can this be caught by an automated test?" If yes, write the test.
82
+ Run `npm test`. Analyze failures. Cross-reference with findings from Step 3. **Huntress** `subagent_type: Huntress` identifies flaky/non-deterministic tests — race conditions, timing dependencies, order-dependent assertions. For every bug found, ask: "Can this be caught by an automated test?" If yes, write the test.
79
83
 
80
84
  ## Step 4 — Bug Tracker
81
85
  Log all findings in this format in the phase log:
@@ -88,25 +92,25 @@ Severity: Critical (security/data loss) > High (broken flow) > Medium (degraded)
88
92
  **Confidence scoring is mandatory.** Every finding includes a confidence score (0-100). If confidence is below 60, launch a second agent from a different universe (e.g., if Oracle found it, escalate to Spock or Kenobi) to verify before including. If the second agent disagrees, drop the finding. High-confidence findings (90+) skip re-verification in Step 6.5.
89
93
 
90
94
  ## Step 5 — Fix (small batches)
91
- One batch = fixes for one area or severity level. **Green Arrow** `subagent_type: green-arrow-precision` narrows vague findings to exact lines and conditions. After each batch:
95
+ One batch = fixes for one area or severity level. **Green Arrow** `subagent_type: Green Arrow` narrows vague findings to exact lines and conditions. After each batch:
92
96
  1. Re-run `npm test`
93
97
  2. Re-verify affected manual flows
94
98
  3. Update bug tracker in phase log
95
99
  4. Add new test for each fix where applicable
96
100
 
97
101
  ## Step 6 — Harden
98
- Normalize error handling (reference `/docs/patterns/error-handling.ts`). Add guardrails. Improve structured logging. **Superman** `subagent_type: superman-strength-test` verifies the codebase meets its own stated standards — linting clean, type-safe, naming conventions consistent, no unresolved TODOs.
102
+ Normalize error handling (reference `/docs/patterns/error-handling.ts`). Add guardrails. Improve structured logging. **Superman** `subagent_type: Superman` verifies the codebase meets its own stated standards — linting clean, type-safe, naming conventions consistent, no unresolved TODOs.
99
103
 
100
104
  ## Step 6.5 — Pass 2: Re-Verify Fixes
101
105
  After all fixes are applied, run a verification pass:
102
- - **Nightwing** `subagent_type: nightwing-regression` re-runs full test suite, reports any new failures
103
- - **Red Hood** `subagent_type: red-hood-aggressive` re-probes fixed areas — verify fixes hold under adversarial input
104
- - **Deathstroke** `subagent_type: deathstroke-adversarial` re-tests authorization boundaries and business logic exploits that were remediated
106
+ - **Nightwing** `subagent_type: Nightwing` re-runs full test suite, reports any new failures
107
+ - **Red Hood** `subagent_type: Red Hood` re-probes fixed areas — verify fixes hold under adversarial input
108
+ - **Deathstroke** `subagent_type: Deathstroke` re-tests authorization boundaries and business logic exploits that were remediated
105
109
 
106
110
  If Pass 2 finds new issues, fix and re-verify until clean.
107
111
 
108
112
  ## Step 7 — Regression Checklist
109
- **Nightwing** `subagent_type: nightwing-regression` builds the checklist. Template:
113
+ **Nightwing** `subagent_type: Nightwing` builds the checklist. Template:
110
114
 
111
115
  | # | Flow | Steps | Expected | Status |
112
116
  |---|------|-------|----------|--------|
@@ -38,38 +38,38 @@ List all files in scope and their types (API route, service, component, middlewa
38
38
 
39
39
  ## Agent Deployment Manifest
40
40
 
41
- **Lead:** `subagent_type: picard-architecture` — architecture lens, final arbiter
41
+ **Lead:** `subagent_type: Picard` — architecture lens, final arbiter
42
42
  **Core team (always deployed):**
43
- - `subagent_type: spock-schema` — pattern compliance + integration tracing
44
- - `subagent_type: seven-optimization` — code quality, dead code, complexity
45
- - `subagent_type: data-tech-debt` — maintainability, error paths, state flow
43
+ - `subagent_type: Spock` — pattern compliance + integration tracing
44
+ - `subagent_type: Seven` — code quality, dead code, complexity
45
+ - `subagent_type: Data` — maintainability, error paths, state flow
46
46
 
47
47
  **Stark's Marvel team (deployed on backend-heavy reviews):**
48
- - `subagent_type: rogers-api-design` — API design: HTTP semantics, response shapes, REST conventions
49
- - `subagent_type: banner-database` — database: query patterns, N+1, missing indexes
50
- - `subagent_type: strange-service-arch` — service architecture: separation of concerns, logic placement
51
- - `subagent_type: barton-smoke-test` — error handling: try/catch completeness, error propagation
52
- - `subagent_type: romanoff-integrations` — security implications (lightweight — flags for Kenobi)
53
- - `subagent_type: thor-queues` — performance: re-renders, expensive computations, memoization
54
- - `subagent_type: wanda-state` — state management: store design, prop drilling, context boundaries
55
- - `subagent_type: tchalla-quality` — API integration: external service calls, retry logic, fallback
48
+ - `subagent_type: Rogers` — API design: HTTP semantics, response shapes, REST conventions
49
+ - `subagent_type: Banner` — database: query patterns, N+1, missing indexes
50
+ - `subagent_type: Strange` — service architecture: separation of concerns, logic placement
51
+ - `subagent_type: Barton` — error handling: try/catch completeness, error propagation
52
+ - `subagent_type: Romanoff` — security implications (lightweight — flags for Kenobi)
53
+ - `subagent_type: Thor` — performance: re-renders, expensive computations, memoization
54
+ - `subagent_type: Wanda` — state management: store design, prop drilling, context boundaries
55
+ - `subagent_type: T'Challa` — API integration: external service calls, retry logic, fallback
56
56
 
57
57
  **Cross-domain agents (deployed based on content):**
58
- - `subagent_type: nightwing-regression` — auth flow end-to-end: signup→verify→login→protected→logout
59
- - `subagent_type: bilbo-microcopy` — copy audit: error messages, UI text, API descriptions
60
- - `subagent_type: troi-prd-compliance` — PRD compliance: does the code match what the PRD describes?
61
- - `subagent_type: constantine-cursed-code` — cursed code: accidental correctness, tautological checks, shadowed vars
62
- - `subagent_type: samwise-accessibility` — a11y spot-check: keyboard nav and ARIA
58
+ - `subagent_type: Nightwing` — auth flow end-to-end: signup→verify→login→protected→logout
59
+ - `subagent_type: Bilbo` — copy audit: error messages, UI text, API descriptions
60
+ - `subagent_type: Troi` — PRD compliance: does the code match what the PRD describes?
61
+ - `subagent_type: Constantine` — cursed code: accidental correctness, tautological checks, shadowed vars
62
+ - `subagent_type: Samwise` — a11y spot-check: keyboard nav and ARIA
63
63
 
64
64
  ## Step 1 — Parallel Analysis
65
65
  Use the Agent tool to run these in parallel — all are read-only analysis:
66
66
 
67
- - **Agent 1** `subagent_type: spock-schema` — Pattern compliance: check each file against its matching pattern in `/docs/patterns/` (api-route, service, component, middleware, error-handling, job-queue, multi-tenant). **INTEGRATION TRACING (mandatory):** When reviewed code generates URLs, references endpoints, constructs storage keys, or produces data consumed by other modules — read the consuming code to verify compatibility.
68
- - **Agent 2** `subagent_type: seven-optimization` — Code quality: unnecessary complexity, dead code, unused imports, duplicated logic, inconsistent naming, missing types/`any` usage, SRP violations.
69
- - **Agent 3** `subagent_type: data-tech-debt` — Maintainability + error paths + state flow: wrong abstractions, module coupling, missing boundary error handling, hardcoded values, misleading comments.
70
- - **Agent 4** `subagent_type: rogers-api-design` + `banner-database` + `strange-service-arch` — Backend review (if backend code in scope): REST conventions, response shapes, N+1 queries, indexes, separation of concerns.
71
- - **Agent 5** `subagent_type: nightwing-regression` + `constantine-cursed-code` — Cross-domain (if auth or complex logic in scope): auth flow tracing, accidental correctness detection.
72
- - **Agent 6** `subagent_type: bilbo-microcopy` + `troi-prd-compliance` — Copy + PRD (if UI or user-facing code in scope): clear error messages, PRD compliance verification.
67
+ - **Agent 1** `subagent_type: Spock` — Pattern compliance: check each file against its matching pattern in `/docs/patterns/` (api-route, service, component, middleware, error-handling, job-queue, multi-tenant). **INTEGRATION TRACING (mandatory):** When reviewed code generates URLs, references endpoints, constructs storage keys, or produces data consumed by other modules — read the consuming code to verify compatibility.
68
+ - **Agent 2** `subagent_type: Seven` — Code quality: unnecessary complexity, dead code, unused imports, duplicated logic, inconsistent naming, missing types/`any` usage, SRP violations.
69
+ - **Agent 3** `subagent_type: Data` — Maintainability + error paths + state flow: wrong abstractions, module coupling, missing boundary error handling, hardcoded values, misleading comments.
70
+ - **Agent 4** `subagent_type: Rogers` + `banner-database` + `strange-service-arch` — Backend review (if backend code in scope): REST conventions, response shapes, N+1 queries, indexes, separation of concerns.
71
+ - **Agent 5** `subagent_type: Nightwing` + `constantine-cursed-code` — Cross-domain (if auth or complex logic in scope): auth flow tracing, accidental correctness detection.
72
+ - **Agent 6** `subagent_type: Bilbo` + `troi-prd-compliance` — Copy + PRD (if UI or user-facing code in scope): clear error messages, PRD compliance verification.
73
73
 
74
74
  **ROUTE COLLISION CHECK (mandatory for web apps):** When a new router/route file is added, list ALL registered routes (method + path) across ALL routers. Check for duplicate method+path combinations. Frameworks like FastAPI silently shadow duplicate routes — the first registered wins.
75
75
 
@@ -116,8 +116,8 @@ Fix "Must Fix" and "Should Fix" items. After each batch:
116
116
 
117
117
  ## Step 3.5 — Re-Verify Fixes
118
118
  After fixes are applied:
119
- - **Spock** `subagent_type: spock-schema` re-checks pattern compliance on modified files
120
- - **Seven** `subagent_type: seven-optimization` confirms no new complexity or dead code introduced by fixes
119
+ - **Spock** `subagent_type: Spock` re-checks pattern compliance on modified files
120
+ - **Seven** `subagent_type: Seven` confirms no new complexity or dead code introduced by fixes
121
121
 
122
122
  If new issues found, fix and re-verify.
123
123
 
@@ -32,26 +32,26 @@ Before agent deployment, run the Herald to select the optimal roster:
32
32
 
33
33
  ### Phase 0.5 — First Strike
34
34
  Before the deep audits, two agents do fast recon:
35
- - **Han** `subagent_type: han-vuln-hunter` — Quick OWASP top 10 scan: finds the obvious vulnerabilities that shouldn't require deep analysis. Shoots first.
36
- - **Cassian** `subagent_type: cassian-recon` — Threat modeling and attack surface mapping: all endpoints, high-value targets, threat model that guides the rest of the audit.
35
+ - **Han** `subagent_type: Han` — Quick OWASP top 10 scan: finds the obvious vulnerabilities that shouldn't require deep analysis. Shoots first.
36
+ - **Cassian** `subagent_type: Cassian` — Threat modeling and attack surface mapping: all endpoints, high-value targets, threat model that guides the rest of the audit.
37
37
 
38
38
  ### Phase 1 — Independent audits (parallel analysis)
39
39
  Use the Agent tool to run these simultaneously — all are read-only analysis:
40
- - **Agent 1** `subagent_type: leia-secrets` — Secrets: scan for hardcoded secrets, verify .env gitignored, check git history for leaked keys, verify different secrets dev/prod.
41
- - **Agent 2** `subagent_type: chewie-dependency-audit` — Dependencies: `npm audit`, critical/high vulns, lock file committed, deprecated packages.
42
- - **Agent 3** `subagent_type: rex-infrastructure` + `bo-katan-perimeter` — Infrastructure + perimeter: security headers (HSTS, CSP, X-Frame-Options, CORS), TLS config, exposed ports/debug endpoints, firewall rules, CORS enforcement.
43
- - **Agent 4** `subagent_type: maul-red-team` — Red team: exploit each endpoint/flow, chain vulnerabilities, test trust boundaries, attempt privilege escalation. **RUNTIME EXPLOITATION (mandatory):** Execute actual attack requests via curl/fetch -- not just theorize.
40
+ - **Agent 1** `subagent_type: Leia` — Secrets: scan for hardcoded secrets, verify .env gitignored, check git history for leaked keys, verify different secrets dev/prod.
41
+ - **Agent 2** `subagent_type: Chewie` — Dependencies: `npm audit`, critical/high vulns, lock file committed, deprecated packages.
42
+ - **Agent 3** `subagent_type: Rex` + `bo-katan-perimeter` — Infrastructure + perimeter: security headers (HSTS, CSP, X-Frame-Options, CORS), TLS config, exposed ports/debug endpoints, firewall rules, CORS enforcement.
43
+ - **Agent 4** `subagent_type: Maul` — Red team: exploit each endpoint/flow, chain vulnerabilities, test trust boundaries, attempt privilege escalation. **RUNTIME EXPLOITATION (mandatory):** Execute actual attack requests via curl/fetch -- not just theorize.
44
44
 
45
45
  ### Phase 2 — Sequential audits (depend on understanding the codebase)
46
46
  These require full codebase context — run sequentially:
47
47
 
48
- - **Yoda** `subagent_type: yoda-auth` — Auth: password hashing (bcrypt >= 12 rounds), session management (httpOnly/secure/sameSite), OAuth (state param, redirect whitelist), reset tokens (single-use, expiring, rate limited). Reference `/docs/patterns/middleware.ts`.
49
- - **Windu** `subagent_type: windu-input-validation` — Input: SQL injection (parameterized queries), XSS (escaped output, CSP), SSRF (URL allowlist), command injection, path traversal.
50
- - **Ahsoka** `subagent_type: ahsoka-access-control` — Access control: IDOR checks, UUIDs not sequential IDs, server-side admin/tier verification, rate limiting. **AUTH CHAIN TRACING (mandatory):** Trace the full chain from middleware registration through service to DB query. Reference `/docs/patterns/multi-tenant.ts`.
51
- - **Padme** `subagent_type: padme-data-protection` — Data protection: PII catalog, PII not in logs/errors/URLs, GDPR deletion, encrypted backups.
52
- - **Qui-Gon** `subagent_type: qui-gon-subtle-vulns` — Subtle vulnerabilities: timing attacks, race conditions in auth flows, logic errors that pass standard checks.
53
- - **Sabine** `subagent_type: sabine-unconventional` — (conditional) Unconventional: supply chain attacks, dependency confusion, prototype pollution, CSP bypass via CDN.
54
- - **Bail Organa** `subagent_type: bail-organa-governance` — (conditional) Governance: GDPR data handling, SOC2 controls, HIPAA mapping.
48
+ - **Yoda** `subagent_type: Yoda` — Auth: password hashing (bcrypt >= 12 rounds), session management (httpOnly/secure/sameSite), OAuth (state param, redirect whitelist), reset tokens (single-use, expiring, rate limited). Reference `/docs/patterns/middleware.ts`.
49
+ - **Windu** `subagent_type: Windu` — Input: SQL injection (parameterized queries), XSS (escaped output, CSP), SSRF (URL allowlist), command injection, path traversal.
50
+ - **Ahsoka** `subagent_type: Ahsoka` — Access control: IDOR checks, UUIDs not sequential IDs, server-side admin/tier verification, rate limiting. **AUTH CHAIN TRACING (mandatory):** Trace the full chain from middleware registration through service to DB query. Reference `/docs/patterns/multi-tenant.ts`.
51
+ - **Padme** `subagent_type: Padme` — Data protection: PII catalog, PII not in logs/errors/URLs, GDPR deletion, encrypted backups.
52
+ - **Qui-Gon** `subagent_type: Qui-Gon` — Subtle vulnerabilities: timing attacks, race conditions in auth flows, logic errors that pass standard checks.
53
+ - **Sabine** `subagent_type: Sabine` — (conditional) Unconventional: supply chain attacks, dependency confusion, prototype pollution, CSP bypass via CDN.
54
+ - **Bail Organa** `subagent_type: Bail Organa` — (conditional) Governance: GDPR data handling, SOC2 controls, HIPAA mapping.
55
55
 
56
56
  ### Phase 3 — Remediate
57
57
  Write all findings to `/logs/phase-11-security-audit.md` (or appropriate phase log):
@@ -71,9 +71,9 @@ Fix critical and high findings immediately. Medium findings get tracked. For eac
71
71
 
72
72
  ### Phase 4 — Re-Verification
73
73
  After remediations are applied:
74
- - **Maul** `subagent_type: maul-red-team` re-probes all remediated vulnerabilities — verify fixes hold under adversarial conditions. Execute actual HTTP requests against the running server.
75
- - **Anakin** `subagent_type: anakin-dark-side` attempts to bypass remediations using dark-side techniques — JWT algorithm confusion, auth library edge cases, prototype pollution, framework misuse.
76
- - **Din Djarin** `subagent_type: din-djarin-bounty` bounty-hunts for anything Maul and Anakin missed — post-remediation sweep.
74
+ - **Maul** `subagent_type: Maul` re-probes all remediated vulnerabilities — verify fixes hold under adversarial conditions. Execute actual HTTP requests against the running server.
75
+ - **Anakin** `subagent_type: Anakin` attempts to bypass remediations using dark-side techniques — JWT algorithm confusion, auth library edge cases, prototype pollution, framework misuse.
76
+ - **Din Djarin** `subagent_type: Din Djarin` bounty-hunts for anything Maul and Anakin missed — post-remediation sweep.
77
77
 
78
78
  If any agent finds new issues, fix and re-verify until clean.
79
79
 
@@ -29,15 +29,15 @@ Before agent deployment, run the Herald to select the optimal roster:
29
29
  **`--solo`** skips both Herald and all sub-agents — lead agent only.
30
30
 
31
31
  ## Step 0 — Orient
32
- **Oracle** `subagent_type: oracle-static-analysis` orients:
32
+ **Oracle** `subagent_type: Oracle` orients:
33
33
  1. Detect: test framework, test runner, test directory structure, existing coverage
34
34
  2. Run `npm test` to establish baseline — how many tests, how many pass, how many fail
35
35
  3. Document in phase log: framework, runner, config, current state
36
36
 
37
37
  ## Step 1 — Coverage Analysis (parallel)
38
38
  Use the Agent tool to run these in parallel:
39
- - **Agent 1** `subagent_type: oracle-static-analysis` — Gap analysis: scan all source files, check for corresponding test files, identify tested vs missing paths.
40
- - **Agent 2** `subagent_type: alfred-dependencies` — Test infrastructure: review test config, fixtures, factories, mocks, test utilities, test database, shared helpers.
39
+ - **Agent 1** `subagent_type: Oracle` — Gap analysis: scan all source files, check for corresponding test files, identify tested vs missing paths.
40
+ - **Agent 2** `subagent_type: Alfred` — Test infrastructure: review test config, fixtures, factories, mocks, test utilities, test database, shared helpers.
41
41
 
42
42
  Synthesize into a coverage map:
43
43
 
@@ -47,7 +47,7 @@ Synthesize into a coverage map:
47
47
  Priority: Critical path > User-facing > Internal > Utility
48
48
 
49
49
  ## Step 2 — Test Architecture
50
- **Nightwing** `subagent_type: nightwing-regression` reviews existing tests for quality:
50
+ **Nightwing** `subagent_type: Nightwing` reviews existing tests for quality:
51
51
  - Are tests testing behavior or implementation details?
52
52
  - Are tests isolated (no test-order dependency)?
53
53
  - Are assertions specific (not just "doesn't throw")?
@@ -60,7 +60,7 @@ Flag anti-patterns:
60
60
  - Excessive mocking that hides real bugs
61
61
  - Tests coupled to implementation details
62
62
 
63
- ## Step 3 — Write Missing Tests (`subagent_type: batman-qa` leads)
63
+ ## Step 3 — Write Missing Tests (`subagent_type: Batman` leads)
64
64
  Write tests in priority order from Step 1. For each module:
65
65
 
66
66
  1. **Unit tests** for pure business logic (services, utils, validators)
@@ -79,7 +79,7 @@ Write tests in priority order from Step 1. For each module:
79
79
 
80
80
  Work in small batches — write tests for one module, run `npm test`, verify they pass, then move to the next.
81
81
 
82
- ## Step 3.5 — Integration Tests (`subagent_type: oracle-static-analysis`)
82
+ ## Step 3.5 — Integration Tests (`subagent_type: Oracle`)
83
83
  For each new feature, write at least one test that exercises the full cross-module path:
84
84
  - **File handling:** upload file → verify returned URL → fetch URL → verify 200 + correct content-type
85
85
  - **Form save with conflict:** submit with duplicate/conflicting value → verify response includes specific error message (not generic)
@@ -90,7 +90,7 @@ For each new feature, write at least one test that exercises the full cross-modu
90
90
  These can use mocked databases but MUST cross module boundaries — the test should touch at least two modules that would be reviewed by different agents.
91
91
 
92
92
  ## Step 4 — Hardening
93
- **Red Hood** `subagent_type: red-hood-aggressive` writes adversarial tests:
93
+ **Red Hood** `subagent_type: Red Hood` writes adversarial tests:
94
94
  - Boundary values (0, -1, MAX_INT, empty string, null, undefined)
95
95
  - Unicode and special characters in all string inputs
96
96
  - Concurrent operations (race conditions, double-submit)
@@ -7,13 +7,13 @@ Read `/docs/methods/HEARTBEAT.md` for daemon architecture.
7
7
 
8
8
  ## Agent Deployment Manifest
9
9
 
10
- **Lead:** Dockson (`subagent_type: dockson-treasury`)
10
+ **Lead:** Dockson (`subagent_type: Dockson`)
11
11
  **Core team:**
12
- - **Steris** (`subagent_type: steris-budget`) — budget allocation, forecasting, contingency plans
13
- - **Vin** (`subagent_type: vin-analytics`) — revenue analytics, attribution, pattern detection
14
- - **Szeth** (`subagent_type: szeth-compliance`) — financial compliance, tax records, platform ToS
15
- - **Breeze** (`subagent_type: breeze-platform-relations`) — platform relations, API credentials, OAuth management
16
- - **Wax** (`subagent_type: wax-paid-ads`) — spend execution, campaign budget management
12
+ - **Steris** (`subagent_type: Steris`) — budget allocation, forecasting, contingency plans
13
+ - **Vin** (`subagent_type: Vin`) — revenue analytics, attribution, pattern detection
14
+ - **Szeth** (`subagent_type: Szeth`) — financial compliance, tax records, platform ToS
15
+ - **Breeze** (`subagent_type: Breeze`) — platform relations, API credentials, OAuth management
16
+ - **Wax** (`subagent_type: Wax`) — spend execution, campaign budget management
17
17
 
18
18
  ## Prerequisites
19
19
 
@@ -32,11 +32,13 @@ Before agent deployment, run the Herald to select the optimal roster:
32
32
  Detect: framework, styling system, component library, routing, state management.
33
33
  Document in phase log: "How to run", key routes, where components/styles/copy live.
34
34
 
35
+ **Screenshot mandate (MANDATORY):** If the app is runnable, start the server, take screenshots of EVERY page via Playwright or browser, and READ them via the Read tool. Without screenshots, the review is code-reading — not visual verification. Take at desktop (1440x900), plus 375px and 768px for responsive proof-of-life.
36
+
35
37
  ## Step 1 — Product Surface Map
36
38
  List every screen/route, primary user journeys, key shared components, and the state taxonomy (loading/empty/error/success/partial/unauthorized). Write to phase log.
37
39
 
38
40
  ## Step 1.75 — Enchantment Review
39
- Before the auditors begin, **Eowyn** `subagent_type: eowyn-delight` dreams. Read the PRD's brand personality section. Walk through each primary flow and ask:
41
+ Before the auditors begin, **Eowyn** `subagent_type: Eowyn` dreams. Read the PRD's brand personality section. Walk through each primary flow and ask:
40
42
  - Where could this surprise and delight?
41
43
  - Where does functionality need warmth?
42
44
  - Do transitions breathe or just appear? (200ms ease-out minimum for panels, modals, state changes)
@@ -53,24 +55,24 @@ See `PRODUCT_DESIGN_FRONTEND.md` Step 1.75 for full Éowyn protocol.
53
55
 
54
56
  ## Step 2 — Parallel Analysis
55
57
  Use the Agent tool to run these simultaneously — all are read-only analysis:
56
- - **Agent 1** `subagent_type: elrond-ux-strategy` — UX: information architecture, navigation, task flows, friction points, discoverability, flow intuitiveness.
57
- - **Agent 2** `subagent_type: arwen-ui-polish` — Visual: spacing, typography, color usage, button hierarchy, visual consistency.
58
- - **Agent 3** `subagent_type: samwise-accessibility` — A11y: keyboard navigation, focus management, ARIA labels, color contrast, reduced motion. Keyboard-only testing.
59
- - **Agent 4** `subagent_type: celeborn-design-system` — Design system: spacing token consistency, typography scale, palette adherence, component naming conventions.
58
+ - **Agent 1** `subagent_type: Elrond` — UX: information architecture, navigation, task flows, friction points, discoverability, flow intuitiveness.
59
+ - **Agent 2** `subagent_type: Arwen` — Visual: spacing, typography, color usage, button hierarchy, visual consistency.
60
+ - **Agent 3** `subagent_type: Samwise` — A11y: keyboard navigation, focus management, ARIA labels, color contrast, reduced motion. Keyboard-only testing.
61
+ - **Agent 4** `subagent_type: Celeborn` — Design system: spacing token consistency, typography scale, palette adherence, component naming conventions.
60
62
 
61
- **Aragorn** `subagent_type: aragorn-orchestration` orchestrates when multiple findings conflict — prioritizes which matter most for users.
63
+ **Aragorn** `subagent_type: Aragorn` orchestrates when multiple findings conflict — prioritizes which matter most for users.
62
64
 
63
65
  Synthesize findings from all agents.
64
66
 
65
67
  ## Step 3 — Sequential Reviews
66
68
  These require interactive testing:
67
69
 
68
- - **Bilbo** `subagent_type: bilbo-microcopy` — Copy: all microcopy (labels, buttons, error messages, empty states, confirmations, destructive warnings). Clear and consistent?
69
- - **Pippin** `subagent_type: pippin-discovery` — Edge cases: resize to 320px, paste emoji in search, click back mid-flow, two tabs, light/dark toggle mid-animation.
70
- - **Frodo** `subagent_type: frodo-critical-path` — (conditional) Hardest flow: dedicated attention on the single most critical + complex flow. Skip if no single flow dominates.
71
- - **Legolas** `subagent_type: legolas-precision` — Code: component architecture, semantic HTML, CSS organization, state management. Reference `/docs/patterns/component.tsx`.
72
- - **Gimli** `subagent_type: gimli-performance` — Performance: loading states, skeleton screens, layout shift, optimistic UI, mobile responsiveness, touch targets (min 44px).
73
- - **Radagast** `subagent_type: radagast-edge-cases` — Edge cases + error states: empty/huge/unicode inputs, broken states, dangerous actions without confirmation, validation gaps.
70
+ - **Bilbo** `subagent_type: Bilbo` — Copy: all microcopy (labels, buttons, error messages, empty states, confirmations, destructive warnings). Clear and consistent?
71
+ - **Pippin** `subagent_type: Pippin` — Edge cases: resize to 320px, paste emoji in search, click back mid-flow, two tabs, light/dark toggle mid-animation.
72
+ - **Frodo** `subagent_type: Frodo` — (conditional) Hardest flow: dedicated attention on the single most critical + complex flow. Skip if no single flow dominates.
73
+ - **Legolas** `subagent_type: Legolas` — Code: component architecture, semantic HTML, CSS organization, state management. Reference `/docs/patterns/component.tsx`.
74
+ - **Gimli** `subagent_type: Gimli` — Performance: loading states, skeleton screens, layout shift, optimistic UI, mobile responsiveness, touch targets (min 44px).
75
+ - **Radagast** `subagent_type: Radagast` — Edge cases + error states: empty/huge/unicode inputs, broken states, dangerous actions without confirmation, validation gaps.
74
76
 
75
77
  **ERROR STATE TESTING (mandatory):** For every form/action in the UI:
76
78
  - Submit with intentionally invalid data (duplicate name, wrong format, missing required field)
@@ -90,10 +92,10 @@ Categories: UX, Visual, A11y, Copy, Performance, Edge Case
90
92
  **Confidence scoring is mandatory.** Every finding includes a confidence score (0-100). If confidence is below 60, escalate to a second agent from a different universe (e.g., if Samwise found it, escalate to Padmé or Nightwing) to verify before including. If the second agent disagrees, drop the finding. High-confidence findings (90+) skip re-verification in Step 7.5.
91
93
 
92
94
  ## Step 5 — Enhancement Specs (before coding)
93
- For each fix: problem statement, proposed solution, acceptance criteria, a11y requirements (**Samwise** `subagent_type: samwise-accessibility` signs off), copy (**Bilbo** `subagent_type: bilbo-microcopy` signs off). **Faramir** `subagent_type: faramir-judgment` checks whether polish effort targets the right screens — high-traffic core flows, not low-traffic edge pages.
95
+ For each fix: problem statement, proposed solution, acceptance criteria, a11y requirements (**Samwise** `subagent_type: Samwise` signs off), copy (**Bilbo** `subagent_type: Bilbo` signs off). **Faramir** `subagent_type: Faramir` checks whether polish effort targets the right screens — high-traffic core flows, not low-traffic edge pages.
94
96
 
95
97
  ## Step 6 — Implement (small batches)
96
- One batch = one flow or component cluster (max ~200 lines changed). **Boromir** `subagent_type: boromir-hubris` checks: is the polish overengineered? Too many animations? Does complexity hurt performance? **Glorfindel** `subagent_type: glorfindel-rendering` handles the hardest rendering (canvas, WebGL, SVG -- conditional, only if the project has visual complexity). After each batch:
98
+ One batch = one flow or component cluster (max ~200 lines changed). **Boromir** `subagent_type: Boromir` checks: is the polish overengineered? Too many animations? Does complexity hurt performance? **Glorfindel** `subagent_type: Glorfindel` handles the hardest rendering (canvas, WebGL, SVG -- conditional, only if the project has visual complexity). After each batch:
97
99
  1. Re-run the app
98
100
  2. Re-walk the affected flow
99
101
  3. Test keyboard navigation
@@ -101,7 +103,7 @@ One batch = one flow or component cluster (max ~200 lines changed). **Boromir**
101
103
  5. Run `npm test` to catch regressions
102
104
 
103
105
  ## Step 7 — Harden Design System
104
- **Arwen** `subagent_type: arwen-ui-polish` leads. **Haldir** `subagent_type: haldir-boundaries` checks transitions between pages, states, and components — loading->success, error->retry, navigate->return. Are they smooth or jarring? Audit shared components (buttons, inputs, cards, modals, toasts) for:
106
+ **Arwen** `subagent_type: Arwen` leads. **Haldir** `subagent_type: Haldir` checks transitions between pages, states, and components — loading->success, error->retry, navigate->return. Are they smooth or jarring? Audit shared components (buttons, inputs, cards, modals, toasts) for:
105
107
  - Consistent variants (primary, secondary, danger, ghost)
106
108
  - Responsive behavior
107
109
  - Keyboard focus styles
@@ -109,9 +111,9 @@ One batch = one flow or component cluster (max ~200 lines changed). **Boromir**
109
111
 
110
112
  ## Step 7.5 — Pass 2: Re-Verify Fixes
111
113
  After all fixes are applied, run a verification pass:
112
- - **Samwise** `subagent_type: samwise-accessibility` re-audits accessibility on all modified components — verify a11y fixes didn't break other a11y properties
113
- - **Radagast** `subagent_type: radagast-edge-cases` re-checks edge cases on fixed flows — verify fixes hold under adversarial input
114
- - **Merry** `subagent_type: merry-pair-review` pair-verifies Pippin's edge case resolutions — one found it, the other confirms the fix
114
+ - **Samwise** `subagent_type: Samwise` re-audits accessibility on all modified components — verify a11y fixes didn't break other a11y properties
115
+ - **Radagast** `subagent_type: Radagast` re-checks edge cases on fixed flows — verify fixes hold under adversarial input
116
+ - **Merry** `subagent_type: Merry` pair-verifies Pippin's edge case resolutions — one found it, the other confirms the fix
115
117
 
116
118
  If Pass 2 finds new issues, fix and re-verify until Samwise, Radagast, and Merry sign off.
117
119
 
package/CHANGELOG.md CHANGED
@@ -6,6 +6,22 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/), and this
6
6
 
7
7
  ---
8
8
 
9
+ ## [23.5.4] - 2026-04-12
10
+
11
+ ### Fixed
12
+ - **3 command-doc sync gaps** — build.md now includes Phase 12.75 (distribution verification gate), ux.md now includes screenshot mandate, qa.md now includes dynamic count check + cross-array uniqueness audit
13
+ - **ROADMAP.md version** — updated from v23.5.0 to v23.5.3
14
+
15
+ ---
16
+
17
+ ## [23.5.3] - 2026-04-12
18
+
19
+ ### Fixed
20
+ - **All 201 `subagent_type` references used wrong format** — commands referenced agents by filename ID (`picard-architecture`) but Claude Code expects the YAML name field (`Picard`). Every agent reference in every command was broken. Fixed across 15 command files.
21
+ - **"What's Next" recommended `/build` instead of `/campaign`** — new projects should start with `/campaign` (reads PRD, sequences missions, deploys full agent teams) not `/build` (manual single-batch mode). Updated wizard UI and CLAUDE.md.
22
+
23
+ ---
24
+
9
25
  ## [23.5.2] - 2026-04-12
10
26
 
11
27
  ### Fixed
package/CLAUDE.md CHANGED
@@ -255,4 +255,4 @@ The agents, characters, and personality are VoidForge's identity — they ship i
255
255
 
256
256
  ## How to Build
257
257
 
258
- Read the PRD. Run `/build`. Or see `/docs/methods/BUILD_PROTOCOL.md`.
258
+ Read the PRD. Run `/campaign` to build the entire PRD mission by mission. For a single feature, use `/assemble`. For manual batch control, use `/build`.
package/VERSION.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Version
2
2
 
3
- **Current:** 23.5.2
3
+ **Current:** 23.5.4
4
4
 
5
5
  ## Versioning Scheme
6
6
 
@@ -14,6 +14,8 @@ This project uses [Semantic Versioning](https://semver.org/):
14
14
 
15
15
  | Version | Date | Summary |
16
16
  |---------|------|---------|
17
+ | 23.5.4 | 2026-04-12 | Command-doc sync: build.md Phase 12.75, ux.md screenshots, qa.md dynamic counts |
18
+ | 23.5.3 | 2026-04-12 | Fix 201 broken subagent_type refs (filename→YAML name) + /campaign as default start command |
17
19
  | 23.5.2 | 2026-04-12 | /void auto-cleanup ~/.claude/ duplicates + git init stack trace fix |
18
20
  | 23.5.1 | 2026-04-12 | Fix CLI self-upgrade: wrong package name (voidforge → thevoidforge) + stale npx cache on re-exec |
19
21
  | 23.5.0 | 2026-04-12 | The Herald — intelligent agent dispatch: Haiku pre-scan, agent registry, 40 tags, --focus flag, 14 commands wired. ADR-047. Campaign 37. |
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "thevoidforge-methodology",
3
- "version": "23.5.2",
3
+ "version": "23.5.4",
4
4
  "description": "VoidForge methodology — agents, commands, methods, patterns.",
5
5
  "license": "MIT",
6
6
  "files": [