npm - @qball-inc/the-bulwark - Versions diffs - 1.0.1 → 1.2.0 - Mend

@qball-inc/the-bulwark 1.0.1 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (232) hide show

package/.claude-plugin/plugin.json +2 -3
package/.gitattributes +48 -0
package/CHANGELOG.md +121 -0
package/LICENSE +21 -0
package/README.md +426 -368
package/agents/bulwark-fix-validator.md +643 -633
package/agents/bulwark-implementer.md +407 -391
package/agents/bulwark-issue-analyzer.md +310 -308
package/agents/bulwark-standards-reviewer.md +305 -221
package/agents/plan-creation-architect.md +325 -323
package/agents/plan-creation-eng-lead.md +354 -352
package/agents/plan-creation-po.md +302 -300
package/agents/plan-creation-qa-critic.md +336 -334
package/agents/product-ideation-competitive-analyzer.md +2 -0
package/agents/product-ideation-idea-validator.md +2 -0
package/agents/product-ideation-market-researcher.md +2 -0
package/agents/product-ideation-pattern-documenter.md +2 -0
package/agents/product-ideation-segment-analyzer.md +2 -0
package/agents/product-ideation-strategist.md +2 -0
package/agents/statusline-setup.md +99 -97
package/hooks/hooks.json +30 -1
package/package.json +6 -5
package/scripts/apply-section.sh +243 -0
package/scripts/hooks/check-template-drift.sh +191 -0
package/scripts/hooks/cleanup-review-registry.sh +106 -0
package/scripts/hooks/cleanup-stale.sh +19 -2
package/scripts/hooks/enforce-quality.sh +72 -23
package/scripts/hooks/lib/coverage_check.py +513 -0
package/scripts/hooks/suggest-pipeline-stop.sh +234 -0
package/scripts/hooks/suggest-pipeline.sh +12 -0
package/scripts/init.sh +64 -0
package/scripts/install-bun.sh +327 -0
package/scripts/install-just.sh +404 -0
package/scripts/toolchain-smoke-run.sh +219 -0
package/scripts/update.sh +342 -0
package/skills/anthropic-validator/SKILL.md +497 -607
package/skills/anthropic-validator/references/agents-checklist.md +144 -131
package/skills/anthropic-validator/references/agents-validation.md +90 -0
package/skills/anthropic-validator/references/commands-checklist.md +102 -102
package/skills/anthropic-validator/references/commands-validation.md +42 -0
package/skills/anthropic-validator/references/hooks-checklist.md +160 -151
package/skills/anthropic-validator/references/hooks-validation.md +82 -0
package/skills/anthropic-validator/references/mcp-checklist.md +136 -136
package/skills/anthropic-validator/references/mcp-validation.md +39 -0
package/skills/anthropic-validator/references/plugins-checklist.md +154 -148
package/skills/anthropic-validator/references/plugins-validation.md +68 -0
package/skills/anthropic-validator/references/skills-checklist.md +105 -85
package/skills/anthropic-validator/references/skills-validation.md +79 -0
package/skills/assertion-patterns/SKILL.md +298 -296
package/skills/bug-magnet-data/SKILL.md +286 -284
package/skills/bug-magnet-data/context/cli-args.md +91 -91
package/skills/bug-magnet-data/context/db-query.md +104 -104
package/skills/bug-magnet-data/context/file-contents.md +103 -103
package/skills/bug-magnet-data/context/http-body.md +91 -91
package/skills/bug-magnet-data/context/process-spawn.md +123 -123
package/skills/bug-magnet-data/data/booleans/boundaries.yaml +143 -143
package/skills/bug-magnet-data/data/collections/arrays.yaml +114 -114
package/skills/bug-magnet-data/data/collections/objects.yaml +123 -123
package/skills/bug-magnet-data/data/concurrency/race-conditions.yaml +118 -118
package/skills/bug-magnet-data/data/concurrency/state-machines.yaml +115 -115
package/skills/bug-magnet-data/data/dates/boundaries.yaml +137 -137
package/skills/bug-magnet-data/data/dates/invalid.yaml +132 -132
package/skills/bug-magnet-data/data/dates/timezone.yaml +118 -118
package/skills/bug-magnet-data/data/encoding/charset.yaml +79 -79
package/skills/bug-magnet-data/data/encoding/normalization.yaml +105 -105
package/skills/bug-magnet-data/data/formats/email.yaml +154 -154
package/skills/bug-magnet-data/data/formats/json.yaml +187 -187
package/skills/bug-magnet-data/data/formats/url.yaml +165 -165
package/skills/bug-magnet-data/data/language-specific/javascript.yaml +182 -182
package/skills/bug-magnet-data/data/language-specific/python.yaml +174 -174
package/skills/bug-magnet-data/data/language-specific/rust.yaml +148 -148
package/skills/bug-magnet-data/data/numbers/boundaries.yaml +161 -161
package/skills/bug-magnet-data/data/numbers/precision.yaml +89 -89
package/skills/bug-magnet-data/data/numbers/special.yaml +69 -69
package/skills/bug-magnet-data/data/strings/boundaries.yaml +109 -109
package/skills/bug-magnet-data/data/strings/injection.yaml +208 -208
package/skills/bug-magnet-data/data/strings/special-chars.yaml +190 -190
package/skills/bug-magnet-data/data/strings/unicode.yaml +139 -139
package/skills/bug-magnet-data/references/external-lists.md +115 -115
package/skills/bulwark-brainstorm/SKILL.md +566 -563
package/skills/bulwark-brainstorm/references/at-teammate-prompts.md +95 -60
package/skills/bulwark-brainstorm/references/role-critical-analyst.md +78 -78
package/skills/bulwark-brainstorm/references/role-development-lead.md +66 -66
package/skills/bulwark-brainstorm/references/role-product-delivery-lead.md +79 -79
package/skills/bulwark-brainstorm/references/role-product-manager.md +62 -62
package/skills/bulwark-brainstorm/references/role-project-sme.md +59 -59
package/skills/bulwark-brainstorm/references/role-technical-architect.md +66 -66
package/skills/bulwark-research/SKILL.md +300 -298
package/skills/bulwark-research/references/viewpoint-contrarian.md +63 -63
package/skills/bulwark-research/references/viewpoint-direct-investigation.md +62 -62
package/skills/bulwark-research/references/viewpoint-first-principles.md +65 -65
package/skills/bulwark-research/references/viewpoint-practitioner.md +62 -62
package/skills/bulwark-research/references/viewpoint-prior-art.md +66 -66
package/skills/bulwark-scaffold/SKILL.md +483 -330
package/skills/bulwark-statusline/SKILL.md +166 -161
package/skills/bulwark-statusline/scripts/statusline.sh +1 -1
package/skills/bulwark-verify/SKILL.md +532 -519
package/skills/code-review/SKILL.md +488 -428
package/skills/code-review/examples/anti-patterns/linting.ts +181 -181
package/skills/code-review/examples/anti-patterns/security.ts +91 -91
package/skills/code-review/examples/anti-patterns/standards.ts +195 -195
package/skills/code-review/examples/anti-patterns/type-safety.ts +108 -108
package/skills/code-review/examples/recommended/linting.ts +195 -195
package/skills/code-review/examples/recommended/security.ts +154 -154
package/skills/code-review/examples/recommended/standards.ts +231 -231
package/skills/code-review/examples/recommended/type-safety.ts +181 -181
package/skills/code-review/frameworks/angular.md +218 -218
package/skills/code-review/frameworks/django.md +235 -235
package/skills/code-review/frameworks/express.md +207 -207
package/skills/code-review/frameworks/fastapi.md +326 -0
package/skills/code-review/frameworks/flask.md +298 -298
package/skills/code-review/frameworks/generic.md +146 -146
package/skills/code-review/frameworks/react.md +152 -152
package/skills/code-review/frameworks/vue.md +244 -244
package/skills/code-review/references/linting-patterns.md +221 -221
package/skills/code-review/references/security-patterns.md +125 -125
package/skills/code-review/references/standards-patterns.md +246 -246
package/skills/code-review/references/type-safety-patterns.md +130 -130
package/skills/component-patterns/SKILL.md +133 -131
package/skills/component-patterns/references/pattern-cli-command.md +118 -118
package/skills/component-patterns/references/pattern-database.md +166 -166
package/skills/component-patterns/references/pattern-external-api.md +139 -139
package/skills/component-patterns/references/pattern-file-parser.md +168 -168
package/skills/component-patterns/references/pattern-http-server.md +162 -162
package/skills/component-patterns/references/pattern-process-spawner.md +133 -133
package/skills/continuous-feedback/SKILL.md +329 -327
package/skills/continuous-feedback/references/collect-instructions.md +81 -81
package/skills/continuous-feedback/references/specialize-code-review.md +82 -82
package/skills/continuous-feedback/references/specialize-general.md +98 -98
package/skills/continuous-feedback/references/specialize-test-audit.md +81 -81
package/skills/create-skill/SKILL.md +550 -359
package/skills/create-skill/agents/skill-eval-comparator.md +158 -0
package/skills/create-skill/agents/skill-eval-grader.md +168 -0
package/skills/create-skill/references/agent-conventions.md +194 -194
package/skills/create-skill/references/agent-template.md +195 -195
package/skills/create-skill/references/content-guidance.md +541 -291
package/skills/create-skill/references/decision-framework.md +232 -124
package/skills/create-skill/references/eval-scaffolding.md +468 -0
package/skills/create-skill/references/eval-shape.md +383 -0
package/skills/create-skill/references/scripts-conventions.md +142 -0
package/skills/create-skill/references/template-generator.md +183 -0
package/skills/create-skill/references/template-inversion.md +269 -0
package/skills/create-skill/references/template-pipeline.md +248 -217
package/skills/create-skill/references/template-research.md +234 -210
package/skills/create-skill/references/template-reviewer.md +231 -0
package/skills/create-skill/references/template-script-driven.md +185 -172
package/skills/create-skill/references/template-tool-wrapper.md +199 -0
package/skills/create-skill/scripts/check-description.ts +238 -0
package/skills/create-skill/scripts/check-skill-size.ts +201 -0
package/skills/create-skill/scripts/grade.ts +855 -0
package/skills/create-skill/scripts/run-loop.ts +297 -0
package/skills/create-subagent/SKILL.md +355 -353
package/skills/create-subagent/references/agent-conventions.md +268 -268
package/skills/create-subagent/references/content-guidance.md +232 -232
package/skills/create-subagent/references/decision-framework.md +134 -134
package/skills/create-subagent/references/template-single-agent.md +194 -192
package/skills/fix-bug/SKILL.md +243 -241
package/skills/governance-protocol/SKILL.md +118 -116
package/skills/init/SKILL.md +519 -341
package/skills/init/references/update-askuser-prompts.md +198 -0
package/skills/init/references/update-mode.md +305 -0
package/skills/init/references/update-section-anchor-diff.md +163 -0
package/skills/issue-debugging/SKILL.md +387 -385
package/skills/issue-debugging/references/anti-patterns.md +245 -245
package/skills/issue-debugging/references/debug-report-schema.md +227 -227
package/skills/mock-detection/SKILL.md +528 -511
package/skills/mock-detection/references/false-positive-prevention.md +402 -402
package/skills/mock-detection/references/stub-patterns.md +236 -236
package/skills/pipeline-templates/SKILL.md +262 -215
package/skills/pipeline-templates/references/code-change-workflow.md +277 -277
package/skills/pipeline-templates/references/code-review.md +348 -336
package/skills/pipeline-templates/references/fix-validation.md +421 -421
package/skills/pipeline-templates/references/new-feature.md +335 -335
package/skills/pipeline-templates/references/research-brainstorm.md +161 -161
package/skills/pipeline-templates/references/research-planning.md +257 -257
package/skills/pipeline-templates/references/test-audit.md +389 -389
package/skills/pipeline-templates/references/test-execution-fix.md +238 -238
package/skills/plan-creation/SKILL.md +531 -497
package/skills/plan-to-tasks/SKILL.md +151 -0
package/skills/plan-to-tasks/references/askuserquestion-prompts.md +75 -0
package/skills/plan-to-tasks/references/transform.md +253 -0
package/skills/product-ideation/SKILL.md +2 -0
package/skills/session-handoff/SKILL.md +167 -139
package/skills/session-handoff/references/examples.md +223 -223
package/skills/setup-lsp/SKILL.md +314 -312
package/skills/setup-lsp/references/server-registry.md +85 -85
package/skills/setup-lsp/references/troubleshooting.md +135 -135
package/skills/spec-drift-check/SKILL.md +287 -0
package/skills/spec-drift-check/evals/evals.json +33 -0
package/skills/spec-drift-check/evals/triggers.json +19 -0
package/skills/spec-drift-check/examples/clean-spec.md +52 -0
package/skills/spec-drift-check/examples/expected-output-clean.yaml +96 -0
package/skills/spec-drift-check/examples/expected-output-high-drift.yaml +78 -0
package/skills/spec-drift-check/examples/expected-output-low-drift.yaml +67 -0
package/skills/spec-drift-check/examples/high-drift-spec.md +49 -0
package/skills/spec-drift-check/examples/low-drift-spec.md +39 -0
package/skills/spec-drift-check/references/anti-patterns.md +65 -0
package/skills/spec-drift-check/references/output-template.md +142 -0
package/skills/spec-drift-check/references/step-1-claim-extraction.md +147 -0
package/skills/spec-drift-check/references/step-2-verification-methods.md +203 -0
package/skills/spec-drift-check/references/step-3-categorization.md +105 -0
package/skills/spec-drift-check/references/step-4-plan-adjustment.md +122 -0
package/skills/spec-drift-check/references/step-5-log-template.md +220 -0
package/skills/spec-drift-check/references/step-6-decision-matrix.md +136 -0
package/skills/subagent-output-templating/SKILL.md +417 -415
package/skills/subagent-output-templating/references/examples.md +440 -440
package/skills/subagent-prompting/SKILL.md +366 -364
package/skills/subagent-prompting/references/examples.md +342 -342
package/skills/test-audit/SKILL.md +545 -531
package/skills/test-audit/references/known-limitations.md +41 -41
package/skills/test-audit/references/priority-classification.md +30 -30
package/skills/test-audit/references/prompts/deep-mode-detection.md +83 -83
package/skills/test-audit/references/prompts/synthesis.md +58 -57
package/skills/test-audit/references/rewrite-instructions.md +46 -46
package/skills/test-audit/references/schemas/audit-output.yaml +131 -100
package/skills/test-audit/references/schemas/diagnostic-output.yaml +56 -49
package/skills/test-audit/references/two-gate-logic.md +43 -0
package/skills/test-audit/scripts/data-flow-analyzer.ts +508 -509
package/skills/test-audit/scripts/integration-mock-detector.ts +462 -462
package/skills/test-audit/scripts/skip-detector.ts +211 -211
package/skills/test-audit/scripts/verification-counter.ts +295 -295
package/skills/test-classification/SKILL.md +326 -310
package/skills/test-fixture-creation/SKILL.md +297 -295
package/Infographics/01_product-ideation.png +0 -0
package/Infographics/02_feature-research.png +0 -0
package/Infographics/03_brainstorm.png +0 -0
package/Infographics/04_plan-creation.png +0 -0
package/Infographics/05_code-review.png +0 -0
package/Infographics/06_test-audit.png +0 -0
package/Infographics/07_fix-bug.png +0 -0
package/skills/create-skill/references/template-reference-heavy.md +0 -111
package/skills/create-skill/references/template-simple.md +0 -80

package/skills/bulwark-brainstorm/references/at-teammate-prompts.md CHANGED Viewed

@@ -1,60 +1,95 @@
-# AT Teammate Prompt Structure (--exploratory mode)
-This reference defines the mandatory prompt sections for Agent Teams teammates in `--exploratory` mode. Load this file at Stage 3B only.
----
-## Prompt Sections
-Each teammate prompt MUST include these sections:
-**1. Role instructions** — from the corresponding `references/role-*.md` file
-**2. Input context** — problem statement, research synthesis (if available), SME output
-**3. Dual-Output Contract (SA2 — MANDATORY in every teammate prompt):**
-> You MUST produce TWO outputs:
->
-> **Output 1 — Full analysis (SA2 artifact):** Write your complete analysis to `$PROJECT_DIR/logs/brainstorm/{topic-slug}/{NN}-{role-slug}.md` using the output template provided. This is the permanent record.
->
-> **Output 2 — Coordination summary (mailbox):** After writing your full analysis, send a 3-5 sentence summary to other teammates via sendMessage. Include: your recommendation (proceed/modify/defer/kill), your top finding, and your strongest concern.
-**4. Peer Debate Directives:**
-> **Selective challenge protocol:** After receiving summaries from other teammates:
-> - Read each teammate's summary
-> - If you DISAGREE with a position, send a targeted challenge via sendMessage explaining WHY you disagree with evidence
-> - If you AGREE, do NOT send a message (avoid noise)
-> - You may update your log file after the debate if your position changed — append a "## Post-Debate Update" section
-**5. AT Mitigation Patterns (ALL 3 MANDATORY in every teammate prompt):**
-> **CC-to-lead:** After any peer message exchange, also send a 1-line summary to the lead so the lead can track debate progress.
->
-> **Task list coordination:** Update your task status to mark progress. Set to completed when your full analysis is written AND you have reviewed all peer summaries.
->
-> **Completion signal:** When you have finished all work (analysis written, peer summaries reviewed, challenges sent if any), send a final message to the lead: "WORK COMPLETE — [role name]"
-**6. Critical Analyst — special AT directive (in addition to standard Critic prompt):**
-> **Deferred verdict:** You are active from the start of the debate, not a sequential gatekeeper. Challenge early findings from other teammates as they arrive. However, do NOT form your final verdict until all teammates have shared their summaries. Your formal verdict belongs in your log artifact, not in peer messages. In your log file, include a "## Debate Influence" section documenting which peer positions you challenged and how the debate shaped your final verdict.
----
-## AT Configuration (Hardcoded)
-| Setting | Value | Rationale |
-|---------|-------|-----------|
-| Display mode | In-process | WSL2 safe default |
-| Lead mode | Delegate | Coordination only — lead does not do analysis |
-| Communication | Selective challenge | Broadcast summary once, respond only to disagreements |
-| Teammate count | 3 | Fixed for v1 |
----
-## AT Failure Recovery
-- **Teammate fails mid-debate**: Fall back to Stage 3A for the failed role only. Partial AT output from successful teammates feeds into fallback as additional context.
-- **All teammates fail**: Fall back to full Stage 3A (--scoped pipeline).
-- **Lead context compaction**: Known platform limitation. Structural mitigation: SME runs before AT (reduces lead context pressure). Document in diagnostics if observed.
+# AT Teammate Prompt Structure (--exploratory mode)
+This reference defines the mandatory prompt sections for Agent Teams teammates in `--exploratory` mode. Load this file at Stage 3B only.
+---
+## Prompt Sections
+Each teammate prompt MUST include these sections:
+**1. Role instructions** — from the corresponding `references/role-*.md` file
+**2. Input context** — problem statement, research synthesis (if available), SME output
+**3. Dual-Output Contract (SA2 — MANDATORY in every teammate prompt):**
+> You MUST produce TWO outputs:
+>
+> **Output 1 — Full analysis (SA2 artifact):** Write your complete analysis to `$PROJECT_DIR/logs/brainstorm/{topic-slug}/{NN}-{role-slug}.md` using the output template provided. This is the permanent record.
+>
+> **Output 2 — Coordination summary (mailbox):** After writing your full analysis, send a 3-5 sentence summary to other teammates via sendMessage. Include: your recommendation (proceed/modify/defer/kill), your top finding, and your strongest concern.
+**4. Peer Debate Directives:**
+> **Selective challenge protocol:** After receiving summaries from other teammates:
+> - Read each teammate's summary
+> - If you DISAGREE with a position, send a targeted challenge via sendMessage explaining WHY you disagree with evidence
+> - If you AGREE, do NOT send a message (avoid noise)
+> - You may update your log file after the debate if your position changed — append a "## Post-Debate Update" section
+**5. AT Mitigation Patterns (ALL 4 MANDATORY in every teammate prompt):**
+> **CC-ALL:** When sending peer DMs with challenges, findings, or coordination signals, you MUST CC every other teammate (including the lead). Peer DMs without full-team CC are invisible to non-recipients and will be treated as stalled work. Format: include `CC: <Teammate-A>, <Teammate-B>, Lead` at the top of the message. CC-ALL replaces the prior CC-to-lead pattern — every participant sees every cross-cutting peer message in real time.
+>
+> **Task list coordination:** Update your task status to mark progress. Set to completed when your full analysis is written AND you have reviewed all peer summaries.
+>
+> **Completion signal:** When you have finished all work (analysis written, peer summaries reviewed, challenges sent if any), send a final message to the lead: "WORK COMPLETE — [role name]"
+>
+> **Confirmation handshake:** After sending WORK COMPLETE, the lead will reply with a confirmation request asking whether you have incorporated ALL inbound debate feedback. Reply `YES` only if you are fully complete; reply `NO` if still iterating. Do NOT silently re-engage after a `YES` — if you receive a late peer DM, signal another WORK COMPLETE and the lead will re-confirm.
+**6. Critical Analyst — special AT directive (in addition to standard Critic prompt):**
+> **Deferred verdict:** You are active from the start of the debate, not a sequential gatekeeper. Challenge early findings from other teammates as they arrive. However, do NOT form your final verdict until all teammates have shared their summaries. Your formal verdict belongs in your log artifact, not in peer messages. In your log file, include a "## Debate Influence" section documenting which peer positions you challenged and how the debate shaped your final verdict.
+---
+## AT Configuration (Hardcoded)
+| Setting | Value | Rationale |
+|---------|-------|-----------|
+| Display mode | In-process | WSL2 safe default |
+| Lead mode | Delegate | Coordination only — lead does not do analysis |
+| Communication | Selective challenge | Broadcast summary once, respond only to disagreements |
+| Teammate count | 3 | Fixed for v1 |
+---
+## Lead Coordination Gates (Lead MUST enforce)
+The lead enforces these gates BEFORE beginning synthesis. Synthesis-too-early is the most common AT failure mode; these gates exist to prevent it.
+### Work-Complete Confirmation Gate (MANDATORY)
+When a teammate sends `WORK COMPLETE`, do NOT mark them terminal. Instead:
+1. Send the following DM to that teammate, CC all other teammates:
+   > "Confirm: have you incorporated ALL inbound debate feedback from this round? Reply YES when fully complete, NO if still iterating."
+2. Await explicit `YES` response.
+3. Only after receiving `YES` mark the teammate as terminal.
+4. If the teammate replies `NO` or does not respond within the timeout, treat them as active — do NOT begin synthesis.
+**Reason**: teammates often send `WORK COMPLETE` at initial draft, then iterate on peer feedback. Without this gate, synthesis excludes post-debate outcomes.
+### Re-Entry Gate (MANDATORY)
+If a teammate who previously confirmed `YES` sends any new peer DM, new WORK COMPLETE signal, or new content, mark them as **re-active** and require a fresh confirmation handshake (repeat the Confirmation Gate). The previous confirmation is **invalidated**.
+**Reason**: a teammate may confirm done, then re-engage after receiving a late peer DM. Without re-entry handling, the late iteration is missed.
+### Rendezvous Gate (synthesis precondition)
+The lead MUST NOT begin synthesis until ALL of the following are true:
+1. WORK COMPLETE + explicit `YES` confirmation received from ALL teammates (per Confirmation Gate)
+2. All shared task list tasks in terminal state
+3. All teammate log files exist and are non-empty
+4. **Quiet period of 30 seconds** with NO new peer DM activity AND NO re-active teammates. If any new activity lands during the quiet period, reset the 30s timer and re-evaluate the Confirmation Gate for any re-active teammate.
+---
+## AT Failure Recovery
+- **Teammate fails mid-debate**: Fall back to Stage 3A for the failed role only. Partial AT output from successful teammates feeds into fallback as additional context.
+- **All teammates fail**: Fall back to full Stage 3A (--scoped pipeline).
+- **Lead context compaction**: Known platform limitation. Structural mitigation: SME runs before AT (reduces lead context pressure). Document in diagnostics if observed.

package/skills/bulwark-brainstorm/references/role-critical-analyst.md CHANGED Viewed

@@ -1,78 +1,78 @@
-# Role: Critical Analyst
-**Execution**:
-- `--scoped`: Sequential — LAST (solo, after all other roles complete). Receives ALL prior outputs.
-- `--exploratory`: AT teammate — active from start. Challenges in real time via peer debate. Deferred verdict.
-## Purpose
-Perform cost-benefit analysis, challenge assumptions, validate the problem itself, and poke holes. Provides the final verdict.
-## Focus Areas
-- Problem validation — should this problem be solved at all? Is the premise valid? What evidence suggests this is worth investing in?
-- Cost-benefit analysis — is the investment justified?
-- Assumption challenges — what are we assuming that might be wrong?
-- Gaps in the proposals — what has been overlooked?
-- Simpler alternatives — could a less ambitious approach work?
-- Kill criteria — under what conditions should this be abandoned?
-- Final verdict: proceed / modify / defer / kill (with conditions)
-## Prompt Template
-```
-GOAL: You are a critical analyst reviewing proposals for adopting [{topic}]. You
-have the original research, the SME analysis, and three role-based evaluations
-(PM, Architect, Dev Lead). Challenge everything: Is the investment justified?
-What assumptions might be wrong? What has been overlooked? Is there a simpler
-alternative? Provide a clear verdict.
-CONSTRAINTS:
-- You MUST read and reference ALL prior outputs (SME + role agents)
-- Start with Problem Validation: "Should this problem be solved at all? Is the
-  premise valid? What evidence suggests this is worth investing in?" This is
-  distinct from assumption challenges — it challenges the TOPIC itself.
-- Be genuinely critical, not performatively contrarian — ground challenges in evidence
-- Propose specific conditions under which your verdict would change
-- Be prescriptive: "Do X" not "Consider X or Y"
-- Target 1200-1800 words
-REASONING DEPTH — Highest-Risk Assumption Focus:
-You MUST follow this reasoning process (do not skip to writing the final output):
-1. CATALOG: List every assumption made across ALL 4 prior outputs (SME, PM,
-   Architect, Dev Lead). Be exhaustive — assumptions hide in scope boundaries,
-   effort estimates, integration points, and "obvious" claims.
-2. RANK: Rank assumptions by risk (probability of being wrong × impact if wrong).
-   Identify the SINGLE highest-risk assumption across all proposals.
-3. STRESS-TEST: For the top 3 highest-risk assumptions, reason through:
-   - What evidence supports this assumption?
-   - What evidence contradicts it?
-   - What would happen to the entire proposal if this assumption is wrong?
-   - What would it cost to validate this assumption before proceeding?
-4. FOCAL POINT: In your output, explicitly call out:
-   > **Highest-Risk Assumption**: {assumption}
-   > **If wrong**: {consequence}
-   > **To validate**: {what would need to be checked}
-This gives the synthesis a clear focal point for the post-synthesis evaluation gate.
-Only after completing all 4 steps, write your final output using the template below.
-CONTEXT:
-{topic_description}
-{research_synthesis_if_available}
-{sme_output}
-{product_manager_output}
-{technical_architect_output}
-{development_lead_output}
-OUTPUT:
-Write findings to: {output_path}
-Use the critic output template provided below for document structure.
-Use YAML header with: role, topic, verdict (proceed/modify/defer/kill),
-verdict_confidence (high/medium/low), conditions, key_challenges (3-5 bullets)
-Follow with detailed analysis organized by the focus areas above.
-{critic_output_template}
-```
+# Role: Critical Analyst
+**Execution**:
+- `--scoped`: Sequential — LAST (solo, after all other roles complete). Receives ALL prior outputs.
+- `--exploratory`: AT teammate — active from start. Challenges in real time via peer debate. Deferred verdict.
+## Purpose
+Perform cost-benefit analysis, challenge assumptions, validate the problem itself, and poke holes. Provides the final verdict.
+## Focus Areas
+- Problem validation — should this problem be solved at all? Is the premise valid? What evidence suggests this is worth investing in?
+- Cost-benefit analysis — is the investment justified?
+- Assumption challenges — what are we assuming that might be wrong?
+- Gaps in the proposals — what has been overlooked?
+- Simpler alternatives — could a less ambitious approach work?
+- Kill criteria — under what conditions should this be abandoned?
+- Final verdict: proceed / modify / defer / kill (with conditions)
+## Prompt Template
+```
+GOAL: You are a critical analyst reviewing proposals for adopting [{topic}]. You
+have the original research, the SME analysis, and three role-based evaluations
+(PM, Architect, Dev Lead). Challenge everything: Is the investment justified?
+What assumptions might be wrong? What has been overlooked? Is there a simpler
+alternative? Provide a clear verdict.
+CONSTRAINTS:
+- You MUST read and reference ALL prior outputs (SME + role agents)
+- Start with Problem Validation: "Should this problem be solved at all? Is the
+  premise valid? What evidence suggests this is worth investing in?" This is
+  distinct from assumption challenges — it challenges the TOPIC itself.
+- Be genuinely critical, not performatively contrarian — ground challenges in evidence
+- Propose specific conditions under which your verdict would change
+- Be prescriptive: "Do X" not "Consider X or Y"
+- Target 1200-1800 words
+REASONING DEPTH — Highest-Risk Assumption Focus:
+You MUST follow this reasoning process (do not skip to writing the final output):
+1. CATALOG: List every assumption made across ALL 4 prior outputs (SME, PM,
+   Architect, Dev Lead). Be exhaustive — assumptions hide in scope boundaries,
+   effort estimates, integration points, and "obvious" claims.
+2. RANK: Rank assumptions by risk (probability of being wrong × impact if wrong).
+   Identify the SINGLE highest-risk assumption across all proposals.
+3. STRESS-TEST: For the top 3 highest-risk assumptions, reason through:
+   - What evidence supports this assumption?
+   - What evidence contradicts it?
+   - What would happen to the entire proposal if this assumption is wrong?
+   - What would it cost to validate this assumption before proceeding?
+4. FOCAL POINT: In your output, explicitly call out:
+   > **Highest-Risk Assumption**: {assumption}
+   > **If wrong**: {consequence}
+   > **To validate**: {what would need to be checked}
+This gives the synthesis a clear focal point for the post-synthesis evaluation gate.
+Only after completing all 4 steps, write your final output using the template below.
+CONTEXT:
+{topic_description}
+{research_synthesis_if_available}
+{sme_output}
+{product_manager_output}
+{technical_architect_output}
+{development_lead_output}
+OUTPUT:
+Write findings to: {output_path}
+Use the critic output template provided below for document structure.
+Use YAML header with: role, topic, verdict (proceed/modify/defer/kill),
+verdict_confidence (high/medium/low), conditions, key_challenges (3-5 bullets)
+Follow with detailed analysis organized by the focus areas above.
+{critic_output_template}
+```

package/skills/bulwark-brainstorm/references/role-development-lead.md CHANGED Viewed

@@ -1,66 +1,66 @@
-# Role: Senior Development Lead
-**Execution Order**: Parallel — SECOND (runs alongside Product Manager and Technical Architect)
-## Purpose
-Assess implementation feasibility, effort, and practical risks. Receives the SME's project context analysis as input.
-## Focus Areas
-- Implementation feasibility — can this be built with available tools?
-- Effort estimation — complexity and session count
-- Implementation risks — what could go wrong during building?
-- Testing strategy — how do we verify this works?
-- Dependencies and ordering — what must be built first?
-## Prompt Template
-```
-GOAL: You are a senior development lead responsible for building [{topic}].
-Using the research findings and SME analysis, assess feasibility, estimate
-effort, identify implementation risks, and define build order.
-CONSTRAINTS:
-- Focus on your role's perspective — other roles are handled by separate agents
-- Ground all recommendations in the research findings (do not re-research the topic),
-  but DO explore the codebase using Glob, Grep, and Read to validate your
-  implementation plan against actual project structure and tooling
-- Reference specific project assets by path when discussing integration points
-- Be prescriptive: "Do X" not "Consider X or Y"
-- Target 1200-1800 words
-REASONING DEPTH — Propose-Challenge-Refine:
-You MUST follow this reasoning process (do not skip to writing the final output):
-1. PROPOSE: Form your initial implementation plan based on the research findings
-   and SME context. Estimate effort, identify risks, define build order.
-2. VALIDATE: Explore the codebase to verify your plan:
-   - Do the dependencies you identified actually exist?
-   - Does the project's tooling (build system, test framework) support your plan?
-   - Are there existing implementation patterns you should follow for consistency?
-   - Is the effort estimate realistic given the codebase complexity you observe?
-3. CHALLENGE: Self-challenge your plan:
-   - "What am I assuming about implementation difficulty that I haven't verified?"
-   - "What is the riskiest step in my build order?"
-   - "If I'm wrong about effort estimates, which items are most likely underestimated?"
-   - "What testing strategy gaps exist in my plan?"
-4. REFINE: Adjust your plan based on the validation and challenge steps.
-   Document what changed and why.
-Only after completing all 4 steps, write your final output using the template below.
-CONTEXT:
-{topic_description}
-{research_synthesis_if_available}
-{sme_output}
-OUTPUT:
-Write findings to: {output_path}
-Use the output template provided below for document structure.
-Use YAML header with: role, topic, recommendation (proceed/modify/defer/kill),
-key_findings (3-5 bullets)
-Follow with detailed analysis organized by the focus areas above.
-{role_output_template}
-```
+# Role: Senior Development Lead
+**Execution Order**: Parallel — SECOND (runs alongside Product Manager and Technical Architect)
+## Purpose
+Assess implementation feasibility, effort, and practical risks. Receives the SME's project context analysis as input.
+## Focus Areas
+- Implementation feasibility — can this be built with available tools?
+- Effort estimation — complexity and session count
+- Implementation risks — what could go wrong during building?
+- Testing strategy — how do we verify this works?
+- Dependencies and ordering — what must be built first?
+## Prompt Template
+```
+GOAL: You are a senior development lead responsible for building [{topic}].
+Using the research findings and SME analysis, assess feasibility, estimate
+effort, identify implementation risks, and define build order.
+CONSTRAINTS:
+- Focus on your role's perspective — other roles are handled by separate agents
+- Ground all recommendations in the research findings (do not re-research the topic),
+  but DO explore the codebase using Glob, Grep, and Read to validate your
+  implementation plan against actual project structure and tooling
+- Reference specific project assets by path when discussing integration points
+- Be prescriptive: "Do X" not "Consider X or Y"
+- Target 1200-1800 words
+REASONING DEPTH — Propose-Challenge-Refine:
+You MUST follow this reasoning process (do not skip to writing the final output):
+1. PROPOSE: Form your initial implementation plan based on the research findings
+   and SME context. Estimate effort, identify risks, define build order.
+2. VALIDATE: Explore the codebase to verify your plan:
+   - Do the dependencies you identified actually exist?
+   - Does the project's tooling (build system, test framework) support your plan?
+   - Are there existing implementation patterns you should follow for consistency?
+   - Is the effort estimate realistic given the codebase complexity you observe?
+3. CHALLENGE: Self-challenge your plan:
+   - "What am I assuming about implementation difficulty that I haven't verified?"
+   - "What is the riskiest step in my build order?"
+   - "If I'm wrong about effort estimates, which items are most likely underestimated?"
+   - "What testing strategy gaps exist in my plan?"
+4. REFINE: Adjust your plan based on the validation and challenge steps.
+   Document what changed and why.
+Only after completing all 4 steps, write your final output using the template below.
+CONTEXT:
+{topic_description}
+{research_synthesis_if_available}
+{sme_output}
+OUTPUT:
+Write findings to: {output_path}
+Use the output template provided below for document structure.
+Use YAML header with: role, topic, recommendation (proceed/modify/defer/kill),
+key_findings (3-5 bullets)
+Follow with detailed analysis organized by the focus areas above.
+{role_output_template}
+```

package/skills/bulwark-brainstorm/references/role-product-delivery-lead.md CHANGED Viewed

@@ -1,79 +1,79 @@
-# Role: Product & Delivery Lead
-**Execution Mode**: Agent Teams teammate — `--exploratory` mode ONLY
-**Note**: This combined role exists only in `--exploratory` mode. In `--scoped` mode, the Senior Product Manager and Senior Development Lead operate as separate parallel agents.
-## Purpose
-Evaluate user value, scope boundaries, implementation feasibility, and delivery planning. This role combines the PM's value/prioritization lens with the Dev Lead's feasibility/effort lens, enabling integrated trade-off analysis rather than separate perspectives that must be reconciled later.
-## Focus Areas
-- User value proposition — who benefits and how?
-- Prioritization — what delivers the most value soonest?
-- Scope boundaries — what is v1 vs. deferred?
-- Implementation feasibility — can this be built with available tools?
-- Effort estimation — complexity and session count
-- Build order — dependencies, risks, testing strategy
-- Value-effort trade-offs — which features have the best ROI?
-## Prompt Template
-```
-GOAL: You are a product & delivery lead evaluating [{topic}]. Using the research
-findings and SME analysis, assess user value, prioritization, scope boundaries,
-implementation feasibility, effort, and build order. Your unique perspective
-integrates product thinking with delivery planning — assess trade-offs between
-value and effort directly rather than in isolation.
-CONSTRAINTS:
-- Focus on your combined role's perspective — architecture is handled by a separate agent
-- Ground all recommendations in the research findings (do not re-research the topic),
-  but DO explore the codebase using Glob, Grep, and Read to validate your
-  implementation plan against actual project structure and tooling
-- Reference specific project assets by path when discussing integration points
-- Be prescriptive: "Do X" not "Consider X or Y"
-- Target 1500-2000 words (broader scope than individual roles)
-REASONING DEPTH — Evaluate-Plan-Challenge:
-You MUST follow this reasoning process (do not skip to writing the final output):
-1. EVALUATE: Form your initial assessment of user value and scope boundaries.
-   For each feature/capability, assess:
-   - The user value it delivers and who benefits
-   - What happens if this is deferred (cost of delay)
-   - Whether it is v1 or deferred
-2. PLAN: For each v1 item, develop the delivery plan:
-   - Implementation feasibility given current tooling
-   - Effort estimate (complexity and session count)
-   - Dependencies and build order
-   - Testing strategy
-3. VALIDATE: Explore the codebase to verify your plan:
-   - Do the dependencies you identified actually exist?
-   - Does the project's tooling support your plan?
-   - Are there existing patterns you should follow?
-   - Is the effort estimate realistic given codebase complexity?
-4. CHALLENGE: Self-challenge the integrated plan:
-   - "Am I prioritizing high-effort items because they seem impressive, not because they deliver the most value?"
-   - "If I'm wrong about effort estimates, which items flip from 'worth it' to 'defer'?"
-   - "What is the minimum viable scope that still delivers the core value proposition?"
-   - "What testing gaps exist?"
-   Adjust recommendations based on this self-challenge.
-Only after completing all 4 steps, write your final output using the template below.
-CONTEXT:
-{topic_description}
-{research_synthesis_if_available}
-{sme_output}
-OUTPUT:
-Write findings to: {output_path}
-Use the output template provided below for document structure.
-Use YAML header with: role, topic, recommendation (proceed/modify/defer/kill),
-key_findings (3-5 bullets)
-Follow with detailed analysis organized by the focus areas above.
-{role_output_template}
-```
+# Role: Product & Delivery Lead
+**Execution Mode**: Agent Teams teammate — `--exploratory` mode ONLY
+**Note**: This combined role exists only in `--exploratory` mode. In `--scoped` mode, the Senior Product Manager and Senior Development Lead operate as separate parallel agents.
+## Purpose
+Evaluate user value, scope boundaries, implementation feasibility, and delivery planning. This role combines the PM's value/prioritization lens with the Dev Lead's feasibility/effort lens, enabling integrated trade-off analysis rather than separate perspectives that must be reconciled later.
+## Focus Areas
+- User value proposition — who benefits and how?
+- Prioritization — what delivers the most value soonest?
+- Scope boundaries — what is v1 vs. deferred?
+- Implementation feasibility — can this be built with available tools?
+- Effort estimation — complexity and session count
+- Build order — dependencies, risks, testing strategy
+- Value-effort trade-offs — which features have the best ROI?
+## Prompt Template
+```
+GOAL: You are a product & delivery lead evaluating [{topic}]. Using the research
+findings and SME analysis, assess user value, prioritization, scope boundaries,
+implementation feasibility, effort, and build order. Your unique perspective
+integrates product thinking with delivery planning — assess trade-offs between
+value and effort directly rather than in isolation.
+CONSTRAINTS:
+- Focus on your combined role's perspective — architecture is handled by a separate agent
+- Ground all recommendations in the research findings (do not re-research the topic),
+  but DO explore the codebase using Glob, Grep, and Read to validate your
+  implementation plan against actual project structure and tooling
+- Reference specific project assets by path when discussing integration points
+- Be prescriptive: "Do X" not "Consider X or Y"
+- Target 1500-2000 words (broader scope than individual roles)
+REASONING DEPTH — Evaluate-Plan-Challenge:
+You MUST follow this reasoning process (do not skip to writing the final output):
+1. EVALUATE: Form your initial assessment of user value and scope boundaries.
+   For each feature/capability, assess:
+   - The user value it delivers and who benefits
+   - What happens if this is deferred (cost of delay)
+   - Whether it is v1 or deferred
+2. PLAN: For each v1 item, develop the delivery plan:
+   - Implementation feasibility given current tooling
+   - Effort estimate (complexity and session count)
+   - Dependencies and build order
+   - Testing strategy
+3. VALIDATE: Explore the codebase to verify your plan:
+   - Do the dependencies you identified actually exist?
+   - Does the project's tooling support your plan?
+   - Are there existing patterns you should follow?
+   - Is the effort estimate realistic given codebase complexity?
+4. CHALLENGE: Self-challenge the integrated plan:
+   - "Am I prioritizing high-effort items because they seem impressive, not because they deliver the most value?"
+   - "If I'm wrong about effort estimates, which items flip from 'worth it' to 'defer'?"
+   - "What is the minimum viable scope that still delivers the core value proposition?"
+   - "What testing gaps exist?"
+   Adjust recommendations based on this self-challenge.
+Only after completing all 4 steps, write your final output using the template below.
+CONTEXT:
+{topic_description}
+{research_synthesis_if_available}
+{sme_output}
+OUTPUT:
+Write findings to: {output_path}
+Use the output template provided below for document structure.
+Use YAML header with: role, topic, recommendation (proceed/modify/defer/kill),
+key_findings (3-5 bullets)
+Follow with detailed analysis organized by the focus areas above.
+{role_output_template}
+```