npm - @qball-inc/the-bulwark - Versions diffs - 1.0.0 - Mend

@qball-inc/the-bulwark 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (175) hide show

package/.claude-plugin/plugin.json +43 -0
package/agents/bulwark-fix-validator.md +633 -0
package/agents/bulwark-implementer.md +391 -0
package/agents/bulwark-issue-analyzer.md +308 -0
package/agents/bulwark-standards-reviewer.md +221 -0
package/agents/plan-creation-architect.md +323 -0
package/agents/plan-creation-eng-lead.md +352 -0
package/agents/plan-creation-po.md +300 -0
package/agents/plan-creation-qa-critic.md +334 -0
package/agents/product-ideation-competitive-analyzer.md +298 -0
package/agents/product-ideation-idea-validator.md +268 -0
package/agents/product-ideation-market-researcher.md +292 -0
package/agents/product-ideation-pattern-documenter.md +308 -0
package/agents/product-ideation-segment-analyzer.md +303 -0
package/agents/product-ideation-strategist.md +259 -0
package/agents/statusline-setup.md +97 -0
package/hooks/hooks.json +59 -0
package/package.json +45 -0
package/scripts/hooks/cleanup-stale.sh +13 -0
package/scripts/hooks/enforce-quality.sh +166 -0
package/scripts/hooks/implementer-quality.sh +256 -0
package/scripts/hooks/inject-protocol.sh +52 -0
package/scripts/hooks/suggest-pipeline.sh +175 -0
package/scripts/hooks/track-pipeline-start.sh +37 -0
package/scripts/hooks/track-pipeline-stop.sh +52 -0
package/scripts/init-rules.sh +35 -0
package/scripts/init.sh +151 -0
package/skills/anthropic-validator/SKILL.md +607 -0
package/skills/anthropic-validator/references/agents-checklist.md +131 -0
package/skills/anthropic-validator/references/commands-checklist.md +102 -0
package/skills/anthropic-validator/references/hooks-checklist.md +151 -0
package/skills/anthropic-validator/references/mcp-checklist.md +136 -0
package/skills/anthropic-validator/references/plugins-checklist.md +148 -0
package/skills/anthropic-validator/references/skills-checklist.md +85 -0
package/skills/assertion-patterns/SKILL.md +296 -0
package/skills/bug-magnet-data/SKILL.md +284 -0
package/skills/bug-magnet-data/context/cli-args.md +91 -0
package/skills/bug-magnet-data/context/db-query.md +104 -0
package/skills/bug-magnet-data/context/file-contents.md +103 -0
package/skills/bug-magnet-data/context/http-body.md +91 -0
package/skills/bug-magnet-data/context/process-spawn.md +123 -0
package/skills/bug-magnet-data/data/booleans/boundaries.yaml +143 -0
package/skills/bug-magnet-data/data/collections/arrays.yaml +114 -0
package/skills/bug-magnet-data/data/collections/objects.yaml +123 -0
package/skills/bug-magnet-data/data/concurrency/race-conditions.yaml +118 -0
package/skills/bug-magnet-data/data/concurrency/state-machines.yaml +115 -0
package/skills/bug-magnet-data/data/dates/boundaries.yaml +137 -0
package/skills/bug-magnet-data/data/dates/invalid.yaml +132 -0
package/skills/bug-magnet-data/data/dates/timezone.yaml +118 -0
package/skills/bug-magnet-data/data/encoding/charset.yaml +79 -0
package/skills/bug-magnet-data/data/encoding/normalization.yaml +105 -0
package/skills/bug-magnet-data/data/formats/email.yaml +154 -0
package/skills/bug-magnet-data/data/formats/json.yaml +187 -0
package/skills/bug-magnet-data/data/formats/url.yaml +165 -0
package/skills/bug-magnet-data/data/language-specific/javascript.yaml +182 -0
package/skills/bug-magnet-data/data/language-specific/python.yaml +174 -0
package/skills/bug-magnet-data/data/language-specific/rust.yaml +148 -0
package/skills/bug-magnet-data/data/numbers/boundaries.yaml +161 -0
package/skills/bug-magnet-data/data/numbers/precision.yaml +89 -0
package/skills/bug-magnet-data/data/numbers/special.yaml +69 -0
package/skills/bug-magnet-data/data/strings/boundaries.yaml +109 -0
package/skills/bug-magnet-data/data/strings/injection.yaml +208 -0
package/skills/bug-magnet-data/data/strings/special-chars.yaml +190 -0
package/skills/bug-magnet-data/data/strings/unicode.yaml +139 -0
package/skills/bug-magnet-data/references/external-lists.md +115 -0
package/skills/bulwark-brainstorm/SKILL.md +563 -0
package/skills/bulwark-brainstorm/references/at-teammate-prompts.md +60 -0
package/skills/bulwark-brainstorm/references/role-critical-analyst.md +78 -0
package/skills/bulwark-brainstorm/references/role-development-lead.md +66 -0
package/skills/bulwark-brainstorm/references/role-product-delivery-lead.md +79 -0
package/skills/bulwark-brainstorm/references/role-product-manager.md +62 -0
package/skills/bulwark-brainstorm/references/role-project-sme.md +59 -0
package/skills/bulwark-brainstorm/references/role-technical-architect.md +66 -0
package/skills/bulwark-research/SKILL.md +298 -0
package/skills/bulwark-research/references/viewpoint-contrarian.md +63 -0
package/skills/bulwark-research/references/viewpoint-direct-investigation.md +62 -0
package/skills/bulwark-research/references/viewpoint-first-principles.md +65 -0
package/skills/bulwark-research/references/viewpoint-practitioner.md +62 -0
package/skills/bulwark-research/references/viewpoint-prior-art.md +66 -0
package/skills/bulwark-scaffold/SKILL.md +330 -0
package/skills/bulwark-statusline/SKILL.md +161 -0
package/skills/bulwark-statusline/scripts/statusline.sh +144 -0
package/skills/bulwark-verify/SKILL.md +519 -0
package/skills/code-review/SKILL.md +428 -0
package/skills/code-review/examples/anti-patterns/linting.ts +181 -0
package/skills/code-review/examples/anti-patterns/security.ts +91 -0
package/skills/code-review/examples/anti-patterns/standards.ts +195 -0
package/skills/code-review/examples/anti-patterns/type-safety.ts +108 -0
package/skills/code-review/examples/recommended/linting.ts +195 -0
package/skills/code-review/examples/recommended/security.ts +154 -0
package/skills/code-review/examples/recommended/standards.ts +231 -0
package/skills/code-review/examples/recommended/type-safety.ts +181 -0
package/skills/code-review/frameworks/angular.md +218 -0
package/skills/code-review/frameworks/django.md +235 -0
package/skills/code-review/frameworks/express.md +207 -0
package/skills/code-review/frameworks/flask.md +298 -0
package/skills/code-review/frameworks/generic.md +146 -0
package/skills/code-review/frameworks/react.md +152 -0
package/skills/code-review/frameworks/vue.md +244 -0
package/skills/code-review/references/linting-patterns.md +221 -0
package/skills/code-review/references/security-patterns.md +125 -0
package/skills/code-review/references/standards-patterns.md +246 -0
package/skills/code-review/references/type-safety-patterns.md +130 -0
package/skills/component-patterns/SKILL.md +131 -0
package/skills/component-patterns/references/pattern-cli-command.md +118 -0
package/skills/component-patterns/references/pattern-database.md +166 -0
package/skills/component-patterns/references/pattern-external-api.md +139 -0
package/skills/component-patterns/references/pattern-file-parser.md +168 -0
package/skills/component-patterns/references/pattern-http-server.md +162 -0
package/skills/component-patterns/references/pattern-process-spawner.md +133 -0
package/skills/continuous-feedback/SKILL.md +327 -0
package/skills/continuous-feedback/references/collect-instructions.md +81 -0
package/skills/continuous-feedback/references/specialize-code-review.md +82 -0
package/skills/continuous-feedback/references/specialize-general.md +98 -0
package/skills/continuous-feedback/references/specialize-test-audit.md +81 -0
package/skills/create-skill/SKILL.md +359 -0
package/skills/create-skill/references/agent-conventions.md +194 -0
package/skills/create-skill/references/agent-template.md +195 -0
package/skills/create-skill/references/content-guidance.md +291 -0
package/skills/create-skill/references/decision-framework.md +124 -0
package/skills/create-skill/references/template-pipeline.md +217 -0
package/skills/create-skill/references/template-reference-heavy.md +111 -0
package/skills/create-skill/references/template-research.md +210 -0
package/skills/create-skill/references/template-script-driven.md +172 -0
package/skills/create-skill/references/template-simple.md +80 -0
package/skills/create-subagent/SKILL.md +353 -0
package/skills/create-subagent/references/agent-conventions.md +268 -0
package/skills/create-subagent/references/content-guidance.md +232 -0
package/skills/create-subagent/references/decision-framework.md +134 -0
package/skills/create-subagent/references/template-single-agent.md +192 -0
package/skills/fix-bug/SKILL.md +241 -0
package/skills/governance-protocol/SKILL.md +116 -0
package/skills/init/SKILL.md +341 -0
package/skills/issue-debugging/SKILL.md +385 -0
package/skills/issue-debugging/references/anti-patterns.md +245 -0
package/skills/issue-debugging/references/debug-report-schema.md +227 -0
package/skills/mock-detection/SKILL.md +511 -0
package/skills/mock-detection/references/false-positive-prevention.md +402 -0
package/skills/mock-detection/references/stub-patterns.md +236 -0
package/skills/pipeline-templates/SKILL.md +215 -0
package/skills/pipeline-templates/references/code-change-workflow.md +277 -0
package/skills/pipeline-templates/references/code-review.md +336 -0
package/skills/pipeline-templates/references/fix-validation.md +421 -0
package/skills/pipeline-templates/references/new-feature.md +335 -0
package/skills/pipeline-templates/references/research-brainstorm.md +161 -0
package/skills/pipeline-templates/references/research-planning.md +257 -0
package/skills/pipeline-templates/references/test-audit.md +389 -0
package/skills/pipeline-templates/references/test-execution-fix.md +238 -0
package/skills/plan-creation/SKILL.md +497 -0
package/skills/product-ideation/SKILL.md +372 -0
package/skills/product-ideation/references/analysis-frameworks.md +161 -0
package/skills/session-handoff/SKILL.md +139 -0
package/skills/session-handoff/references/examples.md +223 -0
package/skills/setup-lsp/SKILL.md +312 -0
package/skills/setup-lsp/references/server-registry.md +85 -0
package/skills/setup-lsp/references/troubleshooting.md +135 -0
package/skills/subagent-output-templating/SKILL.md +415 -0
package/skills/subagent-output-templating/references/examples.md +440 -0
package/skills/subagent-prompting/SKILL.md +364 -0
package/skills/subagent-prompting/references/examples.md +342 -0
package/skills/test-audit/SKILL.md +531 -0
package/skills/test-audit/references/known-limitations.md +41 -0
package/skills/test-audit/references/priority-classification.md +30 -0
package/skills/test-audit/references/prompts/deep-mode-detection.md +83 -0
package/skills/test-audit/references/prompts/synthesis.md +57 -0
package/skills/test-audit/references/rewrite-instructions.md +46 -0
package/skills/test-audit/references/schemas/audit-output.yaml +100 -0
package/skills/test-audit/references/schemas/diagnostic-output.yaml +49 -0
package/skills/test-audit/scripts/data-flow-analyzer.ts +509 -0
package/skills/test-audit/scripts/integration-mock-detector.ts +462 -0
package/skills/test-audit/scripts/package.json +20 -0
package/skills/test-audit/scripts/skip-detector.ts +211 -0
package/skills/test-audit/scripts/verification-counter.ts +295 -0
package/skills/test-classification/SKILL.md +310 -0
package/skills/test-fixture-creation/SKILL.md +295 -0

package/skills/plan-creation/SKILL.md ADDED Viewed

@@ -0,0 +1,497 @@
+---
+name: plan-creation
+description: Create structured implementation plans using a 4-role scrum team with optional Agent Teams peer debate
+user-invocable: true
+argument-hint: "<topic, filepath, or directory> [--research <synthesis-file>]"
+skills:
+  - subagent-prompting
+---
+# Plan Creation
+Create structured implementation plans through a 4-role collaborative scrum team: Product Owner, Technical Architect, Engineering & Delivery Lead, and QA/Critic. The Product Owner explores the codebase first, then Architect and Eng Lead analyze in parallel, and the QA/Critic challenges everything last. The orchestrator synthesizes all outputs into a hybrid Markdown + YAML plan.
+---
+## When to Use This Skill
+**Load this skill when the user request matches ANY of these patterns:**
+| Trigger Pattern | Example User Request |
+|-----------------|---------------------|
+| Implementation planning | "Create an implementation plan for X" |
+| Feature planning | "Plan how we'd build X" |
+| Project scoping | "Break down X into phases and workpackages" |
+| Post-research planning | "We've researched X, now create a plan" |
+| Task brief creation | "Create a task brief for X" |
+**DO NOT use for:**
+- Initial topic research (use `bulwark-research` first)
+- Feasibility brainstorming (use `bulwark-brainstorm`)
+- Quick technical questions (ask directly)
+- Code review or debugging (use `code-review` or `issue-debugging`)
+---
+## Dependencies
+| Category | Files | Requirement | When to Load |
+|----------|-------|-------------|--------------|
+| **Plan output template** | `templates/plan-output.md` | **REQUIRED** | Load at Stage 5 for plan structure |
+| **Critic output template** | `templates/critic-output.md` | **REQUIRED** | Include in QA/Critic agent prompt |
+| **Synthesis template** | `templates/synthesis-output.md` | **REQUIRED** | Use when writing synthesis |
+| **Diagnostic template** | `templates/diagnostic-output.yaml` | **REQUIRED** | Use at Stage 6 |
+| **Role output reference** | `templates/role-output.md` | OPTIONAL | Reference for parsing agent outputs |
+| **Subagent prompting** | `subagent-prompting` skill | **REQUIRED** | Load at Stage 1 for 4-part prompt template |
+| **Research synthesis** | `--research <file>` | OPTIONAL | If provided, include in all agent prompts |
+**Fallback behavior:**
+- If an agent fails to spawn: Re-spawn once. If still fails, skip that role and document in synthesis under "Incomplete Coverage"
+- If PO fails: STOP — all downstream agents depend on PO output. Inform user.
+- If output template is missing: Use the schema from this SKILL.md directly
+- If research synthesis not provided: Agents work from problem statement alone (warn user that quality may be lower)
+---
+## Mandatory Execution Checklist (BINDING)
+**Every item below is mandatory. No deviations. No substitutions. No skipping.**
+This skill uses a multi-stage pipeline. You are the orchestrator. Follow every item in order. Do NOT return to the user until all applicable items are checked.
+- [ ] **Stage 1 — Pre-Flight**: Topic parsed (from argument, --doc, or AskUserQuestion)
+- [ ] **Stage 1 — Pre-Flight**: subagent-prompting skill loaded
+- [ ] **Stage 1 — Pre-Flight**: If topic is ambiguous or under-specified, AskUserQuestion interview conducted (2-3 questions per round)
+- [ ] **Stage 1 — Pre-Flight**: If --research not provided, user warned via displayed message AND asked to confirm proceeding
+- [ ] **Stage 1 — Mode Detection**: `$CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS` env var checked — you MUST check this, no exceptions
+- [ ] **Stage 1 — Mode Detection**: If env var is SET, user offered choice via AskUserQuestion (Agent Teams vs Task tool) — you MUST NOT default silently
+- [ ] **Stage 1 — Mode Detection**: If user selects Agent Teams, AT Confirmation Flow executed (RED banner + model class choice)
+- [ ] **Stage 2 — Product Owner**: PO spawned via Task tool (`plan-creation-po`, Opus) and output read
+- [ ] **Stage 3A or 3B**: Correct mode executed based on user's Stage 1 choice
+- [ ] **Stage 3A (Task tool)**: Architect + Eng Lead spawned in parallel, then QA/Critic spawned with all 3 prior outputs
+- [ ] **Stage 3B (Agent Teams)**: Agent files read, delegate mode entered, 3 teammates spawned with correct model class
+- [ ] **Stage 5 — Synthesis**: ALL role outputs read, synthesis written, plan drafted using template
+- [ ] **Stage 5 — Approval**: Plan presented to user via AskUserQuestion — you MUST NOT write the final plan without user approval
+- [ ] **Stage 5 — Plan Written**: Final plan written to `plans/{slug}/plan_v{N}.md`
+- [ ] **Stage 6 — Diagnostics**: Diagnostic YAML written to `$PROJECT_DIR/logs/diagnostics/`
+---
+## Usage
+```
+/plan-creation <topic-or-prompt> [--research <synthesis-file>]
+/plan-creation --doc <path-to-document> [--research <synthesis-file>]
+```
+**Arguments:**
+- `<topic-or-prompt>` - Free-text topic description or problem statement
+- `--doc <path>` - Use a document as the topic source
+- `--research <synthesis-file>` - Path to research synthesis (from bulwark-research or bulwark-brainstorm). Strongly recommended.
+**Examples:**
+- `/plan-creation "add user authentication" --research logs/research/auth/synthesis.md`
+- `/plan-creation --doc plans/proposal.md`
+- `/plan-creation "migrate database to PostgreSQL"`
+**Plan Versioning:**
+Plans are written to `plans/{slug}/plan_v{N}.md` with automatic version detection:
+| Scenario | Version | Example |
+|----------|---------|---------|
+| First plan for a topic | `v1` | `plans/add-auth/plan_v1.md` |
+| Minor revision (user iterates on current plan) | `v1.1`, `v1.2` | Approval gate feedback → revision |
+| Major version (full re-run or pivot) | `v2`, `v3` | New invocation for same slug |
+The skill checks for existing `plans/{slug}/plan_v*.md` files before writing. When ambiguous (re-run vs revision), it asks the user.
+---
+## Stages
+### Stage 1: Pre-Flight
+```
+Stage 1: Pre-Flight
+├── Read problem statement / document
+├── Load research synthesis if --research provided
+├── AskUserQuestion if ambiguous (iterative, 2-3 questions per round)
+├── Slugify topic for output directory
+├── Create output directory: $PROJECT_DIR/logs/plan-creation/{slug}/
+├── Load subagent-prompting skill
+├── Detect mode: Task tool (default) or Agent Teams (opt-in)
+└── Token budget check (warn if >30% consumed)
+```
+**AskUserQuestion Protocol (Pre-Spawn):**
+If the problem statement is ambiguous, under-specified, or could benefit from scope boundaries:
+1. Ask 2-3 clarifying questions using AskUserQuestion
+2. Assess whether the answers provide sufficient clarity to construct high-quality prompts
+3. If not, ask up to 3 more questions in a follow-up round
+4. Repeat until clarity is achieved (no hard cap on rounds, but each round is 2-3 questions max)
+5. If the problem statement is clear and well-scoped from the start, skip this step and note in diagnostics: `pre_flight_interview: skipped (problem statement sufficient)`
+If `--research` was not provided, warn the user: "No research synthesis provided. Plan quality is significantly higher when preceded by `/bulwark-research` or `/bulwark-brainstorm`. Proceed without research?"
+**Mode Detection:**
+1. Check `$CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS` env var
+2. If env var is SET: offer user choice via AskUserQuestion — "Agent Teams enhanced mode is available. Use Agent Teams or Task tool?" Default to Task tool if user doesn't specify.
+3. If env var is NOT SET: use Task tool mode. If user explicitly requested Agent Teams, notify: "Agent Teams requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1. Using Task tool mode."
+**AT Confirmation Flow (if user selects Agent Teams):**
+Execute this confirmation flow BEFORE spawning any teammates:
+**Step 1 — Display RED warning banner** using ANSI color `\033[38;2;255;154;150m` (#FF9A96):
+```
+⚠️  NOTICE: Claude Code's Agent Teams is an experimental feature. Unexpected
+issues like teammates being stuck or unresponsive may occur. Agent Teams mode
+is also significantly more token-expensive than Task tool mode (4 concurrent
+agents vs sequential sub-agents).
+If you run into issues, start a new session and re-run /plan-creation.
+If the final output does not match expectations, re-run with Task tool mode.
+```
+**Step 2 — AskUserQuestion: model class + mode confirmation**
+Present a single question with 3 options:
+| Option | Label | Description |
+|--------|-------|-------------|
+| 1 | Opus agents (Recommended) | Higher quality analysis, highest token cost. Opus-class agents for all roles. |
+| 2 | Sonnet agents | Good quality, lower token cost. Sonnet-class agents for all roles. |
+| 3 | Switch to Task tool mode | Cancel Agent Teams and use sequential sub-agents instead. |
+- Option 1: proceed with Opus agents in AT mode (Stage 3B)
+- Option 2: proceed with Sonnet agents in AT mode (Stage 3B)
+- Option 3: fall back to Task tool mode (Stage 3A), skip AT entirely
+### Stage 2: Product Owner (Opus, Sequential — First)
+```
+Stage 2: Product Owner
+├── Construct prompt using 4-part template
+│   ├── GOAL: Explore codebase and produce requirements analysis for {topic}
+│   ├── CONSTRAINTS: Do not make architectural decisions or estimate effort
+│   ├── CONTEXT: Problem statement + research synthesis (if available)
+│   └── OUTPUT: $PROJECT_DIR/logs/plan-creation/{slug}/01-product-owner.md
+├── Spawn plan-creation-po agent via Task tool
+│   ├── subagent_type: plan-creation-po
+│   ├── model: opus (specified in agent frontmatter)
+│   ├── Agent autonomously explores codebase (Glob, Grep, Read)
+│   └── NO hardcoded document paths — agent discovers what's relevant
+├── Read PO output from logs/plan-creation/{slug}/01-product-owner.md
+└── Token budget check
+```
+**CRITICAL — PO Autonomy**: The PO agent MUST NOT receive hardcoded project document paths. Instead:
+- PO receives the problem statement and (optionally) research synthesis
+- PO is spawned as `plan-creation-po` subagent type
+- PO autonomously explores the codebase using Glob, Grep, Read
+- PO output documents which files it read and why
+This makes the skill portable across any project.
+### Stage 3A: Scrum Team — Task Tool Mode
+```
+Stage 3A: Scrum Team (Task Tool Mode)
+├── Read PO output in full
+├── Construct 2 prompts using 4-part template:
+│   ├── Technical Architect:
+│   │   ├── GOAL: Analyze system design, components, integration, trade-offs for {topic}
+│   │   ├── CONSTRAINTS: Do not estimate effort or sequence work
+│   │   ├── CONTEXT: Problem statement + research synthesis + PO output (full text)
+│   │   └── OUTPUT: $PROJECT_DIR/logs/plan-creation/{slug}/02-technical-architect.md
+│   └── Engineering & Delivery Lead:
+│       ├── GOAL: Produce WBS, estimates, dependencies, milestones, risk register for {topic}
+│       ├── CONSTRAINTS: Do not redesign architecture — work with Architect's design
+│       ├── CONTEXT: Problem statement + research synthesis + PO output (full text)
+│       └── OUTPUT: $PROJECT_DIR/logs/plan-creation/{slug}/03-eng-delivery-lead.md
+├── Spawn BOTH agents in parallel via Task tool (single message, 2 Task tool calls)
+│   ├── subagent_type: plan-creation-architect (opus)
+│   └── subagent_type: plan-creation-eng-lead (sonnet)
+├── Read both outputs
+└── Token budget check (checkpoint if >55%)
+```
+**CRITICAL**: Spawn both agents in a single message with 2 Task tool calls. Do NOT spawn sequentially.
+**Note**: Both Architect and Eng Lead receive the PO output directly in their prompt CONTEXT. They do NOT read each other's output — they work independently in parallel. The QA/Critic cross-references their outputs in Stage 4.
+### Stage 3B: Scrum Team — Agent Teams Mode (Enhanced, Opt-In)
+**Pre-condition**: User selected Agent Teams in Pre-Flight AND confirmed model class in AT Confirmation Flow.
+```
+Stage 3B: Scrum Team (Agent Teams Mode)
+├── Read PO output in full (from Stage 2)
+├── Read agent definition files from .claude/agents/:
+│   ├── .claude/agents/plan-creation-architect.md
+│   ├── .claude/agents/plan-creation-eng-lead.md
+│   └── .claude/agents/plan-creation-qa-critic.md
+├── YOU (the orchestrator) become the delegate-mode Scrum Lead
+│   └── Your role: coordination ONLY — do not perform analysis yourself
+├── Enter delegate mode with 3 teammates
+├── Create shared task list with initial tasks:
+│   ├── "[Architect] Analyze system design for {topic}"
+│   ├── "[Eng Lead] Produce WBS and delivery plan for {topic}"
+│   └── "[QA/Critic] Adversarially review all analyses for {topic}"
+├── Spawn 3 teammates using agent file content as system prompts:
+│   ├── Technical Architect (model: user's choice from AT Confirmation)
+│   ├── Engineering & Delivery Lead (model: user's choice from AT Confirmation)
+│   └── QA / Critic (model: user's choice from AT Confirmation)
+├── Each teammate prompt MUST include (in addition to agent file content):
+│   ├── Problem statement + research synthesis (if available)
+│   ├── PO output (full text)
+│   ├── Dual-output contract (see below)
+│   ├── CC-to-lead instruction (see below)
+│   ├── Task list coordination instruction (see below)
+│   └── Rendezvous instruction (see below)
+├── Use in-process display mode (WSL2 safe — no tmux)
+├── Shutdown gate: see below
+└── Token budget check
+```
+**Dual-Output Contract** (include in EVERY teammate prompt):
+> You MUST produce two outputs:
+> 1. **Full analysis** → Write to `$PROJECT_DIR/logs/plan-creation/{slug}/{NN}-{role-name}.md` using the Write tool. This is your SA2-compliant artifact. Include all analysis, tables, and findings.
+> 2. **Coordination summary** → Send a 3-5 sentence summary of your key findings and conclusions to the Scrum Lead via mailbox. This is for coordination only — the full analysis is in the log file.
+**CC-to-Lead Instruction** (include in EVERY teammate prompt):
+> When sending peer DMs to other teammates with work instructions, challenges, or significant findings, also send a 1-line summary to the Scrum Lead. Example: "Sent Architect a challenge on component coupling — see my full analysis in logs."
+**Task List Coordination Instruction** (include in EVERY teammate prompt):
+> When you receive work from a peer via DM (e.g., "review this section", "stress-test this estimate"), create a new task in the shared task list describing the peer-dispatched work before starting it. Mark it in_progress immediately. This gives the Scrum Lead visibility into peer-coordinated work.
+**Rendezvous Instruction** (include in EVERY teammate prompt):
+> Your FINAL action before going idle is to send the Scrum Lead: "WORK COMPLETE — all tasks done, log written to {path}". Do NOT go idle without sending this message.
+**Shutdown Gate** (Scrum Lead logic — YOU enforce this):
+The Scrum Lead (you, the orchestrator) MUST NOT call `requestShutdown` for ANY teammate until ALL of the following are true:
+1. All shared task list tasks are in terminal state (completed or blocked)
+2. WORK COMPLETE message received from ALL 3 teammates
+3. All 3 log files exist and are non-empty:
+   - `logs/plan-creation/{slug}/02-technical-architect.md`
+   - `logs/plan-creation/{slug}/03-eng-delivery-lead.md`
+   - `logs/plan-creation/{slug}/04-qa-critic.md`
+If a teammate appears idle but has NOT sent WORK COMPLETE:
+- Check the shared task list for in-progress tasks assigned to that teammate
+- Send a status check message: "Status update? Are you still working on [task]?"
+- Do NOT send `requestShutdown` — they may be executing a long tool call
+**AT Completion Banner** (display after ALL teammates shut down):
+After Agent Teams execution completes successfully, display an AMBER banner using ANSI color `\033[38;2;255;244;176m` (#FFF4B0):
+```
+ℹ️  Agent Teams is an experimental feature. If the final plan does not match
+your expectations, try re-running /plan-creation with Task tool mode
+(sequential sub-agents). Individual role outputs are preserved in
+logs/plan-creation/{slug}/ for inspection.
+```
+Then proceed to Stage 5 (Synthesis) — Stage 4 is skipped in AT mode.
+**CRITICAL — Agent File Reuse**: The 3 agent files in `.claude/agents/` contain the role expertise (system prompts, output formats, tool constraints). For AT mode, read each file's content and use it as the teammate's system prompt. This keeps a single source of truth per role — the agent files serve both Task tool mode (via `subagent_type`) and AT mode (content embedded in teammate prompts).
+**Note**: In AT mode, the QA/Critic participates throughout — it can challenge the Architect and Eng Lead via peer DMs during their analysis, not just after. This is the primary quality advantage over Task tool mode.
+---
+### Stage 4: QA / Critic (Sonnet, Sequential — Last, Task Tool Mode ONLY)
+```
+Stage 4: QA / Critic
+├── Load templates/critic-output.md
+├── Read ALL 3 prior output files:
+│   ├── 01-product-owner.md
+│   ├── 02-technical-architect.md
+│   └── 03-eng-delivery-lead.md
+├── Construct prompt using 4-part template
+│   ├── GOAL: Adversarially review all prior analyses — challenge assumptions, identify gaps, stress-test estimates, produce APPROVE/MODIFY/REJECT verdict
+│   ├── CONSTRAINTS: Do not redesign or re-plan — only challenge and validate
+│   ├── CONTEXT: Problem statement + research synthesis + ALL 3 prior outputs (full text) + critic-output.md template
+│   └── OUTPUT: $PROJECT_DIR/logs/plan-creation/{slug}/04-qa-critic.md
+├── Spawn plan-creation-qa-critic agent via Task tool
+│   ├── subagent_type: plan-creation-qa-critic
+│   └── model: sonnet (specified in agent frontmatter)
+├── Read Critic output
+└── Token budget check
+```
+**CRITICAL**: The QA/Critic MUST receive ALL 3 prior outputs in full. This is the entire point — the Critic cross-references PO requirements against Architect components against Eng Lead workpackages to find gaps.
+**Skip condition**: If Agent Teams mode was used (Stage 3B), skip Stage 4 entirely — the QA/Critic already participated throughout Stage 3B via peer debate. Proceed directly to Stage 5.
+### Stage 5: Synthesis & Plan Output (SHARED — Mode-Aware)
+```
+Stage 5: Synthesis
+├── Read ALL 4 agent output files (MANDATORY — do not skip any):
+│   ├── logs/plan-creation/{slug}/01-product-owner.md
+│   ├── logs/plan-creation/{slug}/02-technical-architect.md
+│   ├── logs/plan-creation/{slug}/03-eng-delivery-lead.md
+│   └── logs/plan-creation/{slug}/04-qa-critic.md
+├── If Agent Teams mode: also review lead coordination notes from the AT session
+├── If any output is missing or empty → re-spawn that agent once (max 1 retry)
+│   └── In AT mode: re-spawning is NOT possible — document gap in "Incomplete Coverage"
+├── If retry fails → document gap in synthesis under "Incomplete Coverage"
+├── Load templates/synthesis-output.md
+├── Load templates/plan-output.md
+├── Write synthesis to $PROJECT_DIR/logs/plan-creation/{slug}/synthesis.md
+├── Compose plan draft:
+│   ├── Executive Summary from synthesis consensus + PO problem statement
+│   ├── YAML body from:
+│   │   ├── Phases and workpackages: Eng Lead's WBS + Architect's component structure
+│   │   ├── Milestones: Eng Lead's milestones
+│   │   ├── Dependency graph: Eng Lead's dependency analysis + Architect's integration order
+│   │   ├── Risks: Consolidated from all roles, prioritized by Critic
+│   │   └── Kill criteria: From Critic's verdict
+│   └── Apply Critic's MODIFY requirements (if verdict was MODIFY)
+├── Present draft plan to user via AskUserQuestion approval gate
+├── Critical Evaluation Gate (see below)
+├── Determine plan version:
+│   ├── Glob for existing plans/{slug}/plan_v*.md
+│   ├── If none found: version = v1 (first plan)
+│   ├── If user is iterating on the current plan (minor revision): bump minor (v1 → v1.1, v1.1 → v1.2)
+│   ├── If user is starting fresh or pivoting: bump major (v1 → v2, v2 → v3)
+│   └── When ambiguous, ask user: "This is a revision of the existing plan (v1.1) or a new plan (v2)?"
+├── On approval: write final plan to plans/{slug}/plan_v{N}.md
+└── Token budget check (must be <65% after synthesis)
+```
+**Enforcement**: Do NOT begin writing synthesis until ALL available agent outputs have been read. The orchestrator must reference every agent's output at least once in the synthesis.
+#### Critical Evaluation Gate (Post-User Q&A)
+After each AskUserQuestion round, do NOT blindly incorporate user responses. Instead:
+**Step 1 — Classify each user response:**
+| Classification | Definition | Action |
+|---------------|------------|--------|
+| **Preference** | Scope, priority, or UX choice (e.g., "I'd prefer v1 to focus on X", "Let's defer Y") | Incorporate directly. These are user decisions — no validation needed. |
+| **Technical Claim** | Assertion about a technology, library, or API (e.g., "Library X supports this", "That API has rate limits") | **Do NOT incorporate.** Trigger Step 2. |
+| **Architectural Suggestion** | Proposed structural approach (e.g., "What if we structure it as a plugin?", "We could use event sourcing") | **Do NOT incorporate.** Trigger Step 2. |
+**Step 2 — For Technical Claims and Architectural Suggestions, present to user:**
+> "Your suggestion about [X] involves a technical claim / architectural approach that hasn't been validated against the codebase and research. I recommend a targeted follow-up with 2 focused agents (Technical Architect + QA/Critic) to verify feasibility and stress-test the approach.
+>
+> This will spawn 2 agents and consume additional token budget.
+>
+> [Run follow-up validation / Incorporate as-is with LOW confidence caveat]"
+**Step 3 — If follow-up validation approved:**
+1. Spawn 2 agents in parallel (single message, 2 Task tool calls):
+   - **Technical Architect** (`plan-creation-architect`) — validates the suggestion against the codebase and research
+   - **QA/Critic** (`plan-creation-qa-critic`) — stress-tests the suggestion
+2. Use the same 4-part prompt template (GOAL/CONSTRAINTS/CONTEXT/OUTPUT)
+3. Provide both agents with: original research synthesis, PO output, and the specific user suggestion
+4. Output to: `$PROJECT_DIR/logs/plan-creation/{slug}/followup-{NN}-architect.md` and `followup-{NN}-critic.md`
+5. Read both outputs, then update plan with validated findings
+6. Tag follow-up findings in plan with: `[Follow-up: validated]` or `[Follow-up: refuted]` or `[Follow-up: mixed — see details]`
+**Step 4 — If user declines follow-up:**
+Incorporate the user's suggestion into the plan with an explicit caveat:
+> **[Unvalidated — user suggestion, not verified against codebase or research]**: {suggestion}
+**Repeat**: After updating the plan, ask if user has additional input. Apply the same classification gate to each round. Each round with Technical Claim / Architectural Suggestion input that triggers validation consumes ~10-15% token budget (2 agents) — warn user if approaching 55%.
+### Stage 6: Diagnostics (REQUIRED)
+```
+Stage 6: Diagnostics
+├── Write diagnostic YAML to $PROJECT_DIR/logs/diagnostics/plan-creation-{YYYYMMDD-HHMMSS}.yaml
+└── Verify Mandatory Execution Checklist (top of skill)
+```
+---
+## Execution Flow (F# Pipeline)
+```fsharp
+// Task tool mode (default)
+ProductOwner(topic, research?)                          // Stage 2: Opus, solo
+|> [Architect, EngDeliveryLead](po_output)              // Stage 3A: parallel Task tool
+|> QACritic(all_prior_outputs)                          // Stage 4: sequential Task tool
+|> Synthesis |> ApprovalGate |> PlanOutput(plan_v{N})   // Stage 5: versioned output
+// Agent Teams mode (enhanced, opt-in)
+ProductOwner(topic, research?)                          // Stage 2: Opus, Task tool (solo)
+|> AgentTeam[Architect, EngDeliveryLead, QACritic](po_output)  // Stage 3B: peer debate
+|> Synthesis |> ApprovalGate |> PlanOutput(plan_v{N})   // Stage 5: versioned output
+// Note: Stage 4 skipped in AT mode — QA/Critic participates throughout Stage 3B
+```
+---
+## Token Budget Management
+| Checkpoint | Threshold | Action |
+|------------|-----------|--------|
+| After constructing PO prompt | >30% consumed | Warn user: "4 agents will consume significant context" |
+| After reading Stage 3A outputs | Running tally | If approaching 55%, checkpoint with user |
+| After synthesis | Must be <65% | Leave room for plan approval + session closing |
+| Synthesis complete at >65% | Immediate | Write plan as-is, create handoff, do not start additional work |
+If token budget is insufficient to complete all 4 agents + synthesis, inform the user and suggest splitting (e.g., "PO + Architect/Eng Lead this session, Critic + synthesis next session").
+---
+## Error Handling
+**Both modes:**
+| Scenario | Action |
+|----------|--------|
+| Agent returns empty output | Re-spawn once. If still empty, document gap in synthesis. |
+| Agent returns truncated output | Accept as-is, note in diagnostics. |
+| Agent fails to spawn | Re-spawn once. If still fails, skip role, document. |
+| PO fails | STOP — subsequent agents depend on PO. Inform user. |
+| Token budget exceeded mid-session | Stop spawning, synthesize from available outputs, note incomplete. |
+| Research synthesis not provided | Warn user, proceed with lower quality. |
+| User rejects plan draft | Ask what needs to change, re-enter Critical Evaluation Gate. |
+**Agent Teams mode only:**
+| Scenario | Action |
+|----------|--------|
+| Teammate appears stuck (no WORK COMPLETE, no task updates) | Send status check via mailbox. Wait for response before any shutdown attempt. |
+| Teammate never sends WORK COMPLETE | After status check + 1 follow-up, check if log file was written. If log exists and is non-empty, treat as implicit completion. Document in diagnostics. |
+| Peer DM traffic invisible to lead | Expected — this is an AT architectural constraint. Rely on CC-to-lead summaries and task list state. |
+| One teammate fails, others succeed | Document gap. Do NOT shut down working teammates — let them complete. Synthesize from available outputs. |
+| AT env var absent but user requested AT | Notify user, fall back to Task tool mode (Stage 3A). |
+| User selects "Switch to Task tool" in AT Confirmation | Execute Stage 3A instead. No AT infrastructure spawned. |
+---
+## Diagnostic Output (REQUIRED)
+**MANDATORY**: You MUST write diagnostic output after every invocation. This is Stage 6 and cannot be skipped.
+Write to: `$PROJECT_DIR/logs/diagnostics/plan-creation-{YYYYMMDD-HHMMSS}.yaml`
+**Template**: Use `templates/diagnostic-output.yaml` for the schema. Fill in actual values from the session.
+---
+**Do NOT return to user until all applicable checkboxes can be marked complete.**