@arthai/agents 1.0.5 → 1.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (130) hide show
  1. package/README.md +33 -3
  2. package/VERSION +1 -1
  3. package/agents/troubleshooter.md +132 -0
  4. package/bin/cli.js +297 -0
  5. package/bundles/canvas.json +1 -1
  6. package/bundles/compass.json +1 -1
  7. package/bundles/counsel.json +1 -0
  8. package/bundles/cruise.json +1 -1
  9. package/bundles/forge.json +12 -1
  10. package/bundles/prism.json +1 -0
  11. package/bundles/scalpel.json +5 -2
  12. package/bundles/sentinel.json +8 -2
  13. package/bundles/shield.json +1 -0
  14. package/bundles/spark.json +1 -0
  15. package/compiler.sh +14 -0
  16. package/dist/plugins/canvas/.claude-plugin/plugin.json +1 -1
  17. package/dist/plugins/canvas/VERSION +1 -0
  18. package/dist/plugins/canvas/commands/planning.md +100 -11
  19. package/dist/plugins/canvas/hooks/hooks.json +16 -0
  20. package/dist/plugins/canvas/hooks/project-setup.sh +109 -0
  21. package/dist/plugins/canvas/templates/CLAUDE.md.managed-block +123 -0
  22. package/dist/plugins/canvas/templates/CLAUDE.md.template +111 -0
  23. package/dist/plugins/compass/.claude-plugin/plugin.json +1 -1
  24. package/dist/plugins/compass/VERSION +1 -0
  25. package/dist/plugins/compass/commands/planning.md +100 -11
  26. package/dist/plugins/compass/hooks/hooks.json +16 -0
  27. package/dist/plugins/compass/hooks/project-setup.sh +109 -0
  28. package/dist/plugins/compass/templates/CLAUDE.md.managed-block +123 -0
  29. package/dist/plugins/compass/templates/CLAUDE.md.template +111 -0
  30. package/dist/plugins/counsel/.claude-plugin/plugin.json +1 -1
  31. package/dist/plugins/counsel/VERSION +1 -0
  32. package/dist/plugins/counsel/hooks/hooks.json +10 -0
  33. package/dist/plugins/counsel/hooks/project-setup.sh +109 -0
  34. package/dist/plugins/counsel/templates/CLAUDE.md.managed-block +123 -0
  35. package/dist/plugins/counsel/templates/CLAUDE.md.template +111 -0
  36. package/dist/plugins/cruise/.claude-plugin/plugin.json +1 -1
  37. package/dist/plugins/cruise/VERSION +1 -0
  38. package/dist/plugins/cruise/hooks/hooks.json +16 -0
  39. package/dist/plugins/cruise/hooks/project-setup.sh +109 -0
  40. package/dist/plugins/cruise/templates/CLAUDE.md.managed-block +123 -0
  41. package/dist/plugins/cruise/templates/CLAUDE.md.template +111 -0
  42. package/dist/plugins/forge/.claude-plugin/plugin.json +1 -1
  43. package/dist/plugins/forge/VERSION +1 -0
  44. package/dist/plugins/forge/agents/troubleshooter.md +132 -0
  45. package/dist/plugins/forge/commands/implement.md +99 -1
  46. package/dist/plugins/forge/commands/planning.md +100 -11
  47. package/dist/plugins/forge/hooks/escalation-guard.sh +177 -0
  48. package/dist/plugins/forge/hooks/hooks.json +22 -0
  49. package/dist/plugins/forge/hooks/project-setup.sh +109 -0
  50. package/dist/plugins/forge/templates/CLAUDE.md.managed-block +123 -0
  51. package/dist/plugins/forge/templates/CLAUDE.md.template +111 -0
  52. package/dist/plugins/prime/.claude-plugin/plugin.json +1 -1
  53. package/dist/plugins/prime/VERSION +1 -0
  54. package/dist/plugins/prime/agents/troubleshooter.md +132 -0
  55. package/dist/plugins/prime/commands/calibrate.md +20 -0
  56. package/dist/plugins/prime/commands/ci-fix.md +36 -0
  57. package/dist/plugins/prime/commands/fix.md +23 -0
  58. package/dist/plugins/prime/commands/implement.md +99 -1
  59. package/dist/plugins/prime/commands/planning.md +100 -11
  60. package/dist/plugins/prime/commands/qa-incident.md +54 -0
  61. package/dist/plugins/prime/commands/restart.md +186 -30
  62. package/dist/plugins/prime/hooks/escalation-guard.sh +177 -0
  63. package/dist/plugins/prime/hooks/hooks.json +60 -0
  64. package/dist/plugins/prime/hooks/post-config-change-restart-reminder.sh +86 -0
  65. package/dist/plugins/prime/hooks/post-server-crash-watch.sh +120 -0
  66. package/dist/plugins/prime/hooks/pre-server-port-guard.sh +110 -0
  67. package/dist/plugins/prime/hooks/project-setup.sh +109 -0
  68. package/dist/plugins/prime/hooks/sync-agents.sh +99 -12
  69. package/dist/plugins/prime/templates/CLAUDE.md.managed-block +123 -0
  70. package/dist/plugins/prime/templates/CLAUDE.md.template +111 -0
  71. package/dist/plugins/prism/.claude-plugin/plugin.json +1 -1
  72. package/dist/plugins/prism/VERSION +1 -0
  73. package/dist/plugins/prism/commands/qa-incident.md +54 -0
  74. package/dist/plugins/prism/hooks/hooks.json +12 -0
  75. package/dist/plugins/prism/hooks/project-setup.sh +109 -0
  76. package/dist/plugins/prism/templates/CLAUDE.md.managed-block +123 -0
  77. package/dist/plugins/prism/templates/CLAUDE.md.template +111 -0
  78. package/dist/plugins/scalpel/.claude-plugin/plugin.json +1 -1
  79. package/dist/plugins/scalpel/VERSION +1 -0
  80. package/dist/plugins/scalpel/agents/troubleshooter.md +132 -0
  81. package/dist/plugins/scalpel/commands/ci-fix.md +36 -0
  82. package/dist/plugins/scalpel/commands/fix.md +23 -0
  83. package/dist/plugins/scalpel/hooks/escalation-guard.sh +177 -0
  84. package/dist/plugins/scalpel/hooks/hooks.json +24 -0
  85. package/dist/plugins/scalpel/hooks/project-setup.sh +109 -0
  86. package/dist/plugins/scalpel/templates/CLAUDE.md.managed-block +123 -0
  87. package/dist/plugins/scalpel/templates/CLAUDE.md.template +111 -0
  88. package/dist/plugins/sentinel/.claude-plugin/plugin.json +1 -1
  89. package/dist/plugins/sentinel/VERSION +1 -0
  90. package/dist/plugins/sentinel/agents/troubleshooter.md +132 -0
  91. package/dist/plugins/sentinel/commands/restart.md +186 -30
  92. package/dist/plugins/sentinel/hooks/escalation-guard.sh +177 -0
  93. package/dist/plugins/sentinel/hooks/hooks.json +64 -0
  94. package/dist/plugins/sentinel/hooks/post-config-change-restart-reminder.sh +86 -0
  95. package/dist/plugins/sentinel/hooks/post-server-crash-watch.sh +120 -0
  96. package/dist/plugins/sentinel/hooks/pre-server-port-guard.sh +110 -0
  97. package/dist/plugins/sentinel/hooks/project-setup.sh +109 -0
  98. package/dist/plugins/sentinel/templates/CLAUDE.md.managed-block +123 -0
  99. package/dist/plugins/sentinel/templates/CLAUDE.md.template +111 -0
  100. package/dist/plugins/shield/.claude-plugin/plugin.json +1 -1
  101. package/dist/plugins/shield/VERSION +1 -0
  102. package/dist/plugins/shield/hooks/hooks.json +22 -12
  103. package/dist/plugins/shield/hooks/project-setup.sh +109 -0
  104. package/dist/plugins/shield/templates/CLAUDE.md.managed-block +123 -0
  105. package/dist/plugins/shield/templates/CLAUDE.md.template +111 -0
  106. package/dist/plugins/spark/.claude-plugin/plugin.json +1 -1
  107. package/dist/plugins/spark/VERSION +1 -0
  108. package/dist/plugins/spark/commands/calibrate.md +20 -0
  109. package/dist/plugins/spark/hooks/hooks.json +10 -0
  110. package/dist/plugins/spark/hooks/project-setup.sh +109 -0
  111. package/dist/plugins/spark/templates/CLAUDE.md.managed-block +123 -0
  112. package/dist/plugins/spark/templates/CLAUDE.md.template +111 -0
  113. package/hook-defs.json +31 -0
  114. package/hooks/escalation-guard.sh +177 -0
  115. package/hooks/post-config-change-restart-reminder.sh +86 -0
  116. package/hooks/post-server-crash-watch.sh +120 -0
  117. package/hooks/pre-server-port-guard.sh +110 -0
  118. package/hooks/project-setup.sh +109 -0
  119. package/hooks/sync-agents.sh +99 -12
  120. package/install.sh +2 -2
  121. package/package.json +1 -1
  122. package/portable.manifest +7 -1
  123. package/skills/calibrate/SKILL.md +20 -0
  124. package/skills/ci-fix/SKILL.md +36 -0
  125. package/skills/fix/SKILL.md +23 -0
  126. package/skills/implement/SKILL.md +99 -1
  127. package/skills/license/SKILL.md +159 -0
  128. package/skills/planning/SKILL.md +100 -11
  129. package/skills/qa-incident/SKILL.md +54 -0
  130. package/skills/restart/SKILL.md +187 -31
@@ -0,0 +1,132 @@
1
+ ---
2
+ name: troubleshooter
3
+ description: "Specialized debugging agent for when other agents get stuck. Performs root cause analysis using error context, knowledge base, git history, and CLAUDE.md. Produces structured diagnosis with confidence level and recommended fix."
4
+ model: sonnet
5
+ ---
6
+
7
+ # Troubleshooter Agent
8
+
9
+ You are a specialized debugging agent. You are called when another agent or workflow
10
+ has failed multiple times and needs expert diagnosis.
11
+
12
+ ## When You Are Spawned
13
+
14
+ Another agent has hit a wall — they've tried 2-3 fixes and keep failing. Your job
15
+ is to diagnose the root cause and provide a fix with confidence rating.
16
+
17
+ ## Your Process (follow in order)
18
+
19
+ ### 1. Understand the Problem (DO NOT SKIP)
20
+
21
+ Read the error context provided in your spawn prompt. Extract:
22
+ - **Exact error message** (not paraphrased)
23
+ - **What was being attempted** (the goal, not just the command)
24
+ - **What has already been tried** (and why each attempt failed)
25
+ - **The file(s) involved**
26
+
27
+ ### 2. Consult Knowledge Base (BEFORE forming any hypothesis)
28
+
29
+ Check these sources in order:
30
+
31
+ ```
32
+ .claude/knowledge/qa-knowledge/ → past incidents with error signatures
33
+ .claude/knowledge/shared/conventions.md → project-specific gotchas and rules
34
+ .claude/knowledge/shared/patterns.md → architecture patterns that may explain the error
35
+ .claude/knowledge/agents/ → per-agent learning files
36
+ CLAUDE.md → project configuration, test commands, services
37
+ ```
38
+
39
+ Search for:
40
+ - The exact error message (or key phrases)
41
+ - The file/module involved
42
+ - The command that failed
43
+ - Similar past incidents
44
+
45
+ **If you find a match:** Follow the documented fix. Do not reinvent.
46
+ **If no match:** Proceed to step 3.
47
+
48
+ ### 3. Gather Fresh Evidence
49
+
50
+ Read the actual source code around the error:
51
+ - The file mentioned in the error (read 50+ lines of context, not just the error line)
52
+ - Related files (imports, callers, configuration)
53
+ - Recent changes: `git log --oneline -10 -- <file>` and `git diff HEAD -- <file>`
54
+
55
+ Check the environment:
56
+ - `git status` — are there uncommitted changes that might cause the issue?
57
+ - Check if the right dependencies are installed (node_modules, venv, etc.)
58
+ - Check if services are running (ports, Docker containers)
59
+ - Check environment variables that the code expects
60
+
61
+ ### 4. Form Hypothesis (evidence-based only)
62
+
63
+ Based on steps 2-3, form ONE primary hypothesis and optionally one alternative.
64
+ Each hypothesis MUST cite evidence:
65
+
66
+ ```
67
+ HYPOTHESIS: [what I think is wrong]
68
+ EVIDENCE:
69
+ - [source]: [what I found that supports this]
70
+ - [source]: [what I found that supports this]
71
+ CONFIDENCE: HIGH / MEDIUM / LOW
72
+ - HIGH: evidence directly explains the error, fix is clear
73
+ - MEDIUM: evidence is consistent but not conclusive
74
+ - LOW: best guess based on limited evidence
75
+ ```
76
+
77
+ ### 5. Recommend Fix
78
+
79
+ Provide a specific, actionable fix:
80
+
81
+ ```
82
+ RECOMMENDED FIX:
83
+ File: [exact file path]
84
+ Change: [what to modify — be specific, not vague]
85
+ Why: [how this addresses the root cause]
86
+ Verify: [command to run to confirm the fix works]
87
+
88
+ ALTERNATIVE FIX (if confidence < HIGH):
89
+ File: [exact file path]
90
+ Change: [what to modify]
91
+ Why: [different hypothesis this addresses]
92
+ ```
93
+
94
+ ### 6. Output Format
95
+
96
+ Always produce this structured output:
97
+
98
+ ```markdown
99
+ ## Troubleshooter Diagnosis
100
+
101
+ **Error:** [exact error]
102
+ **Root Cause:** [1-2 sentence explanation]
103
+ **Confidence:** HIGH / MEDIUM / LOW
104
+
105
+ ### Evidence
106
+ - [source 1]: [finding]
107
+ - [source 2]: [finding]
108
+ - Knowledge base: [match found / no match]
109
+
110
+ ### Recommended Fix
111
+ - File: [path]
112
+ - Change: [specific change]
113
+ - Verify: [command]
114
+
115
+ ### What Was Wrong With Previous Attempts
116
+ - Attempt 1: [why it didn't work — specific reason]
117
+ - Attempt 2: [why it didn't work — specific reason]
118
+
119
+ ### If This Doesn't Work
120
+ - [Next diagnostic step to try]
121
+ - [What data to gather]
122
+ - [Whether to escalate to user — and what to ask them]
123
+ ```
124
+
125
+ ## Rules
126
+
127
+ 1. **Never guess.** Every claim must cite evidence from code, logs, KB, or git history.
128
+ 2. **Check KB first.** If a past incident matches, use that fix. Don't reinvent.
129
+ 3. **Be specific.** "Check the config" is not a fix. "Change line 42 of config.ts from X to Y" is.
130
+ 4. **Explain why previous attempts failed.** This is as valuable as the fix itself.
131
+ 5. **Know when to escalate.** If confidence is LOW and you can't gather more evidence, say so. Recommend what data to ask the user for.
132
+ 6. **Don't try the fix yourself.** Your job is diagnosis. The calling agent implements the fix.
@@ -72,6 +72,26 @@ This goes far deeper than `/scan`. Read **actual source code** to understand HOW
72
72
 
73
73
  #### Step 1.1: Foundation Scan
74
74
 
75
+ **Managed block check (belt-and-suspenders — runs before anything else):**
76
+
77
+ Check if CLAUDE.md has the toolkit managed block. If missing, inject it before proceeding:
78
+
79
+ ```bash
80
+ MANAGED_START="<!-- >>> claude-agents toolkit (DO NOT EDIT THIS BLOCK) >>> -->"
81
+ if [ -f "$CLAUDE_PROJECT_DIR/CLAUDE.md" ]; then
82
+ grep -qF "$MANAGED_START" "$CLAUDE_PROJECT_DIR/CLAUDE.md" || echo "MISSING_BLOCK"
83
+ fi
84
+ ```
85
+
86
+ If the managed block is missing:
87
+ 1. Read `~/.claude-agents/templates/CLAUDE.md.managed-block` (or `$CLAUDE_PROJECT_DIR/.claude/hooks/../templates/CLAUDE.md.managed-block` if installed via plugin)
88
+ 2. Inject it at the end of CLAUDE.md using the markers:
89
+ - Start: `<!-- >>> claude-agents toolkit (DO NOT EDIT THIS BLOCK) >>> -->`
90
+ - End: `<!-- <<< claude-agents toolkit <<< -->`
91
+ 3. Report: "Injected toolkit managed block into CLAUDE.md (was missing)"
92
+
93
+ This catches any install path that missed the injection — clone installs, manual setups, or projects that predate the managed block feature.
94
+
75
95
  Run `/scan` first if CLAUDE.md has `<!-- TODO -->` placeholders or doesn't exist. This populates
76
96
  the basics (tech stack, services, test commands, infrastructure). Then proceed to deep scan.
77
97
 
@@ -160,6 +160,42 @@ gh run view <FAILED_RUN_ID> --log-failed 2>&1 | tail -200
160
160
  | **Build failures** | build errors | Read error, fix import/export/config |
161
161
  | **Migration** | Alembic/Django errors | Fix migration file |
162
162
  | **Dependency** | pip/npm install failures | Fix requirements/package.json |
163
+ | **Toolkit tests** | 15/20-skill-runtime-safety, manifest-coverage | See Toolkit Test Fixes below |
164
+
165
+ #### Toolkit-Specific Test Fixes (claude-agents repo)
166
+
167
+ When CI fails on the mechanical test suite (`tests/run.sh`), these are the common failures and auto-fixes:
168
+
169
+ | Test | Failure message | Root cause | Auto-fix |
170
+ |------|----------------|-----------|----------|
171
+ | `20-skill-runtime-safety` | "regex-unsafe [brackets] in descriptions" | SKILL.md `description:` or `arguments:` field contains `[text]` | Replace `[text]` with `<text>` in the frontmatter field. Brackets break regex matching in Claude Code. |
172
+ | `20-skill-runtime-safety` | "Skills missing required frontmatter fields" | SKILL.md missing `user-invocable: true` or `arguments:` | Add missing field to the YAML frontmatter between `---` markers. Check `git show HEAD~1:path/to/SKILL.md` for the original. |
173
+ | `15-manifest-coverage` | "entries mapped to categories" | New file in `portable.manifest` not listed in any `get_category_items()` category in `install.sh` | Add the manifest entry to the appropriate category in `install.sh:get_category_items()`. |
174
+ | `15-manifest-coverage` | "Install creates all expected symlinks" | New file in `portable.manifest` but install didn't create the symlink | Usually follows from the category mapping fix above. |
175
+ | `15-manifest-coverage` | "Entry counts are consistent" | Mismatch between manifest entries and installed files | Check that new manifest entries have matching source files. |
176
+ | `19-brownfield-assessment` | "classify_file returns IDENTICAL" | Agent fixture is stale after editing an agent `.md` file | Update fixture: `cp agents/{name}.md tests/fixtures/claude-setups/poweruser/.claude/agents/` |
177
+
178
+ **Auto-fix sequence for toolkit tests:**
179
+
180
+ ```bash
181
+ # 1. Get the exact failure
182
+ gh run view <ID> --log-failed 2>&1 | grep -E "FAIL|✗" | head -5
183
+
184
+ # 2. For bracket issues — find and fix ALL bracket descriptions
185
+ grep -rn 'description:.*\[' skills/*/SKILL.md
186
+ # Replace [text] with <text> in each match
187
+
188
+ # 3. For missing frontmatter — compare against last known good
189
+ git show HEAD~1:path/to/SKILL.md | head -6
190
+ # Restore missing fields
191
+
192
+ # 4. For manifest coverage — add to install.sh categories
193
+ grep "get_category_items" install.sh
194
+ # Add new entries to the right category
195
+
196
+ # 5. Verify locally before pushing
197
+ bash tests/run.sh --suite 15,20 --scenario a
198
+ ```
163
199
 
164
200
  **Attempt escalation:**
165
201
  - Attempt 1: Apply the obvious fix (auto-fix tools, direct code fix)
@@ -476,6 +476,29 @@ Select the right agent based on which layer the bug is in:
476
476
  If `.claude/project-profile.md` exists, read it to determine the platform and pick the right agent.
477
477
  If `/calibrate` generated custom agents (e.g., `ios-developer.md`), use those for platform-specific bugs.
478
478
 
479
+ **4.2b: Escalation protocol for fix agents**
480
+
481
+ Include this in the implementation agent's prompt:
482
+
483
+ ```
484
+ ## When Your Fix Doesn't Work (MANDATORY)
485
+
486
+ 1. After first failed attempt: re-read the root cause analysis from Step 1.
487
+ Is the root cause correct? If not, go back to Step 1.
488
+ 2. After second failed attempt: consult knowledge base:
489
+ - .claude/knowledge/qa-knowledge/ (error keywords)
490
+ - .claude/knowledge/shared/conventions.md (project gotchas)
491
+ - git log --all --grep="<error keyword>" --oneline -10
492
+ 3. After third failed attempt: STOP. Do not try another fix.
493
+ Generate a STUCK REPORT and send to team-lead:
494
+ - Error: [exact message]
495
+ - Root cause hypothesis: [from Step 1]
496
+ - Fix attempts: [1, 2, 3 with results]
497
+ - KB consultation results: [what you found]
498
+ - Recommendation: [re-investigate root cause / ask user for X / try different approach]
499
+ 4. If a troubleshooter agent is available, team-lead may spawn one.
500
+ ```
501
+
479
502
  **Agent prompt includes:**
480
503
  ```
481
504
  1. Root cause analysis from Step 1
@@ -27,9 +27,15 @@ If no feature name is provided, use AskUserQuestion to get it.
27
27
  If the plan file exists:
28
28
  - Read it with the Read tool.
29
29
  - Parse the YAML frontmatter to extract the `layers` array (`frontend`, `backend`, or both).
30
+ - Parse the `spec` field from frontmatter (e.g., `spec: specs/feature-name.md`).
30
31
  - Use `layers` to determine which agents to spawn (see Agent Selection below).
31
32
  - Use the full file content as `PLAN`.
32
33
 
34
+ **Also check for a spec file** at `.claude/specs/{feature-name}.md` (written by `/planning` Phase 0):
35
+ - If it exists, read it and store as `FEATURE_SPEC`.
36
+ - Extract `USER_STORIES` (the ## User Stories section) and `EDGE_CASES` (the ## Edge Cases section).
37
+ - These are passed to implementation agents and QA for better coverage.
38
+
33
39
  If the plan file does NOT exist:
34
40
  - Check conversation history for a recent `/planning` output. If found, use it as `PLAN` and infer layers from task breakdown.
35
41
  - If neither exists, ask the user with AskUserQuestion:
@@ -134,6 +140,43 @@ Include the results in the shared context block below so agents match existing
134
140
  patterns instead of inventing new ones. This is 60x cheaper than having each
135
141
  Sonnet agent independently explore the codebase.
136
142
 
143
+ ### 3c. Consult Knowledge Base (before agents start)
144
+
145
+ Before spawning implementation agents, check the knowledge base for relevant context:
146
+
147
+ ```
148
+ 1. .claude/knowledge/shared/conventions.md — coding rules and project gotchas
149
+ 2. .claude/knowledge/shared/patterns.md — architecture patterns to follow
150
+ 3. .claude/knowledge/qa-knowledge/ — past incidents in the same area
151
+ 4. git log --all --grep="fix:" --oneline -10 — recent bug fixes that may be relevant
152
+ ```
153
+
154
+ Include any relevant findings in the shared context block as `KNOWLEDGE_CONTEXT`.
155
+ This prevents agents from repeating past mistakes or contradicting established patterns.
156
+
157
+ ### 3d. Escalation Protocol for Implementation Agents
158
+
159
+ Add this to every implementation agent's prompt:
160
+
161
+ ```
162
+ ## When You Get Stuck (MANDATORY PROTOCOL)
163
+
164
+ If a command fails or a fix doesn't work:
165
+ 1. DO NOT retry the same approach more than twice
166
+ 2. After 2 failures with same error: STOP and consult knowledge base
167
+ - .claude/knowledge/shared/conventions.md
168
+ - .claude/knowledge/qa-knowledge/ (search for error keywords)
169
+ - git log --all --grep="<error keyword>" --oneline -10
170
+ 3. After 3 failures: escalate with a STUCK REPORT:
171
+ - Error: [exact message]
172
+ - Attempts: [what you tried, why each failed]
173
+ - Evidence: [logs, state, KB results]
174
+ - What you need: [access/data/decision]
175
+ - Recommendation: [your best option]
176
+ 4. Send the stuck report to team-lead via SendMessage
177
+ 5. If a troubleshooter agent is available, team-lead may spawn one to help
178
+ ```
179
+
137
180
  ### 4. Build Shared Context Block
138
181
 
139
182
  ```
@@ -146,6 +189,16 @@ Auth: {AUTH_APPROACH}
146
189
  ## Implementation Plan
147
190
  {PLAN}
148
191
 
192
+ ## User Stories (from spec — trace your work to these)
193
+ {USER_STORIES}
194
+
195
+ (If no spec exists, this section is omitted.)
196
+
197
+ ## Edge Cases (from spec — handle these in your implementation)
198
+ {EDGE_CASES}
199
+
200
+ (If no spec exists, this section is omitted.)
201
+
149
202
  ## API Contract
150
203
  {API_CONTRACT}
151
204
 
@@ -204,7 +257,7 @@ Check `.claude/project-profile.md` first (if /calibrate has run). Otherwise the
204
257
 
205
258
  **Always spawn:**
206
259
  - **qa** (subagent_type="qa", model="sonnet", name="qa")
207
- - Prompt: "{SHARED_CONTEXT}\n\nYou are QA. Your job: (1) Review backend and frontend implementations as they complete. (2) Ask teammates 'why did you do X?' when something looks wrong. (3) Run validation checks (linters, type checkers, build commands). (4) Report issues back to the responsible teammate. (5) Mark your tasks done when all checks pass. Do NOT write code — only review and validate."
260
+ - Prompt: "{SHARED_CONTEXT}\n\nYou are QA. Your job: (1) Review backend and frontend implementations as they complete. (2) Verify each user story from the spec is covered by the implementation — flag any story that has no corresponding code. (3) Verify each edge case from the spec is handled — flag any unhandled edge case. (4) Ask teammates 'why did you do X?' when something looks wrong. (5) Run validation checks (linters, type checkers, build commands). (6) Report issues back to the responsible teammate. (7) Mark your tasks done when all checks pass. Do NOT write code — only review and validate.\n\nWhen reviewing, trace each acceptance criterion back to its user story ID (US-1, US-2, etc.) and confirm the implementation satisfies it. Check edge cases (EC-1, EC-2, etc.) have explicit handling in the code."
208
261
 
209
262
  ### 5b. Red Team Phase
210
263
 
@@ -339,6 +392,7 @@ After PASS (or user override of BLOCK):
339
392
 
340
393
  ### 6. Monitor + Coordinate
341
394
 
395
+ **Standard coordination:**
342
396
  - Watch TaskList for progress.
343
397
  - If backend finishes API endpoints, nudge frontend to unblock.
344
398
  - If a teammate is stuck, relay context from the other teammate.
@@ -347,6 +401,50 @@ After PASS (or user override of BLOCK):
347
401
  - If `REDTEAM_MODE=once`, defer Step 5b until all implementation steps are complete.
348
402
  - Track `REDTEAM_CYCLE`. If a BLOCK verdict is returned from Step 5b.4, pause all progress and escalate to the user before continuing.
349
403
 
404
+ **Escalation handling (when an agent sends a STUCK REPORT):**
405
+
406
+ When an agent reports they're stuck (via SendMessage with stuck report format):
407
+
408
+ 1. **Assess scope:** Is this a local issue (one file, one test) or systemic (architecture problem, wrong approach)?
409
+
410
+ 2. **If local issue (single file/test failure):**
411
+ - Check if another teammate can help (e.g., backend stuck on a frontend integration → ask frontend agent)
412
+ - Spawn a troubleshooter agent with the stuck report + error context
413
+ - Relay the troubleshooter's diagnosis back to the stuck agent
414
+ - If troubleshooter confidence is LOW → escalate to user with structured options
415
+
416
+ 3. **If systemic issue (architecture problem, multiple agents affected):**
417
+ - PAUSE all agents (don't let them keep building on a broken foundation)
418
+ - Escalate to user immediately:
419
+ ```
420
+ IMPLEMENTATION BLOCKED
421
+
422
+ What happened: [agent] hit [error] after [N] attempts
423
+ Scope: [local/systemic] — [why you think so]
424
+ Impact: [which tasks are blocked]
425
+ Troubleshooter says: [diagnosis if spawned]
426
+
427
+ Options:
428
+ [1] Fix the root cause (I'll explain what needs to change)
429
+ [2] Adjust the plan (scope down to avoid this area)
430
+ [3] Abort implementation (save work done so far)
431
+ ```
432
+
433
+ 4. **If two agents are stuck simultaneously:**
434
+ - This is almost always a systemic issue → treat as systemic
435
+ - Do NOT spawn two troubleshooters — diagnose once, fix at the root
436
+
437
+ 5. **If a task shows no progress for 3+ consecutive idle cycles:**
438
+ - Check in with the agent: "What's your status on Task #N?"
439
+ - If no meaningful progress → treat as stuck (even without explicit stuck report)
440
+
441
+ **Red team finding escalation:**
442
+
443
+ When red team finds issues that the developer can't fix:
444
+ - If the fix requires changes outside their file ownership → orchestrator makes the cross-cutting change
445
+ - If the fix requires a plan change → escalate to user: "Red team found [issue] that requires changing the plan. Original plan said [X], but we need [Y]. Approve?"
446
+ - If the fix is beyond the team's capability → acknowledge, log it, and add to the PR description as a known limitation
447
+
350
448
  ### 7. Cleanup Implementation Team
351
449
 
352
450
  - Send shutdown_request to all teammates.
@@ -58,7 +58,86 @@ Map user choices:
58
58
 
59
59
  Store the resolved mode as `DEBATE_MODE` (values: `lite`, `fast`, `full`).
60
60
 
61
- ### 3. Codebase Scan
61
+ ### 3. Phase 0: Spec Generation (before debate)
62
+
63
+ Before any debate rounds, the PM generates a **spec doc** that becomes the foundation for all subsequent work. This ensures user stories and edge cases are defined BEFORE architecture decisions.
64
+
65
+ **Create the specs directory:**
66
+ ```bash
67
+ mkdir -p .claude/specs
68
+ ```
69
+
70
+ **Spawn PM agent (subagent_type="product-manager", model="sonnet") for spec generation only:**
71
+
72
+ Prompt:
73
+ ```
74
+ You are a Product Manager generating a spec doc for the feature "{feature-name}".
75
+
76
+ Feature brief: {FEATURE_BRIEF}
77
+
78
+ Generate a spec doc with these sections:
79
+
80
+ ## User Stories
81
+ Write 3-7 user stories covering the happy path and key error states.
82
+ Format: "As a [user type], I want [action], so that [outcome]"
83
+ Each story must have:
84
+ - **Story ID**: US-1, US-2, etc.
85
+ - **Priority**: P0 (must-have for launch) or P1 (important but deferrable)
86
+ - **Acceptance**: specific testable condition that proves this story is done
87
+
88
+ Example:
89
+ US-1 [P0]: As a new developer, I want to install the toolkit with one command,
90
+ so that I can start using it without manual configuration.
91
+ Acceptance: `npx @arthai/agents install forge .` succeeds and skills are available.
92
+
93
+ ## User Journey
94
+ Step-by-step flow from the user's first interaction to completion.
95
+ Include:
96
+ - **Happy path**: numbered steps (1. User does X → 2. System responds Y → ...)
97
+ - **Decision points**: where the user makes a choice (mark with ◆)
98
+ - **Error branches**: where things can go wrong (mark with ✗ and show recovery path)
99
+
100
+ Format as a text flowchart:
101
+ ```
102
+ 1. User discovers feature
103
+ 2. User does [action]
104
+ ◆ Decision: [choice A] or [choice B]
105
+ → Choice A: proceed to step 3
106
+ → Choice B: proceed to step 5
107
+ 3. System responds with [result]
108
+ ✗ Error: [what went wrong] → Recovery: [how to fix]
109
+ 4. ...
110
+ ```
111
+
112
+ ## Edge Cases
113
+ Structured list of what can go wrong. For each:
114
+ - **ID**: EC-1, EC-2, etc.
115
+ - **Scenario**: what triggers this edge case
116
+ - **Expected behavior**: what should happen (not crash, not hang)
117
+ - **Severity**: Critical (blocks user) / High (degrades experience) / Medium (inconvenience)
118
+ - **Linked story**: which user story this edge case relates to
119
+
120
+ ## Success Criteria
121
+ Measurable outcomes tied to user stories. These become the acceptance criteria
122
+ that /implement and /qa use to validate the implementation.
123
+ - Each criterion references a story ID
124
+ - Each criterion is binary: pass or fail, no subjective judgment
125
+ ```
126
+
127
+ **Write the output to `.claude/specs/{feature-name}.md`** with this frontmatter:
128
+
129
+ ```markdown
130
+ ---
131
+ feature: {feature-name}
132
+ generated: {ISO date}
133
+ stories: {count}
134
+ edge_cases: {count}
135
+ ---
136
+ ```
137
+
138
+ **Store the spec content as `FEATURE_SPEC`** — this is injected into the shared context block for all debate participants.
139
+
140
+ ### 4. Codebase Scan
62
141
 
63
142
  Spawn an `explore-light` subagent (model: haiku) to scan for relevant files:
64
143
 
@@ -68,7 +147,7 @@ prompt: "For a feature called '{feature-name}', find: (1) related backend routes
68
147
 
69
148
  Store the result as `CODEBASE_CONTEXT`.
70
149
 
71
- ### 4. Create Team + Tasks
150
+ ### 5. Create Team + Tasks
72
151
 
73
152
  Create team: `planning-{feature-name}`
74
153
 
@@ -76,12 +155,13 @@ Create these tasks:
76
155
 
77
156
  | Task | Owner | Subject |
78
157
  |------|-------|---------|
79
- | 1 | product-manager | Define product spec for {feature-name} |
158
+ | 0 | product-manager | Generate spec doc for {feature-name} (Phase 0 — already done above) |
159
+ | 1 | product-manager | Define product scope for {feature-name} (uses spec as input) |
80
160
  | 2 | architect | Design technical plan for {feature-name} |
81
161
  | 3 (if --design) | design-thinker | Create design brief for {feature-name} |
82
162
  | 4 (if not --lite) | devils-advocate | Challenge scope and feasibility for {feature-name} |
83
163
 
84
- ### 5. Build Shared Context Block
164
+ ### 6. Build Shared Context Block
85
165
 
86
166
  Compose this block to inject into every teammate's spawn prompt:
87
167
 
@@ -95,6 +175,11 @@ Auth: {AUTH_APPROACH}
95
175
  ## Feature: {feature-name}
96
176
  {FEATURE_BRIEF}
97
177
 
178
+ ## Spec Doc (from Phase 0)
179
+ {FEATURE_SPEC}
180
+
181
+ (Full spec at: .claude/specs/{feature-name}.md)
182
+
98
183
  ## Relevant Codebase
99
184
  {CODEBASE_CONTEXT}
100
185
 
@@ -114,13 +199,13 @@ This planning session runs in {DEBATE_MODE} mode.
114
199
  - If DA recommends KILL and user overrides, record as USER OVERRIDE in the plan.
115
200
  ```
116
201
 
117
- ### 6. Spawn Teammates (ALL IN ONE MESSAGE — parallel)
202
+ ### 7. Spawn Teammates (ALL IN ONE MESSAGE — parallel)
118
203
 
119
204
  Spawn all teammates in a **single message** with multiple Task tool calls:
120
205
 
121
206
  **Always spawn:**
122
207
  - **product-manager** (subagent_type="product-manager", model="opus")
123
- - Prompt: "{SHARED_CONTEXT}\n\nYou are the PM. Own the 'what' and 'why'.\n\nIn Round 1, you LEAD with your scope claim. Format your must-haves as:\n [M1] requirement — BECAUSE reason\n [M2] ...\nMaximum 5 MUST-HAVEs. Also list NICE-TO-HAVEs with CUT-IF conditions, EXPLICIT EXCLUSIONS, and success metrics.\n\nIn Round 2, you COUNTER the architect's approach: does it deliver user value, is it over-engineered, what is the time-to-value?\n\nIn Round 3 (full mode only), you DEFEND against the devil's advocate's risk case. Accept or reject each risk with evidence."
208
+ - Prompt: "{SHARED_CONTEXT}\n\nYou are the PM. Own the 'what' and 'why'. You generated the spec doc in Phase 0 — your user stories and edge cases are in the shared context above. Use them as the foundation for your scope claim.\n\nIn Round 1, you LEAD with your scope claim. Each must-have should trace to one or more user stories (reference by ID, e.g., 'US-1, US-3'). Format your must-haves as:\n [M1] requirement — BECAUSE reason (traces to US-X)\n [M2] ...\nMaximum 5 MUST-HAVEs. Also list NICE-TO-HAVEs with CUT-IF conditions, EXPLICIT EXCLUSIONS, and success metrics.\n\nIn Round 2, you COUNTER the architect's approach: does it deliver the user journey as specified? Is it over-engineered? What is the time-to-value? Reference edge cases the approach doesn't handle.\n\nIn Round 3 (full mode only), you DEFEND against the devil's advocate's risk case. Accept or reject each risk with evidence. Reference user stories to justify why a must-have cannot be cut."
124
209
 
125
210
  - **architect** (subagent_type="architect", model="opus")
126
211
  - Prompt: "{SHARED_CONTEXT}\n\nYou are the Architect. Own the 'how'.\n\nIn Round 1, you COUNTER the PM's scope from a feasibility lens. Challenge feasibility, flag hidden complexity, identify scope creep vectors, propose a counter-scope.\n\nIn Round 2, you LEAD with your technical approach: API contract, DB changes, architecture decision + WHY, task breakdown with S/M/L/XL estimates, implementation cost, dependencies, risks.\n\nIn Round 3 (full mode only), you DEFEND against the devil's advocate's risk case. Accept or reject each risk with evidence. Keep it simple for early-stage. Push back on scope that doesn't match the development stage."
@@ -140,7 +225,7 @@ Spawn all teammates in a **single message** with multiple Task tool calls:
140
225
  - **gtm-expert** (subagent_type="gtm-expert", model="sonnet")
141
226
  - Prompt: "{SHARED_CONTEXT}\n\nYou are the GTM Expert. Own distribution and launch strategy. Advise on positioning, viral mechanics, and launch sequencing. Challenge the team on how users will discover and adopt this feature. Your output feeds into Round 3 as additional evidence for the devil's advocate."
142
227
 
143
- ### 7. Structured Debate Protocol
228
+ ### 8. Structured Debate Protocol
144
229
 
145
230
  Facilitate the following rounds in sequence. Each phase completes before the next begins.
146
231
 
@@ -279,7 +364,7 @@ Each matched item is flagged as a RISK NOTE in the plan.
279
364
 
280
365
  ---
281
366
 
282
- ### 8. Convergence Logic
367
+ ### 9. Convergence Logic
283
368
 
284
369
  **Plan is APPROVED when ALL of the following are true after all applicable rounds complete:**
285
370
  - Rounds 1 and 2 have zero UNRESOLVED items
@@ -296,7 +381,7 @@ If user overrides a KILL recommendation, record as `USER OVERRIDE` in the plan.
296
381
 
297
382
  In lite and fast modes, skip convergence checks for rounds that were not run. Apply only the checks applicable to completed rounds.
298
383
 
299
- ### 9. Scope Lock
384
+ ### 10. Scope Lock
300
385
 
301
386
  After convergence, compute a `scope_hash`:
302
387
  - Concatenate all locked MUST-HAVE strings from Round 1 in order
@@ -305,7 +390,7 @@ After convergence, compute a `scope_hash`:
305
390
 
306
391
  When `/implement` loads the plan, it can verify the hash against the locked must-haves to detect tampering.
307
392
 
308
- ### 10. Write Plan File + Present to User
393
+ ### 11. Write Plan File + Present to User
309
394
 
310
395
  Synthesize teammate outputs into a structured plan and **write it to `.claude/plans/{feature-name}.md`** using the Write tool. This file is read by `/implement` to auto-configure the implementation team.
311
396
 
@@ -317,6 +402,7 @@ feature: {feature-name}
317
402
  debate_mode: {DEBATE_MODE}
318
403
  scope_hash: {SHA-256 of locked must-haves}
319
404
  da_confidence: {HIGH|MEDIUM|LOW|N/A}
405
+ spec: specs/{feature-name}.md
320
406
  layers:
321
407
  - frontend # include if ANY frontend tasks exist
322
408
  - backend # include if ANY backend tasks exist
@@ -324,6 +410,9 @@ layers:
324
410
 
325
411
  # Planning Summary: {feature-name}
326
412
 
413
+ ## Spec Reference
414
+ See `.claude/specs/{feature-name}.md` for user stories, user journey, edge cases, and success criteria.
415
+
327
416
  ## Problem & User Segment (from PM)
328
417
  ...
329
418
 
@@ -394,7 +483,7 @@ layers:
394
483
 
395
484
  Present the plan to the user for review.
396
485
 
397
- ### 11. Cleanup
486
+ ### 12. Cleanup
398
487
 
399
488
  After user reviews the plan:
400
489
  - Send shutdown_request to all teammates.
@@ -40,6 +40,60 @@ affected_files:
40
40
 
41
41
  4. **Confirm to user**: "Incident logged. Next `/qa` run will generate a regression test targeting this issue."
42
42
 
43
+ ## Auto-Logging from Escalation (Circuit Breaker Resolution)
44
+
45
+ When an agent resolves a stuck situation (circuit breaker was tripped, then the issue was fixed),
46
+ the resolution should be logged automatically. This enables future agents to find the fix
47
+ without repeating the same debugging journey.
48
+
49
+ **Trigger:** Any workflow that resolves an error after the escalation guard tripped (3+ consecutive failures).
50
+
51
+ **Auto-log format** — create `.claude/qa-knowledge/incidents/{date}-escalation-{slug}.md`:
52
+
53
+ ```markdown
54
+ ---
55
+ date: {today YYYY-MM-DD}
56
+ severity: medium
57
+ status: covered
58
+ type: escalation-resolution
59
+ error_signature: {from .claude/.escalation-state.json}
60
+ affected_files:
61
+ - {files that were modified to fix the issue}
62
+ ---
63
+ # Escalation: {brief description of what was stuck}
64
+
65
+ ## Error
66
+ {exact error message that triggered the circuit breaker}
67
+
68
+ ## What Was Tried (failed)
69
+ 1. {attempt 1 from escalation state} → {result}
70
+ 2. {attempt 2} → {result}
71
+ 3. {attempt 3} → {result}
72
+
73
+ ## Root Cause
74
+ {what was actually wrong}
75
+
76
+ ## Fix Applied
77
+ {what change resolved the issue}
78
+
79
+ ## How to Prevent
80
+ {if applicable — what convention or check would catch this earlier}
81
+
82
+ ## Search Keywords
83
+ {error message fragments, file names, command patterns — for future KB searches}
84
+ ```
85
+
86
+ **Also update** `.claude/qa-knowledge/bug-patterns.md` if this represents a new pattern:
87
+ ```
88
+ ## {Pattern name}
89
+ - **Signature:** {error message or command pattern}
90
+ - **Root cause:** {common reason}
91
+ - **Fix:** {standard resolution}
92
+ - **First seen:** {date}
93
+ ```
94
+
95
+ This closes the learning loop: stuck → escalate → fix → log → future agents find it instantly.
96
+
43
97
  ## If No Description Provided
44
98
 
45
99
  Ask: "Describe the issue you want to log (e.g., 'admin page crashes when user has no sessions')"