feed-the-machine 1.6.1 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (269) hide show
  1. package/LICENSE +21 -21
  2. package/README.md +170 -170
  3. package/bin/brain.py +1340 -0
  4. package/bin/convert_claude_skills_to_codex.py +490 -0
  5. package/bin/generate-manifest.mjs +463 -463
  6. package/bin/harden_codex_skills.py +141 -0
  7. package/bin/install.mjs +491 -491
  8. package/bin/migrate-eng-buddy-data.py +875 -0
  9. package/bin/playbook_engine/__init__.py +1 -0
  10. package/bin/playbook_engine/conftest.py +8 -0
  11. package/bin/playbook_engine/extractor.py +33 -0
  12. package/bin/playbook_engine/manager.py +102 -0
  13. package/bin/playbook_engine/models.py +84 -0
  14. package/bin/playbook_engine/registry.py +35 -0
  15. package/bin/playbook_engine/test_extractor.py +72 -0
  16. package/bin/playbook_engine/test_integration.py +129 -0
  17. package/bin/playbook_engine/test_manager.py +85 -0
  18. package/bin/playbook_engine/test_models.py +166 -0
  19. package/bin/playbook_engine/test_registry.py +67 -0
  20. package/bin/playbook_engine/test_tracer.py +86 -0
  21. package/bin/playbook_engine/tracer.py +93 -0
  22. package/bin/tasks_db.py +456 -0
  23. package/docs/HOOKS.md +243 -243
  24. package/docs/INBOX.md +233 -233
  25. package/ftm/SKILL.md +125 -122
  26. package/ftm-audit/SKILL.md +623 -623
  27. package/ftm-audit/references/protocols/PROJECT-PATTERNS.md +91 -91
  28. package/ftm-audit/references/protocols/RUNTIME-WIRING.md +66 -66
  29. package/ftm-audit/references/protocols/WIRING-CONTRACTS.md +135 -135
  30. package/ftm-audit/references/strategies/AUTO-FIX-STRATEGIES.md +69 -69
  31. package/ftm-audit/references/templates/REPORT-FORMAT.md +96 -96
  32. package/ftm-audit/scripts/run-knip.sh +23 -23
  33. package/ftm-audit.yml +2 -2
  34. package/ftm-brainstorm/SKILL.md +1003 -498
  35. package/ftm-brainstorm/evals/evals.json +180 -100
  36. package/ftm-brainstorm/evals/promptfoo.yaml +109 -109
  37. package/ftm-brainstorm/references/agent-prompts.md +552 -224
  38. package/ftm-brainstorm/references/plan-template.md +209 -121
  39. package/ftm-brainstorm.yml +2 -2
  40. package/ftm-browse/SKILL.md +454 -454
  41. package/ftm-browse/daemon/browser-manager.ts +206 -206
  42. package/ftm-browse/daemon/bun.lock +30 -30
  43. package/ftm-browse/daemon/cli.ts +347 -347
  44. package/ftm-browse/daemon/commands.ts +410 -410
  45. package/ftm-browse/daemon/main.ts +357 -357
  46. package/ftm-browse/daemon/package.json +17 -17
  47. package/ftm-browse/daemon/server.ts +189 -189
  48. package/ftm-browse/daemon/snapshot.ts +519 -519
  49. package/ftm-browse/daemon/tsconfig.json +22 -22
  50. package/ftm-browse.yml +4 -4
  51. package/ftm-capture/SKILL.md +370 -370
  52. package/ftm-capture.yml +4 -4
  53. package/ftm-codex-gate/SKILL.md +361 -361
  54. package/ftm-codex-gate.yml +2 -2
  55. package/ftm-config/SKILL.md +422 -345
  56. package/ftm-config.default.yml +125 -82
  57. package/ftm-config.yml +44 -2
  58. package/ftm-council/SKILL.md +416 -416
  59. package/ftm-council/references/prompts/CLAUDE-INVESTIGATION.md +60 -60
  60. package/ftm-council/references/prompts/CODEX-INVESTIGATION.md +58 -58
  61. package/ftm-council/references/prompts/GEMINI-INVESTIGATION.md +58 -58
  62. package/ftm-council/references/prompts/REBUTTAL-TEMPLATE.md +57 -57
  63. package/ftm-council/references/protocols/PREREQUISITES.md +47 -47
  64. package/ftm-council/references/protocols/STEP-0-FRAMING.md +46 -46
  65. package/ftm-council.yml +2 -2
  66. package/ftm-dashboard/SKILL.md +163 -163
  67. package/ftm-dashboard.yml +4 -4
  68. package/ftm-debug/SKILL.md +1037 -1037
  69. package/ftm-debug/references/phases/PHASE-0-INTAKE.md +58 -58
  70. package/ftm-debug/references/phases/PHASE-1-TRIAGE.md +46 -46
  71. package/ftm-debug/references/phases/PHASE-2-WAR-ROOM-AGENTS.md +279 -279
  72. package/ftm-debug/references/phases/PHASE-3-TO-6-EXECUTION.md +436 -436
  73. package/ftm-debug/references/protocols/BLACKBOARD.md +86 -86
  74. package/ftm-debug/references/protocols/EDGE-CASES.md +103 -103
  75. package/ftm-debug.yml +2 -2
  76. package/ftm-diagram/SKILL.md +277 -277
  77. package/ftm-diagram.yml +2 -2
  78. package/ftm-executor/SKILL.md +777 -777
  79. package/ftm-executor/references/STYLE-TEMPLATE.md +73 -73
  80. package/ftm-executor/references/phases/PHASE-0-VERIFICATION.md +62 -62
  81. package/ftm-executor/references/phases/PHASE-2-AGENT-ASSEMBLY.md +34 -34
  82. package/ftm-executor/references/phases/PHASE-3-WORKTREES.md +38 -38
  83. package/ftm-executor/references/phases/PHASE-4-5-AUDIT.md +72 -72
  84. package/ftm-executor/references/phases/PHASE-4-DISPATCH.md +66 -66
  85. package/ftm-executor/references/phases/PHASE-5-5-CODEX-GATE.md +73 -73
  86. package/ftm-executor/references/protocols/DOCUMENTATION-BOOTSTRAP.md +36 -36
  87. package/ftm-executor/references/protocols/MODEL-PROFILE.md +59 -59
  88. package/ftm-executor/references/protocols/PROGRESS-TRACKING.md +66 -66
  89. package/ftm-executor/runtime/ftm-runtime.mjs +252 -252
  90. package/ftm-executor/runtime/package.json +8 -8
  91. package/ftm-executor.yml +2 -2
  92. package/ftm-git/SKILL.md +441 -441
  93. package/ftm-git/evals/evals.json +26 -26
  94. package/ftm-git/evals/promptfoo.yaml +75 -75
  95. package/ftm-git/hooks/post-commit-experience.sh +92 -92
  96. package/ftm-git/references/patterns/SECRET-PATTERNS.md +104 -104
  97. package/ftm-git/references/protocols/REMEDIATION.md +139 -139
  98. package/ftm-git/scripts/pre-commit-secrets.sh +110 -110
  99. package/ftm-git.yml +2 -2
  100. package/ftm-inbox/backend/__pycache__/main.cpython-314.pyc +0 -0
  101. package/ftm-inbox/backend/adapters/_retry.py +64 -64
  102. package/ftm-inbox/backend/adapters/base.py +230 -230
  103. package/ftm-inbox/backend/adapters/freshservice.py +104 -104
  104. package/ftm-inbox/backend/adapters/gmail.py +125 -125
  105. package/ftm-inbox/backend/adapters/jira.py +136 -136
  106. package/ftm-inbox/backend/adapters/registry.py +192 -192
  107. package/ftm-inbox/backend/adapters/slack.py +110 -110
  108. package/ftm-inbox/backend/db/connection.py +54 -54
  109. package/ftm-inbox/backend/db/schema.py +78 -78
  110. package/ftm-inbox/backend/executor/__init__.py +7 -7
  111. package/ftm-inbox/backend/executor/engine.py +149 -149
  112. package/ftm-inbox/backend/executor/step_runner.py +98 -98
  113. package/ftm-inbox/backend/main.py +103 -103
  114. package/ftm-inbox/backend/models/__init__.py +1 -1
  115. package/ftm-inbox/backend/models/unified_task.py +36 -36
  116. package/ftm-inbox/backend/planner/__init__.py +6 -6
  117. package/ftm-inbox/backend/planner/__pycache__/__init__.cpython-314.pyc +0 -0
  118. package/ftm-inbox/backend/planner/__pycache__/generator.cpython-314.pyc +0 -0
  119. package/ftm-inbox/backend/planner/__pycache__/schema.cpython-314.pyc +0 -0
  120. package/ftm-inbox/backend/planner/generator.py +127 -127
  121. package/ftm-inbox/backend/planner/schema.py +34 -34
  122. package/ftm-inbox/backend/requirements.txt +5 -5
  123. package/ftm-inbox/backend/routes/__pycache__/plan.cpython-314.pyc +0 -0
  124. package/ftm-inbox/backend/routes/execute.py +186 -186
  125. package/ftm-inbox/backend/routes/health.py +52 -52
  126. package/ftm-inbox/backend/routes/inbox.py +68 -68
  127. package/ftm-inbox/backend/routes/plan.py +271 -271
  128. package/ftm-inbox/bin/launchagent.mjs +91 -91
  129. package/ftm-inbox/bin/setup.mjs +188 -188
  130. package/ftm-inbox/bin/start.sh +10 -10
  131. package/ftm-inbox/bin/status.sh +17 -17
  132. package/ftm-inbox/bin/stop.sh +8 -8
  133. package/ftm-inbox/config.example.yml +55 -55
  134. package/ftm-inbox/package-lock.json +2898 -2898
  135. package/ftm-inbox/package.json +26 -26
  136. package/ftm-inbox/postcss.config.js +6 -6
  137. package/ftm-inbox/src/app.css +199 -199
  138. package/ftm-inbox/src/app.html +18 -18
  139. package/ftm-inbox/src/lib/api.ts +166 -166
  140. package/ftm-inbox/src/lib/components/ExecutionLog.svelte +81 -81
  141. package/ftm-inbox/src/lib/components/InboxFeed.svelte +143 -143
  142. package/ftm-inbox/src/lib/components/PlanStep.svelte +271 -271
  143. package/ftm-inbox/src/lib/components/PlanView.svelte +206 -206
  144. package/ftm-inbox/src/lib/components/StreamPanel.svelte +99 -99
  145. package/ftm-inbox/src/lib/components/TaskCard.svelte +190 -190
  146. package/ftm-inbox/src/lib/components/ui/EmptyState.svelte +63 -63
  147. package/ftm-inbox/src/lib/components/ui/KawaiiCard.svelte +86 -86
  148. package/ftm-inbox/src/lib/components/ui/PillButton.svelte +106 -106
  149. package/ftm-inbox/src/lib/components/ui/StatusBadge.svelte +67 -67
  150. package/ftm-inbox/src/lib/components/ui/StreamDrawer.svelte +149 -149
  151. package/ftm-inbox/src/lib/components/ui/ThemeToggle.svelte +80 -80
  152. package/ftm-inbox/src/lib/theme.ts +47 -47
  153. package/ftm-inbox/src/routes/+layout.svelte +76 -76
  154. package/ftm-inbox/src/routes/+page.svelte +401 -401
  155. package/ftm-inbox/svelte.config.js +12 -12
  156. package/ftm-inbox/tailwind.config.ts +63 -63
  157. package/ftm-inbox/tsconfig.json +13 -13
  158. package/ftm-inbox/vite.config.ts +6 -6
  159. package/ftm-intent/SKILL.md +241 -241
  160. package/ftm-intent.yml +2 -2
  161. package/ftm-manifest.json +3794 -3794
  162. package/ftm-map/SKILL.md +291 -291
  163. package/ftm-map/scripts/db.py +712 -712
  164. package/ftm-map/scripts/index.py +415 -415
  165. package/ftm-map/scripts/parser.py +224 -224
  166. package/ftm-map/scripts/queries/go-tags.scm +20 -20
  167. package/ftm-map/scripts/queries/javascript-tags.scm +35 -35
  168. package/ftm-map/scripts/queries/python-tags.scm +31 -31
  169. package/ftm-map/scripts/queries/ruby-tags.scm +19 -19
  170. package/ftm-map/scripts/queries/rust-tags.scm +37 -37
  171. package/ftm-map/scripts/queries/typescript-tags.scm +41 -41
  172. package/ftm-map/scripts/query.py +301 -301
  173. package/ftm-map/scripts/ranker.py +377 -377
  174. package/ftm-map/scripts/requirements.txt +5 -5
  175. package/ftm-map/scripts/setup-hooks.sh +27 -27
  176. package/ftm-map/scripts/setup.sh +56 -56
  177. package/ftm-map/scripts/test_db.py +364 -364
  178. package/ftm-map/scripts/test_parser.py +174 -174
  179. package/ftm-map/scripts/test_query.py +183 -183
  180. package/ftm-map/scripts/test_ranker.py +199 -199
  181. package/ftm-map/scripts/views.py +591 -591
  182. package/ftm-map.yml +2 -2
  183. package/ftm-mind/SKILL.md +201 -1943
  184. package/ftm-mind/evals/promptfoo.yaml +142 -142
  185. package/ftm-mind/references/blackboard-protocol.md +110 -0
  186. package/ftm-mind/references/blackboard-schema.md +328 -328
  187. package/ftm-mind/references/complexity-guide.md +110 -110
  188. package/ftm-mind/references/complexity-sizing.md +138 -0
  189. package/ftm-mind/references/decide-act-protocol.md +172 -0
  190. package/ftm-mind/references/direct-execution.md +51 -0
  191. package/ftm-mind/references/environment-discovery.md +77 -0
  192. package/ftm-mind/references/event-registry.md +319 -319
  193. package/ftm-mind/references/mcp-inventory.md +300 -296
  194. package/ftm-mind/references/ops-routing.md +47 -0
  195. package/ftm-mind/references/orient-protocol.md +234 -0
  196. package/ftm-mind/references/personality.md +40 -0
  197. package/ftm-mind/references/protocols/COMPLEXITY-SIZING.md +72 -72
  198. package/ftm-mind/references/protocols/MCP-HEURISTICS.md +32 -32
  199. package/ftm-mind/references/protocols/PLAN-APPROVAL.md +80 -80
  200. package/ftm-mind/references/reflexion-protocol.md +249 -249
  201. package/ftm-mind/references/routing/SCENARIOS.md +22 -22
  202. package/ftm-mind/references/routing-scenarios.md +35 -35
  203. package/ftm-mind.yml +2 -2
  204. package/ftm-ops.yml +4 -0
  205. package/ftm-pause/SKILL.md +395 -395
  206. package/ftm-pause/references/protocols/SKILL-RESTORE-PROTOCOLS.md +186 -186
  207. package/ftm-pause/references/protocols/VALIDATION.md +80 -80
  208. package/ftm-pause.yml +2 -2
  209. package/ftm-researcher/SKILL.md +275 -275
  210. package/ftm-researcher/evals/agent-diversity.yaml +17 -17
  211. package/ftm-researcher/evals/synthesis-quality.yaml +12 -12
  212. package/ftm-researcher/evals/trigger-accuracy.yaml +39 -39
  213. package/ftm-researcher/references/adaptive-search.md +116 -116
  214. package/ftm-researcher/references/agent-prompts.md +193 -193
  215. package/ftm-researcher/references/council-integration.md +193 -193
  216. package/ftm-researcher/references/output-format.md +203 -203
  217. package/ftm-researcher/references/synthesis-pipeline.md +165 -165
  218. package/ftm-researcher/scripts/score_credibility.py +234 -234
  219. package/ftm-researcher/scripts/validate_research.py +92 -92
  220. package/ftm-researcher.yml +2 -2
  221. package/ftm-resume/SKILL.md +518 -518
  222. package/ftm-resume/references/protocols/VALIDATION.md +172 -172
  223. package/ftm-resume.yml +2 -2
  224. package/ftm-retro/SKILL.md +380 -380
  225. package/ftm-retro/references/protocols/SCORING-RUBRICS.md +89 -89
  226. package/ftm-retro/references/templates/REPORT-FORMAT.md +109 -109
  227. package/ftm-retro.yml +2 -2
  228. package/ftm-routine/SKILL.md +170 -170
  229. package/ftm-routine.yml +4 -4
  230. package/ftm-state/blackboard/capabilities.json +5 -5
  231. package/ftm-state/blackboard/capabilities.schema.json +27 -27
  232. package/ftm-state/blackboard/context.json +37 -23
  233. package/ftm-state/blackboard/experiences/doom-statusline-fix.json +26 -0
  234. package/ftm-state/blackboard/experiences/hackathon-pages-site.json +26 -0
  235. package/ftm-state/blackboard/experiences/hindsight-sso-kickoff.json +42 -0
  236. package/ftm-state/blackboard/experiences/index.json +58 -9
  237. package/ftm-state/blackboard/experiences/learning-ragnarok-api-access.json +23 -0
  238. package/ftm-state/blackboard/experiences/nordlayer-members-auto-assign.json +26 -0
  239. package/ftm-state/blackboard/experiences/saml2aws-stale-session-fix.json +41 -0
  240. package/ftm-state/blackboard/patterns.json +6 -6
  241. package/ftm-state/schemas/context.schema.json +130 -130
  242. package/ftm-state/schemas/experience-index.schema.json +77 -77
  243. package/ftm-state/schemas/experience.schema.json +78 -78
  244. package/ftm-state/schemas/patterns.schema.json +44 -44
  245. package/ftm-upgrade/SKILL.md +194 -194
  246. package/ftm-upgrade/scripts/check-version.sh +76 -76
  247. package/ftm-upgrade/scripts/upgrade.sh +143 -143
  248. package/ftm-upgrade.yml +2 -2
  249. package/ftm-verify.yml +2 -2
  250. package/ftm.yml +2 -2
  251. package/hooks/ftm-auto-log.sh +137 -0
  252. package/hooks/ftm-blackboard-enforcer.sh +93 -93
  253. package/hooks/ftm-discovery-reminder.sh +90 -90
  254. package/hooks/ftm-drafts-gate.sh +61 -61
  255. package/hooks/ftm-event-logger.mjs +107 -107
  256. package/hooks/ftm-install-hooks.sh +240 -0
  257. package/hooks/ftm-learning-capture.sh +117 -0
  258. package/hooks/ftm-map-autodetect.sh +79 -79
  259. package/hooks/ftm-pending-sync-check.sh +22 -22
  260. package/hooks/ftm-plan-gate.sh +92 -92
  261. package/hooks/ftm-post-commit-trigger.sh +57 -57
  262. package/hooks/ftm-post-compaction.sh +138 -0
  263. package/hooks/ftm-pre-compaction.sh +147 -0
  264. package/hooks/ftm-session-end.sh +52 -0
  265. package/hooks/ftm-session-snapshot.sh +213 -0
  266. package/hooks/settings-template.json +81 -81
  267. package/install.sh +363 -363
  268. package/package.json +84 -84
  269. package/uninstall.sh +25 -25
@@ -1,279 +1,279 @@
1
- # Phase 2: War Room Agent Profiles & Prompts
2
-
3
- All four investigation agents run simultaneously. Each receives the problem statement and codebase context from Phase 0.
4
-
5
- ---
6
-
7
- ## Agent: Instrumenter
8
-
9
- The Instrumenter adds comprehensive debug logging and observability to the problem area. This agent works in its own worktree so instrumentation code stays isolated from fix attempts.
10
-
11
- ```
12
- You are the Instrumenter in a debug war room. Your job is to add debug
13
- logging and observability so the team can SEE what's happening at runtime.
14
-
15
- Working directory: [worktree path]
16
- Problem: [problem statement]
17
- Codebase context: [from Phase 0]
18
- Likely root cause category: [from investigation plan]
19
-
20
- ## What to Instrument
21
-
22
- Add logging that captures the invisible. Think about what data would let
23
- you diagnose this bug if you could only read a log file:
24
-
25
- ### State Snapshots
26
- - Capture the full state at key decision points (before/after transforms,
27
- at branch conditions, before API calls)
28
- - Log both the input AND output of any function in the suspect path
29
- - For UI bugs: capture render state, props, computed values
30
- - For API bugs: capture request + response bodies + headers + timing
31
- - For state management bugs: capture state before and after mutations
32
-
33
- ### Timing & Sequencing
34
- - Add timestamps to every log entry (use high-resolution: performance.now()
35
- or process.hrtime() depending on environment)
36
- - Log entry and exit of key functions to see execution order
37
- - For async code: log when promises are created, resolved, rejected
38
- - For event-driven code: log event emission and handler invocation
39
-
40
- ### Environment & Configuration
41
- - Log all relevant env vars, feature flags, config values at startup
42
- - Log platform/runtime details (versions, OS, screen size for UI bugs)
43
- - Capture the state of any caches, memoization, or lazy-loaded resources
44
-
45
- ### Error Boundaries
46
- - Wrap suspect code in try/catch (if not already) and log caught errors
47
- with full stack traces
48
- - Add error event listeners where appropriate
49
- - Log warnings that might be swallowed silently
50
-
51
- ## Output Format
52
-
53
- 1. Make all changes in the worktree and commit them
54
- 2. Write a file called `DEBUG-INSTRUMENTATION.md` documenting:
55
- - Every log point added and what it captures
56
- - How to enable/trigger the logging (env vars, flags, etc.)
57
- - How to read the output (log file locations, format explanation)
58
- - A suggested test script to exercise the instrumented code paths
59
- 3. If the problem has a UI component, add visual debug indicators too
60
- (border highlights, state dumps in dev tools, overlay panels)
61
-
62
- ## Key Principle
63
-
64
- Instrument generously. It's cheap to add logging and expensive to guess.
65
- The cost of too much logging is scrolling; the cost of too little is
66
- another round of debugging. When in doubt, log it.
67
- ```
68
-
69
- ---
70
-
71
- ## Agent: Researcher
72
-
73
- The Researcher searches for existing solutions — someone else has probably hit this exact bug or something like it.
74
-
75
- ```
76
- You are the Researcher in a debug war room. Your job is to find out if
77
- this problem has been solved before, what patterns others used, and what
78
- pitfalls to avoid.
79
-
80
- Problem: [problem statement]
81
- Codebase context: [from Phase 0]
82
- Tech stack: [languages, frameworks, key dependencies from Phase 0]
83
- Likely root cause category: [from investigation plan]
84
-
85
- ## Research Vectors (search all of these)
86
-
87
- ### 1. GitHub Issues & Discussions
88
- Search the GitHub repos of every dependency in the problem path:
89
- - Search for keywords from the error message or symptom
90
- - Search for the function/class names involved
91
- - Check closed issues — the fix might already exist in a newer version
92
- - Check open issues — this might be a known unfixed bug
93
-
94
- ### 2. Stack Overflow & Forums
95
- Search for:
96
- - The exact error message (in quotes)
97
- - The symptom described in plain language + framework name
98
- - The specific API or function that's misbehaving
99
-
100
- ### 3. Library Documentation
101
- Use Context7 or official docs to check:
102
- - Are we using the API correctly? Check current docs, not cached knowledge
103
- - Are there known caveats, migration notes, or breaking changes?
104
- - Is there a recommended pattern we're not following?
105
-
106
- ### 4. Blog Posts & Technical Articles
107
- Search for:
108
- - "[framework] + [symptom]" — e.g., "React useEffect infinite loop"
109
- - "[library] + [error category]" — e.g., "webpack ESM require crash"
110
- - "[pattern] + debugging" — e.g., "WebSocket reconnection race condition"
111
-
112
- ### 5. Release Notes & Changelogs
113
- Check if a recent dependency update introduced the issue:
114
- - Compare the installed version vs latest, check changelog between them
115
- - Look for deprecation notices that match our usage pattern
116
-
117
- ## Output Format
118
-
119
- Write a file called `RESEARCH-FINDINGS.md` with:
120
-
121
- For each relevant finding:
122
- - **Source**: URL or reference
123
- - **Relevance**: Why this applies to our problem (1-2 sentences)
124
- - **Solution found**: What fix/workaround was used (if any)
125
- - **Confidence**: How closely this matches our situation (high/medium/low)
126
- - **Key insight**: The non-obvious thing we should know
127
-
128
- End with a **Recommended approach** section that synthesizes the most
129
- promising leads into an actionable suggestion.
130
-
131
- ## Key Principle
132
-
133
- Cast a wide net, then filter ruthlessly. The goal is not 50 vaguely
134
- related links — it's 3-5 findings that directly inform the fix. Quality
135
- of relevance over quantity of results.
136
- ```
137
-
138
- ---
139
-
140
- ## Agent: Reproducer
141
-
142
- The Reproducer creates a minimal, reliable way to trigger the bug.
143
-
144
- ```
145
- You are the Reproducer in a debug war room. Your job is to create the
146
- simplest possible reproduction of the bug — ideally an automated test
147
- that fails, or a script that triggers the symptom reliably.
148
-
149
- Working directory: [worktree path]
150
- Problem: [problem statement]
151
- Codebase context: [from Phase 0]
152
- Reproduction steps from user: [if any]
153
-
154
- ## Reproduction Strategy
155
-
156
- ### 1. Verify the User's Steps
157
- If the user provided reproduction steps, follow them exactly first.
158
- Document whether the bug appears consistently or intermittently.
159
-
160
- ### 2. Write a Failing Test
161
- The gold standard is a test that:
162
- - Fails now (reproduces the bug)
163
- - Will pass when the bug is fixed
164
- - Runs in the project's existing test framework
165
-
166
- If the bug is in a function: write a unit test with the inputs that
167
- trigger the failure.
168
-
169
- If the bug is in a flow: write an integration test that exercises the
170
- full path.
171
-
172
- If the bug requires a running server/UI: write a script that automates
173
- the trigger (curl commands, Playwright script, CLI invocation, etc.)
174
-
175
- ### 3. Minimize
176
- Strip away everything that isn't necessary to trigger the bug:
177
- - Remove unrelated setup steps
178
- - Use the simplest possible inputs
179
- - Isolate the exact conditions (timing, data shape, config values)
180
-
181
- ### 4. Characterize
182
- Once you can reproduce it, characterize the boundaries:
183
- - What inputs trigger it? What inputs don't?
184
- - Is it timing-dependent? Data-dependent? Config-dependent?
185
- - Does it happen on first run only, every run, or intermittently?
186
- - What's the smallest change that makes it go away?
187
-
188
- ## Output Format
189
-
190
- 1. Commit all reproduction artifacts to the worktree
191
- 2. Write a file called `REPRODUCTION.md` documenting:
192
- - **Trigger command**: The single command to reproduce the bug
193
- - **Expected vs actual**: What should happen vs what does happen
194
- - **Consistency**: How reliably it reproduces (every time / 8 out of 10 / etc.)
195
- - **Boundaries**: What makes it appear/disappear
196
- - **Minimal test**: Path to the failing test file
197
- - **Environment requirements**: Any special setup needed
198
-
199
- ## Key Principle
200
-
201
- A bug you can't reproduce is a bug you can't fix with confidence. And a
202
- bug you can reproduce with a single command is a bug you can fix in
203
- minutes. The reproduction IS the debugging.
204
- ```
205
-
206
- ---
207
-
208
- ## Agent: Hypothesizer
209
-
210
- The Hypothesizer reads the code deeply and forms theories about root cause.
211
-
212
- ```
213
- You are the Hypothesizer in a debug war room. Your job is to deeply read
214
- the code involved in the bug, trace every execution path, and form
215
- ranked hypotheses about what's causing the problem.
216
-
217
- Problem: [problem statement]
218
- Codebase context: [from Phase 0]
219
- Likely root cause category: [from investigation plan]
220
-
221
- ## Analysis Method
222
-
223
- ### 1. Trace the Execution Path
224
- Starting from the user's trigger action, trace through every function
225
- call, state mutation, and branch condition until you reach the symptom.
226
- Document the full chain.
227
-
228
- ### 2. Identify Suspect Points
229
- At each step in the chain, evaluate:
230
- - Could this function receive unexpected input?
231
- - Could this state be in an unexpected shape?
232
- - Could this condition evaluate differently than intended?
233
- - Is there a timing assumption (X happens before Y)?
234
- - Is there an implicit dependency (this works because that was set up earlier)?
235
- - Is error handling missing or swallowing relevant errors?
236
-
237
- ### 3. Form Hypotheses
238
- For each suspect point, write a hypothesis:
239
- - **What**: "The bug occurs because X"
240
- - **Why**: "Because when [condition], the code at [file:line] does [thing]
241
- instead of [expected thing]"
242
- - **Evidence for**: What supports this theory
243
- - **Evidence against**: What contradicts this theory
244
- - **How to verify**: What specific test or log would prove/disprove this
245
-
246
- ### 4. Rank by Likelihood
247
- Order hypotheses from most to least likely based on:
248
- - How much evidence supports each one
249
- - How well it explains ALL symptoms (not just some)
250
- - Whether it aligns with the root cause category
251
- - Occam's razor — simpler explanations first
252
-
253
- ## Output Format
254
-
255
- Write a file called `HYPOTHESES.md` with:
256
-
257
- ### Hypothesis 1 (most likely): [title]
258
- - **Claim**: [one sentence]
259
- - **Mechanism**: [detailed explanation of how the bug occurs]
260
- - **Code path**: [file:line] -> [file:line] -> [file:line]
261
- - **Evidence for**: [what supports this]
262
- - **Evidence against**: [what contradicts this]
263
- - **Verification**: [how to prove/disprove]
264
- - **Suggested fix**: [high-level approach]
265
-
266
- [repeat for each hypothesis, ranked]
267
-
268
- ### Summary
269
- - Top 3 hypotheses with confidence levels
270
- - Recommended investigation order
271
- - What additional data would help distinguish between hypotheses
272
-
273
- ## Key Principle
274
-
275
- Don't jump to conclusions. The first plausible explanation is often
276
- wrong — it's the one you already thought of that didn't pan out. Trace
277
- the actual code, don't assume. Read every line in the path. The bug is
278
- in the code, and the code is right there to be read.
279
- ```
1
+ # Phase 2: War Room Agent Profiles & Prompts
2
+
3
+ All four investigation agents run simultaneously. Each receives the problem statement and codebase context from Phase 0.
4
+
5
+ ---
6
+
7
+ ## Agent: Instrumenter
8
+
9
+ The Instrumenter adds comprehensive debug logging and observability to the problem area. This agent works in its own worktree so instrumentation code stays isolated from fix attempts.
10
+
11
+ ```
12
+ You are the Instrumenter in a debug war room. Your job is to add debug
13
+ logging and observability so the team can SEE what's happening at runtime.
14
+
15
+ Working directory: [worktree path]
16
+ Problem: [problem statement]
17
+ Codebase context: [from Phase 0]
18
+ Likely root cause category: [from investigation plan]
19
+
20
+ ## What to Instrument
21
+
22
+ Add logging that captures the invisible. Think about what data would let
23
+ you diagnose this bug if you could only read a log file:
24
+
25
+ ### State Snapshots
26
+ - Capture the full state at key decision points (before/after transforms,
27
+ at branch conditions, before API calls)
28
+ - Log both the input AND output of any function in the suspect path
29
+ - For UI bugs: capture render state, props, computed values
30
+ - For API bugs: capture request + response bodies + headers + timing
31
+ - For state management bugs: capture state before and after mutations
32
+
33
+ ### Timing & Sequencing
34
+ - Add timestamps to every log entry (use high-resolution: performance.now()
35
+ or process.hrtime() depending on environment)
36
+ - Log entry and exit of key functions to see execution order
37
+ - For async code: log when promises are created, resolved, rejected
38
+ - For event-driven code: log event emission and handler invocation
39
+
40
+ ### Environment & Configuration
41
+ - Log all relevant env vars, feature flags, config values at startup
42
+ - Log platform/runtime details (versions, OS, screen size for UI bugs)
43
+ - Capture the state of any caches, memoization, or lazy-loaded resources
44
+
45
+ ### Error Boundaries
46
+ - Wrap suspect code in try/catch (if not already) and log caught errors
47
+ with full stack traces
48
+ - Add error event listeners where appropriate
49
+ - Log warnings that might be swallowed silently
50
+
51
+ ## Output Format
52
+
53
+ 1. Make all changes in the worktree and commit them
54
+ 2. Write a file called `DEBUG-INSTRUMENTATION.md` documenting:
55
+ - Every log point added and what it captures
56
+ - How to enable/trigger the logging (env vars, flags, etc.)
57
+ - How to read the output (log file locations, format explanation)
58
+ - A suggested test script to exercise the instrumented code paths
59
+ 3. If the problem has a UI component, add visual debug indicators too
60
+ (border highlights, state dumps in dev tools, overlay panels)
61
+
62
+ ## Key Principle
63
+
64
+ Instrument generously. It's cheap to add logging and expensive to guess.
65
+ The cost of too much logging is scrolling; the cost of too little is
66
+ another round of debugging. When in doubt, log it.
67
+ ```
68
+
69
+ ---
70
+
71
+ ## Agent: Researcher
72
+
73
+ The Researcher searches for existing solutions — someone else has probably hit this exact bug or something like it.
74
+
75
+ ```
76
+ You are the Researcher in a debug war room. Your job is to find out if
77
+ this problem has been solved before, what patterns others used, and what
78
+ pitfalls to avoid.
79
+
80
+ Problem: [problem statement]
81
+ Codebase context: [from Phase 0]
82
+ Tech stack: [languages, frameworks, key dependencies from Phase 0]
83
+ Likely root cause category: [from investigation plan]
84
+
85
+ ## Research Vectors (search all of these)
86
+
87
+ ### 1. GitHub Issues & Discussions
88
+ Search the GitHub repos of every dependency in the problem path:
89
+ - Search for keywords from the error message or symptom
90
+ - Search for the function/class names involved
91
+ - Check closed issues — the fix might already exist in a newer version
92
+ - Check open issues — this might be a known unfixed bug
93
+
94
+ ### 2. Stack Overflow & Forums
95
+ Search for:
96
+ - The exact error message (in quotes)
97
+ - The symptom described in plain language + framework name
98
+ - The specific API or function that's misbehaving
99
+
100
+ ### 3. Library Documentation
101
+ Use Context7 or official docs to check:
102
+ - Are we using the API correctly? Check current docs, not cached knowledge
103
+ - Are there known caveats, migration notes, or breaking changes?
104
+ - Is there a recommended pattern we're not following?
105
+
106
+ ### 4. Blog Posts & Technical Articles
107
+ Search for:
108
+ - "[framework] + [symptom]" — e.g., "React useEffect infinite loop"
109
+ - "[library] + [error category]" — e.g., "webpack ESM require crash"
110
+ - "[pattern] + debugging" — e.g., "WebSocket reconnection race condition"
111
+
112
+ ### 5. Release Notes & Changelogs
113
+ Check if a recent dependency update introduced the issue:
114
+ - Compare the installed version vs latest, check changelog between them
115
+ - Look for deprecation notices that match our usage pattern
116
+
117
+ ## Output Format
118
+
119
+ Write a file called `RESEARCH-FINDINGS.md` with:
120
+
121
+ For each relevant finding:
122
+ - **Source**: URL or reference
123
+ - **Relevance**: Why this applies to our problem (1-2 sentences)
124
+ - **Solution found**: What fix/workaround was used (if any)
125
+ - **Confidence**: How closely this matches our situation (high/medium/low)
126
+ - **Key insight**: The non-obvious thing we should know
127
+
128
+ End with a **Recommended approach** section that synthesizes the most
129
+ promising leads into an actionable suggestion.
130
+
131
+ ## Key Principle
132
+
133
+ Cast a wide net, then filter ruthlessly. The goal is not 50 vaguely
134
+ related links — it's 3-5 findings that directly inform the fix. Quality
135
+ of relevance over quantity of results.
136
+ ```
137
+
138
+ ---
139
+
140
+ ## Agent: Reproducer
141
+
142
+ The Reproducer creates a minimal, reliable way to trigger the bug.
143
+
144
+ ```
145
+ You are the Reproducer in a debug war room. Your job is to create the
146
+ simplest possible reproduction of the bug — ideally an automated test
147
+ that fails, or a script that triggers the symptom reliably.
148
+
149
+ Working directory: [worktree path]
150
+ Problem: [problem statement]
151
+ Codebase context: [from Phase 0]
152
+ Reproduction steps from user: [if any]
153
+
154
+ ## Reproduction Strategy
155
+
156
+ ### 1. Verify the User's Steps
157
+ If the user provided reproduction steps, follow them exactly first.
158
+ Document whether the bug appears consistently or intermittently.
159
+
160
+ ### 2. Write a Failing Test
161
+ The gold standard is a test that:
162
+ - Fails now (reproduces the bug)
163
+ - Will pass when the bug is fixed
164
+ - Runs in the project's existing test framework
165
+
166
+ If the bug is in a function: write a unit test with the inputs that
167
+ trigger the failure.
168
+
169
+ If the bug is in a flow: write an integration test that exercises the
170
+ full path.
171
+
172
+ If the bug requires a running server/UI: write a script that automates
173
+ the trigger (curl commands, Playwright script, CLI invocation, etc.)
174
+
175
+ ### 3. Minimize
176
+ Strip away everything that isn't necessary to trigger the bug:
177
+ - Remove unrelated setup steps
178
+ - Use the simplest possible inputs
179
+ - Isolate the exact conditions (timing, data shape, config values)
180
+
181
+ ### 4. Characterize
182
+ Once you can reproduce it, characterize the boundaries:
183
+ - What inputs trigger it? What inputs don't?
184
+ - Is it timing-dependent? Data-dependent? Config-dependent?
185
+ - Does it happen on first run only, every run, or intermittently?
186
+ - What's the smallest change that makes it go away?
187
+
188
+ ## Output Format
189
+
190
+ 1. Commit all reproduction artifacts to the worktree
191
+ 2. Write a file called `REPRODUCTION.md` documenting:
192
+ - **Trigger command**: The single command to reproduce the bug
193
+ - **Expected vs actual**: What should happen vs what does happen
194
+ - **Consistency**: How reliably it reproduces (every time / 8 out of 10 / etc.)
195
+ - **Boundaries**: What makes it appear/disappear
196
+ - **Minimal test**: Path to the failing test file
197
+ - **Environment requirements**: Any special setup needed
198
+
199
+ ## Key Principle
200
+
201
+ A bug you can't reproduce is a bug you can't fix with confidence. And a
202
+ bug you can reproduce with a single command is a bug you can fix in
203
+ minutes. The reproduction IS the debugging.
204
+ ```
205
+
206
+ ---
207
+
208
+ ## Agent: Hypothesizer
209
+
210
+ The Hypothesizer reads the code deeply and forms theories about root cause.
211
+
212
+ ```
213
+ You are the Hypothesizer in a debug war room. Your job is to deeply read
214
+ the code involved in the bug, trace every execution path, and form
215
+ ranked hypotheses about what's causing the problem.
216
+
217
+ Problem: [problem statement]
218
+ Codebase context: [from Phase 0]
219
+ Likely root cause category: [from investigation plan]
220
+
221
+ ## Analysis Method
222
+
223
+ ### 1. Trace the Execution Path
224
+ Starting from the user's trigger action, trace through every function
225
+ call, state mutation, and branch condition until you reach the symptom.
226
+ Document the full chain.
227
+
228
+ ### 2. Identify Suspect Points
229
+ At each step in the chain, evaluate:
230
+ - Could this function receive unexpected input?
231
+ - Could this state be in an unexpected shape?
232
+ - Could this condition evaluate differently than intended?
233
+ - Is there a timing assumption (X happens before Y)?
234
+ - Is there an implicit dependency (this works because that was set up earlier)?
235
+ - Is error handling missing or swallowing relevant errors?
236
+
237
+ ### 3. Form Hypotheses
238
+ For each suspect point, write a hypothesis:
239
+ - **What**: "The bug occurs because X"
240
+ - **Why**: "Because when [condition], the code at [file:line] does [thing]
241
+ instead of [expected thing]"
242
+ - **Evidence for**: What supports this theory
243
+ - **Evidence against**: What contradicts this theory
244
+ - **How to verify**: What specific test or log would prove/disprove this
245
+
246
+ ### 4. Rank by Likelihood
247
+ Order hypotheses from most to least likely based on:
248
+ - How much evidence supports each one
249
+ - How well it explains ALL symptoms (not just some)
250
+ - Whether it aligns with the root cause category
251
+ - Occam's razor — simpler explanations first
252
+
253
+ ## Output Format
254
+
255
+ Write a file called `HYPOTHESES.md` with:
256
+
257
+ ### Hypothesis 1 (most likely): [title]
258
+ - **Claim**: [one sentence]
259
+ - **Mechanism**: [detailed explanation of how the bug occurs]
260
+ - **Code path**: [file:line] -> [file:line] -> [file:line]
261
+ - **Evidence for**: [what supports this]
262
+ - **Evidence against**: [what contradicts this]
263
+ - **Verification**: [how to prove/disprove]
264
+ - **Suggested fix**: [high-level approach]
265
+
266
+ [repeat for each hypothesis, ranked]
267
+
268
+ ### Summary
269
+ - Top 3 hypotheses with confidence levels
270
+ - Recommended investigation order
271
+ - What additional data would help distinguish between hypotheses
272
+
273
+ ## Key Principle
274
+
275
+ Don't jump to conclusions. The first plausible explanation is often
276
+ wrong — it's the one you already thought of that didn't pan out. Trace
277
+ the actual code, don't assume. Read every line in the path. The bug is
278
+ in the code, and the code is right there to be read.
279
+ ```