feed-the-machine 1.6.1 → 1.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (272) hide show
  1. package/LICENSE +21 -21
  2. package/README.md +262 -170
  3. package/bin/__pycache__/tasks_db.cpython-314.pyc +0 -0
  4. package/bin/brain.py +1340 -0
  5. package/bin/convert_claude_skills_to_codex.py +490 -0
  6. package/bin/generate-manifest.mjs +463 -463
  7. package/bin/harden_codex_skills.py +141 -0
  8. package/bin/install.mjs +491 -491
  9. package/bin/migrate-eng-buddy-data.py +875 -0
  10. package/bin/playbook_engine/__init__.py +1 -0
  11. package/bin/playbook_engine/conftest.py +8 -0
  12. package/bin/playbook_engine/extractor.py +33 -0
  13. package/bin/playbook_engine/manager.py +102 -0
  14. package/bin/playbook_engine/models.py +84 -0
  15. package/bin/playbook_engine/registry.py +35 -0
  16. package/bin/playbook_engine/test_extractor.py +72 -0
  17. package/bin/playbook_engine/test_integration.py +129 -0
  18. package/bin/playbook_engine/test_manager.py +85 -0
  19. package/bin/playbook_engine/test_models.py +166 -0
  20. package/bin/playbook_engine/test_registry.py +67 -0
  21. package/bin/playbook_engine/test_tracer.py +86 -0
  22. package/bin/playbook_engine/tracer.py +93 -0
  23. package/bin/tasks_db.py +456 -0
  24. package/docs/HOOKS.md +243 -243
  25. package/docs/INBOX.md +233 -233
  26. package/ftm/SKILL.md +125 -122
  27. package/ftm-audit/SKILL.md +673 -623
  28. package/ftm-audit/references/protocols/PROJECT-PATTERNS.md +91 -91
  29. package/ftm-audit/references/protocols/RUNTIME-WIRING.md +66 -66
  30. package/ftm-audit/references/protocols/WIRING-CONTRACTS.md +135 -135
  31. package/ftm-audit/references/strategies/AUTO-FIX-STRATEGIES.md +69 -69
  32. package/ftm-audit/references/templates/REPORT-FORMAT.md +96 -96
  33. package/ftm-audit/scripts/run-knip.sh +23 -23
  34. package/ftm-audit.yml +2 -2
  35. package/ftm-brainstorm/SKILL.md +1003 -498
  36. package/ftm-brainstorm/evals/evals.json +180 -100
  37. package/ftm-brainstorm/evals/promptfoo.yaml +109 -109
  38. package/ftm-brainstorm/references/agent-prompts.md +552 -224
  39. package/ftm-brainstorm/references/plan-template.md +209 -121
  40. package/ftm-brainstorm.yml +2 -2
  41. package/ftm-browse/SKILL.md +454 -454
  42. package/ftm-browse/daemon/browser-manager.ts +206 -206
  43. package/ftm-browse/daemon/bun.lock +30 -30
  44. package/ftm-browse/daemon/cli.ts +347 -347
  45. package/ftm-browse/daemon/commands.ts +410 -410
  46. package/ftm-browse/daemon/main.ts +357 -357
  47. package/ftm-browse/daemon/package.json +17 -17
  48. package/ftm-browse/daemon/server.ts +189 -189
  49. package/ftm-browse/daemon/snapshot.ts +519 -519
  50. package/ftm-browse/daemon/tsconfig.json +22 -22
  51. package/ftm-browse.yml +4 -4
  52. package/ftm-capture/SKILL.md +370 -370
  53. package/ftm-capture.yml +4 -4
  54. package/ftm-codex-gate/SKILL.md +361 -361
  55. package/ftm-codex-gate.yml +2 -2
  56. package/ftm-config/SKILL.md +422 -345
  57. package/ftm-config.default.yml +125 -82
  58. package/ftm-config.yml +44 -2
  59. package/ftm-council/SKILL.md +416 -416
  60. package/ftm-council/references/prompts/CLAUDE-INVESTIGATION.md +60 -60
  61. package/ftm-council/references/prompts/CODEX-INVESTIGATION.md +58 -58
  62. package/ftm-council/references/prompts/GEMINI-INVESTIGATION.md +58 -58
  63. package/ftm-council/references/prompts/REBUTTAL-TEMPLATE.md +57 -57
  64. package/ftm-council/references/protocols/PREREQUISITES.md +47 -47
  65. package/ftm-council/references/protocols/STEP-0-FRAMING.md +46 -46
  66. package/ftm-council-chat.yml +2 -0
  67. package/ftm-council.yml +2 -2
  68. package/ftm-dashboard/SKILL.md +163 -163
  69. package/ftm-dashboard.yml +4 -4
  70. package/ftm-debug/SKILL.md +1037 -1037
  71. package/ftm-debug/references/phases/PHASE-0-INTAKE.md +58 -58
  72. package/ftm-debug/references/phases/PHASE-1-TRIAGE.md +46 -46
  73. package/ftm-debug/references/phases/PHASE-2-WAR-ROOM-AGENTS.md +279 -279
  74. package/ftm-debug/references/phases/PHASE-3-TO-6-EXECUTION.md +436 -436
  75. package/ftm-debug/references/protocols/BLACKBOARD.md +86 -86
  76. package/ftm-debug/references/protocols/EDGE-CASES.md +103 -103
  77. package/ftm-debug.yml +2 -2
  78. package/ftm-diagram/SKILL.md +277 -277
  79. package/ftm-diagram.yml +2 -2
  80. package/ftm-executor/SKILL.md +777 -777
  81. package/ftm-executor/references/STYLE-TEMPLATE.md +73 -73
  82. package/ftm-executor/references/phases/PHASE-0-VERIFICATION.md +62 -62
  83. package/ftm-executor/references/phases/PHASE-2-AGENT-ASSEMBLY.md +34 -34
  84. package/ftm-executor/references/phases/PHASE-3-WORKTREES.md +38 -38
  85. package/ftm-executor/references/phases/PHASE-4-5-AUDIT.md +81 -72
  86. package/ftm-executor/references/phases/PHASE-4-DISPATCH.md +66 -66
  87. package/ftm-executor/references/phases/PHASE-5-5-CODEX-GATE.md +73 -73
  88. package/ftm-executor/references/protocols/DOCUMENTATION-BOOTSTRAP.md +36 -36
  89. package/ftm-executor/references/protocols/MODEL-PROFILE.md +59 -59
  90. package/ftm-executor/references/protocols/PROGRESS-TRACKING.md +66 -66
  91. package/ftm-executor/runtime/ftm-runtime.mjs +252 -252
  92. package/ftm-executor/runtime/package.json +8 -8
  93. package/ftm-executor.yml +2 -2
  94. package/ftm-git/SKILL.md +441 -441
  95. package/ftm-git/evals/evals.json +26 -26
  96. package/ftm-git/evals/promptfoo.yaml +75 -75
  97. package/ftm-git/hooks/post-commit-experience.sh +92 -92
  98. package/ftm-git/references/patterns/SECRET-PATTERNS.md +104 -104
  99. package/ftm-git/references/protocols/REMEDIATION.md +139 -139
  100. package/ftm-git/scripts/pre-commit-secrets.sh +110 -110
  101. package/ftm-git.yml +2 -2
  102. package/ftm-inbox/backend/__pycache__/main.cpython-314.pyc +0 -0
  103. package/ftm-inbox/backend/adapters/_retry.py +64 -64
  104. package/ftm-inbox/backend/adapters/base.py +230 -230
  105. package/ftm-inbox/backend/adapters/freshservice.py +104 -104
  106. package/ftm-inbox/backend/adapters/gmail.py +125 -125
  107. package/ftm-inbox/backend/adapters/jira.py +136 -136
  108. package/ftm-inbox/backend/adapters/registry.py +192 -192
  109. package/ftm-inbox/backend/adapters/slack.py +110 -110
  110. package/ftm-inbox/backend/db/connection.py +54 -54
  111. package/ftm-inbox/backend/db/schema.py +78 -78
  112. package/ftm-inbox/backend/executor/__init__.py +7 -7
  113. package/ftm-inbox/backend/executor/engine.py +149 -149
  114. package/ftm-inbox/backend/executor/step_runner.py +98 -98
  115. package/ftm-inbox/backend/main.py +103 -103
  116. package/ftm-inbox/backend/models/__init__.py +1 -1
  117. package/ftm-inbox/backend/models/unified_task.py +36 -36
  118. package/ftm-inbox/backend/planner/__init__.py +6 -6
  119. package/ftm-inbox/backend/planner/__pycache__/__init__.cpython-314.pyc +0 -0
  120. package/ftm-inbox/backend/planner/__pycache__/generator.cpython-314.pyc +0 -0
  121. package/ftm-inbox/backend/planner/__pycache__/schema.cpython-314.pyc +0 -0
  122. package/ftm-inbox/backend/planner/generator.py +127 -127
  123. package/ftm-inbox/backend/planner/schema.py +34 -34
  124. package/ftm-inbox/backend/requirements.txt +5 -5
  125. package/ftm-inbox/backend/routes/__pycache__/plan.cpython-314.pyc +0 -0
  126. package/ftm-inbox/backend/routes/execute.py +186 -186
  127. package/ftm-inbox/backend/routes/health.py +52 -52
  128. package/ftm-inbox/backend/routes/inbox.py +68 -68
  129. package/ftm-inbox/backend/routes/plan.py +271 -271
  130. package/ftm-inbox/bin/launchagent.mjs +91 -91
  131. package/ftm-inbox/bin/setup.mjs +188 -188
  132. package/ftm-inbox/bin/start.sh +10 -10
  133. package/ftm-inbox/bin/status.sh +17 -17
  134. package/ftm-inbox/bin/stop.sh +8 -8
  135. package/ftm-inbox/config.example.yml +55 -55
  136. package/ftm-inbox/package-lock.json +2898 -2898
  137. package/ftm-inbox/package.json +26 -26
  138. package/ftm-inbox/postcss.config.js +6 -6
  139. package/ftm-inbox/src/app.css +199 -199
  140. package/ftm-inbox/src/app.html +18 -18
  141. package/ftm-inbox/src/lib/api.ts +166 -166
  142. package/ftm-inbox/src/lib/components/ExecutionLog.svelte +81 -81
  143. package/ftm-inbox/src/lib/components/InboxFeed.svelte +143 -143
  144. package/ftm-inbox/src/lib/components/PlanStep.svelte +271 -271
  145. package/ftm-inbox/src/lib/components/PlanView.svelte +206 -206
  146. package/ftm-inbox/src/lib/components/StreamPanel.svelte +99 -99
  147. package/ftm-inbox/src/lib/components/TaskCard.svelte +190 -190
  148. package/ftm-inbox/src/lib/components/ui/EmptyState.svelte +63 -63
  149. package/ftm-inbox/src/lib/components/ui/KawaiiCard.svelte +86 -86
  150. package/ftm-inbox/src/lib/components/ui/PillButton.svelte +106 -106
  151. package/ftm-inbox/src/lib/components/ui/StatusBadge.svelte +67 -67
  152. package/ftm-inbox/src/lib/components/ui/StreamDrawer.svelte +149 -149
  153. package/ftm-inbox/src/lib/components/ui/ThemeToggle.svelte +80 -80
  154. package/ftm-inbox/src/lib/theme.ts +47 -47
  155. package/ftm-inbox/src/routes/+layout.svelte +76 -76
  156. package/ftm-inbox/src/routes/+page.svelte +401 -401
  157. package/ftm-inbox/svelte.config.js +12 -12
  158. package/ftm-inbox/tailwind.config.ts +63 -63
  159. package/ftm-inbox/tsconfig.json +13 -13
  160. package/ftm-inbox/vite.config.ts +6 -6
  161. package/ftm-intent/SKILL.md +241 -241
  162. package/ftm-intent.yml +2 -2
  163. package/ftm-manifest.json +3794 -3794
  164. package/ftm-map/SKILL.md +291 -291
  165. package/ftm-map/scripts/db.py +712 -712
  166. package/ftm-map/scripts/index.py +415 -415
  167. package/ftm-map/scripts/parser.py +224 -224
  168. package/ftm-map/scripts/queries/go-tags.scm +20 -20
  169. package/ftm-map/scripts/queries/javascript-tags.scm +35 -35
  170. package/ftm-map/scripts/queries/python-tags.scm +31 -31
  171. package/ftm-map/scripts/queries/ruby-tags.scm +19 -19
  172. package/ftm-map/scripts/queries/rust-tags.scm +37 -37
  173. package/ftm-map/scripts/queries/typescript-tags.scm +41 -41
  174. package/ftm-map/scripts/query.py +301 -301
  175. package/ftm-map/scripts/ranker.py +377 -377
  176. package/ftm-map/scripts/requirements.txt +5 -5
  177. package/ftm-map/scripts/setup-hooks.sh +27 -27
  178. package/ftm-map/scripts/setup.sh +56 -56
  179. package/ftm-map/scripts/test_db.py +364 -364
  180. package/ftm-map/scripts/test_parser.py +174 -174
  181. package/ftm-map/scripts/test_query.py +183 -183
  182. package/ftm-map/scripts/test_ranker.py +199 -199
  183. package/ftm-map/scripts/views.py +591 -591
  184. package/ftm-map.yml +2 -2
  185. package/ftm-mind/SKILL.md +201 -1943
  186. package/ftm-mind/evals/promptfoo.yaml +142 -142
  187. package/ftm-mind/references/blackboard-protocol.md +110 -0
  188. package/ftm-mind/references/blackboard-schema.md +328 -328
  189. package/ftm-mind/references/complexity-guide.md +110 -110
  190. package/ftm-mind/references/complexity-sizing.md +138 -0
  191. package/ftm-mind/references/decide-act-protocol.md +172 -0
  192. package/ftm-mind/references/direct-execution.md +51 -0
  193. package/ftm-mind/references/environment-discovery.md +77 -0
  194. package/ftm-mind/references/event-registry.md +319 -319
  195. package/ftm-mind/references/mcp-inventory.md +300 -296
  196. package/ftm-mind/references/ops-routing.md +47 -0
  197. package/ftm-mind/references/orient-protocol.md +234 -0
  198. package/ftm-mind/references/personality.md +40 -0
  199. package/ftm-mind/references/protocols/COMPLEXITY-SIZING.md +72 -72
  200. package/ftm-mind/references/protocols/MCP-HEURISTICS.md +32 -32
  201. package/ftm-mind/references/protocols/PLAN-APPROVAL.md +80 -80
  202. package/ftm-mind/references/reflexion-protocol.md +249 -249
  203. package/ftm-mind/references/routing/SCENARIOS.md +22 -22
  204. package/ftm-mind/references/routing-scenarios.md +35 -35
  205. package/ftm-mind.yml +2 -2
  206. package/ftm-ops.yml +4 -0
  207. package/ftm-pause/SKILL.md +395 -395
  208. package/ftm-pause/references/protocols/SKILL-RESTORE-PROTOCOLS.md +186 -186
  209. package/ftm-pause/references/protocols/VALIDATION.md +80 -80
  210. package/ftm-pause.yml +2 -2
  211. package/ftm-researcher/SKILL.md +275 -275
  212. package/ftm-researcher/evals/agent-diversity.yaml +17 -17
  213. package/ftm-researcher/evals/synthesis-quality.yaml +12 -12
  214. package/ftm-researcher/evals/trigger-accuracy.yaml +39 -39
  215. package/ftm-researcher/references/adaptive-search.md +116 -116
  216. package/ftm-researcher/references/agent-prompts.md +193 -193
  217. package/ftm-researcher/references/council-integration.md +193 -193
  218. package/ftm-researcher/references/output-format.md +203 -203
  219. package/ftm-researcher/references/synthesis-pipeline.md +165 -165
  220. package/ftm-researcher/scripts/score_credibility.py +234 -234
  221. package/ftm-researcher/scripts/validate_research.py +92 -92
  222. package/ftm-researcher.yml +2 -2
  223. package/ftm-resume/SKILL.md +518 -518
  224. package/ftm-resume/references/protocols/VALIDATION.md +172 -172
  225. package/ftm-resume.yml +2 -2
  226. package/ftm-retro/SKILL.md +380 -380
  227. package/ftm-retro/references/protocols/SCORING-RUBRICS.md +89 -89
  228. package/ftm-retro/references/templates/REPORT-FORMAT.md +109 -109
  229. package/ftm-retro.yml +2 -2
  230. package/ftm-routine/SKILL.md +170 -170
  231. package/ftm-routine.yml +4 -4
  232. package/ftm-state/blackboard/capabilities.json +5 -5
  233. package/ftm-state/blackboard/capabilities.schema.json +27 -27
  234. package/ftm-state/blackboard/context.json +37 -23
  235. package/ftm-state/blackboard/experiences/doom-statusline-fix.json +26 -0
  236. package/ftm-state/blackboard/experiences/hackathon-pages-site.json +26 -0
  237. package/ftm-state/blackboard/experiences/hindsight-sso-kickoff.json +42 -0
  238. package/ftm-state/blackboard/experiences/index.json +58 -9
  239. package/ftm-state/blackboard/experiences/learning-ragnarok-api-access.json +23 -0
  240. package/ftm-state/blackboard/experiences/nordlayer-members-auto-assign.json +26 -0
  241. package/ftm-state/blackboard/experiences/saml2aws-stale-session-fix.json +41 -0
  242. package/ftm-state/blackboard/patterns.json +6 -6
  243. package/ftm-state/schemas/context.schema.json +130 -130
  244. package/ftm-state/schemas/experience-index.schema.json +77 -77
  245. package/ftm-state/schemas/experience.schema.json +78 -78
  246. package/ftm-state/schemas/patterns.schema.json +44 -44
  247. package/ftm-upgrade/SKILL.md +194 -194
  248. package/ftm-upgrade/scripts/check-version.sh +76 -76
  249. package/ftm-upgrade/scripts/upgrade.sh +143 -143
  250. package/ftm-upgrade.yml +2 -2
  251. package/ftm-verify.yml +2 -2
  252. package/ftm.yml +2 -2
  253. package/hooks/ftm-auto-log.sh +137 -0
  254. package/hooks/ftm-blackboard-enforcer.sh +93 -93
  255. package/hooks/ftm-discovery-reminder.sh +90 -90
  256. package/hooks/ftm-drafts-gate.sh +61 -61
  257. package/hooks/ftm-event-logger.mjs +107 -107
  258. package/hooks/ftm-install-hooks.sh +240 -0
  259. package/hooks/ftm-learning-capture.sh +117 -0
  260. package/hooks/ftm-map-autodetect.sh +79 -79
  261. package/hooks/ftm-pending-sync-check.sh +22 -22
  262. package/hooks/ftm-plan-gate.sh +92 -92
  263. package/hooks/ftm-post-commit-trigger.sh +57 -57
  264. package/hooks/ftm-post-compaction.sh +138 -0
  265. package/hooks/ftm-pre-compaction.sh +147 -0
  266. package/hooks/ftm-session-end.sh +52 -0
  267. package/hooks/ftm-session-snapshot.sh +213 -0
  268. package/hooks/ftm-task-loader.sh +100 -0
  269. package/hooks/settings-template.json +91 -81
  270. package/install.sh +363 -363
  271. package/package.json +84 -84
  272. package/uninstall.sh +25 -25
@@ -1,279 +1,279 @@
1
- # Phase 2: War Room Agent Profiles & Prompts
2
-
3
- All four investigation agents run simultaneously. Each receives the problem statement and codebase context from Phase 0.
4
-
5
- ---
6
-
7
- ## Agent: Instrumenter
8
-
9
- The Instrumenter adds comprehensive debug logging and observability to the problem area. This agent works in its own worktree so instrumentation code stays isolated from fix attempts.
10
-
11
- ```
12
- You are the Instrumenter in a debug war room. Your job is to add debug
13
- logging and observability so the team can SEE what's happening at runtime.
14
-
15
- Working directory: [worktree path]
16
- Problem: [problem statement]
17
- Codebase context: [from Phase 0]
18
- Likely root cause category: [from investigation plan]
19
-
20
- ## What to Instrument
21
-
22
- Add logging that captures the invisible. Think about what data would let
23
- you diagnose this bug if you could only read a log file:
24
-
25
- ### State Snapshots
26
- - Capture the full state at key decision points (before/after transforms,
27
- at branch conditions, before API calls)
28
- - Log both the input AND output of any function in the suspect path
29
- - For UI bugs: capture render state, props, computed values
30
- - For API bugs: capture request + response bodies + headers + timing
31
- - For state management bugs: capture state before and after mutations
32
-
33
- ### Timing & Sequencing
34
- - Add timestamps to every log entry (use high-resolution: performance.now()
35
- or process.hrtime() depending on environment)
36
- - Log entry and exit of key functions to see execution order
37
- - For async code: log when promises are created, resolved, rejected
38
- - For event-driven code: log event emission and handler invocation
39
-
40
- ### Environment & Configuration
41
- - Log all relevant env vars, feature flags, config values at startup
42
- - Log platform/runtime details (versions, OS, screen size for UI bugs)
43
- - Capture the state of any caches, memoization, or lazy-loaded resources
44
-
45
- ### Error Boundaries
46
- - Wrap suspect code in try/catch (if not already) and log caught errors
47
- with full stack traces
48
- - Add error event listeners where appropriate
49
- - Log warnings that might be swallowed silently
50
-
51
- ## Output Format
52
-
53
- 1. Make all changes in the worktree and commit them
54
- 2. Write a file called `DEBUG-INSTRUMENTATION.md` documenting:
55
- - Every log point added and what it captures
56
- - How to enable/trigger the logging (env vars, flags, etc.)
57
- - How to read the output (log file locations, format explanation)
58
- - A suggested test script to exercise the instrumented code paths
59
- 3. If the problem has a UI component, add visual debug indicators too
60
- (border highlights, state dumps in dev tools, overlay panels)
61
-
62
- ## Key Principle
63
-
64
- Instrument generously. It's cheap to add logging and expensive to guess.
65
- The cost of too much logging is scrolling; the cost of too little is
66
- another round of debugging. When in doubt, log it.
67
- ```
68
-
69
- ---
70
-
71
- ## Agent: Researcher
72
-
73
- The Researcher searches for existing solutions — someone else has probably hit this exact bug or something like it.
74
-
75
- ```
76
- You are the Researcher in a debug war room. Your job is to find out if
77
- this problem has been solved before, what patterns others used, and what
78
- pitfalls to avoid.
79
-
80
- Problem: [problem statement]
81
- Codebase context: [from Phase 0]
82
- Tech stack: [languages, frameworks, key dependencies from Phase 0]
83
- Likely root cause category: [from investigation plan]
84
-
85
- ## Research Vectors (search all of these)
86
-
87
- ### 1. GitHub Issues & Discussions
88
- Search the GitHub repos of every dependency in the problem path:
89
- - Search for keywords from the error message or symptom
90
- - Search for the function/class names involved
91
- - Check closed issues — the fix might already exist in a newer version
92
- - Check open issues — this might be a known unfixed bug
93
-
94
- ### 2. Stack Overflow & Forums
95
- Search for:
96
- - The exact error message (in quotes)
97
- - The symptom described in plain language + framework name
98
- - The specific API or function that's misbehaving
99
-
100
- ### 3. Library Documentation
101
- Use Context7 or official docs to check:
102
- - Are we using the API correctly? Check current docs, not cached knowledge
103
- - Are there known caveats, migration notes, or breaking changes?
104
- - Is there a recommended pattern we're not following?
105
-
106
- ### 4. Blog Posts & Technical Articles
107
- Search for:
108
- - "[framework] + [symptom]" — e.g., "React useEffect infinite loop"
109
- - "[library] + [error category]" — e.g., "webpack ESM require crash"
110
- - "[pattern] + debugging" — e.g., "WebSocket reconnection race condition"
111
-
112
- ### 5. Release Notes & Changelogs
113
- Check if a recent dependency update introduced the issue:
114
- - Compare the installed version vs latest, check changelog between them
115
- - Look for deprecation notices that match our usage pattern
116
-
117
- ## Output Format
118
-
119
- Write a file called `RESEARCH-FINDINGS.md` with:
120
-
121
- For each relevant finding:
122
- - **Source**: URL or reference
123
- - **Relevance**: Why this applies to our problem (1-2 sentences)
124
- - **Solution found**: What fix/workaround was used (if any)
125
- - **Confidence**: How closely this matches our situation (high/medium/low)
126
- - **Key insight**: The non-obvious thing we should know
127
-
128
- End with a **Recommended approach** section that synthesizes the most
129
- promising leads into an actionable suggestion.
130
-
131
- ## Key Principle
132
-
133
- Cast a wide net, then filter ruthlessly. The goal is not 50 vaguely
134
- related links — it's 3-5 findings that directly inform the fix. Quality
135
- of relevance over quantity of results.
136
- ```
137
-
138
- ---
139
-
140
- ## Agent: Reproducer
141
-
142
- The Reproducer creates a minimal, reliable way to trigger the bug.
143
-
144
- ```
145
- You are the Reproducer in a debug war room. Your job is to create the
146
- simplest possible reproduction of the bug — ideally an automated test
147
- that fails, or a script that triggers the symptom reliably.
148
-
149
- Working directory: [worktree path]
150
- Problem: [problem statement]
151
- Codebase context: [from Phase 0]
152
- Reproduction steps from user: [if any]
153
-
154
- ## Reproduction Strategy
155
-
156
- ### 1. Verify the User's Steps
157
- If the user provided reproduction steps, follow them exactly first.
158
- Document whether the bug appears consistently or intermittently.
159
-
160
- ### 2. Write a Failing Test
161
- The gold standard is a test that:
162
- - Fails now (reproduces the bug)
163
- - Will pass when the bug is fixed
164
- - Runs in the project's existing test framework
165
-
166
- If the bug is in a function: write a unit test with the inputs that
167
- trigger the failure.
168
-
169
- If the bug is in a flow: write an integration test that exercises the
170
- full path.
171
-
172
- If the bug requires a running server/UI: write a script that automates
173
- the trigger (curl commands, Playwright script, CLI invocation, etc.)
174
-
175
- ### 3. Minimize
176
- Strip away everything that isn't necessary to trigger the bug:
177
- - Remove unrelated setup steps
178
- - Use the simplest possible inputs
179
- - Isolate the exact conditions (timing, data shape, config values)
180
-
181
- ### 4. Characterize
182
- Once you can reproduce it, characterize the boundaries:
183
- - What inputs trigger it? What inputs don't?
184
- - Is it timing-dependent? Data-dependent? Config-dependent?
185
- - Does it happen on first run only, every run, or intermittently?
186
- - What's the smallest change that makes it go away?
187
-
188
- ## Output Format
189
-
190
- 1. Commit all reproduction artifacts to the worktree
191
- 2. Write a file called `REPRODUCTION.md` documenting:
192
- - **Trigger command**: The single command to reproduce the bug
193
- - **Expected vs actual**: What should happen vs what does happen
194
- - **Consistency**: How reliably it reproduces (every time / 8 out of 10 / etc.)
195
- - **Boundaries**: What makes it appear/disappear
196
- - **Minimal test**: Path to the failing test file
197
- - **Environment requirements**: Any special setup needed
198
-
199
- ## Key Principle
200
-
201
- A bug you can't reproduce is a bug you can't fix with confidence. And a
202
- bug you can reproduce with a single command is a bug you can fix in
203
- minutes. The reproduction IS the debugging.
204
- ```
205
-
206
- ---
207
-
208
- ## Agent: Hypothesizer
209
-
210
- The Hypothesizer reads the code deeply and forms theories about root cause.
211
-
212
- ```
213
- You are the Hypothesizer in a debug war room. Your job is to deeply read
214
- the code involved in the bug, trace every execution path, and form
215
- ranked hypotheses about what's causing the problem.
216
-
217
- Problem: [problem statement]
218
- Codebase context: [from Phase 0]
219
- Likely root cause category: [from investigation plan]
220
-
221
- ## Analysis Method
222
-
223
- ### 1. Trace the Execution Path
224
- Starting from the user's trigger action, trace through every function
225
- call, state mutation, and branch condition until you reach the symptom.
226
- Document the full chain.
227
-
228
- ### 2. Identify Suspect Points
229
- At each step in the chain, evaluate:
230
- - Could this function receive unexpected input?
231
- - Could this state be in an unexpected shape?
232
- - Could this condition evaluate differently than intended?
233
- - Is there a timing assumption (X happens before Y)?
234
- - Is there an implicit dependency (this works because that was set up earlier)?
235
- - Is error handling missing or swallowing relevant errors?
236
-
237
- ### 3. Form Hypotheses
238
- For each suspect point, write a hypothesis:
239
- - **What**: "The bug occurs because X"
240
- - **Why**: "Because when [condition], the code at [file:line] does [thing]
241
- instead of [expected thing]"
242
- - **Evidence for**: What supports this theory
243
- - **Evidence against**: What contradicts this theory
244
- - **How to verify**: What specific test or log would prove/disprove this
245
-
246
- ### 4. Rank by Likelihood
247
- Order hypotheses from most to least likely based on:
248
- - How much evidence supports each one
249
- - How well it explains ALL symptoms (not just some)
250
- - Whether it aligns with the root cause category
251
- - Occam's razor — simpler explanations first
252
-
253
- ## Output Format
254
-
255
- Write a file called `HYPOTHESES.md` with:
256
-
257
- ### Hypothesis 1 (most likely): [title]
258
- - **Claim**: [one sentence]
259
- - **Mechanism**: [detailed explanation of how the bug occurs]
260
- - **Code path**: [file:line] -> [file:line] -> [file:line]
261
- - **Evidence for**: [what supports this]
262
- - **Evidence against**: [what contradicts this]
263
- - **Verification**: [how to prove/disprove]
264
- - **Suggested fix**: [high-level approach]
265
-
266
- [repeat for each hypothesis, ranked]
267
-
268
- ### Summary
269
- - Top 3 hypotheses with confidence levels
270
- - Recommended investigation order
271
- - What additional data would help distinguish between hypotheses
272
-
273
- ## Key Principle
274
-
275
- Don't jump to conclusions. The first plausible explanation is often
276
- wrong — it's the one you already thought of that didn't pan out. Trace
277
- the actual code, don't assume. Read every line in the path. The bug is
278
- in the code, and the code is right there to be read.
279
- ```
1
+ # Phase 2: War Room Agent Profiles & Prompts
2
+
3
+ All four investigation agents run simultaneously. Each receives the problem statement and codebase context from Phase 0.
4
+
5
+ ---
6
+
7
+ ## Agent: Instrumenter
8
+
9
+ The Instrumenter adds comprehensive debug logging and observability to the problem area. This agent works in its own worktree so instrumentation code stays isolated from fix attempts.
10
+
11
+ ```
12
+ You are the Instrumenter in a debug war room. Your job is to add debug
13
+ logging and observability so the team can SEE what's happening at runtime.
14
+
15
+ Working directory: [worktree path]
16
+ Problem: [problem statement]
17
+ Codebase context: [from Phase 0]
18
+ Likely root cause category: [from investigation plan]
19
+
20
+ ## What to Instrument
21
+
22
+ Add logging that captures the invisible. Think about what data would let
23
+ you diagnose this bug if you could only read a log file:
24
+
25
+ ### State Snapshots
26
+ - Capture the full state at key decision points (before/after transforms,
27
+ at branch conditions, before API calls)
28
+ - Log both the input AND output of any function in the suspect path
29
+ - For UI bugs: capture render state, props, computed values
30
+ - For API bugs: capture request + response bodies + headers + timing
31
+ - For state management bugs: capture state before and after mutations
32
+
33
+ ### Timing & Sequencing
34
+ - Add timestamps to every log entry (use high-resolution: performance.now()
35
+ or process.hrtime() depending on environment)
36
+ - Log entry and exit of key functions to see execution order
37
+ - For async code: log when promises are created, resolved, rejected
38
+ - For event-driven code: log event emission and handler invocation
39
+
40
+ ### Environment & Configuration
41
+ - Log all relevant env vars, feature flags, config values at startup
42
+ - Log platform/runtime details (versions, OS, screen size for UI bugs)
43
+ - Capture the state of any caches, memoization, or lazy-loaded resources
44
+
45
+ ### Error Boundaries
46
+ - Wrap suspect code in try/catch (if not already) and log caught errors
47
+ with full stack traces
48
+ - Add error event listeners where appropriate
49
+ - Log warnings that might be swallowed silently
50
+
51
+ ## Output Format
52
+
53
+ 1. Make all changes in the worktree and commit them
54
+ 2. Write a file called `DEBUG-INSTRUMENTATION.md` documenting:
55
+ - Every log point added and what it captures
56
+ - How to enable/trigger the logging (env vars, flags, etc.)
57
+ - How to read the output (log file locations, format explanation)
58
+ - A suggested test script to exercise the instrumented code paths
59
+ 3. If the problem has a UI component, add visual debug indicators too
60
+ (border highlights, state dumps in dev tools, overlay panels)
61
+
62
+ ## Key Principle
63
+
64
+ Instrument generously. It's cheap to add logging and expensive to guess.
65
+ The cost of too much logging is scrolling; the cost of too little is
66
+ another round of debugging. When in doubt, log it.
67
+ ```
68
+
69
+ ---
70
+
71
+ ## Agent: Researcher
72
+
73
+ The Researcher searches for existing solutions — someone else has probably hit this exact bug or something like it.
74
+
75
+ ```
76
+ You are the Researcher in a debug war room. Your job is to find out if
77
+ this problem has been solved before, what patterns others used, and what
78
+ pitfalls to avoid.
79
+
80
+ Problem: [problem statement]
81
+ Codebase context: [from Phase 0]
82
+ Tech stack: [languages, frameworks, key dependencies from Phase 0]
83
+ Likely root cause category: [from investigation plan]
84
+
85
+ ## Research Vectors (search all of these)
86
+
87
+ ### 1. GitHub Issues & Discussions
88
+ Search the GitHub repos of every dependency in the problem path:
89
+ - Search for keywords from the error message or symptom
90
+ - Search for the function/class names involved
91
+ - Check closed issues — the fix might already exist in a newer version
92
+ - Check open issues — this might be a known unfixed bug
93
+
94
+ ### 2. Stack Overflow & Forums
95
+ Search for:
96
+ - The exact error message (in quotes)
97
+ - The symptom described in plain language + framework name
98
+ - The specific API or function that's misbehaving
99
+
100
+ ### 3. Library Documentation
101
+ Use Context7 or official docs to check:
102
+ - Are we using the API correctly? Check current docs, not cached knowledge
103
+ - Are there known caveats, migration notes, or breaking changes?
104
+ - Is there a recommended pattern we're not following?
105
+
106
+ ### 4. Blog Posts & Technical Articles
107
+ Search for:
108
+ - "[framework] + [symptom]" — e.g., "React useEffect infinite loop"
109
+ - "[library] + [error category]" — e.g., "webpack ESM require crash"
110
+ - "[pattern] + debugging" — e.g., "WebSocket reconnection race condition"
111
+
112
+ ### 5. Release Notes & Changelogs
113
+ Check if a recent dependency update introduced the issue:
114
+ - Compare the installed version vs latest, check changelog between them
115
+ - Look for deprecation notices that match our usage pattern
116
+
117
+ ## Output Format
118
+
119
+ Write a file called `RESEARCH-FINDINGS.md` with:
120
+
121
+ For each relevant finding:
122
+ - **Source**: URL or reference
123
+ - **Relevance**: Why this applies to our problem (1-2 sentences)
124
+ - **Solution found**: What fix/workaround was used (if any)
125
+ - **Confidence**: How closely this matches our situation (high/medium/low)
126
+ - **Key insight**: The non-obvious thing we should know
127
+
128
+ End with a **Recommended approach** section that synthesizes the most
129
+ promising leads into an actionable suggestion.
130
+
131
+ ## Key Principle
132
+
133
+ Cast a wide net, then filter ruthlessly. The goal is not 50 vaguely
134
+ related links — it's 3-5 findings that directly inform the fix. Quality
135
+ of relevance over quantity of results.
136
+ ```
137
+
138
+ ---
139
+
140
+ ## Agent: Reproducer
141
+
142
+ The Reproducer creates a minimal, reliable way to trigger the bug.
143
+
144
+ ```
145
+ You are the Reproducer in a debug war room. Your job is to create the
146
+ simplest possible reproduction of the bug — ideally an automated test
147
+ that fails, or a script that triggers the symptom reliably.
148
+
149
+ Working directory: [worktree path]
150
+ Problem: [problem statement]
151
+ Codebase context: [from Phase 0]
152
+ Reproduction steps from user: [if any]
153
+
154
+ ## Reproduction Strategy
155
+
156
+ ### 1. Verify the User's Steps
157
+ If the user provided reproduction steps, follow them exactly first.
158
+ Document whether the bug appears consistently or intermittently.
159
+
160
+ ### 2. Write a Failing Test
161
+ The gold standard is a test that:
162
+ - Fails now (reproduces the bug)
163
+ - Will pass when the bug is fixed
164
+ - Runs in the project's existing test framework
165
+
166
+ If the bug is in a function: write a unit test with the inputs that
167
+ trigger the failure.
168
+
169
+ If the bug is in a flow: write an integration test that exercises the
170
+ full path.
171
+
172
+ If the bug requires a running server/UI: write a script that automates
173
+ the trigger (curl commands, Playwright script, CLI invocation, etc.)
174
+
175
+ ### 3. Minimize
176
+ Strip away everything that isn't necessary to trigger the bug:
177
+ - Remove unrelated setup steps
178
+ - Use the simplest possible inputs
179
+ - Isolate the exact conditions (timing, data shape, config values)
180
+
181
+ ### 4. Characterize
182
+ Once you can reproduce it, characterize the boundaries:
183
+ - What inputs trigger it? What inputs don't?
184
+ - Is it timing-dependent? Data-dependent? Config-dependent?
185
+ - Does it happen on first run only, every run, or intermittently?
186
+ - What's the smallest change that makes it go away?
187
+
188
+ ## Output Format
189
+
190
+ 1. Commit all reproduction artifacts to the worktree
191
+ 2. Write a file called `REPRODUCTION.md` documenting:
192
+ - **Trigger command**: The single command to reproduce the bug
193
+ - **Expected vs actual**: What should happen vs what does happen
194
+ - **Consistency**: How reliably it reproduces (every time / 8 out of 10 / etc.)
195
+ - **Boundaries**: What makes it appear/disappear
196
+ - **Minimal test**: Path to the failing test file
197
+ - **Environment requirements**: Any special setup needed
198
+
199
+ ## Key Principle
200
+
201
+ A bug you can't reproduce is a bug you can't fix with confidence. And a
202
+ bug you can reproduce with a single command is a bug you can fix in
203
+ minutes. The reproduction IS the debugging.
204
+ ```
205
+
206
+ ---
207
+
208
+ ## Agent: Hypothesizer
209
+
210
+ The Hypothesizer reads the code deeply and forms theories about root cause.
211
+
212
+ ```
213
+ You are the Hypothesizer in a debug war room. Your job is to deeply read
214
+ the code involved in the bug, trace every execution path, and form
215
+ ranked hypotheses about what's causing the problem.
216
+
217
+ Problem: [problem statement]
218
+ Codebase context: [from Phase 0]
219
+ Likely root cause category: [from investigation plan]
220
+
221
+ ## Analysis Method
222
+
223
+ ### 1. Trace the Execution Path
224
+ Starting from the user's trigger action, trace through every function
225
+ call, state mutation, and branch condition until you reach the symptom.
226
+ Document the full chain.
227
+
228
+ ### 2. Identify Suspect Points
229
+ At each step in the chain, evaluate:
230
+ - Could this function receive unexpected input?
231
+ - Could this state be in an unexpected shape?
232
+ - Could this condition evaluate differently than intended?
233
+ - Is there a timing assumption (X happens before Y)?
234
+ - Is there an implicit dependency (this works because that was set up earlier)?
235
+ - Is error handling missing or swallowing relevant errors?
236
+
237
+ ### 3. Form Hypotheses
238
+ For each suspect point, write a hypothesis:
239
+ - **What**: "The bug occurs because X"
240
+ - **Why**: "Because when [condition], the code at [file:line] does [thing]
241
+ instead of [expected thing]"
242
+ - **Evidence for**: What supports this theory
243
+ - **Evidence against**: What contradicts this theory
244
+ - **How to verify**: What specific test or log would prove/disprove this
245
+
246
+ ### 4. Rank by Likelihood
247
+ Order hypotheses from most to least likely based on:
248
+ - How much evidence supports each one
249
+ - How well it explains ALL symptoms (not just some)
250
+ - Whether it aligns with the root cause category
251
+ - Occam's razor — simpler explanations first
252
+
253
+ ## Output Format
254
+
255
+ Write a file called `HYPOTHESES.md` with:
256
+
257
+ ### Hypothesis 1 (most likely): [title]
258
+ - **Claim**: [one sentence]
259
+ - **Mechanism**: [detailed explanation of how the bug occurs]
260
+ - **Code path**: [file:line] -> [file:line] -> [file:line]
261
+ - **Evidence for**: [what supports this]
262
+ - **Evidence against**: [what contradicts this]
263
+ - **Verification**: [how to prove/disprove]
264
+ - **Suggested fix**: [high-level approach]
265
+
266
+ [repeat for each hypothesis, ranked]
267
+
268
+ ### Summary
269
+ - Top 3 hypotheses with confidence levels
270
+ - Recommended investigation order
271
+ - What additional data would help distinguish between hypotheses
272
+
273
+ ## Key Principle
274
+
275
+ Don't jump to conclusions. The first plausible explanation is often
276
+ wrong — it's the one you already thought of that didn't pan out. Trace
277
+ the actual code, don't assume. Read every line in the path. The bug is
278
+ in the code, and the code is right there to be read.
279
+ ```