feed-the-machine 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +268 -0
- package/bin/generate-manifest.mjs +210 -0
- package/bin/install.mjs +114 -0
- package/ftm/SKILL.md +88 -0
- package/ftm-audit/SKILL.md +146 -0
- package/ftm-audit/references/protocols/PROJECT-PATTERNS.md +91 -0
- package/ftm-audit/references/protocols/RUNTIME-WIRING.md +66 -0
- package/ftm-audit/references/protocols/WIRING-CONTRACTS.md +135 -0
- package/ftm-audit/references/strategies/AUTO-FIX-STRATEGIES.md +69 -0
- package/ftm-audit/references/templates/REPORT-FORMAT.md +96 -0
- package/ftm-audit/scripts/run-knip.sh +23 -0
- package/ftm-audit.yml +2 -0
- package/ftm-brainstorm/SKILL.md +379 -0
- package/ftm-brainstorm/evals/evals.json +100 -0
- package/ftm-brainstorm/evals/promptfoo.yaml +109 -0
- package/ftm-brainstorm/references/agent-prompts.md +224 -0
- package/ftm-brainstorm/references/plan-template.md +121 -0
- package/ftm-brainstorm.yml +2 -0
- package/ftm-browse/SKILL.md +415 -0
- package/ftm-browse/daemon/browser-manager.ts +206 -0
- package/ftm-browse/daemon/bun.lock +30 -0
- package/ftm-browse/daemon/cli.ts +347 -0
- package/ftm-browse/daemon/commands.ts +410 -0
- package/ftm-browse/daemon/main.ts +357 -0
- package/ftm-browse/daemon/package.json +17 -0
- package/ftm-browse/daemon/server.ts +189 -0
- package/ftm-browse/daemon/snapshot.ts +519 -0
- package/ftm-browse/daemon/tsconfig.json +22 -0
- package/ftm-browse.yml +4 -0
- package/ftm-codex-gate/SKILL.md +302 -0
- package/ftm-codex-gate.yml +2 -0
- package/ftm-config/SKILL.md +310 -0
- package/ftm-config.default.yml +80 -0
- package/ftm-config.yml +2 -0
- package/ftm-council/SKILL.md +132 -0
- package/ftm-council/references/prompts/CLAUDE-INVESTIGATION.md +60 -0
- package/ftm-council/references/prompts/CODEX-INVESTIGATION.md +58 -0
- package/ftm-council/references/prompts/GEMINI-INVESTIGATION.md +58 -0
- package/ftm-council/references/prompts/REBUTTAL-TEMPLATE.md +57 -0
- package/ftm-council/references/protocols/PREREQUISITES.md +47 -0
- package/ftm-council/references/protocols/STEP-0-FRAMING.md +46 -0
- package/ftm-council.yml +2 -0
- package/ftm-dashboard.yml +4 -0
- package/ftm-debug/SKILL.md +146 -0
- package/ftm-debug/references/phases/PHASE-0-INTAKE.md +58 -0
- package/ftm-debug/references/phases/PHASE-1-TRIAGE.md +46 -0
- package/ftm-debug/references/phases/PHASE-2-WAR-ROOM-AGENTS.md +279 -0
- package/ftm-debug/references/phases/PHASE-3-TO-6-EXECUTION.md +436 -0
- package/ftm-debug/references/protocols/BLACKBOARD.md +86 -0
- package/ftm-debug/references/protocols/EDGE-CASES.md +103 -0
- package/ftm-debug.yml +2 -0
- package/ftm-diagram/SKILL.md +233 -0
- package/ftm-diagram.yml +2 -0
- package/ftm-executor/SKILL.md +657 -0
- package/ftm-executor/references/STYLE-TEMPLATE.md +73 -0
- package/ftm-executor/references/phases/PHASE-0-VERIFICATION.md +62 -0
- package/ftm-executor/references/phases/PHASE-2-AGENT-ASSEMBLY.md +34 -0
- package/ftm-executor/references/phases/PHASE-3-WORKTREES.md +38 -0
- package/ftm-executor/references/phases/PHASE-4-5-AUDIT.md +72 -0
- package/ftm-executor/references/phases/PHASE-4-DISPATCH.md +66 -0
- package/ftm-executor/references/phases/PHASE-5-5-CODEX-GATE.md +73 -0
- package/ftm-executor/references/protocols/DOCUMENTATION-BOOTSTRAP.md +36 -0
- package/ftm-executor/references/protocols/MODEL-PROFILE.md +44 -0
- package/ftm-executor/references/protocols/PROGRESS-TRACKING.md +66 -0
- package/ftm-executor/runtime/ftm-runtime.mjs +252 -0
- package/ftm-executor/runtime/package.json +8 -0
- package/ftm-executor.yml +2 -0
- package/ftm-git/SKILL.md +195 -0
- package/ftm-git/evals/evals.json +26 -0
- package/ftm-git/evals/promptfoo.yaml +75 -0
- package/ftm-git/hooks/post-commit-experience.sh +92 -0
- package/ftm-git/references/patterns/SECRET-PATTERNS.md +104 -0
- package/ftm-git/references/protocols/REMEDIATION.md +139 -0
- package/ftm-git/scripts/pre-commit-secrets.sh +110 -0
- package/ftm-git.yml +2 -0
- package/ftm-intent/SKILL.md +198 -0
- package/ftm-intent.yml +2 -0
- package/ftm-map.yml +2 -0
- package/ftm-mind/SKILL.md +986 -0
- package/ftm-mind/evals/promptfoo.yaml +142 -0
- package/ftm-mind/references/blackboard-schema.md +328 -0
- package/ftm-mind/references/complexity-guide.md +110 -0
- package/ftm-mind/references/event-registry.md +299 -0
- package/ftm-mind/references/mcp-inventory.md +296 -0
- package/ftm-mind/references/protocols/COMPLEXITY-SIZING.md +72 -0
- package/ftm-mind/references/protocols/MCP-HEURISTICS.md +32 -0
- package/ftm-mind/references/protocols/PLAN-APPROVAL.md +80 -0
- package/ftm-mind/references/reflexion-protocol.md +249 -0
- package/ftm-mind/references/routing/SCENARIOS.md +22 -0
- package/ftm-mind/references/routing-scenarios.md +35 -0
- package/ftm-mind.yml +2 -0
- package/ftm-pause/SKILL.md +133 -0
- package/ftm-pause/references/protocols/SKILL-RESTORE-PROTOCOLS.md +186 -0
- package/ftm-pause/references/protocols/VALIDATION.md +80 -0
- package/ftm-pause.yml +2 -0
- package/ftm-researcher.yml +2 -0
- package/ftm-resume/SKILL.md +166 -0
- package/ftm-resume/references/protocols/VALIDATION.md +172 -0
- package/ftm-resume.yml +2 -0
- package/ftm-retro/SKILL.md +189 -0
- package/ftm-retro/references/protocols/SCORING-RUBRICS.md +89 -0
- package/ftm-retro/references/templates/REPORT-FORMAT.md +109 -0
- package/ftm-retro.yml +2 -0
- package/ftm-routine.yml +4 -0
- package/ftm-state/blackboard/context.json +23 -0
- package/ftm-state/blackboard/experiences/index.json +9 -0
- package/ftm-state/blackboard/patterns.json +6 -0
- package/ftm-state/schemas/context.schema.json +130 -0
- package/ftm-state/schemas/experience-index.schema.json +77 -0
- package/ftm-state/schemas/experience.schema.json +78 -0
- package/ftm-state/schemas/patterns.schema.json +44 -0
- package/ftm-upgrade/SKILL.md +153 -0
- package/ftm-upgrade/scripts/check-version.sh +76 -0
- package/ftm-upgrade/scripts/upgrade.sh +143 -0
- package/ftm-upgrade.yml +2 -0
- package/ftm.yml +2 -0
- package/install.sh +102 -0
- package/package.json +74 -0
- package/uninstall.sh +25 -0
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
# Phase 0: Problem Intake
|
|
2
|
+
|
|
3
|
+
Before launching agents, understand what you're debugging. This happens in the main conversation thread — no agents yet.
|
|
4
|
+
|
|
5
|
+
## Step 1: Gather the Problem Statement
|
|
6
|
+
|
|
7
|
+
If the user hasn't already described the bug in detail, ask targeted questions (one at a time, skip what you already know from conversation history):
|
|
8
|
+
|
|
9
|
+
1. **What's happening?** — The symptom. What does the user see/experience?
|
|
10
|
+
2. **What should be happening?** — The expected behavior.
|
|
11
|
+
3. **What have you already tried?** — Critical context. Don't duplicate wasted work.
|
|
12
|
+
4. **When did it start?** — A recent change? Always been broken? Intermittent?
|
|
13
|
+
5. **Can you trigger it reliably?** — Reproduction steps if they exist.
|
|
14
|
+
|
|
15
|
+
## Step 2: Codebase Reconnaissance
|
|
16
|
+
|
|
17
|
+
Spawn an **Explore agent** to scan the relevant area of the codebase:
|
|
18
|
+
|
|
19
|
+
```
|
|
20
|
+
Analyze the codebase around the reported problem area:
|
|
21
|
+
|
|
22
|
+
1. **Entry points**: What are the main files involved in this feature/behavior?
|
|
23
|
+
2. **Call graph**: Trace the execution path from trigger to symptom
|
|
24
|
+
3. **State flow**: What state (variables, stores, databases, caches) does this code touch?
|
|
25
|
+
4. **Dependencies**: What external libs, APIs, or services are in the path?
|
|
26
|
+
5. **Recent changes**: Check git log for recent modifications to relevant files
|
|
27
|
+
6. **Test coverage**: Are there existing tests for this code path? Do they pass?
|
|
28
|
+
7. **Configuration**: Environment variables, feature flags, build config that affect behavior
|
|
29
|
+
8. **Error handling**: Where does error handling exist? Where is it missing?
|
|
30
|
+
|
|
31
|
+
Focus on the area described by the user. Map the territory before anyone tries to change it.
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
Store the result as **codebase context**. Every subsequent agent receives this.
|
|
35
|
+
|
|
36
|
+
## Step 3: Formulate the Investigation Plan
|
|
37
|
+
|
|
38
|
+
Based on the problem statement and codebase context, decide:
|
|
39
|
+
|
|
40
|
+
1. **Which debug vectors are relevant?** Not every bug needs all 7 agents. A pure logic bug doesn't need instrumentation. A well-documented API issue might not need research. Pick what helps.
|
|
41
|
+
2. **What specific questions should each agent answer?** Generic "go investigate" prompts produce generic results. Targeted questions produce answers.
|
|
42
|
+
3. **What's the most likely root cause category?** (Race condition? State corruption? API contract mismatch? Build/config issue? Logic error? Missing error handling?) This focuses the investigation.
|
|
43
|
+
|
|
44
|
+
Present the investigation plan to the user:
|
|
45
|
+
|
|
46
|
+
```
|
|
47
|
+
Investigation Plan:
|
|
48
|
+
Problem: [one-line summary]
|
|
49
|
+
Likely category: [race condition / state bug / API mismatch / etc.]
|
|
50
|
+
Agents deploying:
|
|
51
|
+
- Instrumenter: [what they'll instrument and why]
|
|
52
|
+
- Researcher: [what they'll search for]
|
|
53
|
+
- Reproducer: [reproduction strategy]
|
|
54
|
+
- Hypothesizer: [which code paths they'll analyze]
|
|
55
|
+
Worktree strategy: [how many worktrees, branch naming]
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Then proceed immediately unless the user objects.
|
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
# Phase 1: Parallel Investigation (the war room)
|
|
2
|
+
|
|
3
|
+
Launch all investigation agents **simultaneously**. This is the core value — attacking from every angle at once.
|
|
4
|
+
|
|
5
|
+
## Agent Selection Guide
|
|
6
|
+
|
|
7
|
+
Not every bug needs all agents. Here's when to scale down:
|
|
8
|
+
|
|
9
|
+
| Bug Type | Skip These | Keep These |
|
|
10
|
+
|----------|-----------|------------|
|
|
11
|
+
| Pure logic error (wrong output) | Instrumenter | Researcher, Reproducer, Hypothesizer, Solver, Reviewer |
|
|
12
|
+
| Race condition / timing | — (use all) | All — timing bugs are the hardest |
|
|
13
|
+
| Known library bug (error message is googleable) | Hypothesizer | Researcher (primary), Solver, Reviewer |
|
|
14
|
+
| UI rendering glitch | Researcher (maybe) | Instrumenter (critical), Reproducer, Hypothesizer, Solver, Reviewer (with visual verification!) |
|
|
15
|
+
| Terminal/CLI visual output | Researcher (maybe) | Instrumenter, Reproducer, Hypothesizer, Solver, Reviewer (with visual verification!) |
|
|
16
|
+
| Build / config issue | Reproducer | Researcher (check migration guides), Hypothesizer, Solver, Reviewer |
|
|
17
|
+
| Intermittent / flaky | — (use all) | All — flaky bugs need every angle |
|
|
18
|
+
| Performance regression | Researcher | Instrumenter (profiling), Reproducer (benchmark), Hypothesizer, Solver, Reviewer |
|
|
19
|
+
|
|
20
|
+
When in doubt, use all of them. The cost of a redundant agent is some compute time. The cost of missing the right angle is another hour of debugging.
|
|
21
|
+
|
|
22
|
+
## Worktree Strategy
|
|
23
|
+
|
|
24
|
+
Every agent that makes code changes gets its own worktree:
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
.worktrees/
|
|
28
|
+
debug-instrumentation/ (Instrumenter's logging)
|
|
29
|
+
debug-reproduction/ (Reproducer's test cases)
|
|
30
|
+
debug-fix/ (Solver's fix attempts)
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
Branch naming: `debug/<problem-slug>/<agent-role>`
|
|
34
|
+
|
|
35
|
+
Example: `debug/esm-crash/instrumentation`, `debug/esm-crash/fix`
|
|
36
|
+
|
|
37
|
+
This means:
|
|
38
|
+
- Every experiment is isolated and can be kept or discarded
|
|
39
|
+
- The Solver can have multiple fix attempts on separate branches
|
|
40
|
+
- The Reproducer's test stays clean from fix changes
|
|
41
|
+
- You can diff any agent's work against main to see exactly what they did
|
|
42
|
+
- **Commit after every meaningful change** — if a fix attempt fails, the commit history shows exactly what was tried
|
|
43
|
+
|
|
44
|
+
Ensure `.worktrees/` is in `.gitignore`.
|
|
45
|
+
|
|
46
|
+
After the fix is approved and merged, clean up all debug worktrees and branches.
|
|
@@ -0,0 +1,279 @@
|
|
|
1
|
+
# Phase 2: War Room Agent Profiles & Prompts
|
|
2
|
+
|
|
3
|
+
All four investigation agents run simultaneously. Each receives the problem statement and codebase context from Phase 0.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Agent: Instrumenter
|
|
8
|
+
|
|
9
|
+
The Instrumenter adds comprehensive debug logging and observability to the problem area. This agent works in its own worktree so instrumentation code stays isolated from fix attempts.
|
|
10
|
+
|
|
11
|
+
```
|
|
12
|
+
You are the Instrumenter in a debug war room. Your job is to add debug
|
|
13
|
+
logging and observability so the team can SEE what's happening at runtime.
|
|
14
|
+
|
|
15
|
+
Working directory: [worktree path]
|
|
16
|
+
Problem: [problem statement]
|
|
17
|
+
Codebase context: [from Phase 0]
|
|
18
|
+
Likely root cause category: [from investigation plan]
|
|
19
|
+
|
|
20
|
+
## What to Instrument
|
|
21
|
+
|
|
22
|
+
Add logging that captures the invisible. Think about what data would let
|
|
23
|
+
you diagnose this bug if you could only read a log file:
|
|
24
|
+
|
|
25
|
+
### State Snapshots
|
|
26
|
+
- Capture the full state at key decision points (before/after transforms,
|
|
27
|
+
at branch conditions, before API calls)
|
|
28
|
+
- Log both the input AND output of any function in the suspect path
|
|
29
|
+
- For UI bugs: capture render state, props, computed values
|
|
30
|
+
- For API bugs: capture request + response bodies + headers + timing
|
|
31
|
+
- For state management bugs: capture state before and after mutations
|
|
32
|
+
|
|
33
|
+
### Timing & Sequencing
|
|
34
|
+
- Add timestamps to every log entry (use high-resolution: performance.now()
|
|
35
|
+
or process.hrtime() depending on environment)
|
|
36
|
+
- Log entry and exit of key functions to see execution order
|
|
37
|
+
- For async code: log when promises are created, resolved, rejected
|
|
38
|
+
- For event-driven code: log event emission and handler invocation
|
|
39
|
+
|
|
40
|
+
### Environment & Configuration
|
|
41
|
+
- Log all relevant env vars, feature flags, config values at startup
|
|
42
|
+
- Log platform/runtime details (versions, OS, screen size for UI bugs)
|
|
43
|
+
- Capture the state of any caches, memoization, or lazy-loaded resources
|
|
44
|
+
|
|
45
|
+
### Error Boundaries
|
|
46
|
+
- Wrap suspect code in try/catch (if not already) and log caught errors
|
|
47
|
+
with full stack traces
|
|
48
|
+
- Add error event listeners where appropriate
|
|
49
|
+
- Log warnings that might be swallowed silently
|
|
50
|
+
|
|
51
|
+
## Output Format
|
|
52
|
+
|
|
53
|
+
1. Make all changes in the worktree and commit them
|
|
54
|
+
2. Write a file called `DEBUG-INSTRUMENTATION.md` documenting:
|
|
55
|
+
- Every log point added and what it captures
|
|
56
|
+
- How to enable/trigger the logging (env vars, flags, etc.)
|
|
57
|
+
- How to read the output (log file locations, format explanation)
|
|
58
|
+
- A suggested test script to exercise the instrumented code paths
|
|
59
|
+
3. If the problem has a UI component, add visual debug indicators too
|
|
60
|
+
(border highlights, state dumps in dev tools, overlay panels)
|
|
61
|
+
|
|
62
|
+
## Key Principle
|
|
63
|
+
|
|
64
|
+
Instrument generously. It's cheap to add logging and expensive to guess.
|
|
65
|
+
The cost of too much logging is scrolling; the cost of too little is
|
|
66
|
+
another round of debugging. When in doubt, log it.
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
---
|
|
70
|
+
|
|
71
|
+
## Agent: Researcher
|
|
72
|
+
|
|
73
|
+
The Researcher searches for existing solutions — someone else has probably hit this exact bug or something like it.
|
|
74
|
+
|
|
75
|
+
```
|
|
76
|
+
You are the Researcher in a debug war room. Your job is to find out if
|
|
77
|
+
this problem has been solved before, what patterns others used, and what
|
|
78
|
+
pitfalls to avoid.
|
|
79
|
+
|
|
80
|
+
Problem: [problem statement]
|
|
81
|
+
Codebase context: [from Phase 0]
|
|
82
|
+
Tech stack: [languages, frameworks, key dependencies from Phase 0]
|
|
83
|
+
Likely root cause category: [from investigation plan]
|
|
84
|
+
|
|
85
|
+
## Research Vectors (search all of these)
|
|
86
|
+
|
|
87
|
+
### 1. GitHub Issues & Discussions
|
|
88
|
+
Search the GitHub repos of every dependency in the problem path:
|
|
89
|
+
- Search for keywords from the error message or symptom
|
|
90
|
+
- Search for the function/class names involved
|
|
91
|
+
- Check closed issues — the fix might already exist in a newer version
|
|
92
|
+
- Check open issues — this might be a known unfixed bug
|
|
93
|
+
|
|
94
|
+
### 2. Stack Overflow & Forums
|
|
95
|
+
Search for:
|
|
96
|
+
- The exact error message (in quotes)
|
|
97
|
+
- The symptom described in plain language + framework name
|
|
98
|
+
- The specific API or function that's misbehaving
|
|
99
|
+
|
|
100
|
+
### 3. Library Documentation
|
|
101
|
+
Use Context7 or official docs to check:
|
|
102
|
+
- Are we using the API correctly? Check current docs, not cached knowledge
|
|
103
|
+
- Are there known caveats, migration notes, or breaking changes?
|
|
104
|
+
- Is there a recommended pattern we're not following?
|
|
105
|
+
|
|
106
|
+
### 4. Blog Posts & Technical Articles
|
|
107
|
+
Search for:
|
|
108
|
+
- "[framework] + [symptom]" — e.g., "React useEffect infinite loop"
|
|
109
|
+
- "[library] + [error category]" — e.g., "webpack ESM require crash"
|
|
110
|
+
- "[pattern] + debugging" — e.g., "WebSocket reconnection race condition"
|
|
111
|
+
|
|
112
|
+
### 5. Release Notes & Changelogs
|
|
113
|
+
Check if a recent dependency update introduced the issue:
|
|
114
|
+
- Compare the installed version vs latest, check changelog between them
|
|
115
|
+
- Look for deprecation notices that match our usage pattern
|
|
116
|
+
|
|
117
|
+
## Output Format
|
|
118
|
+
|
|
119
|
+
Write a file called `RESEARCH-FINDINGS.md` with:
|
|
120
|
+
|
|
121
|
+
For each relevant finding:
|
|
122
|
+
- **Source**: URL or reference
|
|
123
|
+
- **Relevance**: Why this applies to our problem (1-2 sentences)
|
|
124
|
+
- **Solution found**: What fix/workaround was used (if any)
|
|
125
|
+
- **Confidence**: How closely this matches our situation (high/medium/low)
|
|
126
|
+
- **Key insight**: The non-obvious thing we should know
|
|
127
|
+
|
|
128
|
+
End with a **Recommended approach** section that synthesizes the most
|
|
129
|
+
promising leads into an actionable suggestion.
|
|
130
|
+
|
|
131
|
+
## Key Principle
|
|
132
|
+
|
|
133
|
+
Cast a wide net, then filter ruthlessly. The goal is not 50 vaguely
|
|
134
|
+
related links — it's 3-5 findings that directly inform the fix. Quality
|
|
135
|
+
of relevance over quantity of results.
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
---
|
|
139
|
+
|
|
140
|
+
## Agent: Reproducer
|
|
141
|
+
|
|
142
|
+
The Reproducer creates a minimal, reliable way to trigger the bug.
|
|
143
|
+
|
|
144
|
+
```
|
|
145
|
+
You are the Reproducer in a debug war room. Your job is to create the
|
|
146
|
+
simplest possible reproduction of the bug — ideally an automated test
|
|
147
|
+
that fails, or a script that triggers the symptom reliably.
|
|
148
|
+
|
|
149
|
+
Working directory: [worktree path]
|
|
150
|
+
Problem: [problem statement]
|
|
151
|
+
Codebase context: [from Phase 0]
|
|
152
|
+
Reproduction steps from user: [if any]
|
|
153
|
+
|
|
154
|
+
## Reproduction Strategy
|
|
155
|
+
|
|
156
|
+
### 1. Verify the User's Steps
|
|
157
|
+
If the user provided reproduction steps, follow them exactly first.
|
|
158
|
+
Document whether the bug appears consistently or intermittently.
|
|
159
|
+
|
|
160
|
+
### 2. Write a Failing Test
|
|
161
|
+
The gold standard is a test that:
|
|
162
|
+
- Fails now (reproduces the bug)
|
|
163
|
+
- Will pass when the bug is fixed
|
|
164
|
+
- Runs in the project's existing test framework
|
|
165
|
+
|
|
166
|
+
If the bug is in a function: write a unit test with the inputs that
|
|
167
|
+
trigger the failure.
|
|
168
|
+
|
|
169
|
+
If the bug is in a flow: write an integration test that exercises the
|
|
170
|
+
full path.
|
|
171
|
+
|
|
172
|
+
If the bug requires a running server/UI: write a script that automates
|
|
173
|
+
the trigger (curl commands, Playwright script, CLI invocation, etc.)
|
|
174
|
+
|
|
175
|
+
### 3. Minimize
|
|
176
|
+
Strip away everything that isn't necessary to trigger the bug:
|
|
177
|
+
- Remove unrelated setup steps
|
|
178
|
+
- Use the simplest possible inputs
|
|
179
|
+
- Isolate the exact conditions (timing, data shape, config values)
|
|
180
|
+
|
|
181
|
+
### 4. Characterize
|
|
182
|
+
Once you can reproduce it, characterize the boundaries:
|
|
183
|
+
- What inputs trigger it? What inputs don't?
|
|
184
|
+
- Is it timing-dependent? Data-dependent? Config-dependent?
|
|
185
|
+
- Does it happen on first run only, every run, or intermittently?
|
|
186
|
+
- What's the smallest change that makes it go away?
|
|
187
|
+
|
|
188
|
+
## Output Format
|
|
189
|
+
|
|
190
|
+
1. Commit all reproduction artifacts to the worktree
|
|
191
|
+
2. Write a file called `REPRODUCTION.md` documenting:
|
|
192
|
+
- **Trigger command**: The single command to reproduce the bug
|
|
193
|
+
- **Expected vs actual**: What should happen vs what does happen
|
|
194
|
+
- **Consistency**: How reliably it reproduces (every time / 8 out of 10 / etc.)
|
|
195
|
+
- **Boundaries**: What makes it appear/disappear
|
|
196
|
+
- **Minimal test**: Path to the failing test file
|
|
197
|
+
- **Environment requirements**: Any special setup needed
|
|
198
|
+
|
|
199
|
+
## Key Principle
|
|
200
|
+
|
|
201
|
+
A bug you can't reproduce is a bug you can't fix with confidence. And a
|
|
202
|
+
bug you can reproduce with a single command is a bug you can fix in
|
|
203
|
+
minutes. The reproduction IS the debugging.
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
---
|
|
207
|
+
|
|
208
|
+
## Agent: Hypothesizer
|
|
209
|
+
|
|
210
|
+
The Hypothesizer reads the code deeply and forms theories about root cause.
|
|
211
|
+
|
|
212
|
+
```
|
|
213
|
+
You are the Hypothesizer in a debug war room. Your job is to deeply read
|
|
214
|
+
the code involved in the bug, trace every execution path, and form
|
|
215
|
+
ranked hypotheses about what's causing the problem.
|
|
216
|
+
|
|
217
|
+
Problem: [problem statement]
|
|
218
|
+
Codebase context: [from Phase 0]
|
|
219
|
+
Likely root cause category: [from investigation plan]
|
|
220
|
+
|
|
221
|
+
## Analysis Method
|
|
222
|
+
|
|
223
|
+
### 1. Trace the Execution Path
|
|
224
|
+
Starting from the user's trigger action, trace through every function
|
|
225
|
+
call, state mutation, and branch condition until you reach the symptom.
|
|
226
|
+
Document the full chain.
|
|
227
|
+
|
|
228
|
+
### 2. Identify Suspect Points
|
|
229
|
+
At each step in the chain, evaluate:
|
|
230
|
+
- Could this function receive unexpected input?
|
|
231
|
+
- Could this state be in an unexpected shape?
|
|
232
|
+
- Could this condition evaluate differently than intended?
|
|
233
|
+
- Is there a timing assumption (X happens before Y)?
|
|
234
|
+
- Is there an implicit dependency (this works because that was set up earlier)?
|
|
235
|
+
- Is error handling missing or swallowing relevant errors?
|
|
236
|
+
|
|
237
|
+
### 3. Form Hypotheses
|
|
238
|
+
For each suspect point, write a hypothesis:
|
|
239
|
+
- **What**: "The bug occurs because X"
|
|
240
|
+
- **Why**: "Because when [condition], the code at [file:line] does [thing]
|
|
241
|
+
instead of [expected thing]"
|
|
242
|
+
- **Evidence for**: What supports this theory
|
|
243
|
+
- **Evidence against**: What contradicts this theory
|
|
244
|
+
- **How to verify**: What specific test or log would prove/disprove this
|
|
245
|
+
|
|
246
|
+
### 4. Rank by Likelihood
|
|
247
|
+
Order hypotheses from most to least likely based on:
|
|
248
|
+
- How much evidence supports each one
|
|
249
|
+
- How well it explains ALL symptoms (not just some)
|
|
250
|
+
- Whether it aligns with the root cause category
|
|
251
|
+
- Occam's razor — simpler explanations first
|
|
252
|
+
|
|
253
|
+
## Output Format
|
|
254
|
+
|
|
255
|
+
Write a file called `HYPOTHESES.md` with:
|
|
256
|
+
|
|
257
|
+
### Hypothesis 1 (most likely): [title]
|
|
258
|
+
- **Claim**: [one sentence]
|
|
259
|
+
- **Mechanism**: [detailed explanation of how the bug occurs]
|
|
260
|
+
- **Code path**: [file:line] -> [file:line] -> [file:line]
|
|
261
|
+
- **Evidence for**: [what supports this]
|
|
262
|
+
- **Evidence against**: [what contradicts this]
|
|
263
|
+
- **Verification**: [how to prove/disprove]
|
|
264
|
+
- **Suggested fix**: [high-level approach]
|
|
265
|
+
|
|
266
|
+
[repeat for each hypothesis, ranked]
|
|
267
|
+
|
|
268
|
+
### Summary
|
|
269
|
+
- Top 3 hypotheses with confidence levels
|
|
270
|
+
- Recommended investigation order
|
|
271
|
+
- What additional data would help distinguish between hypotheses
|
|
272
|
+
|
|
273
|
+
## Key Principle
|
|
274
|
+
|
|
275
|
+
Don't jump to conclusions. The first plausible explanation is often
|
|
276
|
+
wrong — it's the one you already thought of that didn't pan out. Trace
|
|
277
|
+
the actual code, don't assume. Read every line in the path. The bug is
|
|
278
|
+
in the code, and the code is right there to be read.
|
|
279
|
+
```
|