feed-the-machine 1.3.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/ftm-audit/SKILL.md +383 -57
  2. package/ftm-brainstorm/SKILL.md +119 -51
  3. package/ftm-config/SKILL.md +1 -1
  4. package/ftm-council/SKILL.md +259 -31
  5. package/ftm-dashboard/SKILL.md +10 -10
  6. package/ftm-debug/SKILL.md +861 -54
  7. package/ftm-diagram/SKILL.md +1 -1
  8. package/ftm-executor/SKILL.md +6 -6
  9. package/ftm-git/SKILL.md +208 -22
  10. package/ftm-inbox/bin/start.sh +1 -1
  11. package/ftm-inbox/bin/status.sh +1 -1
  12. package/ftm-inbox/bin/stop.sh +1 -1
  13. package/ftm-intent/SKILL.md +0 -1
  14. package/ftm-map/SKILL.md +46 -14
  15. package/ftm-map/scripts/db.py +439 -118
  16. package/ftm-map/scripts/index.py +128 -54
  17. package/ftm-map/scripts/parser.py +89 -320
  18. package/ftm-map/scripts/queries/go-tags.scm +20 -0
  19. package/ftm-map/scripts/queries/javascript-tags.scm +19 -7
  20. package/ftm-map/scripts/queries/python-tags.scm +22 -8
  21. package/ftm-map/scripts/queries/ruby-tags.scm +19 -0
  22. package/ftm-map/scripts/queries/rust-tags.scm +37 -0
  23. package/ftm-map/scripts/queries/typescript-tags.scm +20 -8
  24. package/ftm-map/scripts/query.py +176 -24
  25. package/ftm-map/scripts/ranker.py +377 -0
  26. package/ftm-map/scripts/requirements.txt +3 -0
  27. package/ftm-map/scripts/setup.sh +11 -0
  28. package/ftm-map/scripts/test_db.py +355 -115
  29. package/ftm-map/scripts/test_parser.py +169 -101
  30. package/ftm-map/scripts/test_query.py +178 -61
  31. package/ftm-map/scripts/test_ranker.py +199 -0
  32. package/ftm-map/scripts/views.py +107 -61
  33. package/ftm-mind/SKILL.md +861 -11
  34. package/ftm-mind/references/event-registry.md +20 -0
  35. package/ftm-pause/SKILL.md +256 -37
  36. package/ftm-resume/SKILL.md +380 -75
  37. package/ftm-retro/SKILL.md +164 -27
  38. package/ftm-upgrade/SKILL.md +4 -4
  39. package/hooks/ftm-blackboard-enforcer.sh +2 -4
  40. package/install.sh +6 -1
  41. package/package.json +1 -1
  42. package/ftm-map/scripts/tests/fixtures/__init__.py +0 -0
  43. package/ftm-map/scripts/tests/fixtures/sample_project/api.ts +0 -16
  44. package/ftm-map/scripts/tests/fixtures/sample_project/auth.py +0 -15
  45. package/ftm-map/scripts/tests/fixtures/sample_project/utils.js +0 -16
@@ -25,53 +25,195 @@ If index.json is empty or no matches found, proceed normally without experience-
25
25
 
26
26
  # FTM Council
27
27
 
28
- Three peers — a subagent investigator, Codex, and Gemini — independently research the codebase and deliberate on a problem through structured rounds of debate. No single model is the authority. Each model explores the code on its own, forms its own conclusions from what it finds, and only then enters deliberation. The council converges through majority vote: when 2 of 3 agree, that's the decision. If 5 rounds pass without majority agreement, the orchestrator synthesizes the best elements from all three positions and presents the user with a clear summary of where the models agreed, where they diverged, and a recommended path forward.
28
+ Three AI peers — Claude, Codex, and Gemini — independently research the codebase and deliberate on a problem through structured rounds of debate. No single model is the authority. Each model explores the code on its own, forms its own conclusions from what it finds, and only then enters deliberation. The council converges through majority vote: when 2 of 3 agree, that's the decision. If 5 rounds pass without majority agreement, Claude synthesizes the best elements from all three positions and presents the user with a clear summary of where the models agreed, where they diverged, and a recommended path forward.
29
29
 
30
30
  ## Why Independent Research Matters
31
31
 
32
- Each model explores the codebase independently different attention patterns, different navigation instincts, different focus areas. This produces genuinely diverse perspectives grounded in what each model actually found, not three reactions to one curated framing.
32
+ The whole point of a multi-model council is diverse perspectives. If Claude reads the code first and then tells the other models what it found, you get three models reacting to Claude's framing — not three independent investigations. That's a game of telephone, not a council.
33
+
34
+ Each model has different attention patterns, different ways of navigating code, and different instincts about what's relevant. Codex might grep for usage patterns Claude wouldn't think to check. Gemini might focus on a config file Claude skimmed past. By letting each model explore independently, you get genuinely different perspectives grounded in what each model actually found in the codebase — not just different opinions about the same Claude-curated snippet.
33
35
 
34
36
  ## Prerequisites
35
37
 
36
- Check tool availability before starting. Read `references/protocols/PREREQUISITES.md` for the full availability check, fallback logic, timeout configuration, and working directory setup.
38
+ The user needs both CLI tools installed and authenticated:
39
+ - **Codex**: `npm install -g @openai/codex` (authenticated via `codex login`)
40
+ - **Gemini**: `npm install -g @google/gemini-cli` (authenticated via Google)
37
41
 
38
- Quick check:
42
+ Before the first round, verify both are available:
39
43
  ```bash
40
44
  which codex && which gemini
41
45
  ```
42
- If either is missing, tell the user what to install and stop — don't try to run a degraded council without informing them.
46
+ If either is missing, tell the user what to install and stop — don't try to run a 2-model council.
43
47
 
44
48
  ## The Protocol
45
49
 
46
50
  ### Auto-Invocation Mode
47
51
 
48
- Two invocation modes:
49
- 1. **User-invoked** (default): Frame the problem in Step 0, proceed through protocol.
50
- 2. **Auto-invoked**: Another skill provides a pre-framed conflict payload (containing `CONFLICT TYPE`, `ORIGINAL INTENT`, `CODEX'S CHANGE`, etc.). Skip Step 0, use the payload directly, include DEBUG.md history, run Steps 1-5, and return a structured `COUNCIL VERDICT` to the caller without user interaction.
52
+ The council can be invoked in two ways:
53
+
54
+ 1. **User-invoked** (default): The user asks for a council. You frame the problem in Step 0 and proceed through the protocol.
55
+ 2. **Auto-invoked**: Another skill (typically ftm-executor) invokes the council with a pre-framed conflict payload. Skip Step 0 — the problem is already framed.
56
+
57
+ **Detecting auto-invocation:**
58
+ If the invocation includes a structured conflict payload with these fields, you're in auto-invocation mode:
59
+ - `CONFLICT TYPE`
60
+ - `ORIGINAL INTENT`
61
+ - `CODEX'S CHANGE`
62
+ - `CODEX'S REASONING`
63
+ - `THE CODE IN QUESTION`
64
+ - `DEBUG.md HISTORY`
65
+ - `QUESTION FOR THE COUNCIL`
66
+
67
+ **Auto-invocation protocol:**
68
+ 1. Skip Step 0 (problem is already framed by the calling skill)
69
+ 2. Use the conflict payload directly as the council prompt for all three models
70
+ 3. Add this context to each model's prompt: "This is an INTENT.md conflict from an automated build pipeline. Codex (gpt-5.4) made a code fix that contradicts the project's stated intent. You must decide: should the intent documentation be updated to match the fix, or should the fix be reverted to preserve the original intent?"
71
+ 4. Include the DEBUG.md history so models don't suggest approaches already tried
72
+ 5. Run through Steps 1-5 as normal (independent research → consensus check → rebuttals → verdict)
73
+ 6. Return the verdict in a structured format the calling skill can parse:
51
74
 
52
- ---
75
+ ```
76
+ COUNCIL VERDICT:
77
+ decision: "update_intent" | "revert_fix"
78
+ round: [which round consensus was reached]
79
+ agreed_by: [which 2 models agreed]
80
+ dissent: [the third model's position]
81
+ reasoning: [2-3 sentence explanation]
82
+ debug_log_entry: [formatted entry for DEBUG.md]
83
+ ```
84
+
85
+ **Key difference from user-invoked:**
86
+ - In user-invoked mode, you show the user the framed prompt and wait for confirmation before starting
87
+ - In auto-invoked mode, you proceed immediately — the calling skill already validated the conflict
88
+ - In auto-invoked mode, you do NOT ask the user if they want to dig deeper into the dissent — you return the verdict directly to the calling skill
53
89
 
54
90
  ### Step 0: Frame the Problem
55
91
 
56
- > **Note:** Skipped in auto-invocation mode. If a structured conflict payload was provided, proceed directly to Step 1.
92
+ > **Note:** This step is skipped in auto-invocation mode. If a structured conflict payload was provided, proceed directly to Step 1 using the payload as the council prompt.
57
93
 
58
- Distill the user's request into a self-contained **council prompt** — a clear problem statement with investigation entry points but no pre-read code. Models read the code themselves.
94
+ Take the user's request and distill it into a clear **council prompt** — a self-contained problem statement that makes sense without conversation history. The prompt should describe the problem and what a good answer looks like, but it should NOT include pre-read code. The models will read the code themselves.
59
95
 
60
- Read `references/protocols/STEP-0-FRAMING.md` for the full framing guide, including what to include, what to exclude, and the structured payload format.
96
+ Include:
97
+ - The specific question or decision to be made
98
+ - File paths or areas of the codebase to start investigating (as pointers, not content)
99
+ - Error messages or symptoms if it's a debugging problem
100
+ - Decision criteria — what a good answer looks like
101
+ - Any constraints the user has mentioned
61
102
 
62
- Show the user the framed prompt before proceeding: "Here's what I'll send to the council — does this capture the problem?" Wait for confirmation or edits.
103
+ Do NOT include:
104
+ - Pre-read file contents (each model reads files itself)
105
+ - Your own analysis or opinion about the problem
106
+ - Summaries of what the code does (let each model discover that)
63
107
 
64
- ---
108
+ Show the user the framed prompt before proceeding: "Here's what I'll send to the council — does this capture the problem?" Wait for confirmation or edits.
65
109
 
66
110
  ### Step 1: Independent Research (Round 1)
67
111
 
68
- **You (the orchestrator) are NOT a peer in this step.** Do not form your own position yet. Spawn three independent investigations in parallel and collect the results.
112
+ This is the critical step. All three models explore the codebase independently and in parallel. Each one reads whatever files it thinks are relevant, follows whatever threads it wants, and arrives at its own position based on its own research.
69
113
 
70
- Read `references/prompts/CLAUDE-INVESTIGATION.md`, `references/prompts/CODEX-INVESTIGATION.md`, and `references/prompts/GEMINI-INVESTIGATION.md` for the full prompt templates for each model.
114
+ **You (Claude) are the orchestrator in this step, NOT a peer.** You do not form your own position yet. You spawn three independent investigations and collect the results.
71
115
 
72
- Present results with a structured comparison showing each model's research, position, key evidence, and an alignment check (agreement areas, divergence points, majority forming?).
116
+ Launch all three in parallel:
73
117
 
74
- ---
118
+ **Claude investigation** — spawn a subagent (this keeps the investigation isolated from your orchestrator context):
119
+
120
+ ```
121
+ You are one of three AI peers in a deliberation council. The other two peers are Codex (OpenAI) and Gemini (Google). Your job is to independently investigate the following problem by reading the codebase, then give your honest, well-reasoned position.
122
+
123
+ IMPORTANT: Do your own research. Read files, search code, trace through logic. Your position must be grounded in what you actually find in the code, not assumptions. Cite specific files and line numbers.
124
+
125
+ PROBLEM:
126
+ {council_prompt}
127
+
128
+ WORKING DIRECTORY: {cwd}
129
+
130
+ Instructions:
131
+ 1. Start by exploring the relevant parts of the codebase — read files, search for patterns, trace dependencies
132
+ 2. Take notes on what you find as you go
133
+ 3. After you've done sufficient research, formulate your position
134
+
135
+ Give your response in this format:
136
+ 1. RESEARCH SUMMARY: What files you examined, what you found (with file:line references)
137
+ 2. POSITION: Your clear stance (1-2 sentences)
138
+ 3. REASONING: Why you believe this, grounded in specific code you read
139
+ 4. CONCERNS: What could go wrong with your approach
140
+ 5. CONFIDENCE: High/Medium/Low and why
141
+ ```
142
+
143
+ **Codex** — spawn a subagent that runs:
144
+ ```bash
145
+ codex exec --full-auto "You are one of three AI peers in a deliberation council. The other two peers are Claude (Anthropic) and Gemini (Google). Your job is to independently investigate the following problem by reading the codebase, then give your honest, well-reasoned position.
146
+
147
+ IMPORTANT: Do your own research. Read files, search code, trace through logic. Your position must be grounded in what you actually find in the code, not assumptions. Cite specific files and line numbers.
148
+
149
+ PROBLEM:
150
+ {council_prompt}
151
+
152
+ Instructions:
153
+ 1. Start by exploring the relevant parts of the codebase — read files, search for patterns, trace dependencies
154
+ 2. Take notes on what you find as you go
155
+ 3. After you have done sufficient research, formulate your position
156
+
157
+ Give your response in this format:
158
+ 1. RESEARCH SUMMARY: What files you examined, what you found (with file:line references)
159
+ 2. POSITION: Your clear stance (1-2 sentences)
160
+ 3. REASONING: Why you believe this, grounded in specific code you read
161
+ 4. CONCERNS: What could go wrong with your approach
162
+ 5. CONFIDENCE: High/Medium/Low and why"
163
+ ```
164
+
165
+ The `--full-auto` flag gives Codex sandboxed read access to the workspace so it can explore files on its own.
166
+
167
+ **Gemini** — spawn a subagent that runs:
168
+ ```bash
169
+ gemini -p "You are one of three AI peers in a deliberation council. The other two peers are Claude (Anthropic) and Codex (OpenAI). Your job is to independently investigate the following problem by reading the codebase, then give your honest, well-reasoned position.
170
+
171
+ IMPORTANT: Do your own research. Read files, search code, trace through logic. Your position must be grounded in what you actually find in the code, not assumptions. Cite specific files and line numbers.
172
+
173
+ PROBLEM:
174
+ {council_prompt}
175
+
176
+ Instructions:
177
+ 1. Start by exploring the relevant parts of the codebase — read files, search for patterns, trace dependencies
178
+ 2. Take notes on what you find as you go
179
+ 3. After you have done sufficient research, formulate your position
180
+
181
+ Give your response in this format:
182
+ 1. RESEARCH SUMMARY: What files you examined, what you found (with file:line references)
183
+ 2. POSITION: Your clear stance (1-2 sentences)
184
+ 3. REASONING: Why you believe this, grounded in specific code you read
185
+ 4. CONCERNS: What could go wrong with your approach
186
+ 5. CONFIDENCE: High/Medium/Low and why" --yolo
187
+ ```
188
+
189
+ The `--yolo` flag lets Gemini auto-approve file reads so it can explore without getting stuck on permission prompts.
190
+
191
+ Collect all three responses. Present them to the user with a structured comparison that highlights what each model found:
192
+
193
+ ```
194
+ ## Round 1 — Independent Research
195
+
196
+ ### Claude
197
+ **Research**: [what files it read, what it focused on]
198
+ **Position**: ...
199
+ **Key evidence**: ...
200
+
201
+ ### Codex
202
+ **Research**: [what files it read, what it focused on]
203
+ **Position**: ...
204
+ **Key evidence**: ...
205
+
206
+ ### Gemini
207
+ **Research**: [what files it read, what it focused on]
208
+ **Position**: ...
209
+ **Key evidence**: ...
210
+
211
+ ### Alignment Check
212
+ - Agreement areas: ...
213
+ - Divergence points: ...
214
+ - Different research paths: [note if models looked at different files or focused on different aspects — this is valuable signal]
215
+ - Majority forming? [Yes — X and Y agree / No — all three differ]
216
+ ```
75
217
 
76
218
  ### Step 2: Check for Early Consensus
77
219
 
@@ -80,13 +222,47 @@ After each round, check if 2 of 3 positions substantially agree. "Substantially
80
222
  If majority exists → jump to **Step 5: Verdict**.
81
223
  If not → continue to the next rebuttal round.
82
224
 
83
- ---
84
-
85
225
  ### Step 3: Rebuttal Rounds (Rounds 2-5)
86
226
 
87
- Each model sees the other two models' previous positions and must respond directly to their evidence. Read `references/prompts/REBUTTAL-TEMPLATE.md` for the full rebuttal prompt template. Use same CLI flags for follow-up research. Present results highlighting what changed and whether consensus is forming.
227
+ For each subsequent round, each model sees the other two models' previous positions (including what they found in the code) and must respond directly. This is where the real deliberation happens models engage with each other's evidence and arguments, not just opinions.
88
228
 
89
- ---
229
+ Build a rebuttal prompt that includes the previous round's research and positions:
230
+
231
+ For Codex and Gemini, the rebuttal prompt should include enough context for them to do targeted follow-up research if they want to verify the other models' claims:
232
+
233
+ ```
234
+ Round {N} of the deliberation council.
235
+
236
+ Here's what happened in the previous round. Each model independently researched the codebase and formed a position:
237
+
238
+ CLAUDE's research and position:
239
+ {claude_previous_full}
240
+
241
+ CODEX's research and position:
242
+ {codex_previous_full}
243
+
244
+ GEMINI's research and position:
245
+ {gemini_previous_full}
246
+
247
+ Now respond. You may do additional codebase research if you want to verify claims the other models made or investigate angles they raised. Then:
248
+
249
+ 1. Directly address the strongest point from each other model
250
+ 2. If another model cited code you haven't looked at, go read it and see if you agree with their interpretation
251
+ 3. State whether you've changed your position (and why, or why not)
252
+ 4. If you agree with another model, say so explicitly
253
+
254
+ UPDATED POSITION: [same/changed] ...
255
+ NEW EVIDENCE (if any): [anything new you found by following up on other models' research]
256
+ KEY RESPONSE TO {OTHER_MODEL_1}: ...
257
+ KEY RESPONSE TO {OTHER_MODEL_2}: ...
258
+ REMAINING DISAGREEMENTS: ...
259
+ ```
260
+
261
+ For rebuttal rounds, use the same CLI flags (`--full-auto` for Codex, `--yolo` for Gemini) so models can do follow-up research — they might want to verify a claim another model made by reading a file they hadn't looked at before.
262
+
263
+ The Claude rebuttal should also be done via a subagent so it stays isolated and doesn't anchor on the orchestrator's accumulated context.
264
+
265
+ Present the round results to the user with the structured comparison format. Highlight what changed, who moved, and whether consensus is forming. Pay special attention to cases where a model changed its mind after reading code another model pointed to — that's the council working as intended.
90
266
 
91
267
  ### Step 4: Repeat or Escalate
92
268
 
@@ -99,25 +275,77 @@ If after 5 rounds there's still no majority:
99
275
  - Note which models examined which parts of the codebase — incomplete research might explain persistent disagreement
100
276
  - Present the user with 2-3 concrete options (mapped to the council positions) and let them decide
101
277
 
102
- ---
103
-
104
278
  ### Step 5: Verdict
105
279
 
106
- When 2 of 3 agree, present: decision, which models agreed, dissent, evidence basis, why the majority won, what the dissent raised that's still valid, and recommended action. Ask if the user wants to proceed or dig into the dissent.
280
+ When 2 of 3 agree, present the verdict:
107
281
 
108
- **Auto-invocation:** Return structured `COUNCIL VERDICT` with `decision` (update_intent/revert_fix), `round`, `agreed_by`, `dissent`, `reasoning`, `debug_log_entry`. Do not ask the user — return directly to the calling skill.
282
+ ```
283
+ ## Council Verdict — Round {N}
109
284
 
110
- ---
285
+ **Decision**: {the agreed position}
286
+ **Agreed by**: {which two models}
287
+ **Dissent**: {the third model's remaining objection}
288
+
289
+ ### Evidence basis
290
+ {What code each model examined that led to this conclusion}
291
+
292
+ ### Why the majority position won
293
+ {Brief analysis of why the arguments were stronger}
294
+
295
+ ### The dissent is worth noting because
296
+ {What the dissenting model raised that's still valid — this often contains useful caveats}
297
+
298
+ ### Recommended action
299
+ {Concrete next steps based on the decision}
300
+ ```
301
+
302
+ Ask the user if they want to proceed with the verdict or if they want to dig deeper into the dissent.
303
+
304
+ **Auto-invocation verdict format:**
305
+
306
+ When auto-invoked, also return the verdict in the structured format the calling skill expects:
307
+
308
+ ```
309
+ COUNCIL VERDICT:
310
+ decision: "update_intent" | "revert_fix"
311
+ round: [N]
312
+ agreed_by: [model1, model2]
313
+ dissent: [model3's position summary]
314
+ reasoning: [why the majority position won]
315
+ debug_log_entry: |
316
+ ### Council Verdict — [timestamp]
317
+ **Conflict**: [brief description]
318
+ **Decision**: [update_intent/revert_fix]
319
+ **Agreed by**: [models]
320
+ **Reasoning**: [explanation]
321
+ **Dissent**: [third model's concern]
322
+ ```
323
+
324
+ Do not ask the user if they want to proceed — return the verdict directly to the calling skill.
111
325
 
112
326
  ## Practical Considerations
113
327
 
328
+ ### Timeouts
329
+ Independent research takes longer than simple prompting — each model is reading files, searching code, etc. Set timeouts at 300s (5 minutes) for Round 1 since that's the heavy research round. Rebuttal rounds can use 180s since they're doing less exploration. If one model times out, report it and continue with the other two.
330
+
331
+ ### Error Handling
332
+ If Codex or Gemini returns an error (auth failure, rate limit, sandbox issue, etc.):
333
+ - Report the error to the user
334
+ - Continue with the remaining models
335
+ - A 2-model debate is better than nothing, though you lose the tiebreaker benefit
336
+
114
337
  ### Conversation State
115
- Orchestrator holds state between rounds. Codex and Gemini are stateless every prompt must include full history.
338
+ Between rounds, you (the orchestrator) hold state. Keep a running record of each model's research findings AND positions so you can construct accurate rebuttal prompts. Codex and Gemini are stateless between rounds, so every round's prompt must be self-contained — include the full history of what each model found and argued.
116
339
 
117
- ### When NOT to Council
118
- Trivial questions, pure execution requests, pure opinion with no code to investigate, or when the user says "just do it". Exception: always proceed when auto-invoked by ftm-executor.
340
+ ### Working Directory
341
+ Make sure Codex and Gemini run from the same working directory as the current session. This ensures they're all looking at the same codebase. Pass `cd {cwd} &&` before the CLI command if needed to ensure correct directory.
119
342
 
120
- ---
343
+ ### When NOT to Council
344
+ - Trivial questions with obvious answers (don't waste 3 research sessions on "should I use const or let")
345
+ - Questions where the user just needs execution, not deliberation
346
+ - Pure opinion questions with no code to investigate
347
+ - If the user says "just do it" — they want action, not debate
348
+ - When auto-invoked by ftm-executor — always proceed (the executor already determined a council is needed)
121
349
 
122
350
  ## Blackboard Write
123
351
 
@@ -118,6 +118,16 @@ Average plan modifications: 1.2 per plan
118
118
  - Keep output concise — one screen max
119
119
  - If data sources are empty (fresh install), show: "No data yet. Use FTM for a few sessions and check back."
120
120
 
121
+ ## Events
122
+
123
+ ### Listens To
124
+ - `task_completed` — for session stats tracking
125
+
126
+ ### Blackboard Read
127
+ - `context.json` — session metadata
128
+ - `experiences/index.json` — experience inventory
129
+ - `patterns.json` — pattern health
130
+
121
131
  ## Requirements
122
132
 
123
133
  - reference: `~/.claude/ftm-state/events.log` | optional | JSONL event log for skill invocation tracking
@@ -151,13 +161,3 @@ Average plan modifications: 1.2 per plan
151
161
 
152
162
  ### (none)
153
163
  ftm-dashboard is read-only and does not emit events directly. It listens to task_completed for session tracking only.
154
-
155
- ## Events
156
-
157
- ### Listens To
158
- - `task_completed` — for session stats tracking
159
-
160
- ### Blackboard Read
161
- - `context.json` — session metadata
162
- - `experiences/index.json` — experience inventory
163
- - `patterns.json` — pattern health