feed-the-machine 1.7.12 → 1.7.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -2,177 +2,116 @@
2
2
 
3
3
  ## Decide
4
4
 
5
- Decide turns the orientation model into one concrete next move.
5
+ ### 1. Choose execution mode
6
6
 
7
- ### 1. Choose the smallest correct execution mode
7
+ - `micro` direct action
8
+ - `small` → pre-flight summary + action + verify
9
+ - `medium` → checkbox plan, wait for approval, execute
10
+ - `large` → `ftm-brainstorm` (no plan) or `ftm-executor` (plan exists)
8
11
 
9
- - `micro` -> direct action
10
- - `small` -> pre-flight summary, then direct action plus verification
11
- - `medium` -> numbered plan, wait for approval, then execute
12
- - `large` -> `ftm-brainstorm` if no plan exists, or `ftm-executor` if a plan exists
12
+ Double-check forced escalation signals from Complexity Sizing reference. If any fired → medium minimum.
13
13
 
14
- **Double-check before committing to a size**: Re-read the forced escalation signals from the Complexity Sizing reference. If any forced-medium signals fired, the task is medium regardless of how it feels.
14
+ ### 1.5 Plan Approval
15
15
 
16
- ### 1.5 Interactive Plan Approval
16
+ Read `ftm-config.yml` `execution.approval_mode`.
17
17
 
18
- Read `~/.claude/ftm-config.yml` field `execution.approval_mode`. This controls whether the user sees and approves the plan before execution begins.
18
+ **`auto`**: micro/small just go, medium outlines + executes, large routes to brainstorm/executor.
19
19
 
20
- #### Mode: `auto` (default legacy behavior)
21
- Skip this section entirely. Execute as before — micro/small just go, medium outlines steps and executes, large routes to brainstorm/executor.
20
+ **`plan_first`** (recommended):
21
+ - Small: pre-flight summary, proceed unless user objects
22
+ - Medium/large: present checkbox plan, wait for explicit approval
22
23
 
23
- #### Mode: `plan_first` (recommended for collaborative work)
24
+ Plan format is **mandatory**: `N. [ ] One-line action → target`. See `protocols/PLAN-APPROVAL.md` for spec + examples.
24
25
 
25
- **For small tasks**: Show a brief pre-flight summary before executing. Not a formal gate — just visibility:
26
-
27
- ```
28
- Quick summary before I start:
29
- - Read [file] to understand current behavior
30
- - Change [X] to [Y] in [file]
31
- - Verify: [test/lint/manual check]
32
-
33
- Going ahead unless you say otherwise.
34
- ```
35
-
36
- **For medium and large tasks**: Present a numbered task list and wait for the user to approve.
37
-
38
- **Step 0: Discovery Interview (if applicable).** Before generating the plan, check whether a Discovery Interview is needed (see Orient reference). If the task involves external systems, stakeholder coordination, or unfamiliar code, run the interview FIRST.
39
-
40
- **Step 1: Generate the plan.** Build a numbered checkbox list. This format is **mandatory** — no narrative steps, no prose paragraphs. Every plan MUST use: `N. [ ] One-line action → target`. See `references/protocols/PLAN-APPROVAL.md` for the full format spec, examples for code/ops/comms/infra tasks, and the list of NEVER-produce anti-patterns.
26
+ | User says | Action |
27
+ |---|---|
28
+ | approve/go/yes/lgtm | Execute all |
29
+ | skip N | Remove step, execute rest |
30
+ | only N,M | Execute only listed |
31
+ | for step N, [change] | Modify + execute all |
32
+ | add: [desc] after N | Insert, renumber, execute |
33
+ | deny/stop/cancel | Cancel entirely |
41
34
 
42
- **Step 2: Parse the user's response.**
35
+ Execute sequentially. Show `Step 2/5 done: [summary]` after each. If step fails → stop and report.
43
36
 
44
- | User says | Action |
45
- |-----------|--------|
46
- | `approve`, `go`, `yes`, `lgtm`, `ship it` | Execute all steps in order |
47
- | `skip N` or `skip N,M` | Remove those steps, execute the rest |
48
- | `only N,M,P` | Execute only the listed steps in order |
49
- | `for step N, [instruction]` | Replace step N's approach, then execute all |
50
- | `add: [description] after N` | Insert a new step, renumber, then execute all |
51
- | `deny`, `stop`, `cancel`, `no` | Cancel. Do not execute anything. |
52
- | A longer message with mixed feedback | Parse each instruction. Apply all modifications. Present revised plan and ask for final approval. |
37
+ **`always_ask`**: Same as plan_first but also gates small tasks. Only micro skips.
53
38
 
54
- **Step 3: Execute the approved plan.** Work through steps sequentially. After each step show: `Step 2/5 done: [summary].` If a step fails, stop and report.
39
+ ### 2. Direct vs routed
55
40
 
56
- **Step 4: Post-execution update.** Update blackboard with decisions and experience.
41
+ Direct when: micro/small, routing overhead adds no value, faster to just do it.
42
+ Skill when: specialized workflow improves result, user invoked it, medium/large.
57
43
 
58
- #### Mode: `always_ask`
59
- Same as `plan_first` but applies to **small** tasks too. Only micro tasks skip the approval gate.
44
+ ### 3. Supporting MCP reads
60
45
 
61
- #### Combining with explicit skill routing
62
- When routing to a skill, plan approval still applies if mode is `plan_first` or `always_ask`. Present the strategy for user control.
46
+ Fetch minimum required external context first (ticket, calendar, docs, browser state).
63
47
 
64
- ### 2. Choose direct vs routed execution
48
+ ### 4. Loop decision
65
49
 
66
- Use direct execution when:
67
- - the work is micro or small
68
- - routing overhead adds no value
69
- - the answer can be delivered faster than a delegated workflow
50
+ If next move reveals new information → plan to re-enter Observe after.
70
51
 
71
- Use a ftm skill when:
72
- - its specialized workflow will materially improve the result
73
- - the user explicitly invoked it
74
- - the task is medium/large and the skill is the right vehicle
52
+ ## Act
75
53
 
76
- ### 3. Choose any supporting MCP reads
54
+ ### Pre-Act Checkpoint (HARD GATE)
77
55
 
78
- If the request depends on external context, fetch the minimum required state first.
56
+ Before executing ANYTHING Bash, MCP, Write, Edit, API calls:
79
57
 
80
- Examples:
81
- - Jira URL -> read the ticket first
82
- - meeting request -> read calendar first
83
- - internal policy question -> search Glean first
84
- - UI bug -> snapshot or inspect browser first
58
+ 1. **Checkbox plan presented?** Medium+ tasks require `N. [ ] action → target` format, approved by user. Prose is NOT a plan.
59
+ 2. **User approved?** Wait for explicit go/approve/yes.
60
+ 3. **Plan marker written?** Write to `~/.claude/ftm-state/.plan-presented` after approval.
61
+ 4. **External mutations approved?** Per Approval Gates in orient-protocol.
62
+ 5. None apply (micro/small, no forced escalation) proceed.
85
63
 
86
- ### 4. Decide whether to loop
64
+ | Rationalization | Reality |
65
+ |---|---|
66
+ | "Do as much as you can" = implicit approval | That's the task description, not plan approval |
67
+ | "I know what to do, plan is overhead" | Plan is for the USER |
68
+ | "Just one small API call first" | One becomes five becomes a full unplanned execution |
69
+ | "User seems impatient" | 30-second plan saves 10 minutes of wrong work |
87
70
 
88
- If the next move will reveal new information, plan to re-enter Observe after the action.
71
+ Applies to ALL execution methods including Bash/curl/python. The plan-gate hook catches Edit/Write/MCP; this checkpoint catches everything else.
89
72
 
90
- ## Act
73
+ ### Compare Before You Loop (MANDATORY for external systems)
91
74
 
92
- Act is clean, decisive execution — but execution of **approved** work only.
75
+ **Never trial-and-error. Always compare first.**
93
76
 
94
- **HARD GATEPre-Act checkpoint**: Before executing ANYTHING (Bash, MCP, Write, Edit, API calls of any kind), verify ALL of these:
77
+ 1. **Find working reference** GET a resource that already works the way you want
78
+ 2. **Diff** — compare field-by-field against the broken one. Fix is almost always a small, specific difference
79
+ 3. **Targeted change** — change ONLY what the diff revealed. Verify after each change
95
80
 
96
- 1. **Did you present a checkbox plan?** If the task is medium+ (forced escalation signals fired), you MUST have presented a `N. [ ] action → target` plan and received explicit user approval. "I'll do X, Y, Z" in prose is NOT a plan. Listing steps without `[ ]` checkboxes is NOT a plan. If you haven't presented one, STOP and present it now.
97
- 2. **Did the user approve it?** Look for "go", "approve", "yes", "lgtm", or similar. If the user hasn't responded to your plan yet, WAIT. Do not start executing.
98
- 3. **Is the plan marker written?** After approval, write to `~/.claude/ftm-state/.plan-presented` before executing. This signals to hooks that planning happened.
99
- 4. If the task involves external mutations (see Approval Gates), have you presented the specific actions and received approval?
100
- 5. If none of the above apply (micro/small task, no forced escalation), proceed.
81
+ **Loop detection red flags:**
82
+ - 3+ API calls to same system without success
83
+ - Trying different URL formats (underscore vs hyphen, internal vs display ID)
84
+ - Shuffling payload fields hoping one works
85
+ - Reading API docs for endpoint paths (playbook should have this)
101
86
 
102
- **The rationalization trap**: You will feel the urge to skip the plan because:
103
- - "The user said 'do as much as you can' — that's implicit approval" → NO. That's the task description, not plan approval.
104
- - "I know what needs to happen, presenting a plan is just overhead" → NO. The plan is for the USER, not for you.
105
- - "I'll just start with one small API call to check something" → NO. One call becomes five becomes a full execution without approval.
106
- - "The user seems impatient" → NO. A 30-second plan saves 10 minutes of unwanted work.
87
+ **On detection:** STOP. Tell user: "Tried N approaches, none worked. Comparing against working reference." Do step 1.
107
88
 
108
- **This applies to ALL execution methods** — Bash commands, MCP calls, Python scripts, curl, direct API calls. The plan-gate hook catches Edit/Write/MCP, but Bash API calls bypass it. This checkpoint is the only thing that catches those. Do not skip it.
89
+ See `references/incidents.md` Braintrust Incident for the cost of skipping this.
109
90
 
110
91
  ### 1. Direct action
111
92
 
112
- For micro tasks:
113
- - do the work
114
- - summarize what changed
115
-
116
- For small tasks (when `approval_mode` is `plan_first` or `always_ask`):
117
- - show the pre-flight summary first
118
- - then do the work
119
- - verify
120
- - summarize what changed
93
+ Micro: do + summarize. Small (plan_first/always_ask): pre-flight → do → verify → summarize.
121
94
 
122
95
  ### 2. Skill routing
123
96
 
124
- Before invoking a skill, show one short routing line.
125
-
126
- Examples:
127
- - `Routing to ftm-debug: this is a flaky failure with real diagnostic uncertainty.`
128
- - `Routing to ftm-brainstorm: this is still design-stage and benefits from research-backed planning.`
129
-
130
- Then invoke the target skill with the full user input.
97
+ Show one routing line, then invoke: `Routing to ftm-debug: flaky failure with diagnostic uncertainty.`
131
98
 
132
99
  ### 3. MCP execution
133
100
 
134
- Use:
135
- - parallel reads when safe
136
- - sequential writes
137
- - approval gates only for external-facing actions
138
-
139
- ### 3.5. Draft-before-send protocol
140
-
141
- When composing Slack messages, emails, or any outbound communication, always save the draft locally before sending.
142
-
143
- **Drafts folder**: `.ftm-drafts/` in the project root (or `~/.claude/ftm-drafts/` if no project context).
101
+ Parallel reads, sequential writes, approval gates for external-facing actions.
144
102
 
145
- **Ensure the folder exists and is gitignored.** Save every draft before presenting or sending:
103
+ ### 3.5 Draft-before-send
146
104
 
147
- - Filename: `YYYY-MM-DD_HH-MM_<type>_<recipient-or-channel>.md`
148
- - Content includes frontmatter: type, to, subject (email only), drafted timestamp, status (draft/sent/cancelled)
149
-
150
- **Workflow:**
151
- 1. Compose the message
152
- 2. Save to `.ftm-drafts/`
153
- 3. Present to user for approval
154
- 4. If approved and sent, update `status: sent`
155
- 5. If cancelled or modified, update accordingly
105
+ Slack/email/outbound comms → save to `.ftm-drafts/` first. Filename: `YYYY-MM-DD_HH-MM_<type>_<recipient>.md`. Present for approval, update status on send/cancel.
156
106
 
157
107
  ### 4. Blackboard updates (mandatory)
158
108
 
159
- After every completed task, update the blackboard:
160
-
161
- 1. Update `context.json` set `current_task` to reflect what was done, append to `recent_decisions`
162
- 2. Update `session_metadata.skills_invoked` if a skill was used
163
- 3. Write an experience file to `~/.claude/ftm-state/blackboard/experiences/YYYY-MM-DD_task-slug.json`
164
- 4. Update `~/.claude/ftm-state/blackboard/experiences/index.json` with the new entry
165
-
166
- The experience file should capture:
167
- - `task_type`, `tags`, `outcome`, `lessons`, `files_touched`, `stakeholders`, `decisions_made`
168
-
169
- Follow the schema and full-file write rules from `blackboard-schema.md`.
109
+ After every completed task:
110
+ 1. Update `context.json` — current_task, recent_decisions, session_metadata
111
+ 2. Write experience file to `experiences/YYYY-MM-DD_task-slug.json`
112
+ 3. Update `experiences/index.json`
113
+ 4. Include: task_type, tags, outcome, lessons, files_touched, stakeholders, decisions_made, code_patterns, api_gotchas
170
114
 
171
115
  ### 5. Loop
172
116
 
173
- After acting:
174
-
175
- - if complete, answer and stop
176
- - if new information appeared, return to Observe
177
- - if blocked by approval or missing info, ask the user
178
- - if the simple approach failed, re-orient and escalate one level
117
+ Complete → answer and stop. New info → re-observe. Blocked → ask user. Failed → re-orient, escalate one level.
@@ -0,0 +1,23 @@
1
+ # Incident Reference
2
+
3
+ Named incidents referenced by Orient and Decide-Act protocols. Read this file only when an incident name is cited and you need the full context.
4
+
5
+ ## Hindsight Incident (March 2026)
6
+
7
+ **What happened**: ftm-mind took an SSO setup task and autonomously created Okta groups, added users to production Okta, created Freshservice records, a service catalog item, and modified S3 workflow configs — all without presenting a plan or asking for approval once.
8
+
9
+ **Root cause**: No plan-first gate existed. The task "felt small" but touched 5+ external systems.
10
+
11
+ **What it taught us**: Any task that calls production APIs is forced-medium. Plans are mandatory. Approval gates are circuit breakers, not suggestions.
12
+
13
+ ## Braintrust Incident (April 2026)
14
+
15
+ **What happened**: Freshservice catalog items #626 and #621 were deleted and recreated as #631 and #632 to "fix" duplicate fields. This broke the S3 workflow config (assign_after_app_owner_approval), required emergency patching, and custom_lookup_bigint fields had to be re-added manually.
16
+
17
+ **Root cause**: Three knowledge sources existed (playbook, blackboard, brain.py) and none were consulted. Then, when trial-and-error failed, the model chose a destructive action (delete + recreate) without considering dependencies or asking for approval.
18
+
19
+ **What it taught us**:
20
+ 1. Always check playbooks before external system operations
21
+ 2. Never delete and recreate external resources — IDs are depended on
22
+ 3. Compare working references against broken ones instead of guessing
23
+ 4. A one-field diff (`requester_can_edit: "true"`) was the entire fix — discoverable in 30 seconds by comparing the working HR Acuity item against the broken ones
@@ -1,348 +1,155 @@
1
- # Orient Protocol — Full Detail
1
+ # Orient Protocol
2
2
 
3
3
  ## Capability Inventory: FTM Skills
4
4
 
5
- Orient must know all ftm capabilities before deciding whether to route or act directly.
6
-
7
5
  | Skill | Reach for it when... |
8
6
  |---|---|
9
- | `ftm-brainstorm` | The user is exploring ideas, designing a system, comparing approaches, or needs research-backed planning before build work exists. |
10
- | `ftm-executor` | The user has a plan doc or clearly wants autonomous implementation across multiple tasks or waves. |
11
- | `ftm-debug` | The core problem is broken behavior, an error, flaky tests, a crash, regression, race, or "why is this failing?" |
12
- | `ftm-audit` | The user wants wiring checks, dead code analysis, structural verification, or adversarial code hygiene review. |
13
- | `ftm-council` | The user wants multiple AI perspectives, debate, second opinions, or multi-model convergence. |
14
- | `ftm-codex-gate` | The user wants adversarial Codex review, validation, or a correctness stress test from Codex specifically. |
15
- | `ftm-intent` | The user wants function/module purpose documented or `INTENT.md` updated or reconciled. |
16
- | `ftm-diagram` | The user wants diagrams, architecture visuals, dependency maps, or Mermaid assets updated. |
17
- | `ftm-browse` | The task requires a browser, screenshots, DOM inspection, or visual verification. |
18
- | `ftm-pause` | The user wants to park the session and save resumable state. |
19
- | `ftm-resume` | The user wants to restore paused context and continue prior work. |
20
- | `ftm-upgrade` | The user wants ftm skills checked or upgraded. |
21
- | `ftm-retro` | The user wants a post-run retrospective, lessons learned, or execution review. |
22
- | `ftm-config` | The user wants ftm settings, model profile, or feature configuration changed. |
23
- | `ftm-git` | Any git commit or push is about to happen, the user asks to scan for secrets/credentials/API keys, or wants to verify no secrets are hardcoded before sharing code. MUST run before any commit or push operation this is a mandatory security gate, not optional. |
24
- | `ftm-capture` | The user just completed a repeatable workflow and wants to save it as a reusable routine + playbook + reference doc. Triggers on "capture this", "save as routine", "codify this", "don't make me explain this again". Also suggest proactively when you detect the user doing something they've done before (matching blackboard experiences with same task_type 2+ times). |
25
- | `ftm-ops` | The user asks about tasks, capacity, burnout, stakeholders, meetings, incidents, patterns, or daily/weekly summaries. Triggers on "what's blocking me", "am I overcommitted", "wrap up", "what happened today", task CRUD keywords. |
26
-
27
- Routing heuristic:
28
-
29
- - If a task is self-contained and small enough, do it directly.
30
- - Route to a skill only when the skill's workflow adds clear value.
31
- - Explicit skill invocation is a strong route signal.
32
-
33
- ## MCP Inventory Reference
34
-
35
- Read `~/.claude/skills/ftm-mind/references/mcp-inventory.md` for full MCP server details.
36
-
37
- Orient must know the available MCPs and their contextual triggers.
38
-
39
- | MCP server | Reach for it when... |
7
+ | `ftm-brainstorm` | Exploring ideas, designing systems, comparing approaches, research-backed planning |
8
+ | `ftm-executor` | Has a plan doc or wants autonomous multi-task implementation |
9
+ | `ftm-debug` | Broken behavior, errors, flaky tests, crashes, regressions |
10
+ | `ftm-audit` | Wiring checks, dead code analysis, structural verification |
11
+ | `ftm-council` | Multiple AI perspectives, debate, second opinions |
12
+ | `ftm-codex-gate` | Adversarial Codex review or correctness stress test |
13
+ | `ftm-intent` | Function/module purpose docs or INTENT.md updates |
14
+ | `ftm-diagram` | Diagrams, architecture visuals, Mermaid assets |
15
+ | `ftm-browse` | Browser, screenshots, DOM inspection, visual verification |
16
+ | `ftm-pause` / `ftm-resume` | Park or restore session state |
17
+ | `ftm-upgrade` | Check or upgrade ftm skills |
18
+ | `ftm-retro` | Post-run retrospective or execution review |
19
+ | `ftm-config` | Settings, model profiles, feature configuration |
20
+ | `ftm-git` | MANDATORY before any commit/push secret scanning gate |
21
+ | `ftm-capture` | Save repeatable workflow as routine/playbook. Also suggest proactively when blackboard shows same task_type 2+ times |
22
+ | `ftm-ops` | Tasks, capacity, burnout, stakeholders, meetings, incidents, daily/weekly summaries |
23
+
24
+ Routing: do it directly if small enough. Route to a skill only when the workflow adds clear value. Explicit invocation is a strong signal.
25
+
26
+ ## MCP Inventory
27
+
28
+ Read `references/mcp-inventory.md` for full details. Quick heuristics:
29
+
30
+ | Signal | MCP |
40
31
  |---|---|
41
- | `git` | You need repo state, diffs, history, branches, staging, or commits. |
42
- | `playwright` | You need browser automation, screenshots, UI interaction, console logs, or visual checks. |
43
- | `sequential-thinking` | The problem genuinely needs multi-step reflective reasoning or trade-off analysis. |
44
- | `slack` | You need to read Slack context, inspect channels or threads, or send a Slack update. |
45
- | `gmail` | You need inbox search, email reading, drafting, sending, labels, or filters. |
46
- | `mcp-atlassian-personal` | Personal Jira or Confluence reads and writes: tickets, sprints, docs, comments, status changes. Default Atlassian account. *(Server names are configurable via `ops.mcp_account_rules` in ftm-config.yml. This table shows defaults.)* |
47
- | `mcp-atlassian` | Admin-scope Jira or Confluence operations that must run with elevated org credentials. *(Configurable via `ops.mcp_account_rules.admin` in ftm-config.yml.)* |
48
- | `freshservice-mcp` | IT ticketing, requesters, agent groups, products, or service requests. |
49
- | `context7` | External library and framework documentation. |
50
- | `glean_default` | Internal company docs, policies, runbooks, and institutional knowledge. |
51
- | `apple-doc-mcp` | Apple platform docs for Swift, SwiftUI, UIKit, AppKit, and related APIs. |
52
- | `lusha` | Contact or company lookup and enrichment. |
53
- | `google-calendar` | Schedule inspection, free/busy checks, event search, drafting scheduling actions, and calendar changes. |
54
-
55
- ### MCP matching heuristics
56
-
57
- Use the smallest relevant MCP set.
58
-
59
- - Jira issue key or Atlassian URL -> `mcp-atlassian-personal` (or the configured personal account name)
60
- - "internal docs", "runbook", "company wiki", "Glean" -> `glean_default`
61
- - "how do I use X library" -> `context7`
62
- - "calendar", "meeting", "free time" -> `google-calendar`
63
- - "Slack", "channel", "thread", "notify" -> `slack`
64
- - "email", "Gmail", "draft" -> `gmail`
65
- - "ticket", "hardware", "access request" -> `freshservice-mcp`
66
- - "browser", "screenshot", "look at the page" -> `playwright`
67
- - "talk through trade-offs" -> `sequential-thinking`
68
- - "SwiftUI" or Apple framework names -> `apple-doc-mcp`
69
- - "find contact/company" -> `lusha`
70
-
71
- ### Multi-MCP chaining
72
-
73
- Detect mixed-domain requests early.
74
-
75
- Examples:
76
-
77
- - "check my calendar and draft a Slack message" -> `google-calendar` + `slack`
78
- - "read the Jira ticket, inspect the repo, then propose a fix" -> `mcp-atlassian-personal` + `git`
79
- - "search internal docs, then update a Confluence page" -> `glean_default` + `mcp-atlassian-personal`
80
-
81
- Rules:
82
-
83
- - parallelize reads when safe
84
- - gather state before proposing writes
85
- - chain writes sequentially
32
+ | Jira key or Atlassian URL | `mcp-atlassian-personal` |
33
+ | Internal docs, runbook, company wiki | `glean_default` |
34
+ | Library/framework docs | `context7` |
35
+ | Calendar, meeting, free time | `google-calendar` |
36
+ | Slack, channel, thread | `slack` |
37
+ | Email, Gmail, draft | `gmail` |
38
+ | Ticket, hardware, access request | `freshservice-mcp` |
39
+ | Browser, screenshot | `playwright` |
40
+ | Trade-off analysis | `sequential-thinking` |
41
+ | Apple framework | `apple-doc-mcp` |
42
+ | Contact/company lookup | `lusha` |
43
+
44
+ Multi-MCP: parallelize reads, gather state before writes, chain writes sequentially.
86
45
 
87
46
  ## Session Trajectory
88
47
 
89
- Do not orient from the last user message alone.
90
-
91
- Look for the arc:
92
-
93
- - What skill or action happened just before this?
94
- - What did we learn?
95
- - Is the user moving from ideation -> execution -> validation?
96
- - Did we already choose an approach that this request assumes?
97
-
98
- Trajectory cues:
99
-
100
- - brainstorm -> "ok go" usually means plan or executor
101
- - debug -> "check it now" usually means verify, test, or audit
102
- - executor -> "pause" means checkpoint, not new work
103
- - resume -> "what's next?" means restore and continue
104
-
105
- If a request branches away from the active thread, note that mentally and avoid corrupting the current session model.
48
+ Look for the arc, not just the last message:
49
+ - What happened just before? What did we learn?
50
+ - brainstorm "ok go" = plan/executor
51
+ - debug → "check it now" = verify/test/audit
52
+ - executor "pause" = checkpoint
53
+ - resume "what's next?" = restore and continue
106
54
 
107
55
  ## Codebase State
108
56
 
109
- Orient must incorporate what is true in the repo right now.
110
-
111
- Check:
112
-
113
- - dirty worktree
114
- - recent commits
115
- - active branch
116
- - user changes in progress
117
- - whether the request conflicts with local state
118
-
119
- Use codebase state to answer:
120
-
121
- - is this safe to do directly?
122
- - do we need to avoid stepping on unfinished work?
123
- - is this request actually about the last commit or current unstaged diff?
124
- - should we inspect a particular module first because recent changes point there?
125
-
126
- Repo heuristics:
57
+ Check: dirty worktree, recent commits, active branch, in-progress changes, conflicts with request. Clean tree = lower cost of direct action. Uncommitted changes = continuity and risk.
127
58
 
128
- - uncommitted changes imply continuity and risk
129
- - a clean tree lowers the cost of direct action
130
- - a just-landed commit suggests review or regression-check behavior
131
- - a ticket-linked branch suggests the user expects ticket-driven execution
59
+ ## Approval Gates (HARD STOP)
132
60
 
133
- ## Approval Gates (HARD STOP NOT OPTIONAL)
61
+ **Circuit breaker. External mutations require explicit user approval. No exceptions.**
134
62
 
135
- **This section is a circuit breaker, not a suggestion. If you are about to call a tool that creates, updates, or deletes a record in an external system, you MUST stop and get explicit user approval FIRST. No exceptions. No "the user implied it." No "it's part of the plan." STOP and ASK.**
63
+ See `references/incidents.md` Hindsight Incident for why this exists.
136
64
 
137
- The reason this exists: in March 2026, ftm-mind took a Hindsight SSO task and autonomously created Okta groups, added users to production Okta, created Freshservice records, created a service catalog item, and modified S3 workflow configs — all without asking once.
65
+ ### Requires approval (STOP before each)
138
66
 
139
- ### What requires approval (STOP before each one)
67
+ Every individual external mutation. "User approved the plan" ≠ "user approved every API call."
140
68
 
141
- Every individual external mutation needs its own approval. "The user approved the plan" does not mean "the user approved every API call in the plan."
142
-
143
- - **Okta**: creating apps, groups, assigning users, modifying policies
144
- - **Freshservice**: creating tickets, records, catalog items, custom objects
145
- - **Jira / Confluence**: creating or updating issues, pages, comments
146
- - **Slack / Email**: sending messages (draft-before-send protocol applies)
147
- - **Calendar**: creating or modifying events
148
- - **S3 / cloud storage**: writing or modifying objects
149
- - **Browser forms**: submitting data through playwright/puppeteer
69
+ - **Okta**: create apps/groups, assign users, modify policies
70
+ - **Freshservice**: create tickets/records/catalog items/custom objects
71
+ - **Jira/Confluence**: create/update issues, pages, comments
72
+ - **Slack/Email**: send messages (draft-before-send applies)
73
+ - **Calendar**: create/modify events
74
+ - **S3/cloud**: write/modify objects
75
+ - **Browser forms**: submit data
150
76
  - **Deploys**: any production-affecting operation
151
- - **Git remote**: pushes, PR creation
152
-
153
- When multiple mutations are part of one plan, batch the approval request by phase — not one API call at a time, but not "approve the whole plan" either. Group related mutations and present per-phase.
154
-
155
- ### Destructive Actions (EXTRA HARD GATE — NEVER WITHOUT EXPLICIT CONFIRMATION)
156
-
157
- Deleting, replacing, or recreating external resources is a **separate, higher gate** than creating or updating them. These actions are often irreversible and break downstream dependencies you can't see.
158
-
159
- **NEVER do any of these without explicit user confirmation for each specific resource being destroyed:**
160
- - **DELETE any external resource** (catalog items, custom objects, Okta groups/apps, Jira issues, S3 objects)
161
- - **Recreate (delete + create)** to "fix" something — the new resource gets a different ID, breaking every automation that references the old one
162
- - **Overwrite S3 objects** that other systems read from
163
- - **Remove users from groups** or deactivate accounts
164
- - **Close/resolve tickets** that others may be watching
165
-
166
- **The "delete and recreate" trap**: When you can't update a resource cleanly via API, your instinct will be to delete it and create a fresh one. THIS IS ALMOST ALWAYS WRONG. External resources have IDs that other systems depend on — workflow configs, Lambda triggers, approval chains, custom object lookups, S3 references. Deleting breaks all of them silently. Instead:
167
- 1. Tell the user what you can't update via API
168
- 2. Suggest the minimal manual fix (admin UI link + exact steps)
169
- 3. Only delete if the user explicitly says "yes, delete it, I understand the dependencies"
77
+ - **Git remote**: push, PR creation
170
78
 
171
- **The April 2026 Braintrust incident**: ftm-mind deleted Freshservice catalog items #626 and #621 to "fix" duplicate fields, recreating them as #631 and #632. This broke the S3 workflow config (assign_after_app_owner_approval), required emergency patching, and the custom_lookup_bigint fields had to be re-added manually. The correct fix was: update only the roles field via API, and tell the user to delete the duplicate fields manually in the admin UI.
79
+ Batch by phase not per-call, not whole-plan.
172
80
 
173
- ### What auto-proceeds (no approval needed)
81
+ ### Destructive Actions (EXTRA HARD GATE)
174
82
 
175
- - local code edits, documentation updates
176
- - tests, lint, builds, audits
177
- - local git operations (branch, commit, inspection)
178
- - reading from any MCP or API (GET requests)
179
- - blackboard reads and writes
180
- - saving drafts to `.ftm-drafts/`
83
+ **NEVER delete/recreate external resources without per-resource user confirmation.**
181
84
 
182
- ### The momentum trap
85
+ - DELETE any external resource
86
+ - Recreate (delete + create) — new ID breaks all automation referencing the old one
87
+ - Overwrite S3 objects other systems read
88
+ - Remove users from groups, deactivate accounts
89
+ - Close/resolve tickets others watch
183
90
 
184
- If you notice yourself thinking any of these, STOP you are rationalizing past a gate:
91
+ When you can't update via API: tell the user, suggest manual fix (admin UI link + steps). Only delete if user explicitly confirms with dependency awareness. See `references/incidents.md` → Braintrust Incident.
185
92
 
186
- - "The user clearly wants this done, I'll just do it"
187
- - "This is part of the approved plan"
188
- - "I already started, might as well finish"
189
- - "It's just one more API call"
190
- - "The user will appreciate me being proactive"
93
+ ### Auto-proceeds (no approval)
191
94
 
192
- None of these override the gate. Present the action, wait for approval, then execute.
95
+ Local edits, tests, lint, builds, audits, local git, GET requests, blackboard reads/writes, saving drafts.
193
96
 
194
- ## Ask-the-User Heuristic
97
+ ### Rationalization traps
195
98
 
196
- Ask the user only when one of these is true:
197
-
198
- - two materially different interpretations are both plausible
199
- - an external-facing action needs approval
200
- - a required credential, path, or identifier is missing **AND the blackboard has no experience confirming access** (see Blackboard-First Rule below)
201
- - the user explicitly asked for options before action
202
- - **the task is medium+ and involves external systems, stakeholder coordination, or unfamiliar code** (see Discovery Interview below) **AND the blackboard doesn't already confirm repo-level access**
203
-
204
- When asking, ask one focused question with concrete choices.
205
-
206
- ### Blackboard-First Rule (MANDATORY before any access/auth questions)
99
+ | Thought | Reality |
100
+ |---|---|
101
+ | "The user clearly wants this" | Present the action, wait for approval |
102
+ | "It's part of the approved plan" | Each mutation needs its own gate |
103
+ | "I already started" | Sunk cost. Stop and ask |
104
+ | "Just one more API call" | That's how incidents start |
105
+ | "User will appreciate proactivity" | User will appreciate not breaking things |
207
106
 
208
- **Before asking ANY question about credentials, API access, authorization, permissions, or "do you have access to X" — check the blackboard first.**
107
+ ## Blackboard-First Rule (before any access/auth questions)
209
108
 
109
+ Before asking about credentials, API access, or authorization:
210
110
  1. Read `experiences/index.json`
211
- 2. Look for entries tagged with the current repo name, `api-access`, `full-access`, `credentials`, or the system being asked about (e.g., `freshservice`, `okta`, `jira`)
212
- 3. If a matching experience exists with `confidence >= 0.7`:
213
- - **Do NOT ask about access.** The user already established this.
214
- - **Do NOT run a discovery interview about authorization.** You have the answer.
215
- - **Just do the thing.** If the credentials don't work, you'll find out when the API call fails — and that's a better signal than a speculative question.
216
- 4. If no matching experience exists, proceed with asking.
217
-
218
- This rule exists because users set up repo-level context once (e.g., "my-tools repo has full API access to our admin systems") and expect Claude to remember it across every session. Asking "do you have admin access?" when the blackboard already says "yes, full access" is the #1 frustration signal.
219
-
220
- ### Access Declaration Detection (MANDATORY)
221
-
222
- When a user declares repo-level access — either explicitly or as part of a task — **immediately write a blackboard experience so it persists across sessions.** Do NOT wait until the task is complete. Write it during Orient, before acting.
223
-
224
- **Detection triggers** (any of these in the user's message):
225
- - "I have access to...", "I have credentials for...", "I'm authenticated to..."
226
- - "this repo has access to...", "we have API keys for..."
227
- - "just do it, I have the creds", "you have access here", "credentials are configured"
228
- - "I'm in [repo name] with my credentials"
229
- - The user tells you to stop asking and just use an API
230
- - An API call succeeds for the first time in a repo where no access experience exists
231
-
232
- **What to write** — create an experience file at `~/.claude/ftm-state/blackboard/experiences/learning-{repo-name}-api-access.json`:
233
-
234
- ```json
235
- {
236
- "id": "learning-{repo-name}-api-access",
237
- "timestamp": "{ISO 8601 now}",
238
- "task_type": "environment-knowledge",
239
- "tags": ["{repo-name}", "api-access", "environment", "learning"],
240
- "outcome": "success",
241
- "description": "User confirmed API access in {repo-name} repo. {any specifics they mentioned — which systems, what kind of access}.",
242
- "lessons": [
243
- "{repo-name} repo has configured access to {systems mentioned}",
244
- "Do not ask about credentials or authorization when working in this repo — just act"
245
- ],
246
- "confidence": 1.0,
247
- "code_patterns": [],
248
- "api_gotchas": []
249
- }
250
- ```
111
+ 2. Look for tags: current repo name, `api-access`, `full-access`, or the target system
112
+ 3. If match exists with confidence 0.7 → don't ask, just act
113
+ 4. No match proceed with asking
251
114
 
252
- Also update `experiences/index.json` with the new entry.
115
+ ## Access Declaration Detection
253
116
 
254
- **On first successful API call:** If you make an API call in a repo and it succeeds, but no access experience exists for this repo, write one automatically. The success IS the proof of access. Tag it with the repo name and the system that worked (e.g., `freshservice`, `okta`).
117
+ When user declares repo-level access, **immediately** write a blackboard experience:
255
118
 
256
- **This is not optional.** Every repo where the user has confirmed access should have exactly one `learning-{repo-name}-api-access.json` experience. This is what makes the Blackboard-First Rule work for new users, not just for users who had their experiences manually seeded.
119
+ **Triggers**: "I have access to...", "credentials are configured", "just do it, I have the creds", user tells you to stop asking, or first successful API call in a repo without an access experience.
257
120
 
258
- ### Discovery Interview (medium+ tasks with external systems)
121
+ **Write**: `experiences/learning-{repo-name}-api-access.json` with tags `["{repo-name}", "api-access", "environment", "learning"]`, confidence 1.0. Update index.
259
122
 
260
- When a task hits forced-medium or higher AND involves external systems, stakeholder coordination, or code you haven't read yet this session, run a brief discovery interview BEFORE generating the plan. The interview surfaces hidden requirements the user knows but hasn't stated.
123
+ ## Discovery Interview (medium+ with external systems)
261
124
 
262
- **Before running the interview, apply the Blackboard-First Rule above.** If the blackboard confirms access and the task is a straightforward API operation (add user, create ticket, update group), skip the interview entirely and just do it. The interview is for tasks with genuine unknowns — stakeholder coordination, multi-system migrations, policy changes — not for "use the Freshservice API to add an agent."
125
+ **Apply Blackboard-First Rule first.** If blackboard confirms access + task is a direct API operation skip interview, just do it.
263
126
 
264
- The interview should be 2-4 focused questions:
127
+ Interview is for genuine unknowns only (stakeholder coordination, multi-system migrations, policy changes). 2-4 focused questions:
128
+ - Who else needs to know?
129
+ - Downstream dependencies?
130
+ - Timeline/approval constraints?
131
+ - Parts to leave as-is?
265
132
 
266
- - Who else needs to know about this change?
267
- - Are there downstream systems or automations that depend on what's changing?
268
- - Is there a timeline or dependency on someone else's approval?
269
- - Should we also draft a message to anyone about this?
270
- - Are there parts of this you want left alone for now vs. changed?
133
+ **Skip when**: user provided context, purely local, user said "just do it", or blackboard confirms access for a direct API op.
271
134
 
272
- **When to skip the interview:**
273
- - The user already provided comprehensive context
274
- - The task is purely local with no external dependencies
275
- - The user explicitly says "just do it" or "no questions, go"
276
- - **The blackboard has an experience confirming API access for this repo + the task is a direct API operation** (not stakeholder coordination or multi-system migration)
277
-
278
- ## Brain.py Task Loading (Observe Phase)
279
-
280
- During the Orient phase, enrich session context with the user's active operational state by loading tasks via brain.py:
135
+ ## Brain.py Task Loading
281
136
 
282
137
  ```
283
138
  python3 ~/.claude/skills/ftm/bin/brain.py --tasks --task-json
284
139
  ```
285
140
 
286
- Parse the JSON output for active tasks. Surface high-priority or blocking tasks via `TaskCreate` with the task details so they appear in the session task list. This gives ftm-mind awareness of what the user is carrying before deciding on the next move.
287
-
288
- Skip this step if:
289
- - brain.py is not present or returns an error (fail gracefully, do not block orientation)
290
- - The session context already contains recently loaded task state (within 15 minutes)
291
- - The request is purely local with no operational relevance (e.g., pure code edits)
292
-
293
- ## Playbook Lookup (MANDATORY before any external system operation)
294
-
295
- **Before executing any operation on an external system (Freshservice, Okta, Jira, Trelica, S3, etc.), check for an existing playbook.** This is not optional. Playbooks encode hard-won lessons — API quirks, encoding requirements, field types that can't be updated, correct endpoint paths. Skipping this step means repeating every mistake the playbook was written to prevent.
296
-
297
- **Step 1: Check brain.py playbooks.**
298
-
299
- ```
300
- python3 ~/.claude/skills/ftm/bin/brain.py --playbook-match "[describe the operation]" --playbook-match-source freshservice
301
- ```
302
-
303
- If a match returns with confidence > 0.2, read the full playbook before proceeding.
304
-
305
- ```
306
- python3 ~/.claude/skills/ftm/bin/brain.py --playbook-list
307
- ```
308
-
309
- Also list all playbooks and scan names — sometimes the match query misses a relevant one.
141
+ Load active tasks, surface high-priority via TaskCreate. Skip if brain.py absent, tasks loaded recently (15min), or request is purely local.
310
142
 
311
- **Step 2: Check repo-local playbooks.**
312
-
313
- ```
314
- ls docs/playbooks/ 2>/dev/null
315
- ```
143
+ ## Playbook Lookup (MANDATORY before external system ops)
316
144
 
317
- If the current repo has a `docs/playbooks/` directory, scan it for files matching the target system. Read any relevant playbook before writing a single line of code.
145
+ **Before any external system operation, check all three knowledge sources:**
318
146
 
319
- **Step 3: Check blackboard experiences.**
147
+ 1. `brain.py --playbook-match "[operation]"` + `--playbook-list`
148
+ 2. `ls docs/playbooks/` in current repo
149
+ 3. Blackboard experiences filtered by target system tags — check `code_patterns` and `api_gotchas`
320
150
 
321
- Read `experiences/index.json` and filter by tags matching the target system. Load matching experience files and check for `code_patterns` and `api_gotchas` fields.
322
-
323
- **What playbooks prevent:**
324
- - Using raw HTML when Freshservice requires entity-encoded HTML (`html.escape()`)
325
- - Trying to PUT on `service_catalog/items/{internal_id}` when the correct path is `service-catalog/items/{display_id}`
326
- - Including `custom_lookup_bigint` fields in API updates (they're admin-UI-only)
327
- - Deleting and recreating resources when an in-place update works
328
- - Repeating 10+ failed API calls to discover what the playbook already documents
329
-
330
- **The April 2026 Braintrust incident**: A playbook existed (`docs/playbooks/freshservice-service-catalog-item.md`), the blackboard had the lesson ("FS rich text tables require html.escape()"), and a brain.py playbook (`fs-hide-catalog-el`) was available. None were consulted. The result: 15+ failed API attempts, accidental creation of duplicate fields, then destructive deletion of two catalog items breaking S3 workflow automation.
331
-
332
- **If no playbook exists** and the operation succeeds after trial-and-error, the auto-playbook hook should trigger. If it doesn't, proactively invoke ftm-capture to save the working pattern.
151
+ If any source has relevant content, read it before writing code. See `references/incidents.md` Braintrust Incident for what happens when you skip this.
333
152
 
334
153
  ## Orient Synthesis
335
154
 
336
- Before leaving Orient, silently synthesize all signals into one internal picture:
337
-
338
- - current outcome the user wants
339
- - current task type
340
- - session continuity
341
- - codebase constraints
342
- - relevant lessons
343
- - relevant patterns
344
- - capability mix
345
- - smallest correct task size
346
- - whether approval or clarification is needed
347
-
348
- Orient is complete only when the next move feels obvious.
155
+ Before leaving Orient, have one clear internal picture: what the user wants, task type, session continuity, codebase constraints, relevant lessons/patterns, capability mix, correct task size, whether approval or clarification is needed. Orient is complete when the next move feels obvious.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "feed-the-machine",
3
- "version": "1.7.12",
3
+ "version": "1.7.14",
4
4
  "description": "A brain upgrade for Claude Code — 26 skills that teach it how to think before acting, remember across conversations, debug like a war room, run plans on autopilot with agent teams, and get second opinions from GPT & Gemini. Plus 15 hooks that automate the boring stuff.",
5
5
  "license": "MIT",
6
6
  "author": "kkudumu",