npm - feed-the-machine - Versions diffs - 1.7.13 → 1.7.15 - Mend

feed-the-machine 1.7.13 → 1.7.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/ftm-mind/references/decide-act-protocol.md +47 -162
package/ftm-mind/references/incidents.md +23 -0
package/ftm-mind/references/orient-protocol.md +79 -309
package/hooks/ftm-guard.sh +121 -0
package/hooks/settings-template.json +10 -0
package/package.json +1 -1

package/ftm-mind/references/decide-act-protocol.md CHANGED Viewed

@@ -2,205 +2,90 @@
 ## Decide
-Decide turns the orientation model into one concrete next move.
+### 1. Choose execution mode
-### 1. Choose the smallest correct execution mode
+- `micro` → direct action
+- `small` → pre-flight summary + action + verify
+- `medium` → checkbox plan, wait for approval, execute
+- `large` → `ftm-brainstorm` (no plan) or `ftm-executor` (plan exists)
-- `micro` -> direct action
-- `small` -> pre-flight summary, then direct action plus verification
-- `medium` -> numbered plan, wait for approval, then execute
-- `large` -> `ftm-brainstorm` if no plan exists, or `ftm-executor` if a plan exists
+Double-check forced escalation signals from Complexity Sizing reference. If any fired → medium minimum.
-**Double-check before committing to a size**: Re-read the forced escalation signals from the Complexity Sizing reference. If any forced-medium signals fired, the task is medium regardless of how it feels.
+### 1.5 Plan Approval
-### 1.5 Interactive Plan Approval
+Read `ftm-config.yml` → `execution.approval_mode`.
-Read `~/.claude/ftm-config.yml` field `execution.approval_mode`. This controls whether the user sees and approves the plan before execution begins.
+**`auto`**: micro/small just go, medium outlines + executes, large routes to brainstorm/executor.
-#### Mode: `auto` (default legacy behavior)
-Skip this section entirely. Execute as before — micro/small just go, medium outlines steps and executes, large routes to brainstorm/executor.
+**`plan_first`** (recommended):
+- Small: pre-flight summary, proceed unless user objects
+- Medium/large: present checkbox plan, wait for explicit approval
-#### Mode: `plan_first` (recommended for collaborative work)
-**For small tasks**: Show a brief pre-flight summary before executing. Not a formal gate — just visibility:
-```
-Quick summary before I start:
-- Read [file] to understand current behavior
-- Change [X] to [Y] in [file]
-- Verify: [test/lint/manual check]
-Going ahead unless you say otherwise.
-```
-**For medium and large tasks**: Present a numbered task list and wait for the user to approve.
-**Step 0: Discovery Interview (if applicable).** Before generating the plan, check whether a Discovery Interview is needed (see Orient reference). If the task involves external systems, stakeholder coordination, or unfamiliar code, run the interview FIRST.
-**Step 1: Generate the plan.** Build a numbered checkbox list. This format is **mandatory** — no narrative steps, no prose paragraphs. Every plan MUST use: `N. [ ] One-line action → target`. See `references/protocols/PLAN-APPROVAL.md` for the full format spec, examples for code/ops/comms/infra tasks, and the list of NEVER-produce anti-patterns.
-**Step 2: Parse the user's response.**
+Plan format is **mandatory**: `N. [ ] One-line action → target`. See `protocols/PLAN-APPROVAL.md` for spec + examples.
 | User says | Action |
-|-----------|--------|
-| `approve`, `go`, `yes`, `lgtm`, `ship it` | Execute all steps in order |
-| `skip N` or `skip N,M` | Remove those steps, execute the rest |
-| `only N,M,P` | Execute only the listed steps in order |
-| `for step N, [instruction]` | Replace step N's approach, then execute all |
-| `add: [description] after N` | Insert a new step, renumber, then execute all |
-| `deny`, `stop`, `cancel`, `no` | Cancel. Do not execute anything. |
-| A longer message with mixed feedback | Parse each instruction. Apply all modifications. Present revised plan and ask for final approval. |
-**Step 3: Execute the approved plan.** Work through steps sequentially. After each step show: `Step 2/5 done: [summary].` If a step fails, stop and report.
+|---|---|
+| approve/go/yes/lgtm | Execute all |
+| skip N | Remove step, execute rest |
+| only N,M | Execute only listed |
+| for step N, [change] | Modify + execute all |
+| add: [desc] after N | Insert, renumber, execute |
+| deny/stop/cancel | Cancel entirely |
-**Step 4: Post-execution update.** Update blackboard with decisions and experience.
+Execute sequentially. Show `Step 2/5 done: [summary]` after each. If step fails → stop and report.
-#### Mode: `always_ask`
-Same as `plan_first` but applies to **small** tasks too. Only micro tasks skip the approval gate.
+**`always_ask`**: Same as plan_first but also gates small tasks. Only micro skips.
-#### Combining with explicit skill routing
-When routing to a skill, plan approval still applies if mode is `plan_first` or `always_ask`. Present the strategy for user control.
+### 2. Direct vs routed
-### 2. Choose direct vs routed execution
+Direct when: micro/small, routing overhead adds no value, faster to just do it.
+Skill when: specialized workflow improves result, user invoked it, medium/large.
-Use direct execution when:
-- the work is micro or small
-- routing overhead adds no value
-- the answer can be delivered faster than a delegated workflow
+### 3. Supporting MCP reads
-Use a ftm skill when:
-- its specialized workflow will materially improve the result
-- the user explicitly invoked it
-- the task is medium/large and the skill is the right vehicle
+Fetch minimum required external context first (ticket, calendar, docs, browser state).
-### 3. Choose any supporting MCP reads
+### 4. Loop decision
-If the request depends on external context, fetch the minimum required state first.
-Examples:
-- Jira URL -> read the ticket first
-- meeting request -> read calendar first
-- internal policy question -> search Glean first
-- UI bug -> snapshot or inspect browser first
-### 4. Decide whether to loop
-If the next move will reveal new information, plan to re-enter Observe after the action.
+If next move reveals new information → plan to re-enter Observe after.
 ## Act
-Act is clean, decisive execution — but execution of **approved** work only.
-**HARD GATE — Pre-Act checkpoint**: Before executing ANYTHING (Bash, MCP, Write, Edit, API calls of any kind), verify ALL of these:
-1. **Did you present a checkbox plan?** If the task is medium+ (forced escalation signals fired), you MUST have presented a `N. [ ] action → target` plan and received explicit user approval. "I'll do X, Y, Z" in prose is NOT a plan. Listing steps without `[ ]` checkboxes is NOT a plan. If you haven't presented one, STOP and present it now.
-2. **Did the user approve it?** Look for "go", "approve", "yes", "lgtm", or similar. If the user hasn't responded to your plan yet, WAIT. Do not start executing.
-3. **Is the plan marker written?** After approval, write to `~/.claude/ftm-state/.plan-presented` before executing. This signals to hooks that planning happened.
-4. If the task involves external mutations (see Approval Gates), have you presented the specific actions and received approval?
-5. If none of the above apply (micro/small task, no forced escalation), proceed.
-**The rationalization trap**: You will feel the urge to skip the plan because:
-- "The user said 'do as much as you can' — that's implicit approval" → NO. That's the task description, not plan approval.
-- "I know what needs to happen, presenting a plan is just overhead" → NO. The plan is for the USER, not for you.
-- "I'll just start with one small API call to check something" → NO. One call becomes five becomes a full execution without approval.
-- "The user seems impatient" → NO. A 30-second plan saves 10 minutes of unwanted work.
-**This applies to ALL execution methods** — Bash commands, MCP calls, Python scripts, curl, direct API calls. The plan-gate hook catches Edit/Write/MCP, but Bash API calls bypass it. This checkpoint is the only thing that catches those. Do not skip it.
-### Compare Before You Loop (MANDATORY for external system operations)
+### Pre-Act Checkpoint
-When working with external systems (Freshservice, Okta, Jira, Trelica, any API), **never trial-and-error your way to a solution.** Instead:
+Before executing, verify:
-**Step 1: Find a working reference.**
-Before making any changes, find an existing resource that already works the way you want the target to work. Examples:
-- Updating a catalog item's roles table? GET the working one (HR Acuity #630) AND the broken one. Diff them field by field.
-- Fixing an Okta group mapping? GET a group that works correctly and compare its config to the broken one.
-- Updating a Jira automation? Read a working rule's config before touching the broken one.
+1. **Checkbox plan presented?** Medium+ tasks require `N. [ ] action → target` format, approved by user.
+2. **User approved?** Wait for explicit go/approve/yes.
+3. **Plan marker written?** Write to `~/.claude/ftm-state/.plan-presented` after approval.
+4. None apply (micro/small, no forced escalation) → proceed.
-**Step 2: Diff, don't guess.**
-Compare the working reference against the target. The fix is almost always a small, specific difference — a missing field option, a different encoding, a wrong position value. Find that diff. Don't hypothesize about what might be wrong.
-**Step 3: Make targeted changes.**
-Change ONLY what the diff revealed. One field at a time if needed. Verify after each change.
-**The trial-and-error trap**: When an API call fails, your instinct is to try a different endpoint, different payload, different method. After 3 failed attempts you're in a loop — guessing at combinations. STOP. Go back to Step 1. The answer is in the working reference, not in your next guess.
-**Red flags that you're in a loop:**
-- You've made 3+ API calls to the same system without a success
-- You're trying different URL path formats (underscore vs hyphen, internal ID vs display ID)
-- You're adding/removing fields from the payload hoping one combination works
-- You're reading API docs or source code to figure out the endpoint (the playbook should have this)
-**When you detect a loop:** STOP executing. Tell the user: "I've tried N approaches and none worked. Let me compare against a working reference before continuing." Then do Step 1.
-**The April 2026 lesson**: A one-field-option diff (`requester_can_edit: "true"`) was the entire fix for the Freshservice roles table not rendering. It took 15+ API calls, accidental field duplication, and destructive deletion of two catalog items to discover what a 30-second field-by-field comparison against the working HR Acuity item would have revealed immediately.
+**Note**: The `ftm-guard` hook enforces approval gates, destructive action prevention, playbook checks, and loop detection at the tool-call level. You don't need to self-check these — the hook will stop you. But you should still present plans and get approval before acting.
 ### 1. Direct action
-For micro tasks:
-- do the work
-- summarize what changed
-For small tasks (when `approval_mode` is `plan_first` or `always_ask`):
-- show the pre-flight summary first
-- then do the work
-- verify
-- summarize what changed
+Micro: do + summarize. Small (plan_first/always_ask): pre-flight → do → verify → summarize.
 ### 2. Skill routing
-Before invoking a skill, show one short routing line.
-Examples:
-- `Routing to ftm-debug: this is a flaky failure with real diagnostic uncertainty.`
-- `Routing to ftm-brainstorm: this is still design-stage and benefits from research-backed planning.`
-Then invoke the target skill with the full user input.
+Show one routing line, then invoke: `Routing to ftm-debug: flaky failure with diagnostic uncertainty.`
 ### 3. MCP execution
-Use:
-- parallel reads when safe
-- sequential writes
-- approval gates only for external-facing actions
+Parallel reads, sequential writes. The ftm-guard hook handles approval gates for external-facing actions.
-### 3.5. Draft-before-send protocol
+### 3.5 Draft-before-send
-When composing Slack messages, emails, or any outbound communication, always save the draft locally before sending.
-**Drafts folder**: `.ftm-drafts/` in the project root (or `~/.claude/ftm-drafts/` if no project context).
-**Ensure the folder exists and is gitignored.** Save every draft before presenting or sending:
-- Filename: `YYYY-MM-DD_HH-MM_<type>_<recipient-or-channel>.md`
-- Content includes frontmatter: type, to, subject (email only), drafted timestamp, status (draft/sent/cancelled)
-**Workflow:**
-1. Compose the message
-2. Save to `.ftm-drafts/`
-3. Present to user for approval
-4. If approved and sent, update `status: sent`
-5. If cancelled or modified, update accordingly
+Slack/email/outbound comms → save to `.ftm-drafts/` AND `~/.claude/ftm-ops/drafts/` first. Filename: `YYYY-MM-DD_HH-MM_<type>_<recipient>.md`. Present for approval, update status on send/cancel.
 ### 4. Blackboard updates (mandatory)
-After every completed task, update the blackboard:
-1. Update `context.json` — set `current_task` to reflect what was done, append to `recent_decisions`
-2. Update `session_metadata.skills_invoked` if a skill was used
-3. Write an experience file to `~/.claude/ftm-state/blackboard/experiences/YYYY-MM-DD_task-slug.json`
-4. Update `~/.claude/ftm-state/blackboard/experiences/index.json` with the new entry
-The experience file should capture:
-- `task_type`, `tags`, `outcome`, `lessons`, `files_touched`, `stakeholders`, `decisions_made`
-Follow the schema and full-file write rules from `blackboard-schema.md`.
+After every completed task:
+1. Update `context.json` — current_task, recent_decisions, session_metadata
+2. Write experience file to `experiences/YYYY-MM-DD_task-slug.json`
+3. Update `experiences/index.json`
+4. Include: task_type, tags, outcome, lessons, files_touched, stakeholders, decisions_made, code_patterns, api_gotchas
 ### 5. Loop
-After acting:
-- if complete, answer and stop
-- if new information appeared, return to Observe
-- if blocked by approval or missing info, ask the user
-- if the simple approach failed, re-orient and escalate one level
+Complete → answer and stop. New info → re-observe. Blocked → ask user. Failed → re-orient, escalate one level.

package/ftm-mind/references/incidents.md ADDED Viewed

@@ -0,0 +1,23 @@
+# Incident Reference
+Named incidents referenced by Orient and Decide-Act protocols. Read this file only when an incident name is cited and you need the full context.
+## Hindsight Incident (March 2026)
+**What happened**: ftm-mind took an SSO setup task and autonomously created Okta groups, added users to production Okta, created Freshservice records, a service catalog item, and modified S3 workflow configs — all without presenting a plan or asking for approval once.
+**Root cause**: No plan-first gate existed. The task "felt small" but touched 5+ external systems.
+**What it taught us**: Any task that calls production APIs is forced-medium. Plans are mandatory. Approval gates are circuit breakers, not suggestions.
+## Braintrust Incident (April 2026)
+**What happened**: Freshservice catalog items #626 and #621 were deleted and recreated as #631 and #632 to "fix" duplicate fields. This broke the S3 workflow config (assign_after_app_owner_approval), required emergency patching, and custom_lookup_bigint fields had to be re-added manually.
+**Root cause**: Three knowledge sources existed (playbook, blackboard, brain.py) and none were consulted. Then, when trial-and-error failed, the model chose a destructive action (delete + recreate) without considering dependencies or asking for approval.
+**What it taught us**:
+1. Always check playbooks before external system operations
+2. Never delete and recreate external resources — IDs are depended on
+3. Compare working references against broken ones instead of guessing
+4. A one-field diff (`requester_can_edit: "true"`) was the entire fix — discoverable in 30 seconds by comparing the working HR Acuity item against the broken ones

package/ftm-mind/references/orient-protocol.md CHANGED Viewed

@@ -1,348 +1,118 @@
-# Orient Protocol — Full Detail
+# Orient Protocol
 ## Capability Inventory: FTM Skills
-Orient must know all ftm capabilities before deciding whether to route or act directly.
 | Skill | Reach for it when... |
 |---|---|
-| `ftm-brainstorm` | The user is exploring ideas, designing a system, comparing approaches, or needs research-backed planning before build work exists. |
-| `ftm-executor` | The user has a plan doc or clearly wants autonomous implementation across multiple tasks or waves. |
-| `ftm-debug` | The core problem is broken behavior, an error, flaky tests, a crash, regression, race, or "why is this failing?" |
-| `ftm-audit` | The user wants wiring checks, dead code analysis, structural verification, or adversarial code hygiene review. |
-| `ftm-council` | The user wants multiple AI perspectives, debate, second opinions, or multi-model convergence. |
-| `ftm-codex-gate` | The user wants adversarial Codex review, validation, or a correctness stress test from Codex specifically. |
-| `ftm-intent` | The user wants function/module purpose documented or `INTENT.md` updated or reconciled. |
-| `ftm-diagram` | The user wants diagrams, architecture visuals, dependency maps, or Mermaid assets updated. |
-| `ftm-browse` | The task requires a browser, screenshots, DOM inspection, or visual verification. |
-| `ftm-pause` | The user wants to park the session and save resumable state. |
-| `ftm-resume` | The user wants to restore paused context and continue prior work. |
-| `ftm-upgrade` | The user wants ftm skills checked or upgraded. |
-| `ftm-retro` | The user wants a post-run retrospective, lessons learned, or execution review. |
-| `ftm-config` | The user wants ftm settings, model profile, or feature configuration changed. |
-| `ftm-git` | Any git commit or push is about to happen, the user asks to scan for secrets/credentials/API keys, or wants to verify no secrets are hardcoded before sharing code. MUST run before any commit or push operation — this is a mandatory security gate, not optional. |
-| `ftm-capture` | The user just completed a repeatable workflow and wants to save it as a reusable routine + playbook + reference doc. Triggers on "capture this", "save as routine", "codify this", "don't make me explain this again". Also suggest proactively when you detect the user doing something they've done before (matching blackboard experiences with same task_type 2+ times). |
-| `ftm-ops` | The user asks about tasks, capacity, burnout, stakeholders, meetings, incidents, patterns, or daily/weekly summaries. Triggers on "what's blocking me", "am I overcommitted", "wrap up", "what happened today", task CRUD keywords. |
-Routing heuristic:
-- If a task is self-contained and small enough, do it directly.
-- Route to a skill only when the skill's workflow adds clear value.
-- Explicit skill invocation is a strong route signal.
-## MCP Inventory Reference
-Read `~/.claude/skills/ftm-mind/references/mcp-inventory.md` for full MCP server details.
-Orient must know the available MCPs and their contextual triggers.
-| MCP server | Reach for it when... |
+| `ftm-brainstorm` | Exploring ideas, designing systems, comparing approaches, research-backed planning |
+| `ftm-executor` | Has a plan doc or wants autonomous multi-task implementation |
+| `ftm-debug` | Broken behavior, errors, flaky tests, crashes, regressions |
+| `ftm-audit` | Wiring checks, dead code analysis, structural verification |
+| `ftm-council` | Multiple AI perspectives, debate, second opinions |
+| `ftm-codex-gate` | Adversarial Codex review or correctness stress test |
+| `ftm-intent` | Function/module purpose docs or INTENT.md updates |
+| `ftm-diagram` | Diagrams, architecture visuals, Mermaid assets |
+| `ftm-browse` | Browser, screenshots, DOM inspection, visual verification |
+| `ftm-pause` / `ftm-resume` | Park or restore session state |
+| `ftm-upgrade` | Check or upgrade ftm skills |
+| `ftm-retro` | Post-run retrospective or execution review |
+| `ftm-config` | Settings, model profiles, feature configuration |
+| `ftm-git` | MANDATORY before any commit/push — secret scanning gate |
+| `ftm-capture` | Save repeatable workflow as routine/playbook. Also suggest proactively when blackboard shows same task_type 2+ times |
+| `ftm-ops` | Tasks, capacity, burnout, stakeholders, meetings, incidents, daily/weekly summaries |
+Routing: do it directly if small enough. Route to a skill only when the workflow adds clear value. Explicit invocation is a strong signal.
+## MCP Inventory
+Read `references/mcp-inventory.md` for full details. Quick heuristics:
+| Signal | MCP |
 |---|---|
-| `git` | You need repo state, diffs, history, branches, staging, or commits. |
-| `playwright` | You need browser automation, screenshots, UI interaction, console logs, or visual checks. |
-| `sequential-thinking` | The problem genuinely needs multi-step reflective reasoning or trade-off analysis. |
-| `slack` | You need to read Slack context, inspect channels or threads, or send a Slack update. |
-| `gmail` | You need inbox search, email reading, drafting, sending, labels, or filters. |
-| `mcp-atlassian-personal` | Personal Jira or Confluence reads and writes: tickets, sprints, docs, comments, status changes. Default Atlassian account. *(Server names are configurable via `ops.mcp_account_rules` in ftm-config.yml. This table shows defaults.)* |
-| `mcp-atlassian` | Admin-scope Jira or Confluence operations that must run with elevated org credentials. *(Configurable via `ops.mcp_account_rules.admin` in ftm-config.yml.)* |
-| `freshservice-mcp` | IT ticketing, requesters, agent groups, products, or service requests. |
-| `context7` | External library and framework documentation. |
-| `glean_default` | Internal company docs, policies, runbooks, and institutional knowledge. |
-| `apple-doc-mcp` | Apple platform docs for Swift, SwiftUI, UIKit, AppKit, and related APIs. |
-| `lusha` | Contact or company lookup and enrichment. |
-| `google-calendar` | Schedule inspection, free/busy checks, event search, drafting scheduling actions, and calendar changes. |
-### MCP matching heuristics
-Use the smallest relevant MCP set.
-- Jira issue key or Atlassian URL -> `mcp-atlassian-personal` (or the configured personal account name)
-- "internal docs", "runbook", "company wiki", "Glean" -> `glean_default`
-- "how do I use X library" -> `context7`
-- "calendar", "meeting", "free time" -> `google-calendar`
-- "Slack", "channel", "thread", "notify" -> `slack`
-- "email", "Gmail", "draft" -> `gmail`
-- "ticket", "hardware", "access request" -> `freshservice-mcp`
-- "browser", "screenshot", "look at the page" -> `playwright`
-- "talk through trade-offs" -> `sequential-thinking`
-- "SwiftUI" or Apple framework names -> `apple-doc-mcp`
-- "find contact/company" -> `lusha`
-### Multi-MCP chaining
-Detect mixed-domain requests early.
-Examples:
-- "check my calendar and draft a Slack message" -> `google-calendar` + `slack`
-- "read the Jira ticket, inspect the repo, then propose a fix" -> `mcp-atlassian-personal` + `git`
-- "search internal docs, then update a Confluence page" -> `glean_default` + `mcp-atlassian-personal`
-Rules:
-- parallelize reads when safe
-- gather state before proposing writes
-- chain writes sequentially
+| Jira key or Atlassian URL | `mcp-atlassian-personal` |
+| Internal docs, runbook, company wiki | `glean_default` |
+| Library/framework docs | `context7` |
+| Calendar, meeting, free time | `google-calendar` |
+| Slack, channel, thread | `slack` |
+| Email, Gmail, draft | `gmail` |
+| Ticket, hardware, access request | `freshservice-mcp` |
+| Browser, screenshot | `playwright` |
+| Trade-off analysis | `sequential-thinking` |
+| Apple framework | `apple-doc-mcp` |
+| Contact/company lookup | `lusha` |
+Multi-MCP: parallelize reads, gather state before writes, chain writes sequentially.
 ## Session Trajectory
-Do not orient from the last user message alone.
-Look for the arc:
-- What skill or action happened just before this?
-- What did we learn?
-- Is the user moving from ideation -> execution -> validation?
-- Did we already choose an approach that this request assumes?
-Trajectory cues:
-- brainstorm -> "ok go" usually means plan or executor
-- debug -> "check it now" usually means verify, test, or audit
-- executor -> "pause" means checkpoint, not new work
-- resume -> "what's next?" means restore and continue
-If a request branches away from the active thread, note that mentally and avoid corrupting the current session model.
+Look for the arc, not just the last message:
+- What happened just before? What did we learn?
+- brainstorm → "ok go" = plan/executor
+- debug → "check it now" = verify/test/audit
+- executor → "pause" = checkpoint
+- resume → "what's next?" = restore and continue
 ## Codebase State
-Orient must incorporate what is true in the repo right now.
-Check:
-- dirty worktree
-- recent commits
-- active branch
-- user changes in progress
-- whether the request conflicts with local state
-Use codebase state to answer:
-- is this safe to do directly?
-- do we need to avoid stepping on unfinished work?
-- is this request actually about the last commit or current unstaged diff?
-- should we inspect a particular module first because recent changes point there?
-Repo heuristics:
-- uncommitted changes imply continuity and risk
-- a clean tree lowers the cost of direct action
-- a just-landed commit suggests review or regression-check behavior
-- a ticket-linked branch suggests the user expects ticket-driven execution
-## Approval Gates (HARD STOP — NOT OPTIONAL)
-**This section is a circuit breaker, not a suggestion. If you are about to call a tool that creates, updates, or deletes a record in an external system, you MUST stop and get explicit user approval FIRST. No exceptions. No "the user implied it." No "it's part of the plan." STOP and ASK.**
-The reason this exists: in March 2026, ftm-mind took a Hindsight SSO task and autonomously created Okta groups, added users to production Okta, created Freshservice records, created a service catalog item, and modified S3 workflow configs — all without asking once.
-### What requires approval (STOP before each one)
-Every individual external mutation needs its own approval. "The user approved the plan" does not mean "the user approved every API call in the plan."
-- **Okta**: creating apps, groups, assigning users, modifying policies
-- **Freshservice**: creating tickets, records, catalog items, custom objects
-- **Jira / Confluence**: creating or updating issues, pages, comments
-- **Slack / Email**: sending messages (draft-before-send protocol applies)
-- **Calendar**: creating or modifying events
-- **S3 / cloud storage**: writing or modifying objects
-- **Browser forms**: submitting data through playwright/puppeteer
-- **Deploys**: any production-affecting operation
-- **Git remote**: pushes, PR creation
-When multiple mutations are part of one plan, batch the approval request by phase — not one API call at a time, but not "approve the whole plan" either. Group related mutations and present per-phase.
-### Destructive Actions (EXTRA HARD GATE — NEVER WITHOUT EXPLICIT CONFIRMATION)
-Deleting, replacing, or recreating external resources is a **separate, higher gate** than creating or updating them. These actions are often irreversible and break downstream dependencies you can't see.
-**NEVER do any of these without explicit user confirmation for each specific resource being destroyed:**
-- **DELETE any external resource** (catalog items, custom objects, Okta groups/apps, Jira issues, S3 objects)
-- **Recreate (delete + create)** to "fix" something — the new resource gets a different ID, breaking every automation that references the old one
-- **Overwrite S3 objects** that other systems read from
-- **Remove users from groups** or deactivate accounts
-- **Close/resolve tickets** that others may be watching
-**The "delete and recreate" trap**: When you can't update a resource cleanly via API, your instinct will be to delete it and create a fresh one. THIS IS ALMOST ALWAYS WRONG. External resources have IDs that other systems depend on — workflow configs, Lambda triggers, approval chains, custom object lookups, S3 references. Deleting breaks all of them silently. Instead:
-1. Tell the user what you can't update via API
-2. Suggest the minimal manual fix (admin UI link + exact steps)
-3. Only delete if the user explicitly says "yes, delete it, I understand the dependencies"
-**The April 2026 Braintrust incident**: ftm-mind deleted Freshservice catalog items #626 and #621 to "fix" duplicate fields, recreating them as #631 and #632. This broke the S3 workflow config (assign_after_app_owner_approval), required emergency patching, and the custom_lookup_bigint fields had to be re-added manually. The correct fix was: update only the roles field via API, and tell the user to delete the duplicate fields manually in the admin UI.
-### What auto-proceeds (no approval needed)
-- local code edits, documentation updates
-- tests, lint, builds, audits
-- local git operations (branch, commit, inspection)
-- reading from any MCP or API (GET requests)
-- blackboard reads and writes
-- saving drafts to `.ftm-drafts/`
-### The momentum trap
-If you notice yourself thinking any of these, STOP — you are rationalizing past a gate:
+Check: dirty worktree, recent commits, active branch, in-progress changes, conflicts with request. Clean tree = lower cost of direct action. Uncommitted changes = continuity and risk.
-- "The user clearly wants this done, I'll just do it"
-- "This is part of the approved plan"
-- "I already started, might as well finish"
-- "It's just one more API call"
-- "The user will appreciate me being proactive"
-None of these override the gate. Present the action, wait for approval, then execute.
-## Ask-the-User Heuristic
-Ask the user only when one of these is true:
-- two materially different interpretations are both plausible
-- an external-facing action needs approval
-- a required credential, path, or identifier is missing **AND the blackboard has no experience confirming access** (see Blackboard-First Rule below)
-- the user explicitly asked for options before action
-- **the task is medium+ and involves external systems, stakeholder coordination, or unfamiliar code** (see Discovery Interview below) **AND the blackboard doesn't already confirm repo-level access**
-When asking, ask one focused question with concrete choices.
-### Blackboard-First Rule (MANDATORY before any access/auth questions)
-**Before asking ANY question about credentials, API access, authorization, permissions, or "do you have access to X" — check the blackboard first.**
+## Blackboard-First Rule (before any access/auth questions)
+Before asking about credentials, API access, or authorization:
 1. Read `experiences/index.json`
-2. Look for entries tagged with the current repo name, `api-access`, `full-access`, `credentials`, or the system being asked about (e.g., `freshservice`, `okta`, `jira`)
-3. If a matching experience exists with `confidence >= 0.7`:
-   - **Do NOT ask about access.** The user already established this.
-   - **Do NOT run a discovery interview about authorization.** You have the answer.
-   - **Just do the thing.** If the credentials don't work, you'll find out when the API call fails — and that's a better signal than a speculative question.
-4. If no matching experience exists, proceed with asking.
-This rule exists because users set up repo-level context once (e.g., "my-tools repo has full API access to our admin systems") and expect Claude to remember it across every session. Asking "do you have admin access?" when the blackboard already says "yes, full access" is the #1 frustration signal.
-### Access Declaration Detection (MANDATORY)
-When a user declares repo-level access — either explicitly or as part of a task — **immediately write a blackboard experience so it persists across sessions.** Do NOT wait until the task is complete. Write it during Orient, before acting.
-**Detection triggers** (any of these in the user's message):
-- "I have access to...", "I have credentials for...", "I'm authenticated to..."
-- "this repo has access to...", "we have API keys for..."
-- "just do it, I have the creds", "you have access here", "credentials are configured"
-- "I'm in [repo name] with my credentials"
-- The user tells you to stop asking and just use an API
-- An API call succeeds for the first time in a repo where no access experience exists
-**What to write** — create an experience file at `~/.claude/ftm-state/blackboard/experiences/learning-{repo-name}-api-access.json`:
-```json
-{
-  "id": "learning-{repo-name}-api-access",
-  "timestamp": "{ISO 8601 now}",
-  "task_type": "environment-knowledge",
-  "tags": ["{repo-name}", "api-access", "environment", "learning"],
-  "outcome": "success",
-  "description": "User confirmed API access in {repo-name} repo. {any specifics they mentioned — which systems, what kind of access}.",
-  "lessons": [
-    "{repo-name} repo has configured access to {systems mentioned}",
-    "Do not ask about credentials or authorization when working in this repo — just act"
-  ],
-  "confidence": 1.0,
-  "code_patterns": [],
-  "api_gotchas": []
-}
-```
-Also update `experiences/index.json` with the new entry.
+2. Look for tags: current repo name, `api-access`, `full-access`, or the target system
+3. If match exists with confidence ≥ 0.7 → don't ask, just act
+4. No match → proceed with asking
-**On first successful API call:** If you make an API call in a repo and it succeeds, but no access experience exists for this repo, write one automatically. The success IS the proof of access. Tag it with the repo name and the system that worked (e.g., `freshservice`, `okta`).
+## Access Declaration Detection
-**This is not optional.** Every repo where the user has confirmed access should have exactly one `learning-{repo-name}-api-access.json` experience. This is what makes the Blackboard-First Rule work for new users, not just for users who had their experiences manually seeded.
+When user declares repo-level access, **immediately** write a blackboard experience:
-### Discovery Interview (medium+ tasks with external systems)
+**Triggers**: "I have access to...", "credentials are configured", "just do it, I have the creds", user tells you to stop asking, or first successful API call in a repo without an access experience.
-When a task hits forced-medium or higher AND involves external systems, stakeholder coordination, or code you haven't read yet this session, run a brief discovery interview BEFORE generating the plan. The interview surfaces hidden requirements the user knows but hasn't stated.
+**Write**: `experiences/learning-{repo-name}-api-access.json` with tags `["{repo-name}", "api-access", "environment", "learning"]`, confidence 1.0. Update index.
-**Before running the interview, apply the Blackboard-First Rule above.** If the blackboard confirms access and the task is a straightforward API operation (add user, create ticket, update group), skip the interview entirely and just do it. The interview is for tasks with genuine unknowns — stakeholder coordination, multi-system migrations, policy changes — not for "use the Freshservice API to add an agent."
+## Discovery Interview (medium+ with external systems)
-The interview should be 2-4 focused questions:
+**Apply Blackboard-First Rule first.** If blackboard confirms access + task is a direct API operation → skip interview, just do it.
-- Who else needs to know about this change?
-- Are there downstream systems or automations that depend on what's changing?
-- Is there a timeline or dependency on someone else's approval?
-- Should we also draft a message to anyone about this?
-- Are there parts of this you want left alone for now vs. changed?
+Interview is for genuine unknowns only (stakeholder coordination, multi-system migrations, policy changes). 2-4 focused questions:
+- Who else needs to know?
+- Downstream dependencies?
+- Timeline/approval constraints?
+- Parts to leave as-is?
-**When to skip the interview:**
-- The user already provided comprehensive context
-- The task is purely local with no external dependencies
-- The user explicitly says "just do it" or "no questions, go"
-- **The blackboard has an experience confirming API access for this repo + the task is a direct API operation** (not stakeholder coordination or multi-system migration)
+**Skip when**: user provided context, purely local, user said "just do it", or blackboard confirms access for a direct API op.
-## Brain.py Task Loading (Observe Phase)
-During the Orient phase, enrich session context with the user's active operational state by loading tasks via brain.py:
+## Brain.py Task Loading
 ```
 python3 ~/.claude/skills/ftm/bin/brain.py --tasks --task-json
 ```
-Parse the JSON output for active tasks. Surface high-priority or blocking tasks via `TaskCreate` with the task details so they appear in the session task list. This gives ftm-mind awareness of what the user is carrying before deciding on the next move.
-Skip this step if:
-- brain.py is not present or returns an error (fail gracefully, do not block orientation)
-- The session context already contains recently loaded task state (within 15 minutes)
-- The request is purely local with no operational relevance (e.g., pure code edits)
-## Playbook Lookup (MANDATORY before any external system operation)
-**Before executing any operation on an external system (Freshservice, Okta, Jira, Trelica, S3, etc.), check for an existing playbook.** This is not optional. Playbooks encode hard-won lessons — API quirks, encoding requirements, field types that can't be updated, correct endpoint paths. Skipping this step means repeating every mistake the playbook was written to prevent.
-**Step 1: Check brain.py playbooks.**
-```
-python3 ~/.claude/skills/ftm/bin/brain.py --playbook-match "[describe the operation]" --playbook-match-source freshservice
-```
-If a match returns with confidence > 0.2, read the full playbook before proceeding.
-```
-python3 ~/.claude/skills/ftm/bin/brain.py --playbook-list
-```
-Also list all playbooks and scan names — sometimes the match query misses a relevant one.
-**Step 2: Check repo-local playbooks.**
+Load active tasks, surface high-priority via TaskCreate. Skip if brain.py absent, tasks loaded recently (15min), or request is purely local.
-```
-ls docs/playbooks/ 2>/dev/null
-```
+## Playbook Lookup (MANDATORY before external system ops)
-If the current repo has a `docs/playbooks/` directory, scan it for files matching the target system. Read any relevant playbook before writing a single line of code.
+**Before any external system operation, check all three knowledge sources:**
-**Step 3: Check blackboard experiences.**
+1. `brain.py --playbook-match "[operation]"` + `--playbook-list`
+2. `ls docs/playbooks/` in current repo
+3. Blackboard experiences filtered by target system tags — check `code_patterns` and `api_gotchas`
-Read `experiences/index.json` and filter by tags matching the target system. Load matching experience files and check for `code_patterns` and `api_gotchas` fields.
+If any source has relevant content, read it before writing code. After checking, write a marker: `~/.claude/ftm-state/.playbook-checked-{system}` so the ftm-guard hook knows you checked.
-**What playbooks prevent:**
-- Using raw HTML when Freshservice requires entity-encoded HTML (`html.escape()`)
-- Trying to PUT on `service_catalog/items/{internal_id}` when the correct path is `service-catalog/items/{display_id}`
-- Including `custom_lookup_bigint` fields in API updates (they're admin-UI-only)
-- Deleting and recreating resources when an in-place update works
-- Repeating 10+ failed API calls to discover what the playbook already documents
-**The April 2026 Braintrust incident**: A playbook existed (`docs/playbooks/freshservice-service-catalog-item.md`), the blackboard had the lesson ("FS rich text tables require html.escape()"), and a brain.py playbook (`fs-hide-catalog-el`) was available. None were consulted. The result: 15+ failed API attempts, accidental creation of duplicate fields, then destructive deletion of two catalog items breaking S3 workflow automation.
+## Orient Synthesis
-**If no playbook exists** and the operation succeeds after trial-and-error, the auto-playbook hook should trigger. If it doesn't, proactively invoke ftm-capture to save the working pattern.
+Before leaving Orient, have one clear internal picture: what the user wants, task type, session continuity, codebase constraints, relevant lessons/patterns, capability mix, correct task size, whether approval or clarification is needed. Orient is complete when the next move feels obvious.
-## Orient Synthesis
+## Safety Protocols
-Before leaving Orient, silently synthesize all signals into one internal picture:
+**Approval gates, destructive action prevention, compare-before-you-loop, and loop detection are enforced by the `ftm-guard` hook**, which fires on every mutating tool call automatically. You do not need to self-enforce these — the hook will inject warnings if you're about to do something dangerous. But you should still be aware of them:
-- current outcome the user wants
-- current task type
-- session continuity
-- codebase constraints
-- relevant lessons
-- relevant patterns
-- capability mix
-- smallest correct task size
-- whether approval or clarification is needed
+- External mutations need user approval per-phase (not per-call, not whole-plan)
+- Destructive actions (delete, recreate) need per-resource confirmation
+- 3+ failed API calls = stop and compare against a working reference
+- Never trial-and-error; always diff a working resource first
-Orient is complete only when the next move feels obvious.
+See `references/incidents.md` for the full incident history behind these rules.

package/hooks/ftm-guard.sh ADDED Viewed

@@ -0,0 +1,121 @@
+#!/bin/sh
+# ftm-guard.sh
+# Safety system for all Claude Code sessions with ftm installed.
+# Fires on PreToolUse for mutating tools. Injects safety context
+# before Claude executes external mutations, destructive actions,
+# or enters trial-and-error loops.
+#
+# Hook: PreToolUse (matcher: Edit|Write|Bash|mcp__*)
+#
+# This is NOT gated on ftm session state — it protects ALL sessions.
+set -eu
+INPUT=$(cat)
+TOOL_NAME=$(echo "$INPUT" | jq -r '.tool_name // ""' 2>/dev/null)
+# Only gate mutating operations
+IS_MUTATING=false
+case "$TOOL_NAME" in
+  Edit|Write) IS_MUTATING=true ;;
+  Bash)
+    # Check if the bash command contains mutating API patterns
+    COMMAND=$(echo "$INPUT" | jq -r '.tool_input.command // ""' 2>/dev/null)
+    case "$COMMAND" in
+      *deleteFS*|*delete_custom_object*|*DELETE*|*putFS*|*postFS*) IS_MUTATING=true ;;
+      *"curl -X DELETE"*|*"curl -X PUT"*|*"curl -X POST"*|*"curl -X PATCH"*) IS_MUTATING=true ;;
+      *"requests.delete"*|*"requests.put"*|*"requests.post"*|*"requests.patch"*) IS_MUTATING=true ;;
+    esac
+    ;;
+esac
+# Also catch mutating MCP calls
+case "$TOOL_NAME" in
+  mcp__*create*|mcp__*update*|mcp__*delete*|mcp__*send*|mcp__*add*|mcp__*remove*|mcp__*apply*|mcp__*transition*|mcp__*commit*|mcp__*push*|mcp__*post_message*|mcp__*reply*|mcp__*modify*|mcp__*batch*|mcp__*convert*)
+    IS_MUTATING=true ;;
+esac
+if [ "$IS_MUTATING" != "true" ]; then
+  exit 0
+fi
+# --- Build safety context based on what's about to happen ---
+FTM_STATE="$HOME/.claude/ftm-state"
+CONTEXT_PARTS=""
+# Check 1: Is this a destructive action?
+IS_DESTRUCTIVE=false
+case "$TOOL_NAME" in
+  mcp__*delete*|mcp__*remove*) IS_DESTRUCTIVE=true ;;
+esac
+case "${COMMAND:-}" in
+  *deleteFS*|*delete_custom_object*|*"curl -X DELETE"*|*"requests.delete"*|*DELETE*) IS_DESTRUCTIVE=true ;;
+esac
+if [ "$IS_DESTRUCTIVE" = "true" ]; then
+  CONTEXT_PARTS="$CONTEXT_PARTS [DESTRUCTIVE ACTION GATE] You are about to DELETE an external resource. STOP. Confirm with the user FIRST — name the specific resource being deleted and warn about downstream dependencies (workflow configs, automation references, lookup fields). Never delete-and-recreate to fix something. See ftm-mind/references/incidents.md -> Braintrust Incident."
+fi
+# Check 2: Is there a playbook for this system?
+SYSTEM=""
+case "$TOOL_NAME" in
+  mcp__freshservice*) SYSTEM="freshservice" ;;
+  mcp__mcp-atlassian*) SYSTEM="jira" ;;
+  mcp__slack*) SYSTEM="slack" ;;
+  mcp__gmail*) SYSTEM="gmail" ;;
+esac
+# Also detect from bash commands
+case "${COMMAND:-}" in
+  *freshservice*|*getFS*|*putFS*|*postFS*|*deleteFS*) SYSTEM="freshservice" ;;
+  *okta*|*OktaGroup*|*OktaUser*) SYSTEM="okta" ;;
+esac
+if [ -n "$SYSTEM" ]; then
+  # Check if playbook was consulted this session
+  PLAYBOOK_MARKER="$FTM_STATE/.playbook-checked-$SYSTEM"
+  if [ ! -f "$PLAYBOOK_MARKER" ]; then
+    CONTEXT_PARTS="$CONTEXT_PARTS [PLAYBOOK CHECK] You are calling $SYSTEM APIs. Did you check for playbooks FIRST? Run: brain.py --playbook-match and check docs/playbooks/ and blackboard experiences with code_patterns. If you haven't, STOP and check now. Write to $PLAYBOOK_MARKER after checking."
+  fi
+fi
+# Check 3: Loop detection — count recent failures for this system
+ERROR_TRACKER="$FTM_STATE/.error-tracker.jsonl"
+if [ -f "$ERROR_TRACKER" ] && [ -n "$SYSTEM" ]; then
+  NOW=$(date +%s)
+  RECENT_ERRORS=$(python3 -c "
+import json
+count = 0
+cutoff = $NOW - 600
+for line in open('$ERROR_TRACKER'):
+    line = line.strip()
+    if not line: continue
+    try:
+        ev = json.loads(line)
+        if ev.get('module','').lower().find('$SYSTEM') >= 0 and ev.get('type') == 'error' and ev.get('ts',0) >= cutoff:
+            count += 1
+    except: pass
+print(count)
+" 2>/dev/null || echo "0")
+  if [ "$RECENT_ERRORS" -ge 3 ]; then
+    CONTEXT_PARTS="$CONTEXT_PARTS [LOOP DETECTED] $RECENT_ERRORS recent errors on $SYSTEM in the last 10 minutes. STOP trial-and-error. Find a WORKING reference resource, GET it, diff field-by-field against the broken one, and make targeted changes. The answer is in the diff, not in your next guess."
+  fi
+fi
+# Output combined safety context if any checks triggered
+if [ -n "$CONTEXT_PARTS" ]; then
+  # Escape for JSON
+  ESCAPED=$(echo "$CONTEXT_PARTS" | sed 's/"/\\"/g' | tr '\n' ' ')
+  cat <<JSONEOF
+{
+  "hookSpecificOutput": {
+    "hookEventName": "PreToolUse",
+    "additionalContext": "[ftm-guard]$ESCAPED"
+  }
+}
+JSONEOF
+fi
+exit 0

package/hooks/settings-template.json CHANGED Viewed

@@ -22,6 +22,16 @@
             "timeout": 5
           }
         ]
+      },
+      {
+        "matcher": "Bash|mcp__freshservice-mcp|mcp__mcp-atlassian|mcp__slack|mcp__gmail|mcp__git",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "~/.claude/hooks/ftm-guard.sh",
+            "timeout": 5
+          }
+        ]
       }
     ],
     "UserPromptSubmit": [

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "feed-the-machine",
-  "version": "1.7.13",
+  "version": "1.7.15",
   "description": "A brain upgrade for Claude Code — 26 skills that teach it how to think before acting, remember across conversations, debug like a war room, run plans on autopilot with agent teams, and get second opinions from GPT & Gemini. Plus 15 hooks that automate the boring stuff.",
   "license": "MIT",
   "author": "kkudumu",