npm - feed-the-machine - Versions diffs - 1.7.12 → 1.7.14 - Mend

feed-the-machine 1.7.12 → 1.7.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/ftm-mind/references/decide-act-protocol.md +68 -129
package/ftm-mind/references/incidents.md +23 -0
package/ftm-mind/references/orient-protocol.md +103 -296
package/package.json +1 -1

package/ftm-mind/references/decide-act-protocol.md CHANGED Viewed

@@ -2,177 +2,116 @@
 ## Decide
-Decide turns the orientation model into one concrete next move.
+### 1. Choose execution mode
-### 1. Choose the smallest correct execution mode
+- `micro` → direct action
+- `small` → pre-flight summary + action + verify
+- `medium` → checkbox plan, wait for approval, execute
+- `large` → `ftm-brainstorm` (no plan) or `ftm-executor` (plan exists)
-- `micro` -> direct action
-- `small` -> pre-flight summary, then direct action plus verification
-- `medium` -> numbered plan, wait for approval, then execute
-- `large` -> `ftm-brainstorm` if no plan exists, or `ftm-executor` if a plan exists
+Double-check forced escalation signals from Complexity Sizing reference. If any fired → medium minimum.
-**Double-check before committing to a size**: Re-read the forced escalation signals from the Complexity Sizing reference. If any forced-medium signals fired, the task is medium regardless of how it feels.
+### 1.5 Plan Approval
-### 1.5 Interactive Plan Approval
+Read `ftm-config.yml` → `execution.approval_mode`.
-Read `~/.claude/ftm-config.yml` field `execution.approval_mode`. This controls whether the user sees and approves the plan before execution begins.
+**`auto`**: micro/small just go, medium outlines + executes, large routes to brainstorm/executor.
-#### Mode: `auto` (default legacy behavior)
-Skip this section entirely. Execute as before — micro/small just go, medium outlines steps and executes, large routes to brainstorm/executor.
+**`plan_first`** (recommended):
+- Small: pre-flight summary, proceed unless user objects
+- Medium/large: present checkbox plan, wait for explicit approval
-#### Mode: `plan_first` (recommended for collaborative work)
+Plan format is **mandatory**: `N. [ ] One-line action → target`. See `protocols/PLAN-APPROVAL.md` for spec + examples.
-**For small tasks**: Show a brief pre-flight summary before executing. Not a formal gate — just visibility:
-```
-Quick summary before I start:
-- Read [file] to understand current behavior
-- Change [X] to [Y] in [file]
-- Verify: [test/lint/manual check]
-Going ahead unless you say otherwise.
-```
-**For medium and large tasks**: Present a numbered task list and wait for the user to approve.
-**Step 0: Discovery Interview (if applicable).** Before generating the plan, check whether a Discovery Interview is needed (see Orient reference). If the task involves external systems, stakeholder coordination, or unfamiliar code, run the interview FIRST.
-**Step 1: Generate the plan.** Build a numbered checkbox list. This format is **mandatory** — no narrative steps, no prose paragraphs. Every plan MUST use: `N. [ ] One-line action → target`. See `references/protocols/PLAN-APPROVAL.md` for the full format spec, examples for code/ops/comms/infra tasks, and the list of NEVER-produce anti-patterns.
+| User says | Action |
+|---|---|
+| approve/go/yes/lgtm | Execute all |
+| skip N | Remove step, execute rest |
+| only N,M | Execute only listed |
+| for step N, [change] | Modify + execute all |
+| add: [desc] after N | Insert, renumber, execute |
+| deny/stop/cancel | Cancel entirely |
-**Step 2: Parse the user's response.**
+Execute sequentially. Show `Step 2/5 done: [summary]` after each. If step fails → stop and report.
-| User says | Action |
-|-----------|--------|
-| `approve`, `go`, `yes`, `lgtm`, `ship it` | Execute all steps in order |
-| `skip N` or `skip N,M` | Remove those steps, execute the rest |
-| `only N,M,P` | Execute only the listed steps in order |
-| `for step N, [instruction]` | Replace step N's approach, then execute all |
-| `add: [description] after N` | Insert a new step, renumber, then execute all |
-| `deny`, `stop`, `cancel`, `no` | Cancel. Do not execute anything. |
-| A longer message with mixed feedback | Parse each instruction. Apply all modifications. Present revised plan and ask for final approval. |
+**`always_ask`**: Same as plan_first but also gates small tasks. Only micro skips.
-**Step 3: Execute the approved plan.** Work through steps sequentially. After each step show: `Step 2/5 done: [summary].` If a step fails, stop and report.
+### 2. Direct vs routed
-**Step 4: Post-execution update.** Update blackboard with decisions and experience.
+Direct when: micro/small, routing overhead adds no value, faster to just do it.
+Skill when: specialized workflow improves result, user invoked it, medium/large.
-#### Mode: `always_ask`
-Same as `plan_first` but applies to **small** tasks too. Only micro tasks skip the approval gate.
+### 3. Supporting MCP reads
-#### Combining with explicit skill routing
-When routing to a skill, plan approval still applies if mode is `plan_first` or `always_ask`. Present the strategy for user control.
+Fetch minimum required external context first (ticket, calendar, docs, browser state).
-### 2. Choose direct vs routed execution
+### 4. Loop decision
-Use direct execution when:
-- the work is micro or small
-- routing overhead adds no value
-- the answer can be delivered faster than a delegated workflow
+If next move reveals new information → plan to re-enter Observe after.
-Use a ftm skill when:
-- its specialized workflow will materially improve the result
-- the user explicitly invoked it
-- the task is medium/large and the skill is the right vehicle
+## Act
-### 3. Choose any supporting MCP reads
+### Pre-Act Checkpoint (HARD GATE)
-If the request depends on external context, fetch the minimum required state first.
+Before executing ANYTHING — Bash, MCP, Write, Edit, API calls:
-Examples:
-- Jira URL -> read the ticket first
-- meeting request -> read calendar first
-- internal policy question -> search Glean first
-- UI bug -> snapshot or inspect browser first
+1. **Checkbox plan presented?** Medium+ tasks require `N. [ ] action → target` format, approved by user. Prose is NOT a plan.
+2. **User approved?** Wait for explicit go/approve/yes.
+3. **Plan marker written?** Write to `~/.claude/ftm-state/.plan-presented` after approval.
+4. **External mutations approved?** Per Approval Gates in orient-protocol.
+5. None apply (micro/small, no forced escalation) → proceed.
-### 4. Decide whether to loop
+| Rationalization | Reality |
+|---|---|
+| "Do as much as you can" = implicit approval | That's the task description, not plan approval |
+| "I know what to do, plan is overhead" | Plan is for the USER |
+| "Just one small API call first" | One becomes five becomes a full unplanned execution |
+| "User seems impatient" | 30-second plan saves 10 minutes of wrong work |
-If the next move will reveal new information, plan to re-enter Observe after the action.
+Applies to ALL execution methods including Bash/curl/python. The plan-gate hook catches Edit/Write/MCP; this checkpoint catches everything else.
-## Act
+### Compare Before You Loop (MANDATORY for external systems)
-Act is clean, decisive execution — but execution of **approved** work only.
+**Never trial-and-error. Always compare first.**
-**HARD GATE — Pre-Act checkpoint**: Before executing ANYTHING (Bash, MCP, Write, Edit, API calls of any kind), verify ALL of these:
+1. **Find working reference** — GET a resource that already works the way you want
+2. **Diff** — compare field-by-field against the broken one. Fix is almost always a small, specific difference
+3. **Targeted change** — change ONLY what the diff revealed. Verify after each change
-1. **Did you present a checkbox plan?** If the task is medium+ (forced escalation signals fired), you MUST have presented a `N. [ ] action → target` plan and received explicit user approval. "I'll do X, Y, Z" in prose is NOT a plan. Listing steps without `[ ]` checkboxes is NOT a plan. If you haven't presented one, STOP and present it now.
-2. **Did the user approve it?** Look for "go", "approve", "yes", "lgtm", or similar. If the user hasn't responded to your plan yet, WAIT. Do not start executing.
-3. **Is the plan marker written?** After approval, write to `~/.claude/ftm-state/.plan-presented` before executing. This signals to hooks that planning happened.
-4. If the task involves external mutations (see Approval Gates), have you presented the specific actions and received approval?
-5. If none of the above apply (micro/small task, no forced escalation), proceed.
+**Loop detection red flags:**
+- 3+ API calls to same system without success
+- Trying different URL formats (underscore vs hyphen, internal vs display ID)
+- Shuffling payload fields hoping one works
+- Reading API docs for endpoint paths (playbook should have this)
-**The rationalization trap**: You will feel the urge to skip the plan because:
-- "The user said 'do as much as you can' — that's implicit approval" → NO. That's the task description, not plan approval.
-- "I know what needs to happen, presenting a plan is just overhead" → NO. The plan is for the USER, not for you.
-- "I'll just start with one small API call to check something" → NO. One call becomes five becomes a full execution without approval.
-- "The user seems impatient" → NO. A 30-second plan saves 10 minutes of unwanted work.
+**On detection:** STOP. Tell user: "Tried N approaches, none worked. Comparing against working reference." Do step 1.
-**This applies to ALL execution methods** — Bash commands, MCP calls, Python scripts, curl, direct API calls. The plan-gate hook catches Edit/Write/MCP, but Bash API calls bypass it. This checkpoint is the only thing that catches those. Do not skip it.
+See `references/incidents.md` → Braintrust Incident for the cost of skipping this.
 ### 1. Direct action
-For micro tasks:
-- do the work
-- summarize what changed
-For small tasks (when `approval_mode` is `plan_first` or `always_ask`):
-- show the pre-flight summary first
-- then do the work
-- verify
-- summarize what changed
+Micro: do + summarize. Small (plan_first/always_ask): pre-flight → do → verify → summarize.
 ### 2. Skill routing
-Before invoking a skill, show one short routing line.
-Examples:
-- `Routing to ftm-debug: this is a flaky failure with real diagnostic uncertainty.`
-- `Routing to ftm-brainstorm: this is still design-stage and benefits from research-backed planning.`
-Then invoke the target skill with the full user input.
+Show one routing line, then invoke: `Routing to ftm-debug: flaky failure with diagnostic uncertainty.`
 ### 3. MCP execution
-Use:
-- parallel reads when safe
-- sequential writes
-- approval gates only for external-facing actions
-### 3.5. Draft-before-send protocol
-When composing Slack messages, emails, or any outbound communication, always save the draft locally before sending.
-**Drafts folder**: `.ftm-drafts/` in the project root (or `~/.claude/ftm-drafts/` if no project context).
+Parallel reads, sequential writes, approval gates for external-facing actions.
-**Ensure the folder exists and is gitignored.** Save every draft before presenting or sending:
+### 3.5 Draft-before-send
-- Filename: `YYYY-MM-DD_HH-MM_<type>_<recipient-or-channel>.md`
-- Content includes frontmatter: type, to, subject (email only), drafted timestamp, status (draft/sent/cancelled)
-**Workflow:**
-1. Compose the message
-2. Save to `.ftm-drafts/`
-3. Present to user for approval
-4. If approved and sent, update `status: sent`
-5. If cancelled or modified, update accordingly
+Slack/email/outbound comms → save to `.ftm-drafts/` first. Filename: `YYYY-MM-DD_HH-MM_<type>_<recipient>.md`. Present for approval, update status on send/cancel.
 ### 4. Blackboard updates (mandatory)
-After every completed task, update the blackboard:
-1. Update `context.json` — set `current_task` to reflect what was done, append to `recent_decisions`
-2. Update `session_metadata.skills_invoked` if a skill was used
-3. Write an experience file to `~/.claude/ftm-state/blackboard/experiences/YYYY-MM-DD_task-slug.json`
-4. Update `~/.claude/ftm-state/blackboard/experiences/index.json` with the new entry
-The experience file should capture:
-- `task_type`, `tags`, `outcome`, `lessons`, `files_touched`, `stakeholders`, `decisions_made`
-Follow the schema and full-file write rules from `blackboard-schema.md`.
+After every completed task:
+1. Update `context.json` — current_task, recent_decisions, session_metadata
+2. Write experience file to `experiences/YYYY-MM-DD_task-slug.json`
+3. Update `experiences/index.json`
+4. Include: task_type, tags, outcome, lessons, files_touched, stakeholders, decisions_made, code_patterns, api_gotchas
 ### 5. Loop
-After acting:
-- if complete, answer and stop
-- if new information appeared, return to Observe
-- if blocked by approval or missing info, ask the user
-- if the simple approach failed, re-orient and escalate one level
+Complete → answer and stop. New info → re-observe. Blocked → ask user. Failed → re-orient, escalate one level.

package/ftm-mind/references/incidents.md ADDED Viewed

@@ -0,0 +1,23 @@
+# Incident Reference
+Named incidents referenced by Orient and Decide-Act protocols. Read this file only when an incident name is cited and you need the full context.
+## Hindsight Incident (March 2026)
+**What happened**: ftm-mind took an SSO setup task and autonomously created Okta groups, added users to production Okta, created Freshservice records, a service catalog item, and modified S3 workflow configs — all without presenting a plan or asking for approval once.
+**Root cause**: No plan-first gate existed. The task "felt small" but touched 5+ external systems.
+**What it taught us**: Any task that calls production APIs is forced-medium. Plans are mandatory. Approval gates are circuit breakers, not suggestions.
+## Braintrust Incident (April 2026)
+**What happened**: Freshservice catalog items #626 and #621 were deleted and recreated as #631 and #632 to "fix" duplicate fields. This broke the S3 workflow config (assign_after_app_owner_approval), required emergency patching, and custom_lookup_bigint fields had to be re-added manually.
+**Root cause**: Three knowledge sources existed (playbook, blackboard, brain.py) and none were consulted. Then, when trial-and-error failed, the model chose a destructive action (delete + recreate) without considering dependencies or asking for approval.
+**What it taught us**:
+1. Always check playbooks before external system operations
+2. Never delete and recreate external resources — IDs are depended on
+3. Compare working references against broken ones instead of guessing
+4. A one-field diff (`requester_can_edit: "true"`) was the entire fix — discoverable in 30 seconds by comparing the working HR Acuity item against the broken ones

package/ftm-mind/references/orient-protocol.md CHANGED Viewed

@@ -1,348 +1,155 @@
-# Orient Protocol — Full Detail
+# Orient Protocol
 ## Capability Inventory: FTM Skills
-Orient must know all ftm capabilities before deciding whether to route or act directly.
 | Skill | Reach for it when... |
 |---|---|
-| `ftm-brainstorm` | The user is exploring ideas, designing a system, comparing approaches, or needs research-backed planning before build work exists. |
-| `ftm-executor` | The user has a plan doc or clearly wants autonomous implementation across multiple tasks or waves. |
-| `ftm-debug` | The core problem is broken behavior, an error, flaky tests, a crash, regression, race, or "why is this failing?" |
-| `ftm-audit` | The user wants wiring checks, dead code analysis, structural verification, or adversarial code hygiene review. |
-| `ftm-council` | The user wants multiple AI perspectives, debate, second opinions, or multi-model convergence. |
-| `ftm-codex-gate` | The user wants adversarial Codex review, validation, or a correctness stress test from Codex specifically. |
-| `ftm-intent` | The user wants function/module purpose documented or `INTENT.md` updated or reconciled. |
-| `ftm-diagram` | The user wants diagrams, architecture visuals, dependency maps, or Mermaid assets updated. |
-| `ftm-browse` | The task requires a browser, screenshots, DOM inspection, or visual verification. |
-| `ftm-pause` | The user wants to park the session and save resumable state. |
-| `ftm-resume` | The user wants to restore paused context and continue prior work. |
-| `ftm-upgrade` | The user wants ftm skills checked or upgraded. |
-| `ftm-retro` | The user wants a post-run retrospective, lessons learned, or execution review. |
-| `ftm-config` | The user wants ftm settings, model profile, or feature configuration changed. |
-| `ftm-git` | Any git commit or push is about to happen, the user asks to scan for secrets/credentials/API keys, or wants to verify no secrets are hardcoded before sharing code. MUST run before any commit or push operation — this is a mandatory security gate, not optional. |
-| `ftm-capture` | The user just completed a repeatable workflow and wants to save it as a reusable routine + playbook + reference doc. Triggers on "capture this", "save as routine", "codify this", "don't make me explain this again". Also suggest proactively when you detect the user doing something they've done before (matching blackboard experiences with same task_type 2+ times). |
-| `ftm-ops` | The user asks about tasks, capacity, burnout, stakeholders, meetings, incidents, patterns, or daily/weekly summaries. Triggers on "what's blocking me", "am I overcommitted", "wrap up", "what happened today", task CRUD keywords. |
-Routing heuristic:
-- If a task is self-contained and small enough, do it directly.
-- Route to a skill only when the skill's workflow adds clear value.
-- Explicit skill invocation is a strong route signal.
-## MCP Inventory Reference
-Read `~/.claude/skills/ftm-mind/references/mcp-inventory.md` for full MCP server details.
-Orient must know the available MCPs and their contextual triggers.
-| MCP server | Reach for it when... |
+| `ftm-brainstorm` | Exploring ideas, designing systems, comparing approaches, research-backed planning |
+| `ftm-executor` | Has a plan doc or wants autonomous multi-task implementation |
+| `ftm-debug` | Broken behavior, errors, flaky tests, crashes, regressions |
+| `ftm-audit` | Wiring checks, dead code analysis, structural verification |
+| `ftm-council` | Multiple AI perspectives, debate, second opinions |
+| `ftm-codex-gate` | Adversarial Codex review or correctness stress test |
+| `ftm-intent` | Function/module purpose docs or INTENT.md updates |
+| `ftm-diagram` | Diagrams, architecture visuals, Mermaid assets |
+| `ftm-browse` | Browser, screenshots, DOM inspection, visual verification |
+| `ftm-pause` / `ftm-resume` | Park or restore session state |
+| `ftm-upgrade` | Check or upgrade ftm skills |
+| `ftm-retro` | Post-run retrospective or execution review |
+| `ftm-config` | Settings, model profiles, feature configuration |
+| `ftm-git` | MANDATORY before any commit/push — secret scanning gate |
+| `ftm-capture` | Save repeatable workflow as routine/playbook. Also suggest proactively when blackboard shows same task_type 2+ times |
+| `ftm-ops` | Tasks, capacity, burnout, stakeholders, meetings, incidents, daily/weekly summaries |
+Routing: do it directly if small enough. Route to a skill only when the workflow adds clear value. Explicit invocation is a strong signal.
+## MCP Inventory
+Read `references/mcp-inventory.md` for full details. Quick heuristics:
+| Signal | MCP |
 |---|---|
-| `git` | You need repo state, diffs, history, branches, staging, or commits. |
-| `playwright` | You need browser automation, screenshots, UI interaction, console logs, or visual checks. |
-| `sequential-thinking` | The problem genuinely needs multi-step reflective reasoning or trade-off analysis. |
-| `slack` | You need to read Slack context, inspect channels or threads, or send a Slack update. |
-| `gmail` | You need inbox search, email reading, drafting, sending, labels, or filters. |
-| `mcp-atlassian-personal` | Personal Jira or Confluence reads and writes: tickets, sprints, docs, comments, status changes. Default Atlassian account. *(Server names are configurable via `ops.mcp_account_rules` in ftm-config.yml. This table shows defaults.)* |
-| `mcp-atlassian` | Admin-scope Jira or Confluence operations that must run with elevated org credentials. *(Configurable via `ops.mcp_account_rules.admin` in ftm-config.yml.)* |
-| `freshservice-mcp` | IT ticketing, requesters, agent groups, products, or service requests. |
-| `context7` | External library and framework documentation. |
-| `glean_default` | Internal company docs, policies, runbooks, and institutional knowledge. |
-| `apple-doc-mcp` | Apple platform docs for Swift, SwiftUI, UIKit, AppKit, and related APIs. |
-| `lusha` | Contact or company lookup and enrichment. |
-| `google-calendar` | Schedule inspection, free/busy checks, event search, drafting scheduling actions, and calendar changes. |
-### MCP matching heuristics
-Use the smallest relevant MCP set.
-- Jira issue key or Atlassian URL -> `mcp-atlassian-personal` (or the configured personal account name)
-- "internal docs", "runbook", "company wiki", "Glean" -> `glean_default`
-- "how do I use X library" -> `context7`
-- "calendar", "meeting", "free time" -> `google-calendar`
-- "Slack", "channel", "thread", "notify" -> `slack`
-- "email", "Gmail", "draft" -> `gmail`
-- "ticket", "hardware", "access request" -> `freshservice-mcp`
-- "browser", "screenshot", "look at the page" -> `playwright`
-- "talk through trade-offs" -> `sequential-thinking`
-- "SwiftUI" or Apple framework names -> `apple-doc-mcp`
-- "find contact/company" -> `lusha`
-### Multi-MCP chaining
-Detect mixed-domain requests early.
-Examples:
-- "check my calendar and draft a Slack message" -> `google-calendar` + `slack`
-- "read the Jira ticket, inspect the repo, then propose a fix" -> `mcp-atlassian-personal` + `git`
-- "search internal docs, then update a Confluence page" -> `glean_default` + `mcp-atlassian-personal`
-Rules:
-- parallelize reads when safe
-- gather state before proposing writes
-- chain writes sequentially
+| Jira key or Atlassian URL | `mcp-atlassian-personal` |
+| Internal docs, runbook, company wiki | `glean_default` |
+| Library/framework docs | `context7` |
+| Calendar, meeting, free time | `google-calendar` |
+| Slack, channel, thread | `slack` |
+| Email, Gmail, draft | `gmail` |
+| Ticket, hardware, access request | `freshservice-mcp` |
+| Browser, screenshot | `playwright` |
+| Trade-off analysis | `sequential-thinking` |
+| Apple framework | `apple-doc-mcp` |
+| Contact/company lookup | `lusha` |
+Multi-MCP: parallelize reads, gather state before writes, chain writes sequentially.
 ## Session Trajectory
-Do not orient from the last user message alone.
-Look for the arc:
-- What skill or action happened just before this?
-- What did we learn?
-- Is the user moving from ideation -> execution -> validation?
-- Did we already choose an approach that this request assumes?
-Trajectory cues:
-- brainstorm -> "ok go" usually means plan or executor
-- debug -> "check it now" usually means verify, test, or audit
-- executor -> "pause" means checkpoint, not new work
-- resume -> "what's next?" means restore and continue
-If a request branches away from the active thread, note that mentally and avoid corrupting the current session model.
+Look for the arc, not just the last message:
+- What happened just before? What did we learn?
+- brainstorm → "ok go" = plan/executor
+- debug → "check it now" = verify/test/audit
+- executor → "pause" = checkpoint
+- resume → "what's next?" = restore and continue
 ## Codebase State
-Orient must incorporate what is true in the repo right now.
-Check:
-- dirty worktree
-- recent commits
-- active branch
-- user changes in progress
-- whether the request conflicts with local state
-Use codebase state to answer:
-- is this safe to do directly?
-- do we need to avoid stepping on unfinished work?
-- is this request actually about the last commit or current unstaged diff?
-- should we inspect a particular module first because recent changes point there?
-Repo heuristics:
+Check: dirty worktree, recent commits, active branch, in-progress changes, conflicts with request. Clean tree = lower cost of direct action. Uncommitted changes = continuity and risk.
-- uncommitted changes imply continuity and risk
-- a clean tree lowers the cost of direct action
-- a just-landed commit suggests review or regression-check behavior
-- a ticket-linked branch suggests the user expects ticket-driven execution
+## Approval Gates (HARD STOP)
-## Approval Gates (HARD STOP — NOT OPTIONAL)
+**Circuit breaker. External mutations require explicit user approval. No exceptions.**
-**This section is a circuit breaker, not a suggestion. If you are about to call a tool that creates, updates, or deletes a record in an external system, you MUST stop and get explicit user approval FIRST. No exceptions. No "the user implied it." No "it's part of the plan." STOP and ASK.**
+See `references/incidents.md` → Hindsight Incident for why this exists.
-The reason this exists: in March 2026, ftm-mind took a Hindsight SSO task and autonomously created Okta groups, added users to production Okta, created Freshservice records, created a service catalog item, and modified S3 workflow configs — all without asking once.
+### Requires approval (STOP before each)
-### What requires approval (STOP before each one)
+Every individual external mutation. "User approved the plan" ≠ "user approved every API call."
-Every individual external mutation needs its own approval. "The user approved the plan" does not mean "the user approved every API call in the plan."
-- **Okta**: creating apps, groups, assigning users, modifying policies
-- **Freshservice**: creating tickets, records, catalog items, custom objects
-- **Jira / Confluence**: creating or updating issues, pages, comments
-- **Slack / Email**: sending messages (draft-before-send protocol applies)
-- **Calendar**: creating or modifying events
-- **S3 / cloud storage**: writing or modifying objects
-- **Browser forms**: submitting data through playwright/puppeteer
+- **Okta**: create apps/groups, assign users, modify policies
+- **Freshservice**: create tickets/records/catalog items/custom objects
+- **Jira/Confluence**: create/update issues, pages, comments
+- **Slack/Email**: send messages (draft-before-send applies)
+- **Calendar**: create/modify events
+- **S3/cloud**: write/modify objects
+- **Browser forms**: submit data
 - **Deploys**: any production-affecting operation
-- **Git remote**: pushes, PR creation
-When multiple mutations are part of one plan, batch the approval request by phase — not one API call at a time, but not "approve the whole plan" either. Group related mutations and present per-phase.
-### Destructive Actions (EXTRA HARD GATE — NEVER WITHOUT EXPLICIT CONFIRMATION)
-Deleting, replacing, or recreating external resources is a **separate, higher gate** than creating or updating them. These actions are often irreversible and break downstream dependencies you can't see.
-**NEVER do any of these without explicit user confirmation for each specific resource being destroyed:**
-- **DELETE any external resource** (catalog items, custom objects, Okta groups/apps, Jira issues, S3 objects)
-- **Recreate (delete + create)** to "fix" something — the new resource gets a different ID, breaking every automation that references the old one
-- **Overwrite S3 objects** that other systems read from
-- **Remove users from groups** or deactivate accounts
-- **Close/resolve tickets** that others may be watching
-**The "delete and recreate" trap**: When you can't update a resource cleanly via API, your instinct will be to delete it and create a fresh one. THIS IS ALMOST ALWAYS WRONG. External resources have IDs that other systems depend on — workflow configs, Lambda triggers, approval chains, custom object lookups, S3 references. Deleting breaks all of them silently. Instead:
-1. Tell the user what you can't update via API
-2. Suggest the minimal manual fix (admin UI link + exact steps)
-3. Only delete if the user explicitly says "yes, delete it, I understand the dependencies"
+- **Git remote**: push, PR creation
-**The April 2026 Braintrust incident**: ftm-mind deleted Freshservice catalog items #626 and #621 to "fix" duplicate fields, recreating them as #631 and #632. This broke the S3 workflow config (assign_after_app_owner_approval), required emergency patching, and the custom_lookup_bigint fields had to be re-added manually. The correct fix was: update only the roles field via API, and tell the user to delete the duplicate fields manually in the admin UI.
+Batch by phase — not per-call, not whole-plan.
-### What auto-proceeds (no approval needed)
+### Destructive Actions (EXTRA HARD GATE)
-- local code edits, documentation updates
-- tests, lint, builds, audits
-- local git operations (branch, commit, inspection)
-- reading from any MCP or API (GET requests)
-- blackboard reads and writes
-- saving drafts to `.ftm-drafts/`
+**NEVER delete/recreate external resources without per-resource user confirmation.**
-### The momentum trap
+- DELETE any external resource
+- Recreate (delete + create) — new ID breaks all automation referencing the old one
+- Overwrite S3 objects other systems read
+- Remove users from groups, deactivate accounts
+- Close/resolve tickets others watch
-If you notice yourself thinking any of these, STOP — you are rationalizing past a gate:
+When you can't update via API: tell the user, suggest manual fix (admin UI link + steps). Only delete if user explicitly confirms with dependency awareness. See `references/incidents.md` → Braintrust Incident.
-- "The user clearly wants this done, I'll just do it"
-- "This is part of the approved plan"
-- "I already started, might as well finish"
-- "It's just one more API call"
-- "The user will appreciate me being proactive"
+### Auto-proceeds (no approval)
-None of these override the gate. Present the action, wait for approval, then execute.
+Local edits, tests, lint, builds, audits, local git, GET requests, blackboard reads/writes, saving drafts.
-## Ask-the-User Heuristic
+### Rationalization traps
-Ask the user only when one of these is true:
-- two materially different interpretations are both plausible
-- an external-facing action needs approval
-- a required credential, path, or identifier is missing **AND the blackboard has no experience confirming access** (see Blackboard-First Rule below)
-- the user explicitly asked for options before action
-- **the task is medium+ and involves external systems, stakeholder coordination, or unfamiliar code** (see Discovery Interview below) **AND the blackboard doesn't already confirm repo-level access**
-When asking, ask one focused question with concrete choices.
-### Blackboard-First Rule (MANDATORY before any access/auth questions)
+| Thought | Reality |
+|---|---|
+| "The user clearly wants this" | Present the action, wait for approval |
+| "It's part of the approved plan" | Each mutation needs its own gate |
+| "I already started" | Sunk cost. Stop and ask |
+| "Just one more API call" | That's how incidents start |
+| "User will appreciate proactivity" | User will appreciate not breaking things |
-**Before asking ANY question about credentials, API access, authorization, permissions, or "do you have access to X" — check the blackboard first.**
+## Blackboard-First Rule (before any access/auth questions)
+Before asking about credentials, API access, or authorization:
 1. Read `experiences/index.json`
-2. Look for entries tagged with the current repo name, `api-access`, `full-access`, `credentials`, or the system being asked about (e.g., `freshservice`, `okta`, `jira`)
-3. If a matching experience exists with `confidence >= 0.7`:
-   - **Do NOT ask about access.** The user already established this.
-   - **Do NOT run a discovery interview about authorization.** You have the answer.
-   - **Just do the thing.** If the credentials don't work, you'll find out when the API call fails — and that's a better signal than a speculative question.
-4. If no matching experience exists, proceed with asking.
-This rule exists because users set up repo-level context once (e.g., "my-tools repo has full API access to our admin systems") and expect Claude to remember it across every session. Asking "do you have admin access?" when the blackboard already says "yes, full access" is the #1 frustration signal.
-### Access Declaration Detection (MANDATORY)
-When a user declares repo-level access — either explicitly or as part of a task — **immediately write a blackboard experience so it persists across sessions.** Do NOT wait until the task is complete. Write it during Orient, before acting.
-**Detection triggers** (any of these in the user's message):
-- "I have access to...", "I have credentials for...", "I'm authenticated to..."
-- "this repo has access to...", "we have API keys for..."
-- "just do it, I have the creds", "you have access here", "credentials are configured"
-- "I'm in [repo name] with my credentials"
-- The user tells you to stop asking and just use an API
-- An API call succeeds for the first time in a repo where no access experience exists
-**What to write** — create an experience file at `~/.claude/ftm-state/blackboard/experiences/learning-{repo-name}-api-access.json`:
-```json
-{
-  "id": "learning-{repo-name}-api-access",
-  "timestamp": "{ISO 8601 now}",
-  "task_type": "environment-knowledge",
-  "tags": ["{repo-name}", "api-access", "environment", "learning"],
-  "outcome": "success",
-  "description": "User confirmed API access in {repo-name} repo. {any specifics they mentioned — which systems, what kind of access}.",
-  "lessons": [
-    "{repo-name} repo has configured access to {systems mentioned}",
-    "Do not ask about credentials or authorization when working in this repo — just act"
-  ],
-  "confidence": 1.0,
-  "code_patterns": [],
-  "api_gotchas": []
-}
-```
+2. Look for tags: current repo name, `api-access`, `full-access`, or the target system
+3. If match exists with confidence ≥ 0.7 → don't ask, just act
+4. No match → proceed with asking
-Also update `experiences/index.json` with the new entry.
+## Access Declaration Detection
-**On first successful API call:** If you make an API call in a repo and it succeeds, but no access experience exists for this repo, write one automatically. The success IS the proof of access. Tag it with the repo name and the system that worked (e.g., `freshservice`, `okta`).
+When user declares repo-level access, **immediately** write a blackboard experience:
-**This is not optional.** Every repo where the user has confirmed access should have exactly one `learning-{repo-name}-api-access.json` experience. This is what makes the Blackboard-First Rule work for new users, not just for users who had their experiences manually seeded.
+**Triggers**: "I have access to...", "credentials are configured", "just do it, I have the creds", user tells you to stop asking, or first successful API call in a repo without an access experience.
-### Discovery Interview (medium+ tasks with external systems)
+**Write**: `experiences/learning-{repo-name}-api-access.json` with tags `["{repo-name}", "api-access", "environment", "learning"]`, confidence 1.0. Update index.
-When a task hits forced-medium or higher AND involves external systems, stakeholder coordination, or code you haven't read yet this session, run a brief discovery interview BEFORE generating the plan. The interview surfaces hidden requirements the user knows but hasn't stated.
+## Discovery Interview (medium+ with external systems)
-**Before running the interview, apply the Blackboard-First Rule above.** If the blackboard confirms access and the task is a straightforward API operation (add user, create ticket, update group), skip the interview entirely and just do it. The interview is for tasks with genuine unknowns — stakeholder coordination, multi-system migrations, policy changes — not for "use the Freshservice API to add an agent."
+**Apply Blackboard-First Rule first.** If blackboard confirms access + task is a direct API operation → skip interview, just do it.
-The interview should be 2-4 focused questions:
+Interview is for genuine unknowns only (stakeholder coordination, multi-system migrations, policy changes). 2-4 focused questions:
+- Who else needs to know?
+- Downstream dependencies?
+- Timeline/approval constraints?
+- Parts to leave as-is?
-- Who else needs to know about this change?
-- Are there downstream systems or automations that depend on what's changing?
-- Is there a timeline or dependency on someone else's approval?
-- Should we also draft a message to anyone about this?
-- Are there parts of this you want left alone for now vs. changed?
+**Skip when**: user provided context, purely local, user said "just do it", or blackboard confirms access for a direct API op.
-**When to skip the interview:**
-- The user already provided comprehensive context
-- The task is purely local with no external dependencies
-- The user explicitly says "just do it" or "no questions, go"
-- **The blackboard has an experience confirming API access for this repo + the task is a direct API operation** (not stakeholder coordination or multi-system migration)
-## Brain.py Task Loading (Observe Phase)
-During the Orient phase, enrich session context with the user's active operational state by loading tasks via brain.py:
+## Brain.py Task Loading
 ```
 python3 ~/.claude/skills/ftm/bin/brain.py --tasks --task-json
 ```
-Parse the JSON output for active tasks. Surface high-priority or blocking tasks via `TaskCreate` with the task details so they appear in the session task list. This gives ftm-mind awareness of what the user is carrying before deciding on the next move.
-Skip this step if:
-- brain.py is not present or returns an error (fail gracefully, do not block orientation)
-- The session context already contains recently loaded task state (within 15 minutes)
-- The request is purely local with no operational relevance (e.g., pure code edits)
-## Playbook Lookup (MANDATORY before any external system operation)
-**Before executing any operation on an external system (Freshservice, Okta, Jira, Trelica, S3, etc.), check for an existing playbook.** This is not optional. Playbooks encode hard-won lessons — API quirks, encoding requirements, field types that can't be updated, correct endpoint paths. Skipping this step means repeating every mistake the playbook was written to prevent.
-**Step 1: Check brain.py playbooks.**
-```
-python3 ~/.claude/skills/ftm/bin/brain.py --playbook-match "[describe the operation]" --playbook-match-source freshservice
-```
-If a match returns with confidence > 0.2, read the full playbook before proceeding.
-```
-python3 ~/.claude/skills/ftm/bin/brain.py --playbook-list
-```
-Also list all playbooks and scan names — sometimes the match query misses a relevant one.
+Load active tasks, surface high-priority via TaskCreate. Skip if brain.py absent, tasks loaded recently (15min), or request is purely local.
-**Step 2: Check repo-local playbooks.**
-```
-ls docs/playbooks/ 2>/dev/null
-```
+## Playbook Lookup (MANDATORY before external system ops)
-If the current repo has a `docs/playbooks/` directory, scan it for files matching the target system. Read any relevant playbook before writing a single line of code.
+**Before any external system operation, check all three knowledge sources:**
-**Step 3: Check blackboard experiences.**
+1. `brain.py --playbook-match "[operation]"` + `--playbook-list`
+2. `ls docs/playbooks/` in current repo
+3. Blackboard experiences filtered by target system tags — check `code_patterns` and `api_gotchas`
-Read `experiences/index.json` and filter by tags matching the target system. Load matching experience files and check for `code_patterns` and `api_gotchas` fields.
-**What playbooks prevent:**
-- Using raw HTML when Freshservice requires entity-encoded HTML (`html.escape()`)
-- Trying to PUT on `service_catalog/items/{internal_id}` when the correct path is `service-catalog/items/{display_id}`
-- Including `custom_lookup_bigint` fields in API updates (they're admin-UI-only)
-- Deleting and recreating resources when an in-place update works
-- Repeating 10+ failed API calls to discover what the playbook already documents
-**The April 2026 Braintrust incident**: A playbook existed (`docs/playbooks/freshservice-service-catalog-item.md`), the blackboard had the lesson ("FS rich text tables require html.escape()"), and a brain.py playbook (`fs-hide-catalog-el`) was available. None were consulted. The result: 15+ failed API attempts, accidental creation of duplicate fields, then destructive deletion of two catalog items breaking S3 workflow automation.
-**If no playbook exists** and the operation succeeds after trial-and-error, the auto-playbook hook should trigger. If it doesn't, proactively invoke ftm-capture to save the working pattern.
+If any source has relevant content, read it before writing code. See `references/incidents.md` → Braintrust Incident for what happens when you skip this.
 ## Orient Synthesis
-Before leaving Orient, silently synthesize all signals into one internal picture:
-- current outcome the user wants
-- current task type
-- session continuity
-- codebase constraints
-- relevant lessons
-- relevant patterns
-- capability mix
-- smallest correct task size
-- whether approval or clarification is needed
-Orient is complete only when the next move feels obvious.
+Before leaving Orient, have one clear internal picture: what the user wants, task type, session continuity, codebase constraints, relevant lessons/patterns, capability mix, correct task size, whether approval or clarification is needed. Orient is complete when the next move feels obvious.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "feed-the-machine",
-  "version": "1.7.12",
+  "version": "1.7.14",
   "description": "A brain upgrade for Claude Code — 26 skills that teach it how to think before acting, remember across conversations, debug like a war room, run plans on autopilot with agent teams, and get second opinions from GPT & Gemini. Plus 15 hooks that automate the boring stuff.",
   "license": "MIT",
   "author": "kkudumu",