npm - knit-mcp - Versions diffs - 0.9.0 → 0.11.2 - Mend

knit-mcp 0.9.0 → 0.11.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/README.md +368 -215
package/dist/cache-3LPETDUT.js +19 -0
package/dist/{chunk-XFS2XGZI.js → chunk-27TA2ZQZ.js} +24 -0
package/dist/{chunk-KLNUEE3O.js → chunk-7UFS67HP.js} +9 -1
package/dist/{chunk-4K4FHOKE.js → chunk-HROSQ5MS.js} +256 -344
package/dist/{chunk-5NCSZYRJ.js → chunk-LV73YTVN.js} +5 -0
package/dist/{chunk-B2KZ5UR7.js → chunk-ORKWLA33.js} +1 -1
package/dist/{chunk-BU3VHX3W.js → chunk-RZOVZYTF.js} +12 -4
package/dist/{chunk-7PPC6IG6.js → chunk-ST4X7LZT.js} +60 -2
package/dist/chunk-TWHNYJAJ.js +328 -0
package/dist/{chunk-FLNV2IQC.js → chunk-VB2TIR6L.js} +2 -2
package/dist/{chunk-BAUQEFYY.js → chunk-WKQHCLLO.js} +45 -10
package/dist/cli.js +18 -9
package/dist/doctor-4DN2P2JR.js +179 -0
package/dist/{export-I5Y26WUL.js → export-CGSEUYZA.js} +3 -3
package/dist/{install-agents-D2KJQUH3.js → install-agents-OBDCWCPB.js} +8 -7
package/dist/{instructions-4FI32YZU.js → instructions-JARSXQPO.js} +1 -1
package/dist/{integration-scanner-PS47AHGM.js → integration-scanner-LBD2PIZ3.js} +3 -3
package/dist/{refresh-BXN32CNA.js → refresh-SMJ2NGIW.js} +3 -3
package/dist/{status-RQWRIM2Y.js → status-VJDB75X2.js} +2 -2
package/dist/{tools-EISDGPS5.js → tools-MIROTK2A.js} +1146 -91
package/package.json +2 -1
package/dist/cache-JSN6ETUF.js +0 -18

package/README.md CHANGED Viewed

@@ -1,73 +1,64 @@
 <p align="center">
   <a href="https://www.npmjs.com/package/knit-mcp"><img src="https://img.shields.io/npm/v/knit-mcp?style=for-the-badge&color=7c3aed&label=npm" alt="npm version" /></a>
-  <img src="https://img.shields.io/badge/license-MIT-blue?style=for-the-badge" alt="license" />
-  <a href="https://github.com/PDgit12/knit/actions/workflows/ci.yml"><img src="https://github.com/PDgit12/knit/actions/workflows/ci.yml/badge.svg" alt="CI" /></a>
-  <img src="https://img.shields.io/badge/node-%3E%3D18-339933?style=for-the-badge&logo=node.js&logoColor=white" alt="node" />
+  <a href="https://github.com/PDgit12/knit/actions/workflows/ci.yml"><img src="https://img.shields.io/github/actions/workflow/status/PDgit12/knit/ci.yml?style=for-the-badge&label=CI&color=10b981" alt="CI" /></a>
+  <img src="https://img.shields.io/badge/license-MIT-3b82f6?style=for-the-badge" alt="license" />
+  <img src="https://img.shields.io/badge/node-%E2%89%A518-339933?style=for-the-badge&logo=node.js&logoColor=white" alt="node" />
+  <img src="https://img.shields.io/badge/tests-664%20passing-22c55e?style=for-the-badge" alt="tests" />
+  <img src="https://img.shields.io/badge/MCP%20tools-53-7c3aed?style=for-the-badge" alt="tools" />
 </p>
-<h1 align="center">knit</h1>
+<h1 align="center">🧶 knit</h1>
 <p align="center">
-  <strong>An intelligent command layer for Claude Code.</strong>
-  <br/>
-  Project-scoped memory, on-demand workflow protocol, parallel team worktrees,<br/>
-  and honest token accounting — all in one MCP server.
+  <strong>An intelligent command layer for Claude Code.</strong><br/>
+  Project-scoped memory · on-demand workflow · parallel team worktrees · honest token accounting.<br/>
+  <em>All in one MCP server.</em>
 </p>
-<br/>
-## What knit is
-Knit makes Claude Code do the right thing automatically because it can't predict how a user will phrase a request. It does three jobs at once:
-- **Memory** — every project keeps a brain at `~/.knit/projects/<hash>/`. Sessions compound: learnings, false positives, session summaries, and a static-analysis import graph are all queryable next session.
-- **Tokens** — `CLAUDE.md` is ~100 lines (project facts only). Workflow protocol is fetched on demand via `knit_get_workflow(phase)`. Knit is net-negative on context cost.
-- **Workflow** — a 4-tier classification (Inquiry / Trivial / Standard / Complex) with phase-triggered plan mode, quality-gated `LEARN`, and team-scoped git worktrees so parallel agents don't step on each other.
-It's a **single product**, not three. Every design choice has to win on memory + tokens + workflow together.
+<p align="center">
+  <a href="#-quick-start">Quick start</a> ·
+  <a href="#-what-knit-is">What it is</a> ·
+  <a href="#-whats-new-in-v0110">v0.11</a> ·
+  <a href="#-52-mcp-tools">Tools</a> ·
+  <a href="#-how-its-different">Comparison</a> ·
+  <a href="#-honest-comparison-vs-memory-libraries">vs mem0/Letta</a>
+</p>
-## What's new in v0.7.0
+---
-- **Universal protocol injection.** Knit sets the MCP server-level `instructions` field, so every MCP client (Claude Code, Cursor, Codex) sees Knit's flow at session start — *before* tool descriptions. Session 1 follows the protocol instead of stumbling onto it.
-- **Tier-gated tool surface.** 38 tools split into three tiers: Tier 1 (26 universal — memory, knowledge graph, workflow, classification, false-positive suppression, reflection, Protocol Guard config, diagnostics) is always exposed. Tier 2 (team worktrees, subagent installer) auto-activates when the project shape matches (≥3 detected domains, `.claude/agents/` exists) or via explicit opt-in. Tier 3 (admin/setup) is opt-in only. Solo-domain projects no longer see 9 team-worktree tools cluttering their decision space.
-- **`knit_list_features`** is the discoverability escape hatch — always available, always tells you what's hidden and exactly how to enable it (`knit_enable_feature({feature: "teams" | "subagents" | "admin"})`). Persisted to `~/.knit/projects/<hash>/features.json` so the choice survives sessions.
-- **Inquiry tier in the classifier.** Read-only "what / where / audit / explain" tasks now route to `tier: "inquiry"` with no plan mode and no phases — fixes a long-standing over-routing bug where audit-style questions hijacked Complex.
-- **CLAUDE.md cut ~88%** (16.7 KB → ~2 KB on typical projects). The per-turn context tax dropped sharply; all project-specific content (header, project map, domain architecture, build gates, false positives) stays intact.
-- **Lazy / minimal response modes.** `knit_load_session` returns the lean core by default; opt into more via `include=patterns,teams,metrics,recent_sessions,full_learnings,full_knowledge,all`. `knit_classify_task` returns the minimal shape by default; pass `verbose=true` for the diagnostic fields.
-- **Legacy CLAUDE.md migration.** Users upgrading from v0.5.x with `<!-- engram:start -->/<!-- engram:end -->` markers are auto-migrated — the legacy block is replaced cleanly with the new lean block instead of leaving an orphan.
+## 🧠 What knit is
-### Per-session token-budget table
+Knit makes Claude Code do the right thing automatically — because you can't predict how a user will phrase a request. It does three jobs at once:
-| Surface | v0.6.5 | v0.7.0 | Cut |
-|---|---|---|---|
-| CLAUDE.md per-turn | ~16.7 KB | ~2 KB | 88% |
-| Tool registry (typical project) | ~6–8 KB | ~3–4 KB | ~50% |
-| `knit_classify_task` response | ~500 tok | ~150 tok | 70% |
-| `knit_load_session` response | ~3–5 KB | ~1.5 KB | ~60% |
+| | |
+|---|---|
+| 🧠 **Memory** | Every project keeps a brain at `~/.knit/projects/<hash>/`. Sessions compound: learnings, false positives, session summaries, and a static-analysis import graph are all queryable next session. |
+| 🪶 **Tokens** | `CLAUDE.md` is ~2 KB (project facts only). Protocol depth is fetched on demand via `knit_get_workflow(phase)`. Knit is **net-negative** on context cost. |
+| 🛠️ **Workflow** | A 4-tier classification (Inquiry / Trivial / Standard / Complex) with phase-triggered plan mode, quality-gated `LEARN`, and team-scoped git worktrees so parallel agents don't step on each other. |
-### Upgrade note
+It's a **single product**, not three. Every design choice has to win on memory + tokens + workflow together.
-After running `npx knit-mcp@latest setup` (or just updating the version pin), **restart Claude Code**. The MCP server's `instructions` field and tier-gated `tools/list` only flow into the system prompt at handshake — the cached process from before the upgrade keeps the v0.6.5 behavior until restart.
+---
-## Setup (one time)
+## 🚀 Quick start
 ```bash
 npx knit-mcp@latest setup
 ```
-Adds the Knit MCP server to your Claude Code config (`~/.claude.json`). No per-project setup. Open Claude Code in any project and the first MCP tool call auto-initializes everything.
+Adds the Knit MCP server to your Claude Code config (`~/.claude.json`). **No per-project setup.** Open Claude Code in any project — the first MCP tool call auto-initializes the brain, hooks, and per-project CLAUDE.md block.
-**Supported shells:** macOS, Linux, WSL, Git Bash, and Windows PowerShell. The generated hooks use POSIX-style single-quoted `node -e '…'` payloads. Windows `cmd.exe` does not treat single quotes as delimiters and is not supported as the hook-runner shell — on Windows, use PowerShell (default in modern Windows Terminal) or Git Bash. If you hit a hook error on Windows, file an issue with the shell you're using.
+> **Supported shells:** macOS, Linux, WSL, Git Bash, PowerShell. Windows `cmd.exe` is not supported as the hook-runner shell — use PowerShell (default in modern Windows Terminal) or Git Bash.
-### Quiet mode (no hook enforcement)
+### Quiet mode
-Knit ships Protocol Guard in `warn` mode by default — hooks print reminders, they never block. If you want it fully silent (no PreToolUse classification gate, no reminder messages), run this once inside Claude Code:
+Knit ships **Protocol Guard in `warn` mode by default** — hooks print reminders, they never block. Fully silent:
-> `knit_set_protocol_strictness({ level: "off" })`
-The other hooks (LEARN compliance, KB metrics, final build verification) stay as observability nudges — they print, they don't gate. To remove them too, see Uninstall below.
+```js
+knit_set_protocol_strictness({ level: "off" })
+```
-### Uninstall
+### Uninstall in 30 seconds
 ```bash
 rm -rf ~/.knit                                 # all per-project + global memory
@@ -76,13 +67,173 @@ rm -rf ~/.knit                                 # all per-project + global memory
 Then:
 1. Remove `"knit-brain"` from `mcpServers` in `~/.claude.json`
 2. Delete the `<!-- knit:start --> ... <!-- knit:end -->` block from each project's `CLAUDE.md`
-3. Remove `_knitOwned` entries from each project's `.claude/settings.local.json` (or delete the file if Knit was the only thing in it)
+3. Remove `_knitOwned` entries from each project's `.claude/settings.local.json`
+Knit writes nowhere else on your machine.
+---
+## ✨ What's new in v0.9.0
+v0.9 closes the **enforcement story** — every honest limit from the v0.8 architecture got a structural fix.
+### Anti-hallucination
+- 📎 **Citation rule in the MCP `instructions` field.** Every session's system prompt now tells the agent: *"when you state a fact about this codebase, cite the Knit tool result that verified it — e.g. (per `knit_query_imports`). If you can't cite, say 'unverified' explicitly."* Makes hallucinations visible at the **claim level**.
+- 🔍 **`knit_verify_claim` tool.** Single-call fact-check against the knowledge graph. Parses *"A imports B"*, *"X exports Y"*, *"A is tested by B"*, *"X exists"* and returns `verified | contradicted | unparseable` with evidence.
-Total time: ~30 seconds per project. Knit doesn't write anywhere else on your machine.
+### Smarter retrieval
-## How data is stored
+- ⚡ **Auto-search inside `knit_classify_task`.** For `standard` / `complex` tier, classify now runs BM25 over (description + affected domains) automatically and embeds top-3 hits as `pre_emptive_learnings`. Closes the *"agent skipped `knit_search_learnings` before re-investigating"* gap with **zero extra calls**.
+- 📚 **`suggested_reads` from `knit_build_context`.** Curated list of files worth opening *before* editing — three signals: graph-importers (blast radius), graph-imports (likely needed), memory-mentions (files referenced by past learnings). Each entry carries `{ path, reason, via }`.
+- 🪜 **`knit_get_learning` — hierarchical retrieval.** Search returns headlines (summary + tags); the agent expands a specific learning by id only when needed. **Pay-per-detail.**
+- 🧮 **`knit_consolidate_learnings`.** Tag-Jaccard clustering of similar learnings → one pattern entry per cluster. Dry-run by default; `commit=true` persists with originals tagged `#consolidated` (preserved but deprioritized).
-Knit data is centralized — not in every repo's working tree:
+### Hook-level enforcement (`HOOKS_VERSION` 6 → 7)
+| Hook | What it does |
+|---|---|
+| **PreToolUse search-gate** | For `standard`/`complex` tasks, blocks Edit/Write (in `block` mode) or warns (default `warn`) when `knit_search_learnings` hasn't fired in the current turn. |
+| **PreToolUse content inspection** | Reads proposed Edit/Write content, parses local imports, warns on relative paths that don't resolve on disk — **catches hallucinated imports before they land**. |
+| **PostToolUse import validation** | After the file lands, re-parses imports and warns about unresolved relative paths — catches anything that slipped past the pre-check. |
+| **Stop-hook budget watch** | Cheap CLAUDE.md size check at session end; warns if it crosses the 12.5 KB over-budget threshold. Drift becomes visible even when the agent doesn't call `knit_brain_status`. |
+> **Upgrade note.** After `npx knit-mcp@latest setup`, **restart Claude Code**. The `instructions` field and tier-gated `tools/list` only flow into the system prompt at handshake. The `HOOKS_VERSION` bump auto-regenerates installed hooks on the next brain load — no manual `knit refresh` needed.
+---
+## 📉 Token budget — measured, not vibes
+| Surface | v0.6.5 | v0.9.0 | Cut |
+|---|---|---|---|
+| CLAUDE.md per-turn | ~16.7 KB | **~2 KB** | **88%** |
+| Tool registry (typical project) | ~6–8 KB | **~3–4 KB** | ~50% |
+| `knit_classify_task` response | ~500 tok | **~150 tok** | 70% |
+| `knit_load_session` response | ~3–5 KB | **~1.5 KB** | ~60% |
+Each surface gets a `healthy | warn | over-budget` verdict from `knit_brain_status.token_budget`. **Drift is a regression test, not a vibes claim.**
+---
+## 🛠️ 43 MCP Tools
+<details open>
+<summary><strong>🕸️ Knowledge graph</strong> <em>(Tier 1, ~5ms)</em></summary>
+| Tool | What it does |
+|---|---|
+| `knit_query_imports` | Reverse dependencies — who imports this file. |
+| `knit_query_dependents` | Forward dependencies — what this file imports. |
+| `knit_query_exports` | Functions / classes / interfaces / types this file exposes. |
+| `knit_query_tests` | Test coverage for a file, or list all untested with `filter=untested`. |
+| `knit_find_fanout` | High-fanout files — the contracts to change carefully. |
+| `knit_verify_claim` | **v0.9.** Fact-check one claim against the graph — `verified \| contradicted \| unparseable` with evidence. |
+</details>
+<details open>
+<summary><strong>📚 Memory + retrieval</strong> <em>(Tier 1)</em></summary>
+| Tool | What it does |
+|---|---|
+| `knit_load_session` | Call at session start — returns handoff, top learnings, false positives, project knowledge. Lazy by default; opt in via `include=patterns,teams,metrics,recent_sessions,full_learnings,full_knowledge,all`. |
+| `knit_search_learnings` | **v0.8+.** BM25 + import-graph hybrid. `query=text` for BM25, `domains=#tag` for tag filter, `files=src/a.ts` for graph-neighborhood boost. Fused via RRF (k=60). |
+| `knit_search_sessions` | BM25 over session summaries + branch + commits + tags. Branch-diversified (max 2 per branch) — one feature branch can't flood. |
+| `knit_search_global_learnings` | BM25 across the cross-project pool at `~/.knit/global/learnings.jsonl`. |
+| `knit_get_learning` | **v0.9.** Fetch one full learning by id. Pair with search (headlines) for hierarchical retrieval. |
+| `knit_record_learning` | Save a non-obvious insight. Quality check first; secret patterns redacted before persistence. |
+| `knit_record_global_learning` | Opt-in: cross-project pool when the insight generalizes beyond this project. |
+| `knit_record_false_positive` | Mark a finding as confirmed non-issue so future reviewers don't re-flag it. |
+| `knit_get_false_positives` | List confirmed non-issues to suppress in review prompts. |
+| `knit_save_session_summary` | Opt-in narrative — record only when this session accomplished something a future session would search for. |
+| `knit_save_handoff` | Save state when context degrades. `failed_attempts` is the load-bearing field. |
+| `knit_consolidate_learnings` | **v0.9.** Cluster similar learnings via tag-Jaccard → one pattern entry per cluster. Dry-run by default. |
+</details>
+<details>
+<summary><strong>🧭 Workflow + classification</strong> <em>(Tier 1)</em></summary>
+| Tool | What it does |
+|---|---|
+| `knit_classify_task` | First call on every task. Returns tier (`inquiry \| trivial \| standard \| complex`), phases, `auto_plan_mode`. **v0.9.** For standard/complex, auto-runs BM25 and embeds top-3 hits as `pre_emptive_learnings`. |
+| `knit_build_context` | Domain context for the current task. **v0.9.** Includes `suggested_reads` — files worth opening (graph-importers, graph-imports, memory-mentions). |
+| `knit_get_workflow` | Fetch protocol depth for one phase on demand. Sections: `overview, tier, phases, research, ideate, plan, execute, optimize, review, tdd, learn, handoff, ship, tools`. |
+| `knit_get_suggestions` | Adaptive warnings from past patterns in given domains. |
+| `knit_reflect` | Detect patterns across recorded learnings (per-project + global pool). Useful with ≥3 entries. |
+| `knit_setup_project` | Describe a non-code project (legal, marketing, research) to bootstrap domain teams. |
+| `knit_prune_sessions` | Prune `sessions.jsonl` by age (default 90 days). Atomic rewrite. |
+</details>
+<details>
+<summary><strong>🛡️ Protocol Guard</strong> <em>(Tier 1)</em></summary>
+Runtime enforcement of the Knit protocol via PreToolUse and SessionStart hooks. Default strictness: `warn`.
+| Tool | What it does |
+|---|---|
+| `knit_set_protocol_strictness` | Set strictness: `off` (no checks), `warn` (reminder, default), `block` (hard-fail Edit/Write without prior `knit_classify_task` AND `knit_search_learnings`). |
+| `knit_get_protocol_strictness` | Read current strictness level. |
+</details>
+<details>
+<summary><strong>📊 Discoverability + diagnostics</strong> <em>(Tier 1)</em></summary>
+| Tool | What it does |
+|---|---|
+| `knit_brain_status` | Brain health + **token-budget** verdicts per surface + `update_available` notification + integrations summary. |
+| `knit_list_features` | Surfaces hidden tools and tells you how to enable them. The escape hatch. |
+| `knit_enable_feature` | Flip on a Tier-2/3 feature (`teams`, `subagents`, `admin`). Emits `notifications/tools/list_changed` — new tools appear without a Claude Code restart. |
+| `knit_disable_feature` | Symmetric to enable. |
+| `knit_scan_integrations` | Re-detect existing workflow frameworks (Ruflo, gstack, CodeTour, Conductor, other MCP servers, custom CLAUDE.md sections). |
+| `knit_compounding_metrics` | Quantifies *"Knit gets cheaper over time"* — sessions, cache hits, reuse-ratio %, estimated tokens saved. Verdict: `cold \| warming \| compounding \| strong`. |
+</details>
+<details>
+<summary><strong>👥 Parallel team worktrees</strong> <em>(Tier 2 — auto-active with ≥3 domains)</em></summary>
+| Tool | What it does |
+|---|---|
+| `knit_spawn_team_worktree` | Create a git worktree for a team so they can write in parallel without colliding. |
+| `knit_list_team_worktrees` | List active team worktrees. |
+| `knit_finalize_team_worktree` | Merge or discard a team's worktree; surfaces conflicts without destroying it. |
+| `knit_get_teams` | List auto-detected or custom teams. |
+| `knit_define_team` | Create a custom team. |
+| `knit_start_team_review` | Start a parallel review with a shared findings board. |
+| `knit_get_team_prompt` | Per-team prompt including other teams' findings. |
+| `knit_post_team_findings` | Post findings to the shared board. |
+| `knit_get_board_summary` | Cross-team summary, severity-gated. |
+</details>
+<details>
+<summary><strong>🤖 Subagents</strong> <em>(Tier 2 — auto-active when `.claude/agents/` exists)</em></summary>
+| Tool | What it does |
+|---|---|
+| `knit_install_agent` | Install a VoltAgent subagent (e.g. `typescript-pro`) into `.claude/agents/knit-<name>.md`, personalized with project context. |
+</details>
+<details>
+<summary><strong>⚙️ Admin</strong> <em>(Tier 3 — opt-in only)</em></summary>
+| Tool | What it does |
+|---|---|
+| `knit_setup_project` | Bootstrap domain teams for a non-code project. One-time. |
+</details>
+> The **cross-project pool** (`knit_search_global_learnings` + `knit_record_global_learning`) holds the lessons that travel between projects — *"Stripe signature rules", "GitHub API pagination quirks", "Redis cluster failover behavior"* — the kind of thing future-you will be glad you wrote down once, somewhere.
+---
+## 💾 How data is stored
+Knit data is **centralized** — not in every repo's working tree:
 ```
 ~/.knit/
@@ -104,15 +255,17 @@ your-project/
     └── settings.local.json             ← per-machine hooks (knit-managed; gitignored by convention)
 ```
-The project's own `CLAUDE.md` is wrapped in `<!-- knit:start --> ... <!-- knit:end -->` markers. Knit regenerates only the block between markers — never clobbers anything else you write. If your project already has a `CLAUDE.md` without markers, knit writes a sidecar at `.claude/KNIT.md` instead.
+The project's own `CLAUDE.md` is wrapped in `<!-- knit:start --> ... <!-- knit:end -->` markers. Knit regenerates **only** the block between markers — never clobbers anything else. If your project already has a `CLAUDE.md` without markers, knit writes a sidecar at `.claude/KNIT.md` instead.
 Override the data location with `KNIT_HOME=/custom/path` (useful for sandboxes and tests).
-## Workflow on demand
+---
-The protocol is in MCP, not preloaded in every session. CLAUDE.md tells the agent to call `knit_get_workflow(phase)` when it needs the actual procedure. Sections:
+## 🧩 Workflow on demand
-```
+The protocol is in MCP, **not preloaded** in every session. CLAUDE.md tells the agent to call `knit_get_workflow(phase)` when it needs the actual procedure:
+```js
 knit_get_workflow({phase: "research"})    // RESEARCH phase details
 knit_get_workflow({phase: "plan"})        // PLAN + plan-mode rules
 knit_get_workflow({phase: "execute"})     // EXECUTE + TDD
@@ -127,111 +280,40 @@ knit_get_workflow({phase: "tools"})       // knit MCP tools reference
 Plus `overview`, `tier`, `phases`. Call with no `phase` to list all sections.
-**Effect:** v0.1's CLAUDE.md was ~700 lines / ~20 KB per session, every session. v0.2's is ~100 lines / ~2.7 KB. Protocol depth pulled only when needed.
+> **Effect:** v0.1's CLAUDE.md was ~700 lines / ~20 KB per session. v0.9's is ~100 lines / ~2 KB. **Protocol depth pulled only when needed.**
-## 35 MCP Tools
+---
-### Query the brain (read-only, cached, ~5ms)
+## 🌳 Parallel team worktrees
-| Tool | What it does |
-|------|--------------|
-| `knit_query_imports` | Reverse dependencies for a file. Use before edits. |
-| `knit_query_dependents` | What a file imports. |
-| `knit_query_exports` | What a file exposes. |
-| `knit_query_tests` | Test coverage for a file, or list all untested. |
-| `knit_find_fanout` | High-fanout files — the contracts. |
-| `knit_search_learnings` | Past lessons by domain tag. |
-| `knit_get_false_positives` | Confirmed non-issues to suppress in review. |
-| `knit_brain_status` | Brain health + **token accounting**. |
-| `knit_search_sessions` | Search past sessions by free text over summary+tags+branch. |
-| `knit_load_session` | Call at session start — returns last sessions, handoff, learnings, false positives, teams, project knowledge in one round trip. |
-### Update the brain (write — quality-gated)
-| Tool | What it does |
-|------|--------------|
-| `knit_classify_task` | First call on every task. Returns tier, phases, affected domains. |
-| `knit_build_context` | Domain context for the current task. |
-| `knit_record_learning` | Save a non-obvious insight. Quality check first. |
-| `knit_record_false_positive` | Mark a finding as a confirmed non-issue. |
-| `knit_save_session_summary` | Opt-in narrative summary of what this session did. |
-| `knit_save_handoff` | Save state when context degrades. |
-| `knit_setup_project` | Describe a non-code project (legal, marketing, research). |
-| `knit_prune_sessions` | Prune sessions.jsonl by age — keep recent N or drop entries older than N days. |
-| `knit_install_agent` | Install a single VoltAgent subagent (e.g. `typescript-pro`) into `.claude/agents/`. |
-### Protocol Guard (v0.5.0+)
-Runtime enforcement of the knit protocol via PreToolUse and SessionStart hooks. Default strictness: `warn`.
+A Complex task gets broken across multiple teams. Each team works in its own git worktree (sibling to the main repo, native `git worktree` convention). Multiple agents within one team share the team's worktree. The orchestrator collects each team's work, runs gates, and merges back.
-| Tool | What it does |
-|------|--------------|
-| `knit_set_protocol_strictness` | Set strictness: `off` (no checks), `warn` (reminder), `block` (hard-fail Edit/Write without prior `knit_classify_task`). |
-| `knit_get_protocol_strictness` | Read current strictness level for this project. |
+```
+/Users/p/my-repo                          ← main
+/Users/p/my-repo-knit-ui-<ts>             ← UI team
+/Users/p/my-repo-knit-api-security-<ts>   ← API & Security team
+```
-### Workflow on demand
+```js
+const ui = await knit_spawn_team_worktree({ team_name: "UI", task_description: "..." })
+// Spawn agents with ui.path; they cd there and work
+await knit_finalize_team_worktree({ team_name: "UI", action: "merge" })
+```
-| Tool | What it does |
-|------|--------------|
-| `knit_get_workflow` | Fetch protocol depth for one phase. |
+**Merge conflicts surface cleanly** — `knit_finalize_team_worktree` with `action: "merge"` returns `{status: "conflict", conflict_files: [...]}` without destroying the worktree. Resolve manually, then call again.
-### Parallel team worktrees
+Compatible with Claude Code's `EnterWorktree({path})` — knit's worktrees register via native `git worktree add`, so any session can switch into one.
-| Tool | What it does |
-|------|--------------|
-| `knit_spawn_team_worktree` | Create a git worktree for a team. |
-| `knit_list_team_worktrees` | List active team worktrees. |
-| `knit_finalize_team_worktree` | Merge or discard a team's worktree. |
+---
-### Team review board
+## 🧬 Subagents — VoltAgent + project personalization
-| Tool | What it does |
-|------|--------------|
-| `knit_get_teams` | List auto-detected or custom teams. |
-| `knit_define_team` | Create a custom team. |
-| `knit_start_team_review` | Start a parallel review with shared findings. |
-| `knit_get_team_prompt` | Per-team prompt including other teams' findings. |
-| `knit_post_team_findings` | Post findings to the shared board. |
-| `knit_get_board_summary` | Cross-team summary, severity-gated. |
+On first MCP call, knit installs **personalized subagents** into `<project>/.claude/agents/knit-<name>.md`. Each agent has:
-### Cross-project learnings (Model C, opt-in)
+1. **The VoltAgent base** — the curated system prompt from [github.com/VoltAgent/awesome-claude-code-subagents](https://github.com/VoltAgent/awesome-claude-code-subagents) (MIT, 131+ agents). Knit bundles the 6 most common (`code-reviewer`, `security-engineer`, `qa-expert`, `typescript-pro`, `python-pro`, `golang-pro`) so they install with zero network. Specialized agents fetch at a pinned SHA the first time knit needs them.
+2. **A knit context block** appended at the end with project name, stack, high-fanout files, recent relevant learnings, false positives to suppress, and the knit MCP tools the agent can call.
-| Tool | What it does |
-|------|--------------|
-| `knit_record_global_learning` | Opt-in: save an insight to `~/.knit/global/learnings.jsonl` when it generalizes beyond this project. |
-| `knit_search_global_learnings` | Free-text search across **all** of your projects' shared learnings. |
-| `knit_reflect` | Detect patterns across recorded learnings. Useful with ≥3 entries (which Model C makes easy to reach). |
-| `knit_get_suggestions` | Adaptive suggestions for the current task based on past patterns in given domains. |
-Per-project `knit_record_learning` stays primary. The global pool is for the lessons that travel between projects — "Stripe signature rules", "GitHub API pagination quirks", "Redis cluster failover behavior" — the kind of thing future-you will be glad you wrote down once, somewhere.
-## Subagents — VoltAgent + project personalization
-v0.4 closes the gap where knit's team configs referenced agent names
-(`typescript-pro`, `security-engineer`, etc.) without actually installing them.
-A fresh user opening Claude Code had none of those agents on disk, so teams
-fell back to generic prompts.
-Now: on first MCP call, knit **installs personalized subagents** into
-`<project>/.claude/agents/knit-<name>.md`. Each agent has:
-1. **The VoltAgent base** — the curated system prompt from
-   [github.com/VoltAgent/awesome-claude-code-subagents](https://github.com/VoltAgent/awesome-claude-code-subagents)
-   (MIT-licensed, 131+ agents). Knit bundles the 6 most common
-   (`code-reviewer`, `security-engineer`, `qa-expert`, `typescript-pro`,
-   `python-pro`, `golang-pro`) so they install with zero network. Specialized
-   agents fetch from VoltAgent at a pinned SHA the first time knit needs them.
-2. **An knit context block** appended at the end with project name, stack,
-   high-fanout files, recent relevant learnings, false positives to suppress,
-   and the knit MCP tools the agent can call.
-Each agent now has both VoltAgent's role expertise AND knit's project-specific
-context. When a team dispatches via Claude Code's Agent tool, the agent inherits
-both layers.
-**Never clobbers user-curated agents.** If you have your own
-`<project>/.claude/agents/typescript-pro.md`, knit writes
-`knit-typescript-pro.md` alongside it. Different filename, no conflict.
+**Never clobbers user-curated agents.** If you have your own `typescript-pro.md`, knit writes `knit-typescript-pro.md` alongside it. Different filename, no conflict.
 ```bash
 knit install-agents              # install agents this project's teams need
@@ -239,116 +321,181 @@ knit install-agents --all        # install every known agent
 knit install-agents --refresh    # re-fetch from network even if cached
 ```
-`KNIT_OFFLINE=1` disables network fetches (bundled-core still works).
-`KNIT_AGENT_REGISTRY_REF=main` overrides the pinned VoltAgent SHA.
-(Legacy `ENGRAM_OFFLINE` / `ENGRAM_AGENT_REGISTRY_REF` are still honored.)
-## Parallel team worktrees
-A Complex task gets broken across multiple teams. Each team works in its own git worktree (sibling to the main repo, native `git worktree` convention). Multiple agents within one team share the team's worktree. The orchestrator collects each team's work, runs gates, and merges back.
-```
-/Users/p/my-repo                          <- main
-/Users/p/my-repo-knit-ui-<ts>           <- UI team
-/Users/p/my-repo-knit-api-security-<ts> <- API & Security team
-```
-```js
-// Orchestrator workflow
-const ui = await knit_spawn_team_worktree({ team_name: "UI", task_description: "..." })
-// Spawn agents with ui.path; they cd there and work
-// ...
-await knit_finalize_team_worktree({ team_name: "UI", action: "merge" })
-```
-**Merge conflicts surface cleanly** — `knit_finalize_team_worktree` with `action: "merge"` returns `{status: "conflict", conflict_files: [...]}` without destroying the worktree. Resolve manually, then call again.
+`KNIT_OFFLINE=1` disables network fetches (bundled-core still works). `KNIT_AGENT_REGISTRY_REF=main` overrides the pinned SHA.
-Compatible with Claude Code's `EnterWorktree({path})` — knit's worktrees register via native `git worktree add`, so any session can switch into one.
+---
-## Token accounting
+## 💰 Token accounting
-`knit_brain_status` answers the only question that matters: is knit saving more than it costs?
+`knit_brain_status` answers the only question that matters: **is knit saving more than it costs?**
 ```json
 {
-  "token_accounting": {
-    "claude_md_kb": 2.7,
-    "session_count": 12,
-    "learnings_hit_rate_pct": 67,
-    "note": "Healthy."
+  "token_budget": {
+    "budgets": {
+      "claude_md":            { "bytes": 2048,  "target_bytes": 6500,  "verdict": "healthy" },
+      "tool_registry":        { "bytes": 8400,  "target_bytes": 8500,  "verdict": "healthy", "active_tool_count": 31, "total_tool_count": 43 },
+      "instructions":         { "bytes": 2200,  "target_bytes": 2500,  "verdict": "healthy" },
+      "per_session_overhead": { "bytes": 12648, "target_bytes": 17500, "verdict": "healthy" }
+    },
+    "overall_verdict": "healthy",
+    "compounding": {
+      "session_count": 12,
+      "total_learnings": 18,
+      "learnings_hit_rate_pct": 67,
+      "note": "Strong compounding — learnings are getting reused across sessions."
+    }
+  },
+  "update_available": {
+    "current": "0.8.0",
+    "latest":  "0.9.0",
+    "upgrade": "Restart Claude Code to spawn a fresh MCP — npx will auto-fetch the new version."
   }
 }
 ```
-Warnings surface when CLAUDE.md > 30 KB (knit is too heavy) or hit rate < 20 % on >10 learnings (most learnings unused — prune).
+Pair with `knit_compounding_metrics` for the value side of the ledger (sessions, hit rate, estimated tokens saved by skipped re-investigations).
-## CLI
+---
+## 💻 CLI
 ```bash
-knit setup       # One time: add MCP to Claude settings
-knit status      # Dashboard: sessions, learnings, hit rate, knowledge health
-knit refresh     # Force rebuild knowledge brain
+knit setup       # one time: add MCP to Claude settings
+knit status      # dashboard: sessions, learnings, hit rate, knowledge health
+knit refresh     # force rebuild knowledge brain
 ```
-Example `knit status` output:
+Example `knit status`:
 ```
 Knowledge Index
-  Files:        47 indexed (12,340 lines)
-  Imports:      23 edges mapped
-  Untested:     8 files
+  Files:        54 indexed (11,495 lines)
+  Imports:      46 edges mapped
+  Untested:     5 files
 Knowledge Base
-  Learnings:      12 total
-  Accessed:        8 (67% hit rate)
+  Learnings:      18 total
+  Accessed:       12 (67% hit rate)
   False positives: 3
-Token accounting
-  CLAUDE.md:       2.7 KB
-  Sessions logged: 14
-  Hit rate:        67% → Healthy
+Token budget (v0.9)
+  CLAUDE.md:           2.0 KB  → healthy
+  Tool registry:       8.4 KB  → healthy (31 active / 43 total)
+  Instructions:        2.2 KB  → healthy
+  Per-session total:   12.6 KB → healthy
+Compounding
+  Sessions logged:     14
+  Reuse ratio:         67%  → strong
+  Tokens saved (est.): 65,000
 ```
-## How it's different
+---
+## 🆚 How it's different
+|  | gstack (skills) | ECC (agents) | Ruflo (orchestration) | **Knit** |
+|--|---|---|---|---|
+| **Bet** | Slash-command flows | Agent rules | 100+ agents in swarms | **One disciplined agent, compounding memory** |
+| **Setup** | Install skills per-project | Manual `.claude/` setup | `npx ruflo init` (heavy) | **`npx knit-mcp setup` (light)** |
+| **Memory** | jsonl files in-tree | Memory directory | Vector DB + 4-tier consolidation | **Local, searchable, vectorless BM25 + graph fusion** |
+| **Token cost** | Skills loaded into context | Rules loaded into context | 314 tools advertised | **~2 KB CLAUDE.md, tier-gated registry, budget guardrail** |
+| **Parallel work** | None | None | Multi-agent swarms + federation | **Team-scoped git worktrees** |
+| **Cloud dependency** | None | None | Cognitum.One (cloud backbone) | **None — fully local** |
+| **Self-measurement** | None | None | Cost-tracker plugin | **`knit_brain_status.token_budget` + `knit_compounding_metrics`** |
+| **Anti-hallucination** | None | None | None advertised | **`knit_verify_claim` + citation rule + pre/post import validation** |
+| **Non-code projects** | No | No | Limited | **Description-driven via `knit_setup_project`** |
+**The bet:** Ruflo for agent quantity (swarms, federation, plugins). Knit for **agent quality** (memory, classification, token discipline, hallucination defense). Different markets. The integration scanner detects Ruflo when installed and tailors instructions to defer routing to it — Knit operates as the memory + classification substrate underneath.
+---
+## 🧭 Honest comparison vs memory libraries
+The mem0 / Letta / agentmemory comparison deserves a separate section because they're a different category — **memory-as-a-service libraries**, not MCP-native workflow layers. Reading their published benchmarks side-by-side:
-|  | gstack (skills) | ECC (agents) | Knit |
-|--|---|---|---|
-| Setup | Install skills per-project | Manual `.claude/` setup | One command. Done forever. |
-| Memory | jsonl files in-tree | Memory directory | `~/.knit/projects/<hash>/` — centralized, project-keyed, searchable sessions |
-| Token cost | Skills loaded into context | Rules loaded into context | Workflow fetched on-demand. CLAUDE.md is ~2.7 KB. |
-| Parallel work | None | None | Team-scoped git worktrees |
-| Self-measurement | None | None | `knit_brain_status.token_accounting` |
-| Non-code projects | No | No | Description-driven domains via `knit_setup_project` |
+| | mem0 | Letta (MemGPT) | agentmemory | **Knit** |
+|--|---|---|---|---|
+| **Published benchmark** | LOCOMO: 67–92% LLM-as-Judge; ~90% token reduction (1.7K vs 26K per conversation) | No head-to-head token-reduction number; "Letta Leaderboard" benchmarks *LLMs* on agentic memory, not Letta | LongMemEval-S: **95.2% R@5** with BM25+RRF+graph; 86.2% BM25-only | **Not yet measured.** Same architecture as agentmemory; no published number. |
+| **Retrieval architecture** | Vector + graph (Mem0g variant) | OS-inspired tiered memory (core/recall/archival) | BM25 + local vectors + KG fused via RRF (k=60) | BM25 + RRF + graph-traversal (fused via RRF k=60). Per-project + cross-project diversity caps. |
+| **Install shape** | SDK integration; managed cloud or self-hosted | SDK integration; self-hosted server | Python library | **`npx knit-mcp setup` → MCP server, zero glue.** Works with Claude Code / Cursor / Codex / any MCP host. |
+| **Workflow primitive** | None — pure memory | Agent-managed memory operations | None — pure retrieval | **4-tier classifier + plan-mode + protocol guard + parallel team worktrees.** |
+| **Self-calibration** | No | No | No | **Per-project classifier calibration** (v0.11): user FP feedback shifts thresholds; classifier gets less wrong over time. |
-## Migration from v0.1
+### What's honest about this
-If you have an existing project with knit v0.1 data at `<project>/.claude/`, knit v0.2 auto-migrates on the first MCP call:
+**Knit's measured retrieval on a 50-question synthetic harness (v0.11.2):**
+| Metric | Knit (v0.11.2 synthetic) | agentmemory (LongMemEval-S, published) |
+|---|---|---|
+| Top-1 accuracy | **86.0%** | not published in that form |
+| Recall@5 | **96.0%** | **95.2%** |
+Run it yourself: `npm run bench`. Source: [`benchmarks/retrieval-synthetic.ts`](./benchmarks/retrieval-synthetic.ts).
+**These numbers are NOT apples-to-apples with agentmemory's.** Their benchmark is 1,500 questions from real long conversations; Knit's is 50 hand-authored questions on a 7KB synthetic corpus. The numbers are close because the architecture is similar (BM25 + RRF), not because we've proven parity at scale. **Real comparison requires running LongMemEval-S on Knit** — on the roadmap for v0.13.
+**Knit isn't trying to be a better mem0.** It's a different product:
+- **MCP-native + zero-glue install** — mem0/Letta require SDK integration; Knit drops into any MCP host (Claude Code, Cursor, Codex) with one command.
+- **Workflow primitive** — the 4-tier classifier + plan-mode + protocol guard + team worktrees is what makes Knit a *command layer*, not a memory library.
+- **Per-project classifier calibration** (v0.11 slice 4) — `knit_record_false_positive` with a direction tag shifts thresholds over time. Nobody else does this; nobody else needs to, because they're memory libraries, not workflow routers.
+- **Measurable cheapness** — `knit_compounding_metrics` + `knit_get_metrics_history` make the "cheaper over time" claim *chartable per project*. mem0 publishes aggregate dataset numbers; Knit ships per-user instrumentation.
+### What's deferred
+LongMemEval-S R@5/R@10 + LOCOMO LLM-as-Judge runs are on the roadmap (v0.13+). Until they're published, treat any cross-system token-savings comparison as architectural-claim-only.
+---
+## 📜 Release history
+| Version | Headline |
+|---|---|
+| **v0.11.2** | Pre-publish polish · chunk cap (2000) + `errorResponse` envelope across handlers + CLAUDE.md generator surfaces v0.11 tools · new `engram doctor` install health-check CLI · upgrade-path smoke test caught + fixed a data-loss bug in cache.ts (Case B was wiping user permissions on upgrade) · 11 real exploit-payload integration tests prove C1/C2/H1 fixes hold · `npm run bench` ships a synthetic retrieval harness (50 Q&A) measuring 86% top-1 / 96% R@5. 53 tools, 664 tests. |
+| **v0.11.1** | Audit-driven hardening · 3 CRITICAL (source_id path traversal, post-edit tsc shell injection, live calibration bug) + 10 HIGH fixes from a 5-agent audit, implemented in 3 parallel `knit_spawn_team_worktree` teams. HOOKS_VERSION 11 (auto-upgrades existing users). New `knit_delete_requirements` tool. Honest comparison vs mem0/Letta added. 53 tools, 636 tests. |
+| **v0.11.0** | Verify Layer + auto-config foundation · mandatory `knit_verify_claim` REVIEW gate · post-edit diff verify + universal `tsc` check · drift detector · self-healing classifier (per-project calibration) · `knit_index_requirements` + `knit_generate_test_cases` (BM25 over long specs) · `knit_get_fingerprint` + `knit_infer_domains` + `knit_compose_template` (zero-config CLAUDE.md). 52 tools, 625 tests. |
+| **v0.10.0** | Token-economics release · risk × scope × change_kind classifier split · `context_budget_remaining` graceful degradation · per-project diversity cap on cross-project search · 11 new compounding-metrics fields + weekly snapshot persistence + `knit_get_metrics_history`. Makes "Knit makes Claude cheaper" a chartable number from day 1. |
+| **v0.9.0** | Hook-level enforcement · citation rule · `knit_verify_claim` · auto-search in classify · `suggested_reads` · `knit_get_learning` · `knit_consolidate_learnings`. |
+| **v0.8.x** | Vectorless RAG (BM25 + RRF) · graph-traversal retriever · per-project instruction tailoring · `knit_compounding_metrics` · integration scanner. |
+| **v0.7.x** | Universal protocol injection via MCP `instructions` · tier-gated tool surface · `knit_list_features` · Inquiry classification tier · CLAUDE.md cut 88% · lazy response modes · token-budget guardrail · legacy migration. |
+| **v0.5.x** | Protocol Guard — runtime enforcement via hooks (off / warn / block). |
+| **v0.4.x** | VoltAgent subagent integration · personalization layer · `engram install-agents` CLI · hybrid hook merging · `export obsidian`. |
+| **v0.3.x** | Centralized data at `~/.knit/projects/<hash>/` · marker-wrapped CLAUDE.md · on-demand workflow · cross-project learnings pool. |
+---
+## 🔄 Migration from v0.1
+If you have an existing project with knit v0.1 data at `<project>/.claude/`, knit auto-migrates on the first MCP call:
 1. Detects `<project>/.claude/knowledge.json` (or `knowledgebase.json`)
 2. Copies all knit data forward to `~/.knit/projects/<hash>/`
 3. Writes `<project>/.claude/MIGRATED.txt` breadcrumb explaining where the data went
 4. Leaves the old `.claude/` directory intact (delete at your discretion)
-No data loss, no dual-writes. Single migration per project.
+**No data loss, no dual-writes.** Single migration per project.
+---
-## Development
+## 🛠️ Development
 ```bash
 git clone https://github.com/PDgit12/knit.git
 cd knit
 npm install
-npm run dev       # Run CLI locally
-npm run test      # 295 tests
-npm run typecheck # TypeScript strict mode
-npm run build     # Compile CLI + MCP server
+npm run dev        # run CLI locally
+npm run test       # 492 tests
+npm run typecheck  # TypeScript strict mode
+npm run build      # compile CLI + MCP server
 ```
-## Architecture
+### Architecture
 ```
 knit (npm package)
 ├── dist/cli.js                 # CLI: setup, status, refresh
-└── dist/mcp/server.js          # MCP server: 27 tools, auto-init
+└── dist/mcp/server.js          # MCP server: 43 tools (tier-gated), auto-init
 per-project, in ~/.knit/projects/<hash>/
 ├── knowledge.json              # import graph + exports + test map
@@ -360,11 +507,17 @@ per-project, in ~/.knit/projects/<hash>/
 per-project, in <project>/
 ├── CLAUDE.md                   # ≤150-line thin shape, marker-wrapped
-└── .claude/settings.local.json # per-machine hooks, knit-managed (gitignored by convention)
+└── .claude/settings.local.json # per-machine hooks, knit-managed
 ```
-Zero external dependencies for the knowledge brain. 295 tests. Strict-mode TypeScript.
+**Zero external dependencies for the knowledge brain.** 492 tests. Strict-mode TypeScript.
-## License
+---
-MIT
+## 📄 License
+[MIT](./LICENSE) © 2026
+<p align="center">
+  <sub>If knit saved you tokens, <a href="https://github.com/PDgit12/knit">give it a ⭐</a>.</sub>
+</p>