knit-mcp 0.16.0 → 0.21.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (30) hide show
  1. package/README.md +237 -149
  2. package/dist/{cache-7S5DFFQ6.js → cache-LW4H6ZU5.js} +10 -9
  3. package/dist/chunk-2GDNMY7N.js +57 -0
  4. package/dist/{chunk-27TA2ZQZ.js → chunk-5EUQ2DCN.js} +12 -0
  5. package/dist/{chunk-BBQSWT4H.js → chunk-6BQPXFRL.js} +40 -0
  6. package/dist/{chunk-VB2TIR6L.js → chunk-DIU7RE5X.js} +2 -2
  7. package/dist/{chunk-OINYMLOV.js → chunk-DXV5NAQ3.js} +10 -4
  8. package/dist/{chunk-JE4BZQUD.js → chunk-F2OIMOX2.js} +53 -19
  9. package/dist/{chunk-QM4U75VE.js → chunk-IINE7UWL.js} +155 -155
  10. package/dist/{chunk-ZESAIRIL.js → chunk-PNIOLRIT.js} +62 -5
  11. package/dist/{chunk-2FAS6CV4.js → chunk-PQTYGVZN.js} +1 -1
  12. package/dist/{chunk-Q3GNWHEW.js → chunk-WPXK5IHO.js} +60 -9
  13. package/dist/{tools-7VJRV64S.js → chunk-WVRS7W5V.js} +352 -134
  14. package/dist/{chunk-OZCVBNHF.js → chunk-Y6MY4STM.js} +83 -6
  15. package/dist/cli.js +26 -13
  16. package/dist/doctor-SBLYY7VW.js +25 -0
  17. package/dist/{export-4BO6HCXP.js → export-QKUVOV3O.js} +3 -2
  18. package/dist/{install-agents-2JYKFLU6.js → install-agents-6SJ7FH57.js} +10 -9
  19. package/dist/{instructions-4SLOUME2.js → instructions-YFZZAY2P.js} +3 -1
  20. package/dist/{integration-scanner-LBD2PIZ3.js → integration-scanner-5O6XSGGP.js} +2 -2
  21. package/dist/{refresh-4X4HMDMT.js → refresh-XJRH2K2M.js} +4 -4
  22. package/dist/{setup-2YN36GWS.js → setup-4K7FICNS.js} +20 -10
  23. package/dist/{status-RPHO7QQO.js → status-J2Q4ACID.js} +4 -4
  24. package/dist/tools-5K6DUP2I.js +28 -0
  25. package/dist/{ui-GN4JT4XR.js → ui-W2SAVL73.js} +166 -82
  26. package/package.json +1 -1
  27. package/webapp/dist/assets/index-DxyZTqwU.js +40 -0
  28. package/webapp/dist/index.html +1 -1
  29. package/dist/doctor-2ESSKFZE.js +0 -14
  30. package/webapp/dist/assets/index-BvEqg_UZ.js +0 -40
package/README.md CHANGED
@@ -3,7 +3,7 @@
3
3
  <a href="https://github.com/PDgit12/knit/actions/workflows/ci.yml"><img src="https://img.shields.io/github/actions/workflow/status/PDgit12/knit/ci.yml?style=for-the-badge&label=CI&color=10b981" alt="CI" /></a>
4
4
  <img src="https://img.shields.io/badge/license-MIT-3b82f6?style=for-the-badge" alt="license" />
5
5
  <img src="https://img.shields.io/badge/node-%E2%89%A518-339933?style=for-the-badge&logo=node.js&logoColor=white" alt="node" />
6
- <img src="https://img.shields.io/badge/MCP%20tools-55-7c3aed?style=for-the-badge" alt="tools" />
6
+ <img src="https://img.shields.io/badge/MCP%20tools-56-7c3aed?style=for-the-badge" alt="tools" />
7
7
  <img src="https://img.shields.io/badge/agents-6-10b981?style=for-the-badge" alt="agents supported" />
8
8
  <img src="https://img.shields.io/badge/local--first-100%25-3b82f6?style=for-the-badge" alt="local-first" />
9
9
  </p>
@@ -20,14 +20,14 @@
20
20
  <a href="#-quick-start">Quick start</a> ·
21
21
  <a href="#-what-knit-is">What it is</a> ·
22
22
  <a href="#-how-search-works">How search works</a> ·
23
- <a href="#-55-mcp-tools">Tools</a> ·
23
+ <a href="#-56-mcp-tools">Tools</a> ·
24
24
  <a href="#-the-dashboard">Dashboard</a> ·
25
- <a href="#-how-its-different">vs mem0/Letta</a>
25
+ <a href="#-why-knit">Why Knit</a>
26
26
  </p>
27
27
 
28
28
  ---
29
29
 
30
- ## 🧠 What knit is
30
+ ## 🧠 What Knit is
31
31
 
32
32
  Knit gives **any MCP-speaking coding agent** the right defaults automatically — because you can't predict how a user will phrase a request, and every agent (Claude Code, Cursor, Codex CLI, Cline, Continue, GitHub Copilot) ends up burning tokens re-discovering the same project facts. Knit does four jobs at once:
33
33
 
@@ -36,11 +36,11 @@ Knit gives **any MCP-speaking coding agent** the right defaults automatically
36
36
  | 🧠 **Memory** | Every project keeps a brain at `~/.knit/projects/<hash>/`. Sessions compound: learnings, false positives, session summaries, and a static-analysis import graph are all queryable next session. Cross-project pool at `~/.knit/global/`. |
37
37
  | 🪶 **Tokens** | `CLAUDE.md` is ~2 KB (project facts only). Protocol depth is fetched on demand via `knit_get_workflow(phase)`. Per-cache-hit savings ≈ 15K tokens (calibrated from instrumented RESEARCH phases — override via env). Reuse-ratio + ROI surfaced in the dashboard. |
38
38
  | 🛠️ **Workflow** | A 4-tier classification (Inquiry / Trivial / Standard / Complex) with phase-triggered plan mode, quality-gated `LEARN`, and team-scoped git worktrees so parallel agents don't step on each other. |
39
- | 📊 **Dashboard** | New in v0.13. `knit ui` opens a local-first analytics dashboard at `http://127.0.0.1:7421` bento layout, brain savings, per-project ROI, **force-directed brain graph**, real-time sync via SSE. See [Dashboard](#-the-dashboard). |
39
+ | 📊 **Dashboard** | `knit` opens the brain — a local-first dashboard at `http://127.0.0.1:7421`: bento layout, brain savings, per-project ROI, **force-directed brain graph**, real-time sync via SSE. See [Dashboard](#-the-dashboard). |
40
40
 
41
41
  **Local-first** invariant: zero cloud calls in memory/retrieval/classification. Dashboard binds to `127.0.0.1` only, with Host/Origin validation + CSP headers. Your brain stays on your machine.
42
42
 
43
- It's a **single product**, not four. Every design choice has to win on memory + tokens + workflow + analytics together.
43
+ One product: every design choice wins on memory, tokens, workflow, and analytics together.
44
44
 
45
45
  ---
46
46
 
@@ -48,12 +48,28 @@ It's a **single product**, not four. Every design choice has to win on memory +
48
48
 
49
49
  ```bash
50
50
  npm install -g knit-mcp
51
- knit setup # adds Knit MCP to your agent's config (Claude Code / Cursor / Codex / etc.)
52
- knit ui # opens the brain dashboard at http://127.0.0.1:7421 (optional but recommended)
51
+ knit setup # one-time: register Knit with your agents (Claude Code / Cursor / Codex / )
52
+ knit # open the brain — the dashboard at http://127.0.0.1:7421
53
53
  ```
54
54
 
55
+ Two commands: `knit setup` for one-time agent registration, then `knit` to open the brain. Agents communicate with the MCP server over stdio; that process is launched by the host, not invoked manually.
56
+
55
57
  **No per-project setup.** Open your MCP-speaking agent in any project — the first MCP tool call auto-initializes the brain, hooks, and per-project CLAUDE.md block.
56
58
 
59
+ ### First prompt — onboard your project
60
+
61
+ Once Knit is connected, open your project in your agent and paste this once. Fill in the brackets, or just describe the project in your own words — the agent does the rest:
62
+
63
+ > You have the Knit MCP connected. Call `knit_load_session`, then call `knit_onboard` with:
64
+ > - **project_description** — what this project is
65
+ > - **intent** — what I'm building right now
66
+ > - **strictness** — `off` | `warn` | `block` (how strictly to enforce the workflow)
67
+ > - **focus_domains** — comma-separated areas (e.g. `api, billing`)
68
+ >
69
+ > Then summarize what you configured and call `knit_classify_task` for my first task.
70
+
71
+ Knit persists these preferences and surfaces your project intent at the start of every session. It's a plain MCP tool, so the same prompt works on **any** host — Claude Code, Cursor, Codex, Cline, Continue, Copilot — new session or resumed.
72
+
57
73
  ### Adoption per agent
58
74
 
59
75
  v0.14: a single `knit setup` detects **every** installed MCP-speaking agent on
@@ -78,8 +94,6 @@ per-agent manual setup, no copy-pasted JSON.
78
94
 
79
95
  > **Supported shells:** macOS, Linux, WSL, Git Bash, PowerShell. Windows `cmd.exe` is not supported as the hook-runner shell — use PowerShell (default in modern Windows Terminal) or Git Bash.
80
96
 
81
- > **Supported shells:** macOS, Linux, WSL, Git Bash, PowerShell. Windows `cmd.exe` is not supported as the hook-runner shell — use PowerShell (default in modern Windows Terminal) or Git Bash.
82
-
83
97
  ### Quiet mode
84
98
 
85
99
  Knit ships **Protocol Guard in `warn` mode by default** — hooks print reminders, they never block. Fully silent:
@@ -88,83 +102,154 @@ Knit ships **Protocol Guard in `warn` mode by default** — hooks print reminder
88
102
  knit_set_protocol_strictness({ level: "off" })
89
103
  ```
90
104
 
91
- ### Uninstall in 30 seconds
105
+ ### Uninstall
106
+
107
+ One command kills all data:
92
108
 
93
109
  ```bash
94
110
  rm -rf ~/.knit # all per-project + global memory
95
111
  ```
96
112
 
97
- Then:
98
- 1. Remove `"knit-brain"` from `mcpServers` in `~/.claude.json`
99
- 2. Delete the `<!-- knit:start --> ... <!-- knit:end -->` block from each project's `CLAUDE.md`
100
- 3. Remove `_knitOwned` entries from each project's `.claude/settings.local.json`
113
+ Then remove Knit's registration from each agent you've used `knit setup` with:
114
+
115
+ | Agent | File to edit | What to remove |
116
+ |---|---|---|
117
+ | Claude Code (global) | `~/.claude.json` | the `"knit-brain"` entry under `mcpServers` |
118
+ | Claude Code (global) | `~/.claude/CLAUDE.md` | the `## Knit Brain (MCP)` block (appended by `knit setup`) |
119
+ | Cursor | `~/.cursor/mcp.json` or `.cursor/mcp.json` | the `"knit-brain"` entry under `mcpServers` |
120
+ | Codex CLI | `~/.codex/config.toml` | the `[mcp_servers.knit-brain]` section |
121
+ | Cline | `~/.cline/mcp.json` | the `"knit-brain"` entry under `mcpServers` |
122
+ | Continue | `.continue/mcpServers/knit-brain.yaml` | delete the file |
123
+ | VS Code Copilot | `.vscode/mcp.json` (or user `mcp.json`) | the `"knit-brain"` entry under `servers` |
124
+
125
+ Per-project residue to clean:
126
+
127
+ - `<project>/CLAUDE.md` — delete the `<!-- knit:start --> ... <!-- knit:end -->` block
128
+ - `<project>/.claude/settings.local.json` — remove hook entries tagged `_knitOwned: true`
129
+ - `<project>/.claude/KNIT.md` — sidecar written when CLAUDE.md had no markers; delete if present
130
+ - `<project>/.claude/agents/knit-*.md` — installed VoltAgent subagents; delete the `knit-` prefixed ones
131
+ - `<project>/AGENTS.md` — if you use Codex CLI or Cline, the marker-wrapped Knit block was written here; delete the block or the file
101
132
 
102
133
  Knit writes nowhere else on your machine.
103
134
 
104
135
  ---
105
136
 
137
+ ## 🎬 A real session
138
+
139
+ A new TypeScript project, from install to a compounding brain:
140
+
141
+ 1. **Install + register.** `npm i -g knit-mcp && knit setup` — Knit registers with every MCP-speaking agent on the machine.
142
+ 2. **Onboard.** Open the project in your agent and paste the onboarding prompt. The agent calls `knit_onboard` — *"Project: a billing API. Intent: add Stripe webhooks. strictness: warn. focus: api, webhooks."* Knit persists those preferences and records the intent.
143
+ 3. **Ask for the feature.** The agent calls `knit_classify_task` → e.g. *complex, high-risk* → plan mode. It pulls context with `knit_build_context` (ripple effects), `knit_search_learnings` (anything learned before), and `knit_query_dependents` on the files it will touch.
144
+ 4. **Build + verify.** It implements, runs `knit_verify_claim` to check its claims against the knowledge graph, and `knit_record_learning` to save what was non-obvious.
145
+ 5. **Compound.** Next session, `knit_load_session` surfaces your intent plus that learning — the brain is already sharper. Run **`knit`** to see it: the dashboard shows the project, its knowledge index, learnings, and token ROI building over time. Hit **Refresh** to re-index or **Export brain** to write an Obsidian vault.
146
+
147
+ Every step is local, deterministic, and works on any MCP host.
148
+
106
149
  ## 🔍 How search works
107
150
 
108
151
  Knit's retrieval is **BM25 + Reciprocal Rank Fusion** over your learnings,
109
- session summaries, and the cross-project pool, with two cheap-but-honest
110
- lexical-bridging layers stacked on top: **2-gram fallback** for typos and
111
- rare compounds, and **curated coding-domain synonym expansion** for the
112
- most common semantic-gap pairs. No vector embeddings, no remote inference,
113
- no API calls.
114
-
115
- **Why this design choice (not an oversight):**
116
-
117
- - **Deterministic.** Same query same ranking, every time. No model
118
- drift, no upgrade-day surprises.
119
- - **Fast.** Sub-millisecond on corpora 1K entries (your typical
120
- project memory). No cold start, no model load.
121
- - **Local-first.** Zero network calls. Your memory never leaves the
122
- machine.
123
- - **Auditable.** You can explain every hit by looking at term overlap
124
- + the synonym dictionary (50 pairs, hand-curated). No "the model
125
- said so."
126
- - **Honest at the boundary.** The bench has documented misses where
127
- even synonym expansion can't bridge the gapwe ship those visible,
128
- not hidden.
129
-
130
- **What it does well.** Exact term match, identifier search
131
- (`knit_classify_task`), rare-term emphasis (e.g. `PIPE_BUF`), multi-word
132
- ranking, tag filtering, cross-project diversification (max 2 per
133
- project), branch diversification on sessions (max 2 per branch). **Typo
134
- recovery via 2-gram fallback** (`knit_clasify` `knit_classify_task`).
135
- **Synonym recovery via curated dictionary** (`hook` ↔ `webhook`,
136
- `schema` ↔ `migration`, `auth` ↔ `authentication`, `cache` ↔ `memo`,
137
- `deploy` `ship` `release`, etc. see
138
- [`src/engine/retrieval/synonyms.ts`](src/engine/retrieval/synonyms.ts)
139
- for the full ~50-pair dictionary). Synonym matches scored at 0.4× of a
140
- direct BM25 hit so genuine matches always rank higher.
141
-
142
- **What it still cannot do.** Multi-word paraphrase ("how do schema
143
- changes ship" with no shared terms). Deep abstraction-level bridging
144
- ("data consistency" → "atomic temp+rename"). Question intent
145
- ("what's the right pattern for X"). Negation. Cross-entry synthesis
146
- ("based on the auth lessons, what should I do for OAuth"). These need
147
- either embeddings (model dependency + bundle weight, breaks local-first
148
- unless run locally via ONNX) or an LLM call layer (Knit-as-retrieval
149
- becomes Knit-as-agent, different identity). v0.20+ candidate: hybrid
150
- retrieval (BM25 + local embeddings via RRF) — opt-in, bench-gated.
151
-
152
- **The practical implication.** Search with words close to how you
153
- recorded the learning, OR words that have a synonym pair in the
154
- dictionary. If you write a learning about *webhook signatures*, you
155
- can now search either *webhook signatures* OR *hook signatures* —
156
- the dictionary bridges those. For genuinely different vocabulary that
157
- isn't in the synonym table, use `knit_search_global_learnings` to widen
158
- the corpus, or call `knit_search_sessions` to pull from past narrative
159
- summaries that may use more terms.
160
-
161
- **Bench numbers (v0.16):** synthetic 88.0% top-1 / **100% recall@5**,
162
- learnings (real-prose) 86.7% top-1 / 96.7% recall@5. Both default ON;
163
- opt-out via `enableNgramFallback: false` + `enableSynonyms: false` for
164
- a strict lexical-only baseline.
152
+ session summaries, and the cross-project pool, with two lexical-bridging
153
+ layers on top: a **2-gram fallback** for typos and rare compounds, and
154
+ **curated coding-domain synonym expansion** for common semantic-gap pairs.
155
+ No vector embeddings, no remote inference, no API calls.
156
+
157
+ The design is deliberate:
158
+
159
+ - **Deterministic** — same query, same ranking, every time. No model drift.
160
+ - **Fast** sub-millisecond on typical project corpora (≤ 1K entries). No cold start.
161
+ - **Local-first** — zero network calls; your memory never leaves the machine.
162
+ - **Auditable** every hit is explainable from term overlap plus the 50-pair synonym dictionary.
163
+
164
+ **Capabilities.** Exact term + identifier match (`knit_classify_task`),
165
+ rare-term emphasis (`PIPE_BUF`), multi-word ranking, tag filtering,
166
+ cross-project diversification (max 2 per project), branch diversification on
167
+ sessions (max 2 per branch), typo recovery via 2-gram fallback
168
+ (`knit_clasify` → `knit_classify_task`), and synonym recovery (`hook` ↔
169
+ `webhook`, `schema` `migration`, `auth` `authentication`, `cache`
170
+ `memo`, `deploy` `ship` `release`, see
171
+ [`src/engine/retrieval/synonyms.ts`](src/engine/retrieval/synonyms.ts) for the
172
+ full ~50-pair dictionary). Synonym matches score at 0.4× a direct hit, so exact
173
+ matches always rank higher.
174
+
175
+ **Benchmarks.** Synthetic 88.0% top-1 / **100% recall@5**; real-prose learnings
176
+ 86.7% top-1 / 96.7% recall@5. Both layers default on; set
177
+ `enableNgramFallback: false` + `enableSynonyms: false` for a strict
178
+ lexical-only baseline.
179
+
180
+ **Roadmap.** A hybrid retriever (BM25 + local embeddings, fused via RRF) for
181
+ paraphrase and abstraction-bridging is a v0.21+ candidate — opt-in,
182
+ bench-gated, and local-first.
165
183
 
166
184
  ---
167
185
 
186
+ ## ✨ What's new in v0.21.0
187
+
188
+ - **Onboarding (`knit_onboard`).** Paste the README prompt after connecting Knit, describe your project + how you want Knit to behave, and the agent persists your preferences (strictness, features, focus domains) and records the project intent — surfaced every session, on any MCP host.
189
+ - **Dashboard actions.** The dashboard can now **Refresh** (re-index a project) and **Export all projects** (Obsidian vault), in addition to viewing. Actions run as child processes (non-blocking) and stay loopback-bound + Host/Origin-gated.
190
+ - **56 tools** (Tier-1 37). Shipped after a second six-dimension audit (0 critical) and a real-life end-to-end run.
191
+
192
+ ## ✨ What's new in v0.20.0
193
+
194
+ v0.20 makes Knit a **fully-ready, dashboard-first brain** — a consolidated
195
+ release (internal phases v0.17–v0.20) shipped after a six-dimension deep-clean
196
+ audit (0 critical findings).
197
+
198
+ - **Brain freshness layer.** One shared primitive governs staleness across every
199
+ store, so the brain never serves data it can't vouch for: handoffs auto-clear
200
+ once resolved or stale, idle classifier signals decay, old cross-project
201
+ learnings drop from search, and a learning that names a now-deleted file is
202
+ flagged. Freshness drives prune/clear/flag only — never the bench-gated
203
+ retrieval ranking.
204
+ - **Tool count you can explain.** `knit doctor` and `knit_list_features` print
205
+ the live active count *with the reason* (e.g. `46 of 56 = 37 always-on + 9
206
+ teams [≥3 domains] · …`), so a number that legitimately varies by project
207
+ shape stops looking like a bug. A drift test pins the docs to the registry.
208
+ - **Stays on-protocol mid-session.** A throttled, escalating reminder rides the
209
+ MCP tool response when an agent drifts (e.g. records work before classifying)
210
+ — reaching every MCP host, not just Claude Code. Silence with
211
+ `knit_set_protocol_strictness({ level: "off" })`.
212
+ - **Dashboard-first.** Run **`knit`** to open the brain; the agent/stdio path is
213
+ unchanged. The dashboard gains a Knowledge-index view and a `knit doctor`
214
+ webapp health check. (v0.21 adds Refresh + Export actions to the dashboard;
215
+ `knit setup` remains CLI-only.)
216
+ - **Composes with your setup.** Scans Claude Code Skills
217
+ (`.claude/skills/<name>/SKILL.md`) alongside slash commands; positioning leads
218
+ with the integrated brain rather than competitor comparisons.
219
+
220
+ Security/hygiene from the audit: the command/Skill scanner now guards size and
221
+ rejects symlinks before reading (no OOM, no arbitrary-file reads into the brain).
222
+
223
+ ## ✨ What's new in v0.16.0
224
+
225
+ v0.16 is the **semantic-lite release**. Two retrieval improvements that
226
+ close the most common BM25 lexical gaps without an embedding model or
227
+ external API call. Both default ON, both bench-pinned non-regressive.
228
+
229
+ - **Curated synonym expansion.** Hand-curated dictionary of ~50
230
+ coding-domain synonym pairs (`webhook` ↔ `hook`, `schema` ↔
231
+ `migration`, `auth` ↔ `authentication`, `cache` ↔ `memo`, `deploy` ↔
232
+ `ship` ↔ `release`, etc.) in `src/engine/retrieval/synonyms.ts`. When
233
+ a query token has known synonyms, BM25 scores documents containing
234
+ those synonyms with a 0.4× discount weight (higher than the 2-gram
235
+ fallback's 0.25 because synonyms are conceptually closer than
236
+ near-spelling matches). Fires both as a fallback (term unmatched,
237
+ synonym matched) and a boost (term matched directly, synonym widens
238
+ reach).
239
+ - **2-gram fallback default ON.** `enableNgramFallback` flipped from
240
+ default `false` → default `true`. v0.15 introduced this as opt-in to
241
+ avoid bench regression risk; v0.16 flips the default after both
242
+ benches verified strictly stable.
243
+ - **FIFO-safe `handleIndexRequirements`.** Latent v0.12.1 hardening
244
+ bug: `openSync(O_RDONLY)` on a named pipe blocked indefinitely
245
+ before `fstat` could reject it. Now passes `O_NONBLOCK`; regular
246
+ files unaffected.
247
+
248
+ Bench impact (v0.15 → v0.16): synthetic 86%/96% → **88%/100%**;
249
+ learnings 83.3%/96.7% → **86.7%/96.7%**. The synthetic recall@5 hit
250
+ 100% because synonym expansion closed the "hook events authenticated"
251
+ miss that BM25 alone couldn't bridge.
252
+
168
253
  ## ✨ What's new in v0.15.0
169
254
 
170
255
  v0.15 is the **deep-clean release**. A second six-dimension internal audit
@@ -221,6 +306,7 @@ return `{ status: 'protocol_required', next_action: '...' }` instead of
221
306
  proceeding — the agent reads the response, follows the breadcrumb, retries.
222
307
  This is the universality answer: same enforcement, transport layer instead
223
308
  of host layer. Default strictness stays `warn` so existing flows are unchanged.
309
+ (v0.20 extends this with mid-session re-surfacing — see *What's new in v0.20.0* above.)
224
310
 
225
311
  ### ⚡ Agent-native slash-command auto-detection
226
312
 
@@ -278,7 +364,7 @@ A single command opens a local-first analytics surface at `http://127.0.0.1:7421
278
364
 
279
365
  **Real-time sync via SSE.** The server watches `~/.knit/` via `fs.watch`; any agent recording a learning anywhere updates the open dashboard within ~250ms. No polling.
280
366
 
281
- ### 🔐 Security hardening (real, not theater)
367
+ ### 🔐 Security hardening
282
368
 
283
369
  The dashboard is a localhost HTTP server, which has real attack surface. v0.13 closes it:
284
370
 
@@ -316,30 +402,30 @@ The dashboard works regardless of which agent you use — it reads the brain fro
316
402
  | `knit_classify_task` response | ~500 tok | **~150 tok** | 70% |
317
403
  | `knit_load_session` response | ~3–5 KB | **~1.5 KB** | ~60% |
318
404
 
319
- Each surface gets a `healthy | warn | over-budget` verdict from `knit_brain_status.token_budget`. **Drift is a regression test, not a vibes claim.**
405
+ Each surface gets a `healthy | warn | over-budget` verdict from `knit_brain_status.token_budget`, enforced by a regression test.
320
406
 
321
407
  ---
322
408
 
323
409
  ## 📊 The dashboard
324
410
 
325
- Run `knit ui` to open the local analytics surface. **Single command**, no other CLI needed for normal operation:
411
+ Run **`knit`** to open the brain (the local analytics surface); `knit ui` is an explicit alias:
326
412
 
327
413
  ```bash
328
- knit ui
414
+ knit
329
415
  # Knit Dashboard — http://127.0.0.1:7421
330
416
  # Reading from: /Users/<you>/.knit
331
417
  # Press Ctrl-C to stop.
332
- # (automatically opens your default browser)
418
+ # (opens your default browser; visit the URL above if it does not)
333
419
  ```
334
420
 
335
421
  | Feature | What you see |
336
422
  |---|---|
337
423
  | **Bento home** | Big "Net tokens saved" hero card (dark), live recent activity (green "live" dot when SSE connected), memory hit-rate gauge, top projects by ROI as color-blocked cards |
338
424
  | **Brain graph** | Force-directed visualization of one project's learnings. Nodes sized by access count, colored by domain. Edges by Jaccard similarity over tags + domains. Click any node → side panel with the full lesson. Threshold slider live-recomputes the graph. |
339
- | **Per-project deep dive** | Hero card with verdict tone (cold/warming/compounding/strong), retrieval signals, classifications-by-tier breakdown, top domains heatmap, searchable learnings list |
340
- | **Health** | Install diagnostics — Node version, Knit version, ~/.knit permissions, MCP registration in `~/.claude.json` |
425
+ | **Per-project deep dive** | Hero card with verdict tone (cold/warming/compounding/strong), retrieval signals, classifications-by-tier breakdown, top domains heatmap, searchable learnings list, Knowledge index, and **Refresh** (re-index this project) + **Export all projects** (Obsidian vault) actions |
426
+ | **Health** | Install diagnostics — Node version, Knit version, ~/.knit permissions, per-agent MCP registration |
341
427
 
342
- **API endpoints** (all read-only, all 127.0.0.1 only):
428
+ **API endpoints** (127.0.0.1 only, Host/Origin-gated):
343
429
 
344
430
  - `GET /api/version` — runtime version + update check + security metadata
345
431
  - `GET /api/brain/summary` — global counts
@@ -347,20 +433,28 @@ knit ui
347
433
  - `GET /api/projects` — project list
348
434
  - `GET /api/projects/:id/learnings` — full learning entries
349
435
  - `GET /api/projects/:id/metrics` — compounding ROI for one project
436
+ - `GET /api/projects/:id/knowledge` — knowledge-index summary
350
437
  - `GET /api/projects/:id/graph` — force-directed node + edge data (Jaccard threshold tunable)
351
438
  - `GET /api/global/learnings` — cross-project pool
352
439
  - `GET /api/doctor` — install diagnostics
353
440
  - `GET /api/events` — Server-Sent Events stream for real-time sync
441
+ - `POST /api/projects/:id/refresh` — re-index a project (source path from its meta; spawned as a child process)
442
+ - `POST /api/export` — export all projects to a fixed `~/.knit/exports/` vault
354
443
 
355
444
  ---
356
445
 
357
- ## 🛠️ 55 MCP Tools
446
+ ## 🛠️ 56 MCP Tools
358
447
 
359
- > **49 active by default** at first handshake. The remaining 6 are tier-gated:
360
- > teams (9 tools, auto-on when ≥3 domains detected), subagents (1 tool, auto-on
361
- > when `.claude/agents/` exists), and admin (3 tools, opt-in via
362
- > `knit_enable_feature("admin")`). Call `knit_list_features` to see what's
363
- > available and how to enable.
448
+ > **37 always-on, up to 19 conditional, 56 total.** The active count varies by
449
+ > project shape, so it isn't one fixed number it's `37` plus whichever
450
+ > conditional groups your project triggers: teams (9 tools, auto-on when ≥3
451
+ > domains detected), diagnostics (6 tools, on during your first session),
452
+ > subagents (1 tool, auto-on when `.claude/agents/` exists), and admin (3 tools,
453
+ > opt-in via `knit_enable_feature("admin")`). That's why one machine shows 46
454
+ > and another 44 — it reflects each project's shape. Run `knit doctor` (or call
455
+ > `knit_list_features`) for your project's **live count and the reason for it**.
456
+ > The groups below cover the main tools; `knit_list_features` is the
457
+ > authoritative live list.
364
458
 
365
459
  <details open>
366
460
  <summary><strong>🕸️ Knowledge graph</strong> <em>(Tier 1, ~5ms)</em></summary>
@@ -406,8 +500,9 @@ knit ui
406
500
  | `knit_get_workflow` | Fetch protocol depth for one phase on demand. Sections: `overview, tier, phases, research, ideate, plan, execute, optimize, review, tdd, learn, handoff, ship, tools`. |
407
501
  | `knit_get_suggestions` | Adaptive warnings from past patterns in given domains. |
408
502
  | `knit_reflect` | Detect patterns across recorded learnings (per-project + global pool). Useful with ≥3 entries. |
409
- | `knit_setup_project` | Describe a non-code project (legal, marketing, research) to bootstrap domain teams. |
410
- | `knit_prune_sessions` | Prune `sessions.jsonl` by age (default 90 days). Atomic rewrite. |
503
+ | `knit_onboard` | **v0.21.** One-time onboarding: captures the project + how the user wants Knit, persists preferences (strictness, features, focus domains), records the project intent. |
504
+ | `knit_scan_agent_commands` | Scan each MCP host's slash-command + skill directories; surface user-defined commands so Knit composes with them. |
505
+ | `knit_suggest_command` | Per-phase lookup against scanned commands; returns the agent-native command to invoke. |
411
506
 
412
507
  </details>
413
508
 
@@ -430,7 +525,7 @@ Runtime enforcement of the Knit protocol via PreToolUse and SessionStart hooks.
430
525
  |---|---|
431
526
  | `knit_brain_status` | Brain health + **token-budget** verdicts per surface + `update_available` notification + integrations summary. |
432
527
  | `knit_list_features` | Surfaces hidden tools and tells you how to enable them. The escape hatch. |
433
- | `knit_enable_feature` | Flip on a Tier-2/3 feature (`teams`, `subagents`, `admin`). Emits `notifications/tools/list_changed` — new tools appear without a Claude Code restart. |
528
+ | `knit_enable_feature` | Flip on a Tier-2/3 feature (`teams`, `subagents`, `admin`). Emits `notifications/tools/list_changed` — new tools appear without an agent restart. |
434
529
  | `knit_disable_feature` | Symmetric to enable. |
435
530
  | `knit_scan_integrations` | Re-detect existing workflow frameworks (Ruflo, gstack, CodeTour, Conductor, other MCP servers, custom CLAUDE.md sections). |
436
531
  | `knit_compounding_metrics` | Quantifies *"Knit gets cheaper over time"* — sessions, cache hits, reuse-ratio %, estimated tokens saved. Verdict: `cold \| warming \| compounding \| strong`. |
@@ -468,7 +563,9 @@ Runtime enforcement of the Knit protocol via PreToolUse and SessionStart hooks.
468
563
 
469
564
  | Tool | What it does |
470
565
  |---|---|
471
- | `knit_setup_project` | Bootstrap domain teams for a non-code project. One-time. |
566
+ | `knit_setup_project` | Bootstrap domain teams for a non-code project (legal, marketing, research). One-time. |
567
+ | `knit_prune_sessions` | Prune `sessions.jsonl` by age (default 90 days). Atomic rewrite. Auto-prune handles this normally. |
568
+ | `knit_reset_calibration` | Wipe per-project classifier calibration. Discards accumulated tuning. |
472
569
 
473
570
  </details>
474
571
 
@@ -579,7 +676,7 @@ knit install-agents --refresh # re-fetch from network even if cached
579
676
  "token_budget": {
580
677
  "budgets": {
581
678
  "claude_md": { "bytes": 2048, "target_bytes": 6500, "verdict": "healthy" },
582
- "tool_registry": { "bytes": 8400, "target_bytes": 8500, "verdict": "healthy", "active_tool_count": 31, "total_tool_count": 43 },
679
+ "tool_registry": { "bytes": 8400, "target_bytes": 8500, "verdict": "healthy", "active_tool_count": 46, "total_tool_count": 56 },
583
680
  "instructions": { "bytes": 2200, "target_bytes": 2500, "verdict": "healthy" },
584
681
  "per_session_overhead": { "bytes": 12648, "target_bytes": 17500, "verdict": "healthy" }
585
682
  },
@@ -594,7 +691,7 @@ knit install-agents --refresh # re-fetch from network even if cached
594
691
  "update_available": {
595
692
  "current": "0.8.0",
596
693
  "latest": "0.9.0",
597
- "upgrade": "Restart Claude Code to spawn a fresh MCP — npx will auto-fetch the new version."
694
+ "upgrade": "Restart your agent to spawn a fresh MCP — npx will auto-fetch the new version."
598
695
  }
599
696
  }
600
697
  ```
@@ -605,10 +702,19 @@ Pair with `knit_compounding_metrics` for the value side of the ledger (sessions,
605
702
 
606
703
  ## 💻 CLI
607
704
 
705
+ The surface is dashboard-first: `knit` opens the brain, `knit setup` performs
706
+ one-time agent registration. The remaining commands are operational tooling for
707
+ scripting and CI; their views are progressively moving into the dashboard.
708
+
608
709
  ```bash
609
- knit setup # one time: add MCP to Claude settings
610
- knit status # dashboard: sessions, learnings, hit rate, knowledge health
611
- knit refresh # force rebuild knowledge brain
710
+ knit # open the brain (the dashboard at http://127.0.0.1:7421)
711
+ knit setup # one-time: detect installed MCP-speaking agents and register Knit in each
712
+ knit doctor # install health check: version, per-agent MCP registration, webapp bundle, knowledgebase
713
+ knit ui # explicit alias for the dashboard (same as bare `knit`)
714
+ knit status # terminal snapshot: sessions, learnings, hit rate, knowledge-index health
715
+ knit refresh # rebuild the knowledge index from source
716
+ knit install-agents # install subagent definitions into <project>/.claude/agents/
717
+ knit export <fmt> # export learnings (supported targets: obsidian)
612
718
  ```
613
719
 
614
720
  Example `knit status`:
@@ -624,11 +730,11 @@ Knowledge Base
624
730
  Accessed: 12 (67% hit rate)
625
731
  False positives: 3
626
732
 
627
- Token budget (v0.9)
733
+ Token budget (v0.16)
628
734
  CLAUDE.md: 2.0 KB → healthy
629
- Tool registry: 8.4 KB → healthy (31 active / 43 total)
630
- Instructions: 2.2 KB → healthy
631
- Per-session total: 12.6 KB → healthy
735
+ Tool registry: ~13 KB → warn (46 active / 56 total)
736
+ Instructions: ~4 KB → healthy
737
+ Per-session total: ~20 KB → healthy
632
738
 
633
739
  Compounding
634
740
  Sessions logged: 14
@@ -638,58 +744,32 @@ Compounding
638
744
 
639
745
  ---
640
746
 
641
- ## 🆚 How it's different
642
-
643
- | | gstack (skills) | ECC (agents) | Ruflo (orchestration) | **Knit** |
644
- |--|---|---|---|---|
645
- | **Bet** | Slash-command flows | Agent rules | 100+ agents in swarms | **One disciplined agent, compounding memory** |
646
- | **Setup** | Install skills per-project | Manual `.claude/` setup | `npx ruflo init` (heavy) | **`npx knit-mcp setup` (light)** |
647
- | **Memory** | jsonl files in-tree | Memory directory | Vector DB + 4-tier consolidation | **Local, searchable, vectorless BM25 + graph fusion** |
648
- | **Token cost** | Skills loaded into context | Rules loaded into context | 314 tools advertised | **~2 KB CLAUDE.md, tier-gated registry, budget guardrail** |
649
- | **Parallel work** | None | None | Multi-agent swarms + federation | **Team-scoped git worktrees** |
650
- | **Cloud dependency** | None | None | Cognitum.One (cloud backbone) | **None — fully local** |
651
- | **Self-measurement** | None | None | Cost-tracker plugin | **`knit_brain_status.token_budget` + `knit_compounding_metrics`** |
652
- | **Anti-hallucination** | None | None | None advertised | **`knit_verify_claim` + citation rule + pre/post import validation** |
653
- | **Non-code projects** | No | No | Limited | **Description-driven via `knit_setup_project`** |
654
-
655
- **The bet:** Ruflo for agent quantity (swarms, federation, plugins). Knit for **agent quality** (memory, classification, token discipline, hallucination defense). Different markets. The integration scanner detects Ruflo when installed and tailors instructions to defer routing to it — Knit operates as the memory + classification substrate underneath.
656
-
657
- ---
658
-
659
- ## 🧭 Honest comparison vs memory libraries
660
-
661
- The mem0 / Letta / agentmemory comparison deserves a separate section because they're a different category — **memory-as-a-service libraries**, not MCP-native workflow layers. Reading their published benchmarks side-by-side:
747
+ ## 🧠 Why Knit
662
748
 
663
- | | mem0 | Letta (MemGPT) | agentmemory | **Knit** |
664
- |--|---|---|---|---|
665
- | **Published benchmark** | LOCOMO: 67–92% LLM-as-Judge; ~90% token reduction (1.7K vs 26K per conversation) | No head-to-head token-reduction number; "Letta Leaderboard" benchmarks *LLMs* on agentic memory, not Letta | LongMemEval-S: **95.2% R@5** with BM25+RRF+graph; 86.2% BM25-only | **Not yet measured.** Same architecture as agentmemory; no published number. |
666
- | **Retrieval architecture** | Vector + graph (Mem0g variant) | OS-inspired tiered memory (core/recall/archival) | BM25 + local vectors + KG fused via RRF (k=60) | BM25 + RRF + graph-traversal (fused via RRF k=60). Per-project + cross-project diversity caps. |
667
- | **Install shape** | SDK integration; managed cloud or self-hosted | SDK integration; self-hosted server | Python library | **`npx knit-mcp setup` → MCP server, zero glue.** Works with Claude Code / Cursor / Codex / any MCP host. |
668
- | **Workflow primitive** | None — pure memory | Agent-managed memory operations | None — pure retrieval | **4-tier classifier + plan-mode + protocol guard + parallel team worktrees.** |
669
- | **Self-calibration** | No | No | No | **Per-project classifier calibration** (v0.11): user FP feedback shifts thresholds; classifier gets less wrong over time. |
749
+ Knit is a **project brain your agent plugs into** — a live code knowledge graph wired into ranked memory and a task classifier that routes work by impact. The pieces aren't sold separately; the value is the integration:
670
750
 
671
- ### What's honest about this
751
+ - **Graph-grounded recall** memory ranked by what your change *structurally* touches (dependents, fanout), not just keyword overlap.
752
+ - **Impact classifier** — every task is sized (Inquiry → Trivial → Standard → Complex) and complex work auto-enters plan mode. The brain decides *how carefully* to handle a change, not just what to recall.
753
+ - **Self-calibrating** — `knit_record_false_positive` shifts the classifier's thresholds per project; it gets less wrong over time.
754
+ - **Token accounting** — `knit_compounding_metrics` makes "cheaper over time" chartable per project.
755
+ - **Parallel team worktrees** — multi-domain work fans out into isolated git worktrees so agents don't collide.
756
+ - **Brain integrity** — a freshness layer keeps every datum trustworthy: stale handoffs auto-clear, idle classifier signals decay, deleted-file references get flagged.
757
+ - **Fully local, zero-glue** — `npx knit-mcp setup` and it's a brain every MCP host (Claude Code, Cursor, Codex, Cline, Continue, Copilot) shares. No cloud, no SDK wiring.
672
758
 
673
- **Knit's measured retrieval on a 50-question synthetic harness (v0.11.2):**
759
+ **"Why use Knit if my agent already has memory?"** Your agent's memory *stores notes*; Knit *decides* — it ranks recall by what your change structurally touches, classifies each task to set the right workflow depth, and tracks the cost over time. Graph-grounded routing, not a markdown notepad.
674
760
 
675
- | Metric | Knit (v0.11.2 synthetic) | agentmemory (LongMemEval-S, published) |
676
- |---|---|---|
677
- | Top-1 accuracy | **86.0%** | not published in that form |
678
- | Recall@5 | **96.0%** | **95.2%** |
679
-
680
- Run it yourself: `npm run bench`. Source: [`benchmarks/retrieval-synthetic.ts`](./benchmarks/retrieval-synthetic.ts).
761
+ Knit also **composes with** whatever else you run: `knit_scan_integrations` detects existing workflow frameworks and slash commands and defers to them where they fit — Knit stays the memory + classification brain underneath.
681
762
 
682
- **These numbers are NOT apples-to-apples with agentmemory's.** Their benchmark is 1,500 questions from real long conversations; Knit's is 50 hand-authored questions on a 7KB synthetic corpus. The numbers are close because the architecture is similar (BM25 + RRF), not because we've proven parity at scale. **Real comparison requires running LongMemEval-S on Knit** — on the roadmap for v0.13.
763
+ ### Retrieval benchmarks
683
764
 
684
- **Knit isn't trying to be a better mem0.** It's a different product:
685
- - **MCP-native + zero-glue install** — mem0/Letta require SDK integration; Knit drops into any MCP host (Claude Code, Cursor, Codex) with one command.
686
- - **Workflow primitive** — the 4-tier classifier + plan-mode + protocol guard + team worktrees is what makes Knit a *command layer*, not a memory library.
687
- - **Per-project classifier calibration** (v0.11 slice 4) — `knit_record_false_positive` with a direction tag shifts thresholds over time. Nobody else does this; nobody else needs to, because they're memory libraries, not workflow routers.
688
- - **Measurable cheapness** — `knit_compounding_metrics` + `knit_get_metrics_history` make the "cheaper over time" claim *chartable per project*. mem0 publishes aggregate dataset numbers; Knit ships per-user instrumentation.
765
+ Knit's retrieval is BM25 + reciprocal-rank fusion + graph traversal — **vectorless, deterministic, auditable**, no embedding model or cloud call. In-repo regression gates:
689
766
 
690
- ### What's deferred
767
+ | Harness | Top-1 | Recall@5 | Run it |
768
+ |---|---|---|---|
769
+ | 50-question synthetic | **88%** | **100%** | `npm run bench` |
770
+ | 30-question narrative prose | **86.7%** | **96.7%** | `npm run bench:learnings` |
691
771
 
692
- LongMemEval-S R@5/R@10 + LOCOMO LLM-as-Judge runs are on the roadmap (v0.13+). Until they're published, treat any cross-system token-savings comparison as architectural-claim-only.
772
+ These are focused in-repo regression gates that block a merge if retrieval degrades. A run on a standard long-memory benchmark and a hybrid BM25 + local-embeddings retriever are v0.21+ candidates.
693
773
 
694
774
  ---
695
775
 
@@ -697,11 +777,18 @@ LongMemEval-S R@5/R@10 + LOCOMO LLM-as-Judge runs are on the roadmap (v0.13+). U
697
777
 
698
778
  | Version | Headline |
699
779
  |---|---|
700
- | **v0.12.0** | **Picture Perfect: Structural Enforcement.** Diagnostic enforcing. Budget verdict surfaces in the MCP `instructions` field at handshake (before any tool description is read). `knit_load_session` carries `budget_health` + `learnings_health` nudges. `engram doctor` exits non-zero on over-budget; `engram setup` runs doctor as final step. New PostToolUse hook warns immediately on over-budget CLAUDE.md edits (HOOKS_VERSION 11→12; auto-rolls to existing users). This repo dogfoods: hand-curated 16KB CLAUDE.md migrated to lean 3.8KB plus an internal long-form sidecar. New `npm run bench:tokens` measures real MCP-on vs MCP-off cost: 93% smaller per-recall call, 50% smaller per-classify, payback at 3 recall calls. 53 tools, 705 tests. |
780
+ | **v0.21.0** | **Onboarding + dashboard actions.** `knit_onboard` captures the project + how the user wants Knit (preferences persisted, intent surfaced every session, host-agnostic). The dashboard gains **Refresh** + **Export all projects** actions (non-blocking child processes, Host/Origin-gated). New `GET /api/projects/:id/knowledge` + a `knit doctor` webapp check. Shipped after a second six-dimension audit (0 critical) + a real-life E2E. 56 tools. |
781
+ | **v0.20.0** | **Brain integrity + clarity + dashboard-first.** A freshness layer keeps every datum trustworthy (handoffs auto-clear, idle classifier signals decay, deleted-file references get flagged). `knit doctor`/`knit_list_features` explain the live tool count. Mid-session protocol re-surfacing keeps agents on-protocol across every MCP host. **`knit`** opens the brain dashboard; a read-only Knowledge-index view + Skills composition land. Removed competitor comparisons for intrinsic positioning. Shipped after a six-dimension deep-clean audit (0 critical). 55 tools, 855 tests. |
782
+ | **v0.16.0** | **Semantic-lite retrieval.** Curated coding-domain synonym dictionary (~50 pairs) closes the most common BM25 lexical gaps (`hook` ↔ `webhook`, `schema` ↔ `migration`, etc.) without an embedding model. 2-gram fallback for typos default ON after bench verification. Synthetic bench 88% top-1 / **100% recall@5** (was 96%); learnings 86.7% top-1 / 96.7% recall@5. Plus a FIFO-safe `O_NONBLOCK` fix to `handleIndexRequirements`. 55 tools, 818 tests. |
783
+ | **v0.15.0** | **Deep-clean audit release.** Six-dimension second audit + atomic-write helper applied to 9+ sites including `~/.claude.json` (a torn write there used to brick Claude Code). SHA256 sidecars on agent-fetcher cache writes detect tampering and re-fetch. `qs` CVE pinned via `npm overrides` → 0 vulns. Opt-in BM25 2-gram fallback for typos. `pruneLearningsByAge` + schema-validated `readLearnings`. Webapp DoctorView shows per-agent rows. Update notice surfaces in MCP `instructions` field for all 6 agents. 55 tools, 805+ tests. |
784
+ | **v0.14.1** | **Ship-readiness audit + atomicity hardening.** First six-dimension audit + 14 P1 fixes: `writeFileAtomic` helper across 9+ persistence paths; `handleSetupProject` redaction gap closed; `record_learning` substring dedup matches the description claim; soft-gate documented in instructions field; pre-publish leak gate. 55 tools. |
785
+ | **v0.14.0** | **Universality release.** Single `knit setup` detects + writes to every installed MCP-speaking agent (Claude Code, Cursor, Codex CLI, Cline, Continue, GitHub Copilot via VS Code Agent mode). Server-side soft-gates as the cross-platform protocol enforcement layer for agents without hook lifecycles. Slash-command auto-detection via `knit_scan_agent_commands` + `knit_suggest_command`. 55 tools. |
786
+ | **v0.13.0** | **Brain dashboard release.** `knit ui` opens a local-first analytics dashboard (Monetir-inspired bento, force-directed brain graph, real-time SSE sync, Host/Origin validation + CSP). Security hardening across every endpoint. Universal positioning copy across CLI + README. |
787
+ | **v0.12.0** | **Picture Perfect: Structural Enforcement.** Diagnostic → enforcing. Budget verdict surfaces in the MCP `instructions` field at handshake (before any tool description is read). `knit_load_session` carries `budget_health` + `learnings_health` nudges. `knit doctor` exits non-zero on over-budget; `knit setup` runs doctor as final step. New PostToolUse hook warns immediately on over-budget CLAUDE.md edits (HOOKS_VERSION 11→12; auto-rolls to existing users). This repo dogfoods: hand-curated 16KB CLAUDE.md migrated to lean 3.8KB plus an internal long-form sidecar. New `npm run bench:tokens` measures real MCP-on vs MCP-off cost: 93% smaller per-recall call, 50% smaller per-classify, payback at 3 recall calls. 53 tools, 705 tests. |
701
788
  | **v0.11.4** | Dogfood audit · ran a full audit of Knit's own codebase using its own `knit_spawn_team_worktree` primitive (4 parallel teams: Core Logic, Infrastructure, UI, Quality Assurance). Fixes: HIGH `engram refresh` no longer clobbers user-curated CLAUDE.md (now uses `spliceKnitBlock` like `cache.ts`); `saveSource`/`loadSource` validate `sourceId`; `appendGlobalLearning` propagates write failures; `redactSecrets` applied to `label`/`tags`/`domains` across all persistence boundaries; 100KB response ceiling on `knit_generate_test_cases`; full v0.11 tool surface now documented in `workflow-protocol.ts` generator (was frozen at the v0.4 surface). Plus: 16 key tools reclassified with `[PROTOCOL]`/`[REVIEW]`/`[MEMORY]`/`[GRAPH]` prefixes so the LLM picks the right tool reliably. 53 tools, 687 tests. |
702
789
  | **v0.11.3** | Propagation patch · `update_available` flag now surfaces in `knit_load_session` response (≈100% session reach vs. brain_status' low reach) + startup stderr nag on stale versions. Helps FUTURE upgrades land faster; doesn't retroactively reach v0.10.x users. 53 tools, 665 tests. |
703
790
  | **v0.11.2** | Pre-publish polish · chunk cap (2000) + `errorResponse` envelope across handlers + CLAUDE.md generator surfaces v0.11 tools · new `engram doctor` install health-check CLI · upgrade-path smoke test caught + fixed a data-loss bug in cache.ts (Case B was wiping user permissions on upgrade) · 11 real exploit-payload integration tests prove C1/C2/H1 fixes hold · `npm run bench` ships a synthetic retrieval harness (50 Q&A) measuring 86% top-1 / 96% R@5. 53 tools, 664 tests. |
704
- | **v0.11.1** | Audit-driven hardening · 3 CRITICAL (source_id path traversal, post-edit tsc shell injection, live calibration bug) + 10 HIGH fixes from a 5-agent audit, implemented in 3 parallel `knit_spawn_team_worktree` teams. HOOKS_VERSION 11 (auto-upgrades existing users). New `knit_delete_requirements` tool. Honest comparison vs mem0/Letta added. 53 tools, 636 tests. |
791
+ | **v0.11.1** | Audit-driven hardening · 3 CRITICAL (source_id path traversal, post-edit tsc shell injection, live calibration bug) + 10 HIGH fixes from a 5-agent audit, implemented in 3 parallel `knit_spawn_team_worktree` teams. HOOKS_VERSION 11 (auto-upgrades existing users). New `knit_delete_requirements` tool. 53 tools, 636 tests. |
705
792
  | **v0.11.0** | Verify Layer + auto-config foundation · mandatory `knit_verify_claim` REVIEW gate · post-edit diff verify + universal `tsc` check · drift detector · self-healing classifier (per-project calibration) · `knit_index_requirements` + `knit_generate_test_cases` (BM25 over long specs) · `knit_get_fingerprint` + `knit_infer_domains` + `knit_compose_template` (zero-config CLAUDE.md). 52 tools, 625 tests. |
706
793
  | **v0.10.0** | Token-economics release · risk × scope × change_kind classifier split · `context_budget_remaining` graceful degradation · per-project diversity cap on cross-project search · 11 new compounding-metrics fields + weekly snapshot persistence + `knit_get_metrics_history`. Makes "Knit makes Claude cheaper" a chartable number from day 1. |
707
794
  | **v0.9.0** | Hook-level enforcement · citation rule · `knit_verify_claim` · auto-search in classify · `suggested_reads` · `knit_get_learning` · `knit_consolidate_learnings`. |
@@ -733,17 +820,18 @@ git clone https://github.com/PDgit12/knit.git
733
820
  cd knit
734
821
  npm install
735
822
  npm run dev # run CLI locally
736
- npm run test # 492 tests
823
+ npm run test # 818 tests, ~8 s
737
824
  npm run typecheck # TypeScript strict mode
738
- npm run build # compile CLI + MCP server
825
+ npm run bench # retrieval bench: synthetic + learnings-shape
826
+ npm run build # compile CLI + MCP server + webapp
739
827
  ```
740
828
 
741
829
  ### Architecture
742
830
 
743
831
  ```
744
832
  knit (npm package)
745
- ├── dist/cli.js # CLI: setup, status, refresh
746
- └── dist/mcp/server.js # MCP server: 43 tools (tier-gated), auto-init
833
+ ├── dist/cli.js # CLI: setup, doctor, ui, status, refresh, install-agents, export
834
+ └── dist/mcp/server.js # MCP server: 56 tools (tier-gated), auto-init
747
835
 
748
836
  per-project, in ~/.knit/projects/<hash>/
749
837
  ├── knowledge.json # import graph + exports + test map
@@ -758,7 +846,7 @@ per-project, in <project>/
758
846
  └── .claude/settings.local.json # per-machine hooks, knit-managed
759
847
  ```
760
848
 
761
- **Zero external dependencies for the knowledge brain.** 492 tests. Strict-mode TypeScript.
849
+ **Zero external dependencies for the knowledge brain.** 818 tests, 0 `npm audit` vulnerabilities. Strict-mode TypeScript.
762
850
 
763
851
  ---
764
852