knit-mcp 0.11.4 → 0.16.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,50 +3,80 @@
3
3
  <a href="https://github.com/PDgit12/knit/actions/workflows/ci.yml"><img src="https://img.shields.io/github/actions/workflow/status/PDgit12/knit/ci.yml?style=for-the-badge&label=CI&color=10b981" alt="CI" /></a>
4
4
  <img src="https://img.shields.io/badge/license-MIT-3b82f6?style=for-the-badge" alt="license" />
5
5
  <img src="https://img.shields.io/badge/node-%E2%89%A518-339933?style=for-the-badge&logo=node.js&logoColor=white" alt="node" />
6
- <img src="https://img.shields.io/badge/tests-665%20passing-22c55e?style=for-the-badge" alt="tests" />
7
- <img src="https://img.shields.io/badge/MCP%20tools-53-7c3aed?style=for-the-badge" alt="tools" />
6
+ <img src="https://img.shields.io/badge/MCP%20tools-55-7c3aed?style=for-the-badge" alt="tools" />
7
+ <img src="https://img.shields.io/badge/agents-6-10b981?style=for-the-badge" alt="agents supported" />
8
+ <img src="https://img.shields.io/badge/local--first-100%25-3b82f6?style=for-the-badge" alt="local-first" />
8
9
  </p>
9
10
 
10
11
  <h1 align="center">🧶 knit</h1>
11
12
 
12
13
  <p align="center">
13
- <strong>An intelligent command layer for Claude Code.</strong><br/>
14
- Project-scoped memory · on-demand workflow · parallel team worktrees · honest token accounting.<br/>
15
- <em>All in one MCP server.</em>
14
+ <strong>Universal MCP brain for agentic coding platforms.</strong><br/>
15
+ Project-scoped memory · on-demand workflow · parallel team worktrees · live analytics dashboard.<br/>
16
+ <em>Works with Claude Code, Cursor, Codex CLI, Cline, Continue, and GitHub Copilot (via VS Code Agent mode) — anything that speaks MCP.</em>
16
17
  </p>
17
18
 
18
19
  <p align="center">
19
20
  <a href="#-quick-start">Quick start</a> ·
20
21
  <a href="#-what-knit-is">What it is</a> ·
21
- <a href="#-whats-new-in-v0110">v0.11</a> ·
22
- <a href="#-52-mcp-tools">Tools</a> ·
23
- <a href="#-how-its-different">Comparison</a> ·
24
- <a href="#-honest-comparison-vs-memory-libraries">vs mem0/Letta</a>
22
+ <a href="#-how-search-works">How search works</a> ·
23
+ <a href="#-55-mcp-tools">Tools</a> ·
24
+ <a href="#-the-dashboard">Dashboard</a> ·
25
+ <a href="#-how-its-different">vs mem0/Letta</a>
25
26
  </p>
26
27
 
27
28
  ---
28
29
 
29
30
  ## 🧠 What knit is
30
31
 
31
- Knit makes Claude Code do the right thing automatically — because you can't predict how a user will phrase a request. It does three jobs at once:
32
+ Knit gives **any MCP-speaking coding agent** the right defaults automatically — because you can't predict how a user will phrase a request, and every agent (Claude Code, Cursor, Codex CLI, Cline, Continue, GitHub Copilot) ends up burning tokens re-discovering the same project facts. Knit does four jobs at once:
32
33
 
33
34
  | | |
34
35
  |---|---|
35
- | 🧠 **Memory** | Every project keeps a brain at `~/.knit/projects/<hash>/`. Sessions compound: learnings, false positives, session summaries, and a static-analysis import graph are all queryable next session. |
36
- | 🪶 **Tokens** | `CLAUDE.md` is ~2 KB (project facts only). Protocol depth is fetched on demand via `knit_get_workflow(phase)`. Knit is **net-negative** on context cost. |
36
+ | 🧠 **Memory** | Every project keeps a brain at `~/.knit/projects/<hash>/`. Sessions compound: learnings, false positives, session summaries, and a static-analysis import graph are all queryable next session. Cross-project pool at `~/.knit/global/`. |
37
+ | 🪶 **Tokens** | `CLAUDE.md` is ~2 KB (project facts only). Protocol depth is fetched on demand via `knit_get_workflow(phase)`. Per-cache-hit savings ≈ 15K tokens (calibrated from instrumented RESEARCH phases — override via env). Reuse-ratio + ROI surfaced in the dashboard. |
37
38
  | 🛠️ **Workflow** | A 4-tier classification (Inquiry / Trivial / Standard / Complex) with phase-triggered plan mode, quality-gated `LEARN`, and team-scoped git worktrees so parallel agents don't step on each other. |
39
+ | 📊 **Dashboard** | New in v0.13. `knit ui` opens a local-first analytics dashboard at `http://127.0.0.1:7421` — bento layout, brain savings, per-project ROI, **force-directed brain graph**, real-time sync via SSE. See [Dashboard](#-the-dashboard). |
38
40
 
39
- It's a **single product**, not three. Every design choice has to win on memory + tokens + workflow together.
41
+ **Local-first** invariant: zero cloud calls in memory/retrieval/classification. Dashboard binds to `127.0.0.1` only, with Host/Origin validation + CSP headers. Your brain stays on your machine.
42
+
43
+ It's a **single product**, not four. Every design choice has to win on memory + tokens + workflow + analytics together.
40
44
 
41
45
  ---
42
46
 
43
47
  ## 🚀 Quick start
44
48
 
45
49
  ```bash
46
- npx knit-mcp@latest setup
50
+ npm install -g knit-mcp
51
+ knit setup # adds Knit MCP to your agent's config (Claude Code / Cursor / Codex / etc.)
52
+ knit ui # opens the brain dashboard at http://127.0.0.1:7421 (optional but recommended)
47
53
  ```
48
54
 
49
- Adds the Knit MCP server to your Claude Code config (`~/.claude.json`). **No per-project setup.** Open Claude Code in any project — the first MCP tool call auto-initializes the brain, hooks, and per-project CLAUDE.md block.
55
+ **No per-project setup.** Open your MCP-speaking agent in any project — the first MCP tool call auto-initializes the brain, hooks, and per-project CLAUDE.md block.
56
+
57
+ ### Adoption per agent
58
+
59
+ v0.14: a single `knit setup` detects **every** installed MCP-speaking agent on
60
+ your machine and writes Knit's config into each one's native format. No
61
+ per-agent manual setup, no copy-pasted JSON.
62
+
63
+ | Agent | Auto-detected by `knit setup` | Config format written | Hook support |
64
+ |---|---|---|---|
65
+ | Claude Code | ✅ `~/.claude.json` | JSON · `mcpServers` | ✅ PreToolUse / PostToolUse / Stop |
66
+ | Cursor | ✅ `.cursor/mcp.json` | JSON · `mcpServers` | ⚠️ approval flow only |
67
+ | Codex CLI | ✅ `~/.codex/config.toml` | **TOML** · `[mcp_servers.knit-brain]` | ⚠️ approval flow only |
68
+ | Cline | ✅ `~/.cline/mcp.json` + `AGENTS.md` | JSON · `mcpServers` | ⚠️ approval flow only |
69
+ | Continue | ✅ `.continue/mcpServers/knit-brain.yaml` | **YAML** per-server | ⚠️ approval flow only |
70
+ | GitHub Copilot (VS Code Agent mode) | ✅ `.vscode/mcp.json` | JSON · `servers` (unique key) | ⚠️ approval flow only |
71
+ | Any other MCP client | ✅ stdio works universally | per the client's docs | varies |
72
+
73
+ > **"Hook support" caveat:** only Claude Code has lifecycle hooks (PreToolUse /
74
+ > PostToolUse / Stop). For the other 5 agents Knit enforces the protocol via
75
+ > the MCP `instructions` field (handshake primer) + **server-side soft-gates**
76
+ > in tool responses — same effect as hooks, transport-layer instead of host-layer.
77
+ > Opt into block-strictness enforcement with `knit_set_protocol_strictness({level: 'block'})`.
78
+
79
+ > **Supported shells:** macOS, Linux, WSL, Git Bash, PowerShell. Windows `cmd.exe` is not supported as the hook-runner shell — use PowerShell (default in modern Windows Terminal) or Git Bash.
50
80
 
51
81
  > **Supported shells:** macOS, Linux, WSL, Git Bash, PowerShell. Windows `cmd.exe` is not supported as the hook-runner shell — use PowerShell (default in modern Windows Terminal) or Git Bash.
52
82
 
@@ -73,32 +103,207 @@ Knit writes nowhere else on your machine.
73
103
 
74
104
  ---
75
105
 
76
- ## What's new in v0.9.0
106
+ ## 🔍 How search works
107
+
108
+ Knit's retrieval is **BM25 + Reciprocal Rank Fusion** over your learnings,
109
+ session summaries, and the cross-project pool, with two cheap-but-honest
110
+ lexical-bridging layers stacked on top: **2-gram fallback** for typos and
111
+ rare compounds, and **curated coding-domain synonym expansion** for the
112
+ most common semantic-gap pairs. No vector embeddings, no remote inference,
113
+ no API calls.
114
+
115
+ **Why this design choice (not an oversight):**
116
+
117
+ - **Deterministic.** Same query → same ranking, every time. No model
118
+ drift, no upgrade-day surprises.
119
+ - **Fast.** Sub-millisecond on corpora ≤ 1K entries (your typical
120
+ project memory). No cold start, no model load.
121
+ - **Local-first.** Zero network calls. Your memory never leaves the
122
+ machine.
123
+ - **Auditable.** You can explain every hit by looking at term overlap
124
+ + the synonym dictionary (50 pairs, hand-curated). No "the model
125
+ said so."
126
+ - **Honest at the boundary.** The bench has documented misses where
127
+ even synonym expansion can't bridge the gap — we ship those visible,
128
+ not hidden.
129
+
130
+ **What it does well.** Exact term match, identifier search
131
+ (`knit_classify_task`), rare-term emphasis (e.g. `PIPE_BUF`), multi-word
132
+ ranking, tag filtering, cross-project diversification (max 2 per
133
+ project), branch diversification on sessions (max 2 per branch). **Typo
134
+ recovery via 2-gram fallback** (`knit_clasify` → `knit_classify_task`).
135
+ **Synonym recovery via curated dictionary** (`hook` ↔ `webhook`,
136
+ `schema` ↔ `migration`, `auth` ↔ `authentication`, `cache` ↔ `memo`,
137
+ `deploy` ↔ `ship` ↔ `release`, etc. — see
138
+ [`src/engine/retrieval/synonyms.ts`](src/engine/retrieval/synonyms.ts)
139
+ for the full ~50-pair dictionary). Synonym matches scored at 0.4× of a
140
+ direct BM25 hit so genuine matches always rank higher.
141
+
142
+ **What it still cannot do.** Multi-word paraphrase ("how do schema
143
+ changes ship" with no shared terms). Deep abstraction-level bridging
144
+ ("data consistency" → "atomic temp+rename"). Question intent
145
+ ("what's the right pattern for X"). Negation. Cross-entry synthesis
146
+ ("based on the auth lessons, what should I do for OAuth"). These need
147
+ either embeddings (model dependency + bundle weight, breaks local-first
148
+ unless run locally via ONNX) or an LLM call layer (Knit-as-retrieval
149
+ becomes Knit-as-agent, different identity). v0.20+ candidate: hybrid
150
+ retrieval (BM25 + local embeddings via RRF) — opt-in, bench-gated.
151
+
152
+ **The practical implication.** Search with words close to how you
153
+ recorded the learning, OR words that have a synonym pair in the
154
+ dictionary. If you write a learning about *webhook signatures*, you
155
+ can now search either *webhook signatures* OR *hook signatures* —
156
+ the dictionary bridges those. For genuinely different vocabulary that
157
+ isn't in the synonym table, use `knit_search_global_learnings` to widen
158
+ the corpus, or call `knit_search_sessions` to pull from past narrative
159
+ summaries that may use more terms.
160
+
161
+ **Bench numbers (v0.16):** synthetic 88.0% top-1 / **100% recall@5**,
162
+ learnings (real-prose) 86.7% top-1 / 96.7% recall@5. Both default ON;
163
+ opt-out via `enableNgramFallback: false` + `enableSynonyms: false` for
164
+ a strict lexical-only baseline.
77
165
 
78
- v0.9 closes the **enforcement story** — every honest limit from the v0.8 architecture got a structural fix.
166
+ ---
79
167
 
80
- ### Anti-hallucination
168
+ ## ✨ What's new in v0.15.0
169
+
170
+ v0.15 is the **deep-clean release**. A second six-dimension internal audit
171
+ graded the post-v0.14.1 codebase and surfaced the deferred items — defense-
172
+ in-depth, retrieval honesty, UX parity, the trailing TODO debt. A single
173
+ audit-cleanup branch closed them all, then six parallel agents re-graded
174
+ the post-fix code to confirm nothing new slipped in.
175
+
176
+ - **Security defense-in-depth.** Every `git` invocation in `worktrees.ts`
177
+ migrated to `execFileSync` with array args (no shell). Agent fetcher
178
+ cache writes are SHA256-verified via sidecars; tampered caches force
179
+ a fresh fetch with stderr alert; pre-v0.15 caches backfilled on first
180
+ read. `qs` CVE (GHSA-q8mj-m7cp-5q26) pinned via npm `overrides` —
181
+ `npm audit` now reports 0 vulnerabilities.
182
+ - **Brain mechanics.** New `pruneLearningsByAge` parallels the sessions
183
+ pattern (atomic rewrite, conservatively preserves unparseable dates +
184
+ `#false-positive` entries). `readLearnings` schema-validates on read.
185
+ Opt-in BM25 2-gram fallback (`enableNgramFallback`, default off)
186
+ rescues typo-only queries without disturbing benchmarks.
187
+ - **Retrieval honesty.** New `bench:learnings` regression bench against
188
+ 30 real-learning-shape narrative entries — gates at top-1 ≥ 75% /
189
+ recall@5 ≥ 90% (currently 83.3% / 96.7%). Compounding-metrics response
190
+ now surfaces token-saved methodology with env-var overrides.
191
+ - **UX & instructions.** Webapp DoctorView shows per-agent rows (parity
192
+ with CLI `knit doctor`). Workflow `EXECUTE` + `REVIEW` phases now embed
193
+ `knit_suggest_command` hooks so the agent defers to user slash-commands
194
+ for test/lint/ship/qa/review. `buildUpdateNotice` surfaces npm-update
195
+ banner in the MCP instructions field — Cursor/Codex/Cline/Continue/
196
+ Copilot users now see updates at handshake.
197
+
198
+ ## ✨ What's new in v0.14.0
199
+
200
+ v0.14 is the **universality release**. Three coordinated shifts: every
201
+ MCP-speaking agent works out of the box, Knit composes with the slash
202
+ commands you already wrote, and enforcement works across all agents
203
+ (not just Claude Code).
204
+
205
+ ### 🌍 Six agents, one install
206
+
207
+ `knit setup` now detects every installed MCP-speaking agent and writes Knit's
208
+ config into each one's native format — JSON for Claude Code / Cursor / Cline /
209
+ VS Code (note: `servers` not `mcpServers` for VS Code), TOML for Codex CLI,
210
+ YAML for Continue. If Codex CLI or Cline is detected, a marker-wrapped
211
+ `AGENTS.md` is also written at project root (the cross-agent rules convention).
212
+ `knit doctor` now reports per-agent registration status, so you can see
213
+ which of your agents are wired up at a glance.
214
+
215
+ ### 🔧 Cross-platform protocol enforcement
216
+
217
+ Only Claude Code has hook lifecycles (PreToolUse / PostToolUse / Stop). For
218
+ the other 5 agents, v0.14 adds **server-side soft-gates** in MCP tool
219
+ responses. When strictness is set to `block`, protocol-critical handlers
220
+ return `{ status: 'protocol_required', next_action: '...' }` instead of
221
+ proceeding — the agent reads the response, follows the breadcrumb, retries.
222
+ This is the universality answer: same enforcement, transport layer instead
223
+ of host layer. Default strictness stays `warn` so existing flows are unchanged.
224
+
225
+ ### ⚡ Agent-native slash-command auto-detection
226
+
227
+ Two new Tier-1 MCP tools:
228
+
229
+ - `knit_scan_agent_commands` — scans `.claude/commands/`, `.cursor/rules/`,
230
+ `.clinerules/`, `~/.codex/prompts/`, `~/.continue/prompts/`, `.github/prompts/`
231
+ and surfaces every user-defined slash command + its description.
232
+ - `knit_suggest_command({phase})` — given a protocol phase (test/lint/review/
233
+ ship), returns matching commands so the agent can invoke `/test` (or
234
+ whatever you wrote) via the host's native slash mechanism, instead of
235
+ describing the work in prose.
236
+
237
+ Cached at `~/.knit/projects/<hash>/agent-commands.json` with a 1-hour TTL
238
+ (~10ms re-scan when stale). Read-only filesystem ops; Knit never executes
239
+ commands — the host agent invokes via its own mechanism.
240
+
241
+ Dashboard exposes the scan results at **`#/commands`** with searchable
242
+ per-agent listing.
243
+
244
+ ### 🛡️ Audit + hardening before publish
245
+
246
+ v0.14 included a deep-dive internal audit of every dashboard
247
+ endpoint, MCP handler, fs.watch race condition, and supply-chain dep. Five
248
+ inline fixes landed in commit `e4e1793`:
249
+ - `fs.watch` error handler now resets `watcher = null` so SSE recovers
250
+ cleanly after a watcher death (pre-fix, real-time sync silently stopped
251
+ until `knit ui` restart).
252
+ - JSON + SSE responses gained `X-Content-Type-Options: nosniff`,
253
+ `X-Frame-Options: DENY`, `Referrer-Policy: no-referrer` (pre-fix only on
254
+ HTML).
255
+ - `handleDefineTeam` + `handlePostTeamFindings` now call `redactSecrets` on
256
+ user-supplied team metadata + finding descriptions (pre-fix: raw write to
257
+ disk). 9 of 9 write handlers now redact uniformly.
258
+
259
+ CBSE-style attack class verified PASS on every dashboard endpoint:
260
+ Host-validation + Origin-validation + read-only contract + same-origin CSP
261
+ + hex-only project-id regex. No malicious-page-can-read-your-brain vector.
262
+
263
+ ## ✨ What's new in v0.13.0
264
+
265
+ v0.13 ships the **dashboard** — the visual surface on top of the brain. Plus security hardening and the universal positioning (works with every MCP-speaking agent).
266
+
267
+ ### 📊 Brain dashboard (`knit ui`)
268
+
269
+ A single command opens a local-first analytics surface at `http://127.0.0.1:7421` — bento layout inspired by modern fintech dashboards, color-blocked cards, generous spacing, real-time sync.
270
+
271
+ | View | What it shows |
272
+ |---|---|
273
+ | **Brain** (`#/`) | Hero card with net tokens saved across all projects, recent activity feed (live), memory hit-rate arc, top projects by ROI |
274
+ | **Graph** (`#/graph`) | Project picker → **force-directed brain graph**: every learning is a node, edges by Jaccard similarity over shared tags + domains. Click any node for the full lesson. Threshold slider. |
275
+ | **Cross-project** (`#/global`) | Cross-project learnings pool, filterable by source project |
276
+ | **Per-project** (`#/p/:id`) | Searchable learnings list, retrieval signals, ROI deep dive (`#/p/:id/metrics`), graph (`#/p/:id/graph`) |
277
+ | **Health** (`#/doctor`) | Install diagnostics: ~/.knit writable, MCP registered, version current |
81
278
 
82
- - 📎 **Citation rule in the MCP `instructions` field.** Every session's system prompt now tells the agent: *"when you state a fact about this codebase, cite the Knit tool result that verified it — e.g. (per `knit_query_imports`). If you can't cite, say 'unverified' explicitly."* Makes hallucinations visible at the **claim level**.
83
- - 🔍 **`knit_verify_claim` tool.** Single-call fact-check against the knowledge graph. Parses *"A imports B"*, *"X exports Y"*, *"A is tested by B"*, *"X exists"* and returns `verified | contradicted | unparseable` with evidence.
279
+ **Real-time sync via SSE.** The server watches `~/.knit/` via `fs.watch`; any agent recording a learning anywhere updates the open dashboard within ~250ms. No polling.
84
280
 
85
- ### Smarter retrieval
281
+ ### 🔐 Security hardening (real, not theater)
86
282
 
87
- - **Auto-search inside `knit_classify_task`.** For `standard` / `complex` tier, classify now runs BM25 over (description + affected domains) automatically and embeds top-3 hits as `pre_emptive_learnings`. Closes the *"agent skipped `knit_search_learnings` before re-investigating"* gap with **zero extra calls**.
88
- - 📚 **`suggested_reads` from `knit_build_context`.** Curated list of files worth opening *before* editing — three signals: graph-importers (blast radius), graph-imports (likely needed), memory-mentions (files referenced by past learnings). Each entry carries `{ path, reason, via }`.
89
- - 🪜 **`knit_get_learning` — hierarchical retrieval.** Search returns headlines (summary + tags); the agent expands a specific learning by id only when needed. **Pay-per-detail.**
90
- - 🧮 **`knit_consolidate_learnings`.** Tag-Jaccard clustering of similar learnings → one pattern entry per cluster. Dry-run by default; `commit=true` persists with originals tagged `#consolidated` (preserved but deprioritized).
283
+ The dashboard is a localhost HTTP server, which has real attack surface. v0.13 closes it:
91
284
 
92
- ### Hook-level enforcement (`HOOKS_VERSION` 6 7)
285
+ - **Host-header validation** — rejects requests whose `Host` isn't `127.0.0.1`/`localhost`. Blocks **DNS rebinding** (a malicious site you visit could resolve `evil.com` to 127.0.0.1 and trick your browser into reading the dashboard).
286
+ - **Origin-header validation** — cross-origin requests get `403`. Same defense pattern as PostgreSQL, Redis, Docker daemon, the React dev server.
287
+ - **Content-Security-Policy** on every HTML response — same-origin scripts only, no `'unsafe-eval'`, no external sources.
288
+ - **X-Frame-Options: DENY**, X-Content-Type-Options: nosniff, Referrer-Policy: no-referrer.
289
+ - **No mutation endpoints** in v0.13 (read-only dashboard). Setup wizard / refresh button stay deferred until proper CSRF protection lands.
93
290
 
94
- | Hook | What it does |
95
- |---|---|
96
- | **PreToolUse search-gate** | For `standard`/`complex` tasks, blocks Edit/Write (in `block` mode) or warns (default `warn`) when `knit_search_learnings` hasn't fired in the current turn. |
97
- | **PreToolUse content inspection** | Reads proposed Edit/Write content, parses local imports, warns on relative paths that don't resolve on disk — **catches hallucinated imports before they land**. |
98
- | **PostToolUse import validation** | After the file lands, re-parses imports and warns about unresolved relative paths — catches anything that slipped past the pre-check. |
99
- | **Stop-hook budget watch** | Cheap CLAUDE.md size check at session end; warns if it crosses the 12.5 KB over-budget threshold. Drift becomes visible even when the agent doesn't call `knit_brain_status`. |
291
+ ### 🌍 Universal positioning
292
+
293
+ Knit is an MCP server. Anything that speaks MCP works:
294
+
295
+ - **Claude Code** handshake via stdio, `instructions` field carries protocol primer
296
+ - **Cursor** register knit MCP server in settings
297
+ - **Codex CLI** — `~/.codex/config.toml` mcpServers section
298
+ - **Cline / Continue** — both speak MCP, same setup
299
+
300
+ The dashboard works regardless of which agent you use — it reads the brain from disk.
100
301
 
101
- > **Upgrade note.** After `npx knit-mcp@latest setup`, **restart Claude Code**. The `instructions` field and tier-gated `tools/list` only flow into the system prompt at handshake. The `HOOKS_VERSION` bump auto-regenerates installed hooks on the next brain load — no manual `knit refresh` needed.
302
+ ### 🪙 Token-economy lever
303
+
304
+ `knit ui` notifies you when a new `knit-mcp` is available on npm — polls the registry every 5 minutes server-side, banner pops in the dashboard with the one-line `npm install -g knit-mcp@latest` command. No stale installs.
305
+
306
+ > **Upgrade note.** After `npm install -g knit-mcp@latest`, **restart your agent**. The `instructions` field flows into the system prompt at handshake. The `HOOKS_VERSION` bump auto-regenerates installed hooks on the next brain load — no manual `knit refresh` needed.
102
307
 
103
308
  ---
104
309
 
@@ -115,7 +320,47 @@ Each surface gets a `healthy | warn | over-budget` verdict from `knit_brain_stat
115
320
 
116
321
  ---
117
322
 
118
- ## 🛠️ 43 MCP Tools
323
+ ## 📊 The dashboard
324
+
325
+ Run `knit ui` to open the local analytics surface. **Single command**, no other CLI needed for normal operation:
326
+
327
+ ```bash
328
+ knit ui
329
+ # Knit Dashboard — http://127.0.0.1:7421
330
+ # Reading from: /Users/<you>/.knit
331
+ # Press Ctrl-C to stop.
332
+ # (automatically opens your default browser)
333
+ ```
334
+
335
+ | Feature | What you see |
336
+ |---|---|
337
+ | **Bento home** | Big "Net tokens saved" hero card (dark), live recent activity (green "live" dot when SSE connected), memory hit-rate gauge, top projects by ROI as color-blocked cards |
338
+ | **Brain graph** | Force-directed visualization of one project's learnings. Nodes sized by access count, colored by domain. Edges by Jaccard similarity over tags + domains. Click any node → side panel with the full lesson. Threshold slider live-recomputes the graph. |
339
+ | **Per-project deep dive** | Hero card with verdict tone (cold/warming/compounding/strong), retrieval signals, classifications-by-tier breakdown, top domains heatmap, searchable learnings list |
340
+ | **Health** | Install diagnostics — Node version, Knit version, ~/.knit permissions, MCP registration in `~/.claude.json` |
341
+
342
+ **API endpoints** (all read-only, all 127.0.0.1 only):
343
+
344
+ - `GET /api/version` — runtime version + update check + security metadata
345
+ - `GET /api/brain/summary` — global counts
346
+ - `GET /api/brain/aggregate` — cross-project ROI totals
347
+ - `GET /api/projects` — project list
348
+ - `GET /api/projects/:id/learnings` — full learning entries
349
+ - `GET /api/projects/:id/metrics` — compounding ROI for one project
350
+ - `GET /api/projects/:id/graph` — force-directed node + edge data (Jaccard threshold tunable)
351
+ - `GET /api/global/learnings` — cross-project pool
352
+ - `GET /api/doctor` — install diagnostics
353
+ - `GET /api/events` — Server-Sent Events stream for real-time sync
354
+
355
+ ---
356
+
357
+ ## 🛠️ 55 MCP Tools
358
+
359
+ > **49 active by default** at first handshake. The remaining 6 are tier-gated:
360
+ > teams (9 tools, auto-on when ≥3 domains detected), subagents (1 tool, auto-on
361
+ > when `.claude/agents/` exists), and admin (3 tools, opt-in via
362
+ > `knit_enable_feature("admin")`). Call `knit_list_features` to see what's
363
+ > available and how to enable.
119
364
 
120
365
  <details open>
121
366
  <summary><strong>🕸️ Knowledge graph</strong> <em>(Tier 1, ~5ms)</em></summary>
@@ -452,6 +697,7 @@ LongMemEval-S R@5/R@10 + LOCOMO LLM-as-Judge runs are on the roadmap (v0.13+). U
452
697
 
453
698
  | Version | Headline |
454
699
  |---|---|
700
+ | **v0.12.0** | **Picture Perfect: Structural Enforcement.** Diagnostic → enforcing. Budget verdict surfaces in the MCP `instructions` field at handshake (before any tool description is read). `knit_load_session` carries `budget_health` + `learnings_health` nudges. `engram doctor` exits non-zero on over-budget; `engram setup` runs doctor as final step. New PostToolUse hook warns immediately on over-budget CLAUDE.md edits (HOOKS_VERSION 11→12; auto-rolls to existing users). This repo dogfoods: hand-curated 16KB CLAUDE.md migrated to lean 3.8KB plus an internal long-form sidecar. New `npm run bench:tokens` measures real MCP-on vs MCP-off cost: 93% smaller per-recall call, 50% smaller per-classify, payback at 3 recall calls. 53 tools, 705 tests. |
455
701
  | **v0.11.4** | Dogfood audit · ran a full audit of Knit's own codebase using its own `knit_spawn_team_worktree` primitive (4 parallel teams: Core Logic, Infrastructure, UI, Quality Assurance). Fixes: HIGH `engram refresh` no longer clobbers user-curated CLAUDE.md (now uses `spliceKnitBlock` like `cache.ts`); `saveSource`/`loadSource` validate `sourceId`; `appendGlobalLearning` propagates write failures; `redactSecrets` applied to `label`/`tags`/`domains` across all persistence boundaries; 100KB response ceiling on `knit_generate_test_cases`; full v0.11 tool surface now documented in `workflow-protocol.ts` generator (was frozen at the v0.4 surface). Plus: 16 key tools reclassified with `[PROTOCOL]`/`[REVIEW]`/`[MEMORY]`/`[GRAPH]` prefixes so the LLM picks the right tool reliably. 53 tools, 687 tests. |
456
702
  | **v0.11.3** | Propagation patch · `update_available` flag now surfaces in `knit_load_session` response (≈100% session reach vs. brain_status' low reach) + startup stderr nag on stale versions. Helps FUTURE upgrades land faster; doesn't retroactively reach v0.10.x users. 53 tools, 665 tests. |
457
703
  | **v0.11.2** | Pre-publish polish · chunk cap (2000) + `errorResponse` envelope across handlers + CLAUDE.md generator surfaces v0.11 tools · new `engram doctor` install health-check CLI · upgrade-path smoke test caught + fixed a data-loss bug in cache.ts (Case B was wiping user permissions on upgrade) · 11 real exploit-payload integration tests prove C1/C2/H1 fixes hold · `npm run bench` ships a synthetic retrieval harness (50 Q&A) measuring 86% top-1 / 96% R@5. 53 tools, 664 tests. |
@@ -2,14 +2,15 @@ import {
2
2
  detectProjectRoot,
3
3
  getBrain,
4
4
  refreshBrain
5
- } from "./chunk-I63UMEBF.js";
6
- import "./chunk-HROSQ5MS.js";
7
- import "./chunk-GATMQQK5.js";
8
- import "./chunk-WKQHCLLO.js";
9
- import "./chunk-MOOVNMIN.js";
10
- import "./chunk-ST4X7LZT.js";
11
- import "./chunk-M3YZOJNW.js";
5
+ } from "./chunk-JE4BZQUD.js";
6
+ import "./chunk-QM4U75VE.js";
7
+ import "./chunk-V54QPQ6K.js";
8
+ import "./chunk-2FAS6CV4.js";
12
9
  import "./chunk-POXT5OYN.js";
10
+ import "./chunk-WKQHCLLO.js";
11
+ import "./chunk-FX3SVNHX.js";
12
+ import "./chunk-YRLAWCYW.js";
13
+ import "./chunk-BBQSWT4H.js";
13
14
  import "./chunk-VB2TIR6L.js";
14
15
  import "./chunk-7UFS67HP.js";
15
16
  import "./chunk-27TA2ZQZ.js";
@@ -15,7 +15,7 @@ import {
15
15
  } from "./chunk-27TA2ZQZ.js";
16
16
 
17
17
  // src/generators/settings.ts
18
- var HOOKS_VERSION = 11;
18
+ var HOOKS_VERSION = 12;
19
19
  function generateSettings(config, rootPath) {
20
20
  return {
21
21
  mcpServers: {
@@ -288,6 +288,39 @@ function generateHooks(config, rootPath) {
288
288
  }
289
289
  ]
290
290
  });
291
+ hooks.PostToolUse.push({
292
+ _knitOwned: true,
293
+ matcher: "Write|Edit|MultiEdit",
294
+ hooks: [
295
+ {
296
+ type: "command",
297
+ command: nodeHook(`
298
+ let d = "";
299
+ process.stdin.on("data", (c) => d += c);
300
+ process.stdin.on("end", () => {
301
+ try {
302
+ const fs = require("fs");
303
+ const path = require("path");
304
+ const i = JSON.parse(d);
305
+ const ti = i.tool_input || {};
306
+ const f = ti.file_path || (i.tool_response && i.tool_response.filePath) || "";
307
+ if (!f) return;
308
+ if (path.basename(f) !== "CLAUDE.md") return;
309
+ const TARGET = 6500;
310
+ const SLACK = 6500 * 1.25;
311
+ let size = 0;
312
+ try { size = fs.statSync(f).size; } catch { return; }
313
+ if (size <= TARGET) return;
314
+ const kb = Math.round(size/1024*10)/10;
315
+ const verdict = size > SLACK ? "over-budget" : "warn";
316
+ process.stderr.write("[knit] BUDGET " + verdict + ": " + f + " is now " + kb + "KB (target 6.5KB). Trim CLAUDE.md or run \\\`knit refresh\\\` to regenerate.\\n");
317
+ } catch (e) { try { process.stderr.write('[knit] claude-md size watch hook failed: ' + (e && e.message ? e.message : e) + '\\n'); } catch {} }
318
+ });
319
+ `),
320
+ timeout: 5
321
+ }
322
+ ]
323
+ });
291
324
  hooks.PostToolUse.push({
292
325
  _knitOwned: true,
293
326
  matcher: "Write|Edit|MultiEdit",
@@ -1,14 +1,30 @@
1
1
  // src/engine/learnings.ts
2
- import { readFileSync, writeFileSync, appendFileSync, existsSync, mkdirSync } from "fs";
2
+ import { readFileSync, writeFileSync, appendFileSync, existsSync, mkdirSync, rmdirSync, renameSync } from "fs";
3
3
  import { dirname } from "path";
4
4
  function readLearnings(filePath) {
5
5
  if (!existsSync(filePath)) return [];
6
6
  const content = readFileSync(filePath, "utf-8");
7
7
  const entries = [];
8
8
  const sections = content.split(/^## /m).slice(1);
9
+ let parseFailures = 0;
10
+ let emptyShells = 0;
9
11
  for (const section of sections) {
10
12
  const entry = parseEntry(section);
11
- if (entry) entries.push(entry);
13
+ if (!entry) {
14
+ parseFailures++;
15
+ continue;
16
+ }
17
+ if (!entry.summary.trim() || !entry.lesson.trim()) {
18
+ emptyShells++;
19
+ continue;
20
+ }
21
+ entries.push(entry);
22
+ }
23
+ if (parseFailures > 0 || emptyShells > 0) {
24
+ process.stderr.write(
25
+ `[knit] readLearnings(${filePath}): skipped ${parseFailures} unparseable, ${emptyShells} empty-shell entries
26
+ `
27
+ );
12
28
  }
13
29
  return entries;
14
30
  }
@@ -325,7 +325,7 @@ function buildSummary(allFiles, sourceFiles, importGraph, testMap, rootPath) {
325
325
  const pkg = JSON.parse(readFileSync(pkgPath, "utf-8"));
326
326
  if (pkg.main) entryPoints.push(pkg.main);
327
327
  if (pkg.bin) {
328
- const bins = typeof pkg.bin === "string" ? [pkg.bin] : Object.values(pkg.bin);
328
+ const bins = typeof pkg.bin === "string" ? [pkg.bin] : Object.values(pkg.bin).filter((v) => typeof v === "string");
329
329
  entryPoints.push(...bins);
330
330
  }
331
331
  }
@@ -1,11 +1,17 @@
1
+ import {
2
+ installAgentsForProject,
3
+ pruneSessionsByAge
4
+ } from "./chunk-QM4U75VE.js";
5
+ import {
6
+ writeFileAtomic
7
+ } from "./chunk-V54QPQ6K.js";
1
8
  import {
2
9
  HOOKS_VERSION,
3
10
  generateSettings
4
- } from "./chunk-HROSQ5MS.js";
11
+ } from "./chunk-2FAS6CV4.js";
5
12
  import {
6
- installAgentsForProject,
7
- pruneSessionsByAge
8
- } from "./chunk-GATMQQK5.js";
13
+ prewarmLatestVersion
14
+ } from "./chunk-POXT5OYN.js";
9
15
  import {
10
16
  importFromMarkdown,
11
17
  loadKnowledgeBaseSafe,
@@ -14,16 +20,13 @@ import {
14
20
  import {
15
21
  buildKnowledge,
16
22
  buildReverseDependencies
17
- } from "./chunk-MOOVNMIN.js";
23
+ } from "./chunk-FX3SVNHX.js";
18
24
  import {
19
25
  scanProject
20
- } from "./chunk-ST4X7LZT.js";
26
+ } from "./chunk-YRLAWCYW.js";
21
27
  import {
22
28
  readLearnings
23
- } from "./chunk-M3YZOJNW.js";
24
- import {
25
- prewarmLatestVersion
26
- } from "./chunk-POXT5OYN.js";
29
+ } from "./chunk-BBQSWT4H.js";
27
30
  import {
28
31
  persistScanResult,
29
32
  scanIntegrations
@@ -50,7 +53,7 @@ import {
50
53
 
51
54
  // src/mcp/cache.ts
52
55
  import { execSync } from "child_process";
53
- import { existsSync, mkdirSync, writeFileSync, readFileSync, copyFileSync, readdirSync, statSync } from "fs";
56
+ import { existsSync, mkdirSync, readFileSync, copyFileSync, readdirSync, statSync } from "fs";
54
57
  import { join, basename, dirname } from "path";
55
58
 
56
59
  // src/generators/learnings.ts
@@ -111,6 +114,9 @@ function getBrain(rootPath) {
111
114
  const projectName = detectProjectName(rootPath);
112
115
  const kbLoad = loadKnowledgeBaseSafe(knowledgebasePath(rootPath), projectName);
113
116
  const knowledgeBase = kbLoad.kb;
117
+ if (!kbLoad.loadFailed && knowledgeBase.projectName !== projectName) {
118
+ knowledgeBase.projectName = projectName;
119
+ }
114
120
  const config = {
115
121
  name: projectName,
116
122
  packageManager: scan.packageManager,
@@ -119,7 +125,7 @@ function getBrain(rootPath) {
119
125
  targetAgent: "claude-code",
120
126
  tokenOptimization: "standard"
121
127
  };
122
- writeFileSync(knowledgePath(rootPath), JSON.stringify(knowledge, null, 2), "utf-8");
128
+ writeFileAtomic(knowledgePath(rootPath), JSON.stringify(knowledge, null, 2));
123
129
  if (!kbLoad.loadFailed) {
124
130
  saveKnowledgeBase(knowledgebasePath(rootPath), knowledgeBase);
125
131
  }
@@ -178,14 +184,14 @@ function autoInitialize(rootPath) {
178
184
  });
179
185
  const learningsPath = learningsFilePath(rootPath, projectName);
180
186
  if (!existsSync(learningsPath)) {
181
- writeFileSync(learningsPath, generateLearningsContent(config), "utf-8");
187
+ writeFileAtomic(learningsPath, generateLearningsContent(config));
182
188
  }
183
189
  const kbPath = knowledgebasePath(rootPath);
184
190
  const kb = loadKnowledgeBaseSafe(kbPath, projectName).kb;
185
191
  const entries = readLearnings(learningsPath);
186
192
  importFromMarkdown(kb, entries);
187
193
  saveKnowledgeBase(kbPath, kb);
188
- writeFileSync(knowledgePath(rootPath), JSON.stringify(knowledge, null, 2), "utf-8");
194
+ writeFileAtomic(knowledgePath(rootPath), JSON.stringify(knowledge, null, 2));
189
195
  }
190
196
  function migrateLegacyData(rootPath) {
191
197
  mkdirSync(projectDataDir(rootPath), { recursive: true });
@@ -219,7 +225,7 @@ can be deleted at your discretion. Future learnings, knowledge indexes, and
219
225
  session memory live in the new path.
220
226
  `;
221
227
  try {
222
- writeFileSync(breadcrumb, note, "utf-8");
228
+ writeFileAtomic(breadcrumb, note);
223
229
  } catch {
224
230
  }
225
231
  }
@@ -228,24 +234,23 @@ function writeProjectClaudeMd(rootPath, config, knowledge) {
228
234
  const claudeMdPath = join(rootPath, "CLAUDE.md");
229
235
  const block = generateClaudeMd(config, knowledge);
230
236
  if (!existsSync(claudeMdPath)) {
231
- writeFileSync(claudeMdPath, block, "utf-8");
237
+ writeFileAtomic(claudeMdPath, block);
232
238
  return;
233
239
  }
234
240
  const existing = readFileSync(claudeMdPath, "utf-8");
235
241
  if (existing.includes(KNIT_MARKER_START)) {
236
242
  const { content } = spliceKnitBlock(existing, block);
237
- writeFileSync(claudeMdPath, content, "utf-8");
243
+ writeFileAtomic(claudeMdPath, content);
238
244
  return;
239
245
  }
240
246
  const sidecarDir = join(rootPath, ".claude");
241
247
  const sidecarPath = join(sidecarDir, "KNIT.md");
242
- mkdirSync(sidecarDir, { recursive: true });
243
248
  const sidecar = `<!-- This file is Knit's per-project workflow. -->
244
249
  <!-- Your CLAUDE.md exists without Knit markers, so Knit wrote here instead of clobbering it. -->
245
250
  <!-- To include this content in CLAUDE.md, add: @.claude/KNIT.md -->
246
251
 
247
252
  ${block}`;
248
- writeFileSync(sidecarPath, sidecar, "utf-8");
253
+ writeFileAtomic(sidecarPath, sidecar);
249
254
  }
250
255
  function copyIfExists(src, dst) {
251
256
  if (existsSync(src) && !existsSync(dst)) {
@@ -258,8 +263,7 @@ function writeKnitHooks(rootPath, config) {
258
263
  const settingsPath = join(claudeDir, "settings.local.json");
259
264
  const fresh = generateSettings(config, rootPath);
260
265
  if (!existsSync(settingsPath)) {
261
- mkdirSync(claudeDir, { recursive: true });
262
- writeFileSync(settingsPath, JSON.stringify(fresh, null, 2), "utf-8");
266
+ writeFileAtomic(settingsPath, JSON.stringify(fresh, null, 2));
263
267
  return;
264
268
  }
265
269
  let existing;
@@ -299,8 +303,7 @@ function writeKnitHooks(rootPath, config) {
299
303
  _knitHooks: { ...fresh._knitHooks, merged: true }
300
304
  };
301
305
  delete merged._engramHooks;
302
- mkdirSync(claudeDir, { recursive: true });
303
- writeFileSync(settingsPath, JSON.stringify(merged, null, 2), "utf-8");
306
+ writeFileAtomic(settingsPath, JSON.stringify(merged, null, 2));
304
307
  }
305
308
  function detectProjectName(rootPath) {
306
309
  let name = basename(rootPath);