npm - sverklo - Versions diffs - 0.27.0 → 0.28.1 - Mend

sverklo 0.27.0 → 0.28.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (92) hide show

package/README.md +84 -74
package/dist/bin/sverklo.js +38 -10
package/dist/bin/sverklo.js.map +1 -1
package/dist/src/audit-prompt.js +10 -10
package/dist/src/doctor.js +23 -15
package/dist/src/doctor.js.map +1 -1
package/dist/src/init.d.ts +1 -1
package/dist/src/init.js +39 -29
package/dist/src/init.js.map +1 -1
package/dist/src/prove.d.ts +34 -0
package/dist/src/prove.js +199 -0
package/dist/src/prove.js.map +1 -0
package/dist/src/server/assets/dashboard.js +2 -2
package/dist/src/server/hints.js +40 -41
package/dist/src/server/hints.js.map +1 -1
package/dist/src/server/mcp-server.d.ts +10 -0
package/dist/src/server/mcp-server.js +218 -119
package/dist/src/server/mcp-server.js.map +1 -1
package/dist/src/server/prompts.js +37 -37
package/dist/src/server/prompts.js.map +1 -1
package/dist/src/server/tool-overrides.js +71 -59
package/dist/src/server/tool-overrides.js.map +1 -1
package/dist/src/server/tools/ask.js +7 -7
package/dist/src/server/tools/ask.js.map +1 -1
package/dist/src/server/tools/ast-grep.js +2 -2
package/dist/src/server/tools/ast-grep.js.map +1 -1
package/dist/src/server/tools/audit.js +5 -5
package/dist/src/server/tools/audit.js.map +1 -1
package/dist/src/server/tools/clusters.js +3 -3
package/dist/src/server/tools/clusters.js.map +1 -1
package/dist/src/server/tools/concepts.js +6 -6
package/dist/src/server/tools/concepts.js.map +1 -1
package/dist/src/server/tools/context.js +7 -7
package/dist/src/server/tools/context.js.map +1 -1
package/dist/src/server/tools/critique.js +4 -4
package/dist/src/server/tools/critique.js.map +1 -1
package/dist/src/server/tools/ctx-handles.js +3 -3
package/dist/src/server/tools/ctx-handles.js.map +1 -1
package/dist/src/server/tools/dependencies.js +3 -3
package/dist/src/server/tools/dependencies.js.map +1 -1
package/dist/src/server/tools/diff-search.js +2 -2
package/dist/src/server/tools/diff-search.js.map +1 -1
package/dist/src/server/tools/find-references.js +3 -3
package/dist/src/server/tools/find-references.js.map +1 -1
package/dist/src/server/tools/forget.js +3 -3
package/dist/src/server/tools/forget.js.map +1 -1
package/dist/src/server/tools/impact.js +1 -1
package/dist/src/server/tools/impact.js.map +1 -1
package/dist/src/server/tools/index-status.js +21 -21
package/dist/src/server/tools/index-status.js.map +1 -1
package/dist/src/server/tools/investigate.js +5 -5
package/dist/src/server/tools/investigate.js.map +1 -1
package/dist/src/server/tools/list-repos.js +1 -1
package/dist/src/server/tools/list-repos.js.map +1 -1
package/dist/src/server/tools/lookup.js +4 -4
package/dist/src/server/tools/lookup.js.map +1 -1
package/dist/src/server/tools/memories.js +3 -3
package/dist/src/server/tools/memories.js.map +1 -1
package/dist/src/server/tools/overview.js +3 -3
package/dist/src/server/tools/overview.js.map +1 -1
package/dist/src/server/tools/patterns.js +1 -1
package/dist/src/server/tools/patterns.js.map +1 -1
package/dist/src/server/tools/pin.js +2 -2
package/dist/src/server/tools/pin.js.map +1 -1
package/dist/src/server/tools/post-filter.js +3 -3
package/dist/src/server/tools/post-filter.js.map +1 -1
package/dist/src/server/tools/recall.js +4 -4
package/dist/src/server/tools/recall.js.map +1 -1
package/dist/src/server/tools/remember.js +2 -2
package/dist/src/server/tools/remember.js.map +1 -1
package/dist/src/server/tools/review-diff.js +5 -5
package/dist/src/server/tools/review-diff.js.map +1 -1
package/dist/src/server/tools/search-iterative.js +10 -10
package/dist/src/server/tools/search-iterative.js.map +1 -1
package/dist/src/server/tools/search.js +5 -5
package/dist/src/server/tools/search.js.map +1 -1
package/dist/src/server/tools/test-map.js +3 -3
package/dist/src/server/tools/test-map.js.map +1 -1
package/dist/src/server/tools/tier.js +4 -4
package/dist/src/server/tools/tier.js.map +1 -1
package/dist/src/server/tools/verify.js +4 -4
package/dist/src/server/tools/verify.js.map +1 -1
package/dist/src/server/tools/wakeup.js +3 -3
package/dist/src/server/tools/wakeup.js.map +1 -1
package/dist/src/server/trajectory.js +20 -8
package/dist/src/server/trajectory.js.map +1 -1
package/dist/src/utils/ollama.js +1 -1
package/dist/src/utils/ollama.js.map +1 -1
package/package.json +2 -2
package/src/skills/sverklo-onboard.md +4 -4
package/src/skills/sverklo-refactor.md +5 -5
package/src/skills/sverklo-review.md +3 -3

package/README.md CHANGED Viewed

@@ -6,19 +6,27 @@
   🇬🇧 <b>English</b> · 🇨🇳 <a href="./README-zh-CN.md">中文</a>
 </p>
-> *"The map is not the territory."* — Alfred Korzybski
->
-> Training data is the map. Your codebase is the territory. **Sverklo gives the agent the territory.**
+# Give your coding agent repo memory.
-**Local-first code intelligence.** Sverklo is the open-source MCP server that gives Claude Code, Cursor, Windsurf, and Zed a real symbol graph, blast-radius analysis, and git-pinned memory — so your AI coding agent stops hallucinating function names on large repos. The only code-intel MCP with a published benchmark and reproducible eval harness. MIT. Zero config. Your code never leaves the machine.
+Sverklo gives coding agents repo memory: symbols, callers, diffs, blast radius, and git-pinned decisions before they edit. It is an open-source local-first MCP server for Claude Code, Cursor, Windsurf, Codex CLI, and any MCP-speaking coding agent.
-> **Local-first code intelligence** ◦ No cloud upload ◦ No embedding lottery ◦ Single MCP tool call
+**Local-first** ◦ MIT ◦ no API keys ◦ no code upload ◦ first run downloads a local ONNX model
+Use grep when you know the exact string. Use Sverklo when the agent needs relationships: who calls this, what depends on it, what changed, and which project decisions still apply. The public bench covers 180 hand-verified tasks across 6 OSS codebases; the methodology and ground truth live in [sverklo/sverklo-bench](https://github.com/sverklo/sverklo-bench). [Bench](https://sverklo.com/bench/) · [paper](https://doi.org/10.5281/zenodo.19802051) · [90-second demo](https://www.youtube.com/watch?v=OX7aEgdlqhQ)
+```bash
+npm install -g sverklo
+cd your-project && sverklo init
+sverklo prove
+```
-**43× fewer input tokens than naive grep**, single tool call vs grep's 7-12 — measured on 90 hand-verified tasks across sverklo, express, and lodash. F1 0.56 overall (leader), 0.73 on definition lookup. [bench:primitives](https://sverklo.com/bench/) is reproducible from a fresh clone with one npm script. Methodology + ground truth lives in its own repo: [sverklo/sverklo-bench](https://github.com/sverklo/sverklo-bench). [Paper](https://doi.org/10.5281/zenodo.19802051) · [bench:swe](https://sverklo.com/blog/bench-swe-first-results/) — 38/65 perfect recall on 5 OSS repos, **including the runs we lose**.
+`sverklo init` writes the MCP config for your agent, appends local instructions to `AGENTS.md` or `CLAUDE.md`, and runs `sverklo doctor` to verify the handshake. `sverklo prove` then shows central files, a real symbol with callers, and the exact prompt to paste into your agent. Your code stays on your machine.
-`blind grep` returns 17,000 tokens of regex hits with no ranking, no semantic recall, no call-graph awareness. `embedding lottery` returns chunks ranked by cosine similarity without verifying any of them are load-bearing. Sverklo returns ~470 tokens of ranked, traceable, call-graph-aware results in a single tool call.
+> *"The map is not the territory."* — Alfred Korzybski
+>
+> Training data is the map. Your codebase is the territory. **Sverklo gives the agent the territory.**
-### One-click install
+### Editor shortcuts
 [![Install in Claude Code](https://img.shields.io/badge/Claude_Code-Install_Plugin-CC785C?style=for-the-badge&logoColor=white)](#claude-code) [![Install in Cursor](https://img.shields.io/badge/Cursor-Install_MCP-F14C28?style=for-the-badge&logo=cursor&logoColor=white)](cursor://anysphere.cursor-deeplink/mcp/install?name=sverklo&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsInN2ZXJrbG8iXX0=) [![Install in VS Code](https://img.shields.io/badge/VS_Code-Install_MCP-0098FF?style=for-the-badge&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=sverklo&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22sverklo%22%5D%7D) [![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install_MCP-24bfa5?style=for-the-badge&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=sverklo&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22sverklo%22%5D%7D&quality=insiders) [![Install in Windsurf](https://img.shields.io/badge/Windsurf-sverklo_init-09B6A2?style=for-the-badge&logoColor=white)](#windsurf--zed--vs-code--jetbrains)
@@ -29,9 +37,9 @@
 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.19802051.svg)](https://doi.org/10.5281/zenodo.19802051)
 [![GitHub stars](https://img.shields.io/github/stars/sverklo/sverklo?style=social)](https://github.com/sverklo/sverklo/stargazers)
-> ⭐ **If sverklo saved your AI from hallucinating, please star this repo** — it's the single most useful thing you can do to help others find it. Then [share with one teammate](https://twitter.com/intent/tweet?text=Local-first%20MCP%20code%20intelligence%20for%20AI%20coding%20agents%20%E2%80%94%20MIT%2C%20zero-deps%2C%20honest%20benchmark%20%40%20https%3A%2F%2Fsverklo.com%2Fbench%2F) who's tired of `getUserByEmail()` not existing in their codebase.
+> If `sverklo prove` surfaces useful repo context, please star the repo. It is the fastest way to help other agent-heavy teams find it.
-![Sverklo cuts agent context by 65 % vs grep — bench:primitives, 60 retrieval tasks, peer-reviewable](./docs/hero-token-comparison.png)
+![Sverklo bench:primitives token comparison](./docs/hero-token-comparison.png)
 [![Watch the 90-second demo: terminal + Claude Code MCP integration](https://i.ytimg.com/vi/OX7aEgdlqhQ/maxresdefault.jpg)](https://www.youtube.com/watch?v=OX7aEgdlqhQ)
@@ -67,9 +75,10 @@ Sverklo drills into your repo before the agent does — symbol graph, blast radi
 ```bash
 npm install -g sverklo
 cd your-project && sverklo init
+sverklo prove
 ```
-That's it. `sverklo init` auto-detects your installed AI coding agent (Claude Code, Cursor, Windsurf, Zed), writes the right MCP config, appends instructions to `AGENTS.md` if present (otherwise `CLAUDE.md`), and runs `sverklo doctor` to verify the setup. Works on macOS, Linux, and Windows. **No API keys. No cloud. Telemetry off by default.**
+That's it. `sverklo init` auto-detects your installed AI coding agent (Claude Code, Cursor, Windsurf, Zed), writes the right MCP config, appends instructions to `AGENTS.md` if present (otherwise `CLAUDE.md`), and runs `sverklo doctor` to verify the setup. `sverklo prove` shows the first useful repo-memory proof from your own codebase. Works on macOS, Linux, and Windows. **No API keys. No cloud. Telemetry off by default.**
 > The embedding model (`all-MiniLM-L6-v2` ONNX, ~86 MB) is downloaded from HuggingFace on first use into `~/.sverklo/models/` and cached forever — every subsequent run is fully offline.
@@ -81,7 +90,7 @@ That's it. `sverklo init` auto-detects your installed AI coding agent (Claude Co
 Likely you've seen tools that look adjacent. The honest one-paragraph answers, with detailed comparisons linked.
-**…just grep with extra steps?** No, but tuned grep is genuinely competitive on F1. On the [90-task bench](https://sverklo.com/bench/), sverklo leads overall F1 (0.56 vs smart-grep's 0.49) — but smart-grep still wins P2 reference finding outright. Sverklo wins by 43× on input tokens and 4-7× on tool-call count vs naive grep. For an AI agent inside a 200K context window, that's the load-bearing axis. For a human at a terminal, smart grep is fine.
+**…just grep with extra steps?** No, but tuned grep is genuinely competitive when you know the exact string. On the [180-task bench](https://sverklo.com/bench/), sverklo leads overall F1 (0.58 vs smart-grep's 0.34) while using about 35× fewer input tokens than naive grep and one tool call per task. For an AI agent inside a 200K context window, that's the load-bearing axis. For a human at a terminal, grep is still fine.
 **…just Sourcegraph Cody?** Same retrieval surface (hybrid BM25 + vector + graph), different deployment model and license. Cody is source-available with enterprise per-developer pricing ($9-19/dev/mo); sverklo is MIT and runs on a laptop with no signup. [Full comparison →](https://sverklo.com/vs/sourcegraph-cody/)
@@ -103,7 +112,7 @@ If something is missing here that you'd ask about, [open an issue](https://githu
 ## What's new in 0.20
-- **Contradiction detection on the bi-temporal memory layer.** `sverklo_memories mode:"conflicts"` surfaces pairs of active memories that share a pin (file path or symbol name) and may contradict — e.g., "JWT in middleware" vs "JWT in route handler" both pinned to `src/auth.ts`. Restricted to decision/preference/pattern categories (procedural/context are additive, not contradicting). Same-SHA pairs are skipped. Sorted by shared-pin count and age. Conservative by design: surfaces *candidates* for the agent or human to review, not auto-resolution. The bi-temporal model already preserved both sides of the contradiction; this just makes them findable.
+- **Contradiction detection on the bi-temporal memory layer.** `memories mode:"conflicts"` surfaces pairs of active memories that share a pin (file path or symbol name) and may contradict — e.g., "JWT in middleware" vs "JWT in route handler" both pinned to `src/auth.ts`. Restricted to decision/preference/pattern categories (procedural/context are additive, not contradicting). Same-SHA pairs are skipped. Sorted by shared-pin count and age. Conservative by design: surfaces *candidates* for the agent or human to review, not auto-resolution. The bi-temporal model already preserved both sides of the contradiction; this just makes them findable.
 ## What's new in 0.19
@@ -116,7 +125,7 @@ If something is missing here that you'd ask about, [open an issue](https://githu
 - **Windows pathing fixed.** `sverklo init` and `sverklo doctor` now work on Windows — absolute paths go through `path.basename()` and stored `relativePath` is normalized to forward slashes so every downstream consumer is cross-platform.
 - **`npm run bench:swe`** — third-party-reproducible cross-repo eval. Clones 5 OSS repos (express, nestjs, vite, prisma, fastapi), runs 65 grounded questions, prints aggregated recall. PRs that add questions are welcome.
 - **Tree-sitter parser opt-in.** `sverklo grammars install` (~3.5 MB across 6 languages) + `SVERKLO_PARSER=tree-sitter` routes the indexer through real ASTs for TypeScript/TSX/JavaScript/Python/Go/Rust. Silent regex fallback when grammars aren't installed. Plan to flip the default lives in [docs/parser-parity.md](./docs/parser-parity.md).
-- **Workspace shared memory.** `sverklo workspace memory <name> add/list/search` plus `sverklo_remember scope:"workspace"` from the agent — write a decision once, query it from every other repo in the workspace. `sverklo_recall` blends workspace results under project ones with a `[ws]` badge.
+- **Workspace shared memory.** `sverklo workspace memory <name> add/list/search` plus `remember scope:"workspace"` from the agent — write a decision once, query it from every other repo in the workspace. `recall` blends workspace results under project ones with a `[ws]` badge.
 - **`sverklo memory export`** — markdown / Notion / JSON. Migrate your team's decision log to wherever it actually lives.
 - **PR-bot inline review.** `sverklo review --format github-review-json` + the action's new `inline-comments: true` default posts per-line review comments via `pulls.createReview`, alongside the existing sticky summary.
 - **VS Code extension scaffold** at [`extensions/vscode/`](./extensions/vscode/) with a pre-built `sverklo-vscode-0.1.0.vsix`. Inline caller-count decorations on every function header (`⟵ 47 callers`). Marketplace publish workflow ships dormant; install with `code --install-extension extensions/vscode/sverklo-vscode-0.1.0.vsix` today.
@@ -130,11 +139,11 @@ Every one of these is a query a real engineer asked a real AI assistant last wee
 | The question | With Grep | With Sverklo |
 |---|---|---|
-| "Where is auth handled in this repo?" | `grep -r 'auth' .` -- 847 matches across tests, comments, unrelated vars, and one 2021 TODO | `sverklo_search "authentication flow"` -- top 5 files ranked by PageRank: middleware, JWT verifier, session store, login route, logout route |
-| "Can I safely rename `BillingAccount.charge`?" | `grep '\.charge('` -- 312 matches polluted by `recharge`, `discharge`, `Battery.charge` fixtures | `sverklo_impact BillingAccount.charge` -- 14 real callers, depth-ranked, with file paths and line numbers |
-| "Is this helper actually used anywhere?" | `grep -r 'parseFoo' .` -- 4 matches in 3 files. Are any real, or just string mentions? Read each one. | `sverklo_refs parseFoo` -- 0 real callers. Zero. Walk the symbol graph, not the text. Delete the function. |
-| "What's load-bearing in this codebase?" | `find . -name '*.ts' \| xargs wc -l \| sort` -- the biggest files. Not the most important ones. | `sverklo_overview` -- PageRank over the dep graph. The files the rest of the repo depends on, not the ones someone wrote too much code in. |
-| "Review this 40-file PR — what should I read first?" | Read them in the order git diff printed them | `sverklo_review_diff` -- risk-scored per file (touched-symbol importance x coverage x churn), prioritized order, flagged production files with no test changes |
+| "Where is auth handled in this repo?" | `grep -r 'auth' .` -- 847 matches across tests, comments, unrelated vars, and one 2021 TODO | `search "authentication flow"` -- top 5 files ranked by PageRank: middleware, JWT verifier, session store, login route, logout route |
+| "Can I safely rename `BillingAccount.charge`?" | `grep '\.charge('` -- 312 matches polluted by `recharge`, `discharge`, `Battery.charge` fixtures | `impact BillingAccount.charge` -- 14 real callers, depth-ranked, with file paths and line numbers |
+| "Is this helper actually used anywhere?" | `grep -r 'parseFoo' .` -- 4 matches in 3 files. Are any real, or just string mentions? Read each one. | `refs parseFoo` -- 0 real callers. Zero. Walk the symbol graph, not the text. Delete the function. |
+| "What's load-bearing in this codebase?" | `find . -name '*.ts' \| xargs wc -l \| sort` -- the biggest files. Not the most important ones. | `overview` -- PageRank over the dep graph. The files the rest of the repo depends on, not the ones someone wrote too much code in. |
+| "Review this 40-file PR — what should I read first?" | Read them in the order git diff printed them | `review_diff` -- risk-scored per file (touched-symbol importance x coverage x churn), prioritized order, flagged production files with no test changes |
 If the answer to your question is "exact string X exists somewhere," grep wins. Use grep. If the answer is "which 5 files actually matter here, ranked by the graph," you need sverklo.
@@ -161,10 +170,10 @@ If the answer to your question is "exact string X exists somewhere," grep wins.
 | Tool | What it does |
 |------|-------------|
-| `sverklo_search` | Hybrid BM25 + vector + PageRank search. Find code without knowing the literal string. |
-| `sverklo_refs` | All references to a symbol, with caller context. Proves dead code with certainty. |
-| `sverklo_impact` | Walk the symbol graph, return ranked transitive callers — the real blast radius. |
-| `sverklo_review_diff` | Risk-scored review of `git diff`: touched-symbol importance x coverage x churn. |
+| `search` | Hybrid BM25 + vector + PageRank search. Find code without knowing the literal string. |
+| `refs` | All references to a symbol, with caller context. Proves dead code with certainty. |
+| `impact` | Walk the symbol graph, return ranked transitive callers — the real blast radius. |
+| `review_diff` | Risk-scored review of `git diff`: touched-symbol importance x coverage x churn. |
 [See all 37 tools below.](#full-tool-reference)
@@ -206,60 +215,60 @@ Pre-existing cycles and fan-in spikes don't trip the gate — only violations *i
 ### Search — find code without knowing the literal string
 | Tool | What |
 |------|------|
-| `sverklo_search` | Hybrid BM25 + ONNX vector + PageRank, fused with Reciprocal Rank Fusion |
-| `sverklo_search_iterative` | Wider candidate pool with refinement hints between rounds |
-| `sverklo_investigate` | Parallel multi-channel fan-out (FTS / vector / path / symbol) with per-channel RRF |
-| `sverklo_ask` | Natural-language router — concepts + investigate + refs in one call |
-| `sverklo_overview` | Structural codebase map ranked by PageRank importance |
-| `sverklo_lookup` | Find any function, class, or type by name (typo-tolerant) |
-| `sverklo_context` | One-call onboarding — combines overview, code, and saved memories |
-| `sverklo_ast_grep` | Structural pattern matching across the AST, not just text |
-| `sverklo_concepts` | Browse the LLM-derived concept index (themes across the codebase) |
-| `sverklo_clusters` | Semantic clusters of related symbols, computed offline |
-| `sverklo_patterns` | Query symbols tagged with a design pattern (observer, repository, validator, ...) |
+| `search` | Hybrid BM25 + ONNX vector + PageRank, fused with Reciprocal Rank Fusion |
+| `search_iterative` | Wider candidate pool with refinement hints between rounds |
+| `investigate` | Parallel multi-channel fan-out (FTS / vector / path / symbol) with per-channel RRF |
+| `ask` | Natural-language router — concepts + investigate + refs in one call |
+| `overview` | Structural codebase map ranked by PageRank importance |
+| `lookup` | Find any function, class, or type by name (typo-tolerant) |
+| `context` | One-call onboarding — combines overview, code, and saved memories |
+| `ast_grep` | Structural pattern matching across the AST, not just text |
+| `concepts` | Browse the LLM-derived concept index (themes across the codebase) |
+| `clusters` | Semantic clusters of related symbols, computed offline |
+| `patterns` | Query symbols tagged with a design pattern (observer, repository, validator, ...) |
 ### Impact — refactor without the regression
 | Tool | What |
 |------|------|
-| `sverklo_impact` | Walk the symbol graph, return ranked transitive callers (the real blast radius) |
-| `sverklo_refs` | Find all references to a symbol, with caller context |
-| `sverklo_deps` | File dependency graph — both directions, importers and imports |
-| `sverklo_audit` | **Lint your codebase for AI-readiness.** God nodes, hub files, dead code, circular deps, security smells, A-F health grade — all in one call |
+| `impact` | Walk the symbol graph, return ranked transitive callers (the real blast radius) |
+| `refs` | Find all references to a symbol, with caller context |
+| `deps` | File dependency graph — both directions, importers and imports |
+| `audit` | **Lint your codebase for AI-readiness.** God nodes, hub files, dead code, circular deps, security smells, A-F health grade — all in one call |
 ### Review — diff-aware MR review with risk scoring
 | Tool | What |
 |------|------|
-| `sverklo_review_diff` | Risk-scored review of `git diff` — touched-symbol importance x coverage x churn |
-| `sverklo_critique` | Second-pass critique of a review — what did the first read miss |
-| `sverklo_test_map` | Which tests cover which changed symbols; flag untested production changes |
-| `sverklo_diff_search` | Semantic search restricted to the changed surface of a diff |
-| `sverklo_verify` | Verify a quoted code span is still present at the cited SHA — citation gate |
+| `review_diff` | Risk-scored review of `git diff` — touched-symbol importance x coverage x churn |
+| `critique` | Second-pass critique of a review — what did the first read miss |
+| `test_map` | Which tests cover which changed symbols; flag untested production changes |
+| `diff_search` | Semantic search restricted to the changed surface of a diff |
+| `verify` | Verify a quoted code span is still present at the cited SHA — citation gate |
 ### Memory — bi-temporal, git-aware, never stale
 | Tool | What |
 |------|------|
-| `sverklo_remember` | Save decisions, patterns, invariants — pinned to the current git SHA |
-| `sverklo_recall` | Semantic search over saved memories with staleness detection |
-| `sverklo_memories` | List all memories with health metrics (still valid / stale / orphaned) |
-| `sverklo_forget` | Delete a memory |
-| `sverklo_promote` / `sverklo_demote` | Move memories between tiers (core / archive) |
-| `sverklo_pin` / `sverklo_unpin` | Pin a memory to a file path or symbol so recall surfaces it without semantic search |
+| `remember` | Save decisions, patterns, invariants — pinned to the current git SHA |
+| `recall` | Semantic search over saved memories with staleness detection |
+| `memories` | List all memories with health metrics (still valid / stale / orphaned) |
+| `forget` | Delete a memory |
+| `promote` / `demote` | Move memories between tiers (core / archive) |
+| `pin` / `unpin` | Pin a memory to a file path or symbol so recall surfaces it without semantic search |
 ### Post-filter primitives — refine the last response without re-querying
 | Tool | What |
 |------|------|
-| `sverklo_grep_results` | Grep inside the previous result block instead of re-running the search |
-| `sverklo_head_results` | Take the first N hits from the previous response |
-| `sverklo_ctx_peek` | Peek at a referenced span by its handle without expanding it fully |
-| `sverklo_ctx_slice` | Slice a stored response by line range |
-| `sverklo_ctx_grep` | Grep within a stored context window |
-| `sverklo_ctx_stats` | Token-budget stats for stored response handles |
+| `grep_results` | Grep inside the previous result block instead of re-running the search |
+| `head_results` | Take the first N hits from the previous response |
+| `ctx_peek` | Peek at a referenced span by its handle without expanding it fully |
+| `ctx_slice` | Slice a stored response by line range |
+| `ctx_grep` | Grep within a stored context window |
+| `ctx_stats` | Token-budget stats for stored response handles |
 ### Index health
 | Tool | What |
 |------|------|
-| `sverklo_status` | Index health check, file counts, last update |
-| `sverklo_wakeup` | 500-token codebase summary for system prompts on agents that can't run MCP |
+| `status` | Index health check, file counts, last update |
+| `wakeup` | 500-token codebase summary for system prompts on agents that can't run MCP |
 </details>
@@ -289,11 +298,11 @@ If a launch post tells you a tool is great for everything, close the tab.
 ### How do I stop Claude Code from hallucinating about my codebase?
-Claude generates code from training-data patterns, not your repo. Without a symbol graph, it invents `getUserByEmail()` when your code uses `findByEmail()`. Sverklo grounds the agent in your actual symbol graph — `sverklo_lookup` and `sverklo_refs` resolve names to `file:line` and prove existence before the agent writes the call. Verifiable retrieval (`sverklo_verify`) lets the agent re-check that a quoted span is still present at the cited SHA, so a stale citation gets caught instead of confabulated.
+Claude generates code from training-data patterns, not your repo. Without a symbol graph, it invents `getUserByEmail()` when your code uses `findByEmail()`. Sverklo grounds the agent in your actual symbol graph — `lookup` and `refs` resolve names to `file:line` and prove existence before the agent writes the call. Verifiable retrieval (`verify`) lets the agent re-check that a quoted span is still present at the cited SHA, so a stale citation gets caught instead of confabulated.
 ### Is there a local-first MCP server for codebase memory?
-Yes — sverklo. `sverklo_remember` and `sverklo_recall` ship a bi-temporal memory layer: every memory is pinned to the git SHA it was authored on, and `valid_until_sha` + `superseded_by` preserve a timeline of supersessions instead of overwriting. Recall is hybrid (FTS5 + cosine over an ONNX embedding) and runs entirely in embedded SQLite. No cloud, no API keys, no external vector database — unlike most "memory MCP" projects which require Zilliz, Milvus, or a managed Postgres+pgvector.
+Yes — sverklo. `remember` and `recall` ship a bi-temporal memory layer: every memory is pinned to the git SHA it was authored on, and `valid_until_sha` + `superseded_by` preserve a timeline of supersessions instead of overwriting. Recall is hybrid (FTS5 + cosine over an ONNX embedding) and runs entirely in embedded SQLite. No cloud, no API keys, no external vector database — unlike most "memory MCP" projects which require Zilliz, Milvus, or a managed Postgres+pgvector.
 ### Is there an open-source alternative to Sourcegraph Cody I can run locally?
@@ -395,23 +404,23 @@ Real measurements on real codebases. Reproducible via `npm run bench` ([methodol
 - **Search p95 stays under 26 ms** even on a 4k-file monorepo
 - **Impact analysis is sub-millisecond** — indexed SQL join, not a string scan
-- **12 languages:** TS, JS, Vue, Python, Go, Rust, Java, C, C++, Ruby, PHP, C#
+- **24 languages:** 10 first-class structural parsers plus 14 regex-fallback languages
 ### Retrieval benchmark — bench:primitives
-Hybrid retrieval F1 vs grep baselines on a 90-task hand-verified evaluation across three OSS codebases (express, lodash, sverklo). Public report at **[sverklo.com/bench/](https://sverklo.com/bench/)** — including every slice where sverklo *loses*. Methodology repo: **[github.com/sverklo/sverklo-bench](https://github.com/sverklo/sverklo-bench)**.
+Hybrid retrieval F1 vs grep baselines on a 180-task hand-verified evaluation across six OSS codebases (express, lodash, sverklo, requests, flask, fastapi). Public report at **[sverklo.com/bench/](https://sverklo.com/bench/)** — including every slice where sverklo *loses*. Methodology repo: **[github.com/sverklo/sverklo-bench](https://github.com/sverklo/sverklo-bench)**.
-Latest run (sverklo v0.20.2, May 2026):
+Latest published 180-task run (sverklo v0.20.21, May 2026):
-| baseline | F1 | P1 (def) | P2 (refs) | P4 (deps) | input tokens | tool calls |
-|---|---:|---:|---:|---:|---:|---:|
-| naive-grep | 0.29 | 0.10 | 0.18 | 0.53 | 20,278 | 6.5 |
-| smart-grep (tuned) | 0.49 | 0.43 | **0.40** | 0.59 | 1,220 | 4.9 |
-| **sverklo** | **0.56** | **0.73** | 0.25 | **0.71** | **469** | **1.0** |
-| jcodemunch-mcp | 0.32 | **0.73** | 0.00 | 0.46 | 1,267 | 1.2 |
-| GitNexus | 0.25 | 0.27 | 0.00 | 0.30 | **372** | 1.2 |
+| baseline | F1 | avg input tokens | tool calls |
+|---|---:|---:|---:|
+| naive-grep | 0.25 | 22,704 | 6.3 |
+| smart-grep (tuned) | 0.34 | 714 | 3.2 |
+| jcodemunch-mcp | 0.29 | 1,907 | 1.2 |
+| GitNexus | 0.30 | 630 | 1.2 |
+| **sverklo** | **0.58** | **652** | **1.0** |
-Sverklo leads overall F1 (0.56 vs smart-grep's 0.49); ties jcodemunch-mcp on P1 definition lookup; smart-grep still wins P2 reference finding (0.40 vs sverklo's 0.25). Token economy: 43× fewer than naive grep, ~2.6× fewer than smart-grep, single tool call per task vs grep's 4-7.
+Sverklo leads overall F1, dominates P4 file-dependency questions, and keeps the honest loss slice visible: dead-code detection is where grep-style baselines remain strongest. Token economy: about 35× fewer input tokens than naive grep, with a single tool call per task.
 Reproduce: `npm run bench:quick`. Filter with `BASELINES=sverklo,jcodemunch DATASETS=express npm run bench:quick`.
@@ -440,9 +449,10 @@ Click the badge for your editor. Cursor / VS Code prompt to confirm, then sverkl
 ```bash
 npm install -g sverklo
 cd your-project && sverklo init
+sverklo prove
 ```
-`sverklo init` auto-detects which AI coding agents you have (Claude Code, Cursor, Windsurf, Zed, Antigravity) and writes the right MCP config files. Idempotent — safe to re-run. If sverklo doesn't appear in your agent after restart, run `sverklo doctor`.
+`sverklo init` auto-detects which AI coding agents you have (Claude Code, Cursor, Windsurf, Zed, Antigravity) and writes the right MCP config files. `sverklo prove` prints central files, a real caller graph, and a prompt to paste into your agent. Idempotent — safe to re-run. If sverklo doesn't appear in your agent after restart, run `sverklo doctor`.
 **Per-agent config locations** (`sverklo init` writes these for you):
 - Claude Code: `.mcp.json` at project root + appends to `CLAUDE.md` (or `AGENTS.md` if present)
@@ -486,7 +496,7 @@ Use this if you're contributing, debugging the indexer, or want to run a not-yet
 To run the bench:
 ```bash
-npm run bench:primitives
+npm run bench:quick
 ```
 Output lands in `benchmark/results/<timestamp>/`.
@@ -522,7 +532,7 @@ Inside Claude Code:
 /plugin install sverklo-skill@sverklo-marketplace
 ```
-Installs the bundled Skill (procedural instructions teaching Claude when to reach for `sverklo_search`, `sverklo_impact`, `sverklo_review_diff`, `sverklo_remember`, etc.) without touching your global skills directory.
+Installs the bundled Skill (procedural instructions teaching Claude when to reach for `search`, `impact`, `review_diff`, `remember`, etc.) without touching your global skills directory.
 > **First run note:** The ONNX embedding model (~90 MB) downloads automatically on first launch. Takes ~30 seconds, then every subsequent run is offline-capable.
@@ -554,7 +564,7 @@ sverklo audit --format sarif      # GitHub code-scanning alerts
 sverklo audit --format json       # machine-readable for CI gates
 ```
-Six formats: `markdown`, `html`, `json`, `sarif`, `csv`, `badges`. Pair with `sverklo_impact` (the MCP tool) when you want to see the per-symbol blast radius before refactoring.
+Six formats: `markdown`, `html`, `json`, `sarif`, `csv`, `badges`. Pair with `impact` (the MCP tool) when you want to see the per-symbol blast radius before refactoring.
 ---

package/dist/bin/sverklo.js CHANGED Viewed

@@ -52,6 +52,7 @@ if (command && command !== "--help" && command !== "-h") {
         const HELP_BLURBS = {
             init: "Set up sverklo in your project (.mcp.json + CLAUDE.md, auto-detects Claude Code/Cursor/Windsurf/Antigravity). With --global: one-time-per-machine setup — write SVERKLO_SNIPPET to ~/.claude/CLAUDE.md and ~/.codex/AGENTS.md, register the project, gitignore .sverklo/, import memories. Skips per-project boilerplate.",
             doctor: "Diagnose MCP setup issues. Run after `init` to verify the agent can reach sverklo.",
+            prove: "Show a first-run repo-memory proof: central files, a real symbol with callers, and a paste-ready agent prompt.",
             audit: "Run codebase audit and emit a graded report. Flags: --format markdown|html|json|graph|arch|obsidian, --output PATH, --open, --badge, --publish.",
             "audit-diff": "Incremental architectural quality gate. Audits `git diff` for new cycles + fan-in spikes. Flags: --against REF, --fan-in-threshold N, --format human|json, --show-existing, --verbose. Exits 1 on regression.",
             review: "Risk-scored diff review (CI-friendly). Flags: --ref REF, --ci, --format markdown|json, --max-files N, --fail-on low|medium|high.",
@@ -161,6 +162,14 @@ if (command === "register") {
     console.log(`Registry: ${getRegistryPath()}`);
     process.exit(0);
 }
+if (command === "prove") {
+    const flags = args.slice(1);
+    const projectPath = await resolveProjectPath(flags);
+    const { runProve } = await import("../src/prove.js");
+    const report = await runProve(projectPath);
+    process.stdout.write(report);
+    process.exit(0);
+}
 if (command === "unregister") {
     // Issue #73 (HaleTom, 2026-05-25): agents tearing down git worktrees
     // know the absolute path but not the internal repo name. Looking up
@@ -869,7 +878,7 @@ if (command === "telemetry") {
         console.log("  os          darwin / linux / win32");
         console.log("  node_major  the Node major version sverklo is running on");
         console.log("  event       one of 17 fixed event types");
-        console.log("  tool        sverklo_* tool name (when applicable)");
+        console.log("  tool        sverklo tool name (when applicable)");
         console.log("  outcome     ok / error / timeout");
         console.log("  duration_ms tool execution time");
         console.log("");
@@ -1084,7 +1093,7 @@ if (command === "profile") {
             console.log(`             ${tools.map((t) => t.replace(/^sverklo_/, "")).join(", ")}`);
             console.log();
         }
-        console.log("  full        36 tools  (all sverklo_* tools — default)");
+        console.log("  full        37 tools  (every first-party sverklo tool — default)");
         console.log("\n  Set with: SVERKLO_PROFILE=core sverklo init");
         console.log("  Or in .sverklo.yaml: profile: core");
         console.log("  See: https://sverklo.com/blog/we-already-shipped-mcp-code-mode/\n");
@@ -1112,8 +1121,14 @@ if (command === "profile") {
         // Structured doc — use it directly. The --days window doesn't apply
         // to cumulative stats; we use the full lifetime instead, and tell
         // the user when the doc started accumulating.
+        //
+        // v0.28.0: canonicalize legacy `sverklo_*` names in historical stats
+        // files so they collapse onto the new short names (`sverklo_search`
+        // + `search` → `search`). Without this, a long-running user's stats
+        // would show duplicate rows after the rename.
         for (const [tool, stat] of Object.entries(structuredStats.tools)) {
-            counts[tool] = stat.calls;
+            const canon = tool.startsWith("sverklo_") ? tool.slice("sverklo_".length) : tool;
+            counts[canon] = (counts[canon] || 0) + stat.calls;
         }
         total = structuredStats.totalCalls;
         const sinceStr = new Date(structuredStats.startedAt).toISOString().slice(0, 10);
@@ -1134,7 +1149,10 @@ if (command === "profile") {
             process.exit(0);
         }
         for (const c of calls) {
-            const tool = String(c.detail.tool);
+            const raw = String(c.detail.tool);
+            // v0.28.0: collapse legacy `sverklo_*` rows onto canonical names so
+            // upgraded users don't see split-personality stats.
+            const tool = raw.startsWith("sverklo_") ? raw.slice("sverklo_".length) : raw;
             counts[tool] = (counts[tool] || 0) + 1;
         }
         total = calls.length;
@@ -1147,8 +1165,7 @@ if (command === "profile") {
     console.log("  " + "-".repeat(70));
     for (const [tool, n] of ranked) {
         const pct = ((n / total) * 100).toFixed(1) + "%";
-        const display = tool.startsWith("sverklo_") ? tool : tool;
-        console.log("  " + display.padEnd(38) + String(n).padStart(10) + pct.padStart(10));
+        console.log("  " + tool.padEnd(38) + String(n).padStart(10) + pct.padStart(10));
     }
     console.log("  " + "-".repeat(70));
     // Compute coverage for every named profile so the user can see the
@@ -1157,6 +1174,15 @@ if (command === "profile") {
     // of real usage calls fall outside the profile.
     const profileOrder = ["core", "nav", "review", "lean", "research"];
     const fits = [];
+    // v0.28.0: first-party tools no longer carry a `sverklo_` prefix, so
+    // "missing from profile" is anyone-not-in-profile minus the Zilliz
+    // compat aliases (recorded but not first-party).
+    const COMPAT_ALIASES = new Set([
+        "index_codebase",
+        "search_code",
+        "clear_index",
+        "get_indexing_status",
+    ]);
     for (const name of profileOrder) {
         const profileTools = PROFILES[name];
         if (!profileTools)
@@ -1167,7 +1193,7 @@ if (command === "profile") {
         for (const [tool, n] of ranked) {
             if (profileSet.has(tool))
                 covered += n;
-            else if (tool.startsWith("sverklo_"))
+            else if (!COMPAT_ALIASES.has(tool))
                 missing.push(tool);
         }
         fits.push({ name, size: profileTools.length, coveragePct: covered / total, missing });
@@ -2195,10 +2221,10 @@ if (command === "memory") {
                 `  --editor PATH    editor to invoke (default: $EDITOR or vi)\n\n` +
                 `Safety:\n` +
                 `  - Removing a memory's heading from the file does NOT delete it.\n` +
-                `    Use \`sverklo memory demote <id>\` (planned) or \`sverklo_demote\`\n` +
+                `    Use \`sverklo memory demote <id>\` (planned) or \`demote\`\n` +
                 `    from MCP for explicit deletion.\n` +
                 `  - Adding a new memory by hand is not supported here. Use\n` +
-                `    \`sverklo_remember\` from MCP or call the API directly.\n` +
+                `    \`remember\` from MCP or call the API directly.\n` +
                 `  - If the parser can't make sense of your edits, the change\n` +
                 `    is rejected and your SQLite store is left untouched.\n`);
             process.exit(0);
@@ -2555,13 +2581,14 @@ sverklo — code intelligence for AI agents
 Just installed? Run these two:
   sverklo init               Set up sverklo in your project (.mcp.json + CLAUDE.md)
-  sverklo doctor             Verify MCP dispatch end-to-end (initialize + tools/list + tools/call)
+  sverklo prove              Show a real repo-memory proof from your codebase
 Then restart your AI agent (Claude Code, Cursor, Windsurf, etc.) — sverklo tools become available automatically.
 Usage:
   sverklo init               Set up sverklo in your project (.mcp.json + CLAUDE.md)
   sverklo doctor             Diagnose MCP setup issues
+  sverklo prove [path]       Show central files, a real caller graph, and an agent prompt
   sverklo reindex [path]     Incremental rebuild of the index (changed files only)
                              Use --force to clear and rebuild from scratch.
                              Use --timing to see per-phase elapsed ms.
@@ -2604,6 +2631,7 @@ Setup / runtime:
 Quick start (single project):
   npm install -g sverklo
   cd your-project && sverklo init
+  sverklo prove
   claude   # start coding — sverklo tools are preferred automatically
 Quick start (multi-repo, global):