memtrace 0.2.0 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +44 -26
  2. package/package.json +4 -4
package/README.md CHANGED
@@ -25,11 +25,10 @@ Index once. Every agent query after that resolves through graph traversal — ca
25
25
 
26
26
  ```bash
27
27
  npm install -g memtrace # binary + 12 skills + MCP server — one command
28
- memtrace start # launches the graph database
29
- memtrace index . # indexes your codebase in seconds
28
+ memtrace start # launches the graph database and auto-indexes the current project
30
29
  ```
31
30
 
32
- That's it. Claude picks up the skills and MCP tools automatically.
31
+ That's it. Run `memtrace start` from your project root — it spins up the graph database and kicks off indexing automatically. Claude and Cursor (v2.4+) pick up the skills and MCP tools automatically.
33
32
 
34
33
  ---
35
34
 
@@ -48,29 +47,33 @@ On top of that, the structural layer is comprehensive:
48
47
  - **Relationships are edges** — `CALLS`, `IMPLEMENTS`, `IMPORTS`, `EXPORTS`, `CONTAINS`
49
48
  - **Community detection** — Louvain algorithm identifies architectural modules automatically
50
49
  - **Hybrid search** — Tantivy BM25 + vector embeddings + Reciprocal Rank Fusion, all on top of the graph
51
- - **Rust-native** — compiled binary, no Python/JS runtime overhead, sub-15ms average query latency
50
+ - **Rust-native** — compiled binary, no Python/JS runtime overhead, sub-8ms average query latency
52
51
 
53
52
  The agent doesn't just search your code. It *remembers* it.
54
53
 
55
54
  ## Benchmarks
56
55
 
57
- All benchmarks run on the same machine, same codebase, same queries. No cherry-picking.
56
+ All four systems run on the same machine, same mempalace checkout, same 1,000 queries, same evaluator. Ground truth is extracted by Python's stdlib `ast` module — **not** from any tool's index — so no system gets a home-field advantage. Full reproduction scripts and raw results: [`benchmarks/fair/`](https://github.com/syncable-dev/memtrace-public/tree/main/benchmarks/fair).
58
57
 
59
- ### Does it find the right thing?
58
+ <img alt="Benchmark overview: Memtrace 96.7% Acc@1, 100% Acc@10, 9.16ms latency, 195 tokens — vs ChromaDB, GitNexus, CodeGrapher" src="https://raw.githubusercontent.com/syncable-dev/memtrace-public/main/assets/benchmarks/benchmark-overview.svg" width="720"/>
60
59
 
61
- <img alt="Search accuracy: Memtrace 97.3% vs ChromaDB 89.6% vs GitNexus 12.8%" src="https://raw.githubusercontent.com/syncable-dev/memtrace-public/main/assets/benchmarks/search-accuracy.svg" width="720"/>
60
+ ### Results (1,000 Python symbol-lookup queries on mempalace)
62
61
 
63
- ### How fast?
62
+ | Tool | Coverage | Acc@1 | Acc@5 | Acc@10 | Avg lat | Tokens |
63
+ |:-----|---------:|------:|------:|-------:|--------:|-------:|
64
+ | **Memtrace** (ArcadeDB) | **100.0%** | **96.7%** | **100.0%** | **100.0%** | **9.16 ms** | 195 |
65
+ | ChromaDB (all-MiniLM-L6-v2) | 100.0% | 62.3% | 86.1% | 87.9% | 58.5 ms | 1,937 |
66
+ | GitNexus (eval-server) | 99.5% | 27.1% | 89.7% | 89.9% | 191.2 ms | 213 |
67
+ | CodeGrapherContext (CLI) | 67.2% | 6.4% | 66.4% | 66.7% | 1627.2 ms | 221 |
64
68
 
65
- <img alt="Search latency: Memtrace 13.4ms vs ChromaDB 60.6ms vs GitNexus 172.7ms vs CodeGrapher 510.5ms" src="https://raw.githubusercontent.com/syncable-dev/memtrace-public/main/assets/benchmarks/search-latency.svg" width="720"/>
69
+ **What the numbers say, read fairly:**
66
70
 
67
- ### How much context does it save?
71
+ - **Memtrace** is exact-symbol lookup's sweet spot: 100% coverage, rank-1 hit in 96.7% of queries, and the correct file is in the top-10 every single time. 9 ms per query, 195 tokens per response.
72
+ - **ChromaDB** shows what semantic embeddings look like for this workload — 88% top-10 but rank-1 is probabilistic, and the response is 10× larger because it returns 800-char chunks rather than symbol metadata.
73
+ - **GitNexus** finds the right file 90% of the time — its response leads with execution *flows*, pushing standalone definitions down the list, which costs it rank-1 but not top-10.
74
+ - **CodeGrapherContext**'s 67.2% coverage means its parser extracted two-thirds of the symbols Python's AST finds. Among symbols it did index, top-10 hit rate is excellent (~99%). Latency is dominated by CLI re-initialising FalkorDB per call.
68
75
 
69
- <img alt="Token usage: Memtrace 319K vs ChromaDB 1.91M 83% reduction" src="https://raw.githubusercontent.com/syncable-dev/memtrace-public/main/assets/benchmarks/token-context.svg" width="720"/>
70
-
71
- ### How long to set up?
72
-
73
- <img alt="Indexing: Memtrace 1.5s vs Graphiti 6h vs Mem0 31m" src="https://raw.githubusercontent.com/syncable-dev/memtrace-public/main/assets/benchmarks/indexing-speed.svg" width="720"/>
76
+ **Where each tool shines** — this benchmark measures exact-symbol lookup only. Different workloads produce different rankings: ChromaDB wins on natural-language queries, GitNexus on execution-flow traces, Memtrace on exact lookup / typo tolerance / temporal queries / cross-service API topology. See [`benchmarks/fair/README.md`](https://github.com/syncable-dev/memtrace-public/tree/main/benchmarks/fair/README.md) for a per-workload breakdown.
74
77
 
75
78
  <details>
76
79
  <summary><strong>Memtrace vs. general memory systems (Mem0, Graphiti)</strong></summary>
@@ -105,14 +108,16 @@ GitNexus and CodeGrapherContext both build AST-based code graphs with structural
105
108
  | Community detection (Louvain) | **Yes** | Yes | No |
106
109
  | Hybrid search (BM25 + vector + RRF) | **Yes — Tantivy + embeddings** | No | BM25 + optional embeddings |
107
110
  | Language | **Rust (compiled binary)** | JavaScript | Python |
108
- | Search accuracy (1K queries) | **97.3%** | 12.8% | 0%* |
109
- | Query latency (1K queries) | **13.4 ms avg** | 172.7 ms avg | 510.5 ms avg |
110
- | Tokens per query | **319 avg** | 254 avg | 23 avg |
111
- | Index time (1,500 files) | **1.5 sec** | 10.5 sec | ~3.5 min |
111
+ | Coverage (1K queries) | **100%** | 99.5% | 67.2% |
112
+ | Acc@1 (1K queries) | **96.7%** | 27.1% | 6.4% |
113
+ | Acc@10 (1K queries) | **100%** | 89.9% | 66.7% |
114
+ | Query latency (1K queries) | **9.16 ms avg** | 191.2 ms avg | 1627.2 ms avg |
115
+ | Tokens per query | **195 avg** | 213 avg | 221 avg |
116
+ | Index time (~250 files / 2.3K nodes / 5.8K edges) | **~4 sec** (≈500 ms of real work + ~3 s Docker / Bolt / schema DDL startup on first run) | ~6 sec | ~1 sec (cached) |
112
117
 
113
- *CGC's 0% reflects an output format mismatch — it returns symbol names without file paths, so our Acc@1 evaluator can't match them. CGC likely finds relevant symbols; the metric just can't confirm it. All numbers from [live benchmark](https://github.com/syncable-dev/memtrace-public/tree/main/benchmarks) on the same machine, same codebase, same 1,000 queries.
118
+ All numbers from [the fair benchmark](https://github.com/syncable-dev/memtrace-public/tree/main/benchmarks/fair) on the same machine, same mempalace checkout, same 1,000 queries. Ground truth is extracted by Python's stdlib `ast` — not from any tool's index — so no system is advantaged in the dataset itself.
114
119
 
115
- The latency difference is primarily Rust vs. interpreted runtimes, and Memgraph's Bolt protocol vs. HTTP/embedding pipelines. The feature difference is temporal memory and API topology — dimensions Memtrace adds on top of the shared AST-graph foundation.
120
+ The latency difference is primarily Rust vs. interpreted runtimes, and ArcadeDB's Graph-OLAP engine (native CSR projections, PageRank/betweenness as in-database procedures) vs. HTTP/embedding pipelines. The feature difference is temporal memory and API topology — dimensions Memtrace adds on top of the shared AST-graph foundation.
116
121
 
117
122
  </details>
118
123
 
@@ -215,14 +220,14 @@ Uses **Structural Significance Budgeting** to surface the minimum set of changes
215
220
  |:---------------|:---------------:|:-----------:|:--------|
216
221
  | **Claude Code** | ✅ | ✅ | `npm install -g memtrace` — fully automatic |
217
222
  | **Claude Desktop** | ✅ | ✅ | Automatic — shared with Claude Code |
218
- | **Cursor** | ✅ | Coming soon | Add MCP server manually |
223
+ | **Cursor** (v2.4+) | ✅ | | `npm install -g memtrace` — fully automatic |
219
224
  | **Windsurf** | ✅ | Coming soon | Add MCP server manually |
220
225
  | **VS Code (Copilot)** | ✅ | — | Add MCP server manually |
221
226
  | **Cline / Roo Code** | ✅ | — | Add MCP server manually |
222
227
  | **Codex CLI** | ✅ | Coming soon | Add MCP server manually |
223
228
  | **Any MCP client** | ✅ | — | Add MCP server manually |
224
229
 
225
- > **MCP tools** work with any editor or agent that supports the [Model Context Protocol](https://modelcontextprotocol.io). **Skills** are Claude-specific workflow prompts that teach the agent *how* to chain tools — they require Claude Code or Claude Desktop.
230
+ > **MCP tools** work with any editor or agent that supports the [Model Context Protocol](https://modelcontextprotocol.io). **Skills** are workflow prompts that teach the agent *how* to chain tools — Claude Code, Claude Desktop, and Cursor (v2.4+) all load them natively from the same `SKILL.md` format.
226
231
 
227
232
  ## Setup
228
233
 
@@ -238,7 +243,21 @@ claude plugin install memtrace-skills@memtrace --scope user
238
243
  claude mcp add memtrace -- memtrace mcp -e MEMTRACE_ARCADEDB_BOLT_URL=bolt://localhost:7687
239
244
  ```
240
245
 
241
- ### Other Editors (Cursor, Windsurf, VS Code, Cline)
246
+ ### Cursor
247
+
248
+ Cursor **v2.4+** supports Agent Skills natively, and `npm install -g memtrace` handles everything automatically — no separate Cursor plugin is needed because Cursor reads the same `SKILL.md` format as Claude.
249
+
250
+ What the installer writes:
251
+ - **MCP server** → `~/.cursor/mcp.json` (global — works in every project you open)
252
+ - **12 skills + 4 workflows** → `~/.cursor/skills/memtrace-*/SKILL.md`
253
+
254
+ For a **project-local** install (so the skills travel with your repo and teammates get them on clone), run inside the project:
255
+
256
+ ```bash
257
+ memtrace install --only cursor --local
258
+ ```
259
+
260
+ ### Other Editors (Windsurf, VS Code, Cline)
242
261
 
243
262
  After `npm install -g memtrace`, add the MCP server to your editor's config:
244
263
 
@@ -259,7 +278,6 @@ After `npm install -g memtrace`, add the MCP server to your editor's config:
259
278
 
260
279
  | Editor | Config file |
261
280
  |:-------|:------------|
262
- | **Cursor** | `.cursor/mcp.json` in your project root |
263
281
  | **Windsurf** | `~/.codeium/windsurf/mcp_config.json` |
264
282
  | **VS Code (Copilot)** | `.vscode/mcp.json` in your project root |
265
283
  | **Cline** | Cline MCP settings in the extension panel |
@@ -287,7 +305,7 @@ Rust · Go · TypeScript · JavaScript · Python · Java · C · C++ · C# · Sw
287
305
 
288
306
  | Dependency | Purpose |
289
307
  |:-----------|:--------|
290
- | **Memgraph** | Graph database — auto-managed via `memtrace start` |
308
+ | **ArcadeDB** | Graph + document + vector database — auto-managed via `memtrace start` (pulls `arcadedata/arcadedb:latest`) |
291
309
  | **Node.js ≥ 18** | npm installation |
292
310
  | **Git** | Temporal analysis (commit history) |
293
311
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "memtrace",
3
- "version": "0.2.0",
3
+ "version": "0.2.3",
4
4
  "description": "Code intelligence graph — MCP server + AI agent skills + visualization UI",
5
5
  "keywords": [
6
6
  "mcp",
@@ -36,9 +36,9 @@
36
36
  "fs-extra": "^11.0.0"
37
37
  },
38
38
  "optionalDependencies": {
39
- "@memtrace/darwin-arm64": "0.2.0",
40
- "@memtrace/linux-x64": "0.2.0",
41
- "@memtrace/win32-x64": "0.2.0"
39
+ "@memtrace/darwin-arm64": "0.2.3",
40
+ "@memtrace/linux-x64": "0.2.3",
41
+ "@memtrace/win32-x64": "0.2.3"
42
42
  },
43
43
  "engines": {
44
44
  "node": ">=18"