agenticow 0.1.1 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -56,6 +56,12 @@ agent.ingest([{ id: 9001, vector: newMemory }]); // isolated from the base
56
56
  const hits = agent.query(queryVector, 10);
57
57
  // -> [{ id, distance, branch }, ...] (tombstone-masked, reranked)
58
58
 
59
+ // NEW in 0.2.0 — native ANN ACROSS the branch (single Rust dual-graph query):
60
+ const fast = base.fork('agent-b', null, { nativeAnn: true });
61
+ fast.ingest([{ id: 9002, vector: newMemory }]);
62
+ fast.query(queryVector, 10); // parent ∪ edits via native HNSW, recall@10 ≈ 1.0
63
+ fast.nativeAnn; // true on linux-x64; false (exact fallback) elsewhere
64
+
59
65
  // checkpoint + roll back a poisoned branch
60
66
  const ckpt = agent.checkpoint('clean');
61
67
  agent.ingest([{ id: 666, vector: poison }]);
@@ -206,11 +212,12 @@ agentMem.promote(base); // merge — ops via `jj squash`
206
212
  ```
207
213
 
208
214
  **Honest status (ADR-202):** spawn / learn / revert / merge are **wired
209
- end-to-end** with both real native planes. **Cross-branch ANN query is stubbed**
210
- behind a port — agenticow's exact read-through answers it correctly but
211
- unaccelerated across the COW boundary; the native single-index-across-the-branch
212
- lands with [ruvnet/RuVector PR #617](https://github.com/ruvnet/RuVector/pull/617),
213
- at which point only the adapter swaps.
215
+ end-to-end** with both real native planes. **Cross-branch ANN query is now
216
+ shipped** — agenticow `0.2.x` adds native dual-graph ANN across the COW boundary
217
+ (`fork({ nativeAnn: true })`, [RuVector PR #617/#618](https://github.com/ruvnet/RuVector/pull/617),
218
+ recall@10 1.0 on linux-x64; exact read-through fallback elsewhere). The bridge
219
+ adapter can swap from the exact-read-through port to the native ANN port with no
220
+ interface change.
214
221
 
215
222
  ---
216
223
 
@@ -267,10 +274,12 @@ The acceptance test builds a brute-force ground truth (`base ∪ branch-inserts
267
274
  | Exact read-through (parent ∪ edits) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
268
275
  | Embedded / in-process (no server) | ✅ | ❌ | ❌ | via PG | ✅ | ✅/server |
269
276
  | Raw ANN throughput | ⚠️ ~2.7× behind hnswlib\* | high | high | moderate | moderate | high |
270
- | ANN index spanning the branch | 🚧 roadmap | n/a | n/a | n/a | n/a | n/a |
277
+ | ANN search spanning the branch | shipped (recall@10 ≈ 1.0, linux-x64\*\*) | n/a | n/a | n/a | n/a | n/a |
271
278
 
272
279
  \* **Honest concession.** On SIFT-1M, same machine, the underlying [ruvector](https://github.com/ruvnet/RuVector) HNSW does ~2,197 QPS @ recall 0.95 vs hnswlib-node ~9,344 QPS — roughly **2.7× slower** for raw ANN. If you need maximum raw similarity-search speed on a static index, use a dedicated ANN library. agenticow's edge is **cheap branching, checkpointing and rollback of agent memory** — which none of the above have.
273
280
 
281
+ \*\* Native ANN-across-branch (`fork({ nativeAnn: true })`) ships for **linux-x64-gnu** today; other platforms degrade gracefully to exact read-through. The raw-ANN-speed concession above still applies to the underlying engine.
282
+
274
283
  ### Performance · storage · cost at scale
275
284
 
276
285
  **Scenario: 1,000 branches over a 1M-vector base** (dim 128, ~496 MB base). agenticow figures are **measured** on an AMD Ryzen 9 9950X; competitor figures are **published / estimated** (sources below) and labeled as such — not fabricated.
@@ -302,14 +311,14 @@ agenticow ships, and proves, exactly this:
302
311
 
303
312
  - ✅ **COW branch creation** — base-size-independent, 162 B / ~0.5 ms (the 83× / 3000× headline). Proven by `npm run bench`.
304
313
  - ✅ **Exact read-through queries** — point lookup / flat-scan merge returning `parent ∪ edits`, child wins on collisions, deletes honored. Proven by `npm run acceptance` (recall@10 = 100%, masking PASS).
314
+ - ✅ **Native ANN search ACROSS the COW boundary** — *now shipped* (was roadmap). `fork(label, file, { nativeAnn: true })` creates a real `RvfDatabase.branch()` whose `query()` runs a single Rust dual-graph HNSW merge over parent ∪ child ([RuVector PR #617/#618](https://github.com/ruvnet/RuVector/pull/617)). **Verified recall@10 ≈ 1.0 (0.999)** here — 5,000-vector base ∪ 200 edits, dim 128, default cosine — vs a brute-force ground truth. **Platform caveat:** the native binary ships for **linux-x64-gnu** today; darwin / win / linux-arm64 are pending a CI cross-compile and **degrade gracefully to the exact read-through path** (identical correctness, JS merge — `mem.nativeAnn` reads `false`).
305
315
 
306
- What it does **not** yet ship:
307
-
308
- - 🚧 **A single ANN/HNSW index that spans the COW boundary** is **roadmap, not shipped**. Read-through merges each store's own index and re-ranks exactly; it does not build one unified approximate index across parent and child. Native cluster-level read-through landed in [ruvnet/RuVector PR #617](https://github.com/ruvnet/RuVector/pull/617); until that build is published, agenticow implements read-through in its wrapper over the shipped `derive()` primitive.
316
+ Still honest about the rest:
309
317
 
310
- We do not claim "fully queryable git-for-vectors". We claim **COW branch creation (83× / 3000×) + exact read-through queries** — and we prove both.
318
+ - We still **concede raw single-index ANN throughput** to dedicated vector DBs (~2.7× behind hnswlib, see [comparison](#how-it-compares)).
319
+ - The **exotic** applications (agent marketplaces, etc.) remain **vision/roadmap**, clearly labeled.
311
320
 
312
- > **Note on cosine:** the shipped `@ruvector/rvf-node@0.1.8` binding does not persist the cosine metric across a file reopen (it reads back as `l2`). agenticow L2-normalizes vectors on ingest/query when the metric is cosine, so top-K ranking is identical whether the engine scores with cosine or L2. This is why read-through stays correct after `save()`/`load()`.
321
+ > **Note on cosine.** rvf-node does not persist the cosine metric across a file reopen, and its native COW dual-graph query is accurate for **L2**, not for the cosine metric directly. agenticow therefore drives the underlying engine with **L2 over L2-normalized vectors** when you ask for cosine (the default) L2 order equals cosine order on unit vectors. This makes **both** the exact read-through **and** the native ANN path correct for cosine, and is why results survive `save()`/`load()`. (Reopening a cosine store via plain `open()` reports the engine metric `l2`; pass `{ metric: 'cosine' }` or use `save()`/`load()` to preserve the user-facing metric.)
313
322
 
314
323
  ---
315
324
 
@@ -361,8 +370,9 @@ mem.close();
361
370
  npm install agenticow
362
371
  ```
363
372
 
364
- - Node ≥ 18, ESM.
373
+ - Node ≥ 18, ESM. Current: **agenticow@0.2.1** on **@ruvector/rvf-node@0.2.0**.
365
374
  - Depends on [`@ruvector/rvf-node`](https://www.npmjs.com/package/@ruvector/rvf-node) (prebuilt native binding for linux-x64/arm64, darwin-x64/arm64, win32-x64).
375
+ - **Native ANN across the branch** (`fork({ nativeAnn: true })`) requires the native COW binary, which ships for **linux-x64-gnu** today. On other platforms it degrades gracefully to the exact read-through path — same correctness, `mem.nativeAnn === false`. The exact path (the default) works on every platform.
366
376
 
367
377
  ---
368
378
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agenticow",
3
- "version": "0.1.1",
3
+ "version": "0.2.1",
4
4
  "description": "Git for Agent Memory: Copy-On-Write vector branching for embedded multi-agent memory. Branch a base memory in ~0.5ms / 162 bytes regardless of base size — 83x faster, 3000x smaller than full-copy snapshots. Exact read-through queries (parent ∪ edits, child wins).",
5
5
  "type": "module",
6
6
  "main": "src/index.js",
@@ -70,6 +70,6 @@
70
70
  "node": ">=18"
71
71
  },
72
72
  "dependencies": {
73
- "@ruvector/rvf-node": "0.1.8"
73
+ "@ruvector/rvf-node": "^0.2.0"
74
74
  }
75
75
  }
package/src/index.d.ts CHANGED
@@ -30,6 +30,22 @@ export interface QueryOptions {
30
30
  efSearch?: number;
31
31
  /** Candidates to over-fetch per store before exact merge. Default: k*4. */
32
32
  overscan?: number;
33
+ /**
34
+ * Force the exact JS chain-walk even on a native-COW fork.
35
+ * Default false (native path used when available).
36
+ */
37
+ forceExact?: boolean;
38
+ }
39
+
40
+ export interface ForkOptions {
41
+ /**
42
+ * Use the native Rust COW dual-graph ANN path (PR #618).
43
+ * When true, fork() calls RvfDatabase.branch() instead of derive(), giving
44
+ * the returned fork a working node whose query() spans the COW boundary in
45
+ * a single Rust call. recall@10 = 1.0 at 1200-vector L2 test corpus.
46
+ * Default: false (exact JS chain-walk).
47
+ */
48
+ nativeAnn?: boolean;
33
49
  }
34
50
 
35
51
  export interface QueryHit {
@@ -73,12 +89,17 @@ export type IngestRecord = { id: number; vector: number[] | Float32Array };
73
89
  export class AgenticMemory {
74
90
  static open(filePath: string, opts?: OpenOptions): AgenticMemory;
75
91
  readonly dimension: number;
92
+ /**
93
+ * True when this fork was created with `{nativeAnn:true}`.
94
+ * query() routes through the Rust dual-graph ANN merge (PR #618).
95
+ */
96
+ readonly nativeAnn: boolean;
76
97
  ingest(records: IngestRecord[]): IngestResult;
77
98
  ingest(vectors: Float32Array, ids: number[]): IngestResult;
78
99
  delete(ids: number[]): { deleted: number; tombstoned: number };
79
100
  query(vector: number[] | Float32Array, k?: number, opts?: QueryOptions): QueryHit[];
80
101
  branch(label?: string, filePath?: string): AgenticMemory;
81
- fork(label?: string, filePath?: string): AgenticMemory;
102
+ fork(label?: string, filePath?: string, opts?: ForkOptions): AgenticMemory;
82
103
  diff(): MemoryDiff;
83
104
  promote(target: AgenticMemory): { ingested: number; deleted: number };
84
105
  checkpoint(label?: string): CheckpointDescriptor;
package/src/index.js CHANGED
@@ -74,7 +74,7 @@ class Node {
74
74
 
75
75
  export class AgenticMemory {
76
76
  /** @private */
77
- constructor(workingNode, ancestors, dim, metric, track = true, owned = null) {
77
+ constructor(workingNode, ancestors, dim, metric, track = true, owned = null, nativeCow = false) {
78
78
  /** @type {Node} */
79
79
  this._working = workingNode;
80
80
  /** @type {Node[]} ancestors newest -> oldest (base last) */
@@ -83,11 +83,28 @@ export class AgenticMemory {
83
83
  this._metric = metric;
84
84
  this._track = track;
85
85
  this._normalize = String(metric).toLowerCase() === 'cosine';
86
+ // Engine metric of the underlying RVF store. For cosine we drive the engine
87
+ // with L2 over L2-NORMALIZED vectors (L2 order == cosine order on unit
88
+ // vectors). This is what makes BOTH the exact read-through and the native
89
+ // COW dual-graph ANN path correct for cosine — rvf-node 0.2.0's native COW
90
+ // query is accurate for L2 but not for the cosine metric directly.
91
+ this._engineMetric = this._normalize ? 'l2' : String(metric).toLowerCase();
86
92
  // Nodes this instance is allowed to close. Ancestors shared from a parent
87
93
  // (via fork/branch) are NOT owned, so closing a fork never closes the base.
88
94
  /** @type {Set<Node>} */
89
95
  this._owned = owned || new Set([workingNode]);
90
96
  this._closed = false;
97
+ /**
98
+ * True when this instance's working node was created via RvfDatabase.branch()
99
+ * (a real COW child with a dual-graph HNSW that spans the parent boundary).
100
+ * When true, query() routes through the native Rust ANN path — a single
101
+ * db.query() call returns parent∪child hits via the dual-graph merge in
102
+ * rvf-runtime's query_via_index_cow. Verified recall@10 ≈ 1.0 (0.999,
103
+ * 5,000-vector base ∪ 200 edits, dim 128, cosine via normalized-L2) on
104
+ * linux-x64-gnu; degrades to the exact JS path on other platforms.
105
+ * @type {boolean}
106
+ */
107
+ this._nativeCow = nativeCow;
91
108
  }
92
109
 
93
110
  /**
@@ -103,16 +120,21 @@ export class AgenticMemory {
103
120
  if (fs.existsSync(filePath)) {
104
121
  db = RvfDatabase.open(filePath);
105
122
  dim = db.dimension();
106
- metric = db.metric ? db.metric() : (opts.metric || DEFAULT_METRIC);
123
+ // A reopened store reports its ENGINE metric (l2 for cosine stores). Let
124
+ // the caller restore the user-facing metric with opts.metric — or use
125
+ // save()/load(), which persists it. See README "Note on cosine".
126
+ metric = opts.metric || (db.metric ? db.metric() : DEFAULT_METRIC);
107
127
  } else {
108
128
  if (!opts.dimension) {
109
129
  throw new Error('agenticow: dimension is required when creating a new memory file');
110
130
  }
111
131
  dim = opts.dimension;
112
132
  metric = opts.metric || DEFAULT_METRIC;
133
+ // cosine -> drive the engine with l2 over normalized vectors.
134
+ const engineMetric = String(metric).toLowerCase() === 'cosine' ? 'l2' : metric;
113
135
  db = RvfDatabase.create(filePath, {
114
136
  dimension: dim,
115
- metric,
137
+ metric: engineMetric,
116
138
  ...(opts.m ? { m: opts.m } : {}),
117
139
  ...(opts.efConstruction ? { efConstruction: opts.efConstruction } : {}),
118
140
  });
@@ -126,7 +148,8 @@ export class AgenticMemory {
126
148
  }
127
149
 
128
150
  _deriveOpts() {
129
- return { dimension: this._dim, metric: this._metric };
151
+ // Children must use the same ENGINE metric as the base (l2 for cosine).
152
+ return { dimension: this._dim, metric: this._engineMetric };
130
153
  }
131
154
 
132
155
  _assertOpen() {
@@ -209,10 +232,31 @@ export class AgenticMemory {
209
232
  query(vector, k = 10, opts = {}) {
210
233
  this._assertOpen();
211
234
  const qv = this._normalize ? l2normalize(vector) : toF32(vector);
235
+ const qopts = opts.efSearch ? { efSearch: opts.efSearch } : undefined;
236
+
237
+ // ── Native COW dual-graph ANN path ────────────────────────────────
238
+ // When this instance was created via fork({nativeAnn:true}) or
239
+ // branch({nativeAnn:true}), the working node's db is a real COW child
240
+ // (RvfDatabase.branch()). A single db.query() call transparently queries
241
+ // both the child's own HNSW and the parent's HNSW, merges candidates with
242
+ // child-wins semantics, and excludes tombstoned IDs — all in Rust.
243
+ // Recall@10 = 1.0000 on the PR#618 integration test (1200-vector L2,
244
+ // 60 new + 20 overrides + 10 tombstones, efSearch=300).
245
+ if (this._nativeCow && !opts.forceExact) {
246
+ const hits = this._working.db.query(qv, k, qopts);
247
+ return hits.map((h) => ({
248
+ id: h.id,
249
+ distance: h.distance,
250
+ branch: this._working.label || this._working.id,
251
+ }));
252
+ }
253
+
254
+ // ── Exact JS chain-walk (default / fallback) ──────────────────────
255
+ // For each node in the lineage chain (newest first), query its local
256
+ // store and merge with child-wins semantics.
212
257
  const fetch = Math.max(k, opts.overscan || k * 4);
213
258
  const resolved = new Map(); // id -> {id, distance, branch}
214
259
  const hidden = new Set(); // ids tombstoned by a nearer descendant
215
- const qopts = opts.efSearch ? { efSearch: opts.efSearch } : undefined;
216
260
  for (const node of this._chain()) {
217
261
  for (const t of node.tombstones) hidden.add(t);
218
262
  let hits = [];
@@ -229,6 +273,16 @@ export class AgenticMemory {
229
273
  return [...resolved.values()].sort((a, b) => a.distance - b.distance).slice(0, k);
230
274
  }
231
275
 
276
+ /**
277
+ * Whether this instance uses the native Rust COW dual-graph ANN query path.
278
+ * true => query() routes through rvf-runtime's query_via_index_cow (PR #618).
279
+ * false => exact JS chain-walk across the lineage.
280
+ * @type {boolean}
281
+ */
282
+ get nativeAnn() {
283
+ return this._nativeCow;
284
+ }
285
+
232
286
  /**
233
287
  * Create an isolated COW branch (a parallel fork of this memory). O(1) in base
234
288
  * size — ~0.5 ms / 162 bytes. The branch sees everything this memory currently
@@ -264,13 +318,51 @@ export class AgenticMemory {
264
318
  * per-user branches off one shared base). One derive() per fork — ~0.5 ms /
265
319
  * 162 bytes each, O(1) in base size. Read-through isolation holds as long as
266
320
  * the parent base stays read-only after forking.
321
+ *
322
+ * `opts.nativeAnn` (default false): when true, creates a real COW branch via
323
+ * RvfDatabase.branch() instead of derive(). The returned fork's query() routes
324
+ * through the native Rust dual-graph ANN merge (RuVector PR #617/#618), which
325
+ * queries both the fork's own HNSW and the parent's HNSW in a single call —
326
+ * sub-linear ANN ACROSS the COW boundary. Verified recall@10 ≈ 1.0 here
327
+ * (0.999, 5,000-vector base ∪ 200 edits, dim 128, cosine via normalized-L2).
328
+ * Platform: the native binary ships for linux-x64-gnu today; on other
329
+ * platforms this degrades gracefully to the exact read-through path (identical
330
+ * correctness, JS merge — `nativeAnn` will read false). Requires the parent to
331
+ * NOT be mutated after forking (same rule as exact mode).
267
332
  * @param {string} [label]
268
333
  * @param {string} [filePath]
334
+ * @param {{nativeAnn?:boolean}} [opts]
269
335
  * @returns {AgenticMemory}
270
336
  */
271
- fork(label, filePath) {
337
+ fork(label, filePath, opts = {}) {
272
338
  this._assertOpen();
273
339
  const childPath = filePath || tmpChildPath(this._working.path, label);
340
+ if (opts.nativeAnn) {
341
+ // Native COW branch: the Rust COW engine wires parent→child read-through
342
+ // so a single db.query() merges both sides via dual-graph ANN.
343
+ // The native branch() binary ships for linux-x64-gnu today; on other
344
+ // platforms RvfDatabase.branch() may be absent/throw — we degrade
345
+ // gracefully to the exact read-through path (same correctness, JS merge).
346
+ if (typeof this._working.db.branch === 'function') {
347
+ try {
348
+ const childDb = this._working.db.branch(childPath);
349
+ const childNode = new Node(childDb, childPath, label || 'fork');
350
+ // The COW child already knows its parent; no JS ancestor chain needed.
351
+ return new AgenticMemory(
352
+ childNode,
353
+ [], // ancestors managed by Rust COW engine
354
+ this._dim,
355
+ this._metric,
356
+ this._track,
357
+ null,
358
+ true // _nativeCow = true → query() uses native path
359
+ );
360
+ } catch {
361
+ /* fall through to exact read-through */
362
+ }
363
+ }
364
+ // graceful fallback (non-linux-x64, or native branch unavailable)
365
+ }
274
366
  const childDb = this._working.db.derive(childPath, this._deriveOpts());
275
367
  const childNode = new Node(childDb, childPath, label || 'fork');
276
368
  return new AgenticMemory(