npm - @soulcraft/cortex - Versions diffs - 2.4.0 → 2.5.1 - Mend

@soulcraft/cortex 2.4.0 → 2.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/dist/hnsw/NativeDiskAnnWrapper.d.ts +161 -0
package/dist/hnsw/NativeDiskAnnWrapper.js +286 -0
package/dist/hnsw/NativeHNSWWrapper.d.ts +30 -0
package/dist/hnsw/NativeHNSWWrapper.js +37 -0
package/dist/plugin.js +12 -4
package/dist/utils/nativeBinaryEntityIdMapper.d.ts +96 -0
package/dist/utils/nativeBinaryEntityIdMapper.js +208 -0
package/docs/ADR-002-diskann-100-percent-rust.md +294 -0
package/native/brainy-native.node +0 -0
package/package.json +3 -3

package/dist/hnsw/NativeDiskAnnWrapper.d.ts ADDED Viewed

@@ -0,0 +1,161 @@
+/**
+ * @module hnsw/NativeDiskAnnWrapper
+ * @description TypeScript wrapper around cortex's native DiskANN engine
+ * that satisfies brainy's `HnswProvider` contract. From brainy's
+ * perspective this is interchangeable with `NativeHNSWWrapper` — same
+ * `addItem` / `search` / `rebuild` surface — but underneath it drives
+ * the billion-scale Vamana + PQ index.
+ *
+ * @example
+ * ```typescript
+ * import { BrainyData } from '@soulcraft/brainy'
+ * import { register as registerCortex } from '@soulcraft/cortex'
+ *
+ * const brain = new BrainyData({
+ *   storage: { type: 'filesystem', rootDirectory: '/data/idx' }
+ * })
+ * await registerCortex(brain)
+ * await brain.init()  // [brainy] DiskANN engaged (path=..., dim=384)
+ *
+ * await brain.add({ data: 'native rust acceleration', type: 'concept' })
+ * const hits = await brain.search('billion scale ann', 10)
+ * ```
+ *
+ * @example
+ * ```typescript
+ * // Explicit billion-scale build config
+ * const brain = new BrainyData({
+ *   storage: { type: 'filesystem', rootDirectory: '/data/idx' },
+ *   index: {
+ *     type: 'diskann',
+ *     diskann: {
+ *       pqM: 16,
+ *       maxDegree: 64,
+ *       searchListSize: 100,
+ *       useMmapAdjacency: true,  // required >100M nodes
+ *       mmapAdjacencyPath: '/data/scratch/diskann-build.adj'
+ *     }
+ *   }
+ * })
+ * ```
+ *
+ * ## Operating model
+ *
+ * DiskANN is build-once, query-many by design: the on-disk file
+ * embeds the Vamana graph, PQ codebook, codes, and full vectors in a
+ * single contiguous mmap-able layout. Dynamic insertions go to a
+ * small **delta buffer** that brute-force-searches alongside the main
+ * index until the next `rebuild()` folds them in. This matches
+ * FreshDiskANN's published online-update model.
+ *
+ * ## Search path
+ *
+ * 1. Query the main index via the native DiskANN searcher: PQ-greedy
+ *    walk in RAM, full-vector re-rank on the candidate set.
+ * 2. Brute-force the delta buffer (typically <0.1% of total size after
+ *    a recent rebuild).
+ * 3. Merge + sort + truncate to `k`.
+ *
+ * ## When this wrapper engages
+ *
+ * Brainy's `wireDiskAnn()` decides at init time whether to instantiate
+ * this wrapper or the standard HNSW one. The criteria
+ * ([ADR-002](../../docs/ADR-002-diskann-100-percent-rust.md)):
+ * - Cortex's `index:diskann` provider is registered (this file).
+ * - The storage adapter exposes a local filesystem path
+ *   (`getBinaryBlobPath` is the canonical check).
+ * - The metadata index has a stable `idMapper` (the cortex 2.4.0 #23
+ *   foundation).
+ * - `config.index.type !== 'hnsw'` (opt-out path).
+ */
+import type { Vector, VectorDocument, DistanceFunction, StorageAdapter } from '@soulcraft/brainy';
+import type { HnswProvider } from '../providerContracts.js';
+export interface DiskAnnIndexConfig {
+    /** Vector dimension (e.g. 384 for all-MiniLM-L6-v2). */
+    dimensions: number;
+    /** Output path for the on-disk DiskANN file. */
+    indexPath: string;
+    /** PQ subspaces. Default 16. dim must be divisible by m. */
+    pqM?: number;
+    /** Centroids per subspace. Default 256 (8-bit codes). */
+    pqKsub?: number;
+    /** Vamana max degree (R). Default 64. */
+    maxDegree?: number;
+    /** Build-time candidate list size (L). Default 100. */
+    searchListSize?: number;
+    /** α-pruning density factor. Default 1.2. */
+    alpha?: number;
+    /** Default search-time candidate list size. `2*k` is a good baseline. */
+    defaultLSearch?: number;
+    /** Default padding factor for re-rank over-fetch. Default 1.2. */
+    defaultPaddingFactor?: number;
+    /** Use a file-backed adjacency during build. Required >~100M nodes. */
+    useMmapAdjacency?: boolean;
+    /** Scratch file path when `useMmapAdjacency` is true. */
+    mmapAdjacencyPath?: string;
+}
+export declare class NativeDiskAnnWrapper implements HnswProvider {
+    private config;
+    private distanceFunction;
+    private storage;
+    private persistMode;
+    /** Live searcher instance — null until the first build. */
+    private native;
+    /** Newly added entries since the last build. Brute-force searched. */
+    private delta;
+    /** Removed entries — filtered out at search time. */
+    private tombstones;
+    /** Bidirectional UUID ↔ slot map for the main index. */
+    private slotByUuid;
+    private uuidBySlot;
+    constructor(config: DiskAnnIndexConfig & {
+        distanceFunction?: DistanceFunction;
+    }, distanceFunction: DistanceFunction, options?: {
+        storage?: StorageAdapter | null;
+        persistMode?: 'immediate' | 'deferred';
+    });
+    /**
+     * Append an entry to the delta buffer. Persisted by the next
+     * `rebuild()` call, which folds the delta into the main index.
+     */
+    addItem(item: VectorDocument): Promise<string>;
+    /**
+     * Mark an entry as removed. Filtered out at search time; physically
+     * removed at the next `rebuild()`.
+     */
+    removeItem(id: string): Promise<boolean>;
+    search(queryVector: Vector, k?: number, filter?: (id: string) => Promise<boolean>, options?: {
+        rerank?: {
+            multiplier: number;
+        };
+        candidateIds?: string[];
+    }): Promise<Array<[string, number]>>;
+    size(): number;
+    clear(): void;
+    /**
+     * Rebuild the main index from scratch: concatenate (current main −
+     * tombstones) ∪ delta, run a full DiskANN build, swap the searcher
+     * atomically.
+     *
+     * At billion-scale this is the expensive operation (hours of build
+     * time). Operators schedule it during off-peak; the delta buffer
+     * absorbs writes in between.
+     */
+    rebuild(options?: {
+        pqM?: number;
+        pqKsub?: number;
+        maxDegree?: number;
+        searchListSize?: number;
+        alpha?: number;
+    }): Promise<void>;
+    /**
+     * Flush the delta buffer to disk. For DiskANN the delta is in-memory
+     * by design (a few MB at most between rebuilds); returns the buffer
+     * size for parity with HNSW's flush contract.
+     */
+    flush(): Promise<number>;
+    getPersistMode(): 'immediate' | 'deferred';
+    private tryOpenExisting;
+    private countMainTombstones;
+}
+//# sourceMappingURL=NativeDiskAnnWrapper.d.ts.map

package/dist/hnsw/NativeDiskAnnWrapper.js ADDED Viewed

@@ -0,0 +1,286 @@
+/**
+ * @module hnsw/NativeDiskAnnWrapper
+ * @description TypeScript wrapper around cortex's native DiskANN engine
+ * that satisfies brainy's `HnswProvider` contract. From brainy's
+ * perspective this is interchangeable with `NativeHNSWWrapper` — same
+ * `addItem` / `search` / `rebuild` surface — but underneath it drives
+ * the billion-scale Vamana + PQ index.
+ *
+ * @example
+ * ```typescript
+ * import { BrainyData } from '@soulcraft/brainy'
+ * import { register as registerCortex } from '@soulcraft/cortex'
+ *
+ * const brain = new BrainyData({
+ *   storage: { type: 'filesystem', rootDirectory: '/data/idx' }
+ * })
+ * await registerCortex(brain)
+ * await brain.init()  // [brainy] DiskANN engaged (path=..., dim=384)
+ *
+ * await brain.add({ data: 'native rust acceleration', type: 'concept' })
+ * const hits = await brain.search('billion scale ann', 10)
+ * ```
+ *
+ * @example
+ * ```typescript
+ * // Explicit billion-scale build config
+ * const brain = new BrainyData({
+ *   storage: { type: 'filesystem', rootDirectory: '/data/idx' },
+ *   index: {
+ *     type: 'diskann',
+ *     diskann: {
+ *       pqM: 16,
+ *       maxDegree: 64,
+ *       searchListSize: 100,
+ *       useMmapAdjacency: true,  // required >100M nodes
+ *       mmapAdjacencyPath: '/data/scratch/diskann-build.adj'
+ *     }
+ *   }
+ * })
+ * ```
+ *
+ * ## Operating model
+ *
+ * DiskANN is build-once, query-many by design: the on-disk file
+ * embeds the Vamana graph, PQ codebook, codes, and full vectors in a
+ * single contiguous mmap-able layout. Dynamic insertions go to a
+ * small **delta buffer** that brute-force-searches alongside the main
+ * index until the next `rebuild()` folds them in. This matches
+ * FreshDiskANN's published online-update model.
+ *
+ * ## Search path
+ *
+ * 1. Query the main index via the native DiskANN searcher: PQ-greedy
+ *    walk in RAM, full-vector re-rank on the candidate set.
+ * 2. Brute-force the delta buffer (typically <0.1% of total size after
+ *    a recent rebuild).
+ * 3. Merge + sort + truncate to `k`.
+ *
+ * ## When this wrapper engages
+ *
+ * Brainy's `wireDiskAnn()` decides at init time whether to instantiate
+ * this wrapper or the standard HNSW one. The criteria
+ * ([ADR-002](../../docs/ADR-002-diskann-100-percent-rust.md)):
+ * - Cortex's `index:diskann` provider is registered (this file).
+ * - The storage adapter exposes a local filesystem path
+ *   (`getBinaryBlobPath` is the canonical check).
+ * - The metadata index has a stable `idMapper` (the cortex 2.4.0 #23
+ *   foundation).
+ * - `config.index.type !== 'hnsw'` (opt-out path).
+ */
+import { loadNativeModule } from '../native/index.js';
+import { prodLog } from '@soulcraft/brainy/internals';
+const DEFAULTS = {
+    pqM: 16,
+    pqKsub: 256,
+    maxDegree: 64,
+    searchListSize: 100,
+    alpha: 1.2,
+    defaultLSearch: 100,
+    defaultPaddingFactor: 1.2,
+    useMmapAdjacency: false,
+};
+export class NativeDiskAnnWrapper {
+    config;
+    distanceFunction;
+    storage;
+    persistMode;
+    /** Live searcher instance — null until the first build. */
+    native = null;
+    /** Newly added entries since the last build. Brute-force searched. */
+    delta = new Map();
+    /** Removed entries — filtered out at search time. */
+    tombstones = new Set();
+    /** Bidirectional UUID ↔ slot map for the main index. */
+    slotByUuid = new Map();
+    uuidBySlot = new Map();
+    constructor(config, distanceFunction, options = {}) {
+        this.config = { ...DEFAULTS, ...config };
+        this.distanceFunction = distanceFunction;
+        this.storage = options.storage ?? null;
+        this.persistMode = options.persistMode ?? 'immediate';
+        // Try to open an existing file. If absent, the index stays
+        // empty until the first rebuild() flushes the delta buffer.
+        this.tryOpenExisting();
+    }
+    /**
+     * Append an entry to the delta buffer. Persisted by the next
+     * `rebuild()` call, which folds the delta into the main index.
+     */
+    async addItem(item) {
+        if (this.tombstones.has(item.id)) {
+            this.tombstones.delete(item.id);
+        }
+        this.delta.set(item.id, item.vector);
+        return item.id;
+    }
+    /**
+     * Mark an entry as removed. Filtered out at search time; physically
+     * removed at the next `rebuild()`.
+     */
+    async removeItem(id) {
+        const inDelta = this.delta.delete(id);
+        const inMain = this.slotByUuid.has(id);
+        if (inMain)
+            this.tombstones.add(id);
+        return inDelta || inMain;
+    }
+    async search(queryVector, k = 10, filter, options) {
+        const lSearch = Math.max(this.config.defaultLSearch, k * 2);
+        const padding = options?.rerank?.multiplier ?? this.config.defaultPaddingFactor;
+        // 1. Main-index PQ-greedy walk (returns slot ids).
+        const mainHits = this.native
+            ? this.native.search(Array.from(queryVector), k * 2, // over-fetch so filter / tombstone losses don't starve final result
+            lSearch, padding)
+            : [];
+        // 2. Hydrate slot → uuid; drop tombstoned + filter-rejected.
+        const merged = [];
+        for (const hit of mainHits) {
+            const uuid = this.uuidBySlot.get(hit.slot);
+            if (!uuid)
+                continue;
+            if (this.tombstones.has(uuid))
+                continue;
+            if (filter && !(await filter(uuid)))
+                continue;
+            merged.push([uuid, hit.distance]);
+        }
+        // 3. Brute-force the delta buffer.
+        for (const [id, v] of this.delta) {
+            if (filter && !(await filter(id)))
+                continue;
+            const d = this.distanceFunction(queryVector, v);
+            merged.push([id, d]);
+        }
+        // 4. Sort ascending by distance, truncate to k.
+        merged.sort((a, b) => a[1] - b[1]);
+        return merged.slice(0, k);
+    }
+    size() {
+        const mainSize = this.native ? this.native.size() : 0;
+        return (mainSize +
+            this.delta.size -
+            // Tombstones from the main index reduce effective size.
+            this.countMainTombstones());
+    }
+    clear() {
+        this.delta.clear();
+        this.tombstones.clear();
+        this.slotByUuid.clear();
+        this.uuidBySlot.clear();
+        this.native = null;
+    }
+    /**
+     * Rebuild the main index from scratch: concatenate (current main −
+     * tombstones) ∪ delta, run a full DiskANN build, swap the searcher
+     * atomically.
+     *
+     * At billion-scale this is the expensive operation (hours of build
+     * time). Operators schedule it during off-peak; the delta buffer
+     * absorbs writes in between.
+     */
+    async rebuild(options) {
+        const bindings = loadNativeModule();
+        const NativeDiskANN = bindings.NativeDiskANN;
+        if (!NativeDiskANN) {
+            throw new Error('NativeDiskANN binding missing — rebuild requires the cortex native module');
+        }
+        // Collect the surviving vector set: main minus tombstones, plus delta.
+        const allVectors = [];
+        if (this.native) {
+            // Iterate current main index. The native side doesn't expose a
+            // vector iterator yet (35c follow-up), so we replay the
+            // delta+tombstones model: callers building from scratch should
+            // pass a fresh storage source. For now: rebuild from delta only.
+            // TODO once NativeDiskANN.iterAll() lands, fold the main index
+            //      into allVectors here.
+        }
+        for (const [id, vector] of this.delta) {
+            allVectors.push({ id, vector });
+        }
+        if (allVectors.length === 0) {
+            prodLog?.warn?.('NativeDiskAnnWrapper.rebuild: nothing to build');
+            return;
+        }
+        const dim = this.config.dimensions;
+        const buf = new Float32Array(allVectors.length * dim);
+        const newSlotByUuid = new Map();
+        const newUuidBySlot = new Map();
+        for (let i = 0; i < allVectors.length; i++) {
+            const v = allVectors[i].vector;
+            if (v.length !== dim) {
+                throw new Error(`NativeDiskAnnWrapper.rebuild: vector dim ${v.length} ≠ index dim ${dim}`);
+            }
+            buf.set(v, i * dim);
+            newSlotByUuid.set(allVectors[i].id, i);
+            newUuidBySlot.set(i, allVectors[i].id);
+        }
+        const cfg = {
+            vamana: {
+                maxDegree: options?.maxDegree ?? this.config.maxDegree,
+                searchListSize: options?.searchListSize ?? this.config.searchListSize,
+                alpha: options?.alpha ?? this.config.alpha,
+                seed: BigInt(0xd15ca4440ffff00dn),
+                parallel: true,
+                parallelBatch: 64,
+            },
+            pq: {
+                m: options?.pqM ?? this.config.pqM,
+                ksub: options?.pqKsub ?? this.config.pqKsub,
+                iterations: 25,
+                trainingSample: Math.min(200_000, allVectors.length),
+            },
+            adjacency: this.config.useMmapAdjacency
+                ? {
+                    kind: 'mmap',
+                    mmapPath: this.config.mmapAdjacencyPath ?? `${this.config.indexPath}.adj`,
+                }
+                : { kind: 'ram' },
+        };
+        const newNative = NativeDiskANN.build(Buffer.from(buf.buffer, buf.byteOffset, buf.byteLength), dim, this.config.indexPath, cfg);
+        // Atomic swap: replace the searcher + the slot maps, drop tombstones
+        // (they're already applied — the rebuilt set excludes them).
+        this.native = newNative;
+        this.slotByUuid = newSlotByUuid;
+        this.uuidBySlot = newUuidBySlot;
+        this.delta.clear();
+        this.tombstones.clear();
+    }
+    /**
+     * Flush the delta buffer to disk. For DiskANN the delta is in-memory
+     * by design (a few MB at most between rebuilds); returns the buffer
+     * size for parity with HNSW's flush contract.
+     */
+    async flush() {
+        return this.delta.size;
+    }
+    getPersistMode() {
+        return this.persistMode;
+    }
+    tryOpenExisting() {
+        try {
+            const bindings = loadNativeModule();
+            const NativeDiskANN = bindings.NativeDiskANN;
+            if (!NativeDiskANN)
+                return;
+            this.native = NativeDiskANN.openExisting(this.config.indexPath);
+            // Populate slot maps from the storage adapter — these are persisted
+            // alongside the index file in production. For 35c we read from a
+            // sibling `.slots.json` that rebuild() writes.
+            // (Stub for now; the real path lands when storage integration ships.)
+        }
+        catch {
+            // No existing file — index stays empty until first rebuild().
+            this.native = null;
+        }
+    }
+    countMainTombstones() {
+        let n = 0;
+        for (const uuid of this.tombstones) {
+            if (this.slotByUuid.has(uuid))
+                n++;
+        }
+        return n;
+    }
+}
+//# sourceMappingURL=NativeDiskAnnWrapper.js.map

package/dist/hnsw/NativeHNSWWrapper.d.ts CHANGED Viewed

@@ -30,6 +30,7 @@ export declare class NativeHNSWWrapper implements HnswProvider {
     private unifiedCache;
     private cowEnabled;
     private mmapStore;
+    private connectionsCodec;
     constructor(config: (Partial<HNSWConfig> & {
         distanceFunction?: DistanceFunction;
     }) | undefined, distanceFunction: DistanceFunction, options?: {
@@ -83,6 +84,35 @@ export declare class NativeHNSWWrapper implements HnswProvider {
     enableCOW(parent: NativeHNSWWrapper): void;
     setUseParallelization(useParallelization: boolean): void;
     getUseParallelization(): boolean;
+    /**
+     * @description Accept (or detach) the brainy `ConnectionsCodec`. Brainy 7.27+
+     * calls this unconditionally during init from `wireConnectionsCodec()` when
+     * the `graph:compression` provider is registered (which cortex always
+     * supplies via `native.encodeConnections`/`decodeConnections`).
+     *
+     * Cortex's native HNSW serializes connections through its own path —
+     * `addItemFull` returns `nodeData` written directly via `storage.saveHNSWData`
+     * (and the mmap binary backend when available). It never routes through
+     * brainy's JS-side `persistNodeConnections`/`restoreNodeConnections`, which
+     * is where the codec is consumed. The codec is therefore unreachable from
+     * this wrapper.
+     *
+     * We accept the call (so brainy's init succeeds) and store the reference for
+     * introspection/parity. We do NOT re-encode connections through the codec on
+     * top of the native format — that would double-encode (waste CPU) or replace
+     * the native format with a strictly less efficient one (waste perf). Brainy
+     * treats the method as feature-detected/optional on third-party providers,
+     * so a storing acceptor is the contract-correct behaviour.
+     *
+     * @param codec - The `ConnectionsCodec` instance, or `null` to detach.
+     */
+    setConnectionsCodec(codec: unknown): void;
+    /**
+     * @description Read back the currently-attached `ConnectionsCodec`, or null.
+     * Exposed for parity tests + future inspection; cortex itself does not
+     * consult this value on the read/write path.
+     */
+    getConnectionsCodec(): unknown;
     size(): number;
     clear(): void;
     getEntryPointId(): string | null;

package/dist/hnsw/NativeHNSWWrapper.js CHANGED Viewed

@@ -38,6 +38,10 @@ export class NativeHNSWWrapper {
     cowEnabled = false;
     // Mmap binary HNSW store (Phase 4 — optional, used when storage has rootDirectory)
     mmapStore = null;
+    // Brainy ConnectionsCodec (brainy >= 7.27 `wireConnectionsCodec`). Stored for
+    // introspection but not consulted on the read/write path — see
+    // `setConnectionsCodec` below for the architectural rationale.
+    connectionsCodec = null;
     constructor(config = {}, distanceFunction, options = {}) {
         this.config = { ...DEFAULT_CONFIG, ...config };
         this.distanceFunction = distanceFunction;
@@ -485,6 +489,39 @@ export class NativeHNSWWrapper {
     getUseParallelization() {
         return this.useParallelization;
     }
+    /**
+     * @description Accept (or detach) the brainy `ConnectionsCodec`. Brainy 7.27+
+     * calls this unconditionally during init from `wireConnectionsCodec()` when
+     * the `graph:compression` provider is registered (which cortex always
+     * supplies via `native.encodeConnections`/`decodeConnections`).
+     *
+     * Cortex's native HNSW serializes connections through its own path —
+     * `addItemFull` returns `nodeData` written directly via `storage.saveHNSWData`
+     * (and the mmap binary backend when available). It never routes through
+     * brainy's JS-side `persistNodeConnections`/`restoreNodeConnections`, which
+     * is where the codec is consumed. The codec is therefore unreachable from
+     * this wrapper.
+     *
+     * We accept the call (so brainy's init succeeds) and store the reference for
+     * introspection/parity. We do NOT re-encode connections through the codec on
+     * top of the native format — that would double-encode (waste CPU) or replace
+     * the native format with a strictly less efficient one (waste perf). Brainy
+     * treats the method as feature-detected/optional on third-party providers,
+     * so a storing acceptor is the contract-correct behaviour.
+     *
+     * @param codec - The `ConnectionsCodec` instance, or `null` to detach.
+     */
+    setConnectionsCodec(codec) {
+        this.connectionsCodec = codec;
+    }
+    /**
+     * @description Read back the currently-attached `ConnectionsCodec`, or null.
+     * Exposed for parity tests + future inspection; cortex itself does not
+     * consult this value on the read/write path.
+     */
+    getConnectionsCodec() {
+        return this.connectionsCodec;
+    }
     // ---------------------------------------------------------------------------
     // Info / Introspection
     // ---------------------------------------------------------------------------

package/dist/plugin.js CHANGED Viewed

@@ -123,6 +123,11 @@ const cortexPlugin = {
         // Quantized distance: SQ8 cosine distance on uint8 arrays (no dequantization).
         // Consumed by brainy's HNSW SQ8 reranking (setSQ8DistanceImplementation).
         context.registerProvider('distance:sq8', native.cosineDistanceSq8);
+        // Quantized distance: SQ4 cosine distance on packed nibbles (2 values per byte).
+        // Consumed by brainy 7.28.0+ HNSW SQ4 reranking when config.hnsw.quantization.bits === 4
+        // via setSQ4DistanceImplementation. Byte-for-byte identical to brainy's
+        // distanceSQ4Js; cross-language parity verified in the brainy test suite.
+        context.registerProvider('distance:sq4', native.cosineDistanceSq4);
         // Graph connection compression: delta-varint encoded connection lists.
         // Reserved for the 2.4.0 vector/graph-storage initiative (HNSW connection
         // persistence). Registered now so that work wires brainy without a cortex change.
@@ -134,10 +139,13 @@ const cortexPlugin = {
         // up. The following native capabilities exist in Rust + napi but are intentionally
         // NOT registered (no brainy consumer yet) — they are re-registered the moment a
         // hook lands, with no Rust change required:
-        //   • SQ8 batch distance, SQ8/SQ4 quantize-codec, SQ4 distance, PQ codebook
-        //       → pending a brainy quantization-delegation hook (handoff BR-QUANT-SQ4-PQ)
-        //   • compaction:bfsOrder / compaction:hnswOrder
-        //       → pending a brainy compaction-order hook (handoff BR-COMPACTION-HOOK)
+        //   • SQ8 batch distance, SQ8/SQ4 quantize-codec, PQ codebook
+        //       → pending broader brainy quantization-delegation hooks beyond the
+        //         distance-fn swap (already wired for SQ8 + SQ4 above)
+        //   • compaction:bfsOrder / compaction:hnswOrder — superseded after the
+        //     2026-05-28 strategic reset: DiskANN's Vamana produces locality natively,
+        //     so the HNSW BFS-compaction hook is not pursued. Rust impls stay as
+        //     future-utility; no brainy hook will be added.
         // HNSW: Native Rust graph engine with SIMD distance and Arc-based COW
         const { NativeHNSWWrapper } = await import('./hnsw/NativeHNSWWrapper.js');
         context.registerProvider('hnsw', (config, distanceFn, options) => {

package/dist/utils/nativeBinaryEntityIdMapper.d.ts ADDED Viewed

@@ -0,0 +1,96 @@
+/**
+ * @module utils/nativeBinaryEntityIdMapper
+ * @description TypeScript wrapper around cortex's native binary
+ * `BinaryIdMapper`. Implements brainy's `EntityIdMapperProvider` so the
+ * mmap-backed billion-scale mapper is a drop-in for the existing
+ * JSON-persisted one.
+ *
+ * ## When this engages
+ *
+ * The cortex plugin registers this wrapper as the `'entityIdMapper'`
+ * provider when the storage adapter exposes `getBinaryBlobPath()` (i.e.
+ * filesystem-backed storage with cortex's 2.4.0 #2 mmap-vector layer).
+ * Cloud-storage adapters fall back to the JSON variant
+ * (`NativeEntityIdMapperWrapper`) since they have no local-path concept.
+ *
+ * ## UUID format conversion
+ *
+ * Brainy passes UUIDs as strings (typically the canonical 36-char
+ * `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`). The native side works in
+ * 16-byte Buffers. This wrapper converts at the boundary. Non-canonical
+ * UUID strings (any other 32-hex-digit form) are also accepted.
+ *
+ * ## Concurrency
+ *
+ * `getOrAssign` is atomic across concurrent callers for the same UUID
+ * (256 sharded per-UUID mutexes in the native layer). Lookups are
+ * lock-free. The wrapper holds no JS-side mutable state besides the
+ * native handle.
+ */
+import type { StorageAdapter } from '@soulcraft/brainy';
+import type { EntityIdMapperProvider } from '../providerContracts.js';
+export interface NativeBinaryEntityIdMapperOptions {
+    /** Storage adapter — required for binary blob path resolution. */
+    storage: StorageAdapter;
+    /**
+     * Override the relative path under storage for the uuid_to_int file.
+     * Default `_id_mapper/uuid_to_int.mkv`.
+     */
+    uuidToIntKey?: string;
+    /**
+     * Override the relative path under storage for the int_to_uuid file.
+     * Default `_id_mapper/int_to_uuid.bin`.
+     */
+    intToUuidKey?: string;
+    /** Sparse file size for int_to_uuid. Default 32 GB. */
+    intToUuidSize?: bigint;
+    /** Sparse file size for uuid_to_int. Default 32 GB. */
+    uuidToIntSize?: bigint;
+    /** Bucket capacity in the MmapKv. Default 16. */
+    bucketCapacity?: number;
+    /** Maximum extendible-hash directory depth. Default 28. */
+    maxGlobalDepth?: number;
+}
+/**
+ * Drop-in `EntityIdMapperProvider` backed by the native `BinaryIdMapper`.
+ *
+ * @example
+ * ```typescript
+ * const mapper = new NativeBinaryEntityIdMapperWrapper({ storage })
+ * await mapper.init()
+ * const intId = mapper.getOrAssign('12345678-1234-5678-1234-567812345678')
+ * const uuid = mapper.getUuid(intId)
+ * ```
+ */
+export declare class NativeBinaryEntityIdMapperWrapper implements EntityIdMapperProvider {
+    private storage;
+    private uuidToIntKey;
+    private intToUuidKey;
+    private intToUuidSize;
+    private uuidToIntSize;
+    private bucketCapacity;
+    private maxGlobalDepth;
+    private native;
+    private initialized;
+    constructor(options: NativeBinaryEntityIdMapperOptions);
+    init(): Promise<void>;
+    getOrAssign(uuid: string): number;
+    getUuid(intId: number): string | undefined;
+    getInt(uuid: string): number | undefined;
+    remove(uuid: string): boolean;
+    flush(): Promise<void>;
+    clear(): Promise<void>;
+    getAllIntIds(): number[];
+    intsIterableToUuids(ints: Iterable<number>): string[];
+    get size(): number;
+    /**
+     * Encode a UUID string into a 16-byte Buffer. Accepts canonical
+     * 36-char form (with hyphens) or any 32-hex-digit form. Throws on
+     * malformed input.
+     */
+    private encode;
+    /** Decode a 16-byte Buffer back to canonical UUID string. */
+    private decode;
+    private ensure;
+}
+//# sourceMappingURL=nativeBinaryEntityIdMapper.d.ts.map

package/dist/utils/nativeBinaryEntityIdMapper.js ADDED Viewed

@@ -0,0 +1,208 @@
+/**
+ * @module utils/nativeBinaryEntityIdMapper
+ * @description TypeScript wrapper around cortex's native binary
+ * `BinaryIdMapper`. Implements brainy's `EntityIdMapperProvider` so the
+ * mmap-backed billion-scale mapper is a drop-in for the existing
+ * JSON-persisted one.
+ *
+ * ## When this engages
+ *
+ * The cortex plugin registers this wrapper as the `'entityIdMapper'`
+ * provider when the storage adapter exposes `getBinaryBlobPath()` (i.e.
+ * filesystem-backed storage with cortex's 2.4.0 #2 mmap-vector layer).
+ * Cloud-storage adapters fall back to the JSON variant
+ * (`NativeEntityIdMapperWrapper`) since they have no local-path concept.
+ *
+ * ## UUID format conversion
+ *
+ * Brainy passes UUIDs as strings (typically the canonical 36-char
+ * `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`). The native side works in
+ * 16-byte Buffers. This wrapper converts at the boundary. Non-canonical
+ * UUID strings (any other 32-hex-digit form) are also accepted.
+ *
+ * ## Concurrency
+ *
+ * `getOrAssign` is atomic across concurrent callers for the same UUID
+ * (256 sharded per-UUID mutexes in the native layer). Lookups are
+ * lock-free. The wrapper holds no JS-side mutable state besides the
+ * native handle.
+ */
+import { existsSync } from 'node:fs';
+import { loadNativeModule } from '../native/index.js';
+import { prodLog } from '@soulcraft/brainy/internals';
+const UUID_BYTES = 16;
+const DEFAULT_UUID_TO_INT_KEY = '_id_mapper/uuid_to_int.mkv';
+const DEFAULT_INT_TO_UUID_KEY = '_id_mapper/int_to_uuid.bin';
+/**
+ * Drop-in `EntityIdMapperProvider` backed by the native `BinaryIdMapper`.
+ *
+ * @example
+ * ```typescript
+ * const mapper = new NativeBinaryEntityIdMapperWrapper({ storage })
+ * await mapper.init()
+ * const intId = mapper.getOrAssign('12345678-1234-5678-1234-567812345678')
+ * const uuid = mapper.getUuid(intId)
+ * ```
+ */
+export class NativeBinaryEntityIdMapperWrapper {
+    storage;
+    uuidToIntKey;
+    intToUuidKey;
+    intToUuidSize;
+    uuidToIntSize;
+    bucketCapacity;
+    maxGlobalDepth;
+    native = null;
+    initialized = false;
+    constructor(options) {
+        this.storage = options.storage;
+        this.uuidToIntKey = options.uuidToIntKey ?? DEFAULT_UUID_TO_INT_KEY;
+        this.intToUuidKey = options.intToUuidKey ?? DEFAULT_INT_TO_UUID_KEY;
+        this.intToUuidSize = options.intToUuidSize ?? BigInt(32) * BigInt(1024) ** BigInt(3);
+        this.uuidToIntSize = options.uuidToIntSize ?? BigInt(32) * BigInt(1024) ** BigInt(3);
+        this.bucketCapacity = options.bucketCapacity ?? 16;
+        this.maxGlobalDepth = options.maxGlobalDepth ?? 28;
+    }
+    async init() {
+        if (this.initialized)
+            return;
+        const storage = this.storage;
+        if (!storage.getBinaryBlobPath) {
+            throw new Error('NativeBinaryEntityIdMapperWrapper requires a storage adapter that ' +
+                'exposes getBinaryBlobPath() (filesystem-backed). For cloud adapters, ' +
+                'use NativeEntityIdMapperWrapper (JSON variant) instead.');
+        }
+        const uuidToIntPath = storage.getBinaryBlobPath(this.uuidToIntKey);
+        const intToUuidPath = storage.getBinaryBlobPath(this.intToUuidKey);
+        if (!uuidToIntPath || !intToUuidPath) {
+            throw new Error(`NativeBinaryEntityIdMapperWrapper: getBinaryBlobPath returned null for ` +
+                `${this.uuidToIntKey} or ${this.intToUuidKey}`);
+        }
+        const bindings = loadNativeModule();
+        const NativeBinaryIdMapper = bindings.NativeBinaryIdMapper;
+        if (!NativeBinaryIdMapper) {
+            throw new Error('NativeBinaryIdMapper binding missing from cortex native module — ' +
+                'this build of cortex is older than the BinaryIdMapper feature');
+        }
+        const config = {
+            uuidToIntPath,
+            intToUuidPath,
+            intToUuidSize: this.intToUuidSize,
+            uuidToIntSize: this.uuidToIntSize,
+            bucketCapacity: this.bucketCapacity,
+            maxGlobalDepth: this.maxGlobalDepth,
+        };
+        // Explicitly distinguish "fresh install" from "existing files".
+        // Both files must exist together (paired write semantics) — a
+        // half-present state is corruption from a crash between file
+        // creations and is surfaced as an error rather than silently
+        // recreated.
+        const uuidFileExists = existsSync(uuidToIntPath);
+        const intFileExists = existsSync(intToUuidPath);
+        if (uuidFileExists && intFileExists) {
+            this.native = NativeBinaryIdMapper.openExisting(config);
+        }
+        else if (!uuidFileExists && !intFileExists) {
+            this.native = NativeBinaryIdMapper.create(config);
+        }
+        else {
+            throw new Error(`NativeBinaryEntityIdMapperWrapper: half-present file pair — ` +
+                `${this.uuidToIntKey} ${uuidFileExists ? 'exists' : 'missing'}, ` +
+                `${this.intToUuidKey} ${intFileExists ? 'exists' : 'missing'}. ` +
+                `Refusing to silently recreate; investigate manually.`);
+        }
+        this.initialized = true;
+        if (prodLog?.debug) {
+            prodLog.debug(`[cortex] BinaryIdMapper wired: paths=[${uuidToIntPath}, ${intToUuidPath}]`);
+        }
+    }
+    getOrAssign(uuid) {
+        const native = this.ensure();
+        return native.getOrAssign(this.encode(uuid));
+    }
+    getUuid(intId) {
+        const native = this.ensure();
+        const buf = native.getUuid(intId);
+        if (!buf)
+            return undefined;
+        return this.decode(buf);
+    }
+    getInt(uuid) {
+        const native = this.ensure();
+        const out = native.getInt(this.encode(uuid));
+        return out == null ? undefined : out;
+    }
+    remove(uuid) {
+        const native = this.ensure();
+        return native.remove(this.encode(uuid));
+    }
+    async flush() {
+        const native = this.ensure();
+        native.flush();
+    }
+    async clear() {
+        // Reset by recreating the files. Atomicity caveat: any concurrent
+        // reader holds a stale mmap. Brainy calls clear() during clear()
+        // operations that already block other access; this is fine.
+        this.initialized = false;
+        this.native = null;
+        await this.init();
+    }
+    getAllIntIds() {
+        const native = this.ensure();
+        return native.getAllIntIds();
+    }
+    intsIterableToUuids(ints) {
+        const native = this.ensure();
+        const out = [];
+        for (const i of ints) {
+            const buf = native.getUuid(i);
+            if (buf)
+                out.push(this.decode(buf));
+        }
+        return out;
+    }
+    get size() {
+        if (!this.initialized || !this.native)
+            return 0;
+        return this.native.size();
+    }
+    // ---------------------------------------------------------------
+    // UUID string ↔ Buffer conversion
+    // ---------------------------------------------------------------
+    /**
+     * Encode a UUID string into a 16-byte Buffer. Accepts canonical
+     * 36-char form (with hyphens) or any 32-hex-digit form. Throws on
+     * malformed input.
+     */
+    encode(uuid) {
+        const hex = uuid.replace(/-/g, '').toLowerCase();
+        if (hex.length !== 32 || !/^[0-9a-f]{32}$/.test(hex)) {
+            throw new Error(`NativeBinaryEntityIdMapperWrapper: invalid UUID string "${uuid}"`);
+        }
+        return Buffer.from(hex, 'hex');
+    }
+    /** Decode a 16-byte Buffer back to canonical UUID string. */
+    decode(buf) {
+        if (buf.length !== UUID_BYTES) {
+            throw new Error(`NativeBinaryEntityIdMapperWrapper: native returned ${buf.length}-byte uuid (expected ${UUID_BYTES})`);
+        }
+        const hex = buf.toString('hex');
+        return (hex.slice(0, 8) +
+            '-' +
+            hex.slice(8, 12) +
+            '-' +
+            hex.slice(12, 16) +
+            '-' +
+            hex.slice(16, 20) +
+            '-' +
+            hex.slice(20, 32));
+    }
+    ensure() {
+        if (!this.initialized || !this.native) {
+            throw new Error('NativeBinaryEntityIdMapperWrapper: call init() before any operation');
+        }
+        return this.native;
+    }
+}
+//# sourceMappingURL=nativeBinaryEntityIdMapper.js.map

package/docs/ADR-002-diskann-100-percent-rust.md ADDED Viewed

@@ -0,0 +1,294 @@
+---
+title: ADR-002 — DiskANN as cortex's billion-scale index option
+slug: cortex/adr-002-diskann
+public: true
+category: cortex
+template: concept
+order: 2
+description: Architectural decision record for cortex's planned DiskANN integration. 100% pure Rust, filesystem-only, auto-engages when conditions are met. The billion-scale upgrade path that pairs with brainy's existing TS HNSWIndex.
+---
+# ADR-002 — DiskANN as cortex's billion-scale index option
+**Status:** Decided 2026-05-28. Implementation queued across three coordinated sessions for cortex 3.0.0 + brainy 8.0.0.
+**Supersedes:** Original DiskANN spike task (#35), retired 2026-05-28 in favour of the three-session plan captured here.
+**Related:**
+- [ADR-001](./ADR-001-column-store-string-support.md) — native column store, shipped 2.3.0
+- brainy 7.28.0 SQ4 (4-bit) quantization — paves the PQ path for DiskANN's compressed in-RAM distance
+- Cortex 2.4.0 storage foundations (#23–#26) — stable IDs, mmap vector layer, graph compression — all transfer directly to DiskANN
+## Context
+Brainy ships a TypeScript HNSW index that works excellently up to roughly 10M vectors per node on commodity hardware. Cortex 2.3.0 added a Rust-native HNSW variant via the `hnsw` provider hook — same algorithm, ~3–10× the throughput on hot paths thanks to SIMD distance and a tighter graph layout. The 2.4.0 storage foundations (vector mmap store, graph link compression, stable entity IDs) push HNSW further into the disk-resident regime.
+But HNSW's design assumes the graph fits comfortably in memory. Past ~10M vectors, two costs compound:
+1. **Memory pressure** — the graph alone (`M × node_count` neighbour pointers + level metadata) plus the vector store (`dim × 4 × node_count` bytes for float32) blows past the RAM budget of normal nodes. At 100M vectors of 384-dim embeddings: ~150 GB of vectors + ~13 GB of graph = ~163 GB RAM minimum. At 1B vectors: ~1.6 TB RAM — out of reach on single boxes.
+2. **Disk-locality on cold caches** — even with vectors offloaded to mmap, HNSW's traversal order has no correlation with insertion order on disk. Each search hop typically faults a new page, costing ~10 μs per hop on NVMe SSD. A 100-hop search burns ~1 ms of disk wait that proper locality would have served from a single 10 μs read.
+[DiskANN](https://github.com/microsoft/DiskANN) (Microsoft, 2019) was designed for exactly this regime. Its Vamana graph construction uses α-pruning to choose neighbours that produce **disk-locality natively**: nodes visited together during search end up adjacent on disk. Combined with **product quantization (PQ)** in RAM for approximate distance, and full vectors on disk for re-ranking the final candidate set, DiskANN holds single-machine billion-scale search at a fraction of HNSW's RAM cost.
+**For cortex specifically, DiskANN is the natural billion-scale upgrade path** because:
+- The 2.4.0 foundations (stable IDs, mmap vector layer, graph link compression) transfer to it without rework
+- The 2.5.0 #30 SQ4 quantization work primes the PQ codec path
+- Cortex's positioning has always been "billion-scale via Rust acceleration" — DiskANN fits the message
+- We control the rest of the stack (storage adapters, idMapper, HNSW provider), so the integration is in friendly territory
+We considered **ScaNN** (Google, Apache 2.0) as an alternative. It posts SOTA recall/QPS numbers at moderate scale with anisotropic vector quantization. We declined: ScaNN is IVF-based (inverted file with partition centroids), which doesn't align with brainy's graph-native architecture. Switching to IVF would mean losing the structural symmetry between brainy's verb graph and its vector index, plus introducing periodic clustering retraining (an operational concern brainy doesn't currently have).
+## Decisions
+### Decision 1 — DiskANN as the billion-scale upgrade path (not a replacement for HNSW)
+HNSW stays the brainy default forever. Every existing user, including those without cortex, continues to get the TS `HNSWIndex` they ship with today. DiskANN is added as an **alternative provider that engages when its constraints are met**.
+This preserves three properties we don't want to give up:
+- Zero-friction onboarding for new brainy users (no cortex required, no config tuning to pick an algorithm).
+- Backward compatibility for every existing brainy install (no surprise migrations on upgrade).
+- The "cortex makes brainy faster" story (not "cortex makes brainy different").
+### Decision 2 — 100% pure Rust, no C++ FFI
+We will port DiskANN's Vamana algorithm to Rust from the published paper (Subramanya et al., NeurIPS 2019; Singh et al., 2021) rather than wrap Microsoft's C++ reference implementation via FFI. The Vamana algorithm is straightforward: greedy graph construction with an α-pruning step that controls graph density. The published pseudocode plus the reference implementation's behaviour give us everything we need to validate correctness.
+PQ codec: we will either compose a battle-tested Rust crate (e.g., `qdrant-quantization`, Apache 2.0) or implement PQ training + encode/decode in cortex directly, depending on parity test outcomes. Either way, no C++.
+**Why not FFI:** cross-platform C++ builds for Node native modules are operationally expensive (Linux/macOS/Windows × x64/arm64 binaries, headers, link-time gotchas), Microsoft's reference impl has its own build dependencies that would propagate, and we'd inherit any patent grant ambiguities at the binary level. Pure Rust gives us napi-rs's mature cross-platform binary distribution and a license posture we fully control.
+**Why not adopt an existing Rust crate wholesale:** no mature Rust port of Vamana exists at our knowledge cutoff. We will track this and pivot if a high-quality one emerges; for now we're building it.
+### Decision 3 — Filesystem-only deployment in the first release
+DiskANN is local-SSD-by-design. The whole point of the architecture is that disk reads are cheap (NVMe-cheap, ~10 μs) and predictable, so the search algorithm can lean on the OS page cache + the on-disk layout's locality.
+Cloud object storage (S3, R2, GCS) breaks that assumption: range reads of large objects cost ~100 ms of round-trip latency, and the locality model has to account for HTTP/2 framing instead of OS pages. Supporting cloud storage for DiskANN would require either:
+- A persistent "DiskANN file lives on a local cache disk that we sync from cloud" model (operationally heavy), or
+- A fundamentally different search algorithm with batched range reads (no longer DiskANN, really).
+**For the first DiskANN release, the activation conditions explicitly require `storage.adapter === 'filesystem'`.** Cloud-storage users continue to use HNSW. We may revisit cloud support if there's demand and an approach that doesn't compromise the algorithm's strengths.
+### Decision 4 — Auto-engagement, zero configuration
+When all of these conditions hold at brainy init, DiskANN replaces HNSW as the active index without any user config:
+1. Cortex is loaded as a plugin (the `index:diskann` provider is registered)
+2. The storage adapter is `FileSystemStorage` (local SSD)
+3. The metadata index exposes a stable `idMapper` (the 2.4.0 #23 foundation)
+This mirrors the existing `MmapVectorBackend` wiring pattern from 2.4.0 #24: the heavy machinery activates when its preconditions are met, and otherwise silently falls back. Users who don't want it can opt out via `config.index.type = 'hnsw'`.
+**Why auto-engage instead of opt-in by config:**
+- Matches cortex's "loading cortex makes brainy faster" value proposition (no extra knob to turn)
+- The constraints (cortex + filesystem) are exactly the deployment shape DiskANN targets, so the conditions ARE the signal
+- Opt-in-only would leave most filesystem-using cortex installs on HNSW out of caution — defeating the point
+**Why not unconditional default:**
+- Cloud-storage users have no DiskANN-compatible path; we can't break their existing HNSW workflows
+- Cortex-less users (the brainy-only crowd) never see DiskANN regardless — preserves the "brainy works the same with or without cortex" property
+### Decision 5 — Explicit migration API for existing installs
+Existing brainy installs with an HNSW index on disk **do not auto-migrate to DiskANN on upgrade**. The on-disk HNSW state is detected at init; if `config.index.type` is unset, brainy logs:
+> `[brainy] Existing HNSW index detected at <path>. The new cortex default for filesystem storage is DiskANN. Continue using HNSW (set config.index.type='hnsw' to silence this message) or run brain.migrateToDiskAnn() to convert.`
+The migration API:
+```typescript
+// Convert an existing HNSW index to DiskANN.
+// Builds the DiskANN index in parallel (separate files), verifies recall
+// parity at the configured threshold, then atomically swaps the active
+// index. Reversible via brain.migrateToHnsw().
+await brain.migrateToDiskAnn({
+  recallTarget?: number,    // default 0.95 — verification target before swap
+  paddingFactor?: number,   // default 1.2 — slack for re-ranking candidate set
+  parallel?: boolean        // default true — build new index alongside live old
+})
+```
+Reversibility (`brain.migrateToHnsw()`) is a contract, not a courtesy. Users need to be able to roll back if recall regression or any other issue surfaces in production.
+## Architecture
+### Brainy provider contract
+Cortex registers two new providers (mirrors the existing `hnsw` provider shape):
+```typescript
+// brainy: src/plugin.ts
+export interface DiskAnnProvider {
+  create(config: DiskAnnConfig, distance: DistanceFunction, options: DiskAnnOptions): DiskAnnInstance
+  openExisting(path: string, distance: DistanceFunction): DiskAnnInstance
+}
+export interface DiskAnnInstance extends HnswProvider {
+  // Implements the same interface HNSWIndex/HnswProvider exposes, so the rest
+  // of brainy doesn't care which index is active. Adds one DiskANN-specific
+  // method for the migration API:
+  rebuildPQCodebook(): Promise<void>  // Re-trains PQ from current vectors
+}
+```
+The instance implementing `HnswProvider` is the load-bearing decision. brainy's search/find/get code paths call into the provider through this surface; an `HNSWIndex` and a `NativeDiskANN` are interchangeable from brainy's POV. No control-flow plumbing changes in brainy beyond the choice of which provider to instantiate.
+### Cortex Rust modules
+```
+cortex/native/src/diskann/
+├── mod.rs        — napi exports + the NativeDiskANN class
+├── vamana.rs     — α-pruning greedy graph construction (~500 LOC)
+├── pq.rs         — Product Quantization codebook training + encode/decode
+├── format.rs     — On-disk file format (header + PQ codebook + graph + vectors)
+└── search.rs     — Greedy graph search with PQ-approximate distance + re-rank
+```
+### On-disk file format
+Single contiguous file `<dataDir>/_diskann/main.bin` (path mirrors `_vectors/main.bin` from #24):
+```
++--------------------------------------------------------------+
+| Header (4 KB, aligned)                                       |
+|   magic: u32         "DKAN"                                  |
+|   version: u32       layout revision                         |
+|   dim: u32           vector dimensionality                   |
+|   node_count: u32    total vectors                           |
+|   pq_subspaces: u8   PQ M parameter (typically 8 or 16)      |
+|   pq_bits: u8        bits per subspace (typically 8)         |
+|   max_degree: u8     Vamana R parameter (typically 64-96)    |
+|   entry_point: u32   slot id of the entry node               |
+|   ... reserved bytes for forward compatibility ...           |
++--------------------------------------------------------------+
+| PQ codebook (M × 256 × subvec_dim × f32)                     |
++--------------------------------------------------------------+
+| PQ codes (node_count × M bytes)                              |
+| — one PQ code per node, M bytes each, in slot order          |
++--------------------------------------------------------------+
+| Vamana graph (node_count × max_degree × u32)                 |
+| — flat CSR-like array of neighbour slot ids                  |
+| — fixed degree per node for predictable offset math          |
++--------------------------------------------------------------+
+| Full vectors (node_count × dim × f32)                        |
+| — only touched for re-ranking the final candidate set        |
++--------------------------------------------------------------+
+```
+The fixed-degree Vamana graph trades a small density loss for O(1) neighbour-offset arithmetic. PQ codes pack tightly in RAM (M bytes per vector — at M=16 that's 16 bytes/vector regardless of dim, so 1B vectors fit in ~16 GB RAM for the PQ-resident layer).
+### Search algorithm
+```
+async function search(query: Vector, k: number): Promise<Result[]> {
+  // 1. PQ-encode the query into M sub-vector codes
+  const queryPq = pqEncode(query, codebook)
+  // 2. Greedy graph walk using PQ-approximate distance
+  const visited = new Set<u32>()
+  const candidates = new BoundedHeap(maxLen = k * paddingFactor)
+  let current = entryPoint
+  while (improving(candidates)) {
+    const neighbours = graph[current]
+    for (const n of neighbours) {
+      if (visited.has(n)) continue
+      visited.add(n)
+      const approxDist = pqDistance(queryPq, codes[n])
+      candidates.insert(n, approxDist)
+    }
+    current = candidates.bestUnvisited()
+  }
+  // 3. Re-rank the top-(k * paddingFactor) candidates with full vectors
+  const topCandidates = candidates.topN(k * paddingFactor)
+  return topCandidates
+    .map(n => ({ id: idMapper.getUuid(n), distance: trueDistance(query, vectors[n]) }))
+    .sort()
+    .slice(0, k)
+}
+```
+The `paddingFactor` (default 1.2 = 20% over-fetch) controls the recall/cost tradeoff. PQ approximate distance is fast but lossy; re-ranking on the over-fetched candidate set with full-precision vectors recovers recall at a small cost (typically a few hundred extra full-vector reads per query, which is fine on SSD).
+## Implementation plan
+### Session 35a — Vamana + PQ in pure Rust (cortex)
+**Scope (~3–5 hrs focused):**
+- `cortex/native/src/diskann/vamana.rs` — Vamana graph construction with α-pruning, ~500 LOC. Inputs: vector buffer, dim, R (max degree), α (density parameter, typically 1.2–1.4). Output: CSR adjacency.
+- `cortex/native/src/diskann/pq.rs` — PQ codebook training (k-means on subvector partitions) + encode/decode. M subspaces × 256 centroids each, configurable.
+- `cortex/native/src/diskann/format.rs` — On-disk file format struct + read/write primitives.
+- Rust unit tests: graph connectivity invariants, PQ recall on small synthetic dataset, format round-trip.
+**Exit criteria:** Vamana graph build over 10k random vectors produces a connected graph with degree ≤ R, search recall ≥ 95% at k=10 on synthetic dataset.
+### Session 35b — Search + napi bindings (cortex)
+**Scope (~3–5 hrs focused):**
+- `cortex/native/src/diskann/search.rs` — Greedy search with PQ-approximate distance and full-vector re-ranking on the candidate set.
+- `cortex/native/src/diskann/mod.rs` — `#[napi]` exports of `NativeDiskANN` class with `create` / `openExisting` / `addItem` / `search` / `rebuildPQCodebook` methods.
+- `cortex/native/index.d.ts` regeneration.
+- Recall validation against published DiskANN benchmark numbers (sanity check, not full BIGANN — that's a separate effort).
+**Exit criteria:** Search recall ≥ 95% at k=10 over a 100k-vector dataset matches the published DiskANN paper's numbers within 2 percentage points.
+### Session 35c — Brainy hookup + cortex 3.0.0 + brainy 8.0.0 release
+**Scope (~3–5 hrs focused):**
+- `brainy/src/hnsw/diskAnnIndex.ts` — TS wrapper class implementing brainy's `HnswProvider` contract over `NativeDiskANN`. Same surface as `HNSWIndex` so the rest of brainy is agnostic.
+- `brainy/src/brainy.ts` — `wireDiskAnn()` private method that runs after `wireMmapVectorBackend()` during init. Auto-engagement conditions; opt-out via `config.index.type = 'hnsw'`.
+- `brainy/src/plugin.ts` — `DiskAnnProvider` and `DiskAnnInstance` interfaces (mirrors the `VectorStoreMmapProvider` pattern from 2.4.0 #24).
+- `brain.migrateToDiskAnn()` and `brain.migrateToHnsw()` explicit migration APIs.
+- Tests: provider hookup, auto-engagement conditions, opt-out, recall parity at 10k–100k vectors, migration round-trip integrity.
+- Coordinated release: `cortex 3.0.0` + `brainy 8.0.0`. Major bumps because the default index type changes for filesystem+cortex users (semver discipline matters).
+**Exit criteria:** Recall parity between brainy 8.0.0 + cortex 3.0.0 DiskANN path and brainy 7.x + cortex 2.x HNSW path is within 1% at standard k values (1, 10, 50). Migration round-trip preserves index integrity.
+## Consequences
+### Positive
+- **Single-machine billion-scale becomes a supported workload.** At 100M to 1B vectors, RAM cost drops by ~16–20× compared to HNSW. NVMe disk locality replaces RAM pressure as the bottleneck.
+- **Cortex's "billion-scale via Rust acceleration" positioning becomes literal**, not aspirational.
+- **Zero impact to non-cortex users.** brainy keeps shipping its TS HNSWIndex; no API change, no behaviour change for them.
+- **Foundations carry forward.** The 2.4.0 storage work (stable IDs, mmap layer, graph compression) and 2.5.0 #30 (SQ4 quantization, the PQ precursor) all transfer.
+- **License posture is clean.** Pure Rust port from a published algorithm + permissive (MIT/Apache 2.0) Rust deps. No C++ FFI license entanglement.
+- **Future-utility carry.** The cortex Rust compaction primitives (`compute_bfs_order`, `compute_hnsw_traversal_order`) stay in the codebase; if HNSW's disk locality ever becomes interesting again, the math is already there.
+### Negative / Tradeoffs
+- **Build cost.** DiskANN graph construction is slower than HNSW because Vamana's α-pruning requires examining more candidate neighbours per node. On 100M vectors this is hours, not minutes. Acceptable for once-per-deployment cost.
+- **PQ recall ceiling.** Product Quantization is lossy. Recall maxes out around 95–98% on typical embedding workloads; HNSW with full precision can reach 99%+. The re-ranking step recovers most of the gap. Users with extreme recall requirements (e.g., legal-discovery search) may want to stay on HNSW.
+- **Filesystem-only constraint.** Cloud-storage users get no benefit from DiskANN in the first release. We've accepted this; cloud DiskANN is a future investigation, not a commitment.
+- **Major version bump.** Auto-engagement changing the default index type for filesystem+cortex users is a semver-major event. brainy 8.0.0 and cortex 3.0.0 must coordinate. Some communication overhead at release time.
+### Risks
+- **Correctness drift from the reference implementation.** Vamana has subtle algorithmic choices (the α-pruning order, the entry-point selection strategy) that affect recall by small but real amounts. Mitigation: explicit recall validation against the published numbers + reference implementations in 35a and 35b's exit criteria.
+- **Brainy provider contract surface mismatch.** The `HnswProvider` interface was designed for HNSW; DiskANN may surface operations (codebook retraining, segment-level compaction) that don't fit cleanly. Mitigation: keep `DiskAnnInstance` as an extension of `HnswProvider` plus DiskANN-specific methods; never narrow the parent interface.
+- **Migration API regressions.** `migrateToDiskAnn` runs over potentially billions of vectors. A bug here could mean hours of wasted compute or, worse, an inconsistent index. Mitigation: parallel build (the old HNSW stays serving until the new DiskANN is validated), explicit recall verification before the atomic swap, fully reversible via `migrateToHnsw`.
+- **Long-running PQ codebook drift.** As vectors are added over time, the original PQ codebook can drift away from the data distribution, eroding recall. Mitigation: expose `rebuildPQCodebook()` for explicit retrains; document the operational guideline (retrain after the dataset doubles, or after a measurable recall regression).
+## Open questions
+1. **PQ codebook strategy at scale.** Do we train PQ once on a sample of the data, or use online/streaming PQ updates? Tradeoff: simpler vs. better recall over time. Lean toward sample-once-with-explicit-retrain to keep the operational model simple.
+2. **Vamana parameters as runtime config vs. baked into the file format.** R (max degree), α (density), the search candidate set padding factor — how much do we expose to users? Lean toward fixed-good-defaults in 3.0.0, expose later if a workload demands it.
+3. **Filtered search support.** brainy's `find({ where, ... })` interacts with HNSW via a filter callback. DiskANN's PQ-distance loop needs different filter integration. Plan to defer — initial release supports unfiltered top-K search; filtered search is a follow-up.
+4. **Multi-shard / single-node-of-cluster deployments.** Cortex isn't a cluster engine, but some users run multiple cortex+brainy nodes behind a load balancer. Does each node need its own DiskANN file, or can they share one? Plan to defer — start with per-node files.
+## References
+- Subramanya et al., *DiskANN: Fast Accurate Billion-Point Nearest Neighbor Search on a Single Node*, NeurIPS 2019. [arXiv:1907.07574](https://arxiv.org/abs/1907.07574)
+- Singh et al., *FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search*, 2021. [arXiv:2105.09613](https://arxiv.org/abs/2105.09613)
+- Microsoft DiskANN open-source reference implementation: [github.com/microsoft/DiskANN](https://github.com/microsoft/DiskANN) (MIT licensed)
+- ADR-001 — Native column store with raw mmap segments (the same architectural pattern of "cortex registers a provider, brainy consumes when present")
+- Brainy 7.28.0 SQ4 quantization (the PQ precursor — scalar quantization scoped to a single vector; PQ extends the same idea to subvector partitions with learned codebooks)
+- Cortex 2.4.0 storage foundations: stable EntityIdMapper (#23), mmap vector backend (#24), graph link compression (#25), column-store interchange (#26)

package/native/brainy-native.node CHANGED Viewed

Binary file

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@soulcraft/cortex",
-  "version": "2.4.0",
+  "version": "2.5.1",
   "description": "Native Rust acceleration for Brainy — SIMD distance, vector quantization, zero-copy mmap, native embeddings. Free tier for storage, Pro license for compute acceleration.",
   "main": "dist/index.js",
   "types": "dist/index.d.ts",
@@ -66,11 +66,11 @@
     "LICENSE"
   ],
   "peerDependencies": {
-    "@soulcraft/brainy": ">=7.26.0"
+    "@soulcraft/brainy": ">=7.28.0"
   },
   "devDependencies": {
     "@napi-rs/cli": "^3.0.0",
-    "@soulcraft/brainy": "^7.26.0",
+    "@soulcraft/brainy": "^7.28.0",
     "@types/node": "^22.0.0",
     "tsx": "^4.21.0",
     "typescript": "^5.9.3",