npm - eigen-db - Versions diffs - 4.4.0 → 5.0.0 - Mend

eigen-db 4.4.0 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/CHANGELOG.md +4 -0
package/README.md +38 -25
package/dist/eigen-db.js +190 -187
package/dist/eigen-db.js.map +1 -1
package/dist/eigen-db.umd.cjs +1 -1
package/dist/eigen-db.umd.cjs.map +1 -1
package/dist/result-set.d.ts +18 -7
package/dist/types.d.ts +5 -1
package/package.json +1 -1
package/src/lib/__tests__/result-set.test.ts +146 -27
package/src/lib/__tests__/vector-db.test.ts +188 -4
package/src/lib/result-set.ts +55 -24
package/src/lib/types.ts +5 -1
package/src/lib/vector-db.ts +8 -4

package/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,7 @@
+# v5.0.0
+Changed: replaced `topK` with `limit` and `order` parameters in `query()` method
 # v4.4.0
 Added: `entries()`, `keys()`, `values()`, `delete()`, `has()` methods, and `dimensions` property

package/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Eigen DB
-High-performance vector database for the web.
+High-performance vector database for the web, powered by Web Assembly.
 `eigen-db` stores and queries embedding vectors in-browser, using:
@@ -16,7 +16,7 @@ npm install eigen-db
 ## Guide: Set up and query
-### 1) Open a database
+### Open a database
 ```ts
 import { DB } from "eigen-db";
@@ -39,7 +39,7 @@ const db = await DB.open({
 });
 ```
-### 2) Insert vectors
+### Insert vectors
 ```ts
 db.set("doc:1", embedding1);
@@ -56,7 +56,7 @@ Notes:
 - Each vector must be a `number[]` (or `Float32Array`) with exactly `dimensions` elements.
 - Duplicate keys use last-write-wins semantics.
-### 3) Look up, check, and remove vectors
+### Look up, check, and remove vectors
 ```ts
 db.get("doc:1"); // number[] | undefined
@@ -66,7 +66,7 @@ db.dimensions; // configured vector dimensions
 db.size; // number of entries
 ```
-### 4) Iterate over the database
+### Iterate over the database
 ```ts
 // Iterate over all keys
@@ -83,13 +83,13 @@ for (const [key, vector] of db.entries()) {
 const all = [...db];
 ```
-### 5) Query nearest vectors
+### Query nearest vectors
 ```ts
 const queryVector = embeddingQuery;
 // Returns a plain array of { key, similarity } sorted by descending similarity
-const results = db.query(queryVector, { topK: 10 });
+const results = db.query(queryVector, { limit: 10 });
 for (const { key, similarity } of results) {
   console.log(key, similarity);
@@ -99,7 +99,7 @@ for (const { key, similarity } of results) {
 For lazy iteration (useful for pagination or early stopping):
 ```ts
-const results = db.query(queryVector, { topK: 100, iterable: true });
+const results = db.query(queryVector, { limit: 100, iterable: true });
 // Iterate and break early — keys are resolved on demand
 for (const { key, similarity } of results) {
@@ -111,17 +111,27 @@ for (const { key, similarity } of results) {
 const all = [...results];
 ```
-Use `minSimilarity` to automatically cut off results below a threshold:
+Use `minSimilarity` and `maxSimilarity` to filter results by a similarity range:
 ```ts
 // Only return results with similarity ≥ 0.7 (inclusive)
 const results = db.query(queryVector, { minSimilarity: 0.7 });
-// Works with iterable mode too — iteration stops early at the threshold
-const results = db.query(queryVector, { minSimilarity: 0.7, iterable: true });
+// Only return results with similarity ≤ 0.5 (inclusive)
+const results = db.query(queryVector, { maxSimilarity: 0.5 });
+// Combine both for a range
+const results = db.query(queryVector, { minSimilarity: 0.3, maxSimilarity: 0.8 });
+```
+Use `order: "ascend"` to get the least similar results first (bottom-K):
+```ts
+// Least similar results first
+const bottomK = db.query(queryVector, { order: "ascend", limit: 10 });
 ```
-### 6) Persist and lifecycle
+### Persist and lifecycle
 ```ts
 await db.flush(); // persist current state
@@ -134,7 +144,7 @@ To delete all vectors and storage:
 await db.clear();
 ```
-### 7) Export and import
+### Export and import
 Export the entire database as a streaming binary file:
@@ -177,11 +187,11 @@ Similarity is the dot product of the query and stored vectors.
 **When to normalize:**
-| Scenario | Normalize? | Notes |
-| --- | --- | --- |
+| Scenario                                   | Normalize?       | Notes                                                                       |
+| ------------------------------------------ | ---------------- | --------------------------------------------------------------------------- |
 | Using embeddings from OpenAI, Cohere, etc. | `true` (default) | Embeddings may not be unit-length; normalization ensures cosine similarity. |
-| Vectors are already unit-length | Either | Setting `false` avoids redundant work. |
-| You need raw dot-product semantics | `false` | Similarity will be the raw dot product; range depends on vector magnitudes. |
+| Vectors are already unit-length            | Either           | Setting `false` avoids redundant work.                                      |
+| You need raw dot-product semantics         | `false`          | Similarity will be the raw dot product; range depends on vector magnitudes. |
 ## Full API Reference
@@ -303,8 +313,10 @@ interface SetOptions {
 ```ts
 interface QueryOptions {
-  topK?: number; // default: Infinity (all results)
+  limit?: number; // default: Infinity (all results)
+  order?: "ascend" | "descend"; // default: "descend" (most similar first)
   minSimilarity?: number; // inclusive lower bound on similarity; results below this are excluded
+  maxSimilarity?: number; // inclusive upper bound on similarity; results above this are excluded
   normalize?: boolean;
   iterable?: boolean; // when true, returns Iterable<ResultItem> instead of ResultItem[]
 }
@@ -349,12 +361,12 @@ Thrown when memory growth would exceed WASM 32-bit memory limits for the configu
 WASM SIMD vs pure JavaScript performance on 1536-dimensional vectors (OpenAI embedding size), measured with `vitest bench` (Node.js):
-| Operation | JS (ops/s) | WASM SIMD (ops/s) | Speedup |
-| --- | --- | --- | --- |
-| normalize (1536 dims) | 223,117 | 2,226,734 | **~10×** |
-| searchAll (100 vectors × 1536 dims) | 3,429 | 77,130 | **~22×** |
-| searchAll (1,000 vectors × 1536 dims) | 344 | 8,009 | **~23×** |
-| searchAll (10,000 vectors × 1536 dims) | 34 | 398 | **~12×** |
+| Operation                              | JS (ops/s) | WASM SIMD (ops/s) | Speedup  |
+| -------------------------------------- | ---------- | ----------------- | -------- |
+| normalize (1536 dims)                  | 223,117    | 2,226,734         | **~10×** |
+| searchAll (100 vectors × 1536 dims)    | 3,429      | 77,130            | **~22×** |
+| searchAll (1,000 vectors × 1536 dims)  | 344        | 8,009             | **~23×** |
+| searchAll (10,000 vectors × 1536 dims) | 34         | 398               | **~12×** |
 The WASM SIMD layer uses 2-vector outer loop unrolling (halving query memory reads) and 4× inner loop unrolling with multiple independent accumulators.
@@ -376,7 +388,8 @@ npm run dev
 ## Practical notes
 - Similarity is the dot product of query and stored vectors; with normalization enabled (default), this behaves like cosine similarity (1 = identical, -1 = opposite).
-- `topK` defaults to `Infinity`, returning all stored vectors sorted by similarity. Use `minSimilarity` to limit results by proximity.
+- `limit` defaults to `Infinity`, returning all stored vectors sorted by similarity. Use `minSimilarity` and `maxSimilarity` to filter results by proximity range.
+- `order` defaults to `"descend"` (most similar first). Use `"ascend"` to get least similar first.
 - Querying an empty database returns an empty array (`[]`).
 - `flush()` writes deduplicated state, and reopen preserves key-to-slot mapping.