@yoch/frozenminisearch 1.2.4 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +26 -0
- package/README.md +30 -17
- package/dist/browser/index.d.ts +693 -0
- package/dist/browser/index.js +1 -0
- package/dist/cjs/index.cjs +3556 -3517
- package/dist/es/index.d.ts +61 -42
- package/dist/es/index.js +3556 -3517
- package/package.json +10 -3
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,32 @@
|
|
|
2
2
|
|
|
3
3
|
## Unreleased
|
|
4
4
|
|
|
5
|
+
## v1.3.0 — `@yoch/frozenminisearch`
|
|
6
|
+
|
|
7
|
+
Minor release: browser entry (`@yoch/frozenminisearch/browser`), portable default compression (`auto` → zlib), async browser MSv5 binary snapshots, Node ↔ browser zlib interoperability, and indexing parity fixes for custom tokenizers.
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
|
|
11
|
+
- **Browser entry** — `@yoch/frozenminisearch/browser` for read-only search and index build in the browser (`fromDocuments`, `fromJson`, `search`, `autoSuggest`, incremental builder).
|
|
12
|
+
- **Browser binary I/O** — `saveBinaryAsync` / `loadBinaryAsync` on `Uint8Array` (`raw`, `zlib`, `auto`). No sync binary APIs and no zstd in the browser build.
|
|
13
|
+
- **Wire portability layer** — `binaryBytes`, `binaryWireIo`, `fieldLengthMatrixWire`, and browser compression via native `CompressionStream` / `DecompressionStream`.
|
|
14
|
+
- **Indexing parity gate** — `dev/parity/indexing-parity.test.js` compares `MiniSearch.addAll` vs `FrozenMiniSearch.fromDocuments` (index fingerprint + scores) across default, camelCase, `processTerm`, `stringifyField`, and Vocs-style profiles; builder, `fromJson`, and binary round-trips included.
|
|
15
|
+
|
|
16
|
+
### Fixed
|
|
17
|
+
|
|
18
|
+
- **Custom tokenizer indexing** — `isDefaultTokenize` now requires reference equality with the default tokenizer; split-equivalent wrappers no longer take the default fast path (fixes missing camelCase terms such as `create` from `createUser`).
|
|
19
|
+
- **Field length with `processTerm`** — `fromDocuments` counts unique raw tokens per field (MiniSearch semantics) instead of distinct indexed terms after filtering.
|
|
20
|
+
|
|
21
|
+
### Changed
|
|
22
|
+
|
|
23
|
+
- **`compression: 'auto'`** — always tries zlib (then raw if it does not shrink). zstd remains opt-in via `compression: 'zstd'` on Node 22.15+; existing zstd snapshots still load on Node.
|
|
24
|
+
|
|
25
|
+
### Improved
|
|
26
|
+
|
|
27
|
+
- **CI** — cross-runtime smoke tests: Node zlib save → browser load and browser zlib save → Node load.
|
|
28
|
+
- **Browser bundle size** — production `dist/browser/index.js` is ~67.6 KB raw and ~20.9 KB gzip (native compression streams, no `fflate`).
|
|
29
|
+
- **`stringifyField` fast path** — skip redundant `toString()` when the field value is already a string and the default stringifier is in use.
|
|
30
|
+
|
|
5
31
|
## v1.2.4 — `@yoch/frozenminisearch`
|
|
6
32
|
|
|
7
33
|
Patch release: faster frozen search and autoSuggest finalization, simplified AND gate heuristics, and small public exports for advanced callers. No MSv5 wire-format changes.
|
package/README.md
CHANGED
|
@@ -6,7 +6,7 @@
|
|
|
6
6
|
|
|
7
7
|
[API documentation](https://yoch.github.io/frozenminisearch/)
|
|
8
8
|
|
|
9
|
-
**Memory-optimized, read-only full-text search for Node.js.** FrozenMiniSearch keeps the serving API close to [MiniSearch](https://github.com/lucaong/minisearch) while using compact, immutable indexes for fixed corpora.
|
|
9
|
+
**Memory-optimized, read-only full-text search for Node.js and browsers.** FrozenMiniSearch keeps the serving API close to [MiniSearch](https://github.com/lucaong/minisearch) while using compact, immutable indexes for fixed corpora.
|
|
10
10
|
|
|
11
11
|
Use it when your documents are built offline, shipped to production, and queried many times. In that shape, frozen indexes use **~98-99% less index RAM** in the main benchmark set, save to compact binary snapshots, and load faster than MiniSearch JSON.
|
|
12
12
|
|
|
@@ -32,15 +32,15 @@ Same corpora, same BM25-style queries, MiniSearch 7.2.0 as the reference.
|
|
|
32
32
|
|
|
33
33
|
| Scenario | Docs | Index RAM | Binary size | Load time | Search p50 |
|
|
34
34
|
|----------|-----:|-----------|------------:|----------:|-----------:|
|
|
35
|
-
| Divina, with stored text | 14,097 | 0.3 vs 16.0 MB (~98% less) | ~
|
|
36
|
-
| Divina, index only | 14,097 | 0.2 vs 14.9 MB (~99% less) | ~
|
|
37
|
-
| High-frequency terms | 10,000 | 4.4 vs 7.4 MB (~
|
|
38
|
-
| Dense numeric ids | 100,000 | 0.9 vs 91.3 MB (~99% less) | ~
|
|
39
|
-
| Uint16 doc id boundary | 65,535 | 0.6 vs 58.6 MB (~99% less) | ~
|
|
35
|
+
| Divina, with stored text | 14,097 | 0.3 vs 16.0 MB (~98% less) | ~71% less | ~56% faster | ~21% faster |
|
|
36
|
+
| Divina, index only | 14,097 | 0.2 vs 14.9 MB (~99% less) | ~74% less | ~80% faster | ~24% faster |
|
|
37
|
+
| High-frequency terms | 10,000 | 4.4 vs 7.4 MB (~41% less) | ~92% less | ~85% faster | ~41% faster |
|
|
38
|
+
| Dense numeric ids | 100,000 | 0.9 vs 91.3 MB (~99% less) | ~73% less | ~87% faster | ~33% faster |
|
|
39
|
+
| Uint16 doc id boundary | 65,535 | 0.6 vs 58.6 MB (~99% less) | ~77% less | ~91% faster | ~53% faster |
|
|
40
40
|
|
|
41
|
-
Across this full run, frozen is faster on **
|
|
41
|
+
Across this full run, frozen is faster on **25/27** search cases. Divina `inferno` (exact, paired p50): mutable 18.1 µs → frozen 11.4 µs (**-7 µs**, ratio 0.72).
|
|
42
42
|
|
|
43
|
-
Numbers are from `benchmarks/baselines/reference.json`, captured 2026-06-
|
|
43
|
+
Numbers are from `benchmarks/baselines/reference.json`, captured 2026-06-21 on Node v24.16.0, 3 runs per scenario. Heap is measured with one index alive and should be read as a trend, not exact accounting.
|
|
44
44
|
<!-- vs-reference:end -->
|
|
45
45
|
|
|
46
46
|
---
|
|
@@ -77,7 +77,20 @@ for (const doc of rows) builder.add(doc)
|
|
|
77
77
|
const index = freezeFrozenIndexBuilder(builder)
|
|
78
78
|
```
|
|
79
79
|
|
|
80
|
-
ESM and CommonJS are both supported (`main` → CJS, `module` → ESM).
|
|
80
|
+
ESM and CommonJS are both supported on Node (`main` → CJS, `module` → ESM). For browsers and bundlers, use the dedicated browser entry (search, build, and **async** binary I/O):
|
|
81
|
+
|
|
82
|
+
```javascript
|
|
83
|
+
import FrozenMiniSearch from '@yoch/frozenminisearch/browser'
|
|
84
|
+
|
|
85
|
+
const index = FrozenMiniSearch.fromDocuments(documents, options)
|
|
86
|
+
index.search('ishmael', { prefix: true })
|
|
87
|
+
|
|
88
|
+
// Load a zlib snapshot from CDN (Uint8Array)
|
|
89
|
+
const buf = new Uint8Array(await (await fetch('/index.frozen')).arrayBuffer())
|
|
90
|
+
const loaded = await FrozenMiniSearch.loadBinaryAsync(buf, options)
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
See [examples/plain_js_frozen/](examples/plain_js_frozen/) for a plain-JS demo (`yarn build` first).
|
|
81
94
|
|
|
82
95
|
---
|
|
83
96
|
|
|
@@ -127,15 +140,15 @@ MiniSearch is only needed if you still build mutable indexes. Frozen instances d
|
|
|
127
140
|
- `search(query, searchOptions?)` — string, wildcard (`FrozenMiniSearch.wildcard`), or nested `QueryCombination`
|
|
128
141
|
- `autoSuggest(queryString, options?)`
|
|
129
142
|
- `has(id)`, `getStoredFields(id)`
|
|
130
|
-
- `saveBinarySync` / `loadBinarySync`
|
|
143
|
+
- `saveBinarySync` / `loadBinarySync` on **Node** (async variants too); browser entry supports **async** binary only (`Uint8Array`, `raw` / `zlib` / `auto`)
|
|
131
144
|
|
|
132
145
|
Custom `tokenize` and `processTerm` functions are not stored in snapshots; pass the same functions again when loading.
|
|
133
146
|
|
|
134
147
|
---
|
|
135
148
|
|
|
136
|
-
## Binary snapshots
|
|
149
|
+
## Binary snapshots (Node)
|
|
137
150
|
|
|
138
|
-
Binary snapshots are the preferred production format.
|
|
151
|
+
Binary snapshots are the preferred production format on Node.js.
|
|
139
152
|
|
|
140
153
|
```javascript
|
|
141
154
|
const buf = index.saveBinarySync()
|
|
@@ -143,16 +156,16 @@ const loaded = FrozenMiniSearch.loadBinarySync(buf, {}) // field names embedded
|
|
|
143
156
|
```
|
|
144
157
|
|
|
145
158
|
- **Node ≥ 20**
|
|
146
|
-
- `compression: 'auto'`
|
|
147
|
-
- Use explicit compression when you need a
|
|
159
|
+
- `compression: 'auto'` uses **zlib** when it shrinks the payload (portable on Node 20+ and in the browser build); falls back to raw when compression does not help.
|
|
160
|
+
- Use explicit compression when you need a specific artifact:
|
|
148
161
|
|
|
149
162
|
```javascript
|
|
150
|
-
const portable = index.saveBinarySync({ compression: 'zlib' })
|
|
163
|
+
const portable = index.saveBinarySync({ compression: 'zlib' }) // CDN / browser
|
|
151
164
|
const uncompressed = index.saveBinarySync({ compression: 'raw' })
|
|
152
|
-
const bestRatio = index.saveBinarySync({ compression: 'zstd' }) // Node 22.15+
|
|
165
|
+
const bestRatio = index.saveBinarySync({ compression: 'zstd' }) // Node 22.15+ only
|
|
153
166
|
```
|
|
154
167
|
|
|
155
|
-
Raw
|
|
168
|
+
Raw snapshots load in the browser without native compression APIs. zlib snapshots in the browser require `CompressionStream` / `DecompressionStream`. Browser binary I/O is async because it uses native browser stream APIs, but it still materializes the full compressed/decompressed payload in memory. zstd snapshots require Node 22.15+ (read/write on Node; not supported in the browser build).
|
|
156
169
|
|
|
157
170
|
---
|
|
158
171
|
|