@yoch/frozenminisearch 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,33 @@
1
+ # Changelog
2
+
3
+ ## Unreleased
4
+
5
+ ## v1.0.0 — `@yoch/frozenminisearch`
6
+
7
+ First stable release on npm. Frozen-only read-only search for Node.js.
8
+
9
+ ### Breaking
10
+
11
+ - **Binary snapshots** — `loadBinarySync` / `loadBinaryAsync` read only the current frozen binary format; re-build from lucaong JSON if an older snapshot fails to load.
12
+ - **Removed `saveBinary()` / `loadBinary()`** — use `saveBinarySync` / `saveBinaryAsync` and `loadBinarySync` / `loadBinaryAsync`.
13
+
14
+ ## v1.0.0-beta.0 — `@yoch/frozenminisearch`
15
+
16
+ New standalone package (frozen-only) for read-only serving workloads.
17
+
18
+ ### Added
19
+
20
+ - **`FrozenMiniSearch`** as the default export — `fromDocuments`, builder, `saveBinarySync` / `loadBinarySync`
21
+ - **Migration loaders** — `fromMiniSearch`, `fromMiniSearchJson`, `fromMiniSearchSnapshot` (lucaong JSON wire format)
22
+ - **Modular benchmarks** — `npm run bench` with profiles `vs-reference`, `regression`, `dev`
23
+ - **Parity suite** — `dev/parity/` vs `minisearch` npm (functional invariants)
24
+
25
+ ### Removed from published API
26
+
27
+ - Mutable `MiniSearch` class and `freeze()` on the fork
28
+ - `freezeFromMiniSearch` (use `fromMiniSearchJson`)
29
+ - Read-only mutation stubs (`add`, `remove`, …)
30
+
31
+ ### Migration
32
+
33
+ - `new MiniSearch(opts).addAll(docs)` (lucaong) → `FrozenMiniSearch.fromDocuments(docs, opts)` or `fromMiniSearch(mutable, opts)` — see README
package/LICENSE.txt ADDED
@@ -0,0 +1,8 @@
1
+ Copyright 2022 Luca Ongaro
2
+ Copyright 2026 Yoch (Contributions)
3
+
4
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
5
+
6
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
7
+
8
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,181 @@
1
+ # @yoch/frozenminisearch
2
+
3
+ **Read-only full-text search for Node.js** — compact frozen indexes, fast binary snapshots, and the same search API as [MiniSearch](https://github.com/lucaong/minisearch) by [Luca Ongaro](https://github.com/lucaong).
4
+
5
+ > **Current release:** `1.0.0` on npm
6
+
7
+ This package is a **standalone product**: no mutable `MiniSearch` class is published. Build indexes with `fromDocuments`, the incremental builder, or migrate from an existing lucaong index via `fromMiniSearchJson`.
8
+
9
+ ---
10
+
11
+ ## Why frozen instead of MiniSearch?
12
+
13
+ **Mutable** lucaong `minisearch` when documents change (`add`, `remove`, `discard`). **Frozen** when the corpus is fixed or shipped as a binary snapshot — same BM25, prefix/fuzzy, `autoSuggest`, wildcard, and `AND` / `OR` / `AND_NOT`. Parity with `minisearch@7` is validated in `dev/parity/` (scores `toBeCloseTo` precision 6).
14
+
15
+ <!-- vs-reference:start — npm run bench:readme -->
16
+ ### Measured vs lucaong MiniSearch (reference baseline)
17
+
18
+ Same BM25 queries on identical corpora. **Frozen wins on what we optimize for**: RAM, disk, cold load, and search throughput on real workloads.
19
+
20
+ | Scenario | Docs | Index RAM¹ | Disk (binary vs JSON)² | Cold load³ | Search p50⁴ |
21
+ |----------|-----:|------------|------------------------:|-----------:|------------:|
22
+ | Divina with storeFields | 14,097 | 1.1 vs 16.0 MB (~93% less) | ~73% less | ~71% faster | ~14% faster |
23
+ | Divina index only | 14,097 | 0.3 vs 14.9 MB (~98% less) | ~77% less | ~86% faster | ~3% faster |
24
+ | high-frequency terms (10k docs) | 10,000 | 0.2 vs 7.4 MB (~97% less) | ~94% less | ~90% faster | ~32% faster |
25
+ | Dense numeric ids (100k, identity lookup) | 100,000 | 1.7 vs 91.2 MB (~98% less) | ~88% less | ~90% faster | ~21% faster |
26
+ | Doc id Uint16 boundary (65535 docs) | 65,535 | 1.1 vs 58.6 MB (~98% less) | ~91% less | ~94% faster | ~38% faster |
27
+
28
+ **Headline:** 22/27 query benchmarks favor frozen (paired **hrtime** protocol v2). Divina `inferno` (exact, paired p50): mutable 16.3 µs → frozen 13.8 µs (**-2 µs**, ratio 0.87).
29
+
30
+ Decomposition (Divina exact): L0 lookup ~300 ns frozen, L1 `executeQuery` ~8.1 µs, L2 full `search` ~11.5 µs (finalize ≈ 3 µs).
31
+
32
+ | | lucaong `minisearch` | `@yoch/frozenminisearch` |
33
+ |---|------------------------|---------------------------|
34
+ | **Sweet spot** | Live index mutations | Fixed corpus, deploy from binary |
35
+ | **Production path** | `addAll` → `toJSON` | `fromDocuments` / `fromMiniSearch` → `saveBinarySync` → `loadBinarySync` |
36
+ | **Typical trade-off** | Higher RAM, JSON snapshots | One-time freeze, then compact binary |
37
+
38
+ <details>
39
+ <summary><strong>How to read these numbers (limits &amp; protocol)</strong></summary>
40
+
41
+ - **Captured:** 2026-06-07 · commit `2a9a90d` · Node v24.16.0 · minisearch **7.2.0** · **3** run(s)/scenario · protocol **v2** (hrtime-paired, batch target 3 ms).
42
+ - ¹ **Index RAM** — `measureHeap` with `--expose-gc`, one index alive. V8 overhead is extra; treat as **trend**, not accounting. Sporadic outliers happen (e.g. index-only Divina).
43
+ - ² **Disk** — `JSON.stringify(mutable)` vs `saveBinarySync`.
44
+ - ³ **Cold load** — median wall time to searchable index after read from disk format.
45
+ - ⁴ **Search p50** — paired mutable/frozen samples per iteration; sub-0.1 ms baselines reported in **µs** in full reports. Fast queries use **50** iterations, others **20**.
46
+ - **Not shown:** mutable `add`/`remove` (frozen is read-only by design). Freeze time is offline — see full suite for build metrics.
47
+ - **Reproduce:** `npm run bench -- run --profile=vs-reference` · **Update this block:** `npm run bench:readme` after refreshing `benchmarks/baselines/reference.json`.
48
+
49
+ </details>
50
+ <!-- vs-reference:end -->
51
+
52
+ ---
53
+
54
+ ## Quick start
55
+
56
+ ```bash
57
+ npm install @yoch/frozenminisearch
58
+ ```
59
+
60
+ **Build from documents:**
61
+
62
+ ```javascript
63
+ import FrozenMiniSearch from '@yoch/frozenminisearch'
64
+
65
+ const options = { fields: ['title', 'text'], storeFields: ['title'] }
66
+ const index = FrozenMiniSearch.fromDocuments(documents, options)
67
+
68
+ index.search('ishmael', { prefix: true })
69
+ index.autoSuggest('zen ar')
70
+
71
+ const buf = index.saveBinarySync()
72
+ const loaded = FrozenMiniSearch.loadBinarySync(buf, options)
73
+ ```
74
+
75
+ **Incremental builder:**
76
+
77
+ ```javascript
78
+ import FrozenMiniSearch, {
79
+ createFrozenIndexBuilder,
80
+ freezeFrozenIndexBuilder,
81
+ } from '@yoch/frozenminisearch'
82
+
83
+ const builder = createFrozenIndexBuilder(options, { estimatedDocumentCount: rows.length })
84
+ for (const doc of rows) builder.add(doc)
85
+ const index = freezeFrozenIndexBuilder(builder)
86
+ ```
87
+
88
+ ESM and CommonJS are both supported (`main` → CJS, `module` → ESM).
89
+
90
+ ---
91
+
92
+ ## Migration
93
+
94
+ ### From lucaong `minisearch` JSON
95
+
96
+ ```javascript
97
+ import MiniSearch from 'minisearch' // build-time only
98
+ import FrozenMiniSearch from '@yoch/frozenminisearch'
99
+
100
+ const mutable = new MiniSearch(options)
101
+ mutable.addAll(documents)
102
+
103
+ // Option A — live instance
104
+ const frozen = FrozenMiniSearch.fromMiniSearch(mutable, options)
105
+
106
+ // Option B — serialized index (offline / ETL)
107
+ const json = JSON.stringify(mutable)
108
+ const frozen2 = FrozenMiniSearch.fromMiniSearchJson(json, options)
109
+ ```
110
+
111
+ `options.fields` must match the indexed fields in the snapshot when provided.
112
+
113
+ ### From lucaong `minisearch` (mutable → frozen)
114
+
115
+ | Before (mutable) | After (`@yoch/frozenminisearch`) |
116
+ |------------------|----------------------------------|
117
+ | `new MiniSearch(opts).addAll(docs)` then serve | `FrozenMiniSearch.fromDocuments(docs, opts)` or `fromMiniSearch(mutable, opts)` |
118
+ | lucaong JSON snapshot | `FrozenMiniSearch.fromMiniSearchJson(json)` or `fromMiniSearchSnapshot(obj)` |
119
+ | `import MiniSearch from 'minisearch'` | `import FrozenMiniSearch from '@yoch/frozenminisearch'` (+ lucaong `minisearch` only if you still build mutable indexes) |
120
+
121
+ ---
122
+
123
+ ## Search API (compatible with MiniSearch)
124
+
125
+ - `search(query, searchOptions?)` — string, wildcard (`FrozenMiniSearch.wildcard`), or nested `QueryCombination`
126
+ - `autoSuggest(queryString, options?)`
127
+ - `has(id)`, `getStoredFields(id)`
128
+ - `saveBinarySync` / `loadBinarySync` / async variants
129
+
130
+ Indexing is **not** available on a frozen instance — use `fromDocuments`, the builder, `fromMiniSearch*`, or `loadBinary*`.
131
+
132
+ ---
133
+
134
+ ## Binary snapshots
135
+
136
+ ```javascript
137
+ const buf = index.saveBinarySync()
138
+ const loaded = FrozenMiniSearch.loadBinarySync(buf, {}) // field names embedded in snapshot
139
+ ```
140
+
141
+ - **Node ≥ 22.15.0** (zstd via `node:zlib`)
142
+ - Snapshots produced by this package version are forward-compatible; re-build from lucaong JSON if an older binary fails to load
143
+ - `tokenize` / `processTerm` are not stored — pass the same functions at load when customized
144
+
145
+ ---
146
+
147
+ ## Benchmarks
148
+
149
+ See [benchmarks/README.md](benchmarks/README.md).
150
+
151
+ ```bash
152
+ npm run bench -- run --profile=vs-reference # compare frozen vs minisearch
153
+ npm run bench:diff # regression vs reference.json
154
+ npm run bench:readme # refresh comparison table above
155
+ ```
156
+
157
+ ---
158
+
159
+ ## Development
160
+
161
+ ```bash
162
+ yarn install
163
+ yarn test # src/ + dev/parity/
164
+ yarn build
165
+ node scripts/verify-npm-pack.cjs
166
+ ```
167
+
168
+ Parity tests import `minisearch` as a devDependency (reference). Optional upstream clone: `git submodule update --init vendor/minisearch`.
169
+
170
+ Design notes (freq adaptive, AND gating): [dev/docs/README.md](dev/docs/README.md).
171
+
172
+ ---
173
+
174
+ ## Changelog & credits
175
+
176
+ See [CHANGELOG.md](./CHANGELOG.md).
177
+
178
+ - **MiniSearch** — [Luca Ongaro](https://github.com/lucaong/minisearch) (MIT)
179
+ - **@yoch/frozenminisearch** — frozen indexes, packed radix tree, compact binary snapshots
180
+
181
+ Upstream docs: [MiniSearch](https://lucaong.github.io/minisearch/)