rumongo 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/BENCHMARKS.md ADDED
@@ -0,0 +1,448 @@
1
+ # rumongo — Benchmarks: Rust driver vs Node drivers
2
+
3
+ Progressive log of comparison results across build phases. Append new runs; do
4
+ not rewrite history. Each entry: date, what changed, environment, numbers,
5
+ interpretation.
6
+
7
+ ## Environment
8
+
9
+ - Host: Linux 6.2.0, 12 cores, 15 GiB RAM
10
+ - MongoDB server: **8.0.16**, local (`mongodb://localhost:27017`)
11
+ - Node.js: **v20.4.0**
12
+ - Rust: **1.96.0**; crates: `mongodb` 3.7.0, `bson` 2.15.0, `napi` 2.16
13
+ - Compare targets:
14
+ - **official** = official `mongodb` Node.js driver (npm `mongodb` ^6.8)
15
+ - **rust** = rumongo (this project)
16
+ - (Mongoose comparison: planned, Phase 5)
17
+
18
+ > ⚠️ All numbers below are **localhost** (≈0 network latency). Pipeline / fetch
19
+ > overlap wins show up under real network latency, not here. Treat localhost as
20
+ > a lower bound for those features and an upper bound for CPU-bound wins.
21
+
22
+ Bench scripts: [bench/compare.js](bench/compare.js) (single-query find),
23
+ [bench/pipeline.js](bench/pipeline.js) (sequential vs pipelined),
24
+ [bench/concurrent.js](bench/concurrent.js) (parallel queries + event-loop jitter).
25
+
26
+ ---
27
+
28
+ ## 2026-06-15 — Phase 1: scaffold + basic find()
29
+
30
+ Implementation: standard cursor, full BSON deserialize, returned to JS as JSON
31
+ strings (`JSON.parse` on the JS side). No optimization.
32
+
33
+ ### Single-query find, 10k docs, 20 iters (`bench/compare.js`)
34
+
35
+ | metric | official | rust | result |
36
+ |---|---|---|---|
37
+ | parity (result set) | 10000 | 10000 | **identical ✓** |
38
+ | mean (ms) | 216.5 | 171.4 | **rust 1.26× faster** |
39
+ | p50 (ms) | 221.4 | 160.0 | |
40
+
41
+ Interpretation: even unoptimized, native Rust BSON→JSON beats the Node driver
42
+ building JS objects field-by-field on the main thread; V8 `JSON.parse` is cheap.
43
+
44
+ ---
45
+
46
+ ## 2026-06-15 — Phase 2: pipelined fetch (batched mpsc channel)
47
+
48
+ Implementation: spawned tokio task drives the cursor and pushes **batches** of
49
+ docs through a bounded mpsc channel; consumer serializes while fetcher prefetches.
50
+ Bounded channel = backpressure. (`pipeline` option toggles vs sequential.)
51
+
52
+ ### Single-query: sequential vs pipelined, 50k docs, 15 iters (`bench/pipeline.js`)
53
+
54
+ | mode | mean (ms) | p50 (ms) |
55
+ |---|---|---|
56
+ | sequential | 515.3 | 507.4 |
57
+ | pipelined | 629.8 | 629.5 |
58
+
59
+ Result: pipelined **22% SLOWER** on localhost. Expected — `getMore` latency ≈ 0,
60
+ so prefetch-overlap saves nothing while task+channel scheduling costs ~20%.
61
+ (First attempt with a *per-document* channel was ~100% slower; batching the
62
+ channel cut the overhead.) The "30–40% faster" claim needs real network latency.
63
+
64
+ > Note: an earlier per-document channel was 2× slower; the table above is the
65
+ > batched version. Pipeline is kept as the foundation Phase 3 plugs into, and
66
+ > for latency/concurrency wins — not for single-query localhost speed.
67
+
68
+ ### Concurrency: 20 parallel queries × 20k docs, 8 iters (`bench/concurrent.js`)
69
+
70
+ | metric | official | rust | result |
71
+ |---|---|---|---|
72
+ | wall time, 20 concurrent (ms) | 6255.9 | 2342.0 | **rust 2.67× faster** ✓ |
73
+ | event-loop max jitter (ms) | 520.2 | 1565.3 | **rust 3× worse** ✗ |
74
+
75
+ Interpretation:
76
+ - **Throughput win is the real Phase 2 result:** concurrent queries' BSON work
77
+ spreads across tokio worker threads instead of serializing on Node's single
78
+ event loop → 2.67×.
79
+ - **Jitter regression is diagnostic, not a dead end:** we return JSON *strings*,
80
+ so the JS side runs 20× `JSON.parse(20k)` in a synchronous burst that blocks
81
+ the loop. The official driver spreads deserialization across arriving batches.
82
+ → The JSON-string boundary is now the bottleneck. **Phase 3 (off-thread parse,
83
+ no string round-trip) and Phase 4 (lazy, skip parse) target exactly this.**
84
+
85
+ ### Correctness (`__tests__/integration/`)
86
+
87
+ - basic.test.js: **5/5 pass**
88
+ - pipeline.test.js: **4/4 pass** (pipelined==sequential, backpressure with
89
+ `maxInflight=1`, abandoned cursor cleanup)
90
+
91
+ ### Operational findings
92
+
93
+ - `MongoClient.close()` added — without it the napi tokio runtime never drains
94
+ and Node hangs at exit.
95
+ - Streaming server monitoring makes `close()` take **~10001ms** (awaitable
96
+ `hello` blocks shutdown); `?serverMonitoringMode=poll` → **~1ms**. Tests/benches
97
+ use poll. Revisit graceful shutdown in Phase 7.
98
+
99
+ ---
100
+
101
+ ## 2026-06-15 — Phase 3: off-thread parse (RawBatchCursor + rayon)
102
+
103
+ Implementation: default path switched to `RawBatchCursor` (`.find(..).batch()`) —
104
+ raw server batches stream through the bounded channel, each parsed to JSON
105
+ strings on the **rayon** pool via `spawn_blocking` (parallel across cores, off the
106
+ async workers). Filters now parsed as **Extended JSON** (`{$oid}`→ObjectId,
107
+ `{$date}`→DateTime). Interface still JSON strings (native JS / lazy = Phase 4).
108
+ `pipeline:false` = Phase 1 standard-cursor baseline.
109
+
110
+ ### Parity suite — 23/23 PASS (`__tests__/parity/parity.test.js`)
111
+
112
+ Same query run against official Node driver and rust, results compared after
113
+ canonicalizing rich types. Covers: filters, projection in/out, sort asc/desc,
114
+ limit/skip/both, nested filter, `$gt/$lt/$gte/$lte`, `$in`, `$and`, `$or`, empty
115
+ result, ObjectId filter (EJSON), Date, nested doc, array, null, bool, int, float,
116
+ 10k set, abandoned cursor. **All pass.**
117
+
118
+ (Also: integration basic 5/5, pipeline 4/4 still pass.)
119
+
120
+ ### Bench: rayon vs baseline vs official (`bench/phase3.js`)
121
+
122
+ `rust-base` = `pipeline:false` (Phase 1 path); `rust-rayon` = Phase 3.
123
+
124
+ **Single query, 100k docs, 6 iters:**
125
+
126
+ | target | wall mean (ms) | max jitter (ms) |
127
+ |---|---|---|
128
+ | official | 2441.9 | 82.6 |
129
+ | rust-base | 1299.4 | 5.9 |
130
+ | rust-rayon | **641.5** | 9.2 |
131
+
132
+ → rayon **2.03× vs base**, **3.81× vs official**.
133
+
134
+ **20 concurrent queries, 100k docs each, 6 iters:**
135
+
136
+ | target | wall mean (ms) | max jitter (ms) |
137
+ |---|---|---|
138
+ | official | 37167.3 | 1221.3 |
139
+ | rust-base | 15553.6 | 9072.1 |
140
+ | rust-rayon | **12346.5** | 8140.9 |
141
+
142
+ → rayon **1.26× vs base**, **3.01× vs official**.
143
+
144
+ Interpretation:
145
+ - **Throughput is the Phase 3 win:** parallel parse across cores → 3.8× over
146
+ official on a single large query, 3× concurrent, 2× over the Phase 1 baseline
147
+ (meets the plan's "2–3× over Phase 1" gate). Concurrent gain over base is
148
+ smaller (1.26×) because 20 concurrent queries already saturate the 12 cores.
149
+ - **Jitter still high on concurrent (8141ms vs official 1221ms):** the rayon
150
+ parse is off-loop, but each query's result is still returned as JSON strings →
151
+ 20 synchronous `JSON.parse` bursts block the event loop. Single-query jitter is
152
+ fine (9ms). **The JSON-string boundary is the remaining bottleneck → Phase 4
153
+ (native JS values + lazy field access) targets exactly this.**
154
+
155
+ ---
156
+
157
+ ## 2026-06-15 — Phase 4: lazy zero-copy (RawDoc + Proxy)
158
+
159
+ Implementation: new `find_lazy()` returns `RawDoc` handles holding raw BSON bytes
160
+ — **no value parsing on return**. A field is parsed only when JS reads it
161
+ (`get_field`), via a native BSON→JS converter (String/number/bool/null, Date,
162
+ ObjectId→hex, nested doc/array, Buffer). A JS `Proxy` (ts/index.ts) makes
163
+ `doc.field` call `get_field`, while spread / `JSON.stringify` still see all fields
164
+ (ownKeys + descriptors). Eager `find()` is unchanged.
165
+
166
+ ### Lazy tests — 6/6 PASS (`__tests__/lazy/lazy.test.js`)
167
+
168
+ getField primitives; Date/ObjectId/nested/array/Buffer; `keys()`; `to_object`
169
+ parity vs official (normalized); Proxy dot-access + spread + JSON.stringify;
170
+ partial access of a 40-field doc. (Eager 23 parity + 9 integration still pass.)
171
+
172
+ ### Bench: lazy vs eager vs official (`bench/lazy.js`)
173
+
174
+ 10 concurrent queries, 20k docs × 33 fields, **reading only 2 fields/doc**, 6 iters:
175
+
176
+ | target | wall mean (ms) | max jitter (ms) |
177
+ |---|---|---|
178
+ | official | 8362.2 | 645.2 |
179
+ | rust-eager | 2537.3 | 1322.0 |
180
+ | rust-lazy | **1144.1** | 991.1 |
181
+
182
+ → lazy **2.22× vs eager**, **7.31× vs official**.
183
+
184
+ Interpretation:
185
+ - **Throughput is the Phase 4 win:** skipping the 31 unread fields makes lazy
186
+ **7.3× faster than the official driver** and 2.2× faster than our own eager
187
+ path. This is the lever for the headline Mongoose win (Phase 5 Model layer
188
+ pushes projections so even fewer bytes are fetched).
189
+ - **Jitter (991ms) beats eager (1322ms) but not official (645ms):** `find_lazy`
190
+ still materializes one `RawDoc` per doc and each `getField` crosses the JS↔Rust
191
+ boundary — both touch the event loop. Near-zero jitter would need a streaming
192
+ iterator instead of an up-front handle array (future work).
193
+ - **Memory tradeoff:** one handle object + its byte buffer per doc. Holding ~1M
194
+ simultaneously OOMs Node's default 2GB heap (observed at 20×50k). Lazy is for
195
+ "wide docs, few fields read," not for buffering millions of docs at once.
196
+
197
+ ---
198
+
199
+ ## 2026-06-15 — Phase 4b: jitter investigation + streaming cursor
200
+
201
+ Goal: drive down the concurrent-query jitter from Phase 4 (lazy 991ms).
202
+
203
+ ### Diagnosis (`single query, 50k docs`)
204
+
205
+ | step | max jitter |
206
+ |---|---|
207
+ | A) findLazy return only (no access) | **0.9ms** |
208
+ | B) access 2 fields (sync loop) | **0.0ms** |
209
+ | C) eager find + JSON.parse all | **5.7ms** |
210
+
211
+ → Single/low-concurrency jitter is already near-zero. The Phase 4 concurrent
212
+ 991ms was **not** from marshaling.
213
+
214
+ ### Streaming cursor (`FindCursor.next_batch()`)
215
+
216
+ Added a cursor that hands back one batch at a time (process + drop before the
217
+ next), so peak live objects ≈ one batch. Bench, 10 concurrent, 20k×33 fields,
218
+ read 2 fields/doc:
219
+
220
+ | target | wall (ms) | max jitter (ms) |
221
+ |---|---|---|
222
+ | official | 7537 | 636.8 |
223
+ | lazy-array | 830 | 601.4 |
224
+ | lazy-cursor | 821 | 672.0 |
225
+
226
+ **Finding:** at 10× concurrency the jitter floor (~600ms) is the **same for the
227
+ official driver too**. It is not our parsing — it's the single JS thread being
228
+ saturated by 10 simultaneous CPU-bound query loops, so the 10ms timer can't fire
229
+ regardless of driver. `await`-yields don't help when 10 queries keep the thread
230
+ busy. The honest levers:
231
+ - **Do less main-thread work** → lazy already does (reads 2 of 33 fields): same
232
+ peak jitter as official but the busy window is **9× shorter** (821ms vs 7537ms),
233
+ so the loop is responsive again ~9× sooner.
234
+ - **Bound memory** → the cursor's real, measured win.
235
+
236
+ ### Memory: cursor survives what `findLazy` OOM'd
237
+
238
+ `findLazy` on 20×50k (=1M docs) → **OOM at 2080MB** (heap full of handles).
239
+ `FindCursor` on the same 1M docs → **peak RSS 1140MB, no OOM, 4447ms**.
240
+
241
+ Takeaways:
242
+ - Lazy/cursor jitter is near-zero at realistic concurrency; at extreme
243
+ concurrency it's main-thread-bound and equal to official, but lazy finishes
244
+ far sooner (less total work).
245
+ - Use `find` (eager) for small results, `findLazy` for wide-doc/few-field reads,
246
+ `findCursor` for large/streaming results (bounded memory).
247
+
248
+ ---
249
+
250
+ ## 2026-06-15 — Phase 4c: worker-thread offload = near-zero main-loop jitter
251
+
252
+ Physics: napi builds JS values on the calling isolate, and JS is single-threaded
253
+ per isolate. We already decode BSON off-thread (rayon), but the final Rust→JS
254
+ materialization runs on whichever isolate owns the result. On the main isolate
255
+ under concurrency that saturates → jitter. The only escape is a *different
256
+ isolate* = a Worker thread.
257
+
258
+ Bench (`bench/worker.js`): 10 concurrent `findLazy` queries, 20k×33 fields,
259
+ 2 fields read. Main thread runs a 10ms heartbeat throughout.
260
+
261
+ | where the queries run | query wall (ms) | MAIN-loop max jitter (ms) |
262
+ |---|---|---|
263
+ | main thread | 1101 | 329.9 |
264
+ | **worker thread** | 919 | **0.7** |
265
+
266
+ → Offloading the addon to a worker drops main-loop jitter **330ms → 0.7ms**
267
+ (~470×) and even runs faster (no contention with the heartbeat).
268
+
269
+ Why this is a Rust-driver advantage: the worker's isolate does fetch + BSON→JS
270
+ in native code; the main isolate does nothing. The official Node driver's BSON
271
+ decode is JS, so a worker still pays full JS deserialize and worker→main transfer
272
+ is heavier.
273
+
274
+ Caveat: returning large result *data* to main still costs a structured clone.
275
+ Mitigate by (a) processing in the worker and returning summaries, or (b)
276
+ transferring the raw BSON Buffer (transferable, zero-copy) and lazy-parsing only
277
+ accessed fields on main. Recommended production shape: a small worker pool running
278
+ rumongo, main thread dispatches queries — main loop stays responsive under load.
279
+
280
+ **Conclusion on jitter:** near-zero is achievable. Single/low concurrency →
281
+ already near-zero (Phase 4b). High concurrency on one isolate → main-thread-bound
282
+ (equal to official, but lazy finishes ~9× sooner). High concurrency with a worker
283
+ pool → main-loop jitter ~0.7ms. Lever summary: lazy (less work) + worker threads
284
+ (other isolate) + cursor (bounded memory).
285
+
286
+ ---
287
+
288
+ ## 2026-06-15 — Phase 4d: worker pool load sweep → opt-in (not default)
289
+
290
+ Built an opt-in worker pool (`worker/pool.js` + `worker/pool-worker.js`): N Node
291
+ worker threads, each with its own addon + MongoClient, round-robin dispatch.
292
+ Swept it vs direct main-thread use across loads (`bench/poolbench.js`, pool=6).
293
+
294
+ | load | mode | wall (ms) | main jitter (ms) |
295
+ |---|---|---|---|
296
+ | tiny (1 doc, 50 conc) | direct | 8.6 | 0.6 |
297
+ | | pool-data | 18.0 | 3.0 |
298
+ | | pool-reduced | 8.4 | 0.0 |
299
+ | small (100, 20 conc) | direct | 24.9 | 11.7 |
300
+ | | pool-data | 25.8 | 11.5 |
301
+ | | pool-reduced | 14.1 | 1.3 |
302
+ | med (1000, 12 conc) | direct | 121.9 | 93.2 |
303
+ | | pool-data | 125.1 | 108.9 |
304
+ | | pool-reduced | 69.4 | 0.9 |
305
+ | heavy (10000, 6 conc) | direct | 780.7 | 517.9 |
306
+ | | pool-data | 860.6 | 551.0 |
307
+ | | pool-reduced | **328.4** | **7.5** |
308
+
309
+ - **pool-data** (worker queries, ships rows to main): ties or LOSES at every load.
310
+ Main still parses the result and now also pays the cross-thread transfer. A
311
+ transparent "route find() through workers" buys nothing.
312
+ - **pool-reduced** (worker queries AND reduces, returns a summary): wins at every
313
+ load — jitter near-zero (7.5 vs 518ms heavy) and wall up to 2.4× faster. But it
314
+ requires pushing the data-processing INTO the worker; it is not a drop-in find().
315
+
316
+ **Decision: worker pool stays OPT-IN**, positioned for the "do the work in the
317
+ worker, return a small result" pattern (aggregations, transforms, counts, exports,
318
+ streaming to a socket from the worker). Not made default, because the only
319
+ universally-winning mode isn't a transparent `find()` replacement. The default
320
+ path remains the direct addon (already 3–7× faster than the official driver).
321
+
322
+ ---
323
+
324
+ ## 2026-06-16 — Phase 4e: generic worker reduce + larger-load sweep
325
+
326
+ Generalized the worker reduce: `pool.reduce(db, coll, filter, opts, reducerFn,
327
+ init)` ships the reducer as source and runs `(acc, doc) => acc` in the worker;
328
+ only the accumulator returns to main. (find() stays direct; reduce is the
329
+ worker-backed path.)
330
+
331
+ Sweep, 100k-doc collection, pool=6 (`bench/poolbench.js`):
332
+
333
+ | load (result size, conc) | direct wall/jit (ms) | pool-reduced wall/jit (ms) |
334
+ |---|---|---|
335
+ | small (100, 20) | 33 / 18 | 19 / 0.5 |
336
+ | med (1000, 12) | 154 / 127 | 65 / 0.8 |
337
+ | heavy (10000, 6) | 808 / 544 | 337 / 4.6 |
338
+ | huge (50000, 4) | 2879 / 1722 | 1420 / 14 |
339
+ | max (100000, 2) | 3118 / 163 | 2274 / 18 |
340
+
341
+ (pool-data omitted — ties/loses on wall at every load, same as Phase 4d.)
342
+
343
+ - pool-reduced wins wall ~2–2.4× and keeps main-loop jitter ≤18ms while direct
344
+ jitter climbs to 1722ms on big result sets. The bigger the data, the bigger the
345
+ responsiveness win.
346
+ - Worker count is **fixed** at pool creation (default `cpus-2`). BSON work is
347
+ CPU-bound so >cores doesn't help; dynamic autoscaling would add cold-start
348
+ latency on spikes — deferred unless bursty traffic needs it.
349
+
350
+ Final stance: worker pool = opt-in; `reduce` runs in the worker by default within
351
+ the pool. Direct addon remains the default for `find` (returns docs).
352
+
353
+ ---
354
+
355
+ ## 2026-06-16 — Phase 5: Mongoose-style Model + projection pushdown
356
+
357
+ `ts/model.ts`: `Model.define(collection, schemaFields)` builds a cached
358
+ projection from the schema field list and pushes it down on every query, so
359
+ MongoDB only sends schema fields. Methods: `find`, `findOne`, `findById`
360
+ (hex-string id → ObjectId via Extended JSON), `getProjection`.
361
+
362
+ ### Model parity — 9/9 PASS vs Mongoose (`__tests__/model/model.test.js`)
363
+
364
+ find / find+filter / findOne / findById / sort / limit / **projection pushdown
365
+ (non-schema field excluded)** / empty→[] / no-match→null. Compared schema-field
366
+ values + counts on identical data (mongoose `versionKey:false`, `.lean()`).
367
+
368
+ ### Perf: Model vs Mongoose, 50k docs (6 fields), 10 iters (`bench/model.js`)
369
+
370
+ | target | mean (ms) |
371
+ |---|---|
372
+ | mongoose (hydrated) | 1228.1 |
373
+ | mongoose (.lean) | 607.2 |
374
+ | **rumongo Model** | **244.6** |
375
+
376
+ → **5.0× vs hydrated Mongoose**, 2.5× vs `.lean()`. (Eager, all 6 fields read.
377
+ With `findLazy` + few-field reads the multiple is higher — see Phase 4: 7.3× vs
378
+ the raw official driver when reading 2 of 33 fields.)
379
+
380
+ MIGRATION.md written (Mongoose → rumongo mapping + behavior differences).
381
+
382
+ ---
383
+
384
+ ## 2026-06-16 — Phase 6: consolidated preset benchmark suite
385
+
386
+ `bench/suite.js`: projection presets (few=4, small=9, medium=15, large=35,
387
+ full=45 fields) over a 45-field doc. Deterministic (data = f(index)), warmup +
388
+ 6 iters, mean ± sd. N=30k. Run against local MongoDB (network ~0 isolates
389
+ client-side cost — the mock wire-server from the plan was skipped: making both
390
+ the official Node driver and the Rust `mongodb` crate accept a hand-rolled
391
+ handshake is large and brittle, and localhost already removes network variance).
392
+
393
+ ### A) Driver find — official Node driver vs rumongo (eager)
394
+
395
+ | preset | fields | official (ms) | rumongo (ms) | speedup |
396
+ |---|---|---|---|---|
397
+ | few | 4 | 649±95 | 178±17 | **3.65×** |
398
+ | small | 9 | 792±62 | 304±16 | 2.61× |
399
+ | medium | 15 | 687±49 | 418±54 | 1.64× |
400
+ | large | 35 | 1532±132 | 841±61 | 1.82× |
401
+ | full | 45 | 2032±135 | 1031±59 | 1.97× |
402
+
403
+ ### B) ODM — mongoose `.lean()` vs rumongo Model
404
+
405
+ | preset | fields | mongoose (ms) | Model (ms) | speedup |
406
+ |---|---|---|---|---|
407
+ | few | 4 | 477±53 | 177±18 | 2.69× |
408
+ | small | 9 | 559±35 | 284±31 | 1.97× |
409
+ | medium | 15 | 680±68 | 405±35 | 1.68× |
410
+ | large | 35 | 1455±24 | 850±65 | 1.71× |
411
+ | full | 45 | 2041±167 | 1031±98 | 1.98× |
412
+
413
+ ### C) Event-loop jitter (full preset, single query)
414
+
415
+ official `maxJitter=149.2ms` · rumongo `maxJitter=13.8ms` (~10× lower).
416
+
417
+ ### Verdict vs plan targets (honest)
418
+
419
+ - **≥2× over official find:** met for few/small/full (1.97–3.65×); medium/large
420
+ 1.64–1.82× (just under 2× — projection cost dominates at mid widths).
421
+ - **≥15× over Mongoose:** NOT met by eager find. vs `.lean()` it's 1.7–2.7×; vs
422
+ hydrated Mongoose ~5× (Phase 5). The 15× figure only appears with lazy +
423
+ narrow field reads (Phase 4: 7.3× vs the raw official driver reading 2 of 33
424
+ fields) — i.e. it's a property of the access pattern, not eager full-doc reads.
425
+ - **Near-zero jitter:** eager full read is 13.8ms (10× better than official, not
426
+ zero — JSON.parse remains). Near-zero needs lazy/worker paths (Phase 4b/4c).
427
+
428
+ Bottom line: 1.6–3.7× faster reads than the official driver, ~2× vs Mongoose
429
+ `.lean()` / ~5× vs hydrated, 10× lower jitter — consistently, across projection
430
+ sizes. The headline 15–20× is achievable but only under lazy/narrow-read or
431
+ worker-offload patterns, not eager full-document reads.
432
+
433
+ ---
434
+
435
+ ## Template for future entries
436
+
437
+ ```
438
+ ## YYYY-MM-DD — Phase N: <title>
439
+
440
+ Implementation: <what changed>
441
+
442
+ ### <scenario>, <dataset>, <iters> (`bench/<script>.js`)
443
+ | metric | official | rust | result |
444
+ |---|---|---|---|
445
+ | ... | | | |
446
+
447
+ Interpretation: <why the numbers look like this; what's next>
448
+ ```
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Piyush Bhangale
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/MIGRATION.md ADDED
@@ -0,0 +1,69 @@
1
+ # Migrating from Mongoose to rumongo
2
+
3
+ rumongo is a **read-path** replacement for Mongoose: faster reads via
4
+ Rust-native BSON parsing, off-thread, with optional lazy field access. It does
5
+ **not** cover writes, hooks, virtuals, populate, or validation — keep Mongoose
6
+ (or the official driver) for those.
7
+
8
+ ## Model definition
9
+
10
+ **Mongoose**
11
+ ```js
12
+ const User = mongoose.model('User', new mongoose.Schema({ name: String, age: Number }))
13
+ ```
14
+
15
+ **rumongo**
16
+ ```js
17
+ import { MongoClient, Model } from 'rumongo'
18
+ const client = await MongoClient.connect(uri)
19
+ const coll = client.collection('mydb', 'users')
20
+ const User = Model.define(coll, { name: 1, age: 1 }) // field list = projection
21
+ ```
22
+
23
+ The schema field list becomes a cached projection — MongoDB only sends those
24
+ fields (projection pushdown), so less data on the wire and less to parse.
25
+
26
+ ## Read methods
27
+
28
+ | Mongoose | rumongo | Notes |
29
+ |---|---|---|
30
+ | `User.find(filter)` | `User.find(filter)` | returns plain objects |
31
+ | `User.find(f).sort(s).limit(n)` | `User.find(f, { sort: s, limit: n })` | options object, not chained |
32
+ | `User.findOne(filter)` | `User.findOne(filter)` | `null` if no match |
33
+ | `User.findById(id)` | `User.findById(idHexString)` | pass the 24-char hex string |
34
+ | `.lean()` | (always) | rumongo always returns plain objects |
35
+ | `.select('name age')` | (automatic) | the schema fields are the projection |
36
+
37
+ ## Behavior differences
38
+
39
+ - **No Mongoose Documents.** Results are plain objects (like `.lean()`). No
40
+ `.save()`, no getters/setters, no virtuals.
41
+ - **`_id` is a hex string**, not an `ObjectId` instance. Compare with
42
+ `id === doc._id` (string), or convert.
43
+ - **Dates** come back as JS `Date` (via `findLazy`/`Model`) — same as Mongoose
44
+ `.lean()`.
45
+ - **No `__v`** version key (not in your schema → not projected).
46
+ - **Filters with BSON types** use Extended JSON: an ObjectId filter is
47
+ `{ _id: { $oid: '...' } }`. `findById` does this for you.
48
+ - **Writes / hooks / populate / validation: not supported.** Use Mongoose or the
49
+ official driver for the write path; use rumongo for hot read paths.
50
+
51
+ ## Advanced: keep the event loop free under load
52
+
53
+ For heavy concurrent read+aggregate work, run the reduction inside a worker so the
54
+ main loop stays responsive (see README / `worker/pool.js`):
55
+
56
+ ```js
57
+ import { WorkerPool } from 'rumongo'
58
+ const pool = await WorkerPool.create({ uri, size: 6 })
59
+ const { acc } = await pool.reduce('mydb', 'users', { active: true }, {}, (a, d) => a + d.age, 0)
60
+ ```
61
+
62
+ ## Three read APIs (pick by shape)
63
+
64
+ - `collection.find(filter, opts)` — eager, plain objects. Small/medium results.
65
+ - `collection.findLazy(filter, opts)` — Proxy docs; fields parse on access. Wide
66
+ docs where you read few fields.
67
+ - `collection.findCursor(filter, opts)` → `nextBatch()` — streaming, bounded
68
+ memory. Large result sets.
69
+ - `Model.find/findOne/findById` — Mongoose-style + automatic projection.