spl.js 1.0.0 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/.TODO.md ADDED
@@ -0,0 +1,792 @@
1
+ # Cursor Implementation for spl.js
2
+
3
+ ## Background: SQLite Cursors
4
+
5
+ ### SQLite C API
6
+
7
+ SQLite doesn't expose a separate "cursor" object. Instead, the **prepared statement** (`sqlite3_stmt`) acts as the cursor:
8
+
9
+ - `sqlite3_prepare_v2()` - Compile SQL into a prepared statement
10
+ - `sqlite3_step()` - Advance to next row (returns `SQLITE_ROW` or `SQLITE_DONE`)
11
+ - `sqlite3_column_*()` - Extract column values from current row
12
+ - `sqlite3_reset()` - Rewind cursor to beginning for re-execution
13
+ - `sqlite3_finalize()` - Destroy statement and free resources
14
+
15
+ ### Advantages of Cursors
16
+
17
+ | Advantage | Description |
18
+ |-----------|-------------|
19
+ | Memory efficient | Rows fetched one-at-a-time, no need to load entire result set |
20
+ | Streaming | Process large datasets without memory exhaustion |
21
+ | Random access | Can bind new parameters and re-execute with reset |
22
+ | Multiple concurrent | Many cursors can operate on same table independently |
23
+ | Fast seeking | Internal B-tree cursors enable efficient key-based lookups |
24
+
25
+ ### Disadvantages of Cursors
26
+
27
+ | Disadvantage | Description |
28
+ |--------------|-------------|
29
+ | Manual lifecycle | Must finalize or risk memory leaks |
30
+ | No row locking | Other operations can modify data under an active cursor |
31
+ | Forward-only by default | Standard iteration is sequential |
32
+ | Statement caching | Reusing statements requires careful reset management |
33
+ | Connection affinity | Cursors tied to their database connection |
34
+
35
+ ### References
36
+
37
+ - [SQLite C/C++ Interface Introduction](https://sqlite.org/cintro.html)
38
+ - [sqlite3_step() Documentation](https://sqlite.org/c3ref/step.html)
39
+ - [SQLite B-Tree Module](https://sqlite.org/btreemodule.html)
40
+ - [SQLite Bytecode Engine](https://sqlite.org/opcode.html)
41
+
42
+ ---
43
+
44
+ ## Current spl.js Architecture
45
+
46
+ ### Message Protocol (Main Thread -> Worker)
47
+
48
+ ```javascript
49
+ // Single operation
50
+ {
51
+ __id__: number, // Unique message ID for response matching
52
+ fn: string, // Function name (e.g., 'db.exec', 'fs.file')
53
+ id: number, // Database instance ID
54
+ args: any[] // Arguments to pass
55
+ }
56
+
57
+ // Batch operations
58
+ {
59
+ __id__: number,
60
+ fn: [
61
+ { id, fn, args }, // Array of multiple operations
62
+ ...
63
+ ]
64
+ }
65
+ ```
66
+
67
+ ### Response Protocol (Worker -> Main Thread)
68
+
69
+ ```javascript
70
+ {
71
+ __id__: number, // Echoes the request's __id__
72
+ res: any, // Result/response data
73
+ err: string // Error message if execution failed
74
+ }
75
+ ```
76
+
77
+ ### Key Architectural Patterns
78
+
79
+ 1. **Promise Queue**: Each message gets unique `__id__` stored in queue, response matches it back
80
+ 2. **Transferables**: Large ArrayBuffers transferred by reference, not copied
81
+ 3. **Batch Operations**: Multiple SQL operations sent in single message via async generator
82
+ 4. **Thenable Chaining**: Synchronous API stacks operations, sends in batch when awaited
83
+
84
+ ---
85
+
86
+ ## Research: How Other Libraries Handle This
87
+
88
+ ### Python sqlite3 (DB-API 2.0)
89
+
90
+ ```python
91
+ # Create cursor from connection
92
+ cur = con.cursor()
93
+
94
+ # execute() prepares but does NOT fetch all rows
95
+ cur.execute("SELECT * FROM users WHERE age > ?", (18,))
96
+
97
+ # User chooses fetch strategy AFTER execute:
98
+ row = cur.fetchone() # Single row
99
+ rows = cur.fetchmany(100) # Batch of 100
100
+ rows = cur.fetchall() # All at once
101
+
102
+ # Or iterate directly (cursor is iterable)
103
+ for row in cur.execute("SELECT * FROM users"):
104
+ print(row)
105
+ ```
106
+
107
+ **Key behavior discovered in [CPython source](https://github.com/python/cpython/blob/main/Modules/_sqlite/cursor.c):**
108
+
109
+ ```c
110
+ // In _pysqlite_query_execute():
111
+ rc = stmt_step(self->statement->st); // Step ONCE
112
+
113
+ if (rc == SQLITE_ROW) {
114
+ // SELECT with results → STOP HERE
115
+ // Statement left "open", positioned at first row
116
+ }
117
+ else if (rc == SQLITE_DONE) {
118
+ // INSERT/UPDATE/DELETE or empty SELECT
119
+ // Reset statement immediately
120
+ stmt_reset(self->statement);
121
+ }
122
+ ```
123
+
124
+ **Python steps ONCE** - just enough to determine if there's data:
125
+ - `SQLITE_ROW` → SELECT has data, statement stays open for lazy fetch
126
+ - `SQLITE_DONE` → INSERT/UPDATE/DELETE executed immediately, or empty SELECT
127
+
128
+ ### better-sqlite3 (Node.js)
129
+
130
+ ```javascript
131
+ // Prepare statement from db
132
+ const stmt = db.prepare('SELECT * FROM users WHERE age > ?');
133
+
134
+ // User chooses execution method:
135
+ const row = stmt.get(18); // First row only
136
+ const rows = stmt.all(18); // All rows at once
137
+ for (const row of stmt.iterate(18)) { // Iterate one-by-one
138
+ console.log(row);
139
+ }
140
+ ```
141
+
142
+ **Key insight**: `.prepare()` creates statement, then user chooses `.all()` vs `.iterate()`.
143
+
144
+ ### Pattern Comparison
145
+
146
+ | Library | Prepare | Get All | Iterate/Stream |
147
+ |---------|---------|---------|----------------|
148
+ | **Python sqlite3** | `cur.execute(sql)` | `cur.fetchall()` | `for row in cur:` |
149
+ | **better-sqlite3** | `db.prepare(sql)` | `stmt.all()` | `stmt.iterate()` |
150
+ | **spl.js current** | `db.exec(sql)` | `db.get.rows` | ❌ N/A |
151
+
152
+ **Both libraries separate preparation from fetching strategy.**
153
+
154
+ ### References
155
+
156
+ - [Python sqlite3 Documentation](https://docs.python.org/3/library/sqlite3.html)
157
+ - [CPython sqlite3 cursor.c source](https://github.com/python/cpython/blob/main/Modules/_sqlite/cursor.c)
158
+ - [better-sqlite3 API](https://github.com/WiseLibs/better-sqlite3/blob/master/docs/api.md)
159
+
160
+ ---
161
+
162
+ ## API Design Options (Undecided)
163
+
164
+ ### The `exec().cursor()` Problem
165
+
166
+ **Issue**: In synchronous code, `exec()` cannot know if `.cursor()` will be called after it returns.
167
+
168
+ Current spl.js `exec()` behavior:
169
+ 1. Prepares statement
170
+ 2. Binds parameters
171
+ 3. **Steps through ALL rows** (loads entire result into memory)
172
+ 4. Finalizes statement
173
+ 5. Returns `_db` for chaining
174
+
175
+ When `cursor()` is then called, the data is **already fully loaded in memory**, defeating the purpose of streaming/cursor-based iteration.
176
+
177
+ ---
178
+
179
+ ### Option A: `db.cursor(sql, params, batchSize?)`
180
+
181
+ Separate method, bypasses `exec()` entirely.
182
+
183
+ ```javascript
184
+ // Node.js (sync)
185
+ const cursor = db.cursor('SELECT * FROM users WHERE age > ?', [18], 100);
186
+
187
+ console.log(cursor.cols); // ['id', 'name', 'age']
188
+
189
+ while (!cursor.done) {
190
+ for (const user of cursor.objs) {
191
+ console.log(user.name);
192
+ }
193
+ }
194
+ ```
195
+
196
+ | Pros | Cons |
197
+ |------|------|
198
+ | No wasted work | New method to learn |
199
+ | Clear separation of concerns | Different pattern from exec() |
200
+ | Single prepare, single pass | |
201
+
202
+ **Implementation**: New `db.cursor()` method, straightforward.
203
+
204
+ ---
205
+
206
+ ### Option B: `exec(sql, params, { cursor: true/batchSize })`
207
+
208
+ Flag on exec() to return cursor instead of db.
209
+
210
+ ```javascript
211
+ // Node.js (sync)
212
+ const cursor = db.exec('SELECT * FROM users WHERE age > ?', [18], { cursor: true });
213
+ // or with explicit batch size:
214
+ const cursor = db.exec('SELECT * FROM users WHERE age > ?', [18], { cursor: 100 });
215
+
216
+ console.log(cursor.cols); // ['id', 'name', 'age']
217
+
218
+ while (!cursor.done) {
219
+ for (const user of cursor.objs) {
220
+ console.log(user.name);
221
+ }
222
+ }
223
+ ```
224
+
225
+ | Pros | Cons |
226
+ |------|------|
227
+ | No wasted work - exec() knows mode upfront | Return type changes based on options |
228
+ | Single entry point (no new method) | Chaining stops when cursor mode (returns cursor, not _db) |
229
+ | No breaking change | Options parameter adds complexity |
230
+ | Explicit intent | |
231
+
232
+ **Implementation**:
233
+ ```javascript
234
+ this.exec = (sql, par, opt = {}) => {
235
+ const { cursor: cursorMode } = opt;
236
+ const batchSize = typeof cursorMode === 'number' ? cursorMode : 100;
237
+
238
+ if (cursorMode) {
239
+ // Cursor mode: prepare + bind, step ONCE
240
+ const stmt = sqlite3_prepare_v2(...);
241
+ bind(stmt, par);
242
+ sqlite3_step(stmt); // Step to first row
243
+ const columns = getColumnNames(stmt);
244
+ const firstRow = readCurrentRow(stmt);
245
+ return createCursor(stmt, columns, firstRow, batchSize);
246
+ }
247
+
248
+ // Normal mode: existing behavior
249
+ // ... prepare, bind, step ALL, finalize ...
250
+ return _db;
251
+ };
252
+ ```
253
+
254
+ ---
255
+
256
+ ### Option C: `exec(...).cursor()` with Lazy exec (Python-style) ⭐
257
+
258
+ Modify `exec()` to step **once** like Python - just enough to determine if there's data.
259
+
260
+ ```javascript
261
+ // Usage stays the same:
262
+ db.exec('INSERT INTO users VALUES (?)', ['Alice']); // Executes immediately (DONE)
263
+ db.exec('SELECT * FROM users'); // Steps once, waits
264
+ db.get.rows; // NOW fetches all remaining
265
+ // OR
266
+ db.exec('SELECT * FROM users').cursor(100); // Returns cursor for streaming
267
+ ```
268
+
269
+ **Implementation**:
270
+
271
+ ```javascript
272
+ this.exec = (sql, par) => {
273
+ const stmt = sqlite3_prepare_v2(sql);
274
+ bind(stmt, par);
275
+
276
+ const rc = sqlite3_step(stmt); // Step ONCE
277
+
278
+ if (rc === SQLITE_DONE) {
279
+ // INSERT/UPDATE/DELETE or empty SELECT
280
+ // Executed immediately, finalize now
281
+ sqlite3_finalize(stmt);
282
+ _result = result([], []);
283
+ _pendingStmt = null;
284
+ }
285
+ else if (rc === SQLITE_ROW) {
286
+ // SELECT with data - save state, DON'T finalize
287
+ _pendingStmt = stmt;
288
+ _pendingColumns = getColumns(stmt);
289
+ _firstRow = readCurrentRow(stmt); // Save row we already stepped to
290
+ }
291
+
292
+ return _db;
293
+ };
294
+
295
+ // .get - consume all remaining rows (lazy)
296
+ Object.defineProperty(this, 'get', {
297
+ get: () => {
298
+ if (_pendingStmt) {
299
+ const rows = [_firstRow]; // Include first row
300
+ while (sqlite3_step(_pendingStmt) === SQLITE_ROW) {
301
+ rows.push(readCurrentRow(_pendingStmt));
302
+ }
303
+ sqlite3_finalize(_pendingStmt);
304
+ _result = result(_pendingColumns, rows);
305
+ _pendingStmt = null;
306
+ _firstRow = null;
307
+ }
308
+ return _result;
309
+ }
310
+ });
311
+
312
+ // .cursor() - return iterator using the open statement
313
+ this.cursor = (batchSize = 100) => {
314
+ if (!_pendingStmt) {
315
+ throw new Error('No pending SELECT or already consumed');
316
+ }
317
+ const cursor = createCursor(_pendingStmt, _pendingColumns, _firstRow, batchSize);
318
+ _pendingStmt = null; // Cursor now owns the statement
319
+ _firstRow = null;
320
+ return cursor;
321
+ };
322
+ ```
323
+
324
+ | Pros | Cons |
325
+ |------|------|
326
+ | `exec().cursor()` works efficiently | Subtle behavior change (lazy fetch) |
327
+ | Matches Python's proven pattern | `.get` must be called to consume results |
328
+ | INSERT/UPDATE/DELETE still immediate | Can only call `.get` OR `.cursor()`, not both |
329
+ | No new methods needed | Need to handle "dangling" statement if neither called |
330
+ | Single pass, no wasted work | |
331
+
332
+ **Edge cases to handle**:
333
+ - What if user calls `exec()` twice without consuming first? → Finalize previous `_pendingStmt`
334
+ - What if `db.close()` called with pending statement? → Finalize in close()
335
+ - Empty SELECT (no rows)? → `SQLITE_DONE` on first step, same as INSERT
336
+
337
+ ---
338
+
339
+ ### Comparison Summary
340
+
341
+ | Aspect | Option A | Option B | Option C |
342
+ |--------|----------|----------|----------|
343
+ | API | `db.cursor()` | `exec(..., {cursor})` | `exec().cursor()` |
344
+ | Efficiency | ✅ Single pass | ✅ Single pass | ✅ Single pass |
345
+ | Breaking change | No | No | Subtle (lazy `.get`) |
346
+ | New method | Yes | No | No |
347
+ | Return type | New type | Conditional | Consistent |
348
+ | Chaining after | N/A | Stops | `.get` or `.cursor()` |
349
+ | Matches Python | Partially | No | ✅ Yes |
350
+
351
+ ---
352
+
353
+ ## Cursor Object Design
354
+
355
+ The cursor mirrors Result's API but fetches lazily in batches.
356
+
357
+ ### Result vs Cursor Comparison
358
+
359
+ | Property | Result | Cursor |
360
+ |----------|--------|--------|
361
+ | `.cols` | Getter (all columns) | Getter (all columns) |
362
+ | `.rows` | Getter (all data) | Getter (next batch as arrays) |
363
+ | `.objs` | Getter (all data) | Getter (next batch as objects) |
364
+ | `.flat` | Getter (all data) | Getter (next batch flattened) |
365
+ | `.first` | Getter (first row) | N/A (use `exec().get.first` instead) |
366
+ | `.done` | N/A | Getter (true if exhausted) |
367
+
368
+ ### Auto-Cleanup Behavior (Important for Documentation)
369
+
370
+ Cursors are **automatically finalized** in these scenarios:
371
+
372
+ 1. **Cursor exhausted** - When `.done` becomes true (all rows consumed)
373
+ 2. **New `exec()` called** - Finalizes any previous pending statement
374
+ 3. **`db.close()` called** - Finalizes all open cursors/statements
375
+
376
+ **No explicit `close()` or `reset()` needed.** Users can simply:
377
+ - Iterate until done (auto-cleanup)
378
+ - Call `exec()` again (previous cursor cleaned up)
379
+ - Let cursor go out of scope after `exec()` replaces it
380
+
381
+ ### Implementation
382
+
383
+ ```javascript
384
+ const createCursor = (stmt, columns, firstRow, batchSize) => {
385
+ let _done = false;
386
+ let _firstRow = firstRow; // Already stepped to first row in exec()
387
+
388
+ const fetchBatch = () => {
389
+ if (_done) return [];
390
+
391
+ const rows = [];
392
+
393
+ // Include first row if we have it
394
+ if (_firstRow) {
395
+ rows.push(_firstRow);
396
+ _firstRow = null;
397
+ }
398
+
399
+ // Fetch remaining up to batchSize
400
+ while (rows.length < batchSize) {
401
+ const rc = sqlite3_step(stmt);
402
+ if (rc === SQLITE_ROW) {
403
+ rows.push(readCurrentRow(stmt));
404
+ } else {
405
+ _done = true;
406
+ sqlite3_finalize(stmt); // Auto-finalize when exhausted
407
+ break;
408
+ }
409
+ }
410
+
411
+ return rows;
412
+ };
413
+
414
+ const toObj = (row) => columns.reduce((o, col, i) => {
415
+ o[col] = row[i];
416
+ return o;
417
+ }, {});
418
+
419
+ return Object.freeze({
420
+ // Metadata
421
+ get cols() { return columns.slice(); },
422
+ get done() { return _done; },
423
+
424
+ // Fetch getters - each access fetches next batch
425
+ get rows() { return fetchBatch(); },
426
+ get objs() { return fetchBatch().map(toObj); },
427
+ get flat() { return fetchBatch().flat(); },
428
+ });
429
+ };
430
+ ```
431
+
432
+ ### Usage Examples
433
+
434
+ ```javascript
435
+ // Node.js (sync)
436
+ const cursor = db.exec('SELECT id, name, age FROM users').cursor(100);
437
+
438
+ console.log(cursor.cols); // ['id', 'name', 'age']
439
+
440
+ while (!cursor.done) {
441
+ for (const user of cursor.objs) {
442
+ console.log(user.name);
443
+ }
444
+ }
445
+ // Cursor auto-finalized when exhausted
446
+
447
+ // For single row, don't use cursor:
448
+ const count = db.exec('SELECT COUNT(*) FROM users').get.first;
449
+
450
+ // Mix formats (each .rows/.objs/.flat fetches next batch)
451
+ const cursor2 = db.exec('SELECT * FROM logs').cursor(50);
452
+ const first50asArrays = cursor2.rows;
453
+ const next50asObjects = cursor2.objs;
454
+ // Keep fetching until cursor.done...
455
+ ```
456
+
457
+ ---
458
+
459
+ ## Future API Exploration
460
+
461
+ Alternative designs to consider later:
462
+
463
+ ### `get.lazy` on Result
464
+
465
+ ```javascript
466
+ // Instead of db.exec().cursor(), access cursor via .get
467
+ const cursor = db.exec('SELECT * FROM users').get.lazy(100); // Returns cursor instead of all rows
468
+ // or with batchSize in spl (browser) options
469
+ const cursor = db.exec('SELECT * FROM users').get.lazy.objs
470
+ ```
471
+
472
+ **Pros**: Consistent with existing `.get.rows`, `.get.objs` pattern
473
+ **Cons**: Result object would need reference to pending statement
474
+
475
+ ---
476
+
477
+ ## Transaction Isolation Considerations
478
+
479
+ **Important**: Multiple cursors on the same connection share transaction context.
480
+
481
+ | Scenario | Behavior |
482
+ |----------|----------|
483
+ | Cursor A reads, Cursor B reads | Both see same snapshot (within same transaction) |
484
+ | Cursor A reads, Cursor B writes | Cursor A may see B's changes (no isolation) |
485
+ | Cursor A reads, external write | Depends on transaction/WAL mode |
486
+
487
+ ### Implications
488
+
489
+ 1. **Read-only cursors** (typical use case): Safe to have multiple concurrent cursors
490
+ 2. **Mixed read/write**: Changes from one cursor visible to others immediately
491
+ 3. **No row-level locking**: SQLite uses database-level locking, not row-level
492
+
493
+ ### Recommendations
494
+
495
+ - Document that cursors share transaction state
496
+ - For isolation, user should use explicit transactions: `BEGIN IMMEDIATE` before opening cursors
497
+ - Consider adding `cursor({ readonly: true })` option for future optimization
498
+ - WAL mode provides better concurrent read behavior
499
+
500
+ ---
501
+
502
+ ## Resource Lifecycle & Cleanup
503
+
504
+ This is a **critical design concern**, especially for the browser where we cannot rely on garbage collection to clean up worker-side resources.
505
+
506
+ ### The Core Problem
507
+
508
+ ```
509
+ ┌─────────────────────┐ ┌─────────────────────┐
510
+ │ Main Thread │ │ Worker │
511
+ ├─────────────────────┤ ├─────────────────────┤
512
+ │ │ │ │
513
+ │ cursor proxy ──────┼─── messages ──────►│ sqlite3_stmt │
514
+ │ (JS object) │ │ (real resource) │
515
+ │ │ │ │
516
+ │ goes out of │ │ ??? never │
517
+ │ scope... GC'd │ │ finalized │
518
+ │ │ │ MEMORY LEAK │
519
+ └─────────────────────┘ └─────────────────────┘
520
+ ```
521
+
522
+ The worker has **no way to know** when the main thread's proxy object is garbage collected.
523
+
524
+ ### Eager vs Lazy Resource Ownership
525
+
526
+ | Mode | Where Data Lives | Worker State After Return | Leak Risk |
527
+ |------|------------------|---------------------------|-----------|
528
+ | **Eager** (`get.rows`) | Main thread (transferred) | None | ✅ No leak |
529
+ | **Lazy** (cursor) | Worker (stmt open) | Holds `sqlite3_stmt` | ❌ Leaks if not drained |
530
+
531
+ With eager fetching, data is transferred to main thread and worker holds nothing. With lazy cursors, the worker retains the open statement until explicitly finalized or exhausted.
532
+
533
+ ### Possible Solutions
534
+
535
+ #### 1. Single Cursor/Result Per DB (Simplest) ⭐
536
+
537
+ ```javascript
538
+ // Only ONE pending statement allowed per db
539
+ // New exec() auto-finalizes previous
540
+ const cursor1 = db.exec('SELECT * FROM a').cursor();
541
+ const cursor2 = db.exec('SELECT * FROM b').cursor(); // cursor1 auto-finalized
542
+ ```
543
+
544
+ | Pros | Cons |
545
+ |------|------|
546
+ | No reference counting needed | Can't iterate two result sets simultaneously |
547
+ | No cursor IDs to manage | |
548
+ | Trivial cleanup logic | |
549
+ | Matches 99% of real use cases | |
550
+
551
+ **Implementation**: Single `_pendingStmt` per db. New `exec()` calls `sqlite3_finalize()` on previous.
552
+
553
+ #### 2. Explicit Close Required
554
+
555
+ ```javascript
556
+ const cursor = db.exec('SELECT...').cursor();
557
+ try {
558
+ while (!cursor.done) { /* ... */ }
559
+ } finally {
560
+ cursor.close(); // REQUIRED - user's responsibility
561
+ }
562
+ ```
563
+
564
+ | Pros | Cons |
565
+ |------|------|
566
+ | Deterministic cleanup | Users forget to close |
567
+ | Matches RAII pattern | Exceptions can skip cleanup |
568
+ | Allows multiple concurrent cursors | No enforcement mechanism |
569
+
570
+ #### 3. Callback/Scoped Pattern
571
+
572
+ ```javascript
573
+ db.withCursor('SELECT...', [params], (cursor) => {
574
+ while (!cursor.done) {
575
+ console.log(cursor.objs);
576
+ }
577
+ }); // Auto-finalized after callback returns
578
+ ```
579
+
580
+ | Pros | Cons |
581
+ |------|------|
582
+ | Guaranteed cleanup | Callback style awkward |
583
+ | Works with sync and async | Nesting hell with multiple cursors |
584
+ | Clear ownership boundaries | Doesn't compose well with async/await |
585
+
586
+ #### 4. FinalizationRegistry (Browser Only)
587
+
588
+ **Note**: Only relevant for browser implementation where main thread cannot directly access worker-side resources. In Node.js, resources are in the same thread and can be cleaned up directly.
589
+
590
+ ```javascript
591
+ // Browser main thread
592
+ const registry = new FinalizationRegistry((cursorId) => {
593
+ worker.postMessage({ fn: 'cursor.finalize', id: cursorId });
594
+ });
595
+
596
+ function createCursorProxy(cursorId) {
597
+ const proxy = { /* cursor methods */ };
598
+ registry.register(proxy, cursorId); // Clean up when proxy is GC'd
599
+ return proxy;
600
+ }
601
+ ```
602
+
603
+ | Pros | Cons |
604
+ |------|------|
605
+ | Automatic, transparent | GC timing non-deterministic |
606
+ | No user action required | Hard to test (can't force GC) |
607
+ | Good browser support (Chrome 84+, Firefox 79+, Safari 14.1+) | Spec warns against relying on it for "correctness" |
608
+ | Page close is fine (worker dies anyway, OS reclaims) | |
609
+
610
+ **When FinalizationRegistry is appropriate for this use case**:
611
+
612
+ The spec warning ("don't rely on this for correctness") applies when cleanup *must* happen before some operation. For cursor cleanup, we only need *eventual* cleanup to prevent unbounded memory growth:
613
+
614
+ | Scenario | Outcome | Problem? |
615
+ |----------|---------|----------|
616
+ | Cursor abandoned, page open | GC eventually runs, finalizer cleans up | ✅ Works |
617
+ | Cursor abandoned, page closed | Worker terminated, OS reclaims | ✅ No leak |
618
+ | Long pause before GC | Memory temporarily higher | ⚠️ Minor, transient |
619
+
620
+ **Verdict**: Suitable as a **safety net** for browser multi-cursor support, not as the sole cleanup mechanism. Pair with optional explicit `close()` for users who need deterministic cleanup.
621
+
622
+ #### 5. Max Cursors + LRU Eviction
623
+
624
+ ```javascript
625
+ // Worker maintains: Map<cursorId, { stmt, lastAccess, dbId }>
626
+ // When cursors.size > MAX_CURSORS, finalize least recently used
627
+ const MAX_CURSORS_PER_DB = 10;
628
+ ```
629
+
630
+ | Pros | Cons |
631
+ |------|------|
632
+ | Bounded memory usage | Surprising behavior (cursor silently invalidated) |
633
+ | Handles forgetful users | Added complexity |
634
+ | Configurable limits | Arbitrary limit choice |
635
+
636
+ #### 6. Timeout-Based Cleanup
637
+
638
+ ```javascript
639
+ // Worker: if cursor not accessed for N seconds, auto-finalize
640
+ const CURSOR_IDLE_TIMEOUT_MS = 30000;
641
+ ```
642
+
643
+ | Pros | Cons |
644
+ |------|------|
645
+ | Handles abandoned cursors | Long iterations might timeout |
646
+ | No user action required | Unpredictable for users |
647
+ | Bounded leak duration | Adds timer complexity |
648
+
649
+ ### Recommended Approach: Layered Strategy
650
+
651
+ #### Layer 1: Single Pending Statement Per DB (Default)
652
+
653
+ For the initial implementation, enforce **one pending statement per db connection**:
654
+
655
+ ```javascript
656
+ // This "just works" - no leaks possible
657
+ const cursor = db.exec('SELECT * FROM users').cursor();
658
+ while (!cursor.done) { ... }
659
+
660
+ // Or if user forgets to drain:
661
+ db.exec('SELECT * FROM a').cursor(); // opens cursor
662
+ db.exec('SELECT * FROM b'); // auto-closes previous, no leak
663
+ db.close(); // also auto-closes any pending
664
+ ```
665
+
666
+ **Why this works**:
667
+ - Matches how most users actually query (one at a time)
668
+ - Zero reference counting complexity
669
+ - Impossible to leak - next operation cleans up previous
670
+ - No IDs needed in worker (just one `_pendingStmt` per db)
671
+
672
+ #### Layer 2: Explicit Multi-Cursor (Future, Opt-in)
673
+
674
+ If users genuinely need multiple concurrent cursors, require explicit lifecycle management:
675
+
676
+ ```javascript
677
+ // Opt-in to complexity via named cursors
678
+ const cursor1 = db.exec('SELECT * FROM a').cursor({ name: 'cur1' });
679
+ const cursor2 = db.exec('SELECT * FROM b').cursor({ name: 'cur2' });
680
+ // Both remain open until explicitly closed or db.close()
681
+ cursor1.close();
682
+ cursor2.close();
683
+ ```
684
+
685
+ Or use the callback pattern for guaranteed cleanup:
686
+
687
+ ```javascript
688
+ await db.withCursor('SELECT * FROM users', [], async (cursor) => {
689
+ while (!cursor.done) {
690
+ await processBatch(cursor.objs);
691
+ }
692
+ }); // Cursor guaranteed finalized here
693
+ ```
694
+
695
+ #### Browser Implementation: FinalizationRegistry as Safety Net
696
+
697
+ For browser multi-cursor support, use FinalizationRegistry as a fallback for cursors that aren't explicitly closed:
698
+
699
+ ```javascript
700
+ // Browser: belt-and-suspenders approach
701
+ const cursor = db.exec('SELECT...').cursor();
702
+
703
+ // Option A: Rely on FinalizationRegistry (automatic, eventual)
704
+ // When cursor proxy is GC'd, worker receives cleanup message
705
+
706
+ // Option B: Explicit close (deterministic, immediate)
707
+ cursor.close();
708
+
709
+ // Both work. Explicit close is faster, but forgotten cursors
710
+ // still get cleaned up eventually via FinalizationRegistry.
711
+ ```
712
+
713
+ This doesn't apply to Node.js where resources are in the same thread and cleanup is straightforward.
714
+
715
+ ### Browser-Specific Considerations
716
+
717
+ | Event | Cleanup Behavior |
718
+ |-------|------------------|
719
+ | `db.close()` | Finalize all pending statements for that db |
720
+ | New `exec()` on same db | Finalize previous pending statement |
721
+ | Cursor exhausted (`.done` = true) | Auto-finalize |
722
+ | Worker terminated | OS reclaims memory (not a leak, but abrupt) |
723
+ | Page unload | Worker terminated, see above |
724
+ | Tab backgrounded | No automatic cleanup (cursors remain open) |
725
+
726
+ ### Questions to Resolve Before Implementation
727
+
728
+ 1. **Do we ever need multiple concurrent cursors per db?**
729
+ - If no → Single-cursor-per-db is sufficient, dramatically simpler
730
+ - If yes → Need explicit lifecycle (close/callback) or accept leak risk
731
+
732
+ 2. **Should eager Results also become "pending" with lazy exec (Option C)?**
733
+ - Currently: `exec()` fetches all, returns immediately
734
+ - With Option C: `exec()` steps once, `.get` fetches rest
735
+ - This means even eager access leaves state until `.get` is called
736
+
737
+ 3. **Error on cursor access after invalidation?**
738
+ ```javascript
739
+ const cursor = db.exec('SELECT...').cursor();
740
+ db.exec('SELECT...'); // cursor invalidated
741
+ cursor.rows; // Error? Empty array? Undefined behavior?
742
+ ```
743
+ Recommendation: Throw clear error "Cursor invalidated by subsequent exec()"
744
+
745
+ 4. **Should `cursor.close()` exist even in single-cursor mode?**
746
+ - Pro: Explicit cleanup without needing another exec()
747
+ - Pro: Familiar pattern for users from other libraries
748
+ - Con: Suggests close() is required (it isn't in single-cursor mode)
749
+
750
+ ---
751
+
752
+ ## Implementation Checklist
753
+
754
+ *Note: Checklist depends on which API option is chosen (A, B, or C).*
755
+
756
+ ### Core (spl.js) - Sync API for Node.js
757
+
758
+ **For Option C (Python-style lazy exec):**
759
+ - [ ] Modify `exec()` to step ONCE and detect SQLITE_ROW vs SQLITE_DONE
760
+ - [ ] Add `_pendingStmt`, `_pendingColumns`, `_firstRow` state variables
761
+ - [ ] Modify `.get` getter to consume pending statement lazily
762
+ - [ ] Implement `cursor(batchSize)` method that returns cursor object
763
+ - [ ] Implement `createCursor()` with `.cols`, `.rows`, `.objs`, `.flat`, `.done` getters
764
+ - [ ] Auto-finalize cursor when exhausted
765
+ - [ ] Finalize pending statement on new `exec()` call
766
+ - [ ] Finalize all pending statements on `db.close()`
767
+
768
+ **For Options A or B:**
769
+ - [ ] Implement `db.cursor()` method (Option A) or `exec(..., {cursor})` flag (Option B)
770
+ - [ ] Implement `createCursor()` with getters
771
+ - [ ] Handle cleanup scenarios
772
+
773
+ ### Worker (spl-worker.js) - Bridge
774
+ - [ ] Add cursor state management (Map of cursor ID → cursor object)
775
+ - [ ] Add `cursor.cols` handler (returns column names)
776
+ - [ ] Add `cursor.fetch` handler (returns next batch as rows)
777
+ - [ ] Add `cursor.done` handler (returns exhaustion status)
778
+ - [ ] Handle format conversion (rows/objs/flat) on worker side
779
+
780
+ ### Browser (spl-web.js) - Async API
781
+ - [ ] Wrap cursor operations in Thenable pattern
782
+ - [ ] Implement cursor proxy with async getters
783
+ - [ ] Add transferables support for ArrayBuffer columns in batches
784
+
785
+ ### Testing & Documentation
786
+ - [ ] Write tests for cursor iteration until done
787
+ - [ ] Test auto-cleanup when exhausted
788
+ - [ ] Test cleanup on new exec() call
789
+ - [ ] Test cleanup on db.close()
790
+ - [ ] Test transaction isolation behavior
791
+ - [ ] Document cursor API in README
792
+ - [ ] Document auto-cleanup behavior