spl.js 1.0.0 → 1.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.TODO.md +792 -0
- package/.claude/settings.local.json +6 -1
- package/Makefile +3 -1
- package/dist/index.js +1 -1
- package/dist/index.mjs +1 -1
- package/dist/index.wasm +0 -0
- package/doc/spatialite_functions.md +110 -126
- package/package.json +1 -1
- package/test/node.js +58 -1
- package/test/test_function_list.js +63 -0
package/.TODO.md
ADDED
|
@@ -0,0 +1,792 @@
|
|
|
1
|
+
# Cursor Implementation for spl.js
|
|
2
|
+
|
|
3
|
+
## Background: SQLite Cursors
|
|
4
|
+
|
|
5
|
+
### SQLite C API
|
|
6
|
+
|
|
7
|
+
SQLite doesn't expose a separate "cursor" object. Instead, the **prepared statement** (`sqlite3_stmt`) acts as the cursor:
|
|
8
|
+
|
|
9
|
+
- `sqlite3_prepare_v2()` - Compile SQL into a prepared statement
|
|
10
|
+
- `sqlite3_step()` - Advance to next row (returns `SQLITE_ROW` or `SQLITE_DONE`)
|
|
11
|
+
- `sqlite3_column_*()` - Extract column values from current row
|
|
12
|
+
- `sqlite3_reset()` - Rewind cursor to beginning for re-execution
|
|
13
|
+
- `sqlite3_finalize()` - Destroy statement and free resources
|
|
14
|
+
|
|
15
|
+
### Advantages of Cursors
|
|
16
|
+
|
|
17
|
+
| Advantage | Description |
|
|
18
|
+
|-----------|-------------|
|
|
19
|
+
| Memory efficient | Rows fetched one-at-a-time, no need to load entire result set |
|
|
20
|
+
| Streaming | Process large datasets without memory exhaustion |
|
|
21
|
+
| Random access | Can bind new parameters and re-execute with reset |
|
|
22
|
+
| Multiple concurrent | Many cursors can operate on same table independently |
|
|
23
|
+
| Fast seeking | Internal B-tree cursors enable efficient key-based lookups |
|
|
24
|
+
|
|
25
|
+
### Disadvantages of Cursors
|
|
26
|
+
|
|
27
|
+
| Disadvantage | Description |
|
|
28
|
+
|--------------|-------------|
|
|
29
|
+
| Manual lifecycle | Must finalize or risk memory leaks |
|
|
30
|
+
| No row locking | Other operations can modify data under an active cursor |
|
|
31
|
+
| Forward-only by default | Standard iteration is sequential |
|
|
32
|
+
| Statement caching | Reusing statements requires careful reset management |
|
|
33
|
+
| Connection affinity | Cursors tied to their database connection |
|
|
34
|
+
|
|
35
|
+
### References
|
|
36
|
+
|
|
37
|
+
- [SQLite C/C++ Interface Introduction](https://sqlite.org/cintro.html)
|
|
38
|
+
- [sqlite3_step() Documentation](https://sqlite.org/c3ref/step.html)
|
|
39
|
+
- [SQLite B-Tree Module](https://sqlite.org/btreemodule.html)
|
|
40
|
+
- [SQLite Bytecode Engine](https://sqlite.org/opcode.html)
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
## Current spl.js Architecture
|
|
45
|
+
|
|
46
|
+
### Message Protocol (Main Thread -> Worker)
|
|
47
|
+
|
|
48
|
+
```javascript
|
|
49
|
+
// Single operation
|
|
50
|
+
{
|
|
51
|
+
__id__: number, // Unique message ID for response matching
|
|
52
|
+
fn: string, // Function name (e.g., 'db.exec', 'fs.file')
|
|
53
|
+
id: number, // Database instance ID
|
|
54
|
+
args: any[] // Arguments to pass
|
|
55
|
+
}
|
|
56
|
+
|
|
57
|
+
// Batch operations
|
|
58
|
+
{
|
|
59
|
+
__id__: number,
|
|
60
|
+
fn: [
|
|
61
|
+
{ id, fn, args }, // Array of multiple operations
|
|
62
|
+
...
|
|
63
|
+
]
|
|
64
|
+
}
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
### Response Protocol (Worker -> Main Thread)
|
|
68
|
+
|
|
69
|
+
```javascript
|
|
70
|
+
{
|
|
71
|
+
__id__: number, // Echoes the request's __id__
|
|
72
|
+
res: any, // Result/response data
|
|
73
|
+
err: string // Error message if execution failed
|
|
74
|
+
}
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
### Key Architectural Patterns
|
|
78
|
+
|
|
79
|
+
1. **Promise Queue**: Each message gets unique `__id__` stored in queue, response matches it back
|
|
80
|
+
2. **Transferables**: Large ArrayBuffers transferred by reference, not copied
|
|
81
|
+
3. **Batch Operations**: Multiple SQL operations sent in single message via async generator
|
|
82
|
+
4. **Thenable Chaining**: Synchronous API stacks operations, sends in batch when awaited
|
|
83
|
+
|
|
84
|
+
---
|
|
85
|
+
|
|
86
|
+
## Research: How Other Libraries Handle This
|
|
87
|
+
|
|
88
|
+
### Python sqlite3 (DB-API 2.0)
|
|
89
|
+
|
|
90
|
+
```python
|
|
91
|
+
# Create cursor from connection
|
|
92
|
+
cur = con.cursor()
|
|
93
|
+
|
|
94
|
+
# execute() prepares but does NOT fetch all rows
|
|
95
|
+
cur.execute("SELECT * FROM users WHERE age > ?", (18,))
|
|
96
|
+
|
|
97
|
+
# User chooses fetch strategy AFTER execute:
|
|
98
|
+
row = cur.fetchone() # Single row
|
|
99
|
+
rows = cur.fetchmany(100) # Batch of 100
|
|
100
|
+
rows = cur.fetchall() # All at once
|
|
101
|
+
|
|
102
|
+
# Or iterate directly (cursor is iterable)
|
|
103
|
+
for row in cur.execute("SELECT * FROM users"):
|
|
104
|
+
print(row)
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
**Key behavior discovered in [CPython source](https://github.com/python/cpython/blob/main/Modules/_sqlite/cursor.c):**
|
|
108
|
+
|
|
109
|
+
```c
|
|
110
|
+
// In _pysqlite_query_execute():
|
|
111
|
+
rc = stmt_step(self->statement->st); // Step ONCE
|
|
112
|
+
|
|
113
|
+
if (rc == SQLITE_ROW) {
|
|
114
|
+
// SELECT with results → STOP HERE
|
|
115
|
+
// Statement left "open", positioned at first row
|
|
116
|
+
}
|
|
117
|
+
else if (rc == SQLITE_DONE) {
|
|
118
|
+
// INSERT/UPDATE/DELETE or empty SELECT
|
|
119
|
+
// Reset statement immediately
|
|
120
|
+
stmt_reset(self->statement);
|
|
121
|
+
}
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
**Python steps ONCE** - just enough to determine if there's data:
|
|
125
|
+
- `SQLITE_ROW` → SELECT has data, statement stays open for lazy fetch
|
|
126
|
+
- `SQLITE_DONE` → INSERT/UPDATE/DELETE executed immediately, or empty SELECT
|
|
127
|
+
|
|
128
|
+
### better-sqlite3 (Node.js)
|
|
129
|
+
|
|
130
|
+
```javascript
|
|
131
|
+
// Prepare statement from db
|
|
132
|
+
const stmt = db.prepare('SELECT * FROM users WHERE age > ?');
|
|
133
|
+
|
|
134
|
+
// User chooses execution method:
|
|
135
|
+
const row = stmt.get(18); // First row only
|
|
136
|
+
const rows = stmt.all(18); // All rows at once
|
|
137
|
+
for (const row of stmt.iterate(18)) { // Iterate one-by-one
|
|
138
|
+
console.log(row);
|
|
139
|
+
}
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
**Key insight**: `.prepare()` creates statement, then user chooses `.all()` vs `.iterate()`.
|
|
143
|
+
|
|
144
|
+
### Pattern Comparison
|
|
145
|
+
|
|
146
|
+
| Library | Prepare | Get All | Iterate/Stream |
|
|
147
|
+
|---------|---------|---------|----------------|
|
|
148
|
+
| **Python sqlite3** | `cur.execute(sql)` | `cur.fetchall()` | `for row in cur:` |
|
|
149
|
+
| **better-sqlite3** | `db.prepare(sql)` | `stmt.all()` | `stmt.iterate()` |
|
|
150
|
+
| **spl.js current** | `db.exec(sql)` | `db.get.rows` | ❌ N/A |
|
|
151
|
+
|
|
152
|
+
**Both libraries separate preparation from fetching strategy.**
|
|
153
|
+
|
|
154
|
+
### References
|
|
155
|
+
|
|
156
|
+
- [Python sqlite3 Documentation](https://docs.python.org/3/library/sqlite3.html)
|
|
157
|
+
- [CPython sqlite3 cursor.c source](https://github.com/python/cpython/blob/main/Modules/_sqlite/cursor.c)
|
|
158
|
+
- [better-sqlite3 API](https://github.com/WiseLibs/better-sqlite3/blob/master/docs/api.md)
|
|
159
|
+
|
|
160
|
+
---
|
|
161
|
+
|
|
162
|
+
## API Design Options (Undecided)
|
|
163
|
+
|
|
164
|
+
### The `exec().cursor()` Problem
|
|
165
|
+
|
|
166
|
+
**Issue**: In synchronous code, `exec()` cannot know if `.cursor()` will be called after it returns.
|
|
167
|
+
|
|
168
|
+
Current spl.js `exec()` behavior:
|
|
169
|
+
1. Prepares statement
|
|
170
|
+
2. Binds parameters
|
|
171
|
+
3. **Steps through ALL rows** (loads entire result into memory)
|
|
172
|
+
4. Finalizes statement
|
|
173
|
+
5. Returns `_db` for chaining
|
|
174
|
+
|
|
175
|
+
When `cursor()` is then called, the data is **already fully loaded in memory**, defeating the purpose of streaming/cursor-based iteration.
|
|
176
|
+
|
|
177
|
+
---
|
|
178
|
+
|
|
179
|
+
### Option A: `db.cursor(sql, params, batchSize?)`
|
|
180
|
+
|
|
181
|
+
Separate method, bypasses `exec()` entirely.
|
|
182
|
+
|
|
183
|
+
```javascript
|
|
184
|
+
// Node.js (sync)
|
|
185
|
+
const cursor = db.cursor('SELECT * FROM users WHERE age > ?', [18], 100);
|
|
186
|
+
|
|
187
|
+
console.log(cursor.cols); // ['id', 'name', 'age']
|
|
188
|
+
|
|
189
|
+
while (!cursor.done) {
|
|
190
|
+
for (const user of cursor.objs) {
|
|
191
|
+
console.log(user.name);
|
|
192
|
+
}
|
|
193
|
+
}
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
| Pros | Cons |
|
|
197
|
+
|------|------|
|
|
198
|
+
| No wasted work | New method to learn |
|
|
199
|
+
| Clear separation of concerns | Different pattern from exec() |
|
|
200
|
+
| Single prepare, single pass | |
|
|
201
|
+
|
|
202
|
+
**Implementation**: New `db.cursor()` method, straightforward.
|
|
203
|
+
|
|
204
|
+
---
|
|
205
|
+
|
|
206
|
+
### Option B: `exec(sql, params, { cursor: true/batchSize })`
|
|
207
|
+
|
|
208
|
+
Flag on exec() to return cursor instead of db.
|
|
209
|
+
|
|
210
|
+
```javascript
|
|
211
|
+
// Node.js (sync)
|
|
212
|
+
const cursor = db.exec('SELECT * FROM users WHERE age > ?', [18], { cursor: true });
|
|
213
|
+
// or with explicit batch size:
|
|
214
|
+
const cursor = db.exec('SELECT * FROM users WHERE age > ?', [18], { cursor: 100 });
|
|
215
|
+
|
|
216
|
+
console.log(cursor.cols); // ['id', 'name', 'age']
|
|
217
|
+
|
|
218
|
+
while (!cursor.done) {
|
|
219
|
+
for (const user of cursor.objs) {
|
|
220
|
+
console.log(user.name);
|
|
221
|
+
}
|
|
222
|
+
}
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
| Pros | Cons |
|
|
226
|
+
|------|------|
|
|
227
|
+
| No wasted work - exec() knows mode upfront | Return type changes based on options |
|
|
228
|
+
| Single entry point (no new method) | Chaining stops when cursor mode (returns cursor, not _db) |
|
|
229
|
+
| No breaking change | Options parameter adds complexity |
|
|
230
|
+
| Explicit intent | |
|
|
231
|
+
|
|
232
|
+
**Implementation**:
|
|
233
|
+
```javascript
|
|
234
|
+
this.exec = (sql, par, opt = {}) => {
|
|
235
|
+
const { cursor: cursorMode } = opt;
|
|
236
|
+
const batchSize = typeof cursorMode === 'number' ? cursorMode : 100;
|
|
237
|
+
|
|
238
|
+
if (cursorMode) {
|
|
239
|
+
// Cursor mode: prepare + bind, step ONCE
|
|
240
|
+
const stmt = sqlite3_prepare_v2(...);
|
|
241
|
+
bind(stmt, par);
|
|
242
|
+
sqlite3_step(stmt); // Step to first row
|
|
243
|
+
const columns = getColumnNames(stmt);
|
|
244
|
+
const firstRow = readCurrentRow(stmt);
|
|
245
|
+
return createCursor(stmt, columns, firstRow, batchSize);
|
|
246
|
+
}
|
|
247
|
+
|
|
248
|
+
// Normal mode: existing behavior
|
|
249
|
+
// ... prepare, bind, step ALL, finalize ...
|
|
250
|
+
return _db;
|
|
251
|
+
};
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
---
|
|
255
|
+
|
|
256
|
+
### Option C: `exec(...).cursor()` with Lazy exec (Python-style) ⭐
|
|
257
|
+
|
|
258
|
+
Modify `exec()` to step **once** like Python - just enough to determine if there's data.
|
|
259
|
+
|
|
260
|
+
```javascript
|
|
261
|
+
// Usage stays the same:
|
|
262
|
+
db.exec('INSERT INTO users VALUES (?)', ['Alice']); // Executes immediately (DONE)
|
|
263
|
+
db.exec('SELECT * FROM users'); // Steps once, waits
|
|
264
|
+
db.get.rows; // NOW fetches all remaining
|
|
265
|
+
// OR
|
|
266
|
+
db.exec('SELECT * FROM users').cursor(100); // Returns cursor for streaming
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
**Implementation**:
|
|
270
|
+
|
|
271
|
+
```javascript
|
|
272
|
+
this.exec = (sql, par) => {
|
|
273
|
+
const stmt = sqlite3_prepare_v2(sql);
|
|
274
|
+
bind(stmt, par);
|
|
275
|
+
|
|
276
|
+
const rc = sqlite3_step(stmt); // Step ONCE
|
|
277
|
+
|
|
278
|
+
if (rc === SQLITE_DONE) {
|
|
279
|
+
// INSERT/UPDATE/DELETE or empty SELECT
|
|
280
|
+
// Executed immediately, finalize now
|
|
281
|
+
sqlite3_finalize(stmt);
|
|
282
|
+
_result = result([], []);
|
|
283
|
+
_pendingStmt = null;
|
|
284
|
+
}
|
|
285
|
+
else if (rc === SQLITE_ROW) {
|
|
286
|
+
// SELECT with data - save state, DON'T finalize
|
|
287
|
+
_pendingStmt = stmt;
|
|
288
|
+
_pendingColumns = getColumns(stmt);
|
|
289
|
+
_firstRow = readCurrentRow(stmt); // Save row we already stepped to
|
|
290
|
+
}
|
|
291
|
+
|
|
292
|
+
return _db;
|
|
293
|
+
};
|
|
294
|
+
|
|
295
|
+
// .get - consume all remaining rows (lazy)
|
|
296
|
+
Object.defineProperty(this, 'get', {
|
|
297
|
+
get: () => {
|
|
298
|
+
if (_pendingStmt) {
|
|
299
|
+
const rows = [_firstRow]; // Include first row
|
|
300
|
+
while (sqlite3_step(_pendingStmt) === SQLITE_ROW) {
|
|
301
|
+
rows.push(readCurrentRow(_pendingStmt));
|
|
302
|
+
}
|
|
303
|
+
sqlite3_finalize(_pendingStmt);
|
|
304
|
+
_result = result(_pendingColumns, rows);
|
|
305
|
+
_pendingStmt = null;
|
|
306
|
+
_firstRow = null;
|
|
307
|
+
}
|
|
308
|
+
return _result;
|
|
309
|
+
}
|
|
310
|
+
});
|
|
311
|
+
|
|
312
|
+
// .cursor() - return iterator using the open statement
|
|
313
|
+
this.cursor = (batchSize = 100) => {
|
|
314
|
+
if (!_pendingStmt) {
|
|
315
|
+
throw new Error('No pending SELECT or already consumed');
|
|
316
|
+
}
|
|
317
|
+
const cursor = createCursor(_pendingStmt, _pendingColumns, _firstRow, batchSize);
|
|
318
|
+
_pendingStmt = null; // Cursor now owns the statement
|
|
319
|
+
_firstRow = null;
|
|
320
|
+
return cursor;
|
|
321
|
+
};
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
| Pros | Cons |
|
|
325
|
+
|------|------|
|
|
326
|
+
| `exec().cursor()` works efficiently | Subtle behavior change (lazy fetch) |
|
|
327
|
+
| Matches Python's proven pattern | `.get` must be called to consume results |
|
|
328
|
+
| INSERT/UPDATE/DELETE still immediate | Can only call `.get` OR `.cursor()`, not both |
|
|
329
|
+
| No new methods needed | Need to handle "dangling" statement if neither called |
|
|
330
|
+
| Single pass, no wasted work | |
|
|
331
|
+
|
|
332
|
+
**Edge cases to handle**:
|
|
333
|
+
- What if user calls `exec()` twice without consuming first? → Finalize previous `_pendingStmt`
|
|
334
|
+
- What if `db.close()` called with pending statement? → Finalize in close()
|
|
335
|
+
- Empty SELECT (no rows)? → `SQLITE_DONE` on first step, same as INSERT
|
|
336
|
+
|
|
337
|
+
---
|
|
338
|
+
|
|
339
|
+
### Comparison Summary
|
|
340
|
+
|
|
341
|
+
| Aspect | Option A | Option B | Option C |
|
|
342
|
+
|--------|----------|----------|----------|
|
|
343
|
+
| API | `db.cursor()` | `exec(..., {cursor})` | `exec().cursor()` |
|
|
344
|
+
| Efficiency | ✅ Single pass | ✅ Single pass | ✅ Single pass |
|
|
345
|
+
| Breaking change | No | No | Subtle (lazy `.get`) |
|
|
346
|
+
| New method | Yes | No | No |
|
|
347
|
+
| Return type | New type | Conditional | Consistent |
|
|
348
|
+
| Chaining after | N/A | Stops | `.get` or `.cursor()` |
|
|
349
|
+
| Matches Python | Partially | No | ✅ Yes |
|
|
350
|
+
|
|
351
|
+
---
|
|
352
|
+
|
|
353
|
+
## Cursor Object Design
|
|
354
|
+
|
|
355
|
+
The cursor mirrors Result's API but fetches lazily in batches.
|
|
356
|
+
|
|
357
|
+
### Result vs Cursor Comparison
|
|
358
|
+
|
|
359
|
+
| Property | Result | Cursor |
|
|
360
|
+
|----------|--------|--------|
|
|
361
|
+
| `.cols` | Getter (all columns) | Getter (all columns) |
|
|
362
|
+
| `.rows` | Getter (all data) | Getter (next batch as arrays) |
|
|
363
|
+
| `.objs` | Getter (all data) | Getter (next batch as objects) |
|
|
364
|
+
| `.flat` | Getter (all data) | Getter (next batch flattened) |
|
|
365
|
+
| `.first` | Getter (first row) | N/A (use `exec().get.first` instead) |
|
|
366
|
+
| `.done` | N/A | Getter (true if exhausted) |
|
|
367
|
+
|
|
368
|
+
### Auto-Cleanup Behavior (Important for Documentation)
|
|
369
|
+
|
|
370
|
+
Cursors are **automatically finalized** in these scenarios:
|
|
371
|
+
|
|
372
|
+
1. **Cursor exhausted** - When `.done` becomes true (all rows consumed)
|
|
373
|
+
2. **New `exec()` called** - Finalizes any previous pending statement
|
|
374
|
+
3. **`db.close()` called** - Finalizes all open cursors/statements
|
|
375
|
+
|
|
376
|
+
**No explicit `close()` or `reset()` needed.** Users can simply:
|
|
377
|
+
- Iterate until done (auto-cleanup)
|
|
378
|
+
- Call `exec()` again (previous cursor cleaned up)
|
|
379
|
+
- Let cursor go out of scope after `exec()` replaces it
|
|
380
|
+
|
|
381
|
+
### Implementation
|
|
382
|
+
|
|
383
|
+
```javascript
|
|
384
|
+
const createCursor = (stmt, columns, firstRow, batchSize) => {
|
|
385
|
+
let _done = false;
|
|
386
|
+
let _firstRow = firstRow; // Already stepped to first row in exec()
|
|
387
|
+
|
|
388
|
+
const fetchBatch = () => {
|
|
389
|
+
if (_done) return [];
|
|
390
|
+
|
|
391
|
+
const rows = [];
|
|
392
|
+
|
|
393
|
+
// Include first row if we have it
|
|
394
|
+
if (_firstRow) {
|
|
395
|
+
rows.push(_firstRow);
|
|
396
|
+
_firstRow = null;
|
|
397
|
+
}
|
|
398
|
+
|
|
399
|
+
// Fetch remaining up to batchSize
|
|
400
|
+
while (rows.length < batchSize) {
|
|
401
|
+
const rc = sqlite3_step(stmt);
|
|
402
|
+
if (rc === SQLITE_ROW) {
|
|
403
|
+
rows.push(readCurrentRow(stmt));
|
|
404
|
+
} else {
|
|
405
|
+
_done = true;
|
|
406
|
+
sqlite3_finalize(stmt); // Auto-finalize when exhausted
|
|
407
|
+
break;
|
|
408
|
+
}
|
|
409
|
+
}
|
|
410
|
+
|
|
411
|
+
return rows;
|
|
412
|
+
};
|
|
413
|
+
|
|
414
|
+
const toObj = (row) => columns.reduce((o, col, i) => {
|
|
415
|
+
o[col] = row[i];
|
|
416
|
+
return o;
|
|
417
|
+
}, {});
|
|
418
|
+
|
|
419
|
+
return Object.freeze({
|
|
420
|
+
// Metadata
|
|
421
|
+
get cols() { return columns.slice(); },
|
|
422
|
+
get done() { return _done; },
|
|
423
|
+
|
|
424
|
+
// Fetch getters - each access fetches next batch
|
|
425
|
+
get rows() { return fetchBatch(); },
|
|
426
|
+
get objs() { return fetchBatch().map(toObj); },
|
|
427
|
+
get flat() { return fetchBatch().flat(); },
|
|
428
|
+
});
|
|
429
|
+
};
|
|
430
|
+
```
|
|
431
|
+
|
|
432
|
+
### Usage Examples
|
|
433
|
+
|
|
434
|
+
```javascript
|
|
435
|
+
// Node.js (sync)
|
|
436
|
+
const cursor = db.exec('SELECT id, name, age FROM users').cursor(100);
|
|
437
|
+
|
|
438
|
+
console.log(cursor.cols); // ['id', 'name', 'age']
|
|
439
|
+
|
|
440
|
+
while (!cursor.done) {
|
|
441
|
+
for (const user of cursor.objs) {
|
|
442
|
+
console.log(user.name);
|
|
443
|
+
}
|
|
444
|
+
}
|
|
445
|
+
// Cursor auto-finalized when exhausted
|
|
446
|
+
|
|
447
|
+
// For single row, don't use cursor:
|
|
448
|
+
const count = db.exec('SELECT COUNT(*) FROM users').get.first;
|
|
449
|
+
|
|
450
|
+
// Mix formats (each .rows/.objs/.flat fetches next batch)
|
|
451
|
+
const cursor2 = db.exec('SELECT * FROM logs').cursor(50);
|
|
452
|
+
const first50asArrays = cursor2.rows;
|
|
453
|
+
const next50asObjects = cursor2.objs;
|
|
454
|
+
// Keep fetching until cursor.done...
|
|
455
|
+
```
|
|
456
|
+
|
|
457
|
+
---
|
|
458
|
+
|
|
459
|
+
## Future API Exploration
|
|
460
|
+
|
|
461
|
+
Alternative designs to consider later:
|
|
462
|
+
|
|
463
|
+
### `get.lazy` on Result
|
|
464
|
+
|
|
465
|
+
```javascript
|
|
466
|
+
// Instead of db.exec().cursor(), access cursor via .get
|
|
467
|
+
const cursor = db.exec('SELECT * FROM users').get.lazy(100); // Returns cursor instead of all rows
|
|
468
|
+
// or with batchSize in spl (browser) options
|
|
469
|
+
const cursor = db.exec('SELECT * FROM users').get.lazy.objs
|
|
470
|
+
```
|
|
471
|
+
|
|
472
|
+
**Pros**: Consistent with existing `.get.rows`, `.get.objs` pattern
|
|
473
|
+
**Cons**: Result object would need reference to pending statement
|
|
474
|
+
|
|
475
|
+
---
|
|
476
|
+
|
|
477
|
+
## Transaction Isolation Considerations
|
|
478
|
+
|
|
479
|
+
**Important**: Multiple cursors on the same connection share transaction context.
|
|
480
|
+
|
|
481
|
+
| Scenario | Behavior |
|
|
482
|
+
|----------|----------|
|
|
483
|
+
| Cursor A reads, Cursor B reads | Both see same snapshot (within same transaction) |
|
|
484
|
+
| Cursor A reads, Cursor B writes | Cursor A may see B's changes (no isolation) |
|
|
485
|
+
| Cursor A reads, external write | Depends on transaction/WAL mode |
|
|
486
|
+
|
|
487
|
+
### Implications
|
|
488
|
+
|
|
489
|
+
1. **Read-only cursors** (typical use case): Safe to have multiple concurrent cursors
|
|
490
|
+
2. **Mixed read/write**: Changes from one cursor visible to others immediately
|
|
491
|
+
3. **No row-level locking**: SQLite uses database-level locking, not row-level
|
|
492
|
+
|
|
493
|
+
### Recommendations
|
|
494
|
+
|
|
495
|
+
- Document that cursors share transaction state
|
|
496
|
+
- For isolation, user should use explicit transactions: `BEGIN IMMEDIATE` before opening cursors
|
|
497
|
+
- Consider adding `cursor({ readonly: true })` option for future optimization
|
|
498
|
+
- WAL mode provides better concurrent read behavior
|
|
499
|
+
|
|
500
|
+
---
|
|
501
|
+
|
|
502
|
+
## Resource Lifecycle & Cleanup
|
|
503
|
+
|
|
504
|
+
This is a **critical design concern**, especially for the browser where we cannot rely on garbage collection to clean up worker-side resources.
|
|
505
|
+
|
|
506
|
+
### The Core Problem
|
|
507
|
+
|
|
508
|
+
```
|
|
509
|
+
┌─────────────────────┐ ┌─────────────────────┐
|
|
510
|
+
│ Main Thread │ │ Worker │
|
|
511
|
+
├─────────────────────┤ ├─────────────────────┤
|
|
512
|
+
│ │ │ │
|
|
513
|
+
│ cursor proxy ──────┼─── messages ──────►│ sqlite3_stmt │
|
|
514
|
+
│ (JS object) │ │ (real resource) │
|
|
515
|
+
│ │ │ │
|
|
516
|
+
│ goes out of │ │ ??? never │
|
|
517
|
+
│ scope... GC'd │ │ finalized │
|
|
518
|
+
│ │ │ MEMORY LEAK │
|
|
519
|
+
└─────────────────────┘ └─────────────────────┘
|
|
520
|
+
```
|
|
521
|
+
|
|
522
|
+
The worker has **no way to know** when the main thread's proxy object is garbage collected.
|
|
523
|
+
|
|
524
|
+
### Eager vs Lazy Resource Ownership
|
|
525
|
+
|
|
526
|
+
| Mode | Where Data Lives | Worker State After Return | Leak Risk |
|
|
527
|
+
|------|------------------|---------------------------|-----------|
|
|
528
|
+
| **Eager** (`get.rows`) | Main thread (transferred) | None | ✅ No leak |
|
|
529
|
+
| **Lazy** (cursor) | Worker (stmt open) | Holds `sqlite3_stmt` | ❌ Leaks if not drained |
|
|
530
|
+
|
|
531
|
+
With eager fetching, data is transferred to main thread and worker holds nothing. With lazy cursors, the worker retains the open statement until explicitly finalized or exhausted.
|
|
532
|
+
|
|
533
|
+
### Possible Solutions
|
|
534
|
+
|
|
535
|
+
#### 1. Single Cursor/Result Per DB (Simplest) ⭐
|
|
536
|
+
|
|
537
|
+
```javascript
|
|
538
|
+
// Only ONE pending statement allowed per db
|
|
539
|
+
// New exec() auto-finalizes previous
|
|
540
|
+
const cursor1 = db.exec('SELECT * FROM a').cursor();
|
|
541
|
+
const cursor2 = db.exec('SELECT * FROM b').cursor(); // cursor1 auto-finalized
|
|
542
|
+
```
|
|
543
|
+
|
|
544
|
+
| Pros | Cons |
|
|
545
|
+
|------|------|
|
|
546
|
+
| No reference counting needed | Can't iterate two result sets simultaneously |
|
|
547
|
+
| No cursor IDs to manage | |
|
|
548
|
+
| Trivial cleanup logic | |
|
|
549
|
+
| Matches 99% of real use cases | |
|
|
550
|
+
|
|
551
|
+
**Implementation**: Single `_pendingStmt` per db. New `exec()` calls `sqlite3_finalize()` on previous.
|
|
552
|
+
|
|
553
|
+
#### 2. Explicit Close Required
|
|
554
|
+
|
|
555
|
+
```javascript
|
|
556
|
+
const cursor = db.exec('SELECT...').cursor();
|
|
557
|
+
try {
|
|
558
|
+
while (!cursor.done) { /* ... */ }
|
|
559
|
+
} finally {
|
|
560
|
+
cursor.close(); // REQUIRED - user's responsibility
|
|
561
|
+
}
|
|
562
|
+
```
|
|
563
|
+
|
|
564
|
+
| Pros | Cons |
|
|
565
|
+
|------|------|
|
|
566
|
+
| Deterministic cleanup | Users forget to close |
|
|
567
|
+
| Matches RAII pattern | Exceptions can skip cleanup |
|
|
568
|
+
| Allows multiple concurrent cursors | No enforcement mechanism |
|
|
569
|
+
|
|
570
|
+
#### 3. Callback/Scoped Pattern
|
|
571
|
+
|
|
572
|
+
```javascript
|
|
573
|
+
db.withCursor('SELECT...', [params], (cursor) => {
|
|
574
|
+
while (!cursor.done) {
|
|
575
|
+
console.log(cursor.objs);
|
|
576
|
+
}
|
|
577
|
+
}); // Auto-finalized after callback returns
|
|
578
|
+
```
|
|
579
|
+
|
|
580
|
+
| Pros | Cons |
|
|
581
|
+
|------|------|
|
|
582
|
+
| Guaranteed cleanup | Callback style awkward |
|
|
583
|
+
| Works with sync and async | Nesting hell with multiple cursors |
|
|
584
|
+
| Clear ownership boundaries | Doesn't compose well with async/await |
|
|
585
|
+
|
|
586
|
+
#### 4. FinalizationRegistry (Browser Only)
|
|
587
|
+
|
|
588
|
+
**Note**: Only relevant for browser implementation where main thread cannot directly access worker-side resources. In Node.js, resources are in the same thread and can be cleaned up directly.
|
|
589
|
+
|
|
590
|
+
```javascript
|
|
591
|
+
// Browser main thread
|
|
592
|
+
const registry = new FinalizationRegistry((cursorId) => {
|
|
593
|
+
worker.postMessage({ fn: 'cursor.finalize', id: cursorId });
|
|
594
|
+
});
|
|
595
|
+
|
|
596
|
+
function createCursorProxy(cursorId) {
|
|
597
|
+
const proxy = { /* cursor methods */ };
|
|
598
|
+
registry.register(proxy, cursorId); // Clean up when proxy is GC'd
|
|
599
|
+
return proxy;
|
|
600
|
+
}
|
|
601
|
+
```
|
|
602
|
+
|
|
603
|
+
| Pros | Cons |
|
|
604
|
+
|------|------|
|
|
605
|
+
| Automatic, transparent | GC timing non-deterministic |
|
|
606
|
+
| No user action required | Hard to test (can't force GC) |
|
|
607
|
+
| Good browser support (Chrome 84+, Firefox 79+, Safari 14.1+) | Spec warns against relying on it for "correctness" |
|
|
608
|
+
| Page close is fine (worker dies anyway, OS reclaims) | |
|
|
609
|
+
|
|
610
|
+
**When FinalizationRegistry is appropriate for this use case**:
|
|
611
|
+
|
|
612
|
+
The spec warning ("don't rely on this for correctness") applies when cleanup *must* happen before some operation. For cursor cleanup, we only need *eventual* cleanup to prevent unbounded memory growth:
|
|
613
|
+
|
|
614
|
+
| Scenario | Outcome | Problem? |
|
|
615
|
+
|----------|---------|----------|
|
|
616
|
+
| Cursor abandoned, page open | GC eventually runs, finalizer cleans up | ✅ Works |
|
|
617
|
+
| Cursor abandoned, page closed | Worker terminated, OS reclaims | ✅ No leak |
|
|
618
|
+
| Long pause before GC | Memory temporarily higher | ⚠️ Minor, transient |
|
|
619
|
+
|
|
620
|
+
**Verdict**: Suitable as a **safety net** for browser multi-cursor support, not as the sole cleanup mechanism. Pair with optional explicit `close()` for users who need deterministic cleanup.
|
|
621
|
+
|
|
622
|
+
#### 5. Max Cursors + LRU Eviction
|
|
623
|
+
|
|
624
|
+
```javascript
|
|
625
|
+
// Worker maintains: Map<cursorId, { stmt, lastAccess, dbId }>
|
|
626
|
+
// When cursors.size > MAX_CURSORS, finalize least recently used
|
|
627
|
+
const MAX_CURSORS_PER_DB = 10;
|
|
628
|
+
```
|
|
629
|
+
|
|
630
|
+
| Pros | Cons |
|
|
631
|
+
|------|------|
|
|
632
|
+
| Bounded memory usage | Surprising behavior (cursor silently invalidated) |
|
|
633
|
+
| Handles forgetful users | Added complexity |
|
|
634
|
+
| Configurable limits | Arbitrary limit choice |
|
|
635
|
+
|
|
636
|
+
#### 6. Timeout-Based Cleanup
|
|
637
|
+
|
|
638
|
+
```javascript
|
|
639
|
+
// Worker: if cursor not accessed for N seconds, auto-finalize
|
|
640
|
+
const CURSOR_IDLE_TIMEOUT_MS = 30000;
|
|
641
|
+
```
|
|
642
|
+
|
|
643
|
+
| Pros | Cons |
|
|
644
|
+
|------|------|
|
|
645
|
+
| Handles abandoned cursors | Long iterations might timeout |
|
|
646
|
+
| No user action required | Unpredictable for users |
|
|
647
|
+
| Bounded leak duration | Adds timer complexity |
|
|
648
|
+
|
|
649
|
+
### Recommended Approach: Layered Strategy
|
|
650
|
+
|
|
651
|
+
#### Layer 1: Single Pending Statement Per DB (Default)
|
|
652
|
+
|
|
653
|
+
For the initial implementation, enforce **one pending statement per db connection**:
|
|
654
|
+
|
|
655
|
+
```javascript
|
|
656
|
+
// This "just works" - no leaks possible
|
|
657
|
+
const cursor = db.exec('SELECT * FROM users').cursor();
|
|
658
|
+
while (!cursor.done) { ... }
|
|
659
|
+
|
|
660
|
+
// Or if user forgets to drain:
|
|
661
|
+
db.exec('SELECT * FROM a').cursor(); // opens cursor
|
|
662
|
+
db.exec('SELECT * FROM b'); // auto-closes previous, no leak
|
|
663
|
+
db.close(); // also auto-closes any pending
|
|
664
|
+
```
|
|
665
|
+
|
|
666
|
+
**Why this works**:
|
|
667
|
+
- Matches how most users actually query (one at a time)
|
|
668
|
+
- Zero reference counting complexity
|
|
669
|
+
- Impossible to leak - next operation cleans up previous
|
|
670
|
+
- No IDs needed in worker (just one `_pendingStmt` per db)
|
|
671
|
+
|
|
672
|
+
#### Layer 2: Explicit Multi-Cursor (Future, Opt-in)
|
|
673
|
+
|
|
674
|
+
If users genuinely need multiple concurrent cursors, require explicit lifecycle management:
|
|
675
|
+
|
|
676
|
+
```javascript
|
|
677
|
+
// Opt-in to complexity via named cursors
|
|
678
|
+
const cursor1 = db.exec('SELECT * FROM a').cursor({ name: 'cur1' });
|
|
679
|
+
const cursor2 = db.exec('SELECT * FROM b').cursor({ name: 'cur2' });
|
|
680
|
+
// Both remain open until explicitly closed or db.close()
|
|
681
|
+
cursor1.close();
|
|
682
|
+
cursor2.close();
|
|
683
|
+
```
|
|
684
|
+
|
|
685
|
+
Or use the callback pattern for guaranteed cleanup:
|
|
686
|
+
|
|
687
|
+
```javascript
|
|
688
|
+
await db.withCursor('SELECT * FROM users', [], async (cursor) => {
|
|
689
|
+
while (!cursor.done) {
|
|
690
|
+
await processBatch(cursor.objs);
|
|
691
|
+
}
|
|
692
|
+
}); // Cursor guaranteed finalized here
|
|
693
|
+
```
|
|
694
|
+
|
|
695
|
+
#### Browser Implementation: FinalizationRegistry as Safety Net
|
|
696
|
+
|
|
697
|
+
For browser multi-cursor support, use FinalizationRegistry as a fallback for cursors that aren't explicitly closed:
|
|
698
|
+
|
|
699
|
+
```javascript
|
|
700
|
+
// Browser: belt-and-suspenders approach
|
|
701
|
+
const cursor = db.exec('SELECT...').cursor();
|
|
702
|
+
|
|
703
|
+
// Option A: Rely on FinalizationRegistry (automatic, eventual)
|
|
704
|
+
// When cursor proxy is GC'd, worker receives cleanup message
|
|
705
|
+
|
|
706
|
+
// Option B: Explicit close (deterministic, immediate)
|
|
707
|
+
cursor.close();
|
|
708
|
+
|
|
709
|
+
// Both work. Explicit close is faster, but forgotten cursors
|
|
710
|
+
// still get cleaned up eventually via FinalizationRegistry.
|
|
711
|
+
```
|
|
712
|
+
|
|
713
|
+
This doesn't apply to Node.js where resources are in the same thread and cleanup is straightforward.
|
|
714
|
+
|
|
715
|
+
### Browser-Specific Considerations
|
|
716
|
+
|
|
717
|
+
| Event | Cleanup Behavior |
|
|
718
|
+
|-------|------------------|
|
|
719
|
+
| `db.close()` | Finalize all pending statements for that db |
|
|
720
|
+
| New `exec()` on same db | Finalize previous pending statement |
|
|
721
|
+
| Cursor exhausted (`.done` = true) | Auto-finalize |
|
|
722
|
+
| Worker terminated | OS reclaims memory (not a leak, but abrupt) |
|
|
723
|
+
| Page unload | Worker terminated, see above |
|
|
724
|
+
| Tab backgrounded | No automatic cleanup (cursors remain open) |
|
|
725
|
+
|
|
726
|
+
### Questions to Resolve Before Implementation
|
|
727
|
+
|
|
728
|
+
1. **Do we ever need multiple concurrent cursors per db?**
|
|
729
|
+
- If no → Single-cursor-per-db is sufficient, dramatically simpler
|
|
730
|
+
- If yes → Need explicit lifecycle (close/callback) or accept leak risk
|
|
731
|
+
|
|
732
|
+
2. **Should eager Results also become "pending" with lazy exec (Option C)?**
|
|
733
|
+
- Currently: `exec()` fetches all, returns immediately
|
|
734
|
+
- With Option C: `exec()` steps once, `.get` fetches rest
|
|
735
|
+
- This means even eager access leaves state until `.get` is called
|
|
736
|
+
|
|
737
|
+
3. **Error on cursor access after invalidation?**
|
|
738
|
+
```javascript
|
|
739
|
+
const cursor = db.exec('SELECT...').cursor();
|
|
740
|
+
db.exec('SELECT...'); // cursor invalidated
|
|
741
|
+
cursor.rows; // Error? Empty array? Undefined behavior?
|
|
742
|
+
```
|
|
743
|
+
Recommendation: Throw clear error "Cursor invalidated by subsequent exec()"
|
|
744
|
+
|
|
745
|
+
4. **Should `cursor.close()` exist even in single-cursor mode?**
|
|
746
|
+
- Pro: Explicit cleanup without needing another exec()
|
|
747
|
+
- Pro: Familiar pattern for users from other libraries
|
|
748
|
+
- Con: Suggests close() is required (it isn't in single-cursor mode)
|
|
749
|
+
|
|
750
|
+
---
|
|
751
|
+
|
|
752
|
+
## Implementation Checklist
|
|
753
|
+
|
|
754
|
+
*Note: Checklist depends on which API option is chosen (A, B, or C).*
|
|
755
|
+
|
|
756
|
+
### Core (spl.js) - Sync API for Node.js
|
|
757
|
+
|
|
758
|
+
**For Option C (Python-style lazy exec):**
|
|
759
|
+
- [ ] Modify `exec()` to step ONCE and detect SQLITE_ROW vs SQLITE_DONE
|
|
760
|
+
- [ ] Add `_pendingStmt`, `_pendingColumns`, `_firstRow` state variables
|
|
761
|
+
- [ ] Modify `.get` getter to consume pending statement lazily
|
|
762
|
+
- [ ] Implement `cursor(batchSize)` method that returns cursor object
|
|
763
|
+
- [ ] Implement `createCursor()` with `.cols`, `.rows`, `.objs`, `.flat`, `.done` getters
|
|
764
|
+
- [ ] Auto-finalize cursor when exhausted
|
|
765
|
+
- [ ] Finalize pending statement on new `exec()` call
|
|
766
|
+
- [ ] Finalize all pending statements on `db.close()`
|
|
767
|
+
|
|
768
|
+
**For Options A or B:**
|
|
769
|
+
- [ ] Implement `db.cursor()` method (Option A) or `exec(..., {cursor})` flag (Option B)
|
|
770
|
+
- [ ] Implement `createCursor()` with getters
|
|
771
|
+
- [ ] Handle cleanup scenarios
|
|
772
|
+
|
|
773
|
+
### Worker (spl-worker.js) - Bridge
|
|
774
|
+
- [ ] Add cursor state management (Map of cursor ID → cursor object)
|
|
775
|
+
- [ ] Add `cursor.cols` handler (returns column names)
|
|
776
|
+
- [ ] Add `cursor.fetch` handler (returns next batch as rows)
|
|
777
|
+
- [ ] Add `cursor.done` handler (returns exhaustion status)
|
|
778
|
+
- [ ] Handle format conversion (rows/objs/flat) on worker side
|
|
779
|
+
|
|
780
|
+
### Browser (spl-web.js) - Async API
|
|
781
|
+
- [ ] Wrap cursor operations in Thenable pattern
|
|
782
|
+
- [ ] Implement cursor proxy with async getters
|
|
783
|
+
- [ ] Add transferables support for ArrayBuffer columns in batches
|
|
784
|
+
|
|
785
|
+
### Testing & Documentation
|
|
786
|
+
- [ ] Write tests for cursor iteration until done
|
|
787
|
+
- [ ] Test auto-cleanup when exhausted
|
|
788
|
+
- [ ] Test cleanup on new exec() call
|
|
789
|
+
- [ ] Test cleanup on db.close()
|
|
790
|
+
- [ ] Test transaction isolation behavior
|
|
791
|
+
- [ ] Document cursor API in README
|
|
792
|
+
- [ ] Document auto-cleanup behavior
|