@mauryasumit/driftdb 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (82) hide show
  1. package/README.md +810 -0
  2. package/dist/db.d.ts +30 -0
  3. package/dist/db.d.ts.map +1 -0
  4. package/dist/db.js +115 -0
  5. package/dist/db.js.map +1 -0
  6. package/dist/index.d.ts +8 -0
  7. package/dist/index.d.ts.map +1 -0
  8. package/dist/index.js +12 -0
  9. package/dist/index.js.map +1 -0
  10. package/dist/orm/model.d.ts +35 -0
  11. package/dist/orm/model.d.ts.map +1 -0
  12. package/dist/orm/model.js +34 -0
  13. package/dist/orm/model.js.map +1 -0
  14. package/dist/orm/query-builder.d.ts +8 -0
  15. package/dist/orm/query-builder.d.ts.map +1 -0
  16. package/dist/orm/query-builder.js +90 -0
  17. package/dist/orm/query-builder.js.map +1 -0
  18. package/dist/orm/repository.d.ts +38 -0
  19. package/dist/orm/repository.d.ts.map +1 -0
  20. package/dist/orm/repository.js +107 -0
  21. package/dist/orm/repository.js.map +1 -0
  22. package/dist/orm/schema.d.ts +20 -0
  23. package/dist/orm/schema.d.ts.map +1 -0
  24. package/dist/orm/schema.js +81 -0
  25. package/dist/orm/schema.js.map +1 -0
  26. package/dist/queue/queue.d.ts +17 -0
  27. package/dist/queue/queue.d.ts.map +1 -0
  28. package/dist/queue/queue.js +109 -0
  29. package/dist/queue/queue.js.map +1 -0
  30. package/dist/storage/s3-adapter.d.ts +21 -0
  31. package/dist/storage/s3-adapter.d.ts.map +1 -0
  32. package/dist/storage/s3-adapter.js +133 -0
  33. package/dist/storage/s3-adapter.js.map +1 -0
  34. package/dist/sync/change-log.d.ts +15 -0
  35. package/dist/sync/change-log.d.ts.map +1 -0
  36. package/dist/sync/change-log.js +78 -0
  37. package/dist/sync/change-log.js.map +1 -0
  38. package/dist/sync/engine.d.ts +31 -0
  39. package/dist/sync/engine.d.ts.map +1 -0
  40. package/dist/sync/engine.js +210 -0
  41. package/dist/sync/engine.js.map +1 -0
  42. package/dist/sync/snapshot-manager.d.ts +17 -0
  43. package/dist/sync/snapshot-manager.d.ts.map +1 -0
  44. package/dist/sync/snapshot-manager.js +91 -0
  45. package/dist/sync/snapshot-manager.js.map +1 -0
  46. package/dist/types.d.ts +120 -0
  47. package/dist/types.d.ts.map +1 -0
  48. package/dist/types.js +3 -0
  49. package/dist/types.js.map +1 -0
  50. package/dist/utils/compress.d.ts +3 -0
  51. package/dist/utils/compress.d.ts.map +1 -0
  52. package/dist/utils/compress.js +16 -0
  53. package/dist/utils/compress.js.map +1 -0
  54. package/dist/utils/crypto.d.ts +4 -0
  55. package/dist/utils/crypto.d.ts.map +1 -0
  56. package/dist/utils/crypto.js +35 -0
  57. package/dist/utils/crypto.js.map +1 -0
  58. package/dist/utils/id.d.ts +3 -0
  59. package/dist/utils/id.d.ts.map +1 -0
  60. package/dist/utils/id.js +13 -0
  61. package/dist/utils/id.js.map +1 -0
  62. package/dist/utils/retry.d.ts +5 -0
  63. package/dist/utils/retry.d.ts.map +1 -0
  64. package/dist/utils/retry.js +36 -0
  65. package/dist/utils/retry.js.map +1 -0
  66. package/package.json +55 -0
  67. package/src/db.ts +154 -0
  68. package/src/index.ts +24 -0
  69. package/src/orm/model.ts +95 -0
  70. package/src/orm/query-builder.ts +100 -0
  71. package/src/orm/repository.ts +156 -0
  72. package/src/orm/schema.ts +92 -0
  73. package/src/queue/queue.ts +138 -0
  74. package/src/storage/s3-adapter.ts +181 -0
  75. package/src/sync/change-log.ts +101 -0
  76. package/src/sync/engine.ts +249 -0
  77. package/src/sync/snapshot-manager.ts +80 -0
  78. package/src/types.ts +130 -0
  79. package/src/utils/compress.ts +14 -0
  80. package/src/utils/crypto.ts +33 -0
  81. package/src/utils/id.ts +10 -0
  82. package/src/utils/retry.ts +38 -0
package/README.md ADDED
@@ -0,0 +1,810 @@
1
+ # 🌊 DriftDB
2
+
3
+ **Local-first SQLite database with automatic S3 sync.**
4
+
5
+ DriftDB keeps all reads and writes local (fast, offline-capable), and automatically drifts your changes to Amazon S3 in the background — no Redis, no Kafka, no external infrastructure required.
6
+
7
+ ---
8
+
9
+ ## Table of Contents
10
+
11
+ - [Why DriftDB?](#why-driftdb)
12
+ - [Architecture Overview](#architecture-overview)
13
+ - [Installation](#installation)
14
+ - [Quick Start](#quick-start)
15
+ - [API Reference](#api-reference)
16
+ - [DB](#db)
17
+ - [Repository (schema-based API)](#repository-schema-based-api)
18
+ - [Model (class-based API)](#model-class-based-api)
19
+ - [Column Builder](#column-builder)
20
+ - [S3 Sync Engine](#s3-sync-engine)
21
+ - [How Sync Works](#how-sync-works)
22
+ - [S3 Layout](#s3-layout)
23
+ - [Snapshots](#snapshots)
24
+ - [Compression & Encryption](#compression--encryption)
25
+ - [Failure Recovery](#failure-recovery)
26
+ - [Performance](#performance)
27
+ - [Configuration Reference](#configuration-reference)
28
+ - [Advanced Usage](#advanced-usage)
29
+ - [Trade-offs & Design Decisions](#trade-offs--design-decisions)
30
+
31
+ ---
32
+
33
+ ## Why DriftDB?
34
+
35
+ | Feature | DriftDB | Traditional DB |
36
+ |---|---|---|
37
+ | Offline reads/writes | ✅ Always works | ❌ Network required |
38
+ | Zero infrastructure | ✅ Only S3 | ❌ Servers, clusters |
39
+ | Local-first speed | ✅ SQLite speed | ❌ Network latency |
40
+ | Durable replication | ✅ S3 backed | ✅ |
41
+ | Type-safe ORM | ✅ | Varies |
42
+ | Crash recovery | ✅ Idempotent sync | Varies |
43
+
44
+ DriftDB is ideal for:
45
+ - **CLI tools** that need persistent local state with cloud backup
46
+ - **Edge / embedded applications** that must work offline
47
+ - **Single-tenant SaaS** where each customer gets their own isolated SQLite+S3 pair
48
+ - **Developer tools, agents, pipelines** that need durable local storage without an ops burden
49
+
50
+ ---
51
+
52
+ ## Architecture Overview
53
+
54
+ ```
55
+ ┌─────────────────────────────────────────────────────┐
56
+ │ Your Application │
57
+ │ │
58
+ │ const Users = db.define('users', schema) │
59
+ │ await Users.create({ name: 'Alice' }) ◄── fast │
60
+ │ await Users.filter({ age: { $gte: 18 } }) │
61
+ └──────────────────────────┬──────────────────────────┘
62
+ │ all reads/writes local
63
+
64
+ ┌─────────────────────────────────────────────────────┐
65
+ │ SQLite (WAL mode) │
66
+ │ │
67
+ │ ┌──────────────┐ ┌──────────────┐ │
68
+ │ │ Your tables │ │ _driftdb_log │ change log │
69
+ │ │ (users, ...) │ │ _driftdb_queue│ sync queue │
70
+ │ └──────────────┘ └──────────────┘ │
71
+ └──────────────────────────┬──────────────────────────┘
72
+ │ async, non-blocking
73
+
74
+ ┌─────────────────────────────────────────────────────┐
75
+ │ Background Sync Engine │
76
+ │ │
77
+ │ • Batches pending log entries │
78
+ │ • Uploads compressed JSON logs to S3 │
79
+ │ • Periodic full snapshots │
80
+ │ • Retry with exponential backoff │
81
+ └──────────────────────────┬──────────────────────────┘
82
+
83
+
84
+ ┌─────────────────────────────────────────────────────┐
85
+ │ Amazon S3 │
86
+ │ │
87
+ │ nodes/{nodeId}/logs/000000000001-000000000100.json │
88
+ │ nodes/{nodeId}/snapshots/1704067200000.sqlite │
89
+ │ nodes/{nodeId}/manifest.json │
90
+ └─────────────────────────────────────────────────────┘
91
+ ```
92
+
93
+ ---
94
+
95
+ ## Installation
96
+
97
+ ```bash
98
+ npm install driftdb
99
+ ```
100
+
101
+ **Requirements:**
102
+ - Node.js >= 18
103
+ - An S3-compatible bucket (AWS S3, MinIO, LocalStack, etc.)
104
+
105
+ ---
106
+
107
+ ## Quick Start
108
+
109
+ ### Schema-based API (recommended)
110
+
111
+ ```typescript
112
+ import { DB, Column } from 'driftdb';
113
+
114
+ const db = new DB({
115
+ sqlitePath: './data/myapp.sqlite',
116
+ s3Config: {
117
+ bucket: 'my-app-backups',
118
+ region: 'us-east-1',
119
+ },
120
+ });
121
+
122
+ // Define a model
123
+ const Users = db.define('users', {
124
+ name: Column.text().required().build(),
125
+ email: Column.text().unique().required().build(),
126
+ age: Column.integer().build(),
127
+ });
128
+
129
+ // Create
130
+ const alice = await Users.create({ name: 'Alice', email: 'alice@example.com', age: 30 });
131
+
132
+ // Query
133
+ const adults = await Users.find({
134
+ where: { age: { $gte: 18 } },
135
+ orderBy: { name: 'ASC' },
136
+ limit: 10,
137
+ });
138
+
139
+ // Update
140
+ await Users.update({ id: alice.id }, { name: 'Alice Smith' });
141
+
142
+ // Delete
143
+ await Users.delete({ id: alice.id });
144
+
145
+ // Metrics
146
+ console.log(db.getMetrics());
147
+
148
+ db.close();
149
+ ```
150
+
151
+ ### Class-based API
152
+
153
+ ```typescript
154
+ import { DB, Model, Column } from 'driftdb';
155
+ import type { ModelSchema } from 'driftdb';
156
+
157
+ const db = new DB({ sqlitePath: './data.sqlite' });
158
+
159
+ class User extends Model {
160
+ static tableName = 'users';
161
+ static schema: ModelSchema = {
162
+ name: Column.text().required().build(),
163
+ email: Column.text().unique().build(),
164
+ };
165
+
166
+ name!: string;
167
+ email!: string;
168
+ }
169
+
170
+ db.registerModel(User);
171
+
172
+ const alice = await User.create({ name: 'Alice', email: 'alice@example.com' });
173
+ const users = await User.filter({ name: 'Alice' });
174
+ await User.update({ id: alice.id }, { name: 'Alice Smith' });
175
+ ```
176
+
177
+ ---
178
+
179
+ ## API Reference
180
+
181
+ ### DB
182
+
183
+ The main entry point. Manages the SQLite connection, ORM registrations, and sync engine.
184
+
185
+ ```typescript
186
+ const db = new DB(config: DBConfig);
187
+ ```
188
+
189
+ | Method | Description |
190
+ |---|---|
191
+ | `db.define(tableName, schema)` | Create a typed `Repository` for a table |
192
+ | `db.registerModel(ModelClass)` | Register a class-based `Model` |
193
+ | `db.getMetrics()` | Returns `SyncMetrics` |
194
+ | `db.flush()` | Force an immediate sync tick |
195
+ | `db.snapshot()` | Manually trigger a full DB snapshot upload |
196
+ | `db.startSync()` | Start background sync (auto-started if `s3Config` provided) |
197
+ | `db.stopSync()` | Stop background sync |
198
+ | `db.transaction(fn)` | Run `fn` in a SQLite transaction |
199
+ | `db.integrityCheck()` | Returns `true` if the DB passes `PRAGMA integrity_check` |
200
+ | `db.vacuum()` | Run `VACUUM` to reclaim space |
201
+ | `db.raw()` | Access the raw `better-sqlite3` Database instance |
202
+ | `db.getNodeId()` | Returns the unique node ID for this instance |
203
+ | `db.close()` | Stop sync and close the SQLite connection |
204
+
205
+ ---
206
+
207
+ ### Repository (schema-based API)
208
+
209
+ Returned by `db.define(tableName, schema)`.
210
+
211
+ ```typescript
212
+ const Users = db.define('users', {
213
+ name: { type: 'TEXT', notNull: true },
214
+ email: { type: 'TEXT', unique: true },
215
+ age: { type: 'INTEGER' },
216
+ });
217
+ ```
218
+
219
+ Every record automatically gets these base fields:
220
+
221
+ | Field | Type | Description |
222
+ |---|---|---|
223
+ | `id` | `string` | UUID primary key |
224
+ | `createdAt` | `number` | Unix timestamp (ms) |
225
+ | `updatedAt` | `number` | Unix timestamp (ms), auto-updated |
226
+
227
+ #### `create(data)`
228
+
229
+ ```typescript
230
+ const user = await Users.create({ name: 'Alice', email: 'alice@example.com' });
231
+ // Returns the full record including id, createdAt, updatedAt
232
+ ```
233
+
234
+ #### `findById(id)`
235
+
236
+ ```typescript
237
+ const user = await Users.findById('some-uuid');
238
+ // Returns record or null
239
+ ```
240
+
241
+ #### `findOne(where)`
242
+
243
+ ```typescript
244
+ const user = await Users.findOne({ email: 'alice@example.com' });
245
+ ```
246
+
247
+ #### `find(options)`
248
+
249
+ ```typescript
250
+ const users = await Users.find({
251
+ where: { age: { $gte: 18 } },
252
+ orderBy: { name: 'ASC' },
253
+ limit: 20,
254
+ offset: 0,
255
+ });
256
+ ```
257
+
258
+ #### `filter(where, options?)`
259
+
260
+ Shorthand for `find` with just a `where` clause:
261
+
262
+ ```typescript
263
+ const active = await Users.filter({ active: 1 });
264
+ ```
265
+
266
+ #### `update(where, data)`
267
+
268
+ ```typescript
269
+ const affected = await Users.update({ id: user.id }, { name: 'Alice Smith' });
270
+ // Returns number of rows updated
271
+ ```
272
+
273
+ #### `delete(where)`
274
+
275
+ ```typescript
276
+ const deleted = await Users.delete({ email: 'alice@example.com' });
277
+ // Returns number of rows deleted
278
+ ```
279
+
280
+ #### `deleteById(id)`
281
+
282
+ ```typescript
283
+ const ok = await Users.deleteById('some-uuid');
284
+ // Returns boolean
285
+ ```
286
+
287
+ #### `count(where?)`
288
+
289
+ ```typescript
290
+ const total = await Users.count();
291
+ const adults = await Users.count({ age: { $gte: 18 } });
292
+ ```
293
+
294
+ #### `upsert(where, data)`
295
+
296
+ Creates a record if none matches `where`, otherwise updates:
297
+
298
+ ```typescript
299
+ const user = await Users.upsert(
300
+ { email: 'alice@example.com' },
301
+ { name: 'Alice', email: 'alice@example.com' }
302
+ );
303
+ ```
304
+
305
+ #### `raw<R>(sql, params?)`
306
+
307
+ Execute arbitrary SQL and return typed results:
308
+
309
+ ```typescript
310
+ const rows = Users.raw<{ name: string; count: number }>(
311
+ 'SELECT name, COUNT(*) as count FROM users GROUP BY name'
312
+ );
313
+ ```
314
+
315
+ ---
316
+
317
+ ### Where Clause Operators
318
+
319
+ All `where` arguments support the following operators:
320
+
321
+ ```typescript
322
+ // Equality
323
+ { name: 'Alice' }
324
+
325
+ // Comparison
326
+ { age: { $gt: 18 } }
327
+ { age: { $gte: 18 } }
328
+ { age: { $lt: 65 } }
329
+ { age: { $lte: 65 } }
330
+
331
+ // Not equal
332
+ { status: { $ne: 'deleted' } }
333
+
334
+ // IN list
335
+ { role: { $in: ['admin', 'moderator'] } }
336
+
337
+ // LIKE pattern
338
+ { name: { $like: 'Ali%' } }
339
+
340
+ // NULL check
341
+ { deletedAt: null }
342
+
343
+ // Combine multiple fields (implicit AND)
344
+ { age: { $gte: 18 }, active: 1 }
345
+ ```
346
+
347
+ ---
348
+
349
+ ### Model (class-based API)
350
+
351
+ Extend `Model` to define a model with static CRUD methods. You must define `tableName` and `schema` as static properties.
352
+
353
+ ```typescript
354
+ class Post extends Model {
355
+ static tableName = 'posts';
356
+ static schema: ModelSchema = {
357
+ title: Column.text().required().build(),
358
+ content: Column.text().build(),
359
+ published: Column.boolean().default(false).build(),
360
+ views: Column.integer().default(0).build(),
361
+ };
362
+
363
+ title!: string;
364
+ content!: string;
365
+ published!: boolean;
366
+ views!: number;
367
+ }
368
+
369
+ db.registerModel(Post);
370
+ ```
371
+
372
+ All `Repository` methods are available as static methods on the class:
373
+
374
+ ```typescript
375
+ await Post.create({ title: 'Hello', content: '...' });
376
+ await Post.findById(id);
377
+ await Post.findOne({ title: 'Hello' });
378
+ await Post.find({ where: { published: true }, orderBy: { views: 'DESC' } });
379
+ await Post.filter({ published: true });
380
+ await Post.update({ id }, { views: 100 });
381
+ await Post.delete({ id });
382
+ await Post.count();
383
+ await Post.upsert({ title: 'Hello' }, { title: 'Hello', views: 0 });
384
+ ```
385
+
386
+ ---
387
+
388
+ ### Column Builder
389
+
390
+ The `Column` builder provides a fluent API for defining schemas:
391
+
392
+ ```typescript
393
+ import { Column } from 'driftdb';
394
+
395
+ const schema = {
396
+ name: Column.text().required().build(),
397
+ email: Column.text().unique().required().build(),
398
+ score: Column.real().default(0).build(),
399
+ active: Column.boolean().default(true).build(),
400
+ data: Column.blob().build(),
401
+ createdBy: Column.integer().index().build(),
402
+ };
403
+ ```
404
+
405
+ | Builder | SQLite Type | Description |
406
+ |---|---|---|
407
+ | `Column.text()` | `TEXT` | UTF-8 string |
408
+ | `Column.integer()` | `INTEGER` | 64-bit integer |
409
+ | `Column.real()` | `REAL` | 64-bit float |
410
+ | `Column.boolean()` | `BOOLEAN` | Stored as 0/1 |
411
+ | `Column.blob()` | `BLOB` | Raw binary |
412
+
413
+ | Modifier | Description |
414
+ |---|---|
415
+ | `.required()` | Adds `NOT NULL` constraint |
416
+ | `.unique()` | Adds `UNIQUE` constraint |
417
+ | `.default(value)` | Adds `DEFAULT value` |
418
+ | `.index()` | Creates an index on this column |
419
+ | `.build()` | Returns the `ColumnDef` object |
420
+
421
+ ---
422
+
423
+ ## S3 Sync Engine
424
+
425
+ ### How Sync Works
426
+
427
+ 1. **Every write** to a repository appends an entry to the `_driftdb_log` table in SQLite. This is synchronous and local — zero network cost.
428
+
429
+ 2. **Every `syncIntervalMs`** (default: 5 seconds), the sync engine:
430
+ - Reads pending (un-synced) log entries up to `maxBatchSize`
431
+ - Serializes them into a compressed (optionally encrypted) JSON batch
432
+ - Uploads the batch to S3
433
+ - Marks those entries as synced in SQLite
434
+ - Updates the node's `manifest.json` on S3
435
+ - If `latestSequence % snapshotEveryNLogs === 0`, enqueues a snapshot job
436
+
437
+ 3. **On crash/restart**, the sync engine calls `resetStuck()` on the queue, which re-queues any jobs that were "processing" when the process died. Since S3 keys are deterministic, re-uploading is safe.
438
+
439
+ ### S3 Layout
440
+
441
+ ```
442
+ {prefix}/
443
+ nodes/{nodeId}/
444
+ logs/
445
+ 000000000001-000000000100.json ← batch: sequences 1–100
446
+ 000000000101-000000000200.json ← batch: sequences 101–200
447
+ snapshots/
448
+ 1704067200000.sqlite ← full snapshot at this timestamp
449
+ manifest.json ← pointer to latest snapshot + sequence
450
+ ```
451
+
452
+ **`manifest.json` example:**
453
+ ```json
454
+ {
455
+ "nodeId": "a1b2c3d4e5f6",
456
+ "latestSnapshotKey": "nodes/a1b2c3/snapshots/1704067200000.sqlite",
457
+ "latestSnapshotTimestamp": 1704067200000,
458
+ "latestLogSequence": 1500,
459
+ "updatedAt": 1704070800000
460
+ }
461
+ ```
462
+
463
+ **Log batch example:**
464
+ ```json
465
+ {
466
+ "version": 1,
467
+ "nodeId": "a1b2c3d4e5f6",
468
+ "fromSequence": 1,
469
+ "toSequence": 100,
470
+ "entries": [
471
+ {
472
+ "sequence": 1,
473
+ "timestamp": 1704067200000,
474
+ "table": "users",
475
+ "operation": "insert",
476
+ "data": { "id": "uuid-...", "name": "Alice", "email": "alice@example.com" }
477
+ },
478
+ {
479
+ "sequence": 2,
480
+ "timestamp": 1704067201000,
481
+ "table": "users",
482
+ "operation": "update",
483
+ "data": { "where": { "id": "uuid-..." }, "data": { "name": "Alice Smith" } }
484
+ }
485
+ ]
486
+ }
487
+ ```
488
+
489
+ ### Snapshots
490
+
491
+ A snapshot is a full copy of the SQLite database file uploaded to S3. Snapshots serve as recovery checkpoints — on a fresh node, you download the latest snapshot then apply any log entries after it.
492
+
493
+ Snapshots are triggered:
494
+ 1. **Automatically** every `snapshotEveryNLogs` log entries (default: 1000)
495
+ 2. **Manually** via `await db.snapshot()`
496
+
497
+ Before taking a snapshot, DriftDB runs `PRAGMA wal_checkpoint(FULL)` to flush the WAL and ensure a consistent file.
498
+
499
+ ---
500
+
501
+ ## Compression & Encryption
502
+
503
+ Both are optional and apply to S3 uploads only. Local SQLite is never compressed or encrypted by DriftDB.
504
+
505
+ ### Compression
506
+
507
+ Enabled by default (`compression: true`). Uses Node.js built-in `zlib` (gzip, `Z_BEST_SPEED`). Typical JSON log batches compress 60–80%.
508
+
509
+ ```typescript
510
+ const db = new DB({
511
+ sqlitePath: './data.sqlite',
512
+ s3Config: { bucket: 'my-bucket', region: 'us-east-1' },
513
+ compression: true, // default: true
514
+ });
515
+ ```
516
+
517
+ ### Encryption
518
+
519
+ Uses AES-256-GCM via Node.js built-in `crypto`. Encryption happens **after** compression. Each upload uses a fresh random IV, and the auth tag is prepended to the ciphertext.
520
+
521
+ ```typescript
522
+ const db = new DB({
523
+ sqlitePath: './data.sqlite',
524
+ s3Config: { bucket: 'my-bucket', region: 'us-east-1' },
525
+ encryption: {
526
+ key: 'a'.repeat(64), // 64 hex chars = 32 bytes = AES-256
527
+ },
528
+ });
529
+ ```
530
+
531
+ Generate a secure key:
532
+ ```bash
533
+ node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
534
+ ```
535
+
536
+ > **Important:** Store the key securely (e.g., AWS Secrets Manager, environment variable). Losing the key means losing access to all encrypted uploads.
537
+
538
+ ---
539
+
540
+ ## Failure Recovery
541
+
542
+ DriftDB is designed to survive crashes, network failures, and partial uploads.
543
+
544
+ ### Crash during upload
545
+
546
+ The sync job stays in the `_driftdb_queue` table with status `processing`. On restart, `resetStuck()` re-queues it as `pending`. The upload is retried from scratch. Since S3 keys are deterministic (based on sequence range), re-uploading the same batch is a safe no-op.
547
+
548
+ ### Network failure
549
+
550
+ Failed jobs are retried with **exponential backoff + jitter**:
551
+
552
+ ```
553
+ attempt 1: 500ms delay
554
+ attempt 2: 1000ms delay
555
+ attempt 3: 2000ms delay
556
+ attempt 4: 4000ms delay
557
+ attempt 5: 8000ms delay (capped at maxDelayMs)
558
+ ```
559
+
560
+ Configure via `retryConfig`:
561
+ ```typescript
562
+ const db = new DB({
563
+ retryConfig: {
564
+ maxRetries: 5,
565
+ baseDelayMs: 500,
566
+ maxDelayMs: 30_000,
567
+ },
568
+ });
569
+ ```
570
+
571
+ ### SQLite corruption prevention
572
+
573
+ - **WAL mode** is always enabled (`PRAGMA journal_mode = WAL`) — writes are safe from corruption even if the process is killed mid-write
574
+ - Before snapshots, `PRAGMA wal_checkpoint(FULL)` ensures a consistent file is captured
575
+ - `db.integrityCheck()` runs `PRAGMA integrity_check` and returns `true` if the DB is healthy
576
+
577
+ ### Verifying DB health on startup
578
+
579
+ ```typescript
580
+ const db = new DB({ sqlitePath: './data.sqlite' });
581
+
582
+ if (!db.integrityCheck()) {
583
+ console.error('DB corrupted — restore from S3 snapshot');
584
+ process.exit(1);
585
+ }
586
+ ```
587
+
588
+ ---
589
+
590
+ ## Performance
591
+
592
+ ### Local operations
593
+
594
+ DriftDB uses `better-sqlite3`, which is **synchronous** and consistently one of the fastest SQLite drivers for Node.js. Typical benchmarks:
595
+
596
+ | Operation | Speed |
597
+ |---|---|
598
+ | Single insert | ~50,000 ops/sec |
599
+ | Bulk insert (transaction) | ~500,000 rows/sec |
600
+ | Point lookup by id | ~100,000 ops/sec |
601
+ | Range query (indexed) | ~50,000 rows/sec |
602
+
603
+ ### Sync performance
604
+
605
+ - Log entries are **batched** — a single S3 `PutObject` covers up to `maxBatchSize` (default: 100) operations
606
+ - S3 uploads never block the main thread — they run in the Node.js async I/O pool
607
+ - The sync timer calls `timer.unref()` so it does not keep the process alive after your app logic finishes
608
+
609
+ ### Bulk inserts
610
+
611
+ Wrap bulk inserts in a transaction to avoid per-row WAL flushes:
612
+
613
+ ```typescript
614
+ db.transaction(() => {
615
+ for (const item of largeArray) {
616
+ Items.create(item); // sync internally
617
+ }
618
+ });
619
+ ```
620
+
621
+ ### SQLite pragmas applied by default
622
+
623
+ | Pragma | Value | Reason |
624
+ |---|---|---|
625
+ | `journal_mode` | `WAL` | Non-blocking reads during writes |
626
+ | `synchronous` | `NORMAL` | Fast without sacrificing crash safety |
627
+ | `foreign_keys` | `ON` | Referential integrity |
628
+ | `cache_size` | `-64000` | 64MB page cache |
629
+ | `temp_store` | `MEMORY` | Temp tables in RAM |
630
+
631
+ ---
632
+
633
+ ## Configuration Reference
634
+
635
+ ```typescript
636
+ interface DBConfig {
637
+ sqlitePath: string; // Path to SQLite file, or ':memory:'
638
+
639
+ s3Config?: {
640
+ bucket: string; // S3 bucket name
641
+ region: string; // AWS region
642
+ prefix?: string; // Key prefix (default: 'driftdb')
643
+ accessKeyId?: string; // AWS credentials (optional if using IAM role)
644
+ secretAccessKey?: string;
645
+ endpoint?: string; // Custom endpoint for MinIO/LocalStack
646
+ forcePathStyle?: boolean; // Required for MinIO (default: true if endpoint set)
647
+ };
648
+
649
+ nodeId?: string; // Stable ID for this instance (auto-generated if omitted)
650
+ autoSync?: boolean; // Start sync automatically (default: true if s3Config set)
651
+ syncIntervalMs?: number; // How often to sync (default: 5000ms)
652
+ snapshotEveryNLogs?: number; // Take snapshot every N log entries (default: 1000)
653
+ maxBatchSize?: number; // Max log entries per S3 upload (default: 100)
654
+ compression?: boolean; // gzip compress uploads (default: true)
655
+
656
+ encryption?: {
657
+ key: string; // 64-char hex = 32-byte AES-256 key
658
+ };
659
+
660
+ retryConfig?: {
661
+ maxRetries: number; // Max retry attempts (default: 5)
662
+ baseDelayMs: number; // Base delay for exponential backoff (default: 500)
663
+ maxDelayMs: number; // Max delay cap (default: 30000)
664
+ };
665
+ }
666
+ ```
667
+
668
+ ---
669
+
670
+ ## Advanced Usage
671
+
672
+ ### Using with LocalStack / MinIO
673
+
674
+ ```typescript
675
+ const db = new DB({
676
+ sqlitePath: './local.sqlite',
677
+ s3Config: {
678
+ bucket: 'local-test',
679
+ region: 'us-east-1',
680
+ endpoint: 'http://localhost:4566', // LocalStack
681
+ forcePathStyle: true,
682
+ accessKeyId: 'test',
683
+ secretAccessKey: 'test',
684
+ },
685
+ });
686
+ ```
687
+
688
+ ### Manual sync control
689
+
690
+ Disable auto-sync and flush on demand:
691
+
692
+ ```typescript
693
+ const db = new DB({
694
+ sqlitePath: './data.sqlite',
695
+ s3Config: { bucket: 'my-bucket', region: 'us-east-1' },
696
+ autoSync: false,
697
+ });
698
+
699
+ // Do work...
700
+ await Users.create({ name: 'Alice', email: 'alice@example.com' });
701
+
702
+ // Flush when ready
703
+ await db.flush();
704
+ ```
705
+
706
+ ### Monitoring sync metrics
707
+
708
+ ```typescript
709
+ const metrics = db.getMetrics();
710
+
711
+ console.log({
712
+ isRunning: metrics.isRunning, // Is sync engine active
713
+ pendingChanges: metrics.pendingChanges, // Un-synced log entries
714
+ totalSynced: metrics.totalSynced, // Total entries synced this session
715
+ syncErrors: metrics.syncErrors, // Cumulative error count
716
+ dbSizeBytes: metrics.dbSizeBytes, // SQLite file size
717
+ lastSyncAt: metrics.lastSyncAt, // Timestamp of last successful sync
718
+ lastSnapshotAt: metrics.lastSnapshotAt, // Timestamp of last snapshot
719
+ });
720
+ ```
721
+
722
+ ### Raw SQL access
723
+
724
+ ```typescript
725
+ const db = new DB({ sqlitePath: './data.sqlite' });
726
+
727
+ // Via repository
728
+ const Users = db.define('users', schema);
729
+ const rows = Users.raw<{ name: string; cnt: number }>(
730
+ 'SELECT name, COUNT(*) as cnt FROM users GROUP BY name HAVING cnt > ?',
731
+ [5]
732
+ );
733
+
734
+ // Via raw better-sqlite3 handle
735
+ const sqlite = db.raw();
736
+ sqlite.prepare('CREATE INDEX IF NOT EXISTS ...').run();
737
+ ```
738
+
739
+ ### Multiple models with relations
740
+
741
+ ```typescript
742
+ const Teams = db.define('teams', {
743
+ name: Column.text().required().build(),
744
+ });
745
+
746
+ const Members = db.define('members', {
747
+ teamId: Column.text().required().index().build(),
748
+ userId: Column.text().required().build(),
749
+ role: Column.text().default('member').build(),
750
+ });
751
+
752
+ const team = await Teams.create({ name: 'Engineering' });
753
+ const alice = await Users.create({ name: 'Alice', email: 'a@example.com' });
754
+ await Members.create({ teamId: team.id, userId: alice.id, role: 'lead' });
755
+
756
+ // Query with raw SQL for joins
757
+ const teamMembers = Members.raw<{ name: string; role: string }>(
758
+ `SELECT u.name, m.role
759
+ FROM members m
760
+ JOIN users u ON u.id = m.userId
761
+ WHERE m.teamId = ?`,
762
+ [team.id]
763
+ );
764
+ ```
765
+
766
+ ---
767
+
768
+ ## Trade-offs & Design Decisions
769
+
770
+ ### Why SQLite?
771
+
772
+ SQLite is a battle-tested, zero-dependency embedded database. It runs in-process with no separate server, handles GB-scale datasets, and supports full ACID transactions. `better-sqlite3` gives us a synchronous API that maps naturally to local-first patterns.
773
+
774
+ ### Why S3 for persistence?
775
+
776
+ S3 is the simplest, cheapest, and most universally available durable object store. There's no infrastructure to manage, it scales infinitely, and 11 nines of durability is effectively permanent. For backup and replication purposes, it's a perfect fit.
777
+
778
+ ### Why incremental logs instead of full DB sync?
779
+
780
+ Uploading the full SQLite file on every write would be:
781
+ - Slow for large databases (GB-scale)
782
+ - Expensive in S3 PUT costs
783
+ - Bandwidth-intensive
784
+
785
+ Incremental log batching means only the delta is uploaded. A 100-entry batch might be a few KB compressed, regardless of total DB size.
786
+
787
+ ### Why not CRDTs or multi-master sync?
788
+
789
+ DriftDB is designed for **single-writer per node** scenarios. If you need multi-master conflict resolution, CRDTs (like those in `cr-sqlite`) are the right tool. DriftDB prioritizes simplicity and reliability for the common case.
790
+
791
+ ### Why is sync one-way by default?
792
+
793
+ Pushing local changes to S3 is the core use case. Pull-based restore (downloading the latest snapshot) is supported via `SnapshotManager.restoreLatest()` but is intentionally a manual operation — DriftDB does not auto-merge remote changes into your local DB during runtime.
794
+
795
+ ### SQLite WAL mode trade-off
796
+
797
+ WAL mode (`journal_mode = WAL`) means:
798
+ - ✅ Readers don't block writers
799
+ - ✅ Writers don't block readers
800
+ - ✅ Crash safety — incomplete writes are rolled back
801
+ - ⚠️ Two extra files: `.sqlite-wal` and `.sqlite-shm` (normal, safe to ignore)
802
+ - ⚠️ Slightly larger disk footprint until `wal_checkpoint` runs
803
+
804
+ DriftDB automatically runs `wal_checkpoint(FULL)` before every snapshot to keep WAL size bounded.
805
+
806
+ ---
807
+
808
+ ## License
809
+
810
+ MIT