@harperfast/rocksdb-js 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,1095 @@
1
+ # rocksdb-js
2
+
3
+ A Node.js binding for the RocksDB library.
4
+
5
+ ## Features
6
+
7
+ - Supports optimistic and pessimistic transactions
8
+ - Hybrid sync/async data retrieval
9
+ - Range queries return an iterable with array-like methods and lazy evaluation
10
+ - Transaction log system for recording transaction related data
11
+ - Custom stores provide ability to override default database interactions
12
+ - Efficient binary key and value encoding
13
+ - Designed for Node.js and Bun on Linux, macOS, and Windows
14
+
15
+ ## Example
16
+
17
+ ```typescript
18
+ const db = RocksDatabase.open('/path/to/db');
19
+
20
+ for (const key of ['a', 'b', 'c', 'd', 'e']) {
21
+ await db.put(key, `value ${key}`);
22
+ }
23
+
24
+ console.log(await db.get('b')); // `value b`
25
+
26
+ for (const { key, value } of db.getRange({ start: 'b', end: 'd' })) {
27
+ console.log(`${key} = ${value}`);
28
+ }
29
+
30
+ await db.transaction(async (txn: Transaction) => {
31
+ await txn.put('f', 'value f');
32
+ await txn.remove('c');
33
+ });
34
+ ```
35
+
36
+ ## Usage
37
+
38
+ ### `new RocksDatabase(path, options?)`
39
+
40
+ Creates a new database instance.
41
+
42
+ - `path: string` The path to write the database files to. This path does not need to exist, but the
43
+ parent directories do.
44
+ - `options: object` [optional]
45
+ - `disableWAL: boolean` Whether to disable the RocksDB write ahead log.
46
+ - `name: string` The column family name. Defaults to `"default"`.
47
+ - `noBlockCache: boolean` When `true`, disables the block cache. Block caching is enabled by
48
+ default and the cache is shared across all database instances.
49
+ - `parallelismThreads: number` The number of background threads to use for flush and compaction.
50
+ Defaults to `1`.
51
+ - `pessimistic: boolean` When `true`, throws conflict errors when they occur instead of waiting
52
+ until commit. Defaults to `false`.
53
+ - `store: Store` A custom store that handles all interaction between the `RocksDatabase` or
54
+ `Transaction` instances and the native database interface. See [Custom Store](#custom-store) for
55
+ more information.
56
+ - `transactionLogMaxAgeThreshold: number` The threshold for the transaction log file's last
57
+ modified time to be older than the retention period before it is rotated to the next sequence
58
+ number. Value must be between `0.0` and `1.0`. A threshold of `0.0` means ignore age check.
59
+ Defaults to `0.75`.
60
+ - `transactionLogMaxSize: number` The maximum size of a transaction log file. If a log file is
61
+ empty, the first log entry will always be added regardless if it's larger than the max size. If
62
+ a log file is not empty and the entry is larger than the space available, the log file is
63
+ rotated to the next sequence number. Defaults to 16 MB.
64
+ - `transactionLogRetention: string | number` The number of minutes to retain transaction logs
65
+ before purging. Defaults to `'3d'` (3 days).
66
+ - `transactionLogsPath: string` The path to store transaction logs. Defaults to
67
+ `"${db.path}/transaction_logs"`.
68
+
69
+ ### `db.close()`
70
+
71
+ Closes a database. This function can be called multiple times and will only close an opened
72
+ database. A database instance can be reopened once its closed.
73
+
74
+ ```typescript
75
+ const db = RocksDatabase.open('foo');
76
+ db.close();
77
+ ```
78
+
79
+ ### `db.config(options)`
80
+
81
+ Sets global database settings.
82
+
83
+ - `options: object`
84
+ - `blockCacheSize: number` The amount of memory in bytes to use to cache uncompressed blocks.
85
+ Defaults to 32MB. Set to `0` (zero) disables block cache for future opened databases. Existing
86
+ block cache for any opened databases is resized immediately. Negative values throw an error.
87
+
88
+ ```typescript
89
+ RocksDatabase.config({
90
+ blockCacheSize: 100 * 1024 * 1024, // 100MB
91
+ });
92
+ ```
93
+
94
+ ### `db.open(): RocksDatabase`
95
+
96
+ Opens the database at the given path. This must be called before performing any data operations.
97
+
98
+ ```typescript
99
+ import { RocksDatabase } from '@harperfast/rocksdb-js';
100
+
101
+ const db = new RocksDatabase('path/to/db');
102
+ db.open();
103
+ ```
104
+
105
+ There's also a static `open()` method for convenience that performs the same thing:
106
+
107
+ ```typescript
108
+ const db = RocksDatabase.open('path/to/db');
109
+ ```
110
+
111
+ ## Data Operations
112
+
113
+ ### `db.clear(options?): Promise<number>`
114
+
115
+ Asychronously removes all data in the current database.
116
+
117
+ - `options: object`
118
+ - `batchSize?: number` The number of records to remove at once. Defaults to `10000`.
119
+
120
+ Returns the number of entries that were removed.
121
+
122
+ Note: This does not remove data from other column families within the same database path.
123
+
124
+ ```typescript
125
+ for (let i = 0; i < 10; i++) {
126
+ db.putSync(`key${i}`, `value${i}`);
127
+ }
128
+ const entriesRemoved = await db.clear();
129
+ console.log(entriesRemoved); // 10
130
+ ```
131
+
132
+ ### `db.clearSync(options?): number`
133
+
134
+ Synchronous version of `db.clear()`.
135
+
136
+ - `options: object`
137
+ - `batchSize?: number` The number of records to remove at once. Defaults to `10000`.
138
+
139
+ ```typescript
140
+ for (let i = 0; i < 10; i++) {
141
+ db.putSync(`key${i}`, `value${i}`);
142
+ }
143
+ const entriesRemoved = db.clearSync();
144
+ console.log(entriesRemoved); // 10
145
+ ```
146
+
147
+ ### `db.drop(): Promise<void>`
148
+
149
+ Removes all entries in the database. If the database was opened with a `name`, the database will be
150
+ deleted on close.
151
+
152
+ ```typescript
153
+ const db = RocksDatabase.open('path/to/db', { name: 'users' });
154
+ await db.drop();
155
+ db.close();
156
+ ```
157
+
158
+ ### `db.dropSync(): void`
159
+
160
+ Synchronous version of `db.drop()`.
161
+
162
+ ```typescript
163
+ const db = RocksDatabase.open('path/to/db');
164
+ db.dropSync();
165
+ db.close();
166
+ ```
167
+
168
+ ### `db.get(key: Key, options?: GetOptions): MaybePromise<any>`
169
+
170
+ Retreives the value for a given key. If the key does not exist, it will resolve `undefined`.
171
+
172
+ ```typescript
173
+ const result = await db.get('foo');
174
+ assert.equal(result, 'foo');
175
+ ```
176
+
177
+ If the value is in the memtable or block cache, `get()` will immediately return the value
178
+ synchronously instead of returning a promise.
179
+
180
+ ```typescript
181
+ const result = db.get('foo');
182
+ const value = result instanceof Promise ? (await result) : result;
183
+ assert.equal(result, 'foo');
184
+ ```
185
+
186
+ Note that all errors are returned as rejected promises.
187
+
188
+ ### `db.getSync(key: Key, options?: GetOptions): any`
189
+
190
+ Synchronous version of `get()`.
191
+
192
+ ### `db.getKeys(options?: IteratorOptions): ExtendedIterable`
193
+
194
+ Retrieves all keys within a range.
195
+
196
+ ```typescript
197
+ for (const key of db.getKeys()) {
198
+ console.log(key);
199
+ }
200
+ ```
201
+
202
+ ### `db.getKeysCount(options?: RangeOptions): number`
203
+
204
+ Retrieves the number of keys within a range.
205
+
206
+ ```typescript
207
+ const total = db.getKeysCount();
208
+ const range = db.getKeysCount({ start: 'a', end: 'z' });
209
+ ```
210
+
211
+ ### `db.getMonotonicTimestamp(): number`
212
+
213
+ Returns the current timestamp as a monotonically increasing timestamp in milliseconds represented as
214
+ a decimal number.
215
+
216
+ ```typescript
217
+ const ts = db.getMonotonicTimestamp();
218
+ console.log(ts); // 1764307857213.739
219
+ ```
220
+
221
+ ### `db.getOldestSnapshotTimestamp(): number`
222
+
223
+ Returns a number representing a unix timestamp of the oldest unreleased snapshot.
224
+
225
+ Snapshots are only created during transactions. When the database is opened in optimistic mode (the
226
+ default), the snapshot will be created on the first read. When the database is opened in pessimistic
227
+ mode, the snapshot will be created on the first read or write.
228
+
229
+ ```typescript
230
+ console.log(db.getOldestSnapshotTimestamp()); // returns `0`, no snapshots
231
+
232
+ const promise = db.transaction(async (txn) => {
233
+ // perform a write to create a snapshot
234
+ await txn.get('foo');
235
+ await setTimeout(100);
236
+ });
237
+
238
+ console.log(db.getOldestSnapshotTimestamp()); // returns `1752102248558`
239
+
240
+ await promise;
241
+ // transaction completes, snapshot released
242
+
243
+ console.log(db.getOldestSnapshotTimestamp()); // returns `0`, no snapshots
244
+ ```
245
+
246
+ ### `db.getDBProperty(propertyName: string): string`
247
+
248
+ Gets a RocksDB database property as a string.
249
+
250
+ - `propertyName: string` The name of the property to retrieve (e.g., ) `'rocksdb.levelstats'`.
251
+
252
+ ```typescript
253
+ const db = RocksDatabase.open('/path/to/database');
254
+ const levelStats = db.getDBProperty('rocksdb.levelstats');
255
+ const stats = db.getDBProperty('rocksdb.stats');
256
+ ```
257
+
258
+ ### `db.getDBIntProperty(propertyName: string): number`
259
+
260
+ Gets a RocksDB database property as an integer.
261
+
262
+ - `propertyName: string` The name of the property to retrieve (e.g., ) `'rocksdb.num-blob-files'`.
263
+
264
+ ```typescript
265
+ const db = RocksDatabase.open('/path/to/database');
266
+ const blobFiles = db.getDBIntProperty('rocksdb.num-blob-files');
267
+ const numKeys = db.getDBIntProperty('rocksdb.estimate-num-keys');
268
+ ```
269
+
270
+ ### `db.getRange(options?: IteratorOptions): ExtendedIterable`
271
+
272
+ Retrieves a range of keys and their values. Supports both synchronous and asynchronous iteration.
273
+
274
+ ```typescript
275
+ // sync
276
+ for (const { key, value } of db.getRange()) {
277
+ console.log({ key, value });
278
+ }
279
+
280
+ // async
281
+ for await (const { key, value } of db.getRange()) {
282
+ console.log({ key, value });
283
+ }
284
+
285
+ // key range
286
+ for (const { key, value } of db.getRange({ start: 'a', end: 'z' })) {
287
+ console.log({ key, value });
288
+ }
289
+ ```
290
+
291
+ ### `db.getUserSharedBuffer(key: Key, defaultBuffer: ArrayBuffer, options?)`
292
+
293
+ Creates a new buffer with the contents of `defaultBuffer` that can be accessed across threads. This
294
+ is useful for storing data such as flags, counters, or any ArrayBuffer-based data.
295
+
296
+ - `options?: object`
297
+ - `callback?: () => void` A optional callback is called when `notify()` on the returned buffer is
298
+ called.
299
+
300
+ Returns a new `ArrayBuffer` with two additional methods:
301
+
302
+ - `notify()` - Invokes the `options.callback`, if specified.
303
+ - `cancel()` - Removes the callback; future `notify()` calls do nothing
304
+
305
+ Note: If a shared buffer already exists for the given `key`, the returned `ArrayBuffer` will
306
+ reference this existing shared buffer. Once all `ArrayBuffer` instances have gone out of scope and
307
+ garbage collected, the underlying memory and notify callback will be freed.
308
+
309
+ ```typescript
310
+ const buffer = new Uint8Array(db.getUserSharedBuffer('isDone', new ArrayBuffer(1)));
311
+ done[0] = 0;
312
+
313
+ if (done[0] !== 1) {
314
+ done[1] = 1;
315
+ }
316
+ ```
317
+
318
+ ```typescript
319
+ const incrementer = new BigInt64Array(
320
+ db.getUserSharedBuffer('next-id', new BigInt64Array(1).buffer)
321
+ );
322
+ incrementer[0] = 1n;
323
+
324
+ function getNextId() {
325
+ return Atomics.add(incrementer, 0, 1n);
326
+ }
327
+ ```
328
+
329
+ ### `db.put(key: Key, value: any, options?: PutOptions): Promise`
330
+
331
+ Stores a value for a given key.
332
+
333
+ ```typescript
334
+ await db.put('foo', 'bar');
335
+ ```
336
+
337
+ ### `db.putSync(key: Key, value: any, options?: PutOptions): void`
338
+
339
+ Synchronous version of `put()`.
340
+
341
+ ### `db.remove(key: Key): Promise`
342
+
343
+ Removes the value for a given key.
344
+
345
+ ```typescript
346
+ await db.remove('foo');
347
+ ```
348
+
349
+ ### `db.removeSync(key: Key): void`
350
+
351
+ Synchronous version of `remove()`.
352
+
353
+ ## Transactions
354
+
355
+ ### `db.transaction(async (txn: Transaction) => void | Promise<any>): Promise<any>`
356
+
357
+ Executes all database operations within the specified callback within a single transaction. If the
358
+ callback completes without error, the database operations are automatically committed. However, if
359
+ an error is thrown during the callback, all database operations will be rolled back.
360
+
361
+ ```typescript
362
+ import type { Transaction } from '@harperfast/rocksdb-js';
363
+ await db.transaction(async (txn: Transaction) => {
364
+ await txn.put('foo', 'baz');
365
+ });
366
+ ```
367
+
368
+ Additionally, you may pass the transaction into any database data method:
369
+
370
+ ```typescript
371
+ await db.transaction(async (transaction: Transaction) => {
372
+ await db.put('foo', 'baz', { transaction });
373
+ });
374
+ ```
375
+
376
+ Note that `db.transaction()` returns whatever value the transaction callback returns:
377
+
378
+ ```typescript
379
+ const isBar = await db.transaction(async (txn: Transaction) => {
380
+ const foo = await txn.get('foo');
381
+ return foo === 'bar';
382
+ });
383
+ console.log(isBar ? 'Foo is bar' : 'Foo is not bar');
384
+ ```
385
+
386
+ ### `db.transactionSync((txn: Transaction) => any): any`
387
+
388
+ Executes a transaction callback and commits synchronously. Once the transaction callback returns,
389
+ the commit is executed synchronously and blocks the current thread until finished.
390
+
391
+ Inside a synchronous transaction, use `getSync()`, `putSync()`, and `removeSync()`.
392
+
393
+ ```typescript
394
+ import type { Transaction } from '@harperfast/rocksdb-js';
395
+ db.transactionSync((txn: Transaction) => {
396
+ txn.putSync('foo', 'baz');
397
+ });
398
+ ```
399
+
400
+ ### Class: `Transaction`
401
+
402
+ The transaction callback is passed in a `Transaction` instance which contains all of the same data
403
+ operations methods as the `RocksDatabase` instance plus:
404
+
405
+ - `txn.abort()`
406
+ - `txn.commit()`
407
+ - `txn.commitSync()`
408
+ - `txn.getTimestamp()`
409
+ - `txn.id`
410
+ - `txn.setTimestamp(ts)`
411
+
412
+ #### `txn.abort(): void`
413
+
414
+ Rolls back and closes the transaction. This method is automatically called after the transaction
415
+ callback returns, so you shouldn't need to call it, but it's ok to do so. Once called, no further
416
+ transaction operations are permitted.
417
+
418
+ #### `txn.commit(): Promise<void>`
419
+
420
+ Commits and closes the transaction. This is a non-blocking operation and runs on a background
421
+ thread. Once called, no further transaction operations are permitted.
422
+
423
+ #### `txn.commitSync(): void`
424
+
425
+ Synchronously commits and closes the transaction. This is a blocking operation on the main thread.
426
+ Once called, no further transaction operations are permitted.
427
+
428
+ #### `txn.getTimestamp(): number`
429
+
430
+ Retrieves the transaction start timestamp in seconds as a decimal. It defaults to the time at which
431
+ the transaction was created.
432
+
433
+ #### `txn.id`
434
+
435
+ Type: `number`
436
+
437
+ The transaction ID represented as a 32-bit unsigned integer. Transaction IDs are unique to the
438
+ RocksDB database path, regardless the database name/column family.
439
+
440
+ #### `txn.setTimestamp(ts: number?): void`
441
+
442
+ Overrides the transaction start timestamp. If called without a timestamp, it will set the timestamp
443
+ to the current time. The value must be in seconds with higher precision in the decimal.
444
+
445
+ ```typescript
446
+ await db.transaction(async (txn) => {
447
+ txn.setTimestamp(Date.now() / 1000);
448
+ });
449
+ ```
450
+
451
+ ## Events
452
+
453
+ ### Event: `'aftercommit'`
454
+
455
+ The `'aftercommit'` event is emitted after a transaction has been committed and the transaction has
456
+ completed including waiting for the async worker thread to finish.
457
+
458
+ - `result: object`
459
+ - `next: null`
460
+ - `last: null`
461
+ - `txnId: number` The id of the transaction that was just committed.
462
+
463
+ ### Event: `'beforecommit'`
464
+
465
+ The `'beforecommit'` event is emitted before a transaction is about to be committed.
466
+
467
+ ### Event: `'begin-transaction'`
468
+
469
+ The `'begin-transaction'` event is emitted right before the transaction function is executed.
470
+
471
+ ### Event: `'committed'`
472
+
473
+ The `'committed'` event is emitted after the transaction has been written. When this event is
474
+ emitted, the transaction is still cleaning up. If you need to know when the transaction is fully
475
+ complete, use the `'aftercommit'` event.
476
+
477
+ ## Event API
478
+
479
+ `rocksdb-js` provides a EventEmitter-like API that lets you asynchronously notify events to one or
480
+ more synchronous listener callbacks. Events are scoped by database path.
481
+
482
+ Unlike `EventEmitter`, events are emitted asynchronously, but in the same order that the listeners
483
+ were added.
484
+
485
+ ```typescript
486
+ const callback = (name) => console.log(`Hi from ${name}`);
487
+ db.addListener('foo', callback);
488
+ db.notify('foo');
489
+ db.notify('foo', 'bar');
490
+ db.removeListener('foo', callback);
491
+ ```
492
+
493
+ ### `addListener(event: string, callback: () => void): void`
494
+
495
+ Adds a listener callback for the specific key.
496
+
497
+ ```typescript
498
+ db.addListener('foo', () => {
499
+ // this callback will be executed asynchronously
500
+ });
501
+
502
+ db.addListener(1234, (...args) => {
503
+ console.log(args);
504
+ });
505
+ ```
506
+
507
+ ### `listeners(event: string): number`
508
+
509
+ Gets the number of listeners for the given key.
510
+
511
+ ```typescript
512
+ db.listeners('foo'); // 0
513
+ db.addListener('foo', () => {});
514
+ db.listeners('foo'); // 1
515
+ ```
516
+
517
+ ### `on(event: string, callback: () => void): void`
518
+
519
+ Alias for `addListener()`.
520
+
521
+ ### `once(event: string, callback: () => void): void`
522
+
523
+ Adds a one-time listener, then automatically removes it.
524
+
525
+ ```typescript
526
+ db.once('foo', () => {
527
+ console.log('This will only ever be called once');
528
+ });
529
+ ```
530
+
531
+ ### `removeListener(event: string, callback: () => void): boolean`
532
+
533
+ Removes an event listener. You must specify the exact same callback that was used in
534
+ `addListener()`.
535
+
536
+ ```typescript
537
+ const callback = () => {};
538
+ db.addListener('foo', callback);
539
+
540
+ db.removeListener('foo', callback); // return `true`
541
+ db.removeListener('foo', callback); // return `false`, callback not found
542
+ ```
543
+
544
+ ### `off(event: string, callback: () => void): boolean`
545
+
546
+ Alias for `removeListener()`.
547
+
548
+ ### `notify(event: string, ...args?): boolean`
549
+
550
+ Call all listeners for the given key. Returns `true` if any callbacks were found, otherwise `false`.
551
+
552
+ Unlike `EventEmitter`, events are emitted asynchronously, but in the same order that the listeners
553
+ were added.
554
+
555
+ You can optionally emit one or more arguments. Note that the arguments must be serializable. In
556
+ other words, `undefined`, `null`, strings, booleans, numbers, arrays, and objects are supported.
557
+
558
+ ```typescript
559
+ db.notify('foo');
560
+ db.notify(1234);
561
+ db.notify({ key: 'bar' }, { value: 'baz' });
562
+ ```
563
+
564
+ ## Exclusive Locking
565
+
566
+ `rocksdb-js` includes a handful of functions for executing thread-safe mutually exclusive functions.
567
+
568
+ ### `db.hasLock(key: Key): boolean`
569
+
570
+ Returns `true` if the database has a lock for the given key, otherwise `false`.
571
+
572
+ ```typescript
573
+ db.hasLock('foo'); // false
574
+ db.tryLock('foo'); // true
575
+ db.hasLock('foo'); // true
576
+ ```
577
+
578
+ ### `db.tryLock(key: Key, onUnlocked?: () => void): boolean`
579
+
580
+ Attempts to acquire a lock for a given key. If the lock is available, the function returns `true`
581
+ and the optional `onUnlocked` callback is never called. If the lock is not available, the function
582
+ returns `false` and the `onUnlocked` callback is queued until the lock is released.
583
+
584
+ When a database is closed, all locks associated to it will be unlocked.
585
+
586
+ ```typescript
587
+ db.tryLock('foo', () => {
588
+ console.log('never fired');
589
+ }); // true, callback ignored
590
+
591
+ db.tryLock('foo', () => {
592
+ console.log('hello world');
593
+ }); // false, already locked, callback queued
594
+
595
+ db.unlock('foo'); // fires second lock callback
596
+ ```
597
+
598
+ The `onUnlocked` callback function can be used to signal to retry acquiring the lock:
599
+
600
+ ```typescript
601
+ function doSomethingExclusively() {
602
+ // if lock is unavailable, queue up callback to recursively retry
603
+ if (db.tryLock('foo', () => doSomethingExclusively())) {
604
+ // lock acquired, do something exclusive
605
+
606
+ db.unlock('foo');
607
+ }
608
+ }
609
+ ```
610
+
611
+ ### `db.unlock(key): boolean`
612
+
613
+ Releases the lock on the given key and calls any queued `onUnlocked` callback handlers. Returns
614
+ `true` if the lock was released or `false` if the lock did not exist.
615
+
616
+ ```typescript
617
+ db.tryLock('foo');
618
+ db.unlock('foo'); // true
619
+ db.unlock('foo'); // false, already unlocked
620
+ ```
621
+
622
+ ### `db.withLock(key: Key, callback: () => void | Promise<void>): Promise<void>`
623
+
624
+ Runs a function with guaranteed exclusive access across all threads.
625
+
626
+ ```typescript
627
+ await db.withLock('key', async () => {
628
+ // do something exclusive
629
+ console.log(db.hasLock('key')); // true
630
+ });
631
+ ```
632
+
633
+ If there are more than one simultaneous lock requests, it will block them until the lock is
634
+ available.
635
+
636
+ ```typescript
637
+ await Promise.all([
638
+ db.withLock('key', () => {
639
+ console.log('first lock blocking for 100ms');
640
+ return new Promise(resolve => setTimeout(resolve, 100));
641
+ }),
642
+ db.withLock('key', () => {
643
+ console.log('second lock blocking for 100ms');
644
+ return new Promise(resolve => setTimeout(resolve, 100));
645
+ }),
646
+ db.withLock('key', () => {
647
+ console.log('third lock acquired');
648
+ }),
649
+ ]);
650
+ ```
651
+
652
+ Note: If the `callback` throws an error, Node.js suppress the error. Node.js 18.3.0 introduced a
653
+ `--force-node-api-uncaught-exceptions-policy` flag which will cause errors to emit the
654
+ `'uncaughtException'` event. Future Node.js releases will enable this flag by default.
655
+
656
+ ### `db.flush(): Promise<void>`
657
+
658
+ Flushes all in-memory data to disk asynchronously.
659
+
660
+ ```typescript
661
+ await db.flush();
662
+ ```
663
+
664
+ ### `db.flushSync(): void`
665
+
666
+ Flushes all in-memory data to disk synchronously. Note that this can be an expensive operation, so
667
+ it is recommended to use `flush()` if you want to keep the event loop free.
668
+
669
+ ```typescript
670
+ db.flushSync();
671
+ ```
672
+
673
+ ## Transaction Log
674
+
675
+ A user controlled API for logging transactions. This API is designed to be generic so that you can
676
+ log gets, puts, and deletes, but also arbitrary entries.
677
+
678
+ ### `db.listLogs(): string[]`
679
+
680
+ Returns an array of log store names.
681
+
682
+ ```typescript
683
+ const names = db.listLogs();
684
+ ```
685
+
686
+ ### `db.purgeLogs(options?): string[]`
687
+
688
+ Deletes transaction log files older than the `transactionLogRetention` (defaults to 3 days).
689
+
690
+ - `options: object`
691
+ - `destroy?: boolean` When `true`, deletes transaction log stores including all log sequence files
692
+ on disk.
693
+ - `name?: string` The name of a store to limit the purging to.
694
+
695
+ Returns an array with the full path of each log file deleted.
696
+
697
+ ```typescript
698
+ const removed = db.purgeLogs();
699
+ console.log(`Removed ${removed.length} log files`);
700
+ ```
701
+
702
+ ### `db.useLog(name): TransactionLog`
703
+
704
+ Gets or creates a `TransactionLog` instance. Internally, the `TransactionLog` interfaces with a
705
+ shared transaction log store that is used by all threads. Multiple worker threads can use the same
706
+ log at the same time.
707
+
708
+ - `name: string | number` The name of the log. Numeric log names are converted to a string.
709
+
710
+ ```typescript
711
+ const log1 = db.useLog('foo');
712
+ const log2 = db.useLog('foo'); // gets existing instance (e.g. log1 === log2)
713
+ const log3 = db.useLog(123);
714
+ ```
715
+
716
+ `Transaction` instances also provide a `useLog()` method that binds the returned transaction log to
717
+ the transaction so you don't need to pass in the transaction id every time you add an entry.
718
+
719
+ ```typescript
720
+ await db.transaction(async (txn) => {
721
+ const log = txn.useLog('foo');
722
+ log.addEntry(Buffer.from('hello'));
723
+ });
724
+ ```
725
+
726
+ ### Class: `TransactionLog`
727
+
728
+ A `TransactionLog` lets you add arbitrary data bound to a transaction that is automatically written
729
+ to disk right before the transaction is committed. You may add multiple enties per transaction. The
730
+ underlying architecture is thread safe.
731
+
732
+ - `log.addEntry()`
733
+ - `log.query()`
734
+
735
+ #### `log.addEntry(data, transactionId): void`
736
+
737
+ Adds an entry to the transaction log.
738
+
739
+ - `data: Buffer | UInt8Array` The entry data to store. There is no inherent limit beyond what
740
+ Node.js can handle.
741
+ - `transactionId: Number` A related transaction used to batch entries on commit.
742
+
743
+ ```typescript
744
+ const log = db.useLog('foo');
745
+ await db.transaction(async (txn) => {
746
+ log.addEntry(Buffer.from('hello'), txn.id);
747
+ });
748
+ ```
749
+
750
+ If using `txn.useLog()` (instead of `db.useLog()`), you can omit the transaction id from
751
+ `addEntry()` calls.
752
+
753
+ ```typescript
754
+ await db.transaction(async (txn) => {
755
+ const log = txn.useLog('foo');
756
+ log.addEntry(Buffer.from('hello'));
757
+ });
758
+ ```
759
+
760
+ Note that the `TransactionLog` class also has internal methods `_getMemoryMapOfFile`,
761
+ `_findPosition`, and `_getLastCommittedPosition` that should not be used directly and may change in
762
+ any version.
763
+
764
+ #### `log.query(options?): IterableIterator<TransactionLogEntry>`
765
+
766
+ Returns an iterable/iterator that streams all log entries for the given filter.
767
+
768
+ - `options: object`
769
+ - `start?: number` The transaction start timestamp.
770
+ - `end?: string` The transction end timestamp.
771
+ - `exclusiveStart?: boolean` When `true`, this will only match transactions with timestamps after
772
+ the start timestamp.
773
+ - `exactStart?: boolean` When `true`, this will only match and iterate starting from a transaction
774
+ with the given start timestamp. Once the specified transaction is found, all subsequent
775
+ transactions will be returned (regardless of whether their timestamp comes before the `start`
776
+ time). This can be combined with `exactStart`, finding the specified transaction, and returning
777
+ all transactions that follow. By default, all transactions equal to or greater than the start
778
+ timestamp will be included.
779
+ - `readUncommitted?: boolean` When `true`, this will include uncommitted transaction entries.
780
+ Normally transaction entries that haven't finished committed are not included. This is
781
+ particularly useful for replaying transaction logs on startup where many entries may have been
782
+ written to the log but are no longer considered committed if they were not flushed to disk.
783
+ - `startFromLastFlushed?: boolean` When `true`, this will only match transactions that have been
784
+ flushed from RocksDB's memtables to disk (and are within any provided `start` and `end` filters,
785
+ if included). This is useful for replaying transaction logs on startup where many entries may
786
+ have been written to the log but are no longer considered committed if they were not flushed to
787
+ disk.
788
+
789
+ The iterator produces an object with the log entry timestamp and data.
790
+
791
+ - `object`
792
+ - `data: Buffer` The entry data.
793
+ - `timestamp: number` The entry timestamp used to collate entries by transaction.
794
+ - `endTxn: boolean` This is `true` when the entry is the last entry in a transaction.
795
+
796
+ ```typescript
797
+ const log = db.useLog('foo');
798
+ const iter = log.query({});
799
+ for (const entry of iter) {
800
+ console.log(entry);
801
+ }
802
+
803
+ const lastHour = Date.now() - (60 * 60 * 1000);
804
+ const rangeIter = log.query({ start: lastHour, end: Date.now() });
805
+ for (const entry of rangeIter) {
806
+ console.log(entry.timestamp, entry.data);
807
+ }
808
+ ```
809
+
810
+ #### `log.getLogFileSize(sequenceNumber?: number): number`
811
+
812
+ Returns the size of the given transaction log sequence file in bytes. Omit the sequence number to
813
+ get the total size of all the transaction log sequence files for this log.
814
+
815
+ ### Transaction Log Parser
816
+
817
+ #### `parseTransactionLog(file)`
818
+
819
+ In general, you should use `log.query()` to query the transaction log, however, if you need to load
820
+ an entire transaction log into memory and want detailed information about entries, you can use the
821
+ `parseTransactionLog()` utility function.
822
+
823
+ ```typescript
824
+ const everything = parseTransactionLog('/path/to/file.txnlog');
825
+ console.log(everything);
826
+ ```
827
+
828
+ Returns an object containing all of the information in the log file.
829
+
830
+ - `size: number` The size of the file.
831
+ - `version: number` The log file format version.
832
+ - `entries: LogEntry[]` An array of transaction log entries.
833
+ - `data: Buffer` The entry data.
834
+ - `flags: number` Transaction related flags.
835
+ - `length: number` The size of the entry data.
836
+ - `timestamp: number` The entry timestamp.
837
+
838
+ ### `shutdown(): void`
839
+
840
+ The `shutdown()` will flush all in-memory data to disk and wait for any outstanding compactions to
841
+ finish, for all open databases. It is highly recommended to call this in a `process` `exit` event
842
+ listener (on the main thread), to ensure that all data is flushed to disk before the process exits:
843
+
844
+ ```typescript
845
+ import { shutdown } from '@harperfast/rocksdb-js';
846
+ process.on('exit', shutdown);
847
+ ```
848
+
849
+ ## Custom Store
850
+
851
+ The store is a class that sits between the `RocksDatabase` or `Transaction` instance and the native
852
+ RocksDB interface. It owns the native RocksDB instance along with various settings including
853
+ encoding and the db name. It handles all interactions with the native RocksDB instance.
854
+
855
+ The default `Store` contains the following methods which can be overridden:
856
+
857
+ - `constructor(path, options?)`
858
+ - `close()`
859
+ - `decodeKey(key)`
860
+ - `decodeValue(value)`
861
+ - `encodeKey(key)`
862
+ - `encodeValue(value)`
863
+ - `get(context, key, resolve, reject, txnId?)`
864
+ - `getCount(context, options?, txnId?)`
865
+ - `getRange(context, options?)`
866
+ - `getSync(context, key, options?)`
867
+ - `getUserSharedBuffer(key, defaultBuffer?)`
868
+ - `hasLock(key)`
869
+ - `isOpen()`
870
+ - `listLogs()`
871
+ - `open()`
872
+ - `putSync(context, key, value, options?)`
873
+ - `removeSync(context, key, options?)`
874
+ - `tryLock(key, onUnlocked?)`
875
+ - `unlock(key)`
876
+ - `useLog(context, name)`
877
+ - `withLock(key, callback?)`
878
+
879
+ To use it, extend the default `Store` and pass in an instance of your store into the `RocksDatabase`
880
+ constructor.
881
+
882
+ ```typescript
883
+ import { RocksDatabase, Store } from '@harperfast/rocksdb-js';
884
+
885
+ class MyStore extends Store {
886
+ get(context, key, resolve, reject, txnId) {
887
+ console.log('Getting:' key);
888
+ return super.get(context, key, resolve, reject, txnId);
889
+ }
890
+
891
+ putSync(context, key, value, options) {
892
+ console.log('Putting:', key);
893
+ return super.putSync(context, key, value, options);
894
+ }
895
+ }
896
+
897
+ const myStore = new MyStore('path/to/db');
898
+ const db = RocksDatabase.open(myStore);
899
+ await db.put('foo', 'bar');
900
+ console.log(await db.get('foo'));
901
+ ```
902
+
903
+ > [!IMPORTANT]
904
+ > If your custom store overrides `putSync()` without calling `super.putSync()` and it performs its
905
+ > own `this.encodeKey(key)`, then you MUST encode the VALUE before you encode the KEY.
906
+ >
907
+ > Keys are encoded into a shared buffer. If the database is opened with the `sharedStructuresKey`
908
+ > option, encoding the value will load and save the structures which encodes the
909
+ > `sharedStructuresKey` overwriting the encoded key in the shared key buffer, so it's ultra
910
+ > important that you encode the value first!
911
+
912
+ ## Interfaces
913
+
914
+ ### `RocksDBOptions`
915
+
916
+ - `options: object`
917
+ - `adaptiveReadahead: boolean` When `true`, RocksDB will do some enhancements for prefetching the
918
+ data. Defaults to `true`. Note that RocksDB defaults this to `false`.
919
+ - `asyncIO: boolean` When `true`, RocksDB will prefetch some data async and apply it if reads are
920
+ sequential and its internal automatic prefetching. Defaults to `true`. Note that RocksDB
921
+ defaults this to `false`.
922
+ - `autoReadaheadSize: boolean` When `true`, RocksDB will auto-tune the readahead size during scans
923
+ internally based on the block cache data when block caching is enabled, an end key (e.g. upper
924
+ bound) is set, and prefix is the same as the start key. Defaults to `true`.
925
+ - `backgroundPurgeOnIteratorCleanup: boolean` When `true`, after the iterator is closed, a
926
+ background job is scheduled to flush the job queue and delete obsolete files. Defaults to
927
+ `true`. Note that RocksDB defaults this to `false`.
928
+ - `fillCache: boolean` When `true`, the iterator will fill the block cache. Filling the block
929
+ cache is not desirable for bulk scans and could impact eviction order. Defaults to `false`. Note
930
+ that RocksDB defaults this to `true`.
931
+ - `readaheadSize: number` The RocksDB readahead size. RocksDB does auto-readahead for iterators
932
+ when there is more than two reads for a table file. The readahead starts at 8KB and doubles on
933
+ every additional read up to 256KB. This option can help if most of the range scans are large and
934
+ if a larger readahead than that enabled by auto-readahead is needed. Using a large readahead
935
+ size (> 2MB) can typically improve the performance of forward iteration on spinning disks.
936
+ Defaults to `0`.
937
+ - `tailing: boolean` When `true`, creates a "tailing iterator" which is a special iterator that
938
+ has a view of the complete database including newly added data and is optimized for sequential
939
+ reads. This will return records that were inserted into the database after the creation of the
940
+ iterator. Defaults to `false`.
941
+
942
+ ### `RangeOptions`
943
+
944
+ Extends `RocksDBOptions`.
945
+
946
+ - `options: object`
947
+ - `end: Key | Uint8Array` The range end key, otherwise known as the "upper bound". Defaults to the
948
+ last key in the database.
949
+ - `exclusiveStart: boolean` When `true`, the iterator will exclude the first key if it matches the
950
+ start key. Defaults to `false`.
951
+ - `inclusiveEnd: boolean` When `true`, the iterator will include the last key if it matches the
952
+ end key. Defaults to `false`.
953
+ - `start: Key | Uint8Array` The range start key, otherwise known as the "lower bound". Defaults to
954
+ the first key in the database.
955
+
956
+ ### `IteratorOptions`
957
+
958
+ Extends `RangeOptions`.
959
+
960
+ - `options: object`
961
+ - `reverse: boolean` When `true`, the iterator will iterate in reverse order. Defaults to `false`.
962
+
963
+ ## Development
964
+
965
+ This package requires Node.js 18 or higher, pnpm, and a C++ compiler.
966
+
967
+ > [!TIP]
968
+ > Enable pnpm log streaming to see full build output:
969
+ >
970
+ > ```
971
+ > pnpm config set stream true
972
+ > ```
973
+
974
+ ### Building
975
+
976
+ There are two things being built: the native binding and the TypeScript code. Each of those can be
977
+ built to be debug friendly.
978
+
979
+ | Description | Command |
980
+ | -------------------------------------------- | ---------------------------------------- |
981
+ | Production build (minified + native binding) | `pnpm build` |
982
+ | TypeScript only (minified) | `pnpm build:bundle` |
983
+ | TypeScript only (unminified) | `pnpm build:debug` |
984
+ | Native binding only (prod) | `pnpm rebuild` |
985
+ | Native binding only (with debug logging) | `pnpm rebuild:debug` |
986
+ | Debug build everything | `pnpm build:debug && pnpm rebuild:debug` |
987
+
988
+ When building the native binding, it will download the appropriate prebuilt RocksDB library for your
989
+ platform and architecture from the
990
+ [rocksdb-prebuilds](https://github.com/HarperFast/rocksdb-prebuilds) GitHub repository. It defaults
991
+ to the pinned version in the `package.json` file. You can override this by setting the
992
+ `ROCKSDB_VERSION` environment variable. For example:
993
+
994
+ ```bash
995
+ ROCKSDB_VERSION=10.9.1 pnpm build
996
+ ```
997
+
998
+ You may also specify `latest` to use the latest prebuilt version.
999
+
1000
+ ```bash
1001
+ ROCKSDB_VERSION=latest pnpm build
1002
+ ```
1003
+
1004
+ Optionally, you may also create a `.env` file in the root of the project to specify various
1005
+ settings. For example:
1006
+
1007
+ ```bash
1008
+ echo "ROCKSDB_VERSION=10.9.1" >> .env
1009
+ ```
1010
+
1011
+ ### Linux C runtime versions
1012
+
1013
+ When you compile `rocksdb-js`, you can specify the `ROCKSDB_LIBC` environment variable to choose
1014
+ either `glibc` (default) or `musl`.
1015
+
1016
+ ```bash
1017
+ ROCKSDB_LIBC=musl pnpm rebuild
1018
+ ```
1019
+
1020
+ ### Building RocksDB from Source
1021
+
1022
+ To build RocksDB from source, simply set the `ROCKSDB_PATH` environment variable to the path of the
1023
+ local `rocksdb` repo:
1024
+
1025
+ ```bash
1026
+ git clone https://github.com/facebook/rocksdb.git /path/to/rocksdb
1027
+ echo "ROCKSDB_PATH=/path/to/rocksdb" >> .env
1028
+ pnpm rebuild
1029
+ ```
1030
+
1031
+ ### Debugging
1032
+
1033
+ It is often helpful to do a debug build and see the internal debug logging of the native binding.
1034
+ You can do a debug build by running:
1035
+
1036
+ ```bash
1037
+ pnpm rebuild:debug
1038
+ ```
1039
+
1040
+ Each debug log message is prefixed with the thread id. Most debug log messages include the instance
1041
+ address making it easier to trace through the log output.
1042
+
1043
+ #### Debugging on macOS
1044
+
1045
+ In the event Node.js crashes, re-run Node.js in `lldb`:
1046
+
1047
+ ```bash
1048
+ lldb node
1049
+ # Then in lldb:
1050
+ # (lldb) run your-program.js
1051
+ # When the crash occurs, print the stack trace:
1052
+ # (lldb) bt
1053
+ ```
1054
+
1055
+ ### Testing
1056
+
1057
+ To run the tests, run:
1058
+
1059
+ ```bash
1060
+ pnpm coverage
1061
+ ```
1062
+
1063
+ To run the tests without code coverage, run:
1064
+
1065
+ ```bash
1066
+ pnpm test
1067
+ ```
1068
+
1069
+ To run a specific test suite, for example `"ranges"`, run:
1070
+
1071
+ ```bash
1072
+ pnpm test ranges
1073
+ # or
1074
+ pnpm test test/ranges
1075
+ ```
1076
+
1077
+ To run a specific unit test, for example all tests that mention `"column family"`, run:
1078
+
1079
+ ```bash
1080
+ pnpm test -t "column family"
1081
+ ```
1082
+
1083
+ Vitest's terminal renderer will often overwrite the debug log output, so it's highly recommended to
1084
+ specify the `CI=1` environment variable to prevent Vitest from erasing log output:
1085
+
1086
+ ```bash
1087
+ CI=1 pnpm test
1088
+ ```
1089
+
1090
+ By default, the test runner deletes all test databases after the tests finish. To keep the temp
1091
+ databases for closer inspection, set the `KEEP_FILES=1` environment variable:
1092
+
1093
+ ```bash
1094
+ CI=1 KEEP_FILES=1 pnpm test
1095
+ ```