xitdb 0.10.0 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,19 +3,23 @@
3
3
  <br/>
4
4
  <br/>
5
5
  <b>Choose your flavor:</b>
6
- <a href="https://github.com/radarroark/xitdb">Zig</a> |
7
- <a href="https://github.com/radarroark/xitdb-java">Java</a> |
6
+ <a href="https://github.com/xit-vcs/xitdb">Zig</a> |
7
+ <a href="https://github.com/xit-vcs/xitdb-java">Java</a> |
8
8
  <a href="https://github.com/codeboost/xitdb-clj">Clojure</a> |
9
- <a href="https://github.com/radarroark/xitdb-ts">TypeScript</a>
9
+ <a href="https://github.com/xit-vcs/xitdb-ts">TypeScript</a> |
10
+ <a href="https://github.com/xit-vcs/xitdb-go">Go</a>
10
11
  </p>
11
12
 
12
- * Each transaction efficiently creates a new "copy" of the database, and past copies can still be read from.
13
- * It supports writing to a file as well as purely in-memory use.
13
+ * Each transaction efficiently creates a new "copy" of the database, and past copies can still be read from and reverted to.
14
+ * Supports storing in a single file as well as purely in-memory use.
15
+ * Runs as a library (embedded in process).
16
+ * Incrementally reads and writes, so file-based databases can contain larger-than-memory datasets.
17
+ * Reads never block writes, and a database can be read from multiple threads/processes without locks.
14
18
  * No query engine of any kind. You just write data structures (primarily an `ArrayList` and `HashMap`) that can be nested arbitrarily.
15
19
  * No dependencies besides the JavaScript standard library.
16
- * It is available [on npm](https://www.npmjs.com/package/xitdb).
20
+ * Available [on npm](https://www.npmjs.com/package/xitdb).
17
21
 
18
- This database was originally made for the [xit version control system](https://github.com/radarroark/xit), but I bet it has a lot of potential for other projects. The combination of being immutable and having an API similar to in-memory data structures is pretty powerful. Consider using it [instead of SQLite](https://gist.github.com/radarroark/03a0724484e1111ef4c05d72a935c42c) for your TypeScript projects: it's simpler, it's pure TypeScript, and it creates no impedance mismatch with your program the way SQL databases do.
22
+ This database was originally made for the [xit version control system](https://github.com/xit-vcs/xit), but I bet it has a lot of potential for other projects. The combination of being immutable and having an API similar to in-memory data structures is pretty powerful. Consider using it [instead of SQLite](https://gist.github.com/xeubie/03a0724484e1111ef4c05d72a935c42c) for your TypeScript projects: it's simpler, it's pure TypeScript, and it creates no impedance mismatch with your program the way SQL databases do.
19
23
 
20
24
  * [Example](#example)
21
25
  * [Initializing a Database](#initializing-a-database)
@@ -24,6 +28,7 @@ This database was originally made for the [xit version control system](https://g
24
28
  * [Large Byte Arrays](#large-byte-arrays)
25
29
  * [Iterators](#iterators)
26
30
  * [Hashing](#hashing)
31
+ * [Compaction](#compaction)
27
32
 
28
33
  ## Example
29
34
 
@@ -414,3 +419,18 @@ switch (hashIdStr) {
414
419
  throw new Error('Invalid hash algorithm');
415
420
  }
416
421
  ```
422
+
423
+ ## Compaction
424
+
425
+ Normally, an immutable database grows forever, because old data is never deleted. To reclaim disk space and clear the history, xitdb supports compaction. This involves completely rebuilding the database file to only contain the data accessible from the latest copy (i.e., "moment") of the database.
426
+
427
+ ```typescript
428
+ using compactCore = await CoreBufferedFile.create('compact.db');
429
+ const compactDb = await db.compact(compactCore);
430
+
431
+ // read from the new compacted db
432
+ const history = new ReadArrayList(compactDb.rootCursor());
433
+ expect(await history.count()).toBe(1);
434
+ ```
435
+
436
+ This compacted database will be in a separate file. If you want to delete the original database and replace it with this one, you'll need to do that yourself. It is not possible to compact a database in-place (using the same file as the target database); doing so would fail and would render your original database unreadable.
@@ -223,6 +223,7 @@ export declare class Database {
223
223
  static create(core: Core, hasher: Hasher): Promise<Database>;
224
224
  rootCursor(): WriteCursor;
225
225
  freeze(): Promise<void>;
226
+ compact(targetCore: Core): Promise<Database>;
226
227
  truncate(): Promise<void>;
227
228
  checkHashBytes(hash: Uint8Array): Uint8Array;
228
229
  checkHash(target: HashMapGetTarget): Uint8Array;
package/dist/index.js CHANGED
@@ -2317,6 +2317,43 @@ class Database3 {
2317
2317
  throw new ExpectedTxStartException;
2318
2318
  }
2319
2319
  }
2320
+ async compact(targetCore) {
2321
+ const offsetMap = new Map;
2322
+ const hasher = new Hasher(this.hasher.algorithm, this.header.hashId);
2323
+ const target = await Database3.create(targetCore, hasher);
2324
+ if (this.header.tag === 0 /* NONE */)
2325
+ return target;
2326
+ if (this.header.tag !== 2 /* ARRAY_LIST */)
2327
+ throw new UnexpectedTagException;
2328
+ await this.core.seek(Header.LENGTH);
2329
+ const sourceReader = this.core.reader();
2330
+ const headerBytes = new Uint8Array(ArrayListHeader.LENGTH);
2331
+ await sourceReader.readFully(headerBytes);
2332
+ const sourceHeader = ArrayListHeader.fromBytes(headerBytes);
2333
+ if (sourceHeader.size === 0)
2334
+ return target;
2335
+ const lastKey = sourceHeader.size - 1;
2336
+ const shift = lastKey < SLOT_COUNT ? 0 : Math.floor(Math.log(lastKey) / Math.log(SLOT_COUNT));
2337
+ const lastSlotPtr = await this.readArrayListSlot(sourceHeader.ptr, lastKey, shift, 0 /* READ_ONLY */, true);
2338
+ const momentSlot = lastSlotPtr.slot;
2339
+ const targetWriter = targetCore.writer();
2340
+ await targetCore.seek(Header.LENGTH);
2341
+ const targetArrayListPtr = Header.LENGTH + TopLevelArrayListHeader.LENGTH;
2342
+ await targetWriter.write(new TopLevelArrayListHeader(0, new ArrayListHeader(targetArrayListPtr, 1)).toBytes());
2343
+ await targetWriter.write(new Uint8Array(INDEX_BLOCK_SIZE));
2344
+ const remappedMoment = await remapSlot(this.core, targetCore, this.header.hashSize, offsetMap, momentSlot);
2345
+ await targetCore.seek(targetArrayListPtr);
2346
+ await targetWriter.write(remappedMoment.toBytes());
2347
+ target.header = target.header.withTag(2 /* ARRAY_LIST */);
2348
+ await targetCore.seek(0);
2349
+ await target.header.write(targetCore);
2350
+ await targetCore.flush();
2351
+ const fileSize = await targetCore.length();
2352
+ await targetCore.seek(Header.LENGTH + ArrayListHeader.LENGTH);
2353
+ await targetWriter.writeLong(fileSize);
2354
+ await targetCore.flush();
2355
+ return target;
2356
+ }
2320
2357
  async truncate() {
2321
2358
  if (this.header.tag !== 2 /* ARRAY_LIST */)
2322
2359
  return;
@@ -3141,6 +3178,216 @@ class Database3 {
3141
3178
  return new LinkedArrayListHeader(nextShift, rootPtr, headerA.size + headerB.size);
3142
3179
  }
3143
3180
  }
3181
+ async function remapSlot(sourceCore, targetCore, hashSize, offsetMap, slot) {
3182
+ switch (slot.tag) {
3183
+ case 0 /* NONE */:
3184
+ case 8 /* UINT */:
3185
+ case 9 /* INT */:
3186
+ case 10 /* FLOAT */:
3187
+ case 7 /* SHORT_BYTES */:
3188
+ return slot;
3189
+ case 6 /* BYTES */: {
3190
+ const mapped = offsetMap.get(Number(slot.value));
3191
+ if (mapped !== undefined)
3192
+ return new Slot(mapped, slot.tag, slot.full);
3193
+ const newOffset = await remapBytes(sourceCore, targetCore, slot);
3194
+ offsetMap.set(Number(slot.value), newOffset);
3195
+ return new Slot(newOffset, slot.tag, slot.full);
3196
+ }
3197
+ case 1 /* INDEX */: {
3198
+ const mapped = offsetMap.get(Number(slot.value));
3199
+ if (mapped !== undefined)
3200
+ return new Slot(mapped, slot.tag, slot.full);
3201
+ const newOffset = await remapIndex(sourceCore, targetCore, hashSize, offsetMap, slot);
3202
+ offsetMap.set(Number(slot.value), newOffset);
3203
+ return new Slot(newOffset, slot.tag, slot.full);
3204
+ }
3205
+ case 2 /* ARRAY_LIST */: {
3206
+ const mapped = offsetMap.get(Number(slot.value));
3207
+ if (mapped !== undefined)
3208
+ return new Slot(mapped, slot.tag, slot.full);
3209
+ const newOffset = await remapArrayList(sourceCore, targetCore, hashSize, offsetMap, slot);
3210
+ offsetMap.set(Number(slot.value), newOffset);
3211
+ return new Slot(newOffset, slot.tag, slot.full);
3212
+ }
3213
+ case 3 /* LINKED_ARRAY_LIST */: {
3214
+ const mapped = offsetMap.get(Number(slot.value));
3215
+ if (mapped !== undefined)
3216
+ return new Slot(mapped, slot.tag, slot.full);
3217
+ const newOffset = await remapLinkedArrayList(sourceCore, targetCore, hashSize, offsetMap, slot);
3218
+ offsetMap.set(Number(slot.value), newOffset);
3219
+ return new Slot(newOffset, slot.tag, slot.full);
3220
+ }
3221
+ case 4 /* HASH_MAP */:
3222
+ case 11 /* HASH_SET */: {
3223
+ const mapped = offsetMap.get(Number(slot.value));
3224
+ if (mapped !== undefined)
3225
+ return new Slot(mapped, slot.tag, slot.full);
3226
+ const newOffset = await remapHashMapOrSet(sourceCore, targetCore, hashSize, offsetMap, slot, false);
3227
+ offsetMap.set(Number(slot.value), newOffset);
3228
+ return new Slot(newOffset, slot.tag, slot.full);
3229
+ }
3230
+ case 12 /* COUNTED_HASH_MAP */:
3231
+ case 13 /* COUNTED_HASH_SET */: {
3232
+ const mapped = offsetMap.get(Number(slot.value));
3233
+ if (mapped !== undefined)
3234
+ return new Slot(mapped, slot.tag, slot.full);
3235
+ const newOffset = await remapHashMapOrSet(sourceCore, targetCore, hashSize, offsetMap, slot, true);
3236
+ offsetMap.set(Number(slot.value), newOffset);
3237
+ return new Slot(newOffset, slot.tag, slot.full);
3238
+ }
3239
+ case 5 /* KV_PAIR */: {
3240
+ const mapped = offsetMap.get(Number(slot.value));
3241
+ if (mapped !== undefined)
3242
+ return new Slot(mapped, slot.tag, slot.full);
3243
+ const newOffset = await remapKvPair(sourceCore, targetCore, hashSize, offsetMap, slot);
3244
+ offsetMap.set(Number(slot.value), newOffset);
3245
+ return new Slot(newOffset, slot.tag, slot.full);
3246
+ }
3247
+ default:
3248
+ throw new UnexpectedTagException;
3249
+ }
3250
+ }
3251
+ async function remapBytes(sourceCore, targetCore, slot) {
3252
+ await sourceCore.seek(Number(slot.value));
3253
+ const sourceReader = sourceCore.reader();
3254
+ const length = await sourceReader.readLong();
3255
+ const formatTagSize = slot.full ? 2 : 0;
3256
+ const totalPayload = length + formatTagSize;
3257
+ const newOffset = await targetCore.length();
3258
+ await targetCore.seek(newOffset);
3259
+ const targetWriter = targetCore.writer();
3260
+ await targetWriter.writeLong(length);
3261
+ let remaining = totalPayload;
3262
+ while (remaining > 0) {
3263
+ const chunk = Math.min(remaining, 4096);
3264
+ const buf = new Uint8Array(chunk);
3265
+ await sourceReader.readFully(buf);
3266
+ await targetWriter.write(buf);
3267
+ remaining -= chunk;
3268
+ }
3269
+ return newOffset;
3270
+ }
3271
+ async function remapIndex(sourceCore, targetCore, hashSize, offsetMap, slot) {
3272
+ await sourceCore.seek(Number(slot.value));
3273
+ const sourceReader = sourceCore.reader();
3274
+ const blockBytes = new Uint8Array(INDEX_BLOCK_SIZE);
3275
+ await sourceReader.readFully(blockBytes);
3276
+ const remappedSlots = [];
3277
+ for (let i = 0;i < SLOT_COUNT; i++) {
3278
+ const slotBytes = blockBytes.slice(i * Slot.LENGTH, (i + 1) * Slot.LENGTH);
3279
+ const childSlot = Slot.fromBytes(slotBytes);
3280
+ remappedSlots.push(await remapSlot(sourceCore, targetCore, hashSize, offsetMap, childSlot));
3281
+ }
3282
+ const newOffset = await targetCore.length();
3283
+ await targetCore.seek(newOffset);
3284
+ const targetWriter = targetCore.writer();
3285
+ for (const s of remappedSlots) {
3286
+ await targetWriter.write(s.toBytes());
3287
+ }
3288
+ return newOffset;
3289
+ }
3290
+ async function remapArrayList(sourceCore, targetCore, hashSize, offsetMap, slot) {
3291
+ await sourceCore.seek(Number(slot.value));
3292
+ const sourceReader = sourceCore.reader();
3293
+ const headerBytes = new Uint8Array(ArrayListHeader.LENGTH);
3294
+ await sourceReader.readFully(headerBytes);
3295
+ const header = ArrayListHeader.fromBytes(headerBytes);
3296
+ const indexSlot = new Slot(header.ptr, 1 /* INDEX */);
3297
+ const remappedIndex = await remapSlot(sourceCore, targetCore, hashSize, offsetMap, indexSlot);
3298
+ const newOffset = await targetCore.length();
3299
+ await targetCore.seek(newOffset);
3300
+ const targetWriter = targetCore.writer();
3301
+ await targetWriter.write(new ArrayListHeader(Number(remappedIndex.value), header.size).toBytes());
3302
+ return newOffset;
3303
+ }
3304
+ async function remapLinkedArrayList(sourceCore, targetCore, hashSize, offsetMap, slot) {
3305
+ await sourceCore.seek(Number(slot.value));
3306
+ const sourceReader = sourceCore.reader();
3307
+ const headerBytes = new Uint8Array(LinkedArrayListHeader.LENGTH);
3308
+ await sourceReader.readFully(headerBytes);
3309
+ const header = LinkedArrayListHeader.fromBytes(headerBytes);
3310
+ const remappedPtr = await remapLinkedArrayListBlock(sourceCore, targetCore, hashSize, offsetMap, header.ptr);
3311
+ const newOffset = await targetCore.length();
3312
+ await targetCore.seek(newOffset);
3313
+ const targetWriter = targetCore.writer();
3314
+ await targetWriter.write(new LinkedArrayListHeader(header.shift, remappedPtr, header.size).toBytes());
3315
+ return newOffset;
3316
+ }
3317
+ async function remapLinkedArrayListBlock(sourceCore, targetCore, hashSize, offsetMap, blockOffset) {
3318
+ const mapped = offsetMap.get(blockOffset);
3319
+ if (mapped !== undefined)
3320
+ return mapped;
3321
+ await sourceCore.seek(blockOffset);
3322
+ const sourceReader = sourceCore.reader();
3323
+ const blockBytes = new Uint8Array(LINKED_ARRAY_LIST_INDEX_BLOCK_SIZE);
3324
+ await sourceReader.readFully(blockBytes);
3325
+ const slots = [];
3326
+ for (let i = 0;i < SLOT_COUNT; i++) {
3327
+ const slotBytes = blockBytes.slice(i * LinkedArrayListSlot2.LENGTH, (i + 1) * LinkedArrayListSlot2.LENGTH);
3328
+ slots.push(LinkedArrayListSlot2.fromBytes(slotBytes));
3329
+ }
3330
+ const remappedSlots = [];
3331
+ for (const s of slots) {
3332
+ if (s.slot.tag === 1 /* INDEX */) {
3333
+ const remappedPtr = await remapLinkedArrayListBlock(sourceCore, targetCore, hashSize, offsetMap, Number(s.slot.value));
3334
+ remappedSlots.push(new LinkedArrayListSlot2(s.size, new Slot(remappedPtr, 1 /* INDEX */, s.slot.full)));
3335
+ } else if (s.slot.empty()) {
3336
+ remappedSlots.push(s);
3337
+ } else {
3338
+ const remapped = await remapSlot(sourceCore, targetCore, hashSize, offsetMap, s.slot);
3339
+ remappedSlots.push(new LinkedArrayListSlot2(s.size, remapped));
3340
+ }
3341
+ }
3342
+ const newOffset = await targetCore.length();
3343
+ await targetCore.seek(newOffset);
3344
+ const targetWriter = targetCore.writer();
3345
+ for (const s of remappedSlots) {
3346
+ await targetWriter.write(s.toBytes());
3347
+ }
3348
+ offsetMap.set(blockOffset, newOffset);
3349
+ return newOffset;
3350
+ }
3351
+ async function remapHashMapOrSet(sourceCore, targetCore, hashSize, offsetMap, slot, counted) {
3352
+ await sourceCore.seek(Number(slot.value));
3353
+ const sourceReader = sourceCore.reader();
3354
+ let countValue = -1;
3355
+ if (counted) {
3356
+ countValue = await sourceReader.readLong();
3357
+ }
3358
+ const blockBytes = new Uint8Array(INDEX_BLOCK_SIZE);
3359
+ await sourceReader.readFully(blockBytes);
3360
+ const remappedSlots = [];
3361
+ for (let i = 0;i < SLOT_COUNT; i++) {
3362
+ const slotBytes = blockBytes.slice(i * Slot.LENGTH, (i + 1) * Slot.LENGTH);
3363
+ const childSlot = Slot.fromBytes(slotBytes);
3364
+ remappedSlots.push(await remapSlot(sourceCore, targetCore, hashSize, offsetMap, childSlot));
3365
+ }
3366
+ const newOffset = await targetCore.length();
3367
+ await targetCore.seek(newOffset);
3368
+ const targetWriter = targetCore.writer();
3369
+ if (counted) {
3370
+ await targetWriter.writeLong(countValue);
3371
+ }
3372
+ for (const s of remappedSlots) {
3373
+ await targetWriter.write(s.toBytes());
3374
+ }
3375
+ return newOffset;
3376
+ }
3377
+ async function remapKvPair(sourceCore, targetCore, hashSize, offsetMap, slot) {
3378
+ await sourceCore.seek(Number(slot.value));
3379
+ const sourceReader = sourceCore.reader();
3380
+ const kvPairBytes = new Uint8Array(KeyValuePair.length(hashSize));
3381
+ await sourceReader.readFully(kvPairBytes);
3382
+ const kvPair = KeyValuePair.fromBytes(kvPairBytes, hashSize);
3383
+ const remappedKey = await remapSlot(sourceCore, targetCore, hashSize, offsetMap, kvPair.keySlot);
3384
+ const remappedValue = await remapSlot(sourceCore, targetCore, hashSize, offsetMap, kvPair.valueSlot);
3385
+ const newOffset = await targetCore.length();
3386
+ await targetCore.seek(newOffset);
3387
+ const targetWriter = targetCore.writer();
3388
+ await targetWriter.write(new KeyValuePair(remappedValue, remappedKey, kvPair.hash).toBytes());
3389
+ return newOffset;
3390
+ }
3144
3391
  // src/read-array-list.ts
3145
3392
  class ReadArrayList {
3146
3393
  cursor;
package/package.json CHANGED
@@ -1,11 +1,11 @@
1
1
  {
2
2
  "name": "xitdb",
3
- "version": "0.10.0",
3
+ "version": "0.12.0",
4
4
  "description": "An immutable database",
5
5
  "license": "MIT",
6
6
  "repository": {
7
7
  "type": "git",
8
- "url": "https://github.com/radarroark/xitdb-ts.git"
8
+ "url": "https://github.com/xit-vcs/xitdb-ts.git"
9
9
  },
10
10
  "type": "module",
11
11
  "main": "dist/index.js",