@rljson/bs 0.0.18 → 0.0.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.public.md CHANGED
@@ -8,8 +8,480 @@ found in the LICENSE file in the root of this package.
8
8
 
9
9
  # @rljson/bs
10
10
 
11
- Blob storage interface and implementations for rljson.
11
+ Content-addressable blob storage interface and implementations for rljson.
12
12
 
13
- ## Example
13
+ ## Overview
14
14
 
15
- [src/example.ts](src/example.ts)
15
+ `@rljson/bs` provides a unified interface for blob storage with content-addressable semantics. All blobs are identified by their SHA256 hash, ensuring automatic deduplication and data integrity.
16
+
17
+ ### Key Features
18
+
19
+ - **Content-Addressable Storage**: Blobs are identified by SHA256 hash of their content
20
+ - **Automatic Deduplication**: Identical content is stored only once
21
+ - **Multiple Implementations**: In-memory, peer-to-peer, server-based, and multi-tier
22
+ - **Type-Safe**: Full TypeScript support with comprehensive type definitions
23
+ - **Stream Support**: Efficient handling of large blobs via ReadableStreams
24
+ - **100% Test Coverage**: Fully tested with comprehensive test suite
25
+
26
+ ## Installation
27
+
28
+ ```bash
29
+ npm install @rljson/bs
30
+ ```
31
+
32
+ ## Quick Start
33
+
34
+ ### In-Memory Storage
35
+
36
+ The simplest implementation for testing or temporary storage:
37
+
38
+ ```typescript
39
+ import { BsMem } from '@rljson/bs';
40
+
41
+ // Create an in-memory blob storage
42
+ const bs = new BsMem();
43
+
44
+ // Store a blob - returns SHA256 hash as blobId
45
+ const { blobId } = await bs.setBlob('Hello, World!');
46
+ console.log(blobId); // e.g., "dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f"
47
+
48
+ // Retrieve the blob
49
+ const { content } = await bs.getBlob(blobId);
50
+ console.log(content.toString()); // "Hello, World!"
51
+
52
+ // Check if blob exists
53
+ const exists = await bs.blobExists(blobId);
54
+ console.log(exists); // true
55
+
56
+ // List all blobs
57
+ const { blobs } = await bs.listBlobs();
58
+ console.log(blobs.length); // 1
59
+ ```
60
+
61
+ ### Multi-Tier Storage (Cache + Remote)
62
+
63
+ Combine multiple storage backends with automatic caching:
64
+
65
+ ```typescript
66
+ import { BsMulti, BsMem, BsPeer, PeerSocketMock } from '@rljson/bs';
67
+
68
+ // Setup remote storage (simulated)
69
+ const remoteStore = new BsMem();
70
+ const remoteSocket = new PeerSocketMock(remoteStore);
71
+ const remotePeer = new BsPeer(remoteSocket);
72
+ await remotePeer.init();
73
+
74
+ // Setup local cache
75
+ const localCache = new BsMem();
76
+
77
+ // Create multi-tier storage with cache-first strategy
78
+ const bs = new BsMulti([
79
+ { bs: localCache, priority: 0, read: true, write: true }, // Cache first
80
+ { bs: remotePeer, priority: 1, read: true, write: false }, // Remote fallback
81
+ ]);
82
+ await bs.init();
83
+
84
+ // Store blob - writes to cache only (writable stores)
85
+ const { blobId } = await bs.setBlob('Cached content');
86
+
87
+ // Read from cache first, falls back to remote
88
+ // Automatically hot-swaps remote blobs to cache
89
+ const { content } = await bs.getBlob(blobId);
90
+ ```
91
+
92
+ ## Core Concepts
93
+
94
+ ### Content-Addressable Storage
95
+
96
+ Every blob is identified by the SHA256 hash of its content. This means:
97
+
98
+ - **Automatic Deduplication**: Storing the same content twice returns the same `blobId`
99
+ - **Data Integrity**: The `blobId` serves as a cryptographic checksum
100
+ - **Location Independence**: Blobs can be identified and verified anywhere
101
+
102
+ ```typescript
103
+ const bs = new BsMem();
104
+
105
+ const result1 = await bs.setBlob('Same content');
106
+ const result2 = await bs.setBlob('Same content');
107
+
108
+ // Both return the same blobId
109
+ console.log(result1.blobId === result2.blobId); // true
110
+ ```
111
+
112
+ ### Blob Properties
113
+
114
+ All blobs have associated metadata:
115
+
116
+ ```typescript
117
+ interface BlobProperties {
118
+ blobId: string; // SHA256 hash of content
119
+ size: number; // Size in bytes
120
+ contentType: string; // MIME type (default: 'application/octet-stream')
121
+ createdAt: Date; // Creation timestamp
122
+ metadata?: Record<string, string>; // Optional custom metadata
123
+ }
124
+ ```
125
+
126
+ ## Implementations
127
+
128
+ ### BsMem - In-Memory Storage
129
+
130
+ Fast, ephemeral storage for testing and temporary data:
131
+
132
+ ```typescript
133
+ import { BsMem } from '@rljson/bs';
134
+
135
+ const bs = new BsMem();
136
+ const { blobId } = await bs.setBlob('Temporary data');
137
+ ```
138
+
139
+ **Use Cases:**
140
+
141
+ - Unit testing
142
+ - Temporary caching
143
+ - Development and prototyping
144
+
145
+ **Limitations:**
146
+
147
+ - Data lost when process ends
148
+ - Limited by available RAM
149
+
150
+ ### BsPeer - Peer-to-Peer Storage
151
+
152
+ Access remote blob storage over a socket connection:
153
+
154
+ ```typescript
155
+ import { BsPeer, PeerSocketMock } from '@rljson/bs';
156
+
157
+ // Create a peer connected to a remote storage
158
+ const remoteStorage = new BsMem();
159
+ const socket = new PeerSocketMock(remoteStorage);
160
+ const peer = new BsPeer(socket);
161
+ await peer.init();
162
+
163
+ // Use like any other Bs implementation
164
+ const { blobId } = await peer.setBlob('Remote data');
165
+ const { content } = await peer.getBlob(blobId);
166
+
167
+ // Close connection when done
168
+ await peer.close();
169
+ ```
170
+
171
+ **Use Cases:**
172
+
173
+ - Distributed systems
174
+ - Client-server architectures
175
+ - Remote backup
176
+
177
+ ### BsServer - Server-Side Handler
178
+
179
+ Handle blob storage requests from remote peers:
180
+
181
+ ```typescript
182
+ import { BsServer, BsMem, SocketMock } from '@rljson/bs';
183
+
184
+ // Server-side setup
185
+ const storage = new BsMem();
186
+ const server = new BsServer(storage);
187
+
188
+ // Handle incoming connection
189
+ const clientSocket = new SocketMock();
190
+ const serverSocket = clientSocket.createPeer();
191
+ server.handleConnection(serverSocket);
192
+
193
+ // Client can now access storage through clientSocket
194
+ ```
195
+
196
+ **Use Cases:**
197
+
198
+ - Building blob storage services
199
+ - Network protocol implementation
200
+ - API backends
201
+
202
+ ### BsMulti - Multi-Tier Storage
203
+
204
+ Combine multiple storage backends with configurable priorities:
205
+
206
+ ```typescript
207
+ import { BsMulti, BsMem } from '@rljson/bs';
208
+
209
+ const fastCache = new BsMem();
210
+ const mainStorage = new BsMem();
211
+ const backup = new BsMem();
212
+
213
+ const bs = new BsMulti([
214
+ { bs: fastCache, priority: 0, read: true, write: true }, // L1 cache
215
+ { bs: mainStorage, priority: 1, read: true, write: true }, // Main storage
216
+ { bs: backup, priority: 2, read: true, write: false }, // Read-only backup
217
+ ]);
218
+ await bs.init();
219
+ ```
220
+
221
+ **Features:**
222
+
223
+ - **Priority-Based Reads**: Reads from lowest priority number first
224
+ - **Hot-Swapping**: Automatically caches blobs from remote to local
225
+ - **Parallel Writes**: Writes to all writable stores simultaneously
226
+ - **Deduplication**: Merges results from all readable stores
227
+
228
+ **Use Cases:**
229
+
230
+ - Local cache + remote storage
231
+ - Local network storage infrastructure
232
+ - Backup and archival systems
233
+ - Distributed blob storage across network nodes
234
+
235
+ ## API Reference
236
+
237
+ ### Bs Interface
238
+
239
+ All implementations conform to the `Bs` interface:
240
+
241
+ #### `setBlob(content: Buffer | string | ReadableStream): Promise<BlobProperties>`
242
+
243
+ Stores a blob and returns its properties including the SHA256 `blobId`.
244
+
245
+ ```typescript
246
+ // From string
247
+ const { blobId } = await bs.setBlob('Hello');
248
+
249
+ // From Buffer
250
+ const buffer = Buffer.from('World');
251
+ await bs.setBlob(buffer);
252
+
253
+ // From ReadableStream
254
+ const stream = new ReadableStream({
255
+ start(controller) {
256
+ controller.enqueue(new TextEncoder().encode('Stream data'));
257
+ controller.close();
258
+ }
259
+ });
260
+ await bs.setBlob(stream);
261
+ ```
262
+
263
+ #### `getBlob(blobId: string, options?: DownloadBlobOptions): Promise<{ content: Buffer; properties: BlobProperties }>`
264
+
265
+ Retrieves a blob by its ID.
266
+
267
+ ```typescript
268
+ const { content, properties } = await bs.getBlob(blobId);
269
+ console.log(content.toString());
270
+ console.log(properties.size);
271
+
272
+ // With range request (partial content)
273
+ const { content: partial } = await bs.getBlob(blobId, {
274
+ range: { start: 0, end: 99 } // First 100 bytes
275
+ });
276
+ ```
277
+
278
+ #### `getBlobStream(blobId: string): Promise<ReadableStream<Uint8Array>>`
279
+
280
+ Retrieves a blob as a stream for efficient handling of large files.
281
+
282
+ ```typescript
283
+ const stream = await bs.getBlobStream(blobId);
284
+ const reader = stream.getReader();
285
+
286
+ while (true) {
287
+ const { done, value } = await reader.read();
288
+ if (done) break;
289
+ // Process chunk
290
+ console.log('Chunk size:', value.length);
291
+ }
292
+ ```
293
+
294
+ #### `deleteBlob(blobId: string): Promise<void>`
295
+
296
+ Deletes a blob from storage.
297
+
298
+ ```typescript
299
+ await bs.deleteBlob(blobId);
300
+ ```
301
+
302
+ **Note:** In production systems with content-addressable storage, consider implementing reference counting before deletion.
303
+
304
+ #### `blobExists(blobId: string): Promise<boolean>`
305
+
306
+ Checks if a blob exists.
307
+
308
+ ```typescript
309
+ if (await bs.blobExists(blobId)) {
310
+ console.log('Blob found');
311
+ }
312
+ ```
313
+
314
+ #### `getBlobProperties(blobId: string): Promise<BlobProperties>`
315
+
316
+ Gets blob metadata without downloading content.
317
+
318
+ ```typescript
319
+ const props = await bs.getBlobProperties(blobId);
320
+ console.log(`Blob size: ${props.size} bytes`);
321
+ console.log(`Created: ${props.createdAt}`);
322
+ ```
323
+
324
+ #### `listBlobs(options?: ListBlobsOptions): Promise<ListBlobsResult>`
325
+
326
+ Lists all blobs with optional filtering and pagination.
327
+
328
+ ```typescript
329
+ // List all blobs
330
+ const { blobs } = await bs.listBlobs();
331
+
332
+ // With prefix filter
333
+ const result = await bs.listBlobs({ prefix: 'abc' });
334
+
335
+ // Paginated listing
336
+ let continuationToken: string | undefined;
337
+ do {
338
+ const result = await bs.listBlobs({
339
+ maxResults: 100,
340
+ continuationToken
341
+ });
342
+
343
+ console.log(`Got ${result.blobs.length} blobs`);
344
+ continuationToken = result.continuationToken;
345
+ } while (continuationToken);
346
+ ```
347
+
348
+ #### `generateSignedUrl(blobId: string, expiresIn: number, permissions?: 'read' | 'delete'): Promise<string>`
349
+
350
+ Generates a signed URL for temporary access to a blob.
351
+
352
+ ```typescript
353
+ // Read-only URL valid for 1 hour
354
+ const url = await bs.generateSignedUrl(blobId, 3600);
355
+
356
+ // Delete permission URL
357
+ const deleteUrl = await bs.generateSignedUrl(blobId, 300, 'delete');
358
+ ```
359
+
360
+ ## Advanced Usage
361
+
362
+ ### Custom Storage Implementation
363
+
364
+ Implement the `Bs` interface to create custom storage backends:
365
+
366
+ ```typescript
367
+ import { Bs, BlobProperties } from '@rljson/bs';
368
+
369
+ class MyCustomStorage implements Bs {
370
+ async setBlob(content: Buffer | string | ReadableStream): Promise<BlobProperties> {
371
+ // Your implementation
372
+ }
373
+
374
+ async getBlob(blobId: string) {
375
+ // Your implementation
376
+ }
377
+
378
+ // Implement other methods...
379
+ }
380
+ ```
381
+
382
+ ### Multi-Tier Patterns
383
+
384
+ **Local Cache + Remote Storage:**
385
+
386
+ ```typescript
387
+ const bs = new BsMulti([
388
+ { bs: localCache, priority: 0, read: true, write: true },
389
+ { bs: remoteStorage, priority: 1, read: true, write: false },
390
+ ]);
391
+ ```
392
+
393
+ **Write-Through Cache:**
394
+
395
+ ```typescript
396
+ const bs = new BsMulti([
397
+ { bs: localCache, priority: 0, read: true, write: true },
398
+ { bs: remoteStorage, priority: 1, read: true, write: true }, // Also writable
399
+ ]);
400
+ ```
401
+
402
+ **Multi-Region Replication:**
403
+
404
+ ```typescript
405
+ const bs = new BsMulti([
406
+ { bs: regionUs, priority: 0, read: true, write: true },
407
+ { bs: regionEu, priority: 1, read: true, write: true },
408
+ { bs: regionAsia, priority: 2, read: true, write: true },
409
+ ]);
410
+ ```
411
+
412
+ ### Error Handling
413
+
414
+ All methods throw errors for invalid operations:
415
+
416
+ ```typescript
417
+ try {
418
+ await bs.getBlob('nonexistent-id');
419
+ } catch (error) {
420
+ console.error('Blob not found:', error.message);
421
+ }
422
+
423
+ // BsMulti gracefully handles partial failures
424
+ const multi = new BsMulti([
425
+ { bs: failingStore, priority: 0, read: true, write: false },
426
+ { bs: workingStore, priority: 1, read: true, write: false },
427
+ ]);
428
+
429
+ // Falls back to workingStore if failingStore errors
430
+ const { content } = await multi.getBlob(blobId);
431
+ ```
432
+
433
+ ## Testing
434
+
435
+ The package includes comprehensive test utilities:
436
+
437
+ ```typescript
438
+ import { BsMem } from '@rljson/bs';
439
+
440
+ describe('My Tests', () => {
441
+ let bs: BsMem;
442
+
443
+ beforeEach(() => {
444
+ bs = new BsMem();
445
+ });
446
+
447
+ it('should store and retrieve blobs', async () => {
448
+ const { blobId } = await bs.setBlob('test data');
449
+ const { content } = await bs.getBlob(blobId);
450
+ expect(content.toString()).toBe('test data');
451
+ });
452
+ });
453
+ ```
454
+
455
+ ## Performance Considerations
456
+
457
+ ### Memory Usage
458
+
459
+ - `BsMem` stores all data in RAM - suitable for small to medium datasets
460
+ - Use streams (`getBlobStream`) for large blobs to avoid loading entire content into memory
461
+ - `BsMulti` with local cache reduces network overhead
462
+
463
+ ### Network Efficiency
464
+
465
+ - Use `BsPeer` for remote access with minimal protocol overhead
466
+ - `BsMulti` automatically caches frequently accessed blobs
467
+ - Content-addressable nature prevents redundant transfers
468
+
469
+ ### Deduplication
470
+
471
+ - Identical content stored multiple times occupies space only once
472
+ - Particularly effective for:
473
+ - Version control systems
474
+ - Backup solutions
475
+ - Build artifact storage
476
+
477
+ ## License
478
+
479
+ MIT
480
+
481
+ ## Contributing
482
+
483
+ Issues and pull requests welcome at <https://github.com/rljson/bs>
484
+
485
+ ## Related Packages
486
+
487
+ - `@rljson/io` - Data table storage interface and implementations