@rljson/bs 0.0.18 → 0.0.20

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.public.md CHANGED
@@ -1,6 +1,6 @@
1
1
  <!--
2
2
  @license
3
- Copyright (c) 2025 Rljson
3
+ Copyright (c) 2026 Rljson
4
4
 
5
5
  Use of this source code is governed by terms that can be
6
6
  found in the LICENSE file in the root of this package.
@@ -8,8 +8,635 @@ found in the LICENSE file in the root of this package.
8
8
 
9
9
  # @rljson/bs
10
10
 
11
- Blob storage interface and implementations for rljson.
11
+ Content-addressable blob storage interface and implementations for TypeScript/JavaScript.
12
12
 
13
- ## Example
13
+ ## Overview
14
14
 
15
- [src/example.ts](src/example.ts)
15
+ `@rljson/bs` provides a unified interface for blob storage with content-addressable semantics. All blobs are identified by their SHA256 hash, ensuring automatic deduplication, data integrity verification, and location independence.
16
+
17
+ ### Key Features
18
+
19
+ - **Content-Addressable Storage**: Blobs are identified by SHA256 hash of their content
20
+ - **Automatic Deduplication**: Identical content is stored only once across the entire system
21
+ - **Multiple Implementations**: In-memory, peer-to-peer, server-based, and multi-tier
22
+ - **Type-Safe**: Full TypeScript support with comprehensive type definitions
23
+ - **Stream Support**: Efficient handling of large blobs via ReadableStreams
24
+ - **Network Layer**: Built-in peer-to-peer and client-server implementations
25
+ - **100% Test Coverage**: Fully tested with comprehensive test suite
26
+
27
+ ## Installation
28
+
29
+ ```bash
30
+ npm install @rljson/bs
31
+ ```
32
+
33
+ ## Quick Start
34
+
35
+ ### Basic Usage: In-Memory Storage
36
+
37
+ The simplest implementation for testing or temporary storage:
38
+
39
+ ```typescript
40
+ import { BsMem } from '@rljson/bs';
41
+
42
+ // Create an in-memory blob storage
43
+ const bs = new BsMem();
44
+
45
+ // Store a blob - returns SHA256 hash as blobId
46
+ const { blobId, size } = await bs.setBlob('Hello, World!');
47
+ console.log(blobId); // "dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f"
48
+ console.log(size); // 13
49
+
50
+ // Retrieve the blob
51
+ const { content } = await bs.getBlob(blobId);
52
+ console.log(content.toString()); // "Hello, World!"
53
+
54
+ // Check if blob exists
55
+ const exists = await bs.blobExists(blobId);
56
+ console.log(exists); // true
57
+
58
+ // Get blob properties without downloading
59
+ const props = await bs.getBlobProperties(blobId);
60
+ console.log(props.createdAt); // Timestamp
61
+
62
+ // List all blobs
63
+ const { blobs } = await bs.listBlobs();
64
+ console.log(blobs.length); // 1
65
+ ```
66
+
67
+ ### Client-Server Architecture
68
+
69
+ Access remote blob storage over a socket connection:
70
+
71
+ ```typescript
72
+ import { BsMem, BsServer, BsPeer, SocketMock } from '@rljson/bs';
73
+
74
+ // Server setup
75
+ const storage = new BsMem();
76
+ const server = new BsServer(storage);
77
+
78
+ // Client setup
79
+ const socket = new SocketMock(); // Use real socket in production
80
+ await server.addSocket(socket);
81
+ const client = new BsPeer(socket);
82
+ await client.init();
83
+
84
+ // Client can now access server storage
85
+ const { blobId } = await client.setBlob('Remote data');
86
+ const { content } = await client.getBlob(blobId);
87
+ console.log(content.toString()); // "Remote data"
88
+
89
+ // Close connection
90
+ await client.close();
91
+ ```
92
+
93
+ ### Multi-Tier Storage (Cache + Remote)
94
+
95
+ Combine multiple storage backends with automatic caching:
96
+
97
+ ```typescript
98
+ import { BsMulti, BsMem, BsPeer } from '@rljson/bs';
99
+
100
+ // Setup local cache
101
+ const localCache = new BsMem();
102
+
103
+ // Setup remote storage (via BsPeer)
104
+ const remotePeer = new BsPeer(remoteSocket);
105
+ await remotePeer.init();
106
+
107
+ // Create multi-tier storage with cache-first strategy
108
+ const bs = new BsMulti([
109
+ { bs: localCache, priority: 0, read: true, write: true }, // Cache first
110
+ { bs: remotePeer, priority: 1, read: true, write: false }, // Remote fallback
111
+ ]);
112
+ await bs.init();
113
+
114
+ // Store blob - writes to cache only (writable stores)
115
+ const { blobId } = await bs.setBlob('Cached content');
116
+
117
+ // Read from cache first, falls back to remote if not found
118
+ // Automatically hot-swaps remote blobs to cache for future reads
119
+ const { content } = await bs.getBlob(blobId);
120
+ ```
121
+
122
+ ## Core Concepts
123
+
124
+ ### Content-Addressable Storage
125
+
126
+ Every blob is identified by the SHA256 hash of its content:
127
+
128
+ ```typescript
129
+ const bs = new BsMem();
130
+
131
+ const result1 = await bs.setBlob('Same content');
132
+ const result2 = await bs.setBlob('Same content');
133
+
134
+ // Both return the same blobId (automatic deduplication)
135
+ console.log(result1.blobId === result2.blobId); // true
136
+
137
+ // Different content = different blobId
138
+ const result3 = await bs.setBlob('Different content');
139
+ console.log(result1.blobId !== result3.blobId); // true
140
+ ```
141
+
142
+ **Benefits:**
143
+
144
+ - **Automatic Deduplication**: Identical content stored once, regardless of how many times you call `setBlob`
145
+ - **Data Integrity**: The `blobId` serves as a cryptographic checksum
146
+ - **Location Independence**: Blobs can be identified and verified anywhere
147
+ - **Cache Efficiency**: Content can be cached anywhere and verified by its hash
148
+
149
+ ### Blob Properties
150
+
151
+ All blobs have associated metadata:
152
+
153
+ ```typescript
154
+ interface BlobProperties {
155
+ blobId: string; // SHA256 hash of content (64 hex characters)
156
+ size: number; // Size in bytes
157
+ createdAt: Date; // Creation timestamp
158
+ }
159
+ ```
160
+
161
+ ## API Reference
162
+
163
+ ### Bs Interface
164
+
165
+ All implementations conform to the `Bs` interface:
166
+
167
+ #### `setBlob(content: Buffer | string | ReadableStream): Promise<BlobProperties>`
168
+
169
+ Stores a blob and returns its properties including the SHA256 `blobId`.
170
+
171
+ ```typescript
172
+ // From string
173
+ const { blobId } = await bs.setBlob('Hello');
174
+
175
+ // From Buffer
176
+ const buffer = Buffer.from('World', 'utf8');
177
+ await bs.setBlob(buffer);
178
+
179
+ // From ReadableStream (for large files)
180
+ const stream = new ReadableStream({
181
+ start(controller) {
182
+ controller.enqueue(new TextEncoder().encode('Stream data'));
183
+ controller.close();
184
+ }
185
+ });
186
+ await bs.setBlob(stream);
187
+ ```
188
+
189
+ #### `getBlob(blobId: string, options?: DownloadBlobOptions): Promise<{ content: Buffer; properties: BlobProperties }>`
190
+
191
+ Retrieves a blob by its ID.
192
+
193
+ ```typescript
194
+ const { content, properties } = await bs.getBlob(blobId);
195
+ console.log(content.toString());
196
+ console.log(properties.size);
197
+
198
+ // With range request (partial content)
199
+ const { content: partial } = await bs.getBlob(blobId, {
200
+ range: { start: 0, end: 99 } // First 100 bytes
201
+ });
202
+ ```
203
+
204
+ #### `getBlobStream(blobId: string): Promise<ReadableStream<Uint8Array>>`
205
+
206
+ Retrieves a blob as a stream for efficient handling of large files.
207
+
208
+ ```typescript
209
+ const stream = await bs.getBlobStream(blobId);
210
+ const reader = stream.getReader();
211
+
212
+ while (true) {
213
+ const { done, value } = await reader.read();
214
+ if (done) break;
215
+ // Process chunk
216
+ console.log('Chunk size:', value.length);
217
+ }
218
+ ```
219
+
220
+ #### `deleteBlob(blobId: string): Promise<void>`
221
+
222
+ Deletes a blob from storage.
223
+
224
+ ```typescript
225
+ await bs.deleteBlob(blobId);
226
+
227
+ // Note: In production with content-addressable storage,
228
+ // consider reference counting before deletion
229
+ ```
230
+
231
+ #### `blobExists(blobId: string): Promise<boolean>`
232
+
233
+ Checks if a blob exists without downloading it.
234
+
235
+ ```typescript
236
+ if (await bs.blobExists(blobId)) {
237
+ console.log('Blob found');
238
+ }
239
+ ```
240
+
241
+ #### `getBlobProperties(blobId: string): Promise<BlobProperties>`
242
+
243
+ Gets blob metadata without downloading content.
244
+
245
+ ```typescript
246
+ const props = await bs.getBlobProperties(blobId);
247
+ console.log(`Blob size: ${props.size} bytes`);
248
+ console.log(`Created: ${props.createdAt.toISOString()}`);
249
+ ```
250
+
251
+ #### `listBlobs(options?: ListBlobsOptions): Promise<ListBlobsResult>`
252
+
253
+ Lists all blobs with optional filtering and pagination.
254
+
255
+ ```typescript
256
+ // List all blobs
257
+ const { blobs } = await bs.listBlobs();
258
+
259
+ // With prefix filter (blobs starting with "abc")
260
+ const result = await bs.listBlobs({ prefix: 'abc' });
261
+
262
+ // Paginated listing
263
+ let continuationToken: string | undefined;
264
+ do {
265
+ const result = await bs.listBlobs({
266
+ maxResults: 100,
267
+ continuationToken
268
+ });
269
+
270
+ console.log(`Got ${result.blobs.length} blobs`);
271
+ continuationToken = result.continuationToken;
272
+ } while (continuationToken);
273
+ ```
274
+
275
+ #### `generateSignedUrl(blobId: string, expiresIn: number, permissions?: 'read' | 'delete'): Promise<string>`
276
+
277
+ Generates a signed URL for temporary access to a blob.
278
+
279
+ ```typescript
280
+ // Read-only URL valid for 1 hour (3600 seconds)
281
+ const url = await bs.generateSignedUrl(blobId, 3600);
282
+
283
+ // Delete permission URL valid for 5 minutes
284
+ const deleteUrl = await bs.generateSignedUrl(blobId, 300, 'delete');
285
+ ```
286
+
287
+ ## Implementations
288
+
289
+ ### BsMem - In-Memory Storage
290
+
291
+ Fast, ephemeral storage for testing and temporary data.
292
+
293
+ ```typescript
294
+ import { BsMem } from '@rljson/bs';
295
+
296
+ const bs = new BsMem();
297
+ const { blobId } = await bs.setBlob('Temporary data');
298
+ ```
299
+
300
+ **Use Cases:**
301
+
302
+ - Unit testing
303
+ - Temporary caching
304
+ - Development and prototyping
305
+ - Fast local storage for small datasets
306
+
307
+ **Limitations:**
308
+
309
+ - Data lost when process ends
310
+ - Limited by available RAM
311
+ - Single-process only
312
+
313
+ ### BsPeer - Peer-to-Peer Storage Client
314
+
315
+ Access remote blob storage over a socket connection.
316
+
317
+ ```typescript
318
+ import { BsPeer } from '@rljson/bs';
319
+
320
+ // Create a peer connected to a remote storage
321
+ const peer = new BsPeer(socket);
322
+ await peer.init();
323
+
324
+ // Use like any other Bs implementation
325
+ const { blobId } = await peer.setBlob('Remote data');
326
+ const { content } = await peer.getBlob(blobId);
327
+
328
+ // Close connection when done
329
+ await peer.close();
330
+ ```
331
+
332
+ **Use Cases:**
333
+
334
+ - Client-server architectures
335
+ - Distributed systems
336
+ - Remote backup
337
+ - Accessing centralized storage
338
+
339
+ **Features:**
340
+
341
+ - Async socket-based communication
342
+ - Error-first callback pattern (Node.js style)
343
+ - Connection state management
344
+ - Automatic retry support
345
+
346
+ ### BsServer - Server-Side Handler
347
+
348
+ Handle blob storage requests from remote peers.
349
+
350
+ ```typescript
351
+ import { BsServer, BsMem, SocketMock } from '@rljson/bs';
352
+
353
+ // Server-side setup
354
+ const storage = new BsMem();
355
+ const server = new BsServer(storage);
356
+
357
+ // Handle incoming connection
358
+ const clientSocket = new SocketMock(); // Use real socket in production
359
+ await server.addSocket(clientSocket);
360
+
361
+ // Client can now access storage through socket
362
+ ```
363
+
364
+ **Use Cases:**
365
+
366
+ - Building blob storage services
367
+ - Network protocol implementation
368
+ - API backends
369
+ - Multi-client storage systems
370
+
371
+ **Features:**
372
+
373
+ - Multiple client support
374
+ - Socket lifecycle management
375
+ - Automatic method mapping
376
+ - Error handling
377
+
378
+ ### BsPeerBridge - PULL Architecture Bridge (Read-Only)
379
+
380
+ Exposes local blob storage for server to PULL from (read-only access).
381
+
382
+ ```typescript
383
+ import { BsPeerBridge, BsMem } from '@rljson/bs';
384
+
385
+ // Client-side: expose local storage for server to read
386
+ const localStorage = new BsMem();
387
+ const bridge = new BsPeerBridge(localStorage, socket);
388
+ bridge.start();
389
+
390
+ // Server can now read from client's local storage
391
+ // but CANNOT write to it (PULL architecture)
392
+ ```
393
+
394
+ **Architecture Pattern:**
395
+
396
+ - **PULL-only**: Server can read from client, but cannot write
397
+ - **Read Operations Only**: `getBlob`, `getBlobStream`, `blobExists`, `getBlobProperties`, `listBlobs`
398
+ - **No Write Operations**: Does not expose `setBlob`, `deleteBlob`, or `generateSignedUrl`
399
+
400
+ **Use Cases:**
401
+
402
+ - Client exposes local cache for server to access
403
+ - Distributed storage where server pulls from clients
404
+ - Peer-to-peer networks with read-only sharing
405
+
406
+ ### BsMulti - Multi-Tier Storage
407
+
408
+ Combine multiple storage backends with configurable priorities.
409
+
410
+ ```typescript
411
+ import { BsMulti, BsMem } from '@rljson/bs';
412
+
413
+ const fastCache = new BsMem();
414
+ const mainStorage = new BsMem();
415
+ const backup = new BsMem();
416
+
417
+ const bs = new BsMulti([
418
+ { bs: fastCache, priority: 0, read: true, write: true }, // L1 cache
419
+ { bs: mainStorage, priority: 1, read: true, write: true }, // Main storage
420
+ { bs: backup, priority: 2, read: true, write: false }, // Read-only backup
421
+ ]);
422
+ await bs.init();
423
+ ```
424
+
425
+ **Features:**
426
+
427
+ - **Priority-Based Reads**: Reads from lowest priority number first (0 = highest priority)
428
+ - **Hot-Swapping**: Automatically caches blobs from remote to local on read
429
+ - **Parallel Writes**: Writes to all writable stores simultaneously
430
+ - **Deduplication**: Merges results from all readable stores when listing
431
+ - **Graceful Fallback**: If highest priority fails, falls back to next priority
432
+
433
+ **Use Cases:**
434
+
435
+ - Local cache + remote storage
436
+ - Multi-region storage replication
437
+ - Local network storage infrastructure
438
+ - Backup and archival systems
439
+ - Hierarchical storage management (HSM)
440
+
441
+ ## Common Patterns
442
+
443
+ ### Local Cache + Remote Storage (PULL Architecture)
444
+
445
+ ```typescript
446
+ const localCache = new BsMem();
447
+ const remotePeer = new BsPeer(remoteSocket);
448
+ await remotePeer.init();
449
+
450
+ const bs = new BsMulti([
451
+ { bs: localCache, priority: 0, read: true, write: true },
452
+ { bs: remotePeer, priority: 1, read: true, write: false }, // Read-only
453
+ ]);
454
+
455
+ // Writes go to cache only
456
+ await bs.setBlob('data');
457
+
458
+ // Reads check cache first, then remote
459
+ // Remote blobs are automatically cached
460
+ const { content } = await bs.getBlob(blobId);
461
+ ```
462
+
463
+ ### Write-Through Cache
464
+
465
+ ```typescript
466
+ const bs = new BsMulti([
467
+ { bs: localCache, priority: 0, read: true, write: true },
468
+ { bs: remoteStorage, priority: 1, read: true, write: true }, // Also writable
469
+ ]);
470
+
471
+ // Writes go to both cache and remote simultaneously
472
+ await bs.setBlob('data');
473
+ ```
474
+
475
+ ### Multi-Region Replication
476
+
477
+ ```typescript
478
+ const bs = new BsMulti([
479
+ { bs: regionUs, priority: 0, read: true, write: true },
480
+ { bs: regionEu, priority: 1, read: true, write: true },
481
+ { bs: regionAsia, priority: 2, read: true, write: true },
482
+ ]);
483
+
484
+ // Writes replicate to all regions
485
+ // Reads come from fastest responding region
486
+ ```
487
+
488
+ ### Client-Server with BsPeerBridge (PULL Pattern)
489
+
490
+ ```typescript
491
+ // Client setup
492
+ const clientStorage = new BsMem();
493
+ const bridge = new BsPeerBridge(clientStorage, socketToServer);
494
+ bridge.start(); // Exposes read-only access to server
495
+
496
+ const bsPeer = new BsPeer(socketToServer);
497
+ await bsPeer.init();
498
+
499
+ const clientBs = new BsMulti([
500
+ { bs: clientStorage, priority: 0, read: true, write: true }, // Local storage
501
+ { bs: bsPeer, priority: 1, read: true, write: false }, // Server (read-only)
502
+ ]);
503
+
504
+ // Server can pull from client via bridge
505
+ // Client can pull from server via bsPeer
506
+ ```
507
+
508
+ ## Error Handling
509
+
510
+ All methods throw errors for invalid operations:
511
+
512
+ ```typescript
513
+ try {
514
+ await bs.getBlob('nonexistent-id');
515
+ } catch (error) {
516
+ console.error('Blob not found:', error.message);
517
+ }
518
+
519
+ // BsMulti gracefully handles partial failures
520
+ const multi = new BsMulti([
521
+ { bs: failingStore, priority: 0, read: true, write: false },
522
+ { bs: workingStore, priority: 1, read: true, write: false },
523
+ ]);
524
+
525
+ // Falls back to workingStore if failingStore errors
526
+ const { content } = await multi.getBlob(blobId);
527
+ ```
528
+
529
+ ## Testing
530
+
531
+ The package includes comprehensive test utilities:
532
+
533
+ ```typescript
534
+ import { BsMem } from '@rljson/bs';
535
+ import { describe, it, expect, beforeEach } from 'vitest';
536
+
537
+ describe('My Tests', () => {
538
+ let bs: BsMem;
539
+
540
+ beforeEach(() => {
541
+ bs = new BsMem();
542
+ });
543
+
544
+ it('should store and retrieve blobs', async () => {
545
+ const { blobId } = await bs.setBlob('test data');
546
+ const { content } = await bs.getBlob(blobId);
547
+ expect(content.toString()).toBe('test data');
548
+ });
549
+
550
+ it('should deduplicate identical content', async () => {
551
+ const result1 = await bs.setBlob('same');
552
+ const result2 = await bs.setBlob('same');
553
+ expect(result1.blobId).toBe(result2.blobId);
554
+ });
555
+ });
556
+ ```
557
+
558
+ ## Performance Considerations
559
+
560
+ ### Memory Usage
561
+
562
+ - `BsMem` stores all data in RAM - suitable for small to medium datasets
563
+ - Use streams (`getBlobStream`) for large blobs to avoid loading entire content into memory
564
+ - `BsMulti` with local cache reduces network overhead significantly
565
+
566
+ ### Network Efficiency
567
+
568
+ - Use `BsPeer` for remote access with minimal protocol overhead
569
+ - `BsMulti` automatically caches frequently accessed blobs locally
570
+ - Content-addressable nature prevents redundant transfers (same content = same hash)
571
+ - Hot-swapping in `BsMulti` reduces repeated network requests
572
+
573
+ ### Deduplication Benefits
574
+
575
+ - Identical content stored multiple times occupies space only once
576
+ - Particularly effective for:
577
+ - Version control systems (many files unchanged between versions)
578
+ - Backup solutions (incremental backups with deduplication)
579
+ - Build artifact storage (shared dependencies)
580
+ - Document management (attachments, templates)
581
+
582
+ ## Migration Guide
583
+
584
+ ### From Traditional Blob Storage
585
+
586
+ Traditional blob storage typically uses arbitrary identifiers:
587
+
588
+ ```typescript
589
+ // Traditional
590
+ await blobStore.put('my-file-id', content);
591
+ const data = await blobStore.get('my-file-id');
592
+ ```
593
+
594
+ With content-addressable storage, the ID is derived from content:
595
+
596
+ ```typescript
597
+ // Content-addressable
598
+ const { blobId } = await bs.setBlob(content); // blobId = SHA256(content)
599
+ const { content } = await bs.getBlob(blobId);
600
+ ```
601
+
602
+ **Key Differences:**
603
+
604
+ 1. **No custom IDs**: You cannot choose blob IDs, they are computed
605
+ 2. **Automatic deduplication**: Same content = same ID
606
+ 3. **Verify on read**: You can verify content integrity by recomputing the hash
607
+ 4. **External metadata**: Store file names, tags, etc. separately (e.g., in @rljson/io)
608
+
609
+ ## Frequently Asked Questions
610
+
611
+ ### Q: How do I organize blobs into folders or containers?
612
+
613
+ A: The Bs interface provides a flat storage pool. Organizational metadata (folders, tags, file names) should be stored separately, such as in a database or using `@rljson/io` (data table storage). Reference blobs by their `blobId`.
614
+
615
+ ### Q: What happens if I delete a blob that's referenced elsewhere?
616
+
617
+ A: The blob is permanently deleted. In production systems with shared blobs, implement reference counting before deletion.
618
+
619
+ ### Q: Can I use this in the browser?
620
+
621
+ A: Yes, but you'll need to provide your own Socket implementation for network communication, or use `BsMem` for local-only storage.
622
+
623
+ ### Q: How does BsMulti handle write conflicts?
624
+
625
+ A: `BsMulti` writes to all writable stores in parallel. If any write fails, the error is thrown. All writable stores will have the blob since content is identical (content-addressable).
626
+
627
+ ### Q: Why is BsPeerBridge read-only?
628
+
629
+ A: BsPeerBridge implements the PULL architecture pattern, where the server can read from client storage but cannot modify it. This prevents the server from pushing unwanted data to clients. Use BsPeer for client-to-server writes.
630
+
631
+ ## License
632
+
633
+ MIT
634
+
635
+ ## Contributing
636
+
637
+ Issues and pull requests welcome at <https://github.com/rljson/bs>
638
+
639
+ ## Related Packages
640
+
641
+ - `@rljson/io` - Data table storage interface and implementations
642
+ - `@rljson/hash` - Cryptographic hashing utilities