rust-kgdb 0.1.11 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -5,6 +5,33 @@
5
5
 
6
6
  **Production-ready RDF/hypergraph database with 100% W3C SPARQL 1.1 + RDF 1.2 compliance, worst-case optimal joins (WCOJ), and pluggable storage backends.**
7
7
 
8
+ > **This npm package provides the high-performance in-memory database.**
9
+ > For **distributed cluster deployment** (1B+ triples, horizontal scaling), contact: **gonnect.uk@gmail.com**
10
+
11
+ ---
12
+
13
+ ## Deployment Modes
14
+
15
+ rust-kgdb supports three deployment modes:
16
+
17
+ | Mode | Use Case | Scalability | This Package |
18
+ |------|----------|-------------|--------------|
19
+ | **In-Memory** | Development, embedded apps, testing | Single node, volatile | ✅ **Included** |
20
+ | **Single Node (RocksDB/LMDB)** | Production, persistence needed | Single node, persistent | Via Rust crate |
21
+ | **Distributed Cluster** | Enterprise, 1B+ triples | Horizontal scaling, 9+ partitions | Contact us |
22
+
23
+ ### Need Distributed Cluster?
24
+
25
+ For enterprise deployments requiring:
26
+ - **Subject-Anchored Partitioning**: All triples for a subject guaranteed on same partition for locality
27
+ - Horizontal scaling across multiple nodes (1B+ triples)
28
+ - HDRF (High-Degree Replicated First) with power-law load balancing
29
+ - **OLAP Query Path**: SQL-based analytical execution for aggregations
30
+ - Subject-Hash Filter for accurate COUNT deduplication across replicas
31
+ - Kubernetes-native deployment with StatefulSet executors
32
+
33
+ **Request a demo: gonnect.uk@gmail.com**
34
+
8
35
  ---
9
36
 
10
37
  ## Why rust-kgdb?
@@ -108,7 +135,7 @@ rust-kgdb uses a pluggable storage architecture. **Default is in-memory** (zero
108
135
  |---------|--------------|----------|--------|
109
136
  | **InMemory** | `default` | Development, testing, embedded | ✅ **Production Ready** |
110
137
  | **RocksDB** | `rocksdb-backend` | Production, large datasets | ✅ **61 tests passing** |
111
- | **LMDB** | `lmdb-backend` | Read-heavy workloads | Planned v0.2.0 |
138
+ | **LMDB** | `lmdb-backend` | Read-heavy workloads | **31 tests passing** |
112
139
 
113
140
  ### InMemory (Default)
114
141
 
@@ -176,6 +203,58 @@ store.flush()?;
176
203
  - Unicode & binary data (4 tests)
177
204
  - Large key/value handling (8 tests)
178
205
 
206
+ ### LMDB (Memory-Mapped Persistent)
207
+
208
+ B+tree based storage with memory-mapped I/O (via `heed` crate). Optimized for **read-heavy workloads** with MVCC (Multi-Version Concurrency Control). Tested with **31 comprehensive tests**.
209
+
210
+ ```toml
211
+ # Cargo.toml - Enable LMDB backend
212
+ [dependencies]
213
+ storage = { version = "0.1.12", features = ["lmdb-backend"] }
214
+ ```
215
+
216
+ ```rust
217
+ use storage::{QuadStore, LmdbBackend};
218
+
219
+ // Create persistent database (default 10GB map size)
220
+ let backend = LmdbBackend::new("/path/to/data")?;
221
+ let store = QuadStore::new(backend);
222
+
223
+ // Or with custom map size (1GB)
224
+ let backend = LmdbBackend::with_map_size("/path/to/data", 1024 * 1024 * 1024)?;
225
+
226
+ // Features:
227
+ // - Memory-mapped I/O (zero-copy reads)
228
+ // - MVCC for concurrent readers
229
+ // - Crash-safe ACID transactions
230
+ // - Range & prefix scanning
231
+ // - Excellent for read-heavy workloads
232
+
233
+ // Sync to disk
234
+ store.flush()?;
235
+ ```
236
+
237
+ **When to use LMDB vs RocksDB:**
238
+
239
+ | Characteristic | LMDB | RocksDB |
240
+ |----------------|------|---------|
241
+ | **Read Performance** | ✅ Faster (memory-mapped) | Good |
242
+ | **Write Performance** | Good | ✅ Faster (LSM-tree) |
243
+ | **Concurrent Readers** | ✅ Unlimited | Limited by locks |
244
+ | **Write Amplification** | Low | Higher (compaction) |
245
+ | **Memory Usage** | Higher (map size) | Lower (cache-based) |
246
+ | **Best For** | Read-heavy, OLAP | Write-heavy, OLTP |
247
+
248
+ **LMDB Test Coverage:**
249
+ - Basic CRUD operations (8 tests)
250
+ - Range scanning (4 tests)
251
+ - Prefix scanning (3 tests)
252
+ - Batch operations (3 tests)
253
+ - Large key/value handling (4 tests)
254
+ - Concurrent access (4 tests)
255
+ - Statistics & flush (3 tests)
256
+ - Edge cases (2 tests)
257
+
179
258
  ### TypeScript SDK
180
259
 
181
260
  The npm package uses the in-memory backend—ideal for:
@@ -507,8 +586,52 @@ Total: ~120 bytes/triple including indexes
507
586
 
508
587
  ---
509
588
 
589
+ ## Performance Benchmarks
590
+
591
+ ### By Deployment Mode
592
+
593
+ | Mode | Lookup | Insert | Memory | Dataset Size |
594
+ |------|--------|--------|--------|--------------|
595
+ | **In-Memory (npm)** | 2.78 µs | 146K/sec | 24 bytes/triple | <10M triples |
596
+ | **Single Node (RocksDB)** | 5-10 µs | 100K/sec | On-disk | <100M triples |
597
+ | **Distributed Cluster** | 10-50 µs | 500K+/sec* | Distributed | **1B+ triples** |
598
+
599
+ *Aggregate throughput across all executors with HDRF partitioning
600
+
601
+ ### SIMD + PGO Query Performance (LUBM Benchmark)
602
+
603
+ | Query | Pattern | Time | Improvement |
604
+ |-------|---------|------|-------------|
605
+ | Q5 | 2-hop chain | 53ms | **77% faster** |
606
+ | Q3 | 3-way star | 62ms | **65% faster** |
607
+ | Q4 | 3-hop chain | 101ms | **60% faster** |
608
+ | Q8 | Triangle | 193ms | **53% faster** |
609
+ | Q7 | Hierarchy | 198ms | **42% faster** |
610
+
611
+ **Average: 44.5% speedup** with zero code changes (compiler optimizations only).
612
+
613
+ ---
614
+
510
615
  ## Version History
511
616
 
617
+ ### v0.2.0 (2025-12-08) - Distributed Cluster Support
618
+
619
+ - **NEW: Distributed cluster architecture** with HDRF partitioning
620
+ - **Subject-Hash Filter** for accurate COUNT deduplication across replicas
621
+ - **DataFusion-powered OLAP** with Arrow-native vectorized execution
622
+ - Coordinator-Executor pattern with gRPC communication
623
+ - 9-partition default for optimal data distribution
624
+ - **Contact for cluster deployment**: gonnect.uk@gmail.com
625
+ - **Coming soon**: Embedding support for semantic search (v0.3.0)
626
+
627
+ ### v0.1.12 (2025-12-01) - LMDB Backend Release
628
+
629
+ - **LMDB storage backend** fully implemented (31 tests passing)
630
+ - Memory-mapped I/O for optimal read performance
631
+ - MVCC concurrency for unlimited concurrent readers
632
+ - Complete LMDB vs RocksDB comparison documentation
633
+ - Sample application with 87 triples demonstrating all features
634
+
512
635
  ### v0.1.9 (2025-12-01) - SIMD + PGO Release
513
636
 
514
637
  - **44.5% average speedup** via SIMD + PGO compiler optimizations
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "rust-kgdb",
3
- "version": "0.1.11",
4
- "description": "High-performance RDF/SPARQL database with 100% W3C compliance and WCOJ execution",
3
+ "version": "0.2.0",
4
+ "description": "High-performance RDF/SPARQL database with 100% W3C compliance, WCOJ execution, and distributed cluster support",
5
5
  "main": "index.js",
6
6
  "types": "index.d.ts",
7
7
  "napi": {
@@ -21,7 +21,7 @@
21
21
  "build:debug": "napi build --platform native/rust-kgdb-napi",
22
22
  "prepublishOnly": "napi prepublish -t npm",
23
23
  "test": "jest",
24
- "version": "0.1.11"
24
+ "version": "0.2.0"
25
25
  },
26
26
  "keywords": [
27
27
  "rdf",
@@ -56,10 +56,10 @@
56
56
  "*.node"
57
57
  ],
58
58
  "optionalDependencies": {
59
- "rust-kgdb-win32-x64-msvc": "0.1.11",
60
- "rust-kgdb-darwin-x64": "0.1.11",
61
- "rust-kgdb-linux-x64-gnu": "0.1.11",
62
- "rust-kgdb-darwin-arm64": "0.1.11",
63
- "rust-kgdb-linux-arm64-gnu": "0.1.11"
59
+ "rust-kgdb-win32-x64-msvc": "0.2.0",
60
+ "rust-kgdb-darwin-x64": "0.2.0",
61
+ "rust-kgdb-linux-x64-gnu": "0.2.0",
62
+ "rust-kgdb-darwin-arm64": "0.2.0",
63
+ "rust-kgdb-linux-arm64-gnu": "0.2.0"
64
64
  }
65
65
  }
Binary file