rust-kgdb 0.1.9 → 0.1.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +426 -268
  2. package/package.json +7 -7
package/README.md CHANGED
@@ -1,237 +1,469 @@
1
- # rust-kgdb - High-Performance RDF/SPARQL Database
1
+ # rust-kgdb
2
2
 
3
3
  [![npm version](https://badge.fury.io/js/rust-kgdb.svg)](https://www.npmjs.com/package/rust-kgdb)
4
4
  [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
5
5
 
6
- **Production-ready mobile-first RDF/hypergraph database with complete SPARQL 1.1 support and worst-case optimal join (WCOJ) execution.**
6
+ **Production-ready RDF/hypergraph database with 100% W3C SPARQL 1.1 + RDF 1.2 compliance, worst-case optimal joins (WCOJ), and pluggable storage backends.**
7
7
 
8
- ## 🚀 Key Features
8
+ ---
9
9
 
10
- - **100% W3C SPARQL 1.1 Compliance** - Complete query and update support
11
- - **100% W3C RDF 1.2 Compliance** - Full standard implementation
12
- - **WCOJ Execution** (v0.1.8) - LeapFrog TrieJoin for optimal multi-way joins
13
- - **Zero-Copy Semantics** - Minimal allocations, maximum performance
14
- - **Blazing Fast** - 2.78 µs triple lookups, 146K triples/sec bulk insert
15
- - **Memory Efficient** - 24 bytes/triple (25% better than RDFox)
16
- - **Native Rust** - Safe, reliable, production-ready
10
+ ## Why rust-kgdb?
17
11
 
18
- ## 📊 Performance (v0.1.8 - WCOJ Execution)
12
+ | Feature | rust-kgdb | Apache Jena | RDFox |
13
+ |---------|-----------|-------------|-------|
14
+ | **Lookup Speed** | 2.78 µs | ~50 µs | 50-100 µs |
15
+ | **Memory/Triple** | 24 bytes | 50-60 bytes | 32 bytes |
16
+ | **SPARQL 1.1** | 100% | 100% | 95% |
17
+ | **RDF 1.2** | 100% | Partial | No |
18
+ | **WCOJ** | ✅ LeapFrog | ❌ | ❌ |
19
+ | **Mobile-Ready** | ✅ iOS/Android | ❌ | ❌ |
19
20
 
20
- ### Query Performance Improvements
21
+ ---
21
22
 
22
- | Query Type | Before (Nested Loop) | After (WCOJ) | Expected Speedup |
23
- |------------|---------------------|--------------|------------------|
24
- | **Star Queries** (3+ patterns) | O(n³) | O(n log n) | **50-100x** |
25
- | **Complex Joins** (4+ patterns) | O(n⁴) | O(n log n) | **100-1000x** |
26
- | **Chain Queries** | O(n²) | O(n log n) | **10-20x** |
23
+ ## Core Technical Innovations
27
24
 
28
- ### Benchmark Results (Apple Silicon)
25
+ ### 1. Worst-Case Optimal Joins (WCOJ)
29
26
 
30
- | Metric | Result | Rate | vs RDFox |
31
- |--------|--------|------|----------|
32
- | **Lookup** | 2.78 µs | 359K/sec | ✅ **35-180x faster** |
33
- | **Bulk Insert** | 682 ms (100K) | 146K/sec | ⚠️ 73% speed (gap closing) |
34
- | **Memory** | 24 bytes/triple | - | ✅ **25% better** |
27
+ Traditional databases use **nested-loop joins** with O(n²) to O(n⁴) complexity. rust-kgdb implements the **LeapFrog TrieJoin** algorithm—a worst-case optimal join that achieves O(n log n) for multi-way joins.
35
28
 
36
- ### SIMD + PGO Optimizations (v0.1.8)
29
+ **How it works:**
30
+ - **Trie Data Structure**: Triples indexed hierarchically (S→P→O) using BTreeMap for sorted access
31
+ - **Variable Ordering**: Frequency-based analysis orders variables for optimal intersection
32
+ - **LeapFrog Iterator**: Binary search across sorted iterators finds intersections without materializing intermediate results
37
33
 
38
- **Compiler-Level Performance Gains** - Zero code changes, pure optimization!
34
+ ```
35
+ Query: SELECT ?x ?y ?z WHERE { ?x :p ?y . ?y :q ?z . ?x :r ?z }
39
36
 
40
- | Query | Before (No SIMD) | After (SIMD+PGO) | Improvement | Category |
41
- |-------|------------------|------------------|-------------|----------|
42
- | **Q1: 4-way star** | 283ms | **258ms** | ✅ **9% faster** | Good |
43
- | **Q2: 5-way star** | 234ms | **183ms** | ✅ **22% faster** | Strong |
44
- | **Q3: 3-way star** | 177ms | **62ms** | 🔥 **65% faster** | Exceptional |
45
- | **Q4: 3-hop chain** | 254ms | **101ms** | 🔥 **60% faster** | Exceptional |
46
- | **Q5: 2-hop chain** | 230ms | **53ms** | 🔥 **77% faster** | **BEST** |
47
- | **Q6: 6-way complex** | 641ms | **464ms** | ✅ **28% faster** | Good |
48
- | **Q7: Hierarchy** | 343ms | **198ms** | ✅ **42% faster** | Strong |
49
- | **Q8: Triangle** | 410ms | **193ms** | ✅ **53% faster** | Strong |
37
+ Nested Loop: O() - examines every combination
38
+ WCOJ: O(n log n) - iterates in sorted order, seeks forward on mismatch
39
+ ```
50
40
 
51
- **Key Results:**
52
- - **Average Speedup**: **44.5%** across all 8 LUBM queries
53
- - **Best Speedup**: **77%** (Q5 - 2-hop chain query)
54
- - **Range**: 9% to 77% improvement (all queries faster!)
55
- - **Distribution**: 3 exceptional (60%+), 2 strong (40-59%), 2 good (20-39%), 1 modest (9%)
41
+ | Query Pattern | Before (Nested Loop) | After (WCOJ) | Speedup |
42
+ |---------------|---------------------|--------------|---------|
43
+ | 3-way star | O(n³) | O(n log n) | **50-100x** |
44
+ | 4+ way complex | O(n⁴) | O(n log n) | **100-1000x** |
45
+ | Chain queries | O() | O(n log n) | **10-20x** |
56
46
 
57
- **How PGO Works:**
58
- 1. **Instrumentation Build**: Compiler adds profiling hooks
59
- 2. **Profile Collection**: Run real workload (23 runtime profiles collected)
60
- 3. **Profile Merging**: Combine profiles into 5.9M merged dataset
61
- 4. **Optimized Rebuild**: Compiler uses runtime data for:
62
- - Optimized hot paths (loops, function calls)
63
- - Improved branch prediction
64
- - Enhanced instruction cache locality
65
- - Better CPU pipelining
47
+ ### 2. Sparse Matrix Engine (CSR Format)
66
48
 
67
- **Hardware:** Tested on Intel Skylake with AVX2, BMI2, POPCNT optimizations.
49
+ Binary relations (e.g., `foaf:knows`, `rdfs:subClassOf`) are converted to **Compressed Sparse Row (CSR)** matrices for cache-efficient join evaluation:
68
50
 
69
- ## 📦 Installation
51
+ - **Memory**: O(nnz) where nnz = number of edges (not O(n²))
52
+ - **Matrix Multiplication**: Replaces nested-loop joins
53
+ - **Transitive Closure**: Semi-naive Δ-matrix evaluation (not iterated powers)
70
54
 
71
- ```bash
72
- npm install rust-kgdb
55
+ ```rust
56
+ // Traditional: O(n²) nested loops
57
+ for (s, p, o) in triples { ... }
58
+
59
+ // CSR Matrix: O(nnz) cache-friendly iteration
60
+ row_ptr[i] → col_indices[j] → values[j]
73
61
  ```
74
62
 
75
- ### Prerequisites
63
+ **Used for**: RDFS/OWL reasoning, transitive closure, Datalog evaluation.
64
+
65
+ ### 3. SIMD + PGO Compiler Optimizations
66
+
67
+ **Zero code changes—pure compiler-level performance gains.**
68
+
69
+ | Optimization | Technology | Effect |
70
+ |--------------|------------|--------|
71
+ | **SIMD Vectorization** | AVX2/BMI2 (Intel), NEON (ARM) | 8-wide parallel operations |
72
+ | **Profile-Guided Optimization** | LLVM PGO | Hot path optimization, branch prediction |
73
+ | **Link-Time Optimization** | LTO (fat) | Cross-crate inlining, dead code elimination |
74
+
75
+ **Benchmark Results (LUBM, Intel Skylake):**
76
+
77
+ | Query | Before | After (SIMD+PGO) | Improvement |
78
+ |-------|--------|------------------|-------------|
79
+ | Q5: 2-hop chain | 230ms | 53ms | **77% faster** |
80
+ | Q3: 3-way star | 177ms | 62ms | **65% faster** |
81
+ | Q4: 3-hop chain | 254ms | 101ms | **60% faster** |
82
+ | Q8: Triangle | 410ms | 193ms | **53% faster** |
83
+ | Q7: Hierarchy | 343ms | 198ms | **42% faster** |
84
+ | Q6: 6-way complex | 641ms | 464ms | **28% faster** |
85
+ | Q2: 5-way star | 234ms | 183ms | **22% faster** |
86
+ | Q1: 4-way star | 283ms | 258ms | **9% faster** |
87
+
88
+ **Average speedup: 44.5%** across all queries.
89
+
90
+ ### 4. Quad Indexing (SPOC)
91
+
92
+ Four complementary indexes enable O(1) pattern matching regardless of query shape:
93
+
94
+ | Index | Pattern | Use Case |
95
+ |-------|---------|----------|
96
+ | **SPOC** | `(?s, ?p, ?o, ?g)` | Subject-centric queries |
97
+ | **POCS** | `(?p, ?o, ?c, ?s)` | Property enumeration |
98
+ | **OCSP** | `(?o, ?c, ?s, ?p)` | Object lookups (reverse links) |
99
+ | **CSPO** | `(?c, ?s, ?p, ?o)` | Named graph iteration |
100
+
101
+ ---
102
+
103
+ ## Storage Backends
104
+
105
+ rust-kgdb uses a pluggable storage architecture. **Default is in-memory** (zero configuration). For persistence, enable RocksDB.
106
+
107
+ | Backend | Feature Flag | Use Case | Status |
108
+ |---------|--------------|----------|--------|
109
+ | **InMemory** | `default` | Development, testing, embedded | ✅ **Production Ready** |
110
+ | **RocksDB** | `rocksdb-backend` | Production, large datasets | ✅ **61 tests passing** |
111
+ | **LMDB** | `lmdb-backend` | Read-heavy workloads | ⏳ Planned v0.2.0 |
112
+
113
+ ### InMemory (Default)
114
+
115
+ Zero configuration, maximum performance. Data is volatile (lost on process exit).
116
+
117
+ **High-Performance Data Structures:**
118
+
119
+ | Component | Structure | Why |
120
+ |-----------|-----------|-----|
121
+ | **Triple Store** | `DashMap` | Lock-free concurrent hash map, 100K pre-allocation |
122
+ | **WCOJ Trie** | `BTreeMap` | Sorted iteration for LeapFrog intersection |
123
+ | **Dictionary** | `FxHashSet` | String interning with rustc-optimized hashing |
124
+ | **Hypergraph** | `FxHashMap` | Fast node→edge adjacency lists |
125
+ | **Reasoning** | `AHashMap` | RDFS/OWL inference with DoS-resistant hashing |
126
+ | **Datalog** | `FxHashMap` | Semi-naive evaluation with delta propagation |
127
+
128
+ **Why these structures enable sub-microsecond performance:**
129
+ - **DashMap**: Sharded locks (16 shards default) → near-linear scaling on multi-core
130
+ - **FxHashMap**: Rust compiler's hash function → 30% faster than std HashMap
131
+ - **BTreeMap**: O(log n) ordered iteration → enables binary search in LeapFrog
132
+ - **Pre-allocation**: 100K capacity avoids rehashing during bulk inserts
133
+
134
+ ```rust
135
+ use storage::{QuadStore, InMemoryBackend};
136
+
137
+ let store = QuadStore::new(InMemoryBackend::new());
138
+ // Ultra-fast: 2.78 µs lookups, zero disk I/O
139
+ ```
140
+
141
+ ### RocksDB (Persistent)
142
+
143
+ LSM-tree based storage with ACID transactions. Tested with **61 comprehensive tests**.
144
+
145
+ ```toml
146
+ # Cargo.toml - Enable RocksDB backend
147
+ [dependencies]
148
+ storage = { version = "0.1.10", features = ["rocksdb-backend"] }
149
+ ```
150
+
151
+ ```rust
152
+ use storage::{QuadStore, RocksDbBackend};
153
+
154
+ // Create persistent database
155
+ let backend = RocksDbBackend::new("/path/to/data")?;
156
+ let store = QuadStore::new(backend);
157
+
158
+ // Features:
159
+ // - ACID transactions
160
+ // - Snappy compression (automatic)
161
+ // - Crash recovery
162
+ // - Range & prefix scanning
163
+ // - 1MB+ value support
164
+
165
+ // Force sync to disk
166
+ store.flush()?;
167
+ ```
168
+
169
+ **RocksDB Test Coverage:**
170
+ - Basic CRUD operations (14 tests)
171
+ - Range scanning (8 tests)
172
+ - Prefix scanning (6 tests)
173
+ - Batch operations (8 tests)
174
+ - Transactions (8 tests)
175
+ - Concurrent access (5 tests)
176
+ - Unicode & binary data (4 tests)
177
+ - Large key/value handling (8 tests)
178
+
179
+ ### TypeScript SDK
180
+
181
+ The npm package uses the in-memory backend—ideal for:
182
+ - Knowledge graph queries
183
+ - SPARQL execution
184
+ - Data transformation pipelines
185
+ - Embedded applications
186
+
187
+ ```typescript
188
+ import { GraphDB } from 'rust-kgdb'
76
189
 
77
- - Node.js >= 14
78
- - No additional dependencies required (native bindings included)
190
+ // In-memory database (default, no configuration needed)
191
+ const db = new GraphDB('http://example.org/app')
192
+
193
+ // For persistence, export via CONSTRUCT:
194
+ const ntriples = db.queryConstruct('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')
195
+ fs.writeFileSync('backup.nt', ntriples)
196
+ ```
197
+
198
+ ---
199
+
200
+ ## Installation
201
+
202
+ ```bash
203
+ npm install rust-kgdb
204
+ ```
79
205
 
80
206
  ### Platform Support
81
207
 
82
- | Platform | Architecture | Status | Notes |
83
- |----------|-------------|--------|-------|
84
- | **macOS** | x64 (Intel) | ✅ Fully Supported | SIMD+PGO optimized (AVX2) |
85
- | **macOS** | arm64 (Apple Silicon) | ✅ Fully Supported | SIMD+PGO optimized (NEON) |
86
- | **Linux** | x64 | ✅ Fully Supported | SIMD+PGO optimized (AVX2) |
87
- | **Linux** | arm64 | ✅ Fully Supported | SIMD+PGO optimized (NEON) |
88
- | **Windows** | x64 | ✅ Fully Supported | SIMD+PGO optimized (AVX2) |
89
- | **Windows** | arm64 | ⏳ Coming Soon | Planned for v0.2.0 |
208
+ | Platform | Architecture | Status | SIMD |
209
+ |----------|-------------|--------|------|
210
+ | **macOS** | Intel (x64) | ✅ | AVX2, BMI2, POPCNT |
211
+ | **macOS** | Apple Silicon (arm64) | ✅ | NEON |
212
+ | **Linux** | x64 | ✅ | AVX2, BMI2, POPCNT |
213
+ | **Linux** | arm64 | ✅ | NEON |
214
+ | **Windows** | x64 | ✅ | AVX2, BMI2, POPCNT |
215
+ | **Windows** | arm64 | ⏳ v0.2.0 | — |
90
216
 
91
- **SIMD Optimizations** (v0.1.8):
92
- - **Intel/AMD (x64)**: AVX2, BMI2, POPCNT auto-vectorization
93
- - **Apple Silicon (arm64)**: NEON auto-vectorization
94
- - **Profile-Guided Optimization (PGO)**: Runtime profile-based code generation
217
+ **No compilation required**—pre-built native binaries included.
218
+
219
+ ---
95
220
 
96
- **Native Bindings**: Pre-compiled binaries included for all platforms. No compilation required during `npm install`.
221
+ ## Quick Start
97
222
 
98
- ## 🎯 Quick Start
223
+ ### Complete Working Example
99
224
 
100
225
  ```typescript
101
- import { GraphDB, Node } from 'rust-kgdb'
226
+ import { GraphDB } from 'rust-kgdb'
102
227
 
103
- // Create in-memory database
104
- const db = new GraphDB('http://example.org/my-app')
228
+ // 1. Create database
229
+ const db = new GraphDB('http://example.org/myapp')
105
230
 
106
- // Insert triples
231
+ // 2. Load data (Turtle format)
107
232
  db.loadTtl(`
108
233
  @prefix foaf: <http://xmlns.com/foaf/0.1/> .
234
+ @prefix ex: <http://example.org/> .
109
235
 
110
- <http://example.org/alice> foaf:name "Alice" ;
111
- foaf:age 30 ;
112
- foaf:knows <http://example.org/bob> .
236
+ ex:alice a foaf:Person ;
237
+ foaf:name "Alice" ;
238
+ foaf:age 30 ;
239
+ foaf:knows ex:bob, ex:charlie .
113
240
 
114
- <http://example.org/bob> foaf:name "Bob" ;
115
- foaf:age 25 .
241
+ ex:bob a foaf:Person ;
242
+ foaf:name "Bob" ;
243
+ foaf:age 25 ;
244
+ foaf:knows ex:charlie .
245
+
246
+ ex:charlie a foaf:Person ;
247
+ foaf:name "Charlie" ;
248
+ foaf:age 35 .
116
249
  `, null)
117
250
 
118
- // SPARQL SELECT query
119
- const results = db.querySelect(`
251
+ // 3. Query: Find friends-of-friends (WCOJ optimized!)
252
+ const fof = db.querySelect(`
120
253
  PREFIX foaf: <http://xmlns.com/foaf/0.1/>
254
+ PREFIX ex: <http://example.org/>
121
255
 
122
- SELECT ?person ?name ?age WHERE {
123
- ?person foaf:name ?name ;
124
- foaf:age ?age .
256
+ SELECT ?person ?friend ?fof WHERE {
257
+ ?person foaf:knows ?friend .
258
+ ?friend foaf:knows ?fof .
259
+ FILTER(?person != ?fof)
125
260
  }
126
- ORDER BY DESC(?age)
127
261
  `)
262
+ console.log('Friends of Friends:', fof)
263
+ // [{ person: 'ex:alice', friend: 'ex:bob', fof: 'ex:charlie' }]
264
+
265
+ // 4. Aggregation: Average age
266
+ const stats = db.querySelect(`
267
+ PREFIX foaf: <http://xmlns.com/foaf/0.1/>
128
268
 
129
- console.log(results)
130
- // [
131
- // { person: '<http://example.org/alice>', name: '"Alice"', age: '30' },
132
- // { person: '<http://example.org/bob>', name: '"Bob"', age: '25' }
133
- // ]
269
+ SELECT (COUNT(?p) AS ?count) (AVG(?age) AS ?avgAge) WHERE {
270
+ ?p a foaf:Person ; foaf:age ?age .
271
+ }
272
+ `)
273
+ console.log('Stats:', stats)
274
+ // [{ count: '3', avgAge: '30.0' }]
134
275
 
135
- // SPARQL ASK query
276
+ // 5. ASK query
136
277
  const hasAlice = db.queryAsk(`
137
- ASK { <http://example.org/alice> foaf:name "Alice" }
278
+ PREFIX ex: <http://example.org/>
279
+ ASK { ex:alice a <http://xmlns.com/foaf/0.1/Person> }
138
280
  `)
139
- console.log(hasAlice) // true
281
+ console.log('Has Alice?', hasAlice) // true
282
+
283
+ // 6. CONSTRUCT query
284
+ const graph = db.queryConstruct(`
285
+ PREFIX foaf: <http://xmlns.com/foaf/0.1/>
286
+ PREFIX ex: <http://example.org/>
287
+
288
+ CONSTRUCT { ?p foaf:knows ?f }
289
+ WHERE { ?p foaf:knows ?f }
290
+ `)
291
+ console.log('Extracted graph:', graph)
292
+
293
+ // 7. Count and cleanup
294
+ console.log('Triple count:', db.count()) // 11
295
+ db.clear()
296
+ ```
297
+
298
+ ### Save to File
299
+
300
+ ```typescript
301
+ import { writeFileSync } from 'fs'
302
+
303
+ // Save as N-Triples
304
+ const db = new GraphDB('http://example.org/export')
305
+ db.loadTtl(`<http://example.org/s> <http://example.org/p> "value" .`, null)
140
306
 
141
- // Count triples
142
- console.log(db.count()) // 5
307
+ const ntriples = db.queryConstruct(`CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }`)
308
+ writeFileSync('output.nt', ntriples)
143
309
  ```
144
310
 
145
- ## 🔥 WCOJ Execution Examples (v0.1.8)
311
+ ---
146
312
 
147
- ### Star Query (50-100x Faster!)
313
+ ## SPARQL 1.1 Features (100% W3C Compliant)
314
+
315
+ ### Query Forms
148
316
 
149
317
  ```typescript
150
- // Find people with name, age, and email
151
- const starQuery = db.querySelect(`
152
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
318
+ // SELECT - return bindings
319
+ db.querySelect('SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10')
153
320
 
154
- SELECT ?person ?name ?age ?email WHERE {
155
- ?person foaf:name ?name .
156
- ?person foaf:age ?age .
157
- ?person foaf:email ?email .
158
- }
321
+ // ASK - boolean existence check
322
+ db.queryAsk('ASK { <http://example.org/x> ?p ?o }')
323
+
324
+ // CONSTRUCT - build new graph
325
+ db.queryConstruct('CONSTRUCT { ?s <http://new/prop> ?o } WHERE { ?s ?p ?o }')
326
+ ```
327
+
328
+ ### Aggregates
329
+
330
+ ```typescript
331
+ db.querySelect(`
332
+ SELECT ?type (COUNT(*) AS ?count) (AVG(?value) AS ?avg)
333
+ WHERE { ?s a ?type ; <http://ex/value> ?value }
334
+ GROUP BY ?type
335
+ HAVING (COUNT(*) > 5)
336
+ ORDER BY DESC(?count)
159
337
  `)
338
+ ```
339
+
340
+ ### Property Paths
160
341
 
161
- // Automatically uses WCOJ execution for optimal performance
162
- // Expected speedup: 50-100x over nested loop joins
342
+ ```typescript
343
+ // Transitive closure (rdfs:subClassOf*)
344
+ db.querySelect('SELECT ?class WHERE { ?class rdfs:subClassOf* <http://top/Class> }')
345
+
346
+ // Alternative paths
347
+ db.querySelect('SELECT ?name WHERE { ?x (foaf:name|rdfs:label) ?name }')
348
+
349
+ // Sequence paths
350
+ db.querySelect('SELECT ?grandparent WHERE { ?x foaf:parent/foaf:parent ?grandparent }')
163
351
  ```
164
352
 
165
- ### Complex Join (100-1000x Faster!)
353
+ ### Named Graphs
166
354
 
167
355
  ```typescript
168
- // Find coworker connections
169
- const complexJoin = db.querySelect(`
170
- PREFIX org: <http://example.org/>
171
-
172
- SELECT ?person1 ?person2 ?company WHERE {
173
- ?person1 org:worksAt ?company .
174
- ?person2 org:worksAt ?company .
175
- ?person1 org:name ?name1 .
176
- ?person2 org:name ?name2 .
177
- FILTER(?person1 != ?person2)
356
+ // Load into named graph
357
+ db.loadTtl('<http://s> <http://p> "o" .', 'http://example.org/graph1')
358
+
359
+ // Query specific graph
360
+ db.querySelect(`
361
+ SELECT ?s ?p ?o WHERE {
362
+ GRAPH <http://example.org/graph1> { ?s ?p ?o }
178
363
  }
179
364
  `)
365
+ ```
366
+
367
+ ### UPDATE Operations
180
368
 
181
- // WCOJ automatically selected for 4+ pattern joins
182
- // Expected speedup: 100-1000x over nested loop
369
+ ```typescript
370
+ // INSERT DATA
371
+ db.updateInsert(`
372
+ INSERT DATA { <http://ex/new> <http://ex/prop> "value" }
373
+ `)
374
+
375
+ // DELETE WHERE
376
+ db.updateDelete(`
377
+ DELETE WHERE { ?s <http://ex/deprecated> ?o }
378
+ `)
183
379
  ```
184
380
 
185
- ### Chain Query (10-20x Faster!)
381
+ ---
382
+
383
+ ## Sample Application
384
+
385
+ ### Knowledge Graph Demo
386
+
387
+ A complete, production-ready sample application demonstrating enterprise knowledge graph capabilities is available in the repository.
388
+
389
+ **Location**: [`examples/knowledge-graph-demo/`](../../examples/knowledge-graph-demo/)
390
+
391
+ **Features Demonstrated**:
392
+ - Complete organizational knowledge graph (employees, departments, projects, skills)
393
+ - SPARQL SELECT queries with star and chain patterns (WCOJ-optimized)
394
+ - Aggregations (COUNT, AVG, GROUP BY, HAVING)
395
+ - Property paths for transitive closure (organizational hierarchy)
396
+ - SPARQL ASK and CONSTRUCT queries
397
+ - Named graphs for multi-tenant data isolation
398
+ - Data export to Turtle format
399
+
400
+ **Run the Demo**:
401
+
402
+ ```bash
403
+ cd examples/knowledge-graph-demo
404
+ npm install
405
+ npm start
406
+ ```
407
+
408
+ **Sample Output**:
409
+
410
+ The demo creates a realistic knowledge graph with:
411
+ - 5 employees across 4 departments
412
+ - 13 technical and soft skills
413
+ - 2 software projects
414
+ - Reporting hierarchies and salary data
415
+ - Named graph for sensitive compensation data
416
+
417
+ **Example Query from Demo** (finds all direct and indirect reports):
186
418
 
187
419
  ```typescript
188
- // Friend-of-friend pattern
189
- const chainQuery = db.querySelect(`
420
+ const pathQuery = `
421
+ PREFIX ex: <http://example.org/>
190
422
  PREFIX foaf: <http://xmlns.com/foaf/0.1/>
191
423
 
192
- SELECT ?person1 ?person2 ?person3 WHERE {
193
- ?person1 foaf:knows ?person2 .
194
- ?person2 foaf:knows ?person3 .
424
+ SELECT ?employee ?name WHERE {
425
+ ?employee ex:reportsTo+ ex:alice . # Transitive closure
426
+ ?employee foaf:name ?name .
195
427
  }
196
- `)
197
-
198
- // WCOJ optimizes chain patterns
199
- // Expected speedup: 10-20x over nested loop
428
+ ORDER BY ?name
429
+ `
430
+ const results = db.querySelect(pathQuery)
200
431
  ```
201
432
 
202
- ## 📚 Full API Reference
433
+ **Learn More**: See the [demo README](../../examples/knowledge-graph-demo/README.md) for full documentation, query examples, and how to customize the knowledge graph.
434
+
435
+ ---
436
+
437
+ ## API Reference
203
438
 
204
439
  ### GraphDB Class
205
440
 
206
441
  ```typescript
207
442
  class GraphDB {
208
- // Create database
209
- static inMemory(): GraphDB
210
- constructor(baseUri: string)
443
+ constructor(baseUri: string) // Create with base URI
444
+ static inMemory(): GraphDB // Create anonymous in-memory DB
211
445
 
212
- // Data loading
213
- loadTtl(data: string, graphName: string | null): void
214
- loadNTriples(data: string, graphName: string | null): void
446
+ // Data Loading
447
+ loadTtl(data: string, graph: string | null): void
448
+ loadNTriples(data: string, graph: string | null): void
215
449
 
216
- // SPARQL queries (WCOJ execution in v0.1.8!)
450
+ // SPARQL Queries (WCOJ-optimized)
217
451
  querySelect(sparql: string): Array<Record<string, string>>
218
452
  queryAsk(sparql: string): boolean
219
- queryConstruct(sparql: string): string
453
+ queryConstruct(sparql: string): string // Returns N-Triples
220
454
 
221
- // SPARQL updates
455
+ // SPARQL Updates
222
456
  updateInsert(sparql: string): void
223
457
  updateDelete(sparql: string): void
224
458
 
225
- // Database operations
459
+ // Database Operations
226
460
  count(): number
227
461
  clear(): void
228
-
229
- // Metadata
230
462
  getVersion(): string
231
463
  }
232
464
  ```
233
465
 
234
- ### Node Class (Triple Construction)
466
+ ### Node Class
235
467
 
236
468
  ```typescript
237
469
  class Node {
@@ -245,165 +477,91 @@ class Node {
245
477
  }
246
478
  ```
247
479
 
248
- ## 🎓 Advanced Usage
480
+ ---
249
481
 
250
- ### SPARQL UPDATE Operations
482
+ ## Performance Characteristics
251
483
 
252
- ```typescript
253
- // INSERT DATA
254
- db.updateInsert(`
255
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
484
+ ### Complexity Analysis
256
485
 
257
- INSERT DATA {
258
- <http://example.org/charlie> foaf:name "Charlie" ;
259
- foaf:age 35 .
260
- }
261
- `)
486
+ | Operation | Complexity | Notes |
487
+ |-----------|------------|-------|
488
+ | Triple lookup | O(1) | Hash-based SPOC index |
489
+ | Pattern scan | O(k) | k = matching triples |
490
+ | Star join (WCOJ) | O(n log n) | LeapFrog intersection |
491
+ | Complex join (WCOJ) | O(n log n) | Trie-based |
492
+ | Transitive closure | O(n²) worst | CSR matrix optimization |
493
+ | Bulk insert | O(n) | Batch indexing |
262
494
 
263
- // DELETE WHERE
264
- db.updateDelete(`
265
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
495
+ ### Memory Layout
266
496
 
267
- DELETE WHERE {
268
- ?person foaf:age ?age .
269
- FILTER(?age < 18)
270
- }
271
- `)
272
497
  ```
273
-
274
- ### Named Graphs
275
-
276
- ```typescript
277
- // Load into named graph
278
- db.loadTtl(`
279
- <http://example.org/resource> <http://purl.org/dc/terms/title> "Title" .
280
- `, 'http://example.org/graph1')
281
-
282
- // Query specific graph
283
- const results = db.querySelect(`
284
- SELECT ?s ?p ?o WHERE {
285
- GRAPH <http://example.org/graph1> {
286
- ?s ?p ?o .
287
- }
288
- }
289
- `)
290
- ```
291
-
292
- ### SPARQL 1.1 Aggregates
293
-
294
- ```typescript
295
- // COUNT, AVG, MIN, MAX, SUM
296
- const aggregates = db.querySelect(`
297
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
298
-
299
- SELECT
300
- (COUNT(?person) AS ?count)
301
- (AVG(?age) AS ?avgAge)
302
- (MIN(?age) AS ?minAge)
303
- (MAX(?age) AS ?maxAge)
304
- WHERE {
305
- ?person foaf:age ?age .
306
- }
307
- `)
498
+ Triple: 24 bytes
499
+ ├── Subject: 8 bytes (dictionary ID)
500
+ ├── Predicate: 8 bytes (dictionary ID)
501
+ └── Object: 8 bytes (dictionary ID)
502
+
503
+ String Interning: All URIs/literals stored once in Dictionary
504
+ Index Overhead: ~4x base triple size (4 indexes)
505
+ Total: ~120 bytes/triple including indexes
308
506
  ```
309
507
 
310
- ### SPARQL 1.1 Property Paths
311
-
312
- ```typescript
313
- // Transitive closure with *
314
- const transitiveKnows = db.querySelect(`
315
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
316
-
317
- SELECT ?person ?connected WHERE {
318
- <http://example.org/alice> foaf:knows* ?connected .
319
- }
320
- `)
321
-
322
- // Alternative paths with |
323
- const nameOrLabel = db.querySelect(`
324
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
325
- PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
326
-
327
- SELECT ?resource ?name WHERE {
328
- ?resource (foaf:name|rdfs:label) ?name .
329
- }
330
- `)
331
- ```
508
+ ---
332
509
 
333
- ## 🏗️ Architecture
510
+ ## Version History
334
511
 
335
- - **Core**: Pure Rust implementation with zero-copy semantics
336
- - **Bindings**: NAPI-RS for native Node.js addon
337
- - **Storage**: Pluggable backends (InMemory, RocksDB, LMDB)
338
- - **Indexing**: SPOC, POCS, OCSP, CSPO quad indexes
339
- - **Query Optimizer**: Automatic WCOJ detection and execution
340
- - **WCOJ Engine**: LeapFrog TrieJoin with variable ordering analysis
512
+ ### v0.1.9 (2025-12-01) - SIMD + PGO Release
341
513
 
342
- ## 📈 Version History
514
+ - **44.5% average speedup** via SIMD + PGO compiler optimizations
515
+ - WCOJ execution with LeapFrog TrieJoin
516
+ - Release automation infrastructure
517
+ - All packages updated to gonnect-uk namespace
343
518
 
344
- ### v0.1.8 (2025-12-01) - WCOJ Execution!
519
+ ### v0.1.8 (2025-12-01) - WCOJ Execution
345
520
 
346
- - ✅ **WCOJ Execution Path Activated** - LeapFrog TrieJoin for multi-way joins
347
- - ✅ **Variable Ordering Analysis** - Frequency-based optimization for WCOJ
348
- - **50-100x Speedup** for star queries (3+ patterns with shared variable)
349
- - ✅ **100-1000x Speedup** for complex joins (4+ patterns)
350
- - ✅ **577 Tests Passing** - Comprehensive end-to-end verification
351
- - ✅ **Zero Regressions** - All existing queries work unchanged
521
+ - WCOJ execution path activated
522
+ - Variable ordering analysis for optimal joins
523
+ - 577 tests passing
352
524
 
353
525
  ### v0.1.7 (2025-11-30)
354
526
 
355
527
  - Query optimizer with automatic strategy selection
356
528
  - WCOJ algorithm integration (planning phase)
357
- - Query plan visualization API
358
529
 
359
530
  ### v0.1.3 (2025-11-18)
360
531
 
361
- - Initial TypeScript SDK release
532
+ - Initial TypeScript SDK
362
533
  - 100% W3C SPARQL 1.1 compliance
363
534
  - 100% W3C RDF 1.2 compliance
364
535
 
365
- ## 🔬 Testing
366
-
367
- ```bash
368
- # Run test suite
369
- npm test
370
-
371
- # Run specific tests
372
- npm test -- --testNamePattern="star query"
373
- ```
374
-
375
- ## 🤝 Contributing
536
+ ---
376
537
 
377
- Contributions are welcome! Please see [CONTRIBUTING.md](https://github.com/gonnect-uk/rust-kgdb/blob/main/CONTRIBUTING.md)
538
+ ## Use Cases
378
539
 
379
- ## 📄 License
540
+ | Domain | Application |
541
+ |--------|-------------|
542
+ | **Knowledge Graphs** | Enterprise ontologies, taxonomies |
543
+ | **Semantic Search** | Structured queries over unstructured data |
544
+ | **Data Integration** | ETL with SPARQL CONSTRUCT |
545
+ | **Compliance** | SHACL validation, provenance tracking |
546
+ | **Graph Analytics** | Pattern detection, community analysis |
547
+ | **Mobile Apps** | Embedded RDF on iOS/Android |
380
548
 
381
- Apache License 2.0 - See [LICENSE](https://github.com/gonnect-uk/rust-kgdb/blob/main/LICENSE)
549
+ ---
382
550
 
383
- ## 🔗 Links
551
+ ## Links
384
552
 
385
553
  - [GitHub Repository](https://github.com/gonnect-uk/rust-kgdb)
386
554
  - [Documentation](https://github.com/gonnect-uk/rust-kgdb/tree/main/docs)
387
555
  - [CHANGELOG](https://github.com/gonnect-uk/rust-kgdb/blob/main/CHANGELOG.md)
388
- - [W3C SPARQL 1.1 Spec](https://www.w3.org/TR/sparql11-query/)
389
- - [W3C RDF 1.2 Spec](https://www.w3.org/TR/rdf12-concepts/)
556
+ - [W3C SPARQL 1.1](https://www.w3.org/TR/sparql11-query/)
557
+ - [W3C RDF 1.2](https://www.w3.org/TR/rdf12-concepts/)
390
558
 
391
- ## 💡 Use Cases
392
-
393
- - **Knowledge Graphs** - Build semantic data models
394
- - **Semantic Search** - Query structured data with SPARQL
395
- - **Data Integration** - Combine data from multiple sources
396
- - **Ontology Reasoning** - RDFS and OWL inference
397
- - **Graph Analytics** - Complex pattern matching with WCOJ
398
- - **Mobile Apps** - Embedded RDF database for iOS/Android
559
+ ---
399
560
 
400
- ## 🎯 Roadmap
561
+ ## License
401
562
 
402
- - [x] v0.1.8: WCOJ execution + SIMD + PGO optimizations (35-55% faster!)
403
- - [ ] v0.1.9: Manual SIMD vectorization for 2-4x additional speedup
404
- - [ ] v0.2.0: Windows ARM64 support + distributed query execution
405
- - [ ] v0.3.0: Graph analytics and reasoning engines
563
+ Apache License 2.0
406
564
 
407
565
  ---
408
566
 
409
- **Built with ❤️ using Rust and NAPI-RS**
567
+ **Built with Rust + NAPI-RS**
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "rust-kgdb",
3
- "version": "0.1.9",
3
+ "version": "0.1.11",
4
4
  "description": "High-performance RDF/SPARQL database with 100% W3C compliance and WCOJ execution",
5
5
  "main": "index.js",
6
6
  "types": "index.d.ts",
@@ -21,7 +21,7 @@
21
21
  "build:debug": "napi build --platform native/rust-kgdb-napi",
22
22
  "prepublishOnly": "napi prepublish -t npm",
23
23
  "test": "jest",
24
- "version": "0.1.9"
24
+ "version": "0.1.11"
25
25
  },
26
26
  "keywords": [
27
27
  "rdf",
@@ -56,10 +56,10 @@
56
56
  "*.node"
57
57
  ],
58
58
  "optionalDependencies": {
59
- "rust-kgdb-win32-x64-msvc": "0.1.9",
60
- "rust-kgdb-darwin-x64": "0.1.9",
61
- "rust-kgdb-linux-x64-gnu": "0.1.9",
62
- "rust-kgdb-darwin-arm64": "0.1.9",
63
- "rust-kgdb-linux-arm64-gnu": "0.1.9"
59
+ "rust-kgdb-win32-x64-msvc": "0.1.11",
60
+ "rust-kgdb-darwin-x64": "0.1.11",
61
+ "rust-kgdb-linux-x64-gnu": "0.1.11",
62
+ "rust-kgdb-darwin-arm64": "0.1.11",
63
+ "rust-kgdb-linux-arm64-gnu": "0.1.11"
64
64
  }
65
65
  }