rust-kgdb 0.1.9 → 0.1.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +373 -269
  2. package/package.json +7 -7
package/README.md CHANGED
@@ -1,237 +1,415 @@
1
- # rust-kgdb - High-Performance RDF/SPARQL Database
1
+ # rust-kgdb
2
2
 
3
3
  [![npm version](https://badge.fury.io/js/rust-kgdb.svg)](https://www.npmjs.com/package/rust-kgdb)
4
4
  [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
5
5
 
6
- **Production-ready mobile-first RDF/hypergraph database with complete SPARQL 1.1 support and worst-case optimal join (WCOJ) execution.**
6
+ **Production-ready RDF/hypergraph database with 100% W3C SPARQL 1.1 + RDF 1.2 compliance, worst-case optimal joins (WCOJ), and pluggable storage backends.**
7
7
 
8
- ## 🚀 Key Features
8
+ ---
9
9
 
10
- - **100% W3C SPARQL 1.1 Compliance** - Complete query and update support
11
- - **100% W3C RDF 1.2 Compliance** - Full standard implementation
12
- - **WCOJ Execution** (v0.1.8) - LeapFrog TrieJoin for optimal multi-way joins
13
- - **Zero-Copy Semantics** - Minimal allocations, maximum performance
14
- - **Blazing Fast** - 2.78 µs triple lookups, 146K triples/sec bulk insert
15
- - **Memory Efficient** - 24 bytes/triple (25% better than RDFox)
16
- - **Native Rust** - Safe, reliable, production-ready
10
+ ## Why rust-kgdb?
17
11
 
18
- ## 📊 Performance (v0.1.8 - WCOJ Execution)
12
+ | Feature | rust-kgdb | Apache Jena | RDFox |
13
+ |---------|-----------|-------------|-------|
14
+ | **Lookup Speed** | 2.78 µs | ~50 µs | 50-100 µs |
15
+ | **Memory/Triple** | 24 bytes | 50-60 bytes | 32 bytes |
16
+ | **SPARQL 1.1** | 100% | 100% | 95% |
17
+ | **RDF 1.2** | 100% | Partial | No |
18
+ | **WCOJ** | ✅ LeapFrog | ❌ | ❌ |
19
+ | **Mobile-Ready** | ✅ iOS/Android | ❌ | ❌ |
19
20
 
20
- ### Query Performance Improvements
21
+ ---
21
22
 
22
- | Query Type | Before (Nested Loop) | After (WCOJ) | Expected Speedup |
23
- |------------|---------------------|--------------|------------------|
24
- | **Star Queries** (3+ patterns) | O(n³) | O(n log n) | **50-100x** |
25
- | **Complex Joins** (4+ patterns) | O(n⁴) | O(n log n) | **100-1000x** |
26
- | **Chain Queries** | O(n²) | O(n log n) | **10-20x** |
23
+ ## Core Technical Innovations
27
24
 
28
- ### Benchmark Results (Apple Silicon)
25
+ ### 1. Worst-Case Optimal Joins (WCOJ)
29
26
 
30
- | Metric | Result | Rate | vs RDFox |
31
- |--------|--------|------|----------|
32
- | **Lookup** | 2.78 µs | 359K/sec | ✅ **35-180x faster** |
33
- | **Bulk Insert** | 682 ms (100K) | 146K/sec | ⚠️ 73% speed (gap closing) |
34
- | **Memory** | 24 bytes/triple | - | ✅ **25% better** |
27
+ Traditional databases use **nested-loop joins** with O(n²) to O(n⁴) complexity. rust-kgdb implements the **LeapFrog TrieJoin** algorithm—a worst-case optimal join that achieves O(n log n) for multi-way joins.
35
28
 
36
- ### SIMD + PGO Optimizations (v0.1.8)
29
+ **How it works:**
30
+ - **Trie Data Structure**: Triples indexed hierarchically (S→P→O) using BTreeMap for sorted access
31
+ - **Variable Ordering**: Frequency-based analysis orders variables for optimal intersection
32
+ - **LeapFrog Iterator**: Binary search across sorted iterators finds intersections without materializing intermediate results
37
33
 
38
- **Compiler-Level Performance Gains** - Zero code changes, pure optimization!
34
+ ```
35
+ Query: SELECT ?x ?y ?z WHERE { ?x :p ?y . ?y :q ?z . ?x :r ?z }
39
36
 
40
- | Query | Before (No SIMD) | After (SIMD+PGO) | Improvement | Category |
41
- |-------|------------------|------------------|-------------|----------|
42
- | **Q1: 4-way star** | 283ms | **258ms** | ✅ **9% faster** | Good |
43
- | **Q2: 5-way star** | 234ms | **183ms** | ✅ **22% faster** | Strong |
44
- | **Q3: 3-way star** | 177ms | **62ms** | 🔥 **65% faster** | Exceptional |
45
- | **Q4: 3-hop chain** | 254ms | **101ms** | 🔥 **60% faster** | Exceptional |
46
- | **Q5: 2-hop chain** | 230ms | **53ms** | 🔥 **77% faster** | **BEST** |
47
- | **Q6: 6-way complex** | 641ms | **464ms** | ✅ **28% faster** | Good |
48
- | **Q7: Hierarchy** | 343ms | **198ms** | ✅ **42% faster** | Strong |
49
- | **Q8: Triangle** | 410ms | **193ms** | ✅ **53% faster** | Strong |
37
+ Nested Loop: O() - examines every combination
38
+ WCOJ: O(n log n) - iterates in sorted order, seeks forward on mismatch
39
+ ```
50
40
 
51
- **Key Results:**
52
- - **Average Speedup**: **44.5%** across all 8 LUBM queries
53
- - **Best Speedup**: **77%** (Q5 - 2-hop chain query)
54
- - **Range**: 9% to 77% improvement (all queries faster!)
55
- - **Distribution**: 3 exceptional (60%+), 2 strong (40-59%), 2 good (20-39%), 1 modest (9%)
41
+ | Query Pattern | Before (Nested Loop) | After (WCOJ) | Speedup |
42
+ |---------------|---------------------|--------------|---------|
43
+ | 3-way star | O(n³) | O(n log n) | **50-100x** |
44
+ | 4+ way complex | O(n⁴) | O(n log n) | **100-1000x** |
45
+ | Chain queries | O() | O(n log n) | **10-20x** |
56
46
 
57
- **How PGO Works:**
58
- 1. **Instrumentation Build**: Compiler adds profiling hooks
59
- 2. **Profile Collection**: Run real workload (23 runtime profiles collected)
60
- 3. **Profile Merging**: Combine profiles into 5.9M merged dataset
61
- 4. **Optimized Rebuild**: Compiler uses runtime data for:
62
- - Optimized hot paths (loops, function calls)
63
- - Improved branch prediction
64
- - Enhanced instruction cache locality
65
- - Better CPU pipelining
47
+ ### 2. Sparse Matrix Engine (CSR Format)
66
48
 
67
- **Hardware:** Tested on Intel Skylake with AVX2, BMI2, POPCNT optimizations.
49
+ Binary relations (e.g., `foaf:knows`, `rdfs:subClassOf`) are converted to **Compressed Sparse Row (CSR)** matrices for cache-efficient join evaluation:
68
50
 
69
- ## 📦 Installation
51
+ - **Memory**: O(nnz) where nnz = number of edges (not O(n²))
52
+ - **Matrix Multiplication**: Replaces nested-loop joins
53
+ - **Transitive Closure**: Semi-naive Δ-matrix evaluation (not iterated powers)
70
54
 
71
- ```bash
72
- npm install rust-kgdb
55
+ ```rust
56
+ // Traditional: O(n²) nested loops
57
+ for (s, p, o) in triples { ... }
58
+
59
+ // CSR Matrix: O(nnz) cache-friendly iteration
60
+ row_ptr[i] → col_indices[j] → values[j]
61
+ ```
62
+
63
+ **Used for**: RDFS/OWL reasoning, transitive closure, Datalog evaluation.
64
+
65
+ ### 3. SIMD + PGO Compiler Optimizations
66
+
67
+ **Zero code changes—pure compiler-level performance gains.**
68
+
69
+ | Optimization | Technology | Effect |
70
+ |--------------|------------|--------|
71
+ | **SIMD Vectorization** | AVX2/BMI2 (Intel), NEON (ARM) | 8-wide parallel operations |
72
+ | **Profile-Guided Optimization** | LLVM PGO | Hot path optimization, branch prediction |
73
+ | **Link-Time Optimization** | LTO (fat) | Cross-crate inlining, dead code elimination |
74
+
75
+ **Benchmark Results (LUBM, Intel Skylake):**
76
+
77
+ | Query | Before | After (SIMD+PGO) | Improvement |
78
+ |-------|--------|------------------|-------------|
79
+ | Q5: 2-hop chain | 230ms | 53ms | **77% faster** |
80
+ | Q3: 3-way star | 177ms | 62ms | **65% faster** |
81
+ | Q4: 3-hop chain | 254ms | 101ms | **60% faster** |
82
+ | Q8: Triangle | 410ms | 193ms | **53% faster** |
83
+ | Q7: Hierarchy | 343ms | 198ms | **42% faster** |
84
+ | Q6: 6-way complex | 641ms | 464ms | **28% faster** |
85
+ | Q2: 5-way star | 234ms | 183ms | **22% faster** |
86
+ | Q1: 4-way star | 283ms | 258ms | **9% faster** |
87
+
88
+ **Average speedup: 44.5%** across all queries.
89
+
90
+ ### 4. Quad Indexing (SPOC)
91
+
92
+ Four complementary indexes enable O(1) pattern matching regardless of query shape:
93
+
94
+ | Index | Pattern | Use Case |
95
+ |-------|---------|----------|
96
+ | **SPOC** | `(?s, ?p, ?o, ?g)` | Subject-centric queries |
97
+ | **POCS** | `(?p, ?o, ?c, ?s)` | Property enumeration |
98
+ | **OCSP** | `(?o, ?c, ?s, ?p)` | Object lookups (reverse links) |
99
+ | **CSPO** | `(?c, ?s, ?p, ?o)` | Named graph iteration |
100
+
101
+ ---
102
+
103
+ ## Storage Backends
104
+
105
+ rust-kgdb uses a pluggable storage architecture. **Default is in-memory** (zero configuration). For persistence, enable RocksDB.
106
+
107
+ | Backend | Feature Flag | Use Case | Status |
108
+ |---------|--------------|----------|--------|
109
+ | **InMemory** | `default` | Development, testing, embedded | ✅ **Production Ready** |
110
+ | **RocksDB** | `rocksdb-backend` | Production, large datasets | ✅ **61 tests passing** |
111
+ | **LMDB** | `lmdb-backend` | Read-heavy workloads | ⏳ Planned v0.2.0 |
112
+
113
+ ### InMemory (Default)
114
+
115
+ Zero configuration, maximum performance. Data is volatile (lost on process exit).
116
+
117
+ **High-Performance Data Structures:**
118
+
119
+ | Component | Structure | Why |
120
+ |-----------|-----------|-----|
121
+ | **Triple Store** | `DashMap` | Lock-free concurrent hash map, 100K pre-allocation |
122
+ | **WCOJ Trie** | `BTreeMap` | Sorted iteration for LeapFrog intersection |
123
+ | **Dictionary** | `FxHashSet` | String interning with rustc-optimized hashing |
124
+ | **Hypergraph** | `FxHashMap` | Fast node→edge adjacency lists |
125
+ | **Reasoning** | `AHashMap` | RDFS/OWL inference with DoS-resistant hashing |
126
+ | **Datalog** | `FxHashMap` | Semi-naive evaluation with delta propagation |
127
+
128
+ **Why these structures enable sub-microsecond performance:**
129
+ - **DashMap**: Sharded locks (16 shards default) → near-linear scaling on multi-core
130
+ - **FxHashMap**: Rust compiler's hash function → 30% faster than std HashMap
131
+ - **BTreeMap**: O(log n) ordered iteration → enables binary search in LeapFrog
132
+ - **Pre-allocation**: 100K capacity avoids rehashing during bulk inserts
133
+
134
+ ```rust
135
+ use storage::{QuadStore, InMemoryBackend};
136
+
137
+ let store = QuadStore::new(InMemoryBackend::new());
138
+ // Ultra-fast: 2.78 µs lookups, zero disk I/O
139
+ ```
140
+
141
+ ### RocksDB (Persistent)
142
+
143
+ LSM-tree based storage with ACID transactions. Tested with **61 comprehensive tests**.
144
+
145
+ ```toml
146
+ # Cargo.toml - Enable RocksDB backend
147
+ [dependencies]
148
+ storage = { version = "0.1.10", features = ["rocksdb-backend"] }
149
+ ```
150
+
151
+ ```rust
152
+ use storage::{QuadStore, RocksDbBackend};
153
+
154
+ // Create persistent database
155
+ let backend = RocksDbBackend::new("/path/to/data")?;
156
+ let store = QuadStore::new(backend);
157
+
158
+ // Features:
159
+ // - ACID transactions
160
+ // - Snappy compression (automatic)
161
+ // - Crash recovery
162
+ // - Range & prefix scanning
163
+ // - 1MB+ value support
164
+
165
+ // Force sync to disk
166
+ store.flush()?;
167
+ ```
168
+
169
+ **RocksDB Test Coverage:**
170
+ - Basic CRUD operations (14 tests)
171
+ - Range scanning (8 tests)
172
+ - Prefix scanning (6 tests)
173
+ - Batch operations (8 tests)
174
+ - Transactions (8 tests)
175
+ - Concurrent access (5 tests)
176
+ - Unicode & binary data (4 tests)
177
+ - Large key/value handling (8 tests)
178
+
179
+ ### TypeScript SDK
180
+
181
+ The npm package uses the in-memory backend—ideal for:
182
+ - Knowledge graph queries
183
+ - SPARQL execution
184
+ - Data transformation pipelines
185
+ - Embedded applications
186
+
187
+ ```typescript
188
+ import { GraphDB } from 'rust-kgdb'
189
+
190
+ // In-memory database (default, no configuration needed)
191
+ const db = new GraphDB('http://example.org/app')
192
+
193
+ // For persistence, export via CONSTRUCT:
194
+ const ntriples = db.queryConstruct('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')
195
+ fs.writeFileSync('backup.nt', ntriples)
73
196
  ```
74
197
 
75
- ### Prerequisites
198
+ ---
76
199
 
77
- - Node.js >= 14
78
- - No additional dependencies required (native bindings included)
200
+ ## Installation
201
+
202
+ ```bash
203
+ npm install rust-kgdb
204
+ ```
79
205
 
80
206
  ### Platform Support
81
207
 
82
- | Platform | Architecture | Status | Notes |
83
- |----------|-------------|--------|-------|
84
- | **macOS** | x64 (Intel) | ✅ Fully Supported | SIMD+PGO optimized (AVX2) |
85
- | **macOS** | arm64 (Apple Silicon) | ✅ Fully Supported | SIMD+PGO optimized (NEON) |
86
- | **Linux** | x64 | ✅ Fully Supported | SIMD+PGO optimized (AVX2) |
87
- | **Linux** | arm64 | ✅ Fully Supported | SIMD+PGO optimized (NEON) |
88
- | **Windows** | x64 | ✅ Fully Supported | SIMD+PGO optimized (AVX2) |
89
- | **Windows** | arm64 | ⏳ Coming Soon | Planned for v0.2.0 |
208
+ | Platform | Architecture | Status | SIMD |
209
+ |----------|-------------|--------|------|
210
+ | **macOS** | Intel (x64) | ✅ | AVX2, BMI2, POPCNT |
211
+ | **macOS** | Apple Silicon (arm64) | ✅ | NEON |
212
+ | **Linux** | x64 | ✅ | AVX2, BMI2, POPCNT |
213
+ | **Linux** | arm64 | ✅ | NEON |
214
+ | **Windows** | x64 | ✅ | AVX2, BMI2, POPCNT |
215
+ | **Windows** | arm64 | ⏳ v0.2.0 | — |
90
216
 
91
- **SIMD Optimizations** (v0.1.8):
92
- - **Intel/AMD (x64)**: AVX2, BMI2, POPCNT auto-vectorization
93
- - **Apple Silicon (arm64)**: NEON auto-vectorization
94
- - **Profile-Guided Optimization (PGO)**: Runtime profile-based code generation
217
+ **No compilation required**—pre-built native binaries included.
95
218
 
96
- **Native Bindings**: Pre-compiled binaries included for all platforms. No compilation required during `npm install`.
219
+ ---
97
220
 
98
- ## 🎯 Quick Start
221
+ ## Quick Start
222
+
223
+ ### Complete Working Example
99
224
 
100
225
  ```typescript
101
- import { GraphDB, Node } from 'rust-kgdb'
226
+ import { GraphDB } from 'rust-kgdb'
102
227
 
103
- // Create in-memory database
104
- const db = new GraphDB('http://example.org/my-app')
228
+ // 1. Create database
229
+ const db = new GraphDB('http://example.org/myapp')
105
230
 
106
- // Insert triples
231
+ // 2. Load data (Turtle format)
107
232
  db.loadTtl(`
108
233
  @prefix foaf: <http://xmlns.com/foaf/0.1/> .
234
+ @prefix ex: <http://example.org/> .
235
+
236
+ ex:alice a foaf:Person ;
237
+ foaf:name "Alice" ;
238
+ foaf:age 30 ;
239
+ foaf:knows ex:bob, ex:charlie .
109
240
 
110
- <http://example.org/alice> foaf:name "Alice" ;
111
- foaf:age 30 ;
112
- foaf:knows <http://example.org/bob> .
241
+ ex:bob a foaf:Person ;
242
+ foaf:name "Bob" ;
243
+ foaf:age 25 ;
244
+ foaf:knows ex:charlie .
113
245
 
114
- <http://example.org/bob> foaf:name "Bob" ;
115
- foaf:age 25 .
246
+ ex:charlie a foaf:Person ;
247
+ foaf:name "Charlie" ;
248
+ foaf:age 35 .
116
249
  `, null)
117
250
 
118
- // SPARQL SELECT query
119
- const results = db.querySelect(`
251
+ // 3. Query: Find friends-of-friends (WCOJ optimized!)
252
+ const fof = db.querySelect(`
120
253
  PREFIX foaf: <http://xmlns.com/foaf/0.1/>
254
+ PREFIX ex: <http://example.org/>
121
255
 
122
- SELECT ?person ?name ?age WHERE {
123
- ?person foaf:name ?name ;
124
- foaf:age ?age .
256
+ SELECT ?person ?friend ?fof WHERE {
257
+ ?person foaf:knows ?friend .
258
+ ?friend foaf:knows ?fof .
259
+ FILTER(?person != ?fof)
125
260
  }
126
- ORDER BY DESC(?age)
127
261
  `)
262
+ console.log('Friends of Friends:', fof)
263
+ // [{ person: 'ex:alice', friend: 'ex:bob', fof: 'ex:charlie' }]
264
+
265
+ // 4. Aggregation: Average age
266
+ const stats = db.querySelect(`
267
+ PREFIX foaf: <http://xmlns.com/foaf/0.1/>
128
268
 
129
- console.log(results)
130
- // [
131
- // { person: '<http://example.org/alice>', name: '"Alice"', age: '30' },
132
- // { person: '<http://example.org/bob>', name: '"Bob"', age: '25' }
133
- // ]
269
+ SELECT (COUNT(?p) AS ?count) (AVG(?age) AS ?avgAge) WHERE {
270
+ ?p a foaf:Person ; foaf:age ?age .
271
+ }
272
+ `)
273
+ console.log('Stats:', stats)
274
+ // [{ count: '3', avgAge: '30.0' }]
134
275
 
135
- // SPARQL ASK query
276
+ // 5. ASK query
136
277
  const hasAlice = db.queryAsk(`
137
- ASK { <http://example.org/alice> foaf:name "Alice" }
278
+ PREFIX ex: <http://example.org/>
279
+ ASK { ex:alice a <http://xmlns.com/foaf/0.1/Person> }
280
+ `)
281
+ console.log('Has Alice?', hasAlice) // true
282
+
283
+ // 6. CONSTRUCT query
284
+ const graph = db.queryConstruct(`
285
+ PREFIX foaf: <http://xmlns.com/foaf/0.1/>
286
+ PREFIX ex: <http://example.org/>
287
+
288
+ CONSTRUCT { ?p foaf:knows ?f }
289
+ WHERE { ?p foaf:knows ?f }
138
290
  `)
139
- console.log(hasAlice) // true
291
+ console.log('Extracted graph:', graph)
140
292
 
141
- // Count triples
142
- console.log(db.count()) // 5
293
+ // 7. Count and cleanup
294
+ console.log('Triple count:', db.count()) // 11
295
+ db.clear()
143
296
  ```
144
297
 
145
- ## 🔥 WCOJ Execution Examples (v0.1.8)
298
+ ### Save to File
299
+
300
+ ```typescript
301
+ import { writeFileSync } from 'fs'
146
302
 
147
- ### Star Query (50-100x Faster!)
303
+ // Save as N-Triples
304
+ const db = new GraphDB('http://example.org/export')
305
+ db.loadTtl(`<http://example.org/s> <http://example.org/p> "value" .`, null)
306
+
307
+ const ntriples = db.queryConstruct(`CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }`)
308
+ writeFileSync('output.nt', ntriples)
309
+ ```
310
+
311
+ ---
312
+
313
+ ## SPARQL 1.1 Features (100% W3C Compliant)
314
+
315
+ ### Query Forms
148
316
 
149
317
  ```typescript
150
- // Find people with name, age, and email
151
- const starQuery = db.querySelect(`
152
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
318
+ // SELECT - return bindings
319
+ db.querySelect('SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10')
153
320
 
154
- SELECT ?person ?name ?age ?email WHERE {
155
- ?person foaf:name ?name .
156
- ?person foaf:age ?age .
157
- ?person foaf:email ?email .
158
- }
159
- `)
321
+ // ASK - boolean existence check
322
+ db.queryAsk('ASK { <http://example.org/x> ?p ?o }')
160
323
 
161
- // Automatically uses WCOJ execution for optimal performance
162
- // Expected speedup: 50-100x over nested loop joins
324
+ // CONSTRUCT - build new graph
325
+ db.queryConstruct('CONSTRUCT { ?s <http://new/prop> ?o } WHERE { ?s ?p ?o }')
163
326
  ```
164
327
 
165
- ### Complex Join (100-1000x Faster!)
328
+ ### Aggregates
166
329
 
167
330
  ```typescript
168
- // Find coworker connections
169
- const complexJoin = db.querySelect(`
170
- PREFIX org: <http://example.org/>
171
-
172
- SELECT ?person1 ?person2 ?company WHERE {
173
- ?person1 org:worksAt ?company .
174
- ?person2 org:worksAt ?company .
175
- ?person1 org:name ?name1 .
176
- ?person2 org:name ?name2 .
177
- FILTER(?person1 != ?person2)
178
- }
331
+ db.querySelect(`
332
+ SELECT ?type (COUNT(*) AS ?count) (AVG(?value) AS ?avg)
333
+ WHERE { ?s a ?type ; <http://ex/value> ?value }
334
+ GROUP BY ?type
335
+ HAVING (COUNT(*) > 5)
336
+ ORDER BY DESC(?count)
179
337
  `)
338
+ ```
180
339
 
181
- // WCOJ automatically selected for 4+ pattern joins
182
- // Expected speedup: 100-1000x over nested loop
340
+ ### Property Paths
341
+
342
+ ```typescript
343
+ // Transitive closure (rdfs:subClassOf*)
344
+ db.querySelect('SELECT ?class WHERE { ?class rdfs:subClassOf* <http://top/Class> }')
345
+
346
+ // Alternative paths
347
+ db.querySelect('SELECT ?name WHERE { ?x (foaf:name|rdfs:label) ?name }')
348
+
349
+ // Sequence paths
350
+ db.querySelect('SELECT ?grandparent WHERE { ?x foaf:parent/foaf:parent ?grandparent }')
183
351
  ```
184
352
 
185
- ### Chain Query (10-20x Faster!)
353
+ ### Named Graphs
186
354
 
187
355
  ```typescript
188
- // Friend-of-friend pattern
189
- const chainQuery = db.querySelect(`
190
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
356
+ // Load into named graph
357
+ db.loadTtl('<http://s> <http://p> "o" .', 'http://example.org/graph1')
191
358
 
192
- SELECT ?person1 ?person2 ?person3 WHERE {
193
- ?person1 foaf:knows ?person2 .
194
- ?person2 foaf:knows ?person3 .
359
+ // Query specific graph
360
+ db.querySelect(`
361
+ SELECT ?s ?p ?o WHERE {
362
+ GRAPH <http://example.org/graph1> { ?s ?p ?o }
195
363
  }
196
364
  `)
365
+ ```
197
366
 
198
- // WCOJ optimizes chain patterns
199
- // Expected speedup: 10-20x over nested loop
367
+ ### UPDATE Operations
368
+
369
+ ```typescript
370
+ // INSERT DATA
371
+ db.updateInsert(`
372
+ INSERT DATA { <http://ex/new> <http://ex/prop> "value" }
373
+ `)
374
+
375
+ // DELETE WHERE
376
+ db.updateDelete(`
377
+ DELETE WHERE { ?s <http://ex/deprecated> ?o }
378
+ `)
200
379
  ```
201
380
 
202
- ## 📚 Full API Reference
381
+ ---
382
+
383
+ ## API Reference
203
384
 
204
385
  ### GraphDB Class
205
386
 
206
387
  ```typescript
207
388
  class GraphDB {
208
- // Create database
209
- static inMemory(): GraphDB
210
- constructor(baseUri: string)
389
+ constructor(baseUri: string) // Create with base URI
390
+ static inMemory(): GraphDB // Create anonymous in-memory DB
211
391
 
212
- // Data loading
213
- loadTtl(data: string, graphName: string | null): void
214
- loadNTriples(data: string, graphName: string | null): void
392
+ // Data Loading
393
+ loadTtl(data: string, graph: string | null): void
394
+ loadNTriples(data: string, graph: string | null): void
215
395
 
216
- // SPARQL queries (WCOJ execution in v0.1.8!)
396
+ // SPARQL Queries (WCOJ-optimized)
217
397
  querySelect(sparql: string): Array<Record<string, string>>
218
398
  queryAsk(sparql: string): boolean
219
- queryConstruct(sparql: string): string
399
+ queryConstruct(sparql: string): string // Returns N-Triples
220
400
 
221
- // SPARQL updates
401
+ // SPARQL Updates
222
402
  updateInsert(sparql: string): void
223
403
  updateDelete(sparql: string): void
224
404
 
225
- // Database operations
405
+ // Database Operations
226
406
  count(): number
227
407
  clear(): void
228
-
229
- // Metadata
230
408
  getVersion(): string
231
409
  }
232
410
  ```
233
411
 
234
- ### Node Class (Triple Construction)
412
+ ### Node Class
235
413
 
236
414
  ```typescript
237
415
  class Node {
@@ -245,165 +423,91 @@ class Node {
245
423
  }
246
424
  ```
247
425
 
248
- ## 🎓 Advanced Usage
426
+ ---
249
427
 
250
- ### SPARQL UPDATE Operations
428
+ ## Performance Characteristics
251
429
 
252
- ```typescript
253
- // INSERT DATA
254
- db.updateInsert(`
255
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
430
+ ### Complexity Analysis
256
431
 
257
- INSERT DATA {
258
- <http://example.org/charlie> foaf:name "Charlie" ;
259
- foaf:age 35 .
260
- }
261
- `)
432
+ | Operation | Complexity | Notes |
433
+ |-----------|------------|-------|
434
+ | Triple lookup | O(1) | Hash-based SPOC index |
435
+ | Pattern scan | O(k) | k = matching triples |
436
+ | Star join (WCOJ) | O(n log n) | LeapFrog intersection |
437
+ | Complex join (WCOJ) | O(n log n) | Trie-based |
438
+ | Transitive closure | O(n²) worst | CSR matrix optimization |
439
+ | Bulk insert | O(n) | Batch indexing |
262
440
 
263
- // DELETE WHERE
264
- db.updateDelete(`
265
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
441
+ ### Memory Layout
266
442
 
267
- DELETE WHERE {
268
- ?person foaf:age ?age .
269
- FILTER(?age < 18)
270
- }
271
- `)
272
443
  ```
273
-
274
- ### Named Graphs
275
-
276
- ```typescript
277
- // Load into named graph
278
- db.loadTtl(`
279
- <http://example.org/resource> <http://purl.org/dc/terms/title> "Title" .
280
- `, 'http://example.org/graph1')
281
-
282
- // Query specific graph
283
- const results = db.querySelect(`
284
- SELECT ?s ?p ?o WHERE {
285
- GRAPH <http://example.org/graph1> {
286
- ?s ?p ?o .
287
- }
288
- }
289
- `)
444
+ Triple: 24 bytes
445
+ ├── Subject: 8 bytes (dictionary ID)
446
+ ├── Predicate: 8 bytes (dictionary ID)
447
+ └── Object: 8 bytes (dictionary ID)
448
+
449
+ String Interning: All URIs/literals stored once in Dictionary
450
+ Index Overhead: ~4x base triple size (4 indexes)
451
+ Total: ~120 bytes/triple including indexes
290
452
  ```
291
453
 
292
- ### SPARQL 1.1 Aggregates
293
-
294
- ```typescript
295
- // COUNT, AVG, MIN, MAX, SUM
296
- const aggregates = db.querySelect(`
297
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
298
-
299
- SELECT
300
- (COUNT(?person) AS ?count)
301
- (AVG(?age) AS ?avgAge)
302
- (MIN(?age) AS ?minAge)
303
- (MAX(?age) AS ?maxAge)
304
- WHERE {
305
- ?person foaf:age ?age .
306
- }
307
- `)
308
- ```
309
-
310
- ### SPARQL 1.1 Property Paths
311
-
312
- ```typescript
313
- // Transitive closure with *
314
- const transitiveKnows = db.querySelect(`
315
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
316
-
317
- SELECT ?person ?connected WHERE {
318
- <http://example.org/alice> foaf:knows* ?connected .
319
- }
320
- `)
321
-
322
- // Alternative paths with |
323
- const nameOrLabel = db.querySelect(`
324
- PREFIX foaf: <http://xmlns.com/foaf/0.1/>
325
- PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
326
-
327
- SELECT ?resource ?name WHERE {
328
- ?resource (foaf:name|rdfs:label) ?name .
329
- }
330
- `)
331
- ```
454
+ ---
332
455
 
333
- ## 🏗️ Architecture
456
+ ## Version History
334
457
 
335
- - **Core**: Pure Rust implementation with zero-copy semantics
336
- - **Bindings**: NAPI-RS for native Node.js addon
337
- - **Storage**: Pluggable backends (InMemory, RocksDB, LMDB)
338
- - **Indexing**: SPOC, POCS, OCSP, CSPO quad indexes
339
- - **Query Optimizer**: Automatic WCOJ detection and execution
340
- - **WCOJ Engine**: LeapFrog TrieJoin with variable ordering analysis
458
+ ### v0.1.9 (2025-12-01) - SIMD + PGO Release
341
459
 
342
- ## 📈 Version History
460
+ - **44.5% average speedup** via SIMD + PGO compiler optimizations
461
+ - WCOJ execution with LeapFrog TrieJoin
462
+ - Release automation infrastructure
463
+ - All packages updated to gonnect-uk namespace
343
464
 
344
- ### v0.1.8 (2025-12-01) - WCOJ Execution!
465
+ ### v0.1.8 (2025-12-01) - WCOJ Execution
345
466
 
346
- - ✅ **WCOJ Execution Path Activated** - LeapFrog TrieJoin for multi-way joins
347
- - ✅ **Variable Ordering Analysis** - Frequency-based optimization for WCOJ
348
- - **50-100x Speedup** for star queries (3+ patterns with shared variable)
349
- - ✅ **100-1000x Speedup** for complex joins (4+ patterns)
350
- - ✅ **577 Tests Passing** - Comprehensive end-to-end verification
351
- - ✅ **Zero Regressions** - All existing queries work unchanged
467
+ - WCOJ execution path activated
468
+ - Variable ordering analysis for optimal joins
469
+ - 577 tests passing
352
470
 
353
471
  ### v0.1.7 (2025-11-30)
354
472
 
355
473
  - Query optimizer with automatic strategy selection
356
474
  - WCOJ algorithm integration (planning phase)
357
- - Query plan visualization API
358
475
 
359
476
  ### v0.1.3 (2025-11-18)
360
477
 
361
- - Initial TypeScript SDK release
478
+ - Initial TypeScript SDK
362
479
  - 100% W3C SPARQL 1.1 compliance
363
480
  - 100% W3C RDF 1.2 compliance
364
481
 
365
- ## 🔬 Testing
366
-
367
- ```bash
368
- # Run test suite
369
- npm test
370
-
371
- # Run specific tests
372
- npm test -- --testNamePattern="star query"
373
- ```
374
-
375
- ## 🤝 Contributing
482
+ ---
376
483
 
377
- Contributions are welcome! Please see [CONTRIBUTING.md](https://github.com/gonnect-uk/rust-kgdb/blob/main/CONTRIBUTING.md)
484
+ ## Use Cases
378
485
 
379
- ## 📄 License
486
+ | Domain | Application |
487
+ |--------|-------------|
488
+ | **Knowledge Graphs** | Enterprise ontologies, taxonomies |
489
+ | **Semantic Search** | Structured queries over unstructured data |
490
+ | **Data Integration** | ETL with SPARQL CONSTRUCT |
491
+ | **Compliance** | SHACL validation, provenance tracking |
492
+ | **Graph Analytics** | Pattern detection, community analysis |
493
+ | **Mobile Apps** | Embedded RDF on iOS/Android |
380
494
 
381
- Apache License 2.0 - See [LICENSE](https://github.com/gonnect-uk/rust-kgdb/blob/main/LICENSE)
495
+ ---
382
496
 
383
- ## 🔗 Links
497
+ ## Links
384
498
 
385
499
  - [GitHub Repository](https://github.com/gonnect-uk/rust-kgdb)
386
500
  - [Documentation](https://github.com/gonnect-uk/rust-kgdb/tree/main/docs)
387
501
  - [CHANGELOG](https://github.com/gonnect-uk/rust-kgdb/blob/main/CHANGELOG.md)
388
- - [W3C SPARQL 1.1 Spec](https://www.w3.org/TR/sparql11-query/)
389
- - [W3C RDF 1.2 Spec](https://www.w3.org/TR/rdf12-concepts/)
502
+ - [W3C SPARQL 1.1](https://www.w3.org/TR/sparql11-query/)
503
+ - [W3C RDF 1.2](https://www.w3.org/TR/rdf12-concepts/)
390
504
 
391
- ## 💡 Use Cases
392
-
393
- - **Knowledge Graphs** - Build semantic data models
394
- - **Semantic Search** - Query structured data with SPARQL
395
- - **Data Integration** - Combine data from multiple sources
396
- - **Ontology Reasoning** - RDFS and OWL inference
397
- - **Graph Analytics** - Complex pattern matching with WCOJ
398
- - **Mobile Apps** - Embedded RDF database for iOS/Android
505
+ ---
399
506
 
400
- ## 🎯 Roadmap
507
+ ## License
401
508
 
402
- - [x] v0.1.8: WCOJ execution + SIMD + PGO optimizations (35-55% faster!)
403
- - [ ] v0.1.9: Manual SIMD vectorization for 2-4x additional speedup
404
- - [ ] v0.2.0: Windows ARM64 support + distributed query execution
405
- - [ ] v0.3.0: Graph analytics and reasoning engines
509
+ Apache License 2.0
406
510
 
407
511
  ---
408
512
 
409
- **Built with ❤️ using Rust and NAPI-RS**
513
+ **Built with Rust + NAPI-RS**
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "rust-kgdb",
3
- "version": "0.1.9",
3
+ "version": "0.1.10",
4
4
  "description": "High-performance RDF/SPARQL database with 100% W3C compliance and WCOJ execution",
5
5
  "main": "index.js",
6
6
  "types": "index.d.ts",
@@ -21,7 +21,7 @@
21
21
  "build:debug": "napi build --platform native/rust-kgdb-napi",
22
22
  "prepublishOnly": "napi prepublish -t npm",
23
23
  "test": "jest",
24
- "version": "0.1.9"
24
+ "version": "0.1.10"
25
25
  },
26
26
  "keywords": [
27
27
  "rdf",
@@ -56,10 +56,10 @@
56
56
  "*.node"
57
57
  ],
58
58
  "optionalDependencies": {
59
- "rust-kgdb-win32-x64-msvc": "0.1.9",
60
- "rust-kgdb-darwin-x64": "0.1.9",
61
- "rust-kgdb-linux-x64-gnu": "0.1.9",
62
- "rust-kgdb-darwin-arm64": "0.1.9",
63
- "rust-kgdb-linux-arm64-gnu": "0.1.9"
59
+ "rust-kgdb-win32-x64-msvc": "0.1.10",
60
+ "rust-kgdb-darwin-x64": "0.1.10",
61
+ "rust-kgdb-linux-x64-gnu": "0.1.10",
62
+ "rust-kgdb-darwin-arm64": "0.1.10",
63
+ "rust-kgdb-linux-arm64-gnu": "0.1.10"
64
64
  }
65
65
  }