rust-kgdb 0.1.9 → 0.1.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +373 -269
- package/package.json +7 -7
package/README.md
CHANGED
|
@@ -1,237 +1,415 @@
|
|
|
1
|
-
# rust-kgdb
|
|
1
|
+
# rust-kgdb
|
|
2
2
|
|
|
3
3
|
[](https://www.npmjs.com/package/rust-kgdb)
|
|
4
4
|
[](https://opensource.org/licenses/Apache-2.0)
|
|
5
5
|
|
|
6
|
-
**Production-ready
|
|
6
|
+
**Production-ready RDF/hypergraph database with 100% W3C SPARQL 1.1 + RDF 1.2 compliance, worst-case optimal joins (WCOJ), and pluggable storage backends.**
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
---
|
|
9
9
|
|
|
10
|
-
|
|
11
|
-
- **100% W3C RDF 1.2 Compliance** - Full standard implementation
|
|
12
|
-
- **WCOJ Execution** (v0.1.8) - LeapFrog TrieJoin for optimal multi-way joins
|
|
13
|
-
- **Zero-Copy Semantics** - Minimal allocations, maximum performance
|
|
14
|
-
- **Blazing Fast** - 2.78 µs triple lookups, 146K triples/sec bulk insert
|
|
15
|
-
- **Memory Efficient** - 24 bytes/triple (25% better than RDFox)
|
|
16
|
-
- **Native Rust** - Safe, reliable, production-ready
|
|
10
|
+
## Why rust-kgdb?
|
|
17
11
|
|
|
18
|
-
|
|
12
|
+
| Feature | rust-kgdb | Apache Jena | RDFox |
|
|
13
|
+
|---------|-----------|-------------|-------|
|
|
14
|
+
| **Lookup Speed** | 2.78 µs | ~50 µs | 50-100 µs |
|
|
15
|
+
| **Memory/Triple** | 24 bytes | 50-60 bytes | 32 bytes |
|
|
16
|
+
| **SPARQL 1.1** | 100% | 100% | 95% |
|
|
17
|
+
| **RDF 1.2** | 100% | Partial | No |
|
|
18
|
+
| **WCOJ** | ✅ LeapFrog | ❌ | ❌ |
|
|
19
|
+
| **Mobile-Ready** | ✅ iOS/Android | ❌ | ❌ |
|
|
19
20
|
|
|
20
|
-
|
|
21
|
+
---
|
|
21
22
|
|
|
22
|
-
|
|
23
|
-
|------------|---------------------|--------------|------------------|
|
|
24
|
-
| **Star Queries** (3+ patterns) | O(n³) | O(n log n) | **50-100x** |
|
|
25
|
-
| **Complex Joins** (4+ patterns) | O(n⁴) | O(n log n) | **100-1000x** |
|
|
26
|
-
| **Chain Queries** | O(n²) | O(n log n) | **10-20x** |
|
|
23
|
+
## Core Technical Innovations
|
|
27
24
|
|
|
28
|
-
###
|
|
25
|
+
### 1. Worst-Case Optimal Joins (WCOJ)
|
|
29
26
|
|
|
30
|
-
|
|
31
|
-
|--------|--------|------|----------|
|
|
32
|
-
| **Lookup** | 2.78 µs | 359K/sec | ✅ **35-180x faster** |
|
|
33
|
-
| **Bulk Insert** | 682 ms (100K) | 146K/sec | ⚠️ 73% speed (gap closing) |
|
|
34
|
-
| **Memory** | 24 bytes/triple | - | ✅ **25% better** |
|
|
27
|
+
Traditional databases use **nested-loop joins** with O(n²) to O(n⁴) complexity. rust-kgdb implements the **LeapFrog TrieJoin** algorithm—a worst-case optimal join that achieves O(n log n) for multi-way joins.
|
|
35
28
|
|
|
36
|
-
|
|
29
|
+
**How it works:**
|
|
30
|
+
- **Trie Data Structure**: Triples indexed hierarchically (S→P→O) using BTreeMap for sorted access
|
|
31
|
+
- **Variable Ordering**: Frequency-based analysis orders variables for optimal intersection
|
|
32
|
+
- **LeapFrog Iterator**: Binary search across sorted iterators finds intersections without materializing intermediate results
|
|
37
33
|
|
|
38
|
-
|
|
34
|
+
```
|
|
35
|
+
Query: SELECT ?x ?y ?z WHERE { ?x :p ?y . ?y :q ?z . ?x :r ?z }
|
|
39
36
|
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
| **Q2: 5-way star** | 234ms | **183ms** | ✅ **22% faster** | Strong |
|
|
44
|
-
| **Q3: 3-way star** | 177ms | **62ms** | 🔥 **65% faster** | Exceptional |
|
|
45
|
-
| **Q4: 3-hop chain** | 254ms | **101ms** | 🔥 **60% faster** | Exceptional |
|
|
46
|
-
| **Q5: 2-hop chain** | 230ms | **53ms** | 🔥 **77% faster** | **BEST** |
|
|
47
|
-
| **Q6: 6-way complex** | 641ms | **464ms** | ✅ **28% faster** | Good |
|
|
48
|
-
| **Q7: Hierarchy** | 343ms | **198ms** | ✅ **42% faster** | Strong |
|
|
49
|
-
| **Q8: Triangle** | 410ms | **193ms** | ✅ **53% faster** | Strong |
|
|
37
|
+
Nested Loop: O(n³) - examines every combination
|
|
38
|
+
WCOJ: O(n log n) - iterates in sorted order, seeks forward on mismatch
|
|
39
|
+
```
|
|
50
40
|
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
-
|
|
54
|
-
|
|
55
|
-
|
|
41
|
+
| Query Pattern | Before (Nested Loop) | After (WCOJ) | Speedup |
|
|
42
|
+
|---------------|---------------------|--------------|---------|
|
|
43
|
+
| 3-way star | O(n³) | O(n log n) | **50-100x** |
|
|
44
|
+
| 4+ way complex | O(n⁴) | O(n log n) | **100-1000x** |
|
|
45
|
+
| Chain queries | O(n²) | O(n log n) | **10-20x** |
|
|
56
46
|
|
|
57
|
-
|
|
58
|
-
1. **Instrumentation Build**: Compiler adds profiling hooks
|
|
59
|
-
2. **Profile Collection**: Run real workload (23 runtime profiles collected)
|
|
60
|
-
3. **Profile Merging**: Combine profiles into 5.9M merged dataset
|
|
61
|
-
4. **Optimized Rebuild**: Compiler uses runtime data for:
|
|
62
|
-
- Optimized hot paths (loops, function calls)
|
|
63
|
-
- Improved branch prediction
|
|
64
|
-
- Enhanced instruction cache locality
|
|
65
|
-
- Better CPU pipelining
|
|
47
|
+
### 2. Sparse Matrix Engine (CSR Format)
|
|
66
48
|
|
|
67
|
-
**
|
|
49
|
+
Binary relations (e.g., `foaf:knows`, `rdfs:subClassOf`) are converted to **Compressed Sparse Row (CSR)** matrices for cache-efficient join evaluation:
|
|
68
50
|
|
|
69
|
-
|
|
51
|
+
- **Memory**: O(nnz) where nnz = number of edges (not O(n²))
|
|
52
|
+
- **Matrix Multiplication**: Replaces nested-loop joins
|
|
53
|
+
- **Transitive Closure**: Semi-naive Δ-matrix evaluation (not iterated powers)
|
|
70
54
|
|
|
71
|
-
```
|
|
72
|
-
|
|
55
|
+
```rust
|
|
56
|
+
// Traditional: O(n²) nested loops
|
|
57
|
+
for (s, p, o) in triples { ... }
|
|
58
|
+
|
|
59
|
+
// CSR Matrix: O(nnz) cache-friendly iteration
|
|
60
|
+
row_ptr[i] → col_indices[j] → values[j]
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
**Used for**: RDFS/OWL reasoning, transitive closure, Datalog evaluation.
|
|
64
|
+
|
|
65
|
+
### 3. SIMD + PGO Compiler Optimizations
|
|
66
|
+
|
|
67
|
+
**Zero code changes—pure compiler-level performance gains.**
|
|
68
|
+
|
|
69
|
+
| Optimization | Technology | Effect |
|
|
70
|
+
|--------------|------------|--------|
|
|
71
|
+
| **SIMD Vectorization** | AVX2/BMI2 (Intel), NEON (ARM) | 8-wide parallel operations |
|
|
72
|
+
| **Profile-Guided Optimization** | LLVM PGO | Hot path optimization, branch prediction |
|
|
73
|
+
| **Link-Time Optimization** | LTO (fat) | Cross-crate inlining, dead code elimination |
|
|
74
|
+
|
|
75
|
+
**Benchmark Results (LUBM, Intel Skylake):**
|
|
76
|
+
|
|
77
|
+
| Query | Before | After (SIMD+PGO) | Improvement |
|
|
78
|
+
|-------|--------|------------------|-------------|
|
|
79
|
+
| Q5: 2-hop chain | 230ms | 53ms | **77% faster** |
|
|
80
|
+
| Q3: 3-way star | 177ms | 62ms | **65% faster** |
|
|
81
|
+
| Q4: 3-hop chain | 254ms | 101ms | **60% faster** |
|
|
82
|
+
| Q8: Triangle | 410ms | 193ms | **53% faster** |
|
|
83
|
+
| Q7: Hierarchy | 343ms | 198ms | **42% faster** |
|
|
84
|
+
| Q6: 6-way complex | 641ms | 464ms | **28% faster** |
|
|
85
|
+
| Q2: 5-way star | 234ms | 183ms | **22% faster** |
|
|
86
|
+
| Q1: 4-way star | 283ms | 258ms | **9% faster** |
|
|
87
|
+
|
|
88
|
+
**Average speedup: 44.5%** across all queries.
|
|
89
|
+
|
|
90
|
+
### 4. Quad Indexing (SPOC)
|
|
91
|
+
|
|
92
|
+
Four complementary indexes enable O(1) pattern matching regardless of query shape:
|
|
93
|
+
|
|
94
|
+
| Index | Pattern | Use Case |
|
|
95
|
+
|-------|---------|----------|
|
|
96
|
+
| **SPOC** | `(?s, ?p, ?o, ?g)` | Subject-centric queries |
|
|
97
|
+
| **POCS** | `(?p, ?o, ?c, ?s)` | Property enumeration |
|
|
98
|
+
| **OCSP** | `(?o, ?c, ?s, ?p)` | Object lookups (reverse links) |
|
|
99
|
+
| **CSPO** | `(?c, ?s, ?p, ?o)` | Named graph iteration |
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
## Storage Backends
|
|
104
|
+
|
|
105
|
+
rust-kgdb uses a pluggable storage architecture. **Default is in-memory** (zero configuration). For persistence, enable RocksDB.
|
|
106
|
+
|
|
107
|
+
| Backend | Feature Flag | Use Case | Status |
|
|
108
|
+
|---------|--------------|----------|--------|
|
|
109
|
+
| **InMemory** | `default` | Development, testing, embedded | ✅ **Production Ready** |
|
|
110
|
+
| **RocksDB** | `rocksdb-backend` | Production, large datasets | ✅ **61 tests passing** |
|
|
111
|
+
| **LMDB** | `lmdb-backend` | Read-heavy workloads | ⏳ Planned v0.2.0 |
|
|
112
|
+
|
|
113
|
+
### InMemory (Default)
|
|
114
|
+
|
|
115
|
+
Zero configuration, maximum performance. Data is volatile (lost on process exit).
|
|
116
|
+
|
|
117
|
+
**High-Performance Data Structures:**
|
|
118
|
+
|
|
119
|
+
| Component | Structure | Why |
|
|
120
|
+
|-----------|-----------|-----|
|
|
121
|
+
| **Triple Store** | `DashMap` | Lock-free concurrent hash map, 100K pre-allocation |
|
|
122
|
+
| **WCOJ Trie** | `BTreeMap` | Sorted iteration for LeapFrog intersection |
|
|
123
|
+
| **Dictionary** | `FxHashSet` | String interning with rustc-optimized hashing |
|
|
124
|
+
| **Hypergraph** | `FxHashMap` | Fast node→edge adjacency lists |
|
|
125
|
+
| **Reasoning** | `AHashMap` | RDFS/OWL inference with DoS-resistant hashing |
|
|
126
|
+
| **Datalog** | `FxHashMap` | Semi-naive evaluation with delta propagation |
|
|
127
|
+
|
|
128
|
+
**Why these structures enable sub-microsecond performance:**
|
|
129
|
+
- **DashMap**: Sharded locks (16 shards default) → near-linear scaling on multi-core
|
|
130
|
+
- **FxHashMap**: Rust compiler's hash function → 30% faster than std HashMap
|
|
131
|
+
- **BTreeMap**: O(log n) ordered iteration → enables binary search in LeapFrog
|
|
132
|
+
- **Pre-allocation**: 100K capacity avoids rehashing during bulk inserts
|
|
133
|
+
|
|
134
|
+
```rust
|
|
135
|
+
use storage::{QuadStore, InMemoryBackend};
|
|
136
|
+
|
|
137
|
+
let store = QuadStore::new(InMemoryBackend::new());
|
|
138
|
+
// Ultra-fast: 2.78 µs lookups, zero disk I/O
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
### RocksDB (Persistent)
|
|
142
|
+
|
|
143
|
+
LSM-tree based storage with ACID transactions. Tested with **61 comprehensive tests**.
|
|
144
|
+
|
|
145
|
+
```toml
|
|
146
|
+
# Cargo.toml - Enable RocksDB backend
|
|
147
|
+
[dependencies]
|
|
148
|
+
storage = { version = "0.1.10", features = ["rocksdb-backend"] }
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
```rust
|
|
152
|
+
use storage::{QuadStore, RocksDbBackend};
|
|
153
|
+
|
|
154
|
+
// Create persistent database
|
|
155
|
+
let backend = RocksDbBackend::new("/path/to/data")?;
|
|
156
|
+
let store = QuadStore::new(backend);
|
|
157
|
+
|
|
158
|
+
// Features:
|
|
159
|
+
// - ACID transactions
|
|
160
|
+
// - Snappy compression (automatic)
|
|
161
|
+
// - Crash recovery
|
|
162
|
+
// - Range & prefix scanning
|
|
163
|
+
// - 1MB+ value support
|
|
164
|
+
|
|
165
|
+
// Force sync to disk
|
|
166
|
+
store.flush()?;
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
**RocksDB Test Coverage:**
|
|
170
|
+
- Basic CRUD operations (14 tests)
|
|
171
|
+
- Range scanning (8 tests)
|
|
172
|
+
- Prefix scanning (6 tests)
|
|
173
|
+
- Batch operations (8 tests)
|
|
174
|
+
- Transactions (8 tests)
|
|
175
|
+
- Concurrent access (5 tests)
|
|
176
|
+
- Unicode & binary data (4 tests)
|
|
177
|
+
- Large key/value handling (8 tests)
|
|
178
|
+
|
|
179
|
+
### TypeScript SDK
|
|
180
|
+
|
|
181
|
+
The npm package uses the in-memory backend—ideal for:
|
|
182
|
+
- Knowledge graph queries
|
|
183
|
+
- SPARQL execution
|
|
184
|
+
- Data transformation pipelines
|
|
185
|
+
- Embedded applications
|
|
186
|
+
|
|
187
|
+
```typescript
|
|
188
|
+
import { GraphDB } from 'rust-kgdb'
|
|
189
|
+
|
|
190
|
+
// In-memory database (default, no configuration needed)
|
|
191
|
+
const db = new GraphDB('http://example.org/app')
|
|
192
|
+
|
|
193
|
+
// For persistence, export via CONSTRUCT:
|
|
194
|
+
const ntriples = db.queryConstruct('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')
|
|
195
|
+
fs.writeFileSync('backup.nt', ntriples)
|
|
73
196
|
```
|
|
74
197
|
|
|
75
|
-
|
|
198
|
+
---
|
|
76
199
|
|
|
77
|
-
|
|
78
|
-
|
|
200
|
+
## Installation
|
|
201
|
+
|
|
202
|
+
```bash
|
|
203
|
+
npm install rust-kgdb
|
|
204
|
+
```
|
|
79
205
|
|
|
80
206
|
### Platform Support
|
|
81
207
|
|
|
82
|
-
| Platform | Architecture | Status |
|
|
83
|
-
|
|
84
|
-
| **macOS** |
|
|
85
|
-
| **macOS** |
|
|
86
|
-
| **Linux** | x64 | ✅
|
|
87
|
-
| **Linux** | arm64 | ✅
|
|
88
|
-
| **Windows** | x64 | ✅
|
|
89
|
-
| **Windows** | arm64 | ⏳
|
|
208
|
+
| Platform | Architecture | Status | SIMD |
|
|
209
|
+
|----------|-------------|--------|------|
|
|
210
|
+
| **macOS** | Intel (x64) | ✅ | AVX2, BMI2, POPCNT |
|
|
211
|
+
| **macOS** | Apple Silicon (arm64) | ✅ | NEON |
|
|
212
|
+
| **Linux** | x64 | ✅ | AVX2, BMI2, POPCNT |
|
|
213
|
+
| **Linux** | arm64 | ✅ | NEON |
|
|
214
|
+
| **Windows** | x64 | ✅ | AVX2, BMI2, POPCNT |
|
|
215
|
+
| **Windows** | arm64 | ⏳ v0.2.0 | — |
|
|
90
216
|
|
|
91
|
-
**
|
|
92
|
-
- **Intel/AMD (x64)**: AVX2, BMI2, POPCNT auto-vectorization
|
|
93
|
-
- **Apple Silicon (arm64)**: NEON auto-vectorization
|
|
94
|
-
- **Profile-Guided Optimization (PGO)**: Runtime profile-based code generation
|
|
217
|
+
**No compilation required**—pre-built native binaries included.
|
|
95
218
|
|
|
96
|
-
|
|
219
|
+
---
|
|
97
220
|
|
|
98
|
-
##
|
|
221
|
+
## Quick Start
|
|
222
|
+
|
|
223
|
+
### Complete Working Example
|
|
99
224
|
|
|
100
225
|
```typescript
|
|
101
|
-
import { GraphDB
|
|
226
|
+
import { GraphDB } from 'rust-kgdb'
|
|
102
227
|
|
|
103
|
-
// Create
|
|
104
|
-
const db = new GraphDB('http://example.org/
|
|
228
|
+
// 1. Create database
|
|
229
|
+
const db = new GraphDB('http://example.org/myapp')
|
|
105
230
|
|
|
106
|
-
//
|
|
231
|
+
// 2. Load data (Turtle format)
|
|
107
232
|
db.loadTtl(`
|
|
108
233
|
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
|
|
234
|
+
@prefix ex: <http://example.org/> .
|
|
235
|
+
|
|
236
|
+
ex:alice a foaf:Person ;
|
|
237
|
+
foaf:name "Alice" ;
|
|
238
|
+
foaf:age 30 ;
|
|
239
|
+
foaf:knows ex:bob, ex:charlie .
|
|
109
240
|
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
241
|
+
ex:bob a foaf:Person ;
|
|
242
|
+
foaf:name "Bob" ;
|
|
243
|
+
foaf:age 25 ;
|
|
244
|
+
foaf:knows ex:charlie .
|
|
113
245
|
|
|
114
|
-
|
|
115
|
-
|
|
246
|
+
ex:charlie a foaf:Person ;
|
|
247
|
+
foaf:name "Charlie" ;
|
|
248
|
+
foaf:age 35 .
|
|
116
249
|
`, null)
|
|
117
250
|
|
|
118
|
-
//
|
|
119
|
-
const
|
|
251
|
+
// 3. Query: Find friends-of-friends (WCOJ optimized!)
|
|
252
|
+
const fof = db.querySelect(`
|
|
120
253
|
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
254
|
+
PREFIX ex: <http://example.org/>
|
|
121
255
|
|
|
122
|
-
SELECT ?person ?
|
|
123
|
-
?person foaf:
|
|
124
|
-
|
|
256
|
+
SELECT ?person ?friend ?fof WHERE {
|
|
257
|
+
?person foaf:knows ?friend .
|
|
258
|
+
?friend foaf:knows ?fof .
|
|
259
|
+
FILTER(?person != ?fof)
|
|
125
260
|
}
|
|
126
|
-
ORDER BY DESC(?age)
|
|
127
261
|
`)
|
|
262
|
+
console.log('Friends of Friends:', fof)
|
|
263
|
+
// [{ person: 'ex:alice', friend: 'ex:bob', fof: 'ex:charlie' }]
|
|
264
|
+
|
|
265
|
+
// 4. Aggregation: Average age
|
|
266
|
+
const stats = db.querySelect(`
|
|
267
|
+
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
128
268
|
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
269
|
+
SELECT (COUNT(?p) AS ?count) (AVG(?age) AS ?avgAge) WHERE {
|
|
270
|
+
?p a foaf:Person ; foaf:age ?age .
|
|
271
|
+
}
|
|
272
|
+
`)
|
|
273
|
+
console.log('Stats:', stats)
|
|
274
|
+
// [{ count: '3', avgAge: '30.0' }]
|
|
134
275
|
|
|
135
|
-
//
|
|
276
|
+
// 5. ASK query
|
|
136
277
|
const hasAlice = db.queryAsk(`
|
|
137
|
-
|
|
278
|
+
PREFIX ex: <http://example.org/>
|
|
279
|
+
ASK { ex:alice a <http://xmlns.com/foaf/0.1/Person> }
|
|
280
|
+
`)
|
|
281
|
+
console.log('Has Alice?', hasAlice) // true
|
|
282
|
+
|
|
283
|
+
// 6. CONSTRUCT query
|
|
284
|
+
const graph = db.queryConstruct(`
|
|
285
|
+
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
286
|
+
PREFIX ex: <http://example.org/>
|
|
287
|
+
|
|
288
|
+
CONSTRUCT { ?p foaf:knows ?f }
|
|
289
|
+
WHERE { ?p foaf:knows ?f }
|
|
138
290
|
`)
|
|
139
|
-
console.log(
|
|
291
|
+
console.log('Extracted graph:', graph)
|
|
140
292
|
|
|
141
|
-
// Count
|
|
142
|
-
console.log(db.count())
|
|
293
|
+
// 7. Count and cleanup
|
|
294
|
+
console.log('Triple count:', db.count()) // 11
|
|
295
|
+
db.clear()
|
|
143
296
|
```
|
|
144
297
|
|
|
145
|
-
|
|
298
|
+
### Save to File
|
|
299
|
+
|
|
300
|
+
```typescript
|
|
301
|
+
import { writeFileSync } from 'fs'
|
|
146
302
|
|
|
147
|
-
|
|
303
|
+
// Save as N-Triples
|
|
304
|
+
const db = new GraphDB('http://example.org/export')
|
|
305
|
+
db.loadTtl(`<http://example.org/s> <http://example.org/p> "value" .`, null)
|
|
306
|
+
|
|
307
|
+
const ntriples = db.queryConstruct(`CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }`)
|
|
308
|
+
writeFileSync('output.nt', ntriples)
|
|
309
|
+
```
|
|
310
|
+
|
|
311
|
+
---
|
|
312
|
+
|
|
313
|
+
## SPARQL 1.1 Features (100% W3C Compliant)
|
|
314
|
+
|
|
315
|
+
### Query Forms
|
|
148
316
|
|
|
149
317
|
```typescript
|
|
150
|
-
//
|
|
151
|
-
|
|
152
|
-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
318
|
+
// SELECT - return bindings
|
|
319
|
+
db.querySelect('SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10')
|
|
153
320
|
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
?person foaf:age ?age .
|
|
157
|
-
?person foaf:email ?email .
|
|
158
|
-
}
|
|
159
|
-
`)
|
|
321
|
+
// ASK - boolean existence check
|
|
322
|
+
db.queryAsk('ASK { <http://example.org/x> ?p ?o }')
|
|
160
323
|
|
|
161
|
-
//
|
|
162
|
-
|
|
324
|
+
// CONSTRUCT - build new graph
|
|
325
|
+
db.queryConstruct('CONSTRUCT { ?s <http://new/prop> ?o } WHERE { ?s ?p ?o }')
|
|
163
326
|
```
|
|
164
327
|
|
|
165
|
-
###
|
|
328
|
+
### Aggregates
|
|
166
329
|
|
|
167
330
|
```typescript
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
?person2 org:worksAt ?company .
|
|
175
|
-
?person1 org:name ?name1 .
|
|
176
|
-
?person2 org:name ?name2 .
|
|
177
|
-
FILTER(?person1 != ?person2)
|
|
178
|
-
}
|
|
331
|
+
db.querySelect(`
|
|
332
|
+
SELECT ?type (COUNT(*) AS ?count) (AVG(?value) AS ?avg)
|
|
333
|
+
WHERE { ?s a ?type ; <http://ex/value> ?value }
|
|
334
|
+
GROUP BY ?type
|
|
335
|
+
HAVING (COUNT(*) > 5)
|
|
336
|
+
ORDER BY DESC(?count)
|
|
179
337
|
`)
|
|
338
|
+
```
|
|
180
339
|
|
|
181
|
-
|
|
182
|
-
|
|
340
|
+
### Property Paths
|
|
341
|
+
|
|
342
|
+
```typescript
|
|
343
|
+
// Transitive closure (rdfs:subClassOf*)
|
|
344
|
+
db.querySelect('SELECT ?class WHERE { ?class rdfs:subClassOf* <http://top/Class> }')
|
|
345
|
+
|
|
346
|
+
// Alternative paths
|
|
347
|
+
db.querySelect('SELECT ?name WHERE { ?x (foaf:name|rdfs:label) ?name }')
|
|
348
|
+
|
|
349
|
+
// Sequence paths
|
|
350
|
+
db.querySelect('SELECT ?grandparent WHERE { ?x foaf:parent/foaf:parent ?grandparent }')
|
|
183
351
|
```
|
|
184
352
|
|
|
185
|
-
###
|
|
353
|
+
### Named Graphs
|
|
186
354
|
|
|
187
355
|
```typescript
|
|
188
|
-
//
|
|
189
|
-
|
|
190
|
-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
356
|
+
// Load into named graph
|
|
357
|
+
db.loadTtl('<http://s> <http://p> "o" .', 'http://example.org/graph1')
|
|
191
358
|
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
359
|
+
// Query specific graph
|
|
360
|
+
db.querySelect(`
|
|
361
|
+
SELECT ?s ?p ?o WHERE {
|
|
362
|
+
GRAPH <http://example.org/graph1> { ?s ?p ?o }
|
|
195
363
|
}
|
|
196
364
|
`)
|
|
365
|
+
```
|
|
197
366
|
|
|
198
|
-
|
|
199
|
-
|
|
367
|
+
### UPDATE Operations
|
|
368
|
+
|
|
369
|
+
```typescript
|
|
370
|
+
// INSERT DATA
|
|
371
|
+
db.updateInsert(`
|
|
372
|
+
INSERT DATA { <http://ex/new> <http://ex/prop> "value" }
|
|
373
|
+
`)
|
|
374
|
+
|
|
375
|
+
// DELETE WHERE
|
|
376
|
+
db.updateDelete(`
|
|
377
|
+
DELETE WHERE { ?s <http://ex/deprecated> ?o }
|
|
378
|
+
`)
|
|
200
379
|
```
|
|
201
380
|
|
|
202
|
-
|
|
381
|
+
---
|
|
382
|
+
|
|
383
|
+
## API Reference
|
|
203
384
|
|
|
204
385
|
### GraphDB Class
|
|
205
386
|
|
|
206
387
|
```typescript
|
|
207
388
|
class GraphDB {
|
|
208
|
-
// Create
|
|
209
|
-
static inMemory(): GraphDB
|
|
210
|
-
constructor(baseUri: string)
|
|
389
|
+
constructor(baseUri: string) // Create with base URI
|
|
390
|
+
static inMemory(): GraphDB // Create anonymous in-memory DB
|
|
211
391
|
|
|
212
|
-
// Data
|
|
213
|
-
loadTtl(data: string,
|
|
214
|
-
loadNTriples(data: string,
|
|
392
|
+
// Data Loading
|
|
393
|
+
loadTtl(data: string, graph: string | null): void
|
|
394
|
+
loadNTriples(data: string, graph: string | null): void
|
|
215
395
|
|
|
216
|
-
// SPARQL
|
|
396
|
+
// SPARQL Queries (WCOJ-optimized)
|
|
217
397
|
querySelect(sparql: string): Array<Record<string, string>>
|
|
218
398
|
queryAsk(sparql: string): boolean
|
|
219
|
-
queryConstruct(sparql: string): string
|
|
399
|
+
queryConstruct(sparql: string): string // Returns N-Triples
|
|
220
400
|
|
|
221
|
-
// SPARQL
|
|
401
|
+
// SPARQL Updates
|
|
222
402
|
updateInsert(sparql: string): void
|
|
223
403
|
updateDelete(sparql: string): void
|
|
224
404
|
|
|
225
|
-
// Database
|
|
405
|
+
// Database Operations
|
|
226
406
|
count(): number
|
|
227
407
|
clear(): void
|
|
228
|
-
|
|
229
|
-
// Metadata
|
|
230
408
|
getVersion(): string
|
|
231
409
|
}
|
|
232
410
|
```
|
|
233
411
|
|
|
234
|
-
### Node Class
|
|
412
|
+
### Node Class
|
|
235
413
|
|
|
236
414
|
```typescript
|
|
237
415
|
class Node {
|
|
@@ -245,165 +423,91 @@ class Node {
|
|
|
245
423
|
}
|
|
246
424
|
```
|
|
247
425
|
|
|
248
|
-
|
|
426
|
+
---
|
|
249
427
|
|
|
250
|
-
|
|
428
|
+
## Performance Characteristics
|
|
251
429
|
|
|
252
|
-
|
|
253
|
-
// INSERT DATA
|
|
254
|
-
db.updateInsert(`
|
|
255
|
-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
430
|
+
### Complexity Analysis
|
|
256
431
|
|
|
257
|
-
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
432
|
+
| Operation | Complexity | Notes |
|
|
433
|
+
|-----------|------------|-------|
|
|
434
|
+
| Triple lookup | O(1) | Hash-based SPOC index |
|
|
435
|
+
| Pattern scan | O(k) | k = matching triples |
|
|
436
|
+
| Star join (WCOJ) | O(n log n) | LeapFrog intersection |
|
|
437
|
+
| Complex join (WCOJ) | O(n log n) | Trie-based |
|
|
438
|
+
| Transitive closure | O(n²) worst | CSR matrix optimization |
|
|
439
|
+
| Bulk insert | O(n) | Batch indexing |
|
|
262
440
|
|
|
263
|
-
|
|
264
|
-
db.updateDelete(`
|
|
265
|
-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
441
|
+
### Memory Layout
|
|
266
442
|
|
|
267
|
-
DELETE WHERE {
|
|
268
|
-
?person foaf:age ?age .
|
|
269
|
-
FILTER(?age < 18)
|
|
270
|
-
}
|
|
271
|
-
`)
|
|
272
443
|
```
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
// Query specific graph
|
|
283
|
-
const results = db.querySelect(`
|
|
284
|
-
SELECT ?s ?p ?o WHERE {
|
|
285
|
-
GRAPH <http://example.org/graph1> {
|
|
286
|
-
?s ?p ?o .
|
|
287
|
-
}
|
|
288
|
-
}
|
|
289
|
-
`)
|
|
444
|
+
Triple: 24 bytes
|
|
445
|
+
├── Subject: 8 bytes (dictionary ID)
|
|
446
|
+
├── Predicate: 8 bytes (dictionary ID)
|
|
447
|
+
└── Object: 8 bytes (dictionary ID)
|
|
448
|
+
|
|
449
|
+
String Interning: All URIs/literals stored once in Dictionary
|
|
450
|
+
Index Overhead: ~4x base triple size (4 indexes)
|
|
451
|
+
Total: ~120 bytes/triple including indexes
|
|
290
452
|
```
|
|
291
453
|
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
```typescript
|
|
295
|
-
// COUNT, AVG, MIN, MAX, SUM
|
|
296
|
-
const aggregates = db.querySelect(`
|
|
297
|
-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
298
|
-
|
|
299
|
-
SELECT
|
|
300
|
-
(COUNT(?person) AS ?count)
|
|
301
|
-
(AVG(?age) AS ?avgAge)
|
|
302
|
-
(MIN(?age) AS ?minAge)
|
|
303
|
-
(MAX(?age) AS ?maxAge)
|
|
304
|
-
WHERE {
|
|
305
|
-
?person foaf:age ?age .
|
|
306
|
-
}
|
|
307
|
-
`)
|
|
308
|
-
```
|
|
309
|
-
|
|
310
|
-
### SPARQL 1.1 Property Paths
|
|
311
|
-
|
|
312
|
-
```typescript
|
|
313
|
-
// Transitive closure with *
|
|
314
|
-
const transitiveKnows = db.querySelect(`
|
|
315
|
-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
316
|
-
|
|
317
|
-
SELECT ?person ?connected WHERE {
|
|
318
|
-
<http://example.org/alice> foaf:knows* ?connected .
|
|
319
|
-
}
|
|
320
|
-
`)
|
|
321
|
-
|
|
322
|
-
// Alternative paths with |
|
|
323
|
-
const nameOrLabel = db.querySelect(`
|
|
324
|
-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
325
|
-
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
|
|
326
|
-
|
|
327
|
-
SELECT ?resource ?name WHERE {
|
|
328
|
-
?resource (foaf:name|rdfs:label) ?name .
|
|
329
|
-
}
|
|
330
|
-
`)
|
|
331
|
-
```
|
|
454
|
+
---
|
|
332
455
|
|
|
333
|
-
##
|
|
456
|
+
## Version History
|
|
334
457
|
|
|
335
|
-
|
|
336
|
-
- **Bindings**: NAPI-RS for native Node.js addon
|
|
337
|
-
- **Storage**: Pluggable backends (InMemory, RocksDB, LMDB)
|
|
338
|
-
- **Indexing**: SPOC, POCS, OCSP, CSPO quad indexes
|
|
339
|
-
- **Query Optimizer**: Automatic WCOJ detection and execution
|
|
340
|
-
- **WCOJ Engine**: LeapFrog TrieJoin with variable ordering analysis
|
|
458
|
+
### v0.1.9 (2025-12-01) - SIMD + PGO Release
|
|
341
459
|
|
|
342
|
-
|
|
460
|
+
- **44.5% average speedup** via SIMD + PGO compiler optimizations
|
|
461
|
+
- WCOJ execution with LeapFrog TrieJoin
|
|
462
|
+
- Release automation infrastructure
|
|
463
|
+
- All packages updated to gonnect-uk namespace
|
|
343
464
|
|
|
344
|
-
### v0.1.8 (2025-12-01) - WCOJ Execution
|
|
465
|
+
### v0.1.8 (2025-12-01) - WCOJ Execution
|
|
345
466
|
|
|
346
|
-
-
|
|
347
|
-
-
|
|
348
|
-
-
|
|
349
|
-
- ✅ **100-1000x Speedup** for complex joins (4+ patterns)
|
|
350
|
-
- ✅ **577 Tests Passing** - Comprehensive end-to-end verification
|
|
351
|
-
- ✅ **Zero Regressions** - All existing queries work unchanged
|
|
467
|
+
- WCOJ execution path activated
|
|
468
|
+
- Variable ordering analysis for optimal joins
|
|
469
|
+
- 577 tests passing
|
|
352
470
|
|
|
353
471
|
### v0.1.7 (2025-11-30)
|
|
354
472
|
|
|
355
473
|
- Query optimizer with automatic strategy selection
|
|
356
474
|
- WCOJ algorithm integration (planning phase)
|
|
357
|
-
- Query plan visualization API
|
|
358
475
|
|
|
359
476
|
### v0.1.3 (2025-11-18)
|
|
360
477
|
|
|
361
|
-
- Initial TypeScript SDK
|
|
478
|
+
- Initial TypeScript SDK
|
|
362
479
|
- 100% W3C SPARQL 1.1 compliance
|
|
363
480
|
- 100% W3C RDF 1.2 compliance
|
|
364
481
|
|
|
365
|
-
|
|
366
|
-
|
|
367
|
-
```bash
|
|
368
|
-
# Run test suite
|
|
369
|
-
npm test
|
|
370
|
-
|
|
371
|
-
# Run specific tests
|
|
372
|
-
npm test -- --testNamePattern="star query"
|
|
373
|
-
```
|
|
374
|
-
|
|
375
|
-
## 🤝 Contributing
|
|
482
|
+
---
|
|
376
483
|
|
|
377
|
-
|
|
484
|
+
## Use Cases
|
|
378
485
|
|
|
379
|
-
|
|
486
|
+
| Domain | Application |
|
|
487
|
+
|--------|-------------|
|
|
488
|
+
| **Knowledge Graphs** | Enterprise ontologies, taxonomies |
|
|
489
|
+
| **Semantic Search** | Structured queries over unstructured data |
|
|
490
|
+
| **Data Integration** | ETL with SPARQL CONSTRUCT |
|
|
491
|
+
| **Compliance** | SHACL validation, provenance tracking |
|
|
492
|
+
| **Graph Analytics** | Pattern detection, community analysis |
|
|
493
|
+
| **Mobile Apps** | Embedded RDF on iOS/Android |
|
|
380
494
|
|
|
381
|
-
|
|
495
|
+
---
|
|
382
496
|
|
|
383
|
-
##
|
|
497
|
+
## Links
|
|
384
498
|
|
|
385
499
|
- [GitHub Repository](https://github.com/gonnect-uk/rust-kgdb)
|
|
386
500
|
- [Documentation](https://github.com/gonnect-uk/rust-kgdb/tree/main/docs)
|
|
387
501
|
- [CHANGELOG](https://github.com/gonnect-uk/rust-kgdb/blob/main/CHANGELOG.md)
|
|
388
|
-
- [W3C SPARQL 1.1
|
|
389
|
-
- [W3C RDF 1.2
|
|
502
|
+
- [W3C SPARQL 1.1](https://www.w3.org/TR/sparql11-query/)
|
|
503
|
+
- [W3C RDF 1.2](https://www.w3.org/TR/rdf12-concepts/)
|
|
390
504
|
|
|
391
|
-
|
|
392
|
-
|
|
393
|
-
- **Knowledge Graphs** - Build semantic data models
|
|
394
|
-
- **Semantic Search** - Query structured data with SPARQL
|
|
395
|
-
- **Data Integration** - Combine data from multiple sources
|
|
396
|
-
- **Ontology Reasoning** - RDFS and OWL inference
|
|
397
|
-
- **Graph Analytics** - Complex pattern matching with WCOJ
|
|
398
|
-
- **Mobile Apps** - Embedded RDF database for iOS/Android
|
|
505
|
+
---
|
|
399
506
|
|
|
400
|
-
##
|
|
507
|
+
## License
|
|
401
508
|
|
|
402
|
-
|
|
403
|
-
- [ ] v0.1.9: Manual SIMD vectorization for 2-4x additional speedup
|
|
404
|
-
- [ ] v0.2.0: Windows ARM64 support + distributed query execution
|
|
405
|
-
- [ ] v0.3.0: Graph analytics and reasoning engines
|
|
509
|
+
Apache License 2.0
|
|
406
510
|
|
|
407
511
|
---
|
|
408
512
|
|
|
409
|
-
**Built with
|
|
513
|
+
**Built with Rust + NAPI-RS**
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "rust-kgdb",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.10",
|
|
4
4
|
"description": "High-performance RDF/SPARQL database with 100% W3C compliance and WCOJ execution",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"types": "index.d.ts",
|
|
@@ -21,7 +21,7 @@
|
|
|
21
21
|
"build:debug": "napi build --platform native/rust-kgdb-napi",
|
|
22
22
|
"prepublishOnly": "napi prepublish -t npm",
|
|
23
23
|
"test": "jest",
|
|
24
|
-
"version": "0.1.
|
|
24
|
+
"version": "0.1.10"
|
|
25
25
|
},
|
|
26
26
|
"keywords": [
|
|
27
27
|
"rdf",
|
|
@@ -56,10 +56,10 @@
|
|
|
56
56
|
"*.node"
|
|
57
57
|
],
|
|
58
58
|
"optionalDependencies": {
|
|
59
|
-
"rust-kgdb-win32-x64-msvc": "0.1.
|
|
60
|
-
"rust-kgdb-darwin-x64": "0.1.
|
|
61
|
-
"rust-kgdb-linux-x64-gnu": "0.1.
|
|
62
|
-
"rust-kgdb-darwin-arm64": "0.1.
|
|
63
|
-
"rust-kgdb-linux-arm64-gnu": "0.1.
|
|
59
|
+
"rust-kgdb-win32-x64-msvc": "0.1.10",
|
|
60
|
+
"rust-kgdb-darwin-x64": "0.1.10",
|
|
61
|
+
"rust-kgdb-linux-x64-gnu": "0.1.10",
|
|
62
|
+
"rust-kgdb-darwin-arm64": "0.1.10",
|
|
63
|
+
"rust-kgdb-linux-arm64-gnu": "0.1.10"
|
|
64
64
|
}
|
|
65
65
|
}
|