rust-kgdb 0.4.0 → 0.4.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +1238 -1935
- package/examples/business-assertions.test.ts +1196 -0
- package/examples/core-concepts-demo.ts +502 -0
- package/examples/datalog-example.ts +478 -0
- package/examples/embeddings-example.ts +376 -0
- package/examples/graphframes-example.ts +367 -0
- package/examples/hypermind-fraud-underwriter.ts +669 -0
- package/examples/pregel-example.ts +399 -0
- package/package.json +3 -2
package/README.md
CHANGED
|
@@ -2,2125 +2,1428 @@
|
|
|
2
2
|
|
|
3
3
|
[](https://www.npmjs.com/package/rust-kgdb)
|
|
4
4
|
[](https://opensource.org/licenses/Apache-2.0)
|
|
5
|
-
[](https://www.w3.org/TR/sparql11-query/)
|
|
6
|
+
[](#wasm-sandbox-security)
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
**Production-Grade Neuro-Symbolic AI Framework**
|
|
9
9
|
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
> **v0.4.0 - Research Release**: HyperMind neuro-symbolic framework with WASM sandbox security, category theory morphisms, and W3C SPARQL 1.1 compliance. Benchmarked on LUBM (Lehigh University Benchmark).
|
|
20
|
-
>
|
|
21
|
-
> **Full Benchmark Report**: [HYPERMIND_BENCHMARK_REPORT.md](./HYPERMIND_BENCHMARK_REPORT.md)
|
|
22
|
-
|
|
23
|
-
---
|
|
24
|
-
|
|
25
|
-
## Key Capabilities
|
|
26
|
-
|
|
27
|
-
| Feature | Description |
|
|
28
|
-
|---------|-------------|
|
|
29
|
-
| **HyperMind Agent** | Neuro-symbolic AI: NL → SPARQL with +86.4% accuracy vs vanilla LLMs |
|
|
30
|
-
| **WASM Sandbox** | Secure agent execution with capability-based access control |
|
|
31
|
-
| **Category Theory** | Tools as morphisms with type-safe composition |
|
|
32
|
-
| **GraphDB** | Core RDF/SPARQL database with 100% W3C compliance |
|
|
33
|
-
| **GraphFrames** | Spark-compatible graph analytics (PageRank, triangles, components) |
|
|
34
|
-
| **Motif Finding** | Graph pattern DSL for structural queries (fraud rings, recommendations) |
|
|
35
|
-
| **EmbeddingService** | Vector similarity search, text search, multi-provider embeddings |
|
|
36
|
-
| **DatalogProgram** | Rule-based reasoning with transitive closure |
|
|
37
|
-
| **Pregel** | Bulk Synchronous Parallel graph processing |
|
|
38
|
-
|
|
39
|
-
### Security Model Comparison
|
|
40
|
-
|
|
41
|
-
| Feature | HyperMind WASM | LangChain | AutoGPT |
|
|
42
|
-
|---------|----------------|-----------|---------|
|
|
43
|
-
| Memory Isolation | YES (wasmtime) | NO | NO |
|
|
44
|
-
| CPU Time Limits | YES (fuel meter) | NO | NO |
|
|
45
|
-
| Capability-Based Access | YES (7 caps) | NO | NO |
|
|
46
|
-
| Execution Audit Trail | YES (full) | Partial | NO |
|
|
47
|
-
| Secure by Default | YES | NO | NO |
|
|
48
|
-
|
|
49
|
-
---
|
|
50
|
-
|
|
51
|
-
## Installation
|
|
52
|
-
|
|
53
|
-
```bash
|
|
54
|
-
npm install rust-kgdb
|
|
10
|
+
```
|
|
11
|
+
╔═══════════════════════════════════════════════════════════════════════════════╗
|
|
12
|
+
║ ║
|
|
13
|
+
║ +86.4% ACCURACY IMPROVEMENT OVER VANILLA LLM AGENTS ║
|
|
14
|
+
║ ║
|
|
15
|
+
║ On structured query generation benchmarks (LUBM dataset, 11 hard tests) ║
|
|
16
|
+
║ ║
|
|
17
|
+
╚═══════════════════════════════════════════════════════════════════════════════╝
|
|
55
18
|
```
|
|
56
19
|
|
|
57
20
|
---
|
|
58
21
|
|
|
59
|
-
##
|
|
60
|
-
|
|
61
|
-
### 1. Core GraphDB (RDF/SPARQL)
|
|
22
|
+
## Benchmark: Vanilla LLM vs HyperMind
|
|
62
23
|
|
|
63
|
-
```
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
// Create database with base URI
|
|
69
|
-
const db = new GraphDB('http://example.org/my-app')
|
|
24
|
+
```
|
|
25
|
+
═══════════════════════════════════════════════════════════════════════════════
|
|
26
|
+
SPARQL QUERY GENERATION ACCURACY
|
|
27
|
+
═══════════════════════════════════════════════════════════════════════════════
|
|
70
28
|
|
|
71
|
-
|
|
72
|
-
db.loadTtl(`
|
|
73
|
-
<http://example.org/alice> <http://xmlns.com/foaf/0.1/name> "Alice" .
|
|
74
|
-
<http://example.org/alice> <http://xmlns.com/foaf/0.1/age> "28"^^<http://www.w3.org/2001/XMLSchema#integer> .
|
|
75
|
-
<http://example.org/bob> <http://xmlns.com/foaf/0.1/name> "Bob" .
|
|
76
|
-
<http://example.org/alice> <http://xmlns.com/foaf/0.1/knows> <http://example.org/bob> .
|
|
77
|
-
`, null)
|
|
29
|
+
VANILLA LLM (No Schema Context):
|
|
78
30
|
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
31
|
+
Claude Sonnet 4 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░│ 0.0% ❌
|
|
32
|
+
GPT-4o │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░│ 0.0% ❌
|
|
33
|
+
Type Errors │████████████████████████████████████████│ 100.0% ⚠️
|
|
82
34
|
|
|
83
|
-
|
|
84
|
-
const hasAlice = db.queryAsk('ASK { <http://example.org/alice> ?p ?o }')
|
|
85
|
-
console.log('Has Alice:', hasAlice) // true
|
|
35
|
+
───────────────────────────────────────────────────────────────────────────────
|
|
86
36
|
|
|
87
|
-
|
|
88
|
-
const graph = db.queryConstruct('CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }')
|
|
89
|
-
console.log('Graph:', graph)
|
|
37
|
+
HYPERMIND NEURO-SYMBOLIC (With Type Theory + Category Theory):
|
|
90
38
|
|
|
91
|
-
|
|
92
|
-
|
|
39
|
+
Claude Sonnet 4 │████████████████████████████████████░░░░│ 90.9% ✅
|
|
40
|
+
GPT-4o │████████████████████████████████░░░░░░░░│ 81.8% ✅
|
|
41
|
+
Average │█████████████████████████████████████░░░│ 86.4% ✅
|
|
42
|
+
Type Errors │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░│ 0.0% ✅
|
|
93
43
|
|
|
94
|
-
|
|
95
|
-
|
|
44
|
+
═══════════════════════════════════════════════════════════════════════════════
|
|
45
|
+
+86.4 PERCENTAGE POINTS IMPROVEMENT
|
|
46
|
+
═══════════════════════════════════════════════════════════════════════════════
|
|
96
47
|
```
|
|
97
48
|
|
|
98
|
-
###
|
|
49
|
+
### Detailed Results by Test Category
|
|
99
50
|
|
|
100
|
-
```javascript
|
|
101
|
-
const {
|
|
102
|
-
GraphFrame,
|
|
103
|
-
friendsGraph,
|
|
104
|
-
completeGraph,
|
|
105
|
-
chainGraph,
|
|
106
|
-
starGraph,
|
|
107
|
-
cycleGraph,
|
|
108
|
-
binaryTreeGraph,
|
|
109
|
-
bipartiteGraph
|
|
110
|
-
} = require('rust-kgdb')
|
|
111
|
-
|
|
112
|
-
// Create graph from vertices and edges
|
|
113
|
-
const graph = new GraphFrame(
|
|
114
|
-
JSON.stringify([{id: "alice"}, {id: "bob"}, {id: "carol"}, {id: "dave"}]),
|
|
115
|
-
JSON.stringify([
|
|
116
|
-
{src: "alice", dst: "bob"},
|
|
117
|
-
{src: "bob", dst: "carol"},
|
|
118
|
-
{src: "carol", dst: "dave"},
|
|
119
|
-
{src: "dave", dst: "alice"}
|
|
120
|
-
])
|
|
121
|
-
)
|
|
122
|
-
|
|
123
|
-
// Graph statistics
|
|
124
|
-
console.log('Vertices:', graph.vertexCount()) // 4
|
|
125
|
-
console.log('Edges:', graph.edgeCount()) // 4
|
|
126
|
-
|
|
127
|
-
// === PageRank Algorithm ===
|
|
128
|
-
const ranks = JSON.parse(graph.pageRank(0.15, 20)) // damping=0.15, iterations=20
|
|
129
|
-
console.log('PageRank:', ranks)
|
|
130
|
-
// { ranks: { alice: 0.25, bob: 0.25, carol: 0.25, dave: 0.25 } }
|
|
131
|
-
|
|
132
|
-
// === Connected Components ===
|
|
133
|
-
const components = JSON.parse(graph.connectedComponents())
|
|
134
|
-
console.log('Components:', components)
|
|
135
|
-
|
|
136
|
-
// === Triangle Counting (WCOJ Optimized) ===
|
|
137
|
-
const k4 = completeGraph(4) // K4 has exactly 4 triangles
|
|
138
|
-
console.log('Triangles in K4:', k4.triangleCount()) // 4
|
|
139
|
-
|
|
140
|
-
const k5 = completeGraph(5) // K5 has exactly 10 triangles (C(5,3))
|
|
141
|
-
console.log('Triangles in K5:', k5.triangleCount()) // 10
|
|
142
|
-
|
|
143
|
-
// === Motif Pattern Matching ===
|
|
144
|
-
const chain = chainGraph(4) // v0 -> v1 -> v2 -> v3
|
|
145
|
-
|
|
146
|
-
// Find single edges
|
|
147
|
-
const edges = JSON.parse(chain.find("(a)-[]->(b)"))
|
|
148
|
-
console.log('Edge patterns:', edges.length) // 3
|
|
149
|
-
|
|
150
|
-
// Find two-hop paths
|
|
151
|
-
const twoHop = JSON.parse(chain.find("(a)-[]->(b); (b)-[]->(c)"))
|
|
152
|
-
console.log('Two-hop patterns:', twoHop.length) // 2 (v0->v1->v2, v1->v2->v3)
|
|
153
|
-
|
|
154
|
-
// === Factory Functions ===
|
|
155
|
-
const friends = friendsGraph() // Social network with 6 vertices
|
|
156
|
-
const star = starGraph(5) // Hub with 5 spokes (6 vertices, 5 edges)
|
|
157
|
-
const complete = completeGraph(4) // K4 complete graph
|
|
158
|
-
const cycle = cycleGraph(5) // Pentagon cycle (5 vertices, 5 edges)
|
|
159
|
-
const tree = binaryTreeGraph(3) // Binary tree depth 3
|
|
160
|
-
const bipartite = bipartiteGraph(3, 4) // 3 left + 4 right vertices
|
|
161
|
-
|
|
162
|
-
console.log('Star graph:', star.vertexCount(), 'vertices,', star.edgeCount(), 'edges')
|
|
163
|
-
console.log('Cycle graph:', cycle.vertexCount(), 'vertices,', cycle.edgeCount(), 'edges')
|
|
164
51
|
```
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
// (a)-[]->(b); (b)-[]->(c) Two-hop path (chain pattern)
|
|
177
|
-
// !(a)-[]->(b) Negation (edge does NOT exist)
|
|
178
|
-
|
|
179
|
-
// === Find Single Edges ===
|
|
180
|
-
const chain = chainGraph(5) // v0 -> v1 -> v2 -> v3 -> v4
|
|
181
|
-
const edges = JSON.parse(chain.find("(a)-[]->(b)"))
|
|
182
|
-
console.log('All edges:', edges.length) // 4
|
|
183
|
-
|
|
184
|
-
// === Two-Hop Paths (Friend-of-Friend Pattern) ===
|
|
185
|
-
const twoHop = JSON.parse(chain.find("(a)-[]->(b); (b)-[]->(c)"))
|
|
186
|
-
console.log('Two-hop paths:', twoHop.length) // 3
|
|
187
|
-
// v0->v1->v2, v1->v2->v3, v2->v3->v4
|
|
188
|
-
|
|
189
|
-
// === Three-Hop Paths ===
|
|
190
|
-
const threeHop = JSON.parse(chain.find("(a)-[]->(b); (b)-[]->(c); (c)-[]->(d)"))
|
|
191
|
-
console.log('Three-hop paths:', threeHop.length) // 2
|
|
192
|
-
|
|
193
|
-
// === Triangle Pattern (Cycle of Length 3) ===
|
|
194
|
-
const k4 = completeGraph(4) // K4 has triangles
|
|
195
|
-
const triangles = JSON.parse(k4.find("(a)-[]->(b); (b)-[]->(c); (c)-[]->(a)"))
|
|
196
|
-
// Filter to avoid counting same triangle multiple times
|
|
197
|
-
const uniqueTriangles = triangles.filter(t => t.a < t.b && t.b < t.c)
|
|
198
|
-
console.log('Triangles in K4:', uniqueTriangles.length) // 4
|
|
199
|
-
|
|
200
|
-
// === Star Pattern (Hub with Multiple Spokes) ===
|
|
201
|
-
const social = new GraphFrame(
|
|
202
|
-
JSON.stringify([
|
|
203
|
-
{id: "influencer"},
|
|
204
|
-
{id: "follower1"}, {id: "follower2"}, {id: "follower3"}
|
|
205
|
-
]),
|
|
206
|
-
JSON.stringify([
|
|
207
|
-
{src: "influencer", dst: "follower1"},
|
|
208
|
-
{src: "influencer", dst: "follower2"},
|
|
209
|
-
{src: "influencer", dst: "follower3"}
|
|
210
|
-
])
|
|
211
|
-
)
|
|
212
|
-
// Find hub pattern: someone with 2+ outgoing edges
|
|
213
|
-
const hubPattern = JSON.parse(social.find("(hub)-[]->(f1); (hub)-[]->(f2)"))
|
|
214
|
-
console.log('Hub patterns (2+ followers):', hubPattern.length)
|
|
215
|
-
|
|
216
|
-
// === Reciprocal Relationship (Mutual Friends) ===
|
|
217
|
-
const mutual = new GraphFrame(
|
|
218
|
-
JSON.stringify([{id: "alice"}, {id: "bob"}, {id: "carol"}]),
|
|
219
|
-
JSON.stringify([
|
|
220
|
-
{src: "alice", dst: "bob"},
|
|
221
|
-
{src: "bob", dst: "alice"}, // Reciprocal
|
|
222
|
-
{src: "bob", dst: "carol"} // One-way
|
|
223
|
-
])
|
|
224
|
-
)
|
|
225
|
-
const reciprocal = JSON.parse(mutual.find("(a)-[]->(b); (b)-[]->(a)"))
|
|
226
|
-
console.log('Mutual relationships:', reciprocal.length) // 2 (alice<->bob counted twice)
|
|
227
|
-
|
|
228
|
-
// === Diamond Pattern (Common in Fraud Detection) ===
|
|
229
|
-
// A -> B, A -> C, B -> D, C -> D (convergence point D)
|
|
230
|
-
const diamond = new GraphFrame(
|
|
231
|
-
JSON.stringify([{id: "A"}, {id: "B"}, {id: "C"}, {id: "D"}]),
|
|
232
|
-
JSON.stringify([
|
|
233
|
-
{src: "A", dst: "B"},
|
|
234
|
-
{src: "A", dst: "C"},
|
|
235
|
-
{src: "B", dst: "D"},
|
|
236
|
-
{src: "C", dst: "D"}
|
|
237
|
-
])
|
|
238
|
-
)
|
|
239
|
-
const diamondPattern = JSON.parse(diamond.find(
|
|
240
|
-
"(a)-[]->(b); (a)-[]->(c); (b)-[]->(d); (c)-[]->(d)"
|
|
241
|
-
))
|
|
242
|
-
console.log('Diamond patterns:', diamondPattern.length) // 1
|
|
243
|
-
|
|
244
|
-
// === Use Case: Fraud Ring Detection ===
|
|
245
|
-
// Find circular money transfers: A -> B -> C -> A
|
|
246
|
-
const transactions = new GraphFrame(
|
|
247
|
-
JSON.stringify([
|
|
248
|
-
{id: "acc001"}, {id: "acc002"}, {id: "acc003"}, {id: "acc004"}
|
|
249
|
-
]),
|
|
250
|
-
JSON.stringify([
|
|
251
|
-
{src: "acc001", dst: "acc002", amount: 10000},
|
|
252
|
-
{src: "acc002", dst: "acc003", amount: 9900},
|
|
253
|
-
{src: "acc003", dst: "acc001", amount: 9800}, // Suspicious cycle!
|
|
254
|
-
{src: "acc003", dst: "acc004", amount: 5000} // Normal transfer
|
|
255
|
-
])
|
|
256
|
-
)
|
|
257
|
-
const cycles = JSON.parse(transactions.find(
|
|
258
|
-
"(a)-[]->(b); (b)-[]->(c); (c)-[]->(a)"
|
|
259
|
-
))
|
|
260
|
-
console.log('Circular transfer patterns:', cycles.length) // Found fraud ring!
|
|
261
|
-
|
|
262
|
-
// === Use Case: Recommendation (Friends-of-Friends not yet connected) ===
|
|
263
|
-
const network = friendsGraph()
|
|
264
|
-
const fofPattern = JSON.parse(network.find("(a)-[]->(b); (b)-[]->(c)"))
|
|
265
|
-
// Filter: a != c and no direct edge a->c (potential recommendation)
|
|
266
|
-
console.log('Friend-of-friend patterns for recommendations:', fofPattern.length)
|
|
52
|
+
┌─────────────────────┬────────────────┬────────────────┬─────────────────┐
|
|
53
|
+
│ Test Category │ Vanilla LLM │ HyperMind │ Improvement │
|
|
54
|
+
├─────────────────────┼────────────────┼────────────────┼─────────────────┤
|
|
55
|
+
│ Ambiguous Queries │ 0.0% │ 100.0% │ +100.0 pp │
|
|
56
|
+
│ Multi-Hop Reasoning │ 0.0% │ 100.0% │ +100.0 pp │
|
|
57
|
+
│ Syntax Discipline │ 0.0% │ 100.0% │ +100.0 pp │
|
|
58
|
+
│ Edge Cases │ 0.0% │ 50.0% │ +50.0 pp │
|
|
59
|
+
│ Type Mismatches │ 0.0% │ 100.0% │ +100.0 pp │
|
|
60
|
+
├─────────────────────┼────────────────┼────────────────┼─────────────────┤
|
|
61
|
+
│ OVERALL │ 0.0% │ 86.4% │ +86.4 pp │
|
|
62
|
+
└─────────────────────┴────────────────┴────────────────┴─────────────────┘
|
|
267
63
|
```
|
|
268
64
|
|
|
269
|
-
###
|
|
270
|
-
|
|
271
|
-
| Pattern | DSL Syntax | Description |
|
|
272
|
-
|---------|------------|-------------|
|
|
273
|
-
| **Edge** | `(a)-[]->(b)` | Single directed edge |
|
|
274
|
-
| **Named Edge** | `(a)-[e]->(b)` | Edge with binding name |
|
|
275
|
-
| **Two-hop** | `(a)-[]->(b); (b)-[]->(c)` | Path of length 2 |
|
|
276
|
-
| **Triangle** | `(a)-[]->(b); (b)-[]->(c); (c)-[]->(a)` | 3-cycle |
|
|
277
|
-
| **Star** | `(h)-[]->(a); (h)-[]->(b); (h)-[]->(c)` | Hub pattern |
|
|
278
|
-
| **Diamond** | `(a)-[]->(b); (a)-[]->(c); (b)-[]->(d); (c)-[]->(d)` | Convergence |
|
|
279
|
-
| **Negation** | `!(a)-[]->(b)` | Edge must NOT exist |
|
|
65
|
+
### Why Vanilla LLMs Fail
|
|
280
66
|
|
|
281
|
-
### 3. EmbeddingService (Vector Similarity & Text Search)
|
|
282
|
-
|
|
283
|
-
```javascript
|
|
284
|
-
const { EmbeddingService } = require('rust-kgdb')
|
|
285
|
-
|
|
286
|
-
const service = new EmbeddingService()
|
|
287
|
-
|
|
288
|
-
// === Store Vector Embeddings (384 dimensions) ===
|
|
289
|
-
service.storeVector('entity1', new Array(384).fill(0.1))
|
|
290
|
-
service.storeVector('entity2', new Array(384).fill(0.15))
|
|
291
|
-
service.storeVector('entity3', new Array(384).fill(0.9))
|
|
292
|
-
|
|
293
|
-
// Retrieve stored vector
|
|
294
|
-
const vec = service.getVector('entity1')
|
|
295
|
-
console.log('Vector dimension:', vec.length) // 384
|
|
296
|
-
|
|
297
|
-
// Count stored vectors
|
|
298
|
-
console.log('Total vectors:', service.countVectors()) // 3
|
|
299
|
-
|
|
300
|
-
// === Similarity Search ===
|
|
301
|
-
// Find top 10 entities similar to 'entity1' with threshold 0.0
|
|
302
|
-
const similar = JSON.parse(service.findSimilar('entity1', 10, 0.0))
|
|
303
|
-
console.log('Similar entities:', similar)
|
|
304
|
-
// Returns entities sorted by cosine similarity
|
|
305
|
-
|
|
306
|
-
// === Multi-Provider Composite Embeddings ===
|
|
307
|
-
// Store embeddings from multiple providers (OpenAI, Voyage, Cohere)
|
|
308
|
-
service.storeComposite('product_123', JSON.stringify({
|
|
309
|
-
openai: new Array(384).fill(0.1),
|
|
310
|
-
voyage: new Array(384).fill(0.2),
|
|
311
|
-
cohere: new Array(384).fill(0.3)
|
|
312
|
-
}))
|
|
313
|
-
|
|
314
|
-
// Retrieve composite embedding
|
|
315
|
-
const composite = service.getComposite('product_123')
|
|
316
|
-
console.log('Composite embedding:', composite ? 'stored' : 'not found')
|
|
317
|
-
|
|
318
|
-
// Count composite embeddings
|
|
319
|
-
console.log('Total composites:', service.countComposites())
|
|
320
|
-
|
|
321
|
-
// === Composite Similarity Search (RRF Aggregation) ===
|
|
322
|
-
// Find similar using Reciprocal Rank Fusion across multiple providers
|
|
323
|
-
const compositeSimilar = JSON.parse(service.findSimilarComposite('product_123', 10, 0.5, 'rrf'))
|
|
324
|
-
console.log('Similar (composite RRF):', compositeSimilar)
|
|
325
|
-
|
|
326
|
-
// === Use Case: Semantic Product Search ===
|
|
327
|
-
// Store product embeddings
|
|
328
|
-
const products = ['laptop', 'phone', 'tablet', 'keyboard', 'mouse']
|
|
329
|
-
products.forEach((product, i) => {
|
|
330
|
-
// In production, use actual embeddings from OpenAI/Cohere/etc
|
|
331
|
-
const embedding = new Array(384).fill(0).map((_, j) => Math.sin(i * 0.1 + j * 0.01))
|
|
332
|
-
service.storeVector(product, embedding)
|
|
333
|
-
})
|
|
334
|
-
|
|
335
|
-
// Find similar products
|
|
336
|
-
const relatedToLaptop = JSON.parse(service.findSimilar('laptop', 5, 0.0))
|
|
337
|
-
console.log('Products similar to laptop:', relatedToLaptop)
|
|
338
67
|
```
|
|
68
|
+
User: "Find all professors"
|
|
339
69
|
|
|
340
|
-
|
|
341
|
-
|
|
342
|
-
```
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
|
|
348
|
-
|
|
349
|
-
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
|
|
353
|
-
provider: 'openai' // Use OpenAI provider
|
|
354
|
-
}
|
|
355
|
-
}
|
|
356
|
-
|
|
357
|
-
// Multiple triggers for different providers
|
|
358
|
-
const triggers = [
|
|
359
|
-
{ name: 'embed_openai', provider: 'openai' },
|
|
360
|
-
{ name: 'embed_voyage', provider: 'voyage' },
|
|
361
|
-
{ name: 'embed_cohere', provider: 'cohere' }
|
|
362
|
-
]
|
|
70
|
+
Vanilla LLM Output:
|
|
71
|
+
┌───────────────────────────────────────────────────────────────────────┐
|
|
72
|
+
│ ```sparql │
|
|
73
|
+
│ PREFIX ub: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#> │
|
|
74
|
+
│ SELECT ?professor WHERE { │
|
|
75
|
+
│ ?professor a ub:Faculty . ← WRONG! Schema has "Professor" │
|
|
76
|
+
│ } │
|
|
77
|
+
│ ``` ← Parser rejects markdown │
|
|
78
|
+
│ │
|
|
79
|
+
│ This query retrieves all faculty members from the LUBM dataset. │
|
|
80
|
+
│ ↑ Explanation text breaks parsing │
|
|
81
|
+
└───────────────────────────────────────────────────────────────────────┘
|
|
82
|
+
Result: ❌ PARSER ERROR - Invalid SPARQL syntax
|
|
363
83
|
|
|
364
|
-
|
|
84
|
+
HyperMind Output:
|
|
85
|
+
┌───────────────────────────────────────────────────────────────────────┐
|
|
86
|
+
│ PREFIX ub: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#> │
|
|
87
|
+
│ SELECT ?professor WHERE { │
|
|
88
|
+
│ ?professor a ub:Professor . ← CORRECT! Schema-aware │
|
|
89
|
+
│ } │
|
|
90
|
+
└───────────────────────────────────────────────────────────────────────┘
|
|
91
|
+
Result: ✅ 15 results returned in 2.3ms
|
|
365
92
|
```
|
|
366
93
|
|
|
367
|
-
|
|
94
|
+
---
|
|
368
95
|
|
|
369
|
-
|
|
370
|
-
// rust-kgdb supports multiple embedding providers:
|
|
371
|
-
//
|
|
372
|
-
// Built-in Providers:
|
|
373
|
-
// - 'openai' → text-embedding-3-small (1536 or 384 dim)
|
|
374
|
-
// - 'voyage' → voyage-2, voyage-lite-02-instruct
|
|
375
|
-
// - 'cohere' → embed-v3
|
|
376
|
-
// - 'anthropic' → Via Voyage partnership
|
|
377
|
-
// - 'mistral' → mistral-embed
|
|
378
|
-
// - 'jina' → jina-embeddings-v2
|
|
379
|
-
// - 'ollama' → Local models (llama, mistral, etc.)
|
|
380
|
-
// - 'hf-tei' → HuggingFace Text Embedding Inference
|
|
381
|
-
//
|
|
382
|
-
// Provider Configuration (Rust-side):
|
|
383
|
-
|
|
384
|
-
const providerConfig = {
|
|
385
|
-
providers: {
|
|
386
|
-
openai: {
|
|
387
|
-
api_key: process.env.OPENAI_API_KEY,
|
|
388
|
-
model: 'text-embedding-3-small',
|
|
389
|
-
dimensions: 384
|
|
390
|
-
},
|
|
391
|
-
voyage: {
|
|
392
|
-
api_key: process.env.VOYAGE_API_KEY,
|
|
393
|
-
model: 'voyage-2',
|
|
394
|
-
dimensions: 1024
|
|
395
|
-
},
|
|
396
|
-
cohere: {
|
|
397
|
-
api_key: process.env.COHERE_API_KEY,
|
|
398
|
-
model: 'embed-english-v3.0',
|
|
399
|
-
dimensions: 384
|
|
400
|
-
},
|
|
401
|
-
ollama: {
|
|
402
|
-
base_url: 'http://localhost:11434',
|
|
403
|
-
model: 'nomic-embed-text',
|
|
404
|
-
dimensions: 768
|
|
405
|
-
}
|
|
406
|
-
},
|
|
407
|
-
default_provider: 'openai'
|
|
408
|
-
}
|
|
96
|
+
## Installation
|
|
409
97
|
|
|
410
|
-
|
|
411
|
-
|
|
412
|
-
// a "recall ceiling" - different providers capture different semantic aspects:
|
|
413
|
-
// - OpenAI: General semantic understanding
|
|
414
|
-
// - Voyage: Domain-specific (legal, financial, code)
|
|
415
|
-
// - Cohere: Multilingual support
|
|
416
|
-
// - Ollama: Privacy-preserving local inference
|
|
417
|
-
|
|
418
|
-
// Aggregation Strategies for composite search:
|
|
419
|
-
// - 'rrf' → Reciprocal Rank Fusion (recommended)
|
|
420
|
-
// - 'max' → Maximum score across providers
|
|
421
|
-
// - 'avg' → Weighted average
|
|
422
|
-
// - 'voting' → Consensus (entity must appear in N providers)
|
|
98
|
+
```bash
|
|
99
|
+
npm install rust-kgdb
|
|
423
100
|
```
|
|
424
101
|
|
|
425
|
-
|
|
102
|
+
**Supported Platforms:**
|
|
103
|
+
- macOS (Intel & Apple Silicon)
|
|
104
|
+
- Linux (x64 & ARM64)
|
|
105
|
+
- Windows (x64)
|
|
426
106
|
|
|
427
|
-
|
|
428
|
-
const { DatalogProgram, evaluateDatalog, queryDatalog } = require('rust-kgdb')
|
|
429
|
-
|
|
430
|
-
const program = new DatalogProgram()
|
|
431
|
-
|
|
432
|
-
// === Add Facts ===
|
|
433
|
-
program.addFact(JSON.stringify({predicate: 'parent', terms: ['alice', 'bob']}))
|
|
434
|
-
program.addFact(JSON.stringify({predicate: 'parent', terms: ['bob', 'charlie']}))
|
|
435
|
-
program.addFact(JSON.stringify({predicate: 'parent', terms: ['charlie', 'dave']}))
|
|
436
|
-
|
|
437
|
-
console.log('Facts:', program.factCount()) // 3
|
|
438
|
-
|
|
439
|
-
// === Add Rules ===
|
|
440
|
-
// Rule 1: grandparent(X, Z) :- parent(X, Y), parent(Y, Z)
|
|
441
|
-
program.addRule(JSON.stringify({
|
|
442
|
-
head: {predicate: 'grandparent', terms: ['?X', '?Z']},
|
|
443
|
-
body: [
|
|
444
|
-
{predicate: 'parent', terms: ['?X', '?Y']},
|
|
445
|
-
{predicate: 'parent', terms: ['?Y', '?Z']}
|
|
446
|
-
]
|
|
447
|
-
}))
|
|
448
|
-
|
|
449
|
-
// Rule 2: ancestor(X, Y) :- parent(X, Y)
|
|
450
|
-
program.addRule(JSON.stringify({
|
|
451
|
-
head: {predicate: 'ancestor', terms: ['?X', '?Y']},
|
|
452
|
-
body: [
|
|
453
|
-
{predicate: 'parent', terms: ['?X', '?Y']}
|
|
454
|
-
]
|
|
455
|
-
}))
|
|
456
|
-
|
|
457
|
-
// Rule 3: ancestor(X, Z) :- parent(X, Y), ancestor(Y, Z) (transitive closure)
|
|
458
|
-
program.addRule(JSON.stringify({
|
|
459
|
-
head: {predicate: 'ancestor', terms: ['?X', '?Z']},
|
|
460
|
-
body: [
|
|
461
|
-
{predicate: 'parent', terms: ['?X', '?Y']},
|
|
462
|
-
{predicate: 'ancestor', terms: ['?Y', '?Z']}
|
|
463
|
-
]
|
|
464
|
-
}))
|
|
465
|
-
|
|
466
|
-
console.log('Rules:', program.ruleCount()) // 3
|
|
467
|
-
|
|
468
|
-
// === Evaluate Program ===
|
|
469
|
-
const result = evaluateDatalog(program)
|
|
470
|
-
console.log('Evaluation result:', result)
|
|
471
|
-
|
|
472
|
-
// === Query Derived Facts ===
|
|
473
|
-
const grandparents = JSON.parse(queryDatalog(program, 'grandparent'))
|
|
474
|
-
console.log('Grandparent relations:', grandparents)
|
|
475
|
-
// alice is grandparent of charlie
|
|
476
|
-
// bob is grandparent of dave
|
|
477
|
-
|
|
478
|
-
const ancestors = JSON.parse(queryDatalog(program, 'ancestor'))
|
|
479
|
-
console.log('Ancestor relations:', ancestors)
|
|
480
|
-
// alice->bob, alice->charlie, alice->dave
|
|
481
|
-
// bob->charlie, bob->dave
|
|
482
|
-
// charlie->dave
|
|
483
|
-
```
|
|
107
|
+
---
|
|
484
108
|
|
|
485
|
-
|
|
109
|
+
## Performance Benchmarks
|
|
486
110
|
|
|
487
|
-
```javascript
|
|
488
|
-
const {
|
|
489
|
-
chainGraph,
|
|
490
|
-
starGraph,
|
|
491
|
-
cycleGraph,
|
|
492
|
-
pregelShortestPaths
|
|
493
|
-
} = require('rust-kgdb')
|
|
494
|
-
|
|
495
|
-
// === Shortest Paths in Chain Graph ===
|
|
496
|
-
const chain = chainGraph(10) // v0 -> v1 -> v2 -> ... -> v9
|
|
497
|
-
|
|
498
|
-
// Run Pregel shortest paths from v0
|
|
499
|
-
const chainResult = JSON.parse(pregelShortestPaths(chain, 'v0', 20))
|
|
500
|
-
console.log('Chain shortest paths from v0:', chainResult)
|
|
501
|
-
// Expected: { v0: 0, v1: 1, v2: 2, v3: 3, ..., v9: 9 }
|
|
502
|
-
|
|
503
|
-
// === Shortest Paths in Star Graph ===
|
|
504
|
-
const star = starGraph(5) // hub connected to spoke0...spoke4
|
|
505
|
-
|
|
506
|
-
// Run Pregel from hub (center vertex)
|
|
507
|
-
const starResult = JSON.parse(pregelShortestPaths(star, 'hub', 10))
|
|
508
|
-
console.log('Star shortest paths from hub:', starResult)
|
|
509
|
-
// Expected: hub=0, all spokes=1
|
|
510
|
-
|
|
511
|
-
// === Shortest Paths in Cycle Graph ===
|
|
512
|
-
const cycle = cycleGraph(6) // v0 -> v1 -> v2 -> v3 -> v4 -> v5 -> v0
|
|
513
|
-
|
|
514
|
-
const cycleResult = JSON.parse(pregelShortestPaths(cycle, 'v0', 20))
|
|
515
|
-
console.log('Cycle shortest paths from v0:', cycleResult)
|
|
516
|
-
// In directed cycle: v0=0, v1=1, v2=2, v3=3, v4=4, v5=5
|
|
517
|
-
|
|
518
|
-
// === Custom Graph for Pregel ===
|
|
519
|
-
const customGraph = new (require('rust-kgdb').GraphFrame)(
|
|
520
|
-
JSON.stringify([
|
|
521
|
-
{id: "server1"},
|
|
522
|
-
{id: "server2"},
|
|
523
|
-
{id: "server3"},
|
|
524
|
-
{id: "client"}
|
|
525
|
-
]),
|
|
526
|
-
JSON.stringify([
|
|
527
|
-
{src: "client", dst: "server1"},
|
|
528
|
-
{src: "client", dst: "server2"},
|
|
529
|
-
{src: "server1", dst: "server3"},
|
|
530
|
-
{src: "server2", dst: "server3"}
|
|
531
|
-
])
|
|
532
|
-
)
|
|
533
|
-
|
|
534
|
-
const networkResult = JSON.parse(pregelShortestPaths(customGraph, 'client', 10))
|
|
535
|
-
console.log('Network shortest paths from client:', networkResult)
|
|
536
|
-
// client=0, server1=1, server2=1, server3=2
|
|
537
111
|
```
|
|
112
|
+
═══════════════════════════════════════════════════════════════════════════════
|
|
113
|
+
KNOWLEDGE GRAPH PERFORMANCE
|
|
114
|
+
═══════════════════════════════════════════════════════════════════════════════
|
|
538
115
|
|
|
539
|
-
|
|
116
|
+
rust-kgdb vs Industry Leaders:
|
|
540
117
|
|
|
541
|
-
|
|
542
|
-
const {
|
|
543
|
-
friendsGraph,
|
|
544
|
-
chainGraph,
|
|
545
|
-
starGraph,
|
|
546
|
-
completeGraph,
|
|
547
|
-
cycleGraph,
|
|
548
|
-
binaryTreeGraph,
|
|
549
|
-
bipartiteGraph,
|
|
550
|
-
} = require('rust-kgdb')
|
|
551
|
-
|
|
552
|
-
// === friendsGraph() - Social Network ===
|
|
553
|
-
// Pre-built social network for testing
|
|
554
|
-
const friends = friendsGraph()
|
|
555
|
-
console.log('Friends graph:', friends.vertexCount(), 'people')
|
|
556
|
-
|
|
557
|
-
// === chainGraph(n) - Linear Path ===
|
|
558
|
-
// v0 -> v1 -> v2 -> ... -> v(n-1)
|
|
559
|
-
const chain5 = chainGraph(5)
|
|
560
|
-
console.log('Chain(5):', chain5.vertexCount(), 'vertices,', chain5.edgeCount(), 'edges')
|
|
561
|
-
// 5 vertices, 4 edges
|
|
562
|
-
|
|
563
|
-
// === starGraph(spokes) - Hub-Spoke ===
|
|
564
|
-
// hub -> spoke0, hub -> spoke1, ..., hub -> spoke(n-1)
|
|
565
|
-
const star6 = starGraph(6)
|
|
566
|
-
console.log('Star(6):', star6.vertexCount(), 'vertices,', star6.edgeCount(), 'edges')
|
|
567
|
-
// 7 vertices (1 hub + 6 spokes), 6 edges
|
|
568
|
-
|
|
569
|
-
// === completeGraph(n) - K_n Complete Graph ===
|
|
570
|
-
// Every vertex connected to every other vertex
|
|
571
|
-
const k4 = completeGraph(4)
|
|
572
|
-
console.log('K4:', k4.vertexCount(), 'vertices,', k4.edgeCount(), 'edges')
|
|
573
|
-
// 4 vertices, 6 edges (bidirectional = 12)
|
|
574
|
-
console.log('K4 triangles:', k4.triangleCount()) // 4 triangles
|
|
575
|
-
|
|
576
|
-
// === cycleGraph(n) - Circular ===
|
|
577
|
-
// v0 -> v1 -> v2 -> ... -> v(n-1) -> v0
|
|
578
|
-
const cycle5 = cycleGraph(5)
|
|
579
|
-
console.log('Cycle(5):', cycle5.vertexCount(), 'vertices,', cycle5.edgeCount(), 'edges')
|
|
580
|
-
// 5 vertices, 5 edges
|
|
581
|
-
|
|
582
|
-
// === binaryTreeGraph(depth) - Binary Tree ===
|
|
583
|
-
// Complete binary tree with given depth
|
|
584
|
-
const tree3 = binaryTreeGraph(3)
|
|
585
|
-
console.log('BinaryTree(3):', tree3.vertexCount(), 'vertices')
|
|
586
|
-
// 2^4 - 1 = 15 vertices for depth 3
|
|
587
|
-
|
|
588
|
-
// === bipartiteGraph(left, right) - Two Sets ===
|
|
589
|
-
// All left vertices connected to all right vertices
|
|
590
|
-
const bp34 = bipartiteGraph(3, 4)
|
|
591
|
-
console.log('Bipartite(3,4):', bp34.vertexCount(), 'vertices,', bp34.edgeCount(), 'edges')
|
|
592
|
-
// 7 vertices, 12 edges (3 * 4)
|
|
593
|
-
```
|
|
118
|
+
LOOKUP SPEED (lower is better):
|
|
594
119
|
|
|
595
|
-
|
|
120
|
+
rust-kgdb │██░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░│ 2.78 µs 🏆
|
|
121
|
+
RDFox │███████████████████████████░░░░░░░░░░░░░│ 97.3 µs
|
|
122
|
+
Apache Jena │████████████████████████████████████████│ 180+ µs
|
|
596
123
|
|
|
597
|
-
|
|
124
|
+
rust-kgdb is 35-180x FASTER than competitors
|
|
598
125
|
|
|
599
|
-
|
|
126
|
+
───────────────────────────────────────────────────────────────────────────────
|
|
600
127
|
|
|
601
|
-
|
|
128
|
+
MEMORY EFFICIENCY (bytes per triple):
|
|
602
129
|
|
|
603
|
-
|
|
604
|
-
|
|
605
|
-
|
|
606
|
-
```
|
|
607
|
-
|
|
608
|
-
**NOT to be confused with:**
|
|
609
|
-
- ❌ **EmbeddingService** - That's for semantic similarity search (different feature)
|
|
610
|
-
- ❌ **GraphDB** - That's for direct SPARQL queries (no natural language)
|
|
130
|
+
rust-kgdb │████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░│ 24 bytes 🏆
|
|
131
|
+
RDFox │████████████████░░░░░░░░░░░░░░░░░░░░░░░░│ 32 bytes
|
|
132
|
+
Apache Jena │████████████████████████████████████░░░░│ 50+ bytes
|
|
611
133
|
|
|
612
|
-
|
|
134
|
+
rust-kgdb uses 25% LESS memory than RDFox
|
|
613
135
|
|
|
614
|
-
|
|
615
|
-
const { HyperMindAgent } = require('rust-kgdb')
|
|
136
|
+
───────────────────────────────────────────────────────────────────────────────
|
|
616
137
|
|
|
617
|
-
|
|
618
|
-
|
|
138
|
+
┌─────────────────────┬────────────────┬────────────────┬─────────────────┐
|
|
139
|
+
│ Metric │ rust-kgdb │ RDFox │ Advantage │
|
|
140
|
+
├─────────────────────┼────────────────┼────────────────┼─────────────────┤
|
|
141
|
+
│ Lookup Speed │ 2.78 µs │ 97.3 µs │ 35x faster │
|
|
142
|
+
│ Memory per Triple │ 24 bytes │ 32 bytes │ 25% less │
|
|
143
|
+
│ Bulk Insert │ 146K/sec │ 200K/sec │ Competitive │
|
|
144
|
+
│ SIMD Speedup │ 44.5% avg │ N/A │ Unique │
|
|
145
|
+
└─────────────────────┴────────────────┴────────────────┴─────────────────┘
|
|
146
|
+
═══════════════════════════════════════════════════════════════════════════════
|
|
619
147
|
```
|
|
620
148
|
|
|
621
149
|
---
|
|
622
150
|
|
|
623
|
-
|
|
151
|
+
## Complete Example: Fraud Detection Agent
|
|
624
152
|
|
|
625
|
-
-
|
|
626
|
-
- **Category Theory**: Tools as morphisms with composable guarantees
|
|
627
|
-
- **Neural Planning**: LLM-based planning (Claude, GPT-4o)
|
|
628
|
-
- **Symbolic Execution**: rust-kgdb knowledge graph operations
|
|
153
|
+
Real-world fraud detection with embeddings and full pipeline.
|
|
629
154
|
|
|
630
|
-
|
|
155
|
+
```javascript
|
|
156
|
+
const { GraphDB, GraphFrame, EmbeddingService, DatalogProgram } = require('rust-kgdb')
|
|
157
|
+
|
|
158
|
+
// ═══════════════════════════════════════════════════════════════════════════
|
|
159
|
+
// FRAUD DETECTION AGENT - Complete Real-World Pipeline
|
|
160
|
+
// ═══════════════════════════════════════════════════════════════════════════
|
|
161
|
+
|
|
162
|
+
async function runFraudDetection() {
|
|
163
|
+
console.log('╔═══════════════════════════════════════════════════════════╗')
|
|
164
|
+
console.log('║ FRAUD DETECTION AGENT - HyperMind Framework ║')
|
|
165
|
+
console.log('╠═══════════════════════════════════════════════════════════╣')
|
|
166
|
+
console.log('║ Data: Panama Papers Style Offshore Entity Network ║')
|
|
167
|
+
console.log('║ Analysis: Circular Payments, Shell Companies, Smurfing ║')
|
|
168
|
+
console.log('╚═══════════════════════════════════════════════════════════╝\n')
|
|
169
|
+
|
|
170
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
171
|
+
// STEP 1: Initialize Knowledge Graph with Real Financial Data
|
|
172
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
173
|
+
|
|
174
|
+
const db = new GraphDB('http://fraud.detection/kb')
|
|
175
|
+
|
|
176
|
+
// Load Panama Papers-style offshore entity data
|
|
177
|
+
db.loadTtl(`
|
|
178
|
+
@prefix fraud: <http://fraud.detection/ontology/> .
|
|
179
|
+
@prefix icij: <http://icij.org/offshore/> .
|
|
180
|
+
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
|
|
181
|
+
|
|
182
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
183
|
+
# OFFSHORE ENTITIES (Shell Company Network)
|
|
184
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
185
|
+
|
|
186
|
+
icij:entity001 a fraud:OffshoreEntity ;
|
|
187
|
+
fraud:name "Mossack Holdings Ltd" ;
|
|
188
|
+
fraud:jurisdiction "Panama" ;
|
|
189
|
+
fraud:incorporationDate "2010-03-15"^^xsd:date ;
|
|
190
|
+
fraud:registeredAgent "Mossack Fonseca" ;
|
|
191
|
+
fraud:riskScore "0.85"^^xsd:decimal ;
|
|
192
|
+
fraud:linkedTo icij:entity002 .
|
|
193
|
+
|
|
194
|
+
icij:entity002 a fraud:OffshoreEntity ;
|
|
195
|
+
fraud:name "British Virgin Islands Trust" ;
|
|
196
|
+
fraud:jurisdiction "BVI" ;
|
|
197
|
+
fraud:incorporationDate "2011-07-22"^^xsd:date ;
|
|
198
|
+
fraud:registeredAgent "Portcullis" ;
|
|
199
|
+
fraud:riskScore "0.72"^^xsd:decimal ;
|
|
200
|
+
fraud:linkedTo icij:entity003 .
|
|
201
|
+
|
|
202
|
+
icij:entity003 a fraud:OffshoreEntity ;
|
|
203
|
+
fraud:name "Cayman Investments LLC" ;
|
|
204
|
+
fraud:jurisdiction "Cayman Islands" ;
|
|
205
|
+
fraud:incorporationDate "2012-01-10"^^xsd:date ;
|
|
206
|
+
fraud:registeredAgent "Ugland House" ;
|
|
207
|
+
fraud:riskScore "0.91"^^xsd:decimal ;
|
|
208
|
+
fraud:linkedTo icij:entity001 . # CIRCULAR LINK - Red Flag!
|
|
209
|
+
|
|
210
|
+
icij:entity004 a fraud:OffshoreEntity ;
|
|
211
|
+
fraud:name "Delaware Holdings Corp" ;
|
|
212
|
+
fraud:jurisdiction "Delaware" ;
|
|
213
|
+
fraud:incorporationDate "2015-05-20"^^xsd:date ;
|
|
214
|
+
fraud:registeredAgent "CT Corporation" ;
|
|
215
|
+
fraud:riskScore "0.45"^^xsd:decimal .
|
|
216
|
+
|
|
217
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
218
|
+
# TRANSACTION NETWORK (Money Flow Pattern)
|
|
219
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
220
|
+
|
|
221
|
+
fraud:tx001 a fraud:Transaction ;
|
|
222
|
+
fraud:transactionId "TXN-2024-001" ;
|
|
223
|
+
fraud:sender icij:entity001 ;
|
|
224
|
+
fraud:receiver icij:entity002 ;
|
|
225
|
+
fraud:amount "2500000"^^xsd:decimal ;
|
|
226
|
+
fraud:currency "USD" ;
|
|
227
|
+
fraud:timestamp "2024-01-15T10:30:00Z"^^xsd:dateTime ;
|
|
228
|
+
fraud:description "Consulting Services" .
|
|
229
|
+
|
|
230
|
+
fraud:tx002 a fraud:Transaction ;
|
|
231
|
+
fraud:transactionId "TXN-2024-002" ;
|
|
232
|
+
fraud:sender icij:entity002 ;
|
|
233
|
+
fraud:receiver icij:entity003 ;
|
|
234
|
+
fraud:amount "2450000"^^xsd:decimal ;
|
|
235
|
+
fraud:currency "USD" ;
|
|
236
|
+
fraud:timestamp "2024-01-15T14:45:00Z"^^xsd:dateTime ;
|
|
237
|
+
fraud:description "Investment Management" .
|
|
238
|
+
|
|
239
|
+
fraud:tx003 a fraud:Transaction ;
|
|
240
|
+
fraud:transactionId "TXN-2024-003" ;
|
|
241
|
+
fraud:sender icij:entity003 ;
|
|
242
|
+
fraud:receiver icij:entity001 ;
|
|
243
|
+
fraud:amount "2400000"^^xsd:decimal ;
|
|
244
|
+
fraud:currency "USD" ;
|
|
245
|
+
fraud:timestamp "2024-01-15T18:00:00Z"^^xsd:dateTime ;
|
|
246
|
+
fraud:description "Loan Repayment" . # CIRCULAR FLOW - Layering!
|
|
247
|
+
|
|
248
|
+
fraud:tx004 a fraud:Transaction ;
|
|
249
|
+
fraud:transactionId "TXN-2024-004" ;
|
|
250
|
+
fraud:sender icij:entity001 ;
|
|
251
|
+
fraud:receiver icij:entity004 ;
|
|
252
|
+
fraud:amount "150000"^^xsd:decimal ;
|
|
253
|
+
fraud:currency "USD" ;
|
|
254
|
+
fraud:timestamp "2024-01-20T09:00:00Z"^^xsd:dateTime ;
|
|
255
|
+
fraud:description "Equipment Purchase" . # Legitimate
|
|
256
|
+
|
|
257
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
258
|
+
# BENEFICIAL OWNERS (Hidden Ownership)
|
|
259
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
260
|
+
|
|
261
|
+
fraud:person001 a fraud:BeneficialOwner ;
|
|
262
|
+
fraud:name "John Smith" ;
|
|
263
|
+
fraud:nationality "Unknown" ;
|
|
264
|
+
fraud:pep true ; # Politically Exposed Person
|
|
265
|
+
fraud:ownerOf icij:entity001 , icij:entity002 , icij:entity003 .
|
|
266
|
+
|
|
267
|
+
fraud:person002 a fraud:BeneficialOwner ;
|
|
268
|
+
fraud:name "Jane Doe" ;
|
|
269
|
+
fraud:nationality "USA" ;
|
|
270
|
+
fraud:pep false ;
|
|
271
|
+
fraud:ownerOf icij:entity004 .
|
|
272
|
+
|
|
273
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
274
|
+
# INSURANCE CLAIMS (Potential Insurance Fraud)
|
|
275
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
276
|
+
|
|
277
|
+
fraud:claim001 a fraud:InsuranceClaim ;
|
|
278
|
+
fraud:claimId "CLM-2024-0001" ;
|
|
279
|
+
fraud:policyNumber "POL-2024-000123" ;
|
|
280
|
+
fraud:claimant icij:entity001 ;
|
|
281
|
+
fraud:claimAmount "750000"^^xsd:decimal ;
|
|
282
|
+
fraud:claimType "BusinessInterruption" ;
|
|
283
|
+
fraud:filingDate "2024-02-01"^^xsd:date ;
|
|
284
|
+
fraud:status "UnderReview" .
|
|
285
|
+
|
|
286
|
+
fraud:claim002 a fraud:InsuranceClaim ;
|
|
287
|
+
fraud:claimId "CLM-2024-0002" ;
|
|
288
|
+
fraud:policyNumber "POL-2024-000124" ;
|
|
289
|
+
fraud:claimant icij:entity002 ;
|
|
290
|
+
fraud:claimAmount "820000"^^xsd:decimal ;
|
|
291
|
+
fraud:claimType "PropertyDamage" ;
|
|
292
|
+
fraud:filingDate "2024-02-05"^^xsd:date ;
|
|
293
|
+
fraud:status "Approved" .
|
|
294
|
+
`, null)
|
|
295
|
+
|
|
296
|
+
console.log('✅ Loaded knowledge graph: 4 entities, 4 transactions, 2 owners, 2 claims\n')
|
|
297
|
+
|
|
298
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
299
|
+
// STEP 2: Initialize Embeddings for Semantic Similarity
|
|
300
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
301
|
+
|
|
302
|
+
console.log('📊 Initializing Embedding Service for Semantic Analysis...\n')
|
|
303
|
+
|
|
304
|
+
const embeddingService = new EmbeddingService()
|
|
305
|
+
|
|
306
|
+
// Store entity embeddings (384-dimensional vectors from pre-trained model)
|
|
307
|
+
// In production, these would come from a transformer model like SBERT
|
|
308
|
+
const generateEmbedding = (seed) => {
|
|
309
|
+
const vec = new Array(384).fill(0).map((_, i) => Math.sin(seed * 0.1 + i * 0.01) * 0.5)
|
|
310
|
+
return vec
|
|
311
|
+
}
|
|
631
312
|
|
|
632
|
-
|
|
313
|
+
embeddingService.storeVector('icij:entity001', generateEmbedding(1))
|
|
314
|
+
embeddingService.storeVector('icij:entity002', generateEmbedding(1.05)) // Similar to entity001
|
|
315
|
+
embeddingService.storeVector('icij:entity003', generateEmbedding(1.02)) // Similar to entity001
|
|
316
|
+
embeddingService.storeVector('icij:entity004', generateEmbedding(5)) // Different pattern
|
|
317
|
+
|
|
318
|
+
console.log('✅ Stored embeddings for 4 entities\n')
|
|
319
|
+
|
|
320
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
321
|
+
// STEP 3: Detect Circular Payment Patterns (Money Laundering)
|
|
322
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
323
|
+
|
|
324
|
+
console.log('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━')
|
|
325
|
+
console.log(' ANALYSIS 1: Circular Payment Detection (Layering)')
|
|
326
|
+
console.log('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n')
|
|
327
|
+
|
|
328
|
+
const circularPayments = db.querySelect(`
|
|
329
|
+
PREFIX fraud: <http://fraud.detection/ontology/>
|
|
330
|
+
SELECT ?entity1 ?entity2 ?entity3 ?amount1 ?amount2 ?amount3 WHERE {
|
|
331
|
+
?tx1 fraud:sender ?entity1 ;
|
|
332
|
+
fraud:receiver ?entity2 ;
|
|
333
|
+
fraud:amount ?amount1 .
|
|
334
|
+
?tx2 fraud:sender ?entity2 ;
|
|
335
|
+
fraud:receiver ?entity3 ;
|
|
336
|
+
fraud:amount ?amount2 .
|
|
337
|
+
?tx3 fraud:sender ?entity3 ;
|
|
338
|
+
fraud:receiver ?entity1 ;
|
|
339
|
+
fraud:amount ?amount3 .
|
|
340
|
+
}
|
|
341
|
+
`)
|
|
342
|
+
|
|
343
|
+
console.log(' 🔍 SPARQL Query: Find A → B → C → A payment cycles')
|
|
344
|
+
console.log(' 📊 Results:')
|
|
345
|
+
|
|
346
|
+
if (circularPayments.length > 0) {
|
|
347
|
+
for (const row of circularPayments) {
|
|
348
|
+
const total = parseFloat(row.bindings.amount1) +
|
|
349
|
+
parseFloat(row.bindings.amount2) +
|
|
350
|
+
parseFloat(row.bindings.amount3)
|
|
351
|
+
console.log(`
|
|
352
|
+
┌────────────────────────────────────────────────────────────────┐
|
|
353
|
+
│ 🚨 CIRCULAR PAYMENT DETECTED - HIGH RISK │
|
|
354
|
+
├────────────────────────────────────────────────────────────────┤
|
|
355
|
+
│ Entity A: ${row.bindings.entity1.split('/').pop().padEnd(45)}│
|
|
356
|
+
│ Entity B: ${row.bindings.entity2.split('/').pop().padEnd(45)}│
|
|
357
|
+
│ Entity C: ${row.bindings.entity3.split('/').pop().padEnd(45)}│
|
|
358
|
+
├────────────────────────────────────────────────────────────────┤
|
|
359
|
+
│ Flow: A → B: $${Number(row.bindings.amount1).toLocaleString().padEnd(20)} │
|
|
360
|
+
│ B → C: $${Number(row.bindings.amount2).toLocaleString().padEnd(20)} │
|
|
361
|
+
│ C → A: $${Number(row.bindings.amount3).toLocaleString().padEnd(20)} │
|
|
362
|
+
├────────────────────────────────────────────────────────────────┤
|
|
363
|
+
│ Total Circulated: $${total.toLocaleString().padEnd(38)}│
|
|
364
|
+
│ Risk Level: CRITICAL │
|
|
365
|
+
│ Pattern: Classic Layering (Money Laundering Stage 2) │
|
|
366
|
+
└────────────────────────────────────────────────────────────────┘`)
|
|
367
|
+
}
|
|
368
|
+
}
|
|
369
|
+
|
|
370
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
371
|
+
// STEP 4: Identify Shell Company Networks with GraphFrames
|
|
372
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
373
|
+
|
|
374
|
+
console.log('\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━')
|
|
375
|
+
console.log(' ANALYSIS 2: Shell Company Network Analysis (GraphFrames)')
|
|
376
|
+
console.log('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n')
|
|
377
|
+
|
|
378
|
+
// Create graph from transaction network
|
|
379
|
+
const graph = new GraphFrame(
|
|
380
|
+
JSON.stringify([
|
|
381
|
+
{ id: 'entity001' },
|
|
382
|
+
{ id: 'entity002' },
|
|
383
|
+
{ id: 'entity003' },
|
|
384
|
+
{ id: 'entity004' }
|
|
385
|
+
]),
|
|
386
|
+
JSON.stringify([
|
|
387
|
+
{ src: 'entity001', dst: 'entity002' },
|
|
388
|
+
{ src: 'entity002', dst: 'entity003' },
|
|
389
|
+
{ src: 'entity003', dst: 'entity001' }, // Circular
|
|
390
|
+
{ src: 'entity001', dst: 'entity004' }
|
|
391
|
+
])
|
|
392
|
+
)
|
|
393
|
+
|
|
394
|
+
// PageRank identifies central nodes (potential money mules)
|
|
395
|
+
const pageRank = JSON.parse(graph.pageRank(0.15, 20))
|
|
396
|
+
console.log(' 📊 PageRank Analysis (Higher = More Central):')
|
|
397
|
+
console.log(' ┌──────────────────────┬────────────────┬──────────────────┐')
|
|
398
|
+
console.log(' │ Entity │ PageRank │ Risk Assessment │')
|
|
399
|
+
console.log(' ├──────────────────────┼────────────────┼──────────────────┤')
|
|
400
|
+
|
|
401
|
+
const sortedRanks = Object.entries(pageRank).sort((a, b) => b[1] - a[1])
|
|
402
|
+
for (const [entity, rank] of sortedRanks) {
|
|
403
|
+
const riskLevel = rank > 0.3 ? 'HIGH' : rank > 0.2 ? 'MEDIUM' : 'LOW'
|
|
404
|
+
const emoji = rank > 0.3 ? '🚨' : rank > 0.2 ? '⚠️' : '✅'
|
|
405
|
+
console.log(` │ ${entity.padEnd(20)} │ ${rank.toFixed(4).padEnd(14)} │ ${emoji} ${riskLevel.padEnd(13)} │`)
|
|
406
|
+
}
|
|
407
|
+
console.log(' └──────────────────────┴────────────────┴──────────────────┘')
|
|
408
|
+
|
|
409
|
+
// Connected Components (identify isolated networks)
|
|
410
|
+
const components = JSON.parse(graph.connectedComponents())
|
|
411
|
+
console.log('\n 📊 Connected Components:')
|
|
412
|
+
console.log(` Found ${Object.keys(components).length} entities in connected network`)
|
|
413
|
+
|
|
414
|
+
// Triangle Count (closed loops = risk)
|
|
415
|
+
const triangles = graph.triangleCount()
|
|
416
|
+
console.log(`\n 📊 Triangle Count: ${triangles}`)
|
|
417
|
+
console.log(` ${triangles > 0 ? '🚨 Triangles indicate potential circular transactions!' : '✅ No triangular patterns'}`)
|
|
418
|
+
|
|
419
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
420
|
+
// STEP 5: Semantic Similarity Analysis (Find Similar Fraud Patterns)
|
|
421
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
422
|
+
|
|
423
|
+
console.log('\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━')
|
|
424
|
+
console.log(' ANALYSIS 3: Semantic Similarity (Embedding Search)')
|
|
425
|
+
console.log('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n')
|
|
426
|
+
|
|
427
|
+
// Find entities similar to entity001 (known shell company)
|
|
428
|
+
const similar = JSON.parse(embeddingService.findSimilar('icij:entity001', 5, 0.5))
|
|
429
|
+
|
|
430
|
+
console.log(' 🔍 Entities Similar to "Mossack Holdings Ltd" (Known Shell):')
|
|
431
|
+
console.log(' ┌──────────────────────────┬────────────────┬──────────────────┐')
|
|
432
|
+
console.log(' │ Entity │ Similarity │ Action │')
|
|
433
|
+
console.log(' ├──────────────────────────┼────────────────┼──────────────────┤')
|
|
434
|
+
|
|
435
|
+
for (const item of similar) {
|
|
436
|
+
if (item.id !== 'icij:entity001') {
|
|
437
|
+
const action = item.similarity > 0.9 ? '🚨 INVESTIGATE' : item.similarity > 0.7 ? '⚠️ MONITOR' : '✅ LOW RISK'
|
|
438
|
+
console.log(` │ ${item.id.padEnd(24)} │ ${item.similarity.toFixed(4).padEnd(14)} │ ${action.padEnd(16)} │`)
|
|
439
|
+
}
|
|
440
|
+
}
|
|
441
|
+
console.log(' └──────────────────────────┴────────────────┴──────────────────┘')
|
|
442
|
+
|
|
443
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
444
|
+
// STEP 6: Datalog Reasoning for Transitive Risk Propagation
|
|
445
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
446
|
+
|
|
447
|
+
console.log('\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━')
|
|
448
|
+
console.log(' ANALYSIS 4: Datalog Reasoning (Risk Propagation)')
|
|
449
|
+
console.log('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n')
|
|
450
|
+
|
|
451
|
+
const datalog = new DatalogProgram()
|
|
452
|
+
|
|
453
|
+
// Add transaction facts
|
|
454
|
+
datalog.addFact(JSON.stringify({ predicate: 'transacts_with', terms: ['entity001', 'entity002'] }))
|
|
455
|
+
datalog.addFact(JSON.stringify({ predicate: 'transacts_with', terms: ['entity002', 'entity003'] }))
|
|
456
|
+
datalog.addFact(JSON.stringify({ predicate: 'transacts_with', terms: ['entity003', 'entity001'] }))
|
|
457
|
+
datalog.addFact(JSON.stringify({ predicate: 'high_risk', terms: ['entity001'] }))
|
|
458
|
+
|
|
459
|
+
// Recursive rule: risk propagates through transaction network
|
|
460
|
+
// connected(X, Z) :- transacts_with(X, Y), connected(Y, Z)
|
|
461
|
+
datalog.addRule(JSON.stringify({
|
|
462
|
+
head: { predicate: 'connected', terms: ['?X', '?Y'] },
|
|
463
|
+
body: [{ predicate: 'transacts_with', terms: ['?X', '?Y'] }]
|
|
464
|
+
}))
|
|
465
|
+
|
|
466
|
+
datalog.addRule(JSON.stringify({
|
|
467
|
+
head: { predicate: 'connected', terms: ['?X', '?Z'] },
|
|
468
|
+
body: [
|
|
469
|
+
{ predicate: 'transacts_with', terms: ['?X', '?Y'] },
|
|
470
|
+
{ predicate: 'connected', terms: ['?Y', '?Z'] }
|
|
471
|
+
]
|
|
472
|
+
}))
|
|
473
|
+
|
|
474
|
+
// Risk propagation rule
|
|
475
|
+
datalog.addRule(JSON.stringify({
|
|
476
|
+
head: { predicate: 'at_risk', terms: ['?X'] },
|
|
477
|
+
body: [
|
|
478
|
+
{ predicate: 'connected', terms: ['?X', '?Y'] },
|
|
479
|
+
{ predicate: 'high_risk', terms: ['?Y'] }
|
|
480
|
+
]
|
|
481
|
+
}))
|
|
482
|
+
|
|
483
|
+
// Evaluate with semi-naive algorithm
|
|
484
|
+
datalog.evaluate()
|
|
485
|
+
|
|
486
|
+
console.log(' 📋 Datalog Rules Applied:')
|
|
487
|
+
console.log(' connected(X, Y) :- transacts_with(X, Y)')
|
|
488
|
+
console.log(' connected(X, Z) :- transacts_with(X, Y), connected(Y, Z)')
|
|
489
|
+
console.log(' at_risk(X) :- connected(X, Y), high_risk(Y)')
|
|
490
|
+
console.log('')
|
|
491
|
+
|
|
492
|
+
// Query entities at risk
|
|
493
|
+
const atRisk = datalog.query(JSON.stringify({
|
|
494
|
+
predicate: 'at_risk',
|
|
495
|
+
terms: ['?entity']
|
|
496
|
+
}))
|
|
497
|
+
|
|
498
|
+
console.log(' 🚨 Entities at Risk (via transitive connection to high-risk entity):')
|
|
499
|
+
const riskEntities = JSON.parse(atRisk)
|
|
500
|
+
for (const entity of riskEntities) {
|
|
501
|
+
console.log(` - ${entity}`)
|
|
502
|
+
}
|
|
503
|
+
|
|
504
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
505
|
+
// FINAL REPORT
|
|
506
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
507
|
+
|
|
508
|
+
console.log('\n\n═══════════════════════════════════════════════════════════════')
|
|
509
|
+
console.log(' FRAUD DETECTION REPORT')
|
|
510
|
+
console.log('═══════════════════════════════════════════════════════════════')
|
|
511
|
+
console.log(`
|
|
633
512
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
634
|
-
│
|
|
513
|
+
│ EXECUTIVE SUMMARY │
|
|
635
514
|
├─────────────────────────────────────────────────────────────────────────────┤
|
|
636
515
|
│ │
|
|
637
|
-
│
|
|
638
|
-
│
|
|
639
|
-
│
|
|
640
|
-
│
|
|
641
|
-
│
|
|
642
|
-
|
|
643
|
-
│
|
|
644
|
-
|
|
645
|
-
│
|
|
646
|
-
│
|
|
647
|
-
│
|
|
648
|
-
│
|
|
649
|
-
│
|
|
650
|
-
│ SPARQL Query: "SELECT ?x WHERE { ?x a ub:Professor }" │
|
|
651
|
-
│ │ │
|
|
652
|
-
│ ▼ │
|
|
653
|
-
│ rust-kgdb Cluster: Executes query, returns results │
|
|
654
|
-
│ │ │
|
|
655
|
-
│ ▼ │
|
|
656
|
-
│ Results: [{ bindings: { x: "http://..." } }, ...] │
|
|
516
|
+
│ Analysis Date: ${new Date().toISOString().split('T')[0]} │
|
|
517
|
+
│ Entities Analyzed: 4 │
|
|
518
|
+
│ Transactions: 4 │
|
|
519
|
+
│ Total Value: $7,500,000 │
|
|
520
|
+
│ │
|
|
521
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
522
|
+
│ FINDINGS │
|
|
523
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
524
|
+
│ │
|
|
525
|
+
│ 🚨 CRITICAL: Circular payment pattern detected │
|
|
526
|
+
│ - 3 entities involved in layering scheme │
|
|
527
|
+
│ - Total circulated: $7,350,000 │
|
|
528
|
+
│ - Pattern matches classic money laundering (Stage 2) │
|
|
657
529
|
│ │
|
|
530
|
+
│ ⚠️ HIGH: Shell company network identified │
|
|
531
|
+
│ - PageRank analysis shows entity001 as central node │
|
|
532
|
+
│ - 1 triangle (closed loop) detected │
|
|
533
|
+
│ │
|
|
534
|
+
│ ⚠️ HIGH: Common beneficial owner (PEP) │
|
|
535
|
+
│ - John Smith owns 3 linked offshore entities │
|
|
536
|
+
│ - Politically Exposed Person flag │
|
|
537
|
+
│ │
|
|
538
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
539
|
+
│ RECOMMENDED ACTIONS │
|
|
540
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
541
|
+
│ │
|
|
542
|
+
│ 1. File SAR (Suspicious Activity Report) for circular transactions │
|
|
543
|
+
│ 2. Enhanced due diligence on John Smith (PEP) │
|
|
544
|
+
│ 3. Freeze accounts pending investigation │
|
|
545
|
+
│ 4. Notify compliance team immediately │
|
|
546
|
+
│ │
|
|
547
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
548
|
+
│ Risk Score: 0.92 / 1.00 (CRITICAL) │
|
|
549
|
+
│ Confidence: 0.95 │
|
|
658
550
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
659
|
-
|
|
660
|
-
|
|
661
|
-
### Mode 1: Mock Mode (No API Keys Required)
|
|
551
|
+
`)
|
|
662
552
|
|
|
663
|
-
|
|
553
|
+
return {
|
|
554
|
+
riskScore: 0.92,
|
|
555
|
+
confidence: 0.95,
|
|
556
|
+
findings: {
|
|
557
|
+
circularPayments: circularPayments.length,
|
|
558
|
+
triangles: triangles,
|
|
559
|
+
entitiesAtRisk: riskEntities.length
|
|
560
|
+
}
|
|
561
|
+
}
|
|
562
|
+
}
|
|
664
563
|
|
|
665
|
-
|
|
666
|
-
|
|
667
|
-
|
|
668
|
-
// Spawn agent with mock model - NO API KEYS NEEDED
|
|
669
|
-
const agent = await HyperMindAgent.spawn({
|
|
670
|
-
name: 'test-agent',
|
|
671
|
-
model: 'mock', // Uses pattern matching, not LLM
|
|
672
|
-
tools: ['kg.sparql.query'],
|
|
673
|
-
endpoint: 'http://localhost:30080' // Your rust-kgdb endpoint
|
|
674
|
-
})
|
|
675
|
-
|
|
676
|
-
// Ask a question (pattern-matched to LUBM queries)
|
|
677
|
-
const result = await agent.call('Find all professors in the database')
|
|
678
|
-
|
|
679
|
-
console.log(result.success) // true
|
|
680
|
-
console.log(result.sparql) // "PREFIX ub: <...> SELECT ?x WHERE { ?x a ub:Professor }"
|
|
681
|
-
console.log(result.results) // Query results from your database
|
|
564
|
+
// Run the analysis
|
|
565
|
+
runFraudDetection().catch(console.error)
|
|
682
566
|
```
|
|
683
567
|
|
|
684
|
-
|
|
685
|
-
| Question Pattern | Generated SPARQL |
|
|
686
|
-
|-----------------|------------------|
|
|
687
|
-
| "Find all professors..." | `SELECT ?x WHERE { ?x a ub:Professor }` |
|
|
688
|
-
| "List all graduate students" | `SELECT ?x WHERE { ?x a ub:GraduateStudent }` |
|
|
689
|
-
| "How many courses..." | `SELECT (COUNT(?x) AS ?count) WHERE { ?x a ub:Course }` |
|
|
690
|
-
| "Find students and their advisors" | `SELECT ?student ?advisor WHERE { ?student ub:advisor ?advisor }` |
|
|
568
|
+
---
|
|
691
569
|
|
|
692
|
-
|
|
570
|
+
## Complete Example: Underwriting Agent
|
|
693
571
|
|
|
694
|
-
|
|
572
|
+
Real-world insurance underwriting with risk assessment and embeddings.
|
|
695
573
|
|
|
696
|
-
```
|
|
697
|
-
|
|
698
|
-
|
|
699
|
-
|
|
700
|
-
|
|
574
|
+
```javascript
|
|
575
|
+
const { GraphDB, EmbeddingService, DatalogProgram } = require('rust-kgdb')
|
|
576
|
+
|
|
577
|
+
// ═══════════════════════════════════════════════════════════════════════════
|
|
578
|
+
// INSURANCE UNDERWRITING AGENT - Complete Real-World Pipeline
|
|
579
|
+
// ═══════════════════════════════════════════════════════════════════════════
|
|
580
|
+
|
|
581
|
+
async function runUnderwriting() {
|
|
582
|
+
console.log('╔═══════════════════════════════════════════════════════════╗')
|
|
583
|
+
console.log('║ UNDERWRITING AGENT - HyperMind Framework ║')
|
|
584
|
+
console.log('╠═══════════════════════════════════════════════════════════╣')
|
|
585
|
+
console.log('║ Analysis: Risk Assessment, Premium Calculation ║')
|
|
586
|
+
console.log('║ Data: Commercial Property Insurance Application ║')
|
|
587
|
+
console.log('╚═══════════════════════════════════════════════════════════╝\n')
|
|
588
|
+
|
|
589
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
590
|
+
// STEP 1: Load Knowledge Base (Historical Policies + Risk Models)
|
|
591
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
592
|
+
|
|
593
|
+
const db = new GraphDB('http://underwriting.ai/kb')
|
|
594
|
+
|
|
595
|
+
db.loadTtl(`
|
|
596
|
+
@prefix uw: <http://underwriting.ai/ontology/> .
|
|
597
|
+
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
|
|
598
|
+
|
|
599
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
600
|
+
# RISK MODELS (Actuarial Rules)
|
|
601
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
602
|
+
|
|
603
|
+
uw:propertyRiskModel a uw:RiskModel ;
|
|
604
|
+
uw:modelName "Commercial Property Risk" ;
|
|
605
|
+
uw:baseRate "0.0025"^^xsd:decimal ;
|
|
606
|
+
uw:factors "location,buildingAge,constructionType,occupancyClass" .
|
|
607
|
+
|
|
608
|
+
uw:liabilityRiskModel a uw:RiskModel ;
|
|
609
|
+
uw:modelName "General Liability Risk" ;
|
|
610
|
+
uw:baseRate "0.0015"^^xsd:decimal ;
|
|
611
|
+
uw:factors "industryCode,revenue,employeeCount,claimsHistory" .
|
|
612
|
+
|
|
613
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
614
|
+
# RISK FACTORS (Location-Based)
|
|
615
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
616
|
+
|
|
617
|
+
uw:california a uw:Location ;
|
|
618
|
+
uw:earthquakeRisk "0.35"^^xsd:decimal ;
|
|
619
|
+
uw:wildfireRisk "0.28"^^xsd:decimal ;
|
|
620
|
+
uw:floodRisk "0.12"^^xsd:decimal ;
|
|
621
|
+
uw:baseMultiplier "1.45"^^xsd:decimal .
|
|
622
|
+
|
|
623
|
+
uw:texas a uw:Location ;
|
|
624
|
+
uw:hurricaneRisk "0.22"^^xsd:decimal ;
|
|
625
|
+
uw:tornadoRisk "0.18"^^xsd:decimal ;
|
|
626
|
+
uw:floodRisk "0.25"^^xsd:decimal ;
|
|
627
|
+
uw:baseMultiplier "1.25"^^xsd:decimal .
|
|
628
|
+
|
|
629
|
+
uw:newYork a uw:Location ;
|
|
630
|
+
uw:earthquakeRisk "0.05"^^xsd:decimal ;
|
|
631
|
+
uw:terrorRisk "0.15"^^xsd:decimal ;
|
|
632
|
+
uw:floodRisk "0.18"^^xsd:decimal ;
|
|
633
|
+
uw:baseMultiplier "1.35"^^xsd:decimal .
|
|
634
|
+
|
|
635
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
636
|
+
# HISTORICAL POLICIES (For Premium Benchmarking)
|
|
637
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
638
|
+
|
|
639
|
+
uw:policy001 a uw:HistoricalPolicy ;
|
|
640
|
+
uw:industry "Manufacturing" ;
|
|
641
|
+
uw:location uw:california ;
|
|
642
|
+
uw:revenue "5000000"^^xsd:decimal ;
|
|
643
|
+
uw:employees "150"^^xsd:integer ;
|
|
644
|
+
uw:premium "32500"^^xsd:decimal ;
|
|
645
|
+
uw:coverage "2000000"^^xsd:decimal ;
|
|
646
|
+
uw:lossRatio "0.45"^^xsd:decimal ;
|
|
647
|
+
uw:claimsCount "2"^^xsd:integer .
|
|
648
|
+
|
|
649
|
+
uw:policy002 a uw:HistoricalPolicy ;
|
|
650
|
+
uw:industry "Manufacturing" ;
|
|
651
|
+
uw:location uw:texas ;
|
|
652
|
+
uw:revenue "4500000"^^xsd:decimal ;
|
|
653
|
+
uw:employees "120"^^xsd:integer ;
|
|
654
|
+
uw:premium "28000"^^xsd:decimal ;
|
|
655
|
+
uw:coverage "1500000"^^xsd:decimal ;
|
|
656
|
+
uw:lossRatio "0.32"^^xsd:decimal ;
|
|
657
|
+
uw:claimsCount "1"^^xsd:integer .
|
|
658
|
+
|
|
659
|
+
uw:policy003 a uw:HistoricalPolicy ;
|
|
660
|
+
uw:industry "Technology" ;
|
|
661
|
+
uw:location uw:california ;
|
|
662
|
+
uw:revenue "8000000"^^xsd:decimal ;
|
|
663
|
+
uw:employees "50"^^xsd:integer ;
|
|
664
|
+
uw:premium "18500"^^xsd:decimal ;
|
|
665
|
+
uw:coverage "3000000"^^xsd:decimal ;
|
|
666
|
+
uw:lossRatio "0.15"^^xsd:decimal ;
|
|
667
|
+
uw:claimsCount "0"^^xsd:integer .
|
|
668
|
+
|
|
669
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
670
|
+
# NEW APPLICATION (To Be Underwritten)
|
|
671
|
+
# ══════════════════════════════════════════════════════════════════════
|
|
672
|
+
|
|
673
|
+
uw:application001 a uw:Application ;
|
|
674
|
+
uw:applicantName "Acme Manufacturing Corp" ;
|
|
675
|
+
uw:industry "Manufacturing" ;
|
|
676
|
+
uw:location uw:california ;
|
|
677
|
+
uw:revenue "5500000"^^xsd:decimal ;
|
|
678
|
+
uw:employees "175"^^xsd:integer ;
|
|
679
|
+
uw:buildingAge "15"^^xsd:integer ;
|
|
680
|
+
uw:constructionType "Masonry" ;
|
|
681
|
+
uw:sprinklerSystem true ;
|
|
682
|
+
uw:securitySystem true ;
|
|
683
|
+
uw:priorClaimsCount "1"^^xsd:integer ;
|
|
684
|
+
uw:requestedCoverage "2500000"^^xsd:decimal .
|
|
685
|
+
`, null)
|
|
686
|
+
|
|
687
|
+
console.log('✅ Loaded underwriting knowledge base\n')
|
|
688
|
+
|
|
689
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
690
|
+
// STEP 2: Initialize Embeddings for Similar Policy Matching
|
|
691
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
692
|
+
|
|
693
|
+
const embeddingService = new EmbeddingService()
|
|
694
|
+
|
|
695
|
+
// Generate policy embeddings based on features
|
|
696
|
+
// (In production: use trained model on policy features)
|
|
697
|
+
const policyToVector = (revenue, employees, lossRatio) => {
|
|
698
|
+
const normalized = [revenue / 10000000, employees / 200, lossRatio]
|
|
699
|
+
return new Array(384).fill(0).map((_, i) =>
|
|
700
|
+
Math.sin(normalized[0] * i * 0.1) +
|
|
701
|
+
Math.cos(normalized[1] * i * 0.2) +
|
|
702
|
+
normalized[2] * Math.sin(i * 0.05)
|
|
703
|
+
)
|
|
704
|
+
}
|
|
701
705
|
|
|
702
|
-
|
|
703
|
-
|
|
706
|
+
embeddingService.storeVector('policy001', policyToVector(5000000, 150, 0.45))
|
|
707
|
+
embeddingService.storeVector('policy002', policyToVector(4500000, 120, 0.32))
|
|
708
|
+
embeddingService.storeVector('policy003', policyToVector(8000000, 50, 0.15))
|
|
709
|
+
embeddingService.storeVector('application001', policyToVector(5500000, 175, 0.40)) // Estimate
|
|
710
|
+
|
|
711
|
+
console.log('✅ Stored embeddings for policy similarity matching\n')
|
|
712
|
+
|
|
713
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
714
|
+
// STEP 3: Query Application Details
|
|
715
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
716
|
+
|
|
717
|
+
console.log('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━')
|
|
718
|
+
console.log(' APPLICATION ANALYSIS')
|
|
719
|
+
console.log('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n')
|
|
720
|
+
|
|
721
|
+
const application = db.querySelect(`
|
|
722
|
+
PREFIX uw: <http://underwriting.ai/ontology/>
|
|
723
|
+
SELECT ?name ?industry ?revenue ?employees ?coverage ?priorClaims WHERE {
|
|
724
|
+
uw:application001 uw:applicantName ?name ;
|
|
725
|
+
uw:industry ?industry ;
|
|
726
|
+
uw:revenue ?revenue ;
|
|
727
|
+
uw:employees ?employees ;
|
|
728
|
+
uw:requestedCoverage ?coverage ;
|
|
729
|
+
uw:priorClaimsCount ?priorClaims .
|
|
730
|
+
}
|
|
731
|
+
`)[0]
|
|
732
|
+
|
|
733
|
+
console.log(' 📋 Application Details:')
|
|
734
|
+
console.log(' ┌─────────────────────────────────────────────────────────────┐')
|
|
735
|
+
console.log(` │ Applicant: ${application.bindings.name.padEnd(41)}│`)
|
|
736
|
+
console.log(` │ Industry: ${application.bindings.industry.padEnd(41)}│`)
|
|
737
|
+
console.log(` │ Revenue: $${Number(application.bindings.revenue).toLocaleString().padEnd(39)}│`)
|
|
738
|
+
console.log(` │ Employees: ${application.bindings.employees.padEnd(41)}│`)
|
|
739
|
+
console.log(` │ Coverage Req: $${Number(application.bindings.coverage).toLocaleString().padEnd(39)}│`)
|
|
740
|
+
console.log(` │ Prior Claims: ${application.bindings.priorClaims.padEnd(41)}│`)
|
|
741
|
+
console.log(' └─────────────────────────────────────────────────────────────┘')
|
|
742
|
+
|
|
743
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
744
|
+
// STEP 4: Find Similar Historical Policies (Embedding Search)
|
|
745
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
746
|
+
|
|
747
|
+
console.log('\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━')
|
|
748
|
+
console.log(' SIMILAR POLICY ANALYSIS (Embedding Similarity)')
|
|
749
|
+
console.log('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n')
|
|
750
|
+
|
|
751
|
+
const similarPolicies = JSON.parse(embeddingService.findSimilar('application001', 5, 0.3))
|
|
752
|
+
|
|
753
|
+
console.log(' 🔍 Most Similar Historical Policies:')
|
|
754
|
+
console.log(' ┌──────────────────┬────────────────┬─────────────────┬──────────────┐')
|
|
755
|
+
console.log(' │ Policy │ Similarity │ Premium │ Loss Ratio │')
|
|
756
|
+
console.log(' ├──────────────────┼────────────────┼─────────────────┼──────────────┤')
|
|
757
|
+
|
|
758
|
+
const policyData = {
|
|
759
|
+
policy001: { premium: 32500, lossRatio: 0.45 },
|
|
760
|
+
policy002: { premium: 28000, lossRatio: 0.32 },
|
|
761
|
+
policy003: { premium: 18500, lossRatio: 0.15 }
|
|
762
|
+
}
|
|
704
763
|
|
|
705
|
-
|
|
706
|
-
|
|
707
|
-
name: 'prod-agent',
|
|
708
|
-
model: 'claude-sonnet-4', // Real LLM - generates dynamic SPARQL
|
|
709
|
-
tools: ['kg.sparql.query', 'kg.motif.find'],
|
|
710
|
-
endpoint: 'http://localhost:30080'
|
|
711
|
-
})
|
|
764
|
+
let similarPremiumSum = 0
|
|
765
|
+
let similarCount = 0
|
|
712
766
|
|
|
713
|
-
|
|
714
|
-
|
|
767
|
+
for (const item of similarPolicies) {
|
|
768
|
+
if (item.id !== 'application001' && policyData[item.id]) {
|
|
769
|
+
const p = policyData[item.id]
|
|
770
|
+
similarPremiumSum += p.premium * item.similarity
|
|
771
|
+
similarCount += item.similarity
|
|
772
|
+
console.log(` │ ${item.id.padEnd(16)} │ ${item.similarity.toFixed(4).padEnd(14)} │ $${p.premium.toLocaleString().padEnd(13)} │ ${(p.lossRatio * 100).toFixed(1)}% │`)
|
|
773
|
+
}
|
|
774
|
+
}
|
|
775
|
+
console.log(' └──────────────────┴────────────────┴─────────────────┴──────────────┘')
|
|
776
|
+
|
|
777
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
778
|
+
// STEP 5: Location Risk Analysis
|
|
779
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
780
|
+
|
|
781
|
+
console.log('\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━')
|
|
782
|
+
console.log(' LOCATION RISK ANALYSIS')
|
|
783
|
+
console.log('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n')
|
|
784
|
+
|
|
785
|
+
const locationRisk = db.querySelect(`
|
|
786
|
+
PREFIX uw: <http://underwriting.ai/ontology/>
|
|
787
|
+
SELECT ?earthquake ?wildfire ?flood ?multiplier WHERE {
|
|
788
|
+
uw:california uw:earthquakeRisk ?earthquake ;
|
|
789
|
+
uw:wildfireRisk ?wildfire ;
|
|
790
|
+
uw:floodRisk ?flood ;
|
|
791
|
+
uw:baseMultiplier ?multiplier .
|
|
792
|
+
}
|
|
793
|
+
`)[0]
|
|
794
|
+
|
|
795
|
+
console.log(' 📍 Location: California')
|
|
796
|
+
console.log(' ┌─────────────────────────────────────────────────────────────┐')
|
|
797
|
+
console.log(' │ Risk Factor │ Value │ Rating │')
|
|
798
|
+
console.log(' ├─────────────────────────────────────────────────────────────┤')
|
|
799
|
+
|
|
800
|
+
const riskBar = (val) => {
|
|
801
|
+
const filled = Math.round(parseFloat(val) * 20)
|
|
802
|
+
return '█'.repeat(filled) + '░'.repeat(20 - filled)
|
|
803
|
+
}
|
|
715
804
|
|
|
716
|
-
|
|
717
|
-
|
|
718
|
-
|
|
805
|
+
const earthquakeRisk = parseFloat(locationRisk.bindings.earthquake)
|
|
806
|
+
const wildfireRisk = parseFloat(locationRisk.bindings.wildfire)
|
|
807
|
+
const floodRisk = parseFloat(locationRisk.bindings.flood)
|
|
808
|
+
|
|
809
|
+
console.log(` │ Earthquake Risk │ ${(earthquakeRisk * 100).toFixed(0)}% │ ${riskBar(earthquakeRisk)} │`)
|
|
810
|
+
console.log(` │ Wildfire Risk │ ${(wildfireRisk * 100).toFixed(0)}% │ ${riskBar(wildfireRisk)} │`)
|
|
811
|
+
console.log(` │ Flood Risk │ ${(floodRisk * 100).toFixed(0)}% │ ${riskBar(floodRisk)} │`)
|
|
812
|
+
console.log(' ├─────────────────────────────────────────────────────────────┤')
|
|
813
|
+
console.log(` │ Base Multiplier │ ${locationRisk.bindings.multiplier}x │ Applied to premium │`)
|
|
814
|
+
console.log(' └─────────────────────────────────────────────────────────────┘')
|
|
815
|
+
|
|
816
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
817
|
+
// STEP 6: Datalog Risk Scoring
|
|
818
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
819
|
+
|
|
820
|
+
console.log('\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━')
|
|
821
|
+
console.log(' DATALOG RISK REASONING')
|
|
822
|
+
console.log('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n')
|
|
823
|
+
|
|
824
|
+
const riskDatalog = new DatalogProgram()
|
|
825
|
+
|
|
826
|
+
// Add facts about the application
|
|
827
|
+
riskDatalog.addFact(JSON.stringify({ predicate: 'industry', terms: ['app001', 'manufacturing'] }))
|
|
828
|
+
riskDatalog.addFact(JSON.stringify({ predicate: 'location', terms: ['app001', 'california'] }))
|
|
829
|
+
riskDatalog.addFact(JSON.stringify({ predicate: 'high_earthquake_zone', terms: ['california'] }))
|
|
830
|
+
riskDatalog.addFact(JSON.stringify({ predicate: 'high_wildfire_zone', terms: ['california'] }))
|
|
831
|
+
riskDatalog.addFact(JSON.stringify({ predicate: 'prior_claims', terms: ['app001', '1'] }))
|
|
832
|
+
riskDatalog.addFact(JSON.stringify({ predicate: 'has_sprinkler', terms: ['app001'] }))
|
|
833
|
+
riskDatalog.addFact(JSON.stringify({ predicate: 'has_security', terms: ['app001'] }))
|
|
834
|
+
|
|
835
|
+
// Risk increase rules
|
|
836
|
+
riskDatalog.addRule(JSON.stringify({
|
|
837
|
+
head: { predicate: 'risk_factor', terms: ['?app', 'earthquake'] },
|
|
838
|
+
body: [
|
|
839
|
+
{ predicate: 'location', terms: ['?app', '?loc'] },
|
|
840
|
+
{ predicate: 'high_earthquake_zone', terms: ['?loc'] }
|
|
841
|
+
]
|
|
842
|
+
}))
|
|
843
|
+
|
|
844
|
+
riskDatalog.addRule(JSON.stringify({
|
|
845
|
+
head: { predicate: 'risk_factor', terms: ['?app', 'wildfire'] },
|
|
846
|
+
body: [
|
|
847
|
+
{ predicate: 'location', terms: ['?app', '?loc'] },
|
|
848
|
+
{ predicate: 'high_wildfire_zone', terms: ['?loc'] }
|
|
849
|
+
]
|
|
850
|
+
}))
|
|
851
|
+
|
|
852
|
+
riskDatalog.addRule(JSON.stringify({
|
|
853
|
+
head: { predicate: 'risk_factor', terms: ['?app', 'claims_history'] },
|
|
854
|
+
body: [{ predicate: 'prior_claims', terms: ['?app', '?count'] }]
|
|
855
|
+
}))
|
|
856
|
+
|
|
857
|
+
// Risk reduction rules
|
|
858
|
+
riskDatalog.addRule(JSON.stringify({
|
|
859
|
+
head: { predicate: 'risk_mitigator', terms: ['?app', 'sprinkler_discount'] },
|
|
860
|
+
body: [{ predicate: 'has_sprinkler', terms: ['?app'] }]
|
|
861
|
+
}))
|
|
862
|
+
|
|
863
|
+
riskDatalog.addRule(JSON.stringify({
|
|
864
|
+
head: { predicate: 'risk_mitigator', terms: ['?app', 'security_discount'] },
|
|
865
|
+
body: [{ predicate: 'has_security', terms: ['?app'] }]
|
|
866
|
+
}))
|
|
867
|
+
|
|
868
|
+
riskDatalog.evaluate()
|
|
869
|
+
|
|
870
|
+
console.log(' 📋 Datalog Rules Applied:')
|
|
871
|
+
console.log(' risk_factor(App, earthquake) :- location(App, Loc), high_earthquake_zone(Loc)')
|
|
872
|
+
console.log(' risk_factor(App, wildfire) :- location(App, Loc), high_wildfire_zone(Loc)')
|
|
873
|
+
console.log(' risk_mitigator(App, sprinkler_discount) :- has_sprinkler(App)')
|
|
874
|
+
console.log('')
|
|
875
|
+
|
|
876
|
+
const riskFactors = JSON.parse(riskDatalog.query(JSON.stringify({
|
|
877
|
+
predicate: 'risk_factor',
|
|
878
|
+
terms: ['app001', '?factor']
|
|
879
|
+
})))
|
|
880
|
+
|
|
881
|
+
const mitigators = JSON.parse(riskDatalog.query(JSON.stringify({
|
|
882
|
+
predicate: 'risk_mitigator',
|
|
883
|
+
terms: ['app001', '?mitigator']
|
|
884
|
+
})))
|
|
885
|
+
|
|
886
|
+
console.log(' 🚨 Risk Factors Identified:')
|
|
887
|
+
for (const factor of riskFactors) {
|
|
888
|
+
console.log(` + ${factor} (+10% premium)`)
|
|
889
|
+
}
|
|
719
890
|
|
|
720
|
-
|
|
721
|
-
|
|
722
|
-
|
|
723
|
-
|
|
724
|
-
| `gpt-4o` | `OPENAI_API_KEY` | Alternative |
|
|
725
|
-
| `mock` | None | Testing only |
|
|
891
|
+
console.log('\n ✅ Risk Mitigators Applied:')
|
|
892
|
+
for (const mitigator of mitigators) {
|
|
893
|
+
console.log(` - ${mitigator} (-5% premium)`)
|
|
894
|
+
}
|
|
726
895
|
|
|
727
|
-
|
|
896
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
897
|
+
// STEP 7: Calculate Premium
|
|
898
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
728
899
|
|
|
729
|
-
|
|
730
|
-
const
|
|
900
|
+
const requestedCoverage = 2500000
|
|
901
|
+
const baseRate = 0.0025
|
|
902
|
+
const locationMultiplier = parseFloat(locationRisk.bindings.multiplier)
|
|
731
903
|
|
|
732
|
-
|
|
733
|
-
const stats = await runHyperMindBenchmark('http://localhost:30080', 'mock', {
|
|
734
|
-
saveResults: true // Saves JSON file with results
|
|
735
|
-
})
|
|
904
|
+
let basePremium = requestedCoverage * baseRate * locationMultiplier
|
|
736
905
|
|
|
737
|
-
|
|
738
|
-
|
|
739
|
-
|
|
906
|
+
// Apply risk factors (+10% each)
|
|
907
|
+
const riskAdjustment = riskFactors.length * 0.10
|
|
908
|
+
basePremium *= (1 + riskAdjustment)
|
|
740
909
|
|
|
741
|
-
|
|
910
|
+
// Apply mitigators (-5% each)
|
|
911
|
+
const mitigatorAdjustment = mitigators.length * 0.05
|
|
912
|
+
basePremium *= (1 - mitigatorAdjustment)
|
|
742
913
|
|
|
743
|
-
|
|
744
|
-
|
|
745
|
-
│ COMMON CONFUSION: These are TWO DIFFERENT FEATURES │
|
|
746
|
-
├───────────────────────────────────────────────────────────────────────────────┤
|
|
747
|
-
│ │
|
|
748
|
-
│ HyperMindAgent EmbeddingService │
|
|
749
|
-
│ ───────────────── ───────────────── │
|
|
750
|
-
│ • Natural Language → SPARQL • Text → Vector embeddings │
|
|
751
|
-
│ • "Find professors" → SQL-like query • "professor" → [0.1, 0.2, ...] │
|
|
752
|
-
│ • Returns database results • Returns similar items │
|
|
753
|
-
│ • NO embeddings used internally • ALL about embeddings │
|
|
754
|
-
│ │
|
|
755
|
-
│ Use HyperMind when: Use Embeddings when: │
|
|
756
|
-
│ "I want to query my database "I want to find semantically │
|
|
757
|
-
│ using natural language" similar items" │
|
|
758
|
-
│ │
|
|
759
|
-
└───────────────────────────────────────────────────────────────────────────────┘
|
|
760
|
-
```
|
|
914
|
+
// Similar policy benchmark
|
|
915
|
+
const benchmarkPremium = similarCount > 0 ? similarPremiumSum / similarCount : basePremium
|
|
761
916
|
|
|
762
|
-
|
|
763
|
-
const
|
|
764
|
-
|
|
765
|
-
// ──────────────────────────────────────────────────────────────────────────────
|
|
766
|
-
// HYPERMIND: Natural language → SPARQL queries (NO embeddings)
|
|
767
|
-
// ──────────────────────────────────────────────────────────────────────────────
|
|
768
|
-
const agent = await HyperMindAgent.spawn({ model: 'mock', endpoint: 'http://localhost:30080' })
|
|
769
|
-
const result = await agent.call('Find all professors')
|
|
770
|
-
// result.sparql = "SELECT ?x WHERE { ?x a ub:Professor }"
|
|
771
|
-
// result.results = [{ x: "http://university.edu/prof1" }, ...]
|
|
772
|
-
|
|
773
|
-
// ──────────────────────────────────────────────────────────────────────────────
|
|
774
|
-
// EMBEDDINGS: Semantic similarity search (COMPLETELY SEPARATE)
|
|
775
|
-
// ──────────────────────────────────────────────────────────────────────────────
|
|
776
|
-
const embeddings = new EmbeddingService()
|
|
777
|
-
embeddings.storeVector('professor', [0.1, 0.2, 0.3, ...]) // 384-dim vector
|
|
778
|
-
embeddings.storeVector('teacher', [0.11, 0.21, 0.31, ...])
|
|
779
|
-
const similar = embeddings.findSimilar('professor', 5) // Finds "teacher" by cosine similarity
|
|
780
|
-
```
|
|
917
|
+
// Final premium (weighted average)
|
|
918
|
+
const finalPremium = Math.round((basePremium * 0.6 + benchmarkPremium * 0.4) * 100) / 100
|
|
781
919
|
|
|
782
|
-
|
|
783
|
-
|
|
784
|
-
| **What it does** | NL → SPARQL queries | Semantic similarity search |
|
|
785
|
-
| **Input** | "Find all professors" | Text or vectors |
|
|
786
|
-
| **Output** | SPARQL query + results | Similar items list |
|
|
787
|
-
| **Uses embeddings?** | ❌ **NO** | ✅ Yes |
|
|
788
|
-
| **Uses LLM?** | ✅ Yes (or mock) | ❌ No |
|
|
789
|
-
| **Requires API key?** | Only for LLM mode | No |
|
|
920
|
+
// Risk score
|
|
921
|
+
const riskScore = Math.min(0.95, 0.3 + (riskFactors.length * 0.15) - (mitigators.length * 0.05))
|
|
790
922
|
|
|
791
|
-
|
|
923
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
924
|
+
// FINAL QUOTE
|
|
925
|
+
// ─────────────────────────────────────────────────────────────────────────
|
|
792
926
|
|
|
793
|
-
|
|
927
|
+
console.log('\n\n═══════════════════════════════════════════════════════════════')
|
|
928
|
+
console.log(' INSURANCE QUOTE')
|
|
929
|
+
console.log('═══════════════════════════════════════════════════════════════')
|
|
930
|
+
console.log(`
|
|
794
931
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
795
|
-
│
|
|
932
|
+
│ QUOTE SUMMARY │
|
|
796
933
|
├─────────────────────────────────────────────────────────────────────────────┤
|
|
797
934
|
│ │
|
|
798
|
-
│
|
|
799
|
-
│
|
|
935
|
+
│ Quote ID: QT-${Date.now().toString().slice(-8)} │
|
|
936
|
+
│ Generated: ${new Date().toISOString().split('T')[0]} │
|
|
800
937
|
│ │
|
|
801
|
-
|
|
802
|
-
│
|
|
938
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
939
|
+
│ APPLICANT │
|
|
940
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
803
941
|
│ │
|
|
804
|
-
│
|
|
805
|
-
│
|
|
942
|
+
│ Company: ${application.bindings.name.padEnd(49)}│
|
|
943
|
+
│ Industry: ${application.bindings.industry.padEnd(49)}│
|
|
944
|
+
│ Location: California │
|
|
806
945
|
│ │
|
|
807
|
-
|
|
808
|
-
│
|
|
946
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
947
|
+
│ COVERAGE │
|
|
948
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
809
949
|
│ │
|
|
810
|
-
│
|
|
811
|
-
│
|
|
950
|
+
│ Coverage Amount: $${Number(requestedCoverage).toLocaleString().padEnd(48)}│
|
|
951
|
+
│ Deductible: $25,000 │
|
|
952
|
+
│ Policy Term: 12 months │
|
|
812
953
|
│ │
|
|
813
|
-
|
|
814
|
-
│
|
|
954
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
955
|
+
│ PREMIUM │
|
|
956
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
957
|
+
│ │
|
|
958
|
+
│ Annual Premium: $${finalPremium.toLocaleString().padEnd(48)}│
|
|
959
|
+
│ Monthly Payment: $${(finalPremium / 12).toFixed(2).padEnd(48)}│
|
|
960
|
+
│ │
|
|
961
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
962
|
+
│ CALCULATION BREAKDOWN │
|
|
963
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
964
|
+
│ │
|
|
965
|
+
│ Base Premium: $${(requestedCoverage * baseRate).toLocaleString().padEnd(38)}│
|
|
966
|
+
│ Location Multiplier: ${locationMultiplier}x │
|
|
967
|
+
│ Risk Factors (${riskFactors.length}): +${(riskAdjustment * 100).toFixed(0)}% │
|
|
968
|
+
│ Mitigators (${mitigators.length}): -${(mitigatorAdjustment * 100).toFixed(0)}% │
|
|
969
|
+
│ Similar Policy Benchmark: $${Math.round(benchmarkPremium).toLocaleString().padEnd(38)}│
|
|
970
|
+
│ │
|
|
971
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
972
|
+
│ RISK ASSESSMENT │
|
|
973
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
974
|
+
│ │
|
|
975
|
+
│ Risk Score: ${(riskScore * 100).toFixed(1)}% ${riskScore > 0.6 ? '(MODERATE-HIGH)' : '(ACCEPTABLE)'} │
|
|
976
|
+
│ │
|
|
977
|
+
│ Risk Factors: │
|
|
978
|
+
│ • Earthquake zone (+10%) │
|
|
979
|
+
│ • Wildfire zone (+10%) │
|
|
980
|
+
│ • Prior claims history (+10%) │
|
|
981
|
+
│ │
|
|
982
|
+
│ Mitigators Applied: │
|
|
983
|
+
│ • Sprinkler system (-5%) │
|
|
984
|
+
│ • Security system (-5%) │
|
|
985
|
+
│ │
|
|
986
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
987
|
+
│ RECOMMENDATION │
|
|
988
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
989
|
+
│ │
|
|
990
|
+
│ Decision: ✅ APPROVED │
|
|
991
|
+
│ Confidence: 95% │
|
|
992
|
+
│ │
|
|
993
|
+
│ Conditions: │
|
|
994
|
+
│ 1. Annual fire safety inspection required │
|
|
995
|
+
│ 2. Earthquake retrofit documentation │
|
|
996
|
+
│ 3. Updated business continuity plan │
|
|
815
997
|
│ │
|
|
816
998
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
817
|
-
|
|
818
|
-
|
|
819
|
-
### MCP (Model Context Protocol) Status
|
|
820
|
-
|
|
821
|
-
**Current Status: NOT IMPLEMENTED**
|
|
822
|
-
|
|
823
|
-
MCP (Model Context Protocol) is Anthropic's standard for LLM-tool communication. HyperMind currently uses **typed morphisms** for tool definitions rather than MCP:
|
|
824
|
-
|
|
825
|
-
| Feature | HyperMind Current | MCP Standard |
|
|
826
|
-
|---------|-------------------|--------------|
|
|
827
|
-
| Tool Definition | `TypedTool` trait + `Morphism` | JSON Schema |
|
|
828
|
-
| Type Safety | Compile-time (Rust generics) | Runtime validation |
|
|
829
|
-
| Composition | Category theory (`>>>` operator) | Sequential calls |
|
|
830
|
-
| Tool Discovery | `ToolRegistry` with introspection | `tools/list` endpoint |
|
|
831
|
-
|
|
832
|
-
**Why not MCP yet?**
|
|
833
|
-
- HyperMind's typed morphisms provide **stronger guarantees** than MCP's JSON Schema
|
|
834
|
-
- Category theory composition catches type errors at **planning time**, not runtime
|
|
835
|
-
- Future: MCP adapter layer planned for interoperability with Claude Desktop, etc.
|
|
836
|
-
|
|
837
|
-
**Future MCP Integration (Planned):**
|
|
838
|
-
```
|
|
839
|
-
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
840
|
-
│ MCP Client (Claude Desktop, etc.) │
|
|
841
|
-
│ │ │
|
|
842
|
-
│ ▼ MCP Protocol │
|
|
843
|
-
│ ┌─────────────────┐ │
|
|
844
|
-
│ │ MCP Adapter │ ← Future: Translates MCP ↔ TypedTool │
|
|
845
|
-
│ └────────┬────────┘ │
|
|
846
|
-
│ ▼ │
|
|
847
|
-
│ ┌─────────────────┐ │
|
|
848
|
-
│ │ TypedTool │ ← Current: Native HyperMind interface │
|
|
849
|
-
│ │ (Morphism) │ │
|
|
850
|
-
│ └─────────────────┘ │
|
|
851
|
-
└─────────────────────────────────────────────────────────────────────────────┘
|
|
852
|
-
```
|
|
853
|
-
|
|
854
|
-
### RuntimeScope (Proxied Objects)
|
|
855
|
-
|
|
856
|
-
The `RuntimeScope` provides a **hierarchical, type-safe container** for agent objects:
|
|
857
|
-
|
|
858
|
-
```typescript
|
|
859
|
-
// RuntimeScope: Dynamic object container with parent-child hierarchy
|
|
860
|
-
interface RuntimeScope {
|
|
861
|
-
// Bind a value to a name in this scope
|
|
862
|
-
bind<T>(name: string, value: T): void
|
|
863
|
-
|
|
864
|
-
// Get a value by name (searches parent scopes)
|
|
865
|
-
get<T>(name: string): T | null
|
|
999
|
+
`)
|
|
866
1000
|
|
|
867
|
-
|
|
868
|
-
|
|
1001
|
+
return {
|
|
1002
|
+
quoteId: `QT-${Date.now().toString().slice(-8)}`,
|
|
1003
|
+
applicant: application.bindings.name,
|
|
1004
|
+
premium: finalPremium,
|
|
1005
|
+
coverage: requestedCoverage,
|
|
1006
|
+
riskScore: riskScore,
|
|
1007
|
+
decision: 'APPROVED'
|
|
1008
|
+
}
|
|
869
1009
|
}
|
|
870
1010
|
|
|
871
|
-
//
|
|
872
|
-
|
|
873
|
-
parentScope.bind('db', graphDb)
|
|
874
|
-
parentScope.bind('ontology', 'lubm')
|
|
875
|
-
|
|
876
|
-
// Child agent inherits parent's bindings
|
|
877
|
-
const childScope = parentScope.child()
|
|
878
|
-
childScope.get('db') // → graphDb (inherited from parent)
|
|
879
|
-
childScope.bind('task', 'findProfessors') // Local binding
|
|
1011
|
+
// Run the underwriting
|
|
1012
|
+
runUnderwriting().catch(console.error)
|
|
880
1013
|
```
|
|
881
1014
|
|
|
882
|
-
|
|
883
|
-
- Objects in scope are **not directly exposed** to the LLM
|
|
884
|
-
- The agent accesses them through **typed tool interfaces**
|
|
885
|
-
- Prevents prompt injection attacks (LLM can't directly call methods)
|
|
886
|
-
|
|
887
|
-
### Vanilla LLM vs HyperMind: What We Measure
|
|
1015
|
+
---
|
|
888
1016
|
|
|
889
|
-
|
|
1017
|
+
## Architecture
|
|
890
1018
|
|
|
891
1019
|
```
|
|
892
1020
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
893
|
-
│
|
|
894
|
-
|
|
1021
|
+
│ YOUR APPLICATION │
|
|
1022
|
+
│ (FraudDetector, Underwriter, Recommender) │
|
|
1023
|
+
└─────────────────────────────────────────────────────────────────────────────┘
|
|
1024
|
+
│
|
|
1025
|
+
▼
|
|
1026
|
+
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
1027
|
+
│ HyperMind SDK (TypeScript) │
|
|
895
1028
|
│ │
|
|
896
|
-
│
|
|
897
|
-
|
|
898
|
-
│
|
|
899
|
-
|
|
900
|
-
│
|
|
901
|
-
|
|
902
|
-
|
|
1029
|
+
│ GraphDB ──── GraphFrame ──── EmbeddingService ──── DatalogProgram │
|
|
1030
|
+
└─────────────────────────────────────────────────────────────────────────────┘
|
|
1031
|
+
│
|
|
1032
|
+
NAPI-RS (FFI)
|
|
1033
|
+
│
|
|
1034
|
+
▼
|
|
1035
|
+
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
1036
|
+
│ HyperMind Runtime (Rust Core) │
|
|
903
1037
|
│ │
|
|
904
|
-
│
|
|
905
|
-
│
|
|
906
|
-
│
|
|
907
|
-
│
|
|
908
|
-
│
|
|
909
|
-
│ 4. Cleaning Required: How often HyperMind cleaning fixes issues │
|
|
910
|
-
│ 5. Latency: Time from prompt to results │
|
|
1038
|
+
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
|
|
1039
|
+
│ │ Type Theory │ │ Category │ │ Proof │ │
|
|
1040
|
+
│ │ (TypeId, │ │ Theory │ │ Theory │ │
|
|
1041
|
+
│ │ Refinement) │ │ (Morphisms) │ │ (Witnesses) │ │
|
|
1042
|
+
│ └───────────────┘ └───────────────┘ └───────────────┘ │
|
|
911
1043
|
│ │
|
|
1044
|
+
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
|
1045
|
+
│ │ WASM Sandbox Runtime (wasmtime) │ │
|
|
1046
|
+
│ │ Secure tool execution via capability proxy │ │
|
|
1047
|
+
│ └─────────────────────────────────────────────────────────────────────┘ │
|
|
912
1048
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
913
|
-
|
|
914
|
-
|
|
915
|
-
**Key Insight**: Real LLMs often return markdown-formatted output. HyperMind's typed tool contracts force structured output, dramatically improving syntax success rates.
|
|
916
|
-
|
|
917
|
-
### Core Concepts
|
|
918
|
-
|
|
919
|
-
#### TypeId - Type System Foundation
|
|
920
|
-
|
|
921
|
-
```typescript
|
|
922
|
-
// TypeId enum defines all types in the system
|
|
923
|
-
enum TypeId {
|
|
924
|
-
Unit, // ()
|
|
925
|
-
Bool, // boolean
|
|
926
|
-
Int64, // 64-bit integer
|
|
927
|
-
Float64, // 64-bit float
|
|
928
|
-
String, // UTF-8 string
|
|
929
|
-
Node, // RDF Node
|
|
930
|
-
Triple, // RDF Triple
|
|
931
|
-
Quad, // RDF Quad
|
|
932
|
-
BindingSet, // SPARQL solution set
|
|
933
|
-
Record, // Named fields: Record<{name: String, age: Int64}>
|
|
934
|
-
List, // Homogeneous list: List<Node>
|
|
935
|
-
Option, // Optional value: Option<String>
|
|
936
|
-
Function, // Function type: A → B
|
|
937
|
-
}
|
|
938
|
-
```
|
|
939
|
-
|
|
940
|
-
#### Morphism - Category Theory Abstraction
|
|
941
|
-
|
|
942
|
-
A **Morphism** is a typed function between objects with composable guarantees:
|
|
943
|
-
|
|
944
|
-
```typescript
|
|
945
|
-
// Morphism trait - a typed function between objects
|
|
946
|
-
interface Morphism<Input, Output> {
|
|
947
|
-
apply(input: Input): Result<Output, MorphismError>
|
|
948
|
-
inputType(): TypeId
|
|
949
|
-
outputType(): TypeId
|
|
950
|
-
}
|
|
951
|
-
|
|
952
|
-
// Example: SPARQL query as a morphism
|
|
953
|
-
// SparqlMorphism: String → BindingSet
|
|
954
|
-
const sparqlQuery: Morphism<string, BindingSet> = {
|
|
955
|
-
inputType: () => TypeId.String,
|
|
956
|
-
outputType: () => TypeId.BindingSet,
|
|
957
|
-
apply: (query) => db.querySelect(query)
|
|
958
|
-
}
|
|
959
|
-
```
|
|
960
|
-
|
|
961
|
-
#### ToolDescription - Typed Tool Contracts
|
|
962
|
-
|
|
963
|
-
```typescript
|
|
964
|
-
interface ToolDescription {
|
|
965
|
-
name: string // "kg.sparql.query"
|
|
966
|
-
description: string // "Execute SPARQL queries"
|
|
967
|
-
inputType: TypeId // TypeId.String
|
|
968
|
-
outputType: TypeId // TypeId.BindingSet
|
|
969
|
-
examples: string[] // Example queries
|
|
970
|
-
capabilities: string[] // ["query", "filter", "aggregate"]
|
|
971
|
-
}
|
|
972
|
-
|
|
973
|
-
// Available HyperMind tools
|
|
974
|
-
const tools: ToolDescription[] = [
|
|
975
|
-
{ name: "kg.sparql.query", input: TypeId.String, output: TypeId.BindingSet },
|
|
976
|
-
{ name: "kg.motif.find", input: TypeId.String, output: TypeId.BindingSet },
|
|
977
|
-
{ name: "kg.datalog.apply", input: TypeId.String, output: TypeId.BindingSet },
|
|
978
|
-
{ name: "kg.semantic.search", input: TypeId.String, output: TypeId.List },
|
|
979
|
-
{ name: "kg.traverse.neighbors", input: TypeId.Node, output: TypeId.List },
|
|
980
|
-
]
|
|
981
|
-
```
|
|
982
|
-
|
|
983
|
-
#### PlanningContext - Scope for Neural Planning
|
|
984
|
-
|
|
985
|
-
```typescript
|
|
986
|
-
interface PlanningContext {
|
|
987
|
-
tools: ToolDescription[] // Available tools
|
|
988
|
-
scopeBindings: Map<string, string> // Variables in scope
|
|
989
|
-
feedback: string | null // Error feedback from previous attempt
|
|
990
|
-
hints: string[] // Domain hints for the LLM
|
|
991
|
-
}
|
|
992
|
-
|
|
993
|
-
// Create planning context
|
|
994
|
-
const context: PlanningContext = {
|
|
995
|
-
tools: [sparqlTool, motifTool],
|
|
996
|
-
scopeBindings: new Map([["dataset", "lubm"]]),
|
|
997
|
-
feedback: null,
|
|
998
|
-
hints: [
|
|
999
|
-
"Database uses LUBM ontology",
|
|
1000
|
-
"Key classes: Professor, GraduateStudent, Course"
|
|
1001
|
-
]
|
|
1002
|
-
}
|
|
1003
|
-
```
|
|
1004
|
-
|
|
1005
|
-
#### Planner - Neural Planning Interface
|
|
1006
|
-
|
|
1007
|
-
```typescript
|
|
1008
|
-
interface Planner {
|
|
1009
|
-
plan(prompt: string, context: PlanningContext): Promise<Plan>
|
|
1010
|
-
name(): string
|
|
1011
|
-
config(): PlannerConfig
|
|
1012
|
-
}
|
|
1013
|
-
|
|
1014
|
-
// Supported planners
|
|
1015
|
-
type PlannerType =
|
|
1016
|
-
| { type: "claude", model: "claude-sonnet-4" }
|
|
1017
|
-
| { type: "openai", model: "gpt-4o" }
|
|
1018
|
-
| { type: "local", model: "ollama/mistral" }
|
|
1019
|
-
```
|
|
1020
|
-
|
|
1021
|
-
### Neuro-Symbolic Planning Loop
|
|
1022
|
-
|
|
1023
|
-
```
|
|
1049
|
+
│
|
|
1050
|
+
▼
|
|
1024
1051
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
1025
|
-
│
|
|
1026
|
-
├─────────────────────────────────────────────────────────────────────────────┤
|
|
1052
|
+
│ rust-kgdb Knowledge Graph │
|
|
1027
1053
|
│ │
|
|
1028
|
-
│
|
|
1029
|
-
│ │ │
|
|
1030
|
-
│ ▼ │
|
|
1031
|
-
│ ┌─────────────────┐ │
|
|
1032
|
-
│ │ Neural Planner │ (Claude Sonnet 4 / GPT-4o) │
|
|
1033
|
-
│ │ - Understands intent │
|
|
1034
|
-
│ │ - Discovers available tools │
|
|
1035
|
-
│ │ - Generates tool sequence │
|
|
1036
|
-
│ └────────┬────────┘ │
|
|
1037
|
-
│ │ Plan: [kg.sparql.query] │
|
|
1038
|
-
│ ▼ │
|
|
1039
|
-
│ ┌─────────────────┐ │
|
|
1040
|
-
│ │ Type Checker │ (Compile-time verification) │
|
|
1041
|
-
│ │ - Validates composition │
|
|
1042
|
-
│ │ - Checks pre/post conditions │
|
|
1043
|
-
│ │ - Verifies type compatibility │
|
|
1044
|
-
│ └────────┬────────┘ │
|
|
1045
|
-
│ │ Validated Plan │
|
|
1046
|
-
│ ▼ │
|
|
1047
|
-
│ ┌─────────────────┐ │
|
|
1048
|
-
│ │ Symbolic Executor│ (rust-kgdb) │
|
|
1049
|
-
│ │ - Executes SPARQL │
|
|
1050
|
-
│ │ - Returns typed results │
|
|
1051
|
-
│ │ - Records trace │
|
|
1052
|
-
│ └────────┬────────┘ │
|
|
1053
|
-
│ │ Result or Error │
|
|
1054
|
-
│ ▼ │
|
|
1055
|
-
│ ┌─────────────────┐ │
|
|
1056
|
-
│ │ Reflection │ │
|
|
1057
|
-
│ │ - Success? Return result │
|
|
1058
|
-
│ │ - Failure? Generate feedback │
|
|
1059
|
-
│ │ - Loop back to planner with context │
|
|
1060
|
-
│ └─────────────────┘ │
|
|
1054
|
+
│ InMemory (dev) │ RocksDB (single-node) │ Distributed (K8s cluster) │
|
|
1061
1055
|
│ │
|
|
1056
|
+
│ SPOC │ POCS │ OCSP │ CSPO (Four indexes) │
|
|
1062
1057
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
1063
1058
|
```
|
|
1064
1059
|
|
|
1065
|
-
### TypeScript SDK Usage (Available Now)
|
|
1066
|
-
|
|
1067
|
-
```typescript
|
|
1068
|
-
import { HyperMindAgent, runHyperMindBenchmark, createPlanningContext } from 'rust-kgdb'
|
|
1069
|
-
|
|
1070
|
-
// 1. Spawn a HyperMind agent
|
|
1071
|
-
const agent = await HyperMindAgent.spawn({
|
|
1072
|
-
name: 'university-explorer',
|
|
1073
|
-
model: 'mock', // or 'claude-sonnet-4', 'gpt-4o' with API keys
|
|
1074
|
-
tools: ['kg.sparql.query', 'kg.motif.find'],
|
|
1075
|
-
endpoint: 'http://localhost:30080'
|
|
1076
|
-
})
|
|
1077
|
-
|
|
1078
|
-
// 2. Execute natural language queries
|
|
1079
|
-
const result = await agent.call('Find all professors in the database')
|
|
1080
|
-
console.log(result.sparql) // Generated SPARQL query
|
|
1081
|
-
console.log(result.results) // Query results
|
|
1082
|
-
|
|
1083
|
-
// 3. Run the benchmark suite
|
|
1084
|
-
const stats = await runHyperMindBenchmark('http://localhost:30080', 'mock', {
|
|
1085
|
-
saveResults: true // Saves to hypermind_benchmark_*.json
|
|
1086
|
-
})
|
|
1087
|
-
```
|
|
1088
|
-
|
|
1089
|
-
### TypeScript SDK with LLM Planning (Requires API Keys)
|
|
1090
|
-
|
|
1091
|
-
```typescript
|
|
1092
|
-
// Set environment variables first:
|
|
1093
|
-
// ANTHROPIC_API_KEY=sk-ant-... (for Claude)
|
|
1094
|
-
// OPENAI_API_KEY=sk-... (for GPT-4o)
|
|
1095
|
-
|
|
1096
|
-
import { HyperMindAgent, createPlanningContext } from 'rust-kgdb'
|
|
1097
|
-
|
|
1098
|
-
// 1. Create planning context with typed tools
|
|
1099
|
-
const context = createPlanningContext('http://localhost:30080', [
|
|
1100
|
-
'Database contains university data',
|
|
1101
|
-
'Professors teach courses and advise students'
|
|
1102
|
-
])
|
|
1103
|
-
.withHint('Database uses LUBM ontology')
|
|
1104
|
-
.withHint('Key classes: Professor, GraduateStudent, Course')
|
|
1105
|
-
|
|
1106
|
-
// 2. Spawn an agent with tools and context
|
|
1107
|
-
const agent = await spawn({
|
|
1108
|
-
name: 'professor-finder',
|
|
1109
|
-
model: 'claude-sonnet-4',
|
|
1110
|
-
tools: ['kg.sparql.query', 'kg.motif.find']
|
|
1111
|
-
}, {
|
|
1112
|
-
kg: new GraphDB('http://localhost:30080'),
|
|
1113
|
-
context
|
|
1114
|
-
})
|
|
1115
|
-
|
|
1116
|
-
// 3. Execute with type-safe result
|
|
1117
|
-
interface Professor {
|
|
1118
|
-
uri: string
|
|
1119
|
-
name: string
|
|
1120
|
-
department: string
|
|
1121
|
-
}
|
|
1122
|
-
|
|
1123
|
-
const professors = await agent.call<Professor[]>(
|
|
1124
|
-
'Find professors who teach AI courses and advise graduate students'
|
|
1125
|
-
)
|
|
1126
|
-
|
|
1127
|
-
// 4. Type-checked at compile time!
|
|
1128
|
-
console.log(professors[0].name) // TypeScript knows this is a string
|
|
1129
|
-
```
|
|
1130
|
-
|
|
1131
|
-
### Category Theory Composition
|
|
1132
|
-
|
|
1133
|
-
HyperMind enforces **type safety at planning time** using category theory:
|
|
1134
|
-
|
|
1135
|
-
```typescript
|
|
1136
|
-
// Tools are morphisms with input/output types
|
|
1137
|
-
const sparqlQuery: Morphism<string, BindingSet>
|
|
1138
|
-
const extractNodes: Morphism<BindingSet, Node[]>
|
|
1139
|
-
const findSimilar: Morphism<Node, Node[]>
|
|
1140
|
-
|
|
1141
|
-
// Composition is type-checked
|
|
1142
|
-
const pipeline = compose(sparqlQuery, extractNodes, findSimilar)
|
|
1143
|
-
// ✓ String → BindingSet → Node[] → Node[]
|
|
1144
|
-
|
|
1145
|
-
// TYPE ERROR: BindingSet cannot be input to findSimilar (requires Node)
|
|
1146
|
-
const invalid = compose(sparqlQuery, findSimilar)
|
|
1147
|
-
// ✗ Compile error: BindingSet is not assignable to Node
|
|
1148
|
-
```
|
|
1149
|
-
|
|
1150
|
-
### Value Proposition
|
|
1151
|
-
|
|
1152
|
-
| Feature | HyperMind | LangChain | AutoGPT |
|
|
1153
|
-
|---------|-----------|-----------|---------|
|
|
1154
|
-
| **Type Safety** | ✅ Compile-time | ❌ Runtime | ❌ Runtime |
|
|
1155
|
-
| **Category Theory** | ✅ Full (Morphism, Functor, Monad) | ❌ None | ❌ None |
|
|
1156
|
-
| **KG Integration** | ✅ Native SPARQL/Datalog | ⚠️ Plugin | ⚠️ Plugin |
|
|
1157
|
-
| **Provenance** | ✅ Full execution trace | ⚠️ Partial | ❌ None |
|
|
1158
|
-
| **Tool Composition** | ✅ Verified at planning time | ❌ Runtime errors | ❌ Runtime errors |
|
|
1159
|
-
|
|
1160
|
-
### HyperMind Agentic Benchmark (Claude vs GPT-4o)
|
|
1161
|
-
|
|
1162
|
-
HyperMind was benchmarked using the **LUBM (Lehigh University Benchmark)** - the industry-standard benchmark for Semantic Web databases. LUBM provides a standardized ontology (universities, professors, students, courses) with 12 canonical queries of varying complexity.
|
|
1163
|
-
|
|
1164
|
-
**Benchmark Configuration:**
|
|
1165
|
-
- **Dataset**: LUBM(1) - 3,272 triples (1 university)
|
|
1166
|
-
- **Queries**: 12 LUBM-style NL-to-SPARQL queries (Easy: 3, Medium: 5, Hard: 4)
|
|
1167
|
-
- **LLM Models**: Claude Sonnet 4 (`claude-sonnet-4-20250514`), GPT-4o
|
|
1168
|
-
- **Infrastructure**: rust-kgdb K8s cluster (Orby, 1 coordinator + 3 executors)
|
|
1169
|
-
- **Date**: December 12, 2025
|
|
1170
|
-
- **API Keys**: Real production API keys used (NOT mock/simulation)
|
|
1171
|
-
|
|
1172
|
-
---
|
|
1173
|
-
|
|
1174
|
-
### ACTUAL BENCHMARK RESULTS (December 12, 2025)
|
|
1175
|
-
|
|
1176
|
-
#### Rust Benchmark (Native HyperMind Runtime)
|
|
1177
|
-
|
|
1178
|
-
```
|
|
1179
|
-
╔════════════════════════════════════════════════════════════════════╗
|
|
1180
|
-
║ BENCHMARK RESULTS ║
|
|
1181
|
-
╚════════════════════════════════════════════════════════════════════╝
|
|
1182
|
-
|
|
1183
|
-
┌─────────────────┬────────────────────────────┬────────────────────────────┐
|
|
1184
|
-
│ Model │ WITHOUT HyperMind (Raw) │ WITH HyperMind │
|
|
1185
|
-
├─────────────────┼────────────────────────────┼────────────────────────────┤
|
|
1186
|
-
│ Claude Sonnet 4 │ Accuracy: 0.00% │ Accuracy: 91.67% │
|
|
1187
|
-
│ │ Execution: 0/12 │ Execution: 11/12 │
|
|
1188
|
-
│ │ Latency: 222ms │ Latency: 6340ms │
|
|
1189
|
-
├─────────────────┼────────────────────────────┴────────────────────────────┤
|
|
1190
|
-
│ IMPROVEMENT │ Accuracy: +91.67% | Reliability: +91.67% │
|
|
1191
|
-
└─────────────────┴─────────────────────────────────────────────────────────┘
|
|
1192
|
-
|
|
1193
|
-
┌─────────────────┬────────────────────────────┬────────────────────────────┐
|
|
1194
|
-
│ GPT-4o │ Accuracy: 100.00% │ Accuracy: 66.67% │
|
|
1195
|
-
│ │ Execution: 12/12 │ Execution: 9/12 │
|
|
1196
|
-
│ │ Latency: 2940ms │ Latency: 3822ms │
|
|
1197
|
-
├─────────────────┼────────────────────────────┴────────────────────────────┤
|
|
1198
|
-
│ TYPE SAFETY │ 3 type errors caught at planning time (33% unsafe!) │
|
|
1199
|
-
└─────────────────┴─────────────────────────────────────────────────────────┘
|
|
1200
|
-
```
|
|
1201
|
-
|
|
1202
|
-
#### TypeScript Benchmark (Node.js SDK) - December 12, 2025
|
|
1203
|
-
|
|
1204
|
-
```
|
|
1205
|
-
┌──────────────────────────────────────────────────────────────────────────┐
|
|
1206
|
-
│ BENCHMARK CONFIGURATION │
|
|
1207
|
-
├──────────────────────────────────────────────────────────────────────────┤
|
|
1208
|
-
│ Dataset: LUBM (Lehigh University Benchmark) Ontology │
|
|
1209
|
-
│ - 3,272 triples (LUBM-1: 1 university) │
|
|
1210
|
-
│ - Classes: Professor, GraduateStudent, Course, Department │
|
|
1211
|
-
│ - Properties: advisor, teacherOf, memberOf, worksFor │
|
|
1212
|
-
│ │
|
|
1213
|
-
│ Task: Natural Language → SPARQL Query Generation │
|
|
1214
|
-
│ Agent receives question, generates SPARQL, executes query │
|
|
1215
|
-
│ │
|
|
1216
|
-
│ K8s Cluster: rust-kgdb on Orby (1 coordinator + 3 executors) │
|
|
1217
|
-
│ Tests: 12 LUBM queries (Easy: 3, Medium: 5, Hard: 4) │
|
|
1218
|
-
│ Embeddings: NOT USED (NL-to-SPARQL benchmark, not semantic search) │
|
|
1219
|
-
│ Multi-Vector: NOT APPLICABLE │
|
|
1220
|
-
└──────────────────────────────────────────────────────────────────────────┘
|
|
1221
|
-
|
|
1222
|
-
┌──────────────────────────────────────────────────────────────────────────┐
|
|
1223
|
-
│ AGENT CREATION │
|
|
1224
|
-
├──────────────────────────────────────────────────────────────────────────┤
|
|
1225
|
-
│ Name: benchmark-agent │
|
|
1226
|
-
│ Tools: kg.sparql.query, kg.motif.find, kg.datalog.apply │
|
|
1227
|
-
│ Tracing: enabled │
|
|
1228
|
-
└──────────────────────────────────────────────────────────────────────────┘
|
|
1229
|
-
|
|
1230
|
-
┌────────────────────┬───────────┬───────────┬───────────┬───────────────┐
|
|
1231
|
-
│ Model │ Syntax % │ Exec % │ Type Errs │ Avg Latency │
|
|
1232
|
-
├────────────────────┼───────────┼───────────┼───────────┼───────────────┤
|
|
1233
|
-
│ mock │ 100.0% │ 100.0% │ 0 │ 6.1ms │
|
|
1234
|
-
│ claude-sonnet-4 │ 100.0% │ 100.0% │ 0 │ 3439.8ms │
|
|
1235
|
-
│ gpt-4o │ 100.0% │ 100.0% │ 0 │ 1613.3ms │
|
|
1236
|
-
└────────────────────┴───────────┴───────────┴───────────┴───────────────┘
|
|
1237
|
-
|
|
1238
|
-
LLM Provider Details:
|
|
1239
|
-
- Claude Sonnet 4: Anthropic API (claude-sonnet-4-20250514)
|
|
1240
|
-
- GPT-4o: OpenAI API (gpt-4o)
|
|
1241
|
-
- Mock: Pattern matching (no API calls)
|
|
1242
|
-
```
|
|
1243
|
-
|
|
1244
|
-
---
|
|
1245
|
-
|
|
1246
|
-
### KEY FINDING: Claude +91.67% Accuracy Improvement
|
|
1247
|
-
|
|
1248
|
-
**Why Claude Raw Output is 0%:**
|
|
1249
|
-
|
|
1250
|
-
Claude's raw API responses include markdown formatting:
|
|
1251
|
-
|
|
1252
|
-
```markdown
|
|
1253
|
-
Here's the SPARQL query to find professors:
|
|
1254
|
-
|
|
1255
|
-
\`\`\`sparql
|
|
1256
|
-
PREFIX ub: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#>
|
|
1257
|
-
SELECT ?x WHERE { ?x a ub:Professor }
|
|
1258
|
-
\`\`\`
|
|
1259
|
-
|
|
1260
|
-
This query uses the LUBM ontology...
|
|
1261
|
-
```
|
|
1262
|
-
|
|
1263
|
-
This markdown formatting **fails SPARQL validation** because:
|
|
1264
|
-
1. Triple backticks (\`\`\`sparql) are not valid SPARQL
|
|
1265
|
-
2. Natural language explanations around the query
|
|
1266
|
-
3. Sometimes incomplete or truncated
|
|
1267
|
-
|
|
1268
|
-
**HyperMind fixes this by:**
|
|
1269
|
-
1. Forcing structured JSON tool output (not free-form text)
|
|
1270
|
-
2. Cleaning markdown artifacts from responses
|
|
1271
|
-
3. Validating SPARQL syntax before execution
|
|
1272
|
-
4. Type-checking at planning time
|
|
1273
|
-
|
|
1274
1060
|
---
|
|
1275
1061
|
|
|
1276
|
-
|
|
1062
|
+
## Mathematical Foundations
|
|
1277
1063
|
|
|
1278
|
-
|
|
1064
|
+
### Why Math Matters for AI Agents
|
|
1279
1065
|
|
|
1280
1066
|
```
|
|
1281
|
-
|
|
1282
|
-
|
|
1283
|
-
|
|
1284
|
-
|
|
1067
|
+
╔═══════════════════════════════════════════════════════════════════════════════╗
|
|
1068
|
+
║ THE PROBLEM WITH "VIBE-BASED" AI ║
|
|
1069
|
+
╠═══════════════════════════════════════════════════════════════════════════════╣
|
|
1070
|
+
║ ║
|
|
1071
|
+
║ LangChain: "Tools are just functions, YOLO!" ║
|
|
1072
|
+
║ → No type safety → Runtime errors → Production failures ║
|
|
1073
|
+
║ ║
|
|
1074
|
+
║ AutoGPT: "Let the AI figure it out!" ║
|
|
1075
|
+
║ → Hallucinated tools → Invalid calls → Infinite loops ║
|
|
1076
|
+
║ ║
|
|
1077
|
+
║ HyperMind: "Tools are mathematical morphisms with proofs" ║
|
|
1078
|
+
║ → Type-safe → Composable → Auditable → PRODUCTION-READY ║
|
|
1079
|
+
║ ║
|
|
1080
|
+
╚═══════════════════════════════════════════════════════════════════════════════╝
|
|
1285
1081
|
```
|
|
1286
1082
|
|
|
1287
|
-
|
|
1288
|
-
|
|
1289
|
-
---
|
|
1290
|
-
|
|
1291
|
-
### Example LUBM Queries We Ran
|
|
1292
|
-
|
|
1293
|
-
| # | Natural Language Question | Difficulty | Claude Raw | Claude+HM | GPT Raw | GPT+HM |
|
|
1294
|
-
|---|--------------------------|------------|------------|-----------|---------|--------|
|
|
1295
|
-
| Q1 | "Find all professors in the university database" | Easy | ❌ | ✅ | ✅ | ✅ |
|
|
1296
|
-
| Q2 | "List all graduate students" | Easy | ❌ | ✅ | ✅ | ✅ |
|
|
1297
|
-
| Q3 | "How many courses are offered?" | Easy | ❌ | ✅ | ✅ | ✅ |
|
|
1298
|
-
| Q4 | "Find all students and their advisors" | Medium | ❌ | ✅ | ✅ | ✅ |
|
|
1299
|
-
| Q5 | "List professors and the courses they teach" | Medium | ❌ | ✅ | ✅ | ✅ |
|
|
1300
|
-
| Q6 | "Find all departments and their parent universities" | Medium | ❌ | ✅ | ✅ | ✅ |
|
|
1301
|
-
| Q7 | "Count the number of students per department" | Medium | ❌ | ✅ | ✅ | ✅ |
|
|
1302
|
-
| Q8 | "Find the average credit hours for graduate courses" | Medium | ❌ | ⚠️ TYPE | ✅ | ⚠️ |
|
|
1303
|
-
| Q9 | "Find graduate students whose advisors research ML" | Hard | ❌ | ✅ | ✅ | ⚠️ TYPE |
|
|
1304
|
-
| Q10 | "List publications by professors at California universities" | Hard | ❌ | ✅ | ✅ | ⚠️ TYPE |
|
|
1305
|
-
| Q11 | "Find students in courses taught by same-dept professors" | Hard | ❌ | ✅ | ✅ | ✅ |
|
|
1306
|
-
| Q12 | "Find pairs of students sharing advisor and courses" | Hard | ❌ | ✅ | ✅ | ❌ |
|
|
1307
|
-
|
|
1308
|
-
**Legend**: ✅ = Success | ❌ = Failed | ⚠️ TYPE = Type error caught (correct behavior!)
|
|
1309
|
-
|
|
1310
|
-
---
|
|
1311
|
-
|
|
1312
|
-
### Root Cause Analysis
|
|
1313
|
-
|
|
1314
|
-
1. **Claude Raw 0%**: Claude's raw responses **always** include markdown formatting (triple backticks) which fails SPARQL validation. HyperMind's typed tool definitions force structured output.
|
|
1315
|
-
|
|
1316
|
-
2. **GPT-4o 66.67% with HyperMind (not 100%)**: The 33% "failures" are actually **type system victories**—the framework correctly caught queries that would have produced wrong results or runtime errors.
|
|
1317
|
-
|
|
1318
|
-
3. **HyperMind Value**: The framework doesn't just generate queries—it **validates correctness** at planning time, preventing silent failures.
|
|
1319
|
-
|
|
1320
|
-
---
|
|
1321
|
-
|
|
1322
|
-
### Benchmark Summary
|
|
1323
|
-
|
|
1324
|
-
| Metric | Claude WITHOUT HyperMind | Claude WITH HyperMind | Improvement |
|
|
1325
|
-
|--------|-------------------------|----------------------|-------------|
|
|
1326
|
-
| **Syntax Valid** | 0% (0/12) | 91.67% (11/12) | **+91.67%** |
|
|
1327
|
-
| **Execution Success** | 0% (0/12) | 91.67% (11/12) | **+91.67%** |
|
|
1328
|
-
| **Type Errors Caught** | 0 (no validation) | 1 | N/A |
|
|
1329
|
-
| **Avg Latency** | 222ms | 6,340ms | +6,118ms |
|
|
1330
|
-
|
|
1331
|
-
| Metric | GPT-4o WITHOUT HyperMind | GPT-4o WITH HyperMind | Note |
|
|
1332
|
-
|--------|-------------------------|----------------------|------|
|
|
1333
|
-
| **Syntax Valid** | 100% (12/12) | 66.67% (9/12) | -33% (type safety!) |
|
|
1334
|
-
| **Execution Success** | 100% (12/12) | 66.67% (9/12) | -33% (type safety!) |
|
|
1335
|
-
| **Type Errors Caught** | 0 (no validation) | 3 | **Prevented 3 runtime failures** |
|
|
1336
|
-
| **Avg Latency** | 2,940ms | 3,822ms | +882ms |
|
|
1337
|
-
|
|
1338
|
-
**LUBM Reference**: [Lehigh University Benchmark](http://swat.cse.lehigh.edu/projects/lubm/) - W3C standardized Semantic Web database benchmark
|
|
1339
|
-
|
|
1340
|
-
### SDK Benchmark Results
|
|
1341
|
-
|
|
1342
|
-
| Operation | Throughput | Latency |
|
|
1343
|
-
|-----------|------------|---------|
|
|
1344
|
-
| **Single Triple Insert** | 6,438 ops/sec | 155 μs |
|
|
1345
|
-
| **Bulk Insert (1000 triples)** | 112 batches/sec | 8.96 ms |
|
|
1346
|
-
| **Simple SELECT** | 1,137 queries/sec | 880 μs |
|
|
1347
|
-
| **JOIN Query** | 295 queries/sec | 3.39 ms |
|
|
1348
|
-
| **COUNT Aggregation** | 1,158 queries/sec | 863 μs |
|
|
1349
|
-
|
|
1350
|
-
Memory efficiency: **24 bytes/triple** in Rust native memory (zero-copy).
|
|
1351
|
-
|
|
1352
|
-
### Full Documentation
|
|
1353
|
-
|
|
1354
|
-
For complete HyperMind documentation including:
|
|
1355
|
-
- Rust implementation details
|
|
1356
|
-
- All crate structures (hypermind-types, hypermind-category, hypermind-tools, hypermind-runtime)
|
|
1357
|
-
- Session types for multi-agent protocols
|
|
1358
|
-
- Python SDK examples
|
|
1359
|
-
|
|
1360
|
-
See: [HyperMind Agentic Framework Documentation](https://github.com/gonnect-uk/rust-kgdb/blob/main/docs/HYPERMIND_AGENTIC_FRAMEWORK.md)
|
|
1361
|
-
|
|
1362
|
-
---
|
|
1363
|
-
|
|
1364
|
-
## Core RDF/SPARQL Database
|
|
1365
|
-
|
|
1366
|
-
> **This npm package provides the high-performance in-memory database.**
|
|
1367
|
-
> For **distributed cluster deployment** (1B+ triples, horizontal scaling), contact: **gonnect.uk@gmail.com**
|
|
1368
|
-
|
|
1369
|
-
---
|
|
1370
|
-
|
|
1371
|
-
## Deployment Modes
|
|
1372
|
-
|
|
1373
|
-
rust-kgdb supports three deployment modes:
|
|
1374
|
-
|
|
1375
|
-
| Mode | Use Case | Scalability | This Package |
|
|
1376
|
-
|------|----------|-------------|--------------|
|
|
1377
|
-
| **In-Memory** | Development, embedded apps, testing | Single node, volatile | ✅ **Included** |
|
|
1378
|
-
| **Single Node (RocksDB/LMDB)** | Production, persistence needed | Single node, persistent | Via Rust crate |
|
|
1379
|
-
| **Distributed Cluster** | Enterprise, 1B+ triples | Horizontal scaling, 9+ partitions | Contact us |
|
|
1380
|
-
|
|
1381
|
-
### Distributed Cluster Mode (Enterprise)
|
|
1083
|
+
### Type Theory: Catching Errors Before Runtime
|
|
1382
1084
|
|
|
1383
|
-
|
|
1384
|
-
|
|
1385
|
-
|
|
1386
|
-
|
|
1387
|
-
|
|
1388
|
-
|
|
1389
|
-
|
|
1390
|
-
|
|
1391
|
-
|
|
1392
|
-
|
|
1393
|
-
|
|
1394
|
-
|
|
1395
|
-
|
|
1396
|
-
|
|
1397
|
-
|
|
1398
|
-
SELECT (COUNT(*) AS ?count) (AVG(?salary) AS ?avgSalary)
|
|
1399
|
-
WHERE {
|
|
1400
|
-
?employee <http://ex/type> <http://ex/Employee> .
|
|
1401
|
-
?employee <http://ex/salary> ?salary .
|
|
1085
|
+
```typescript
|
|
1086
|
+
// ═══════════════════════════════════════════════════════════════════════════
|
|
1087
|
+
// REFINEMENT TYPES: Constraints enforced at construction time
|
|
1088
|
+
// ═══════════════════════════════════════════════════════════════════════════
|
|
1089
|
+
|
|
1090
|
+
// RiskScore: { x: number | 0.0 <= x <= 1.0 }
|
|
1091
|
+
class RiskScore {
|
|
1092
|
+
private constructor(private readonly value: number) {}
|
|
1093
|
+
|
|
1094
|
+
static create(value: number): RiskScore {
|
|
1095
|
+
if (value < 0 || value > 1) {
|
|
1096
|
+
throw new Error(`RiskScore must be 0-1, got ${value}`)
|
|
1097
|
+
}
|
|
1098
|
+
return new RiskScore(value)
|
|
1099
|
+
}
|
|
1402
1100
|
}
|
|
1403
1101
|
|
|
1404
|
-
|
|
1405
|
-
|
|
1406
|
-
|
|
1407
|
-
|
|
1408
|
-
**Request a demo: gonnect.uk@gmail.com**
|
|
1409
|
-
|
|
1410
|
-
---
|
|
1411
|
-
|
|
1412
|
-
## Why rust-kgdb?
|
|
1413
|
-
|
|
1414
|
-
| Feature | rust-kgdb | Apache Jena | RDFox |
|
|
1415
|
-
|---------|-----------|-------------|-------|
|
|
1416
|
-
| **Lookup Speed** | 2.78 µs | ~50 µs | 50-100 µs |
|
|
1417
|
-
| **Memory/Triple** | 24 bytes | 50-60 bytes | 32 bytes |
|
|
1418
|
-
| **SPARQL 1.1** | 100% | 100% | 95% |
|
|
1419
|
-
| **RDF 1.2** | 100% | Partial | No |
|
|
1420
|
-
| **WCOJ** | ✅ LeapFrog | ❌ | ❌ |
|
|
1421
|
-
| **Mobile-Ready** | ✅ iOS/Android | ❌ | ❌ |
|
|
1422
|
-
|
|
1423
|
-
---
|
|
1424
|
-
|
|
1425
|
-
## Core Technical Innovations
|
|
1426
|
-
|
|
1427
|
-
### 1. Worst-Case Optimal Joins (WCOJ)
|
|
1428
|
-
|
|
1429
|
-
Traditional databases use **nested-loop joins** with O(n²) to O(n⁴) complexity. rust-kgdb implements the **LeapFrog TrieJoin** algorithm—a worst-case optimal join that achieves O(n log n) for multi-way joins.
|
|
1430
|
-
|
|
1431
|
-
**How it works:**
|
|
1432
|
-
- **Trie Data Structure**: Triples indexed hierarchically (S→P→O) using BTreeMap for sorted access
|
|
1433
|
-
- **Variable Ordering**: Frequency-based analysis orders variables for optimal intersection
|
|
1434
|
-
- **LeapFrog Iterator**: Binary search across sorted iterators finds intersections without materializing intermediate results
|
|
1435
|
-
|
|
1436
|
-
```
|
|
1437
|
-
Query: SELECT ?x ?y ?z WHERE { ?x :p ?y . ?y :q ?z . ?x :r ?z }
|
|
1438
|
-
|
|
1439
|
-
Nested Loop: O(n³) - examines every combination
|
|
1440
|
-
WCOJ: O(n log n) - iterates in sorted order, seeks forward on mismatch
|
|
1441
|
-
```
|
|
1442
|
-
|
|
1443
|
-
| Query Pattern | Before (Nested Loop) | After (WCOJ) | Speedup |
|
|
1444
|
-
|---------------|---------------------|--------------|---------|
|
|
1445
|
-
| 3-way star | O(n³) | O(n log n) | **50-100x** |
|
|
1446
|
-
| 4+ way complex | O(n⁴) | O(n log n) | **100-1000x** |
|
|
1447
|
-
| Chain queries | O(n²) | O(n log n) | **10-20x** |
|
|
1448
|
-
|
|
1449
|
-
### 2. Sparse Matrix Engine (CSR Format)
|
|
1450
|
-
|
|
1451
|
-
Binary relations (e.g., `foaf:knows`, `rdfs:subClassOf`) are converted to **Compressed Sparse Row (CSR)** matrices for cache-efficient join evaluation:
|
|
1452
|
-
|
|
1453
|
-
- **Memory**: O(nnz) where nnz = number of edges (not O(n²))
|
|
1454
|
-
- **Matrix Multiplication**: Replaces nested-loop joins
|
|
1455
|
-
- **Transitive Closure**: Semi-naive Δ-matrix evaluation (not iterated powers)
|
|
1456
|
-
|
|
1457
|
-
```rust
|
|
1458
|
-
// Traditional: O(n²) nested loops
|
|
1459
|
-
for (s, p, o) in triples { ... }
|
|
1460
|
-
|
|
1461
|
-
// CSR Matrix: O(nnz) cache-friendly iteration
|
|
1462
|
-
row_ptr[i] → col_indices[j] → values[j]
|
|
1463
|
-
```
|
|
1464
|
-
|
|
1465
|
-
**Used for**: RDFS/OWL reasoning, transitive closure, Datalog evaluation.
|
|
1466
|
-
|
|
1467
|
-
### 3. SIMD + PGO Compiler Optimizations
|
|
1468
|
-
|
|
1469
|
-
**Zero code changes—pure compiler-level performance gains.**
|
|
1470
|
-
|
|
1471
|
-
| Optimization | Technology | Effect |
|
|
1472
|
-
|--------------|------------|--------|
|
|
1473
|
-
| **SIMD Vectorization** | AVX2/BMI2 (Intel), NEON (ARM) | 8-wide parallel operations |
|
|
1474
|
-
| **Profile-Guided Optimization** | LLVM PGO | Hot path optimization, branch prediction |
|
|
1475
|
-
| **Link-Time Optimization** | LTO (fat) | Cross-crate inlining, dead code elimination |
|
|
1476
|
-
|
|
1477
|
-
**Benchmark Results (LUBM, Intel Skylake):**
|
|
1478
|
-
|
|
1479
|
-
| Query | Before | After (SIMD+PGO) | Improvement |
|
|
1480
|
-
|-------|--------|------------------|-------------|
|
|
1481
|
-
| Q5: 2-hop chain | 230ms | 53ms | **77% faster** |
|
|
1482
|
-
| Q3: 3-way star | 177ms | 62ms | **65% faster** |
|
|
1483
|
-
| Q4: 3-hop chain | 254ms | 101ms | **60% faster** |
|
|
1484
|
-
| Q8: Triangle | 410ms | 193ms | **53% faster** |
|
|
1485
|
-
| Q7: Hierarchy | 343ms | 198ms | **42% faster** |
|
|
1486
|
-
| Q6: 6-way complex | 641ms | 464ms | **28% faster** |
|
|
1487
|
-
| Q2: 5-way star | 234ms | 183ms | **22% faster** |
|
|
1488
|
-
| Q1: 4-way star | 283ms | 258ms | **9% faster** |
|
|
1489
|
-
|
|
1490
|
-
**Average speedup: 44.5%** across all queries.
|
|
1491
|
-
|
|
1492
|
-
### 4. Quad Indexing (SPOC)
|
|
1493
|
-
|
|
1494
|
-
Four complementary indexes enable O(1) pattern matching regardless of query shape:
|
|
1495
|
-
|
|
1496
|
-
| Index | Pattern | Use Case |
|
|
1497
|
-
|-------|---------|----------|
|
|
1498
|
-
| **SPOC** | `(?s, ?p, ?o, ?g)` | Subject-centric queries |
|
|
1499
|
-
| **POCS** | `(?p, ?o, ?c, ?s)` | Property enumeration |
|
|
1500
|
-
| **OCSP** | `(?o, ?c, ?s, ?p)` | Object lookups (reverse links) |
|
|
1501
|
-
| **CSPO** | `(?c, ?s, ?p, ?o)` | Named graph iteration |
|
|
1502
|
-
|
|
1503
|
-
---
|
|
1504
|
-
|
|
1505
|
-
## Storage Backends
|
|
1506
|
-
|
|
1507
|
-
rust-kgdb uses a pluggable storage architecture. **Default is in-memory** (zero configuration). For persistence, enable RocksDB.
|
|
1508
|
-
|
|
1509
|
-
| Backend | Feature Flag | Use Case | Status |
|
|
1510
|
-
|---------|--------------|----------|--------|
|
|
1511
|
-
| **InMemory** | `default` | Development, testing, embedded | ✅ **Production Ready** |
|
|
1512
|
-
| **RocksDB** | `rocksdb-backend` | Production, large datasets | ✅ **61 tests passing** |
|
|
1513
|
-
| **LMDB** | `lmdb-backend` | Read-heavy workloads | ✅ **31 tests passing** |
|
|
1514
|
-
|
|
1515
|
-
### InMemory (Default)
|
|
1516
|
-
|
|
1517
|
-
Zero configuration, maximum performance. Data is volatile (lost on process exit).
|
|
1518
|
-
|
|
1519
|
-
**High-Performance Data Structures:**
|
|
1520
|
-
|
|
1521
|
-
| Component | Structure | Why |
|
|
1522
|
-
|-----------|-----------|-----|
|
|
1523
|
-
| **Triple Store** | `DashMap` | Lock-free concurrent hash map, 100K pre-allocation |
|
|
1524
|
-
| **WCOJ Trie** | `BTreeMap` | Sorted iteration for LeapFrog intersection |
|
|
1525
|
-
| **Dictionary** | `FxHashSet` | String interning with rustc-optimized hashing |
|
|
1526
|
-
| **Hypergraph** | `FxHashMap` | Fast node→edge adjacency lists |
|
|
1527
|
-
| **Reasoning** | `AHashMap` | RDFS/OWL inference with DoS-resistant hashing |
|
|
1528
|
-
| **Datalog** | `FxHashMap` | Semi-naive evaluation with delta propagation |
|
|
1529
|
-
|
|
1530
|
-
**Why these structures enable sub-microsecond performance:**
|
|
1531
|
-
- **DashMap**: Sharded locks (16 shards default) → near-linear scaling on multi-core
|
|
1532
|
-
- **FxHashMap**: Rust compiler's hash function → 30% faster than std HashMap
|
|
1533
|
-
- **BTreeMap**: O(log n) ordered iteration → enables binary search in LeapFrog
|
|
1534
|
-
- **Pre-allocation**: 100K capacity avoids rehashing during bulk inserts
|
|
1535
|
-
|
|
1536
|
-
```rust
|
|
1537
|
-
use storage::{QuadStore, InMemoryBackend};
|
|
1102
|
+
// PolicyNumber: { s: string | /^POL-\d{4}-\d{6}$/ }
|
|
1103
|
+
class PolicyNumber {
|
|
1104
|
+
private constructor(private readonly value: string) {}
|
|
1538
1105
|
|
|
1539
|
-
|
|
1540
|
-
|
|
1541
|
-
|
|
1542
|
-
|
|
1543
|
-
|
|
1544
|
-
|
|
1545
|
-
|
|
1106
|
+
static create(value: string): PolicyNumber {
|
|
1107
|
+
if (!/^POL-\d{4}-\d{6}$/.test(value)) {
|
|
1108
|
+
throw new Error(`Invalid policy: ${value}`)
|
|
1109
|
+
}
|
|
1110
|
+
return new PolicyNumber(value)
|
|
1111
|
+
}
|
|
1112
|
+
}
|
|
1546
1113
|
|
|
1547
|
-
|
|
1548
|
-
|
|
1549
|
-
|
|
1550
|
-
|
|
1114
|
+
// Usage:
|
|
1115
|
+
RiskScore.create(0.85) // ✅ OK
|
|
1116
|
+
RiskScore.create(1.5) // ❌ Throws: "RiskScore must be 0-1"
|
|
1117
|
+
PolicyNumber.create("POL-2024-000123") // ✅ OK
|
|
1118
|
+
PolicyNumber.create("INVALID") // ❌ Throws: "Invalid policy"
|
|
1551
1119
|
```
|
|
1552
1120
|
|
|
1553
|
-
|
|
1554
|
-
use storage::{QuadStore, RocksDbBackend};
|
|
1555
|
-
|
|
1556
|
-
// Create persistent database
|
|
1557
|
-
let backend = RocksDbBackend::new("/path/to/data")?;
|
|
1558
|
-
let store = QuadStore::new(backend);
|
|
1121
|
+
### Category Theory: Safe Tool Composition
|
|
1559
1122
|
|
|
1560
|
-
// Features:
|
|
1561
|
-
// - ACID transactions
|
|
1562
|
-
// - Snappy compression (automatic)
|
|
1563
|
-
// - Crash recovery
|
|
1564
|
-
// - Range & prefix scanning
|
|
1565
|
-
// - 1MB+ value support
|
|
1566
|
-
|
|
1567
|
-
// Force sync to disk
|
|
1568
|
-
store.flush()?;
|
|
1569
1123
|
```
|
|
1124
|
+
═══════════════════════════════════════════════════════════════════════════════
|
|
1125
|
+
TOOLS AS TYPED MORPHISMS
|
|
1126
|
+
═══════════════════════════════════════════════════════════════════════════════
|
|
1570
1127
|
|
|
1571
|
-
|
|
1572
|
-
- Basic CRUD operations (14 tests)
|
|
1573
|
-
- Range scanning (8 tests)
|
|
1574
|
-
- Prefix scanning (6 tests)
|
|
1575
|
-
- Batch operations (8 tests)
|
|
1576
|
-
- Transactions (8 tests)
|
|
1577
|
-
- Concurrent access (5 tests)
|
|
1578
|
-
- Unicode & binary data (4 tests)
|
|
1579
|
-
- Large key/value handling (8 tests)
|
|
1128
|
+
In category theory, a morphism is an arrow from A to B: f: A → B
|
|
1580
1129
|
|
|
1581
|
-
|
|
1130
|
+
HyperMind tools are morphisms:
|
|
1582
1131
|
|
|
1583
|
-
|
|
1132
|
+
┌────────────────────────┬──────────────────────────────────────────────────┐
|
|
1133
|
+
│ Tool │ Type Signature (Morphism) │
|
|
1134
|
+
├────────────────────────┼──────────────────────────────────────────────────┤
|
|
1135
|
+
│ kg.sparql.query │ Query → BindingSet │
|
|
1136
|
+
│ kg.sparql.construct │ Query → Graph │
|
|
1137
|
+
│ kg.motif.find │ Pattern → Matches │
|
|
1138
|
+
│ kg.datalog.apply │ (Graph, Rules) → InferredFacts │
|
|
1139
|
+
│ kg.embeddings.search │ Entity → SimilarEntities │
|
|
1140
|
+
│ kg.graphframes.pagerank│ Graph → RankScores │
|
|
1141
|
+
└────────────────────────┴──────────────────────────────────────────────────┘
|
|
1584
1142
|
|
|
1585
|
-
|
|
1586
|
-
# Cargo.toml - Enable LMDB backend
|
|
1587
|
-
[dependencies]
|
|
1588
|
-
storage = { version = "0.1.12", features = ["lmdb-backend"] }
|
|
1589
|
-
```
|
|
1143
|
+
COMPOSITION (f ; g = g(f(x))):
|
|
1590
1144
|
|
|
1591
|
-
|
|
1592
|
-
|
|
1145
|
+
kg.sparql.query ; extractEntities ; kg.embeddings.search
|
|
1146
|
+
─────────────────────────────────────────────────────────────────
|
|
1147
|
+
Query → BindingSet → Entity[] → SimilarEntities
|
|
1593
1148
|
|
|
1594
|
-
|
|
1595
|
-
|
|
1596
|
-
|
|
1149
|
+
The composition is TYPE-SAFE:
|
|
1150
|
+
- If output type of f doesn't match input type of g, composition fails
|
|
1151
|
+
- Guaranteed at compile time, not runtime!
|
|
1597
1152
|
|
|
1598
|
-
|
|
1599
|
-
|
|
1153
|
+
LAWS (Guaranteed by HyperMind):
|
|
1154
|
+
1. Identity: id ; f = f = f ; id
|
|
1155
|
+
2. Associativity: (f ; g) ; h = f ; (g ; h)
|
|
1600
1156
|
|
|
1601
|
-
|
|
1602
|
-
// - Memory-mapped I/O (zero-copy reads)
|
|
1603
|
-
// - MVCC for concurrent readers
|
|
1604
|
-
// - Crash-safe ACID transactions
|
|
1605
|
-
// - Range & prefix scanning
|
|
1606
|
-
// - Excellent for read-heavy workloads
|
|
1607
|
-
|
|
1608
|
-
// Sync to disk
|
|
1609
|
-
store.flush()?;
|
|
1157
|
+
═══════════════════════════════════════════════════════════════════════════════
|
|
1610
1158
|
```
|
|
1611
1159
|
|
|
1612
|
-
|
|
1613
|
-
|
|
1614
|
-
| Characteristic | LMDB | RocksDB |
|
|
1615
|
-
|----------------|------|---------|
|
|
1616
|
-
| **Read Performance** | ✅ Faster (memory-mapped) | Good |
|
|
1617
|
-
| **Write Performance** | Good | ✅ Faster (LSM-tree) |
|
|
1618
|
-
| **Concurrent Readers** | ✅ Unlimited | Limited by locks |
|
|
1619
|
-
| **Write Amplification** | Low | Higher (compaction) |
|
|
1620
|
-
| **Memory Usage** | Higher (map size) | Lower (cache-based) |
|
|
1621
|
-
| **Best For** | Read-heavy, OLAP | Write-heavy, OLTP |
|
|
1622
|
-
|
|
1623
|
-
**LMDB Test Coverage:**
|
|
1624
|
-
- Basic CRUD operations (8 tests)
|
|
1625
|
-
- Range scanning (4 tests)
|
|
1626
|
-
- Prefix scanning (3 tests)
|
|
1627
|
-
- Batch operations (3 tests)
|
|
1628
|
-
- Large key/value handling (4 tests)
|
|
1629
|
-
- Concurrent access (4 tests)
|
|
1630
|
-
- Statistics & flush (3 tests)
|
|
1631
|
-
- Edge cases (2 tests)
|
|
1632
|
-
|
|
1633
|
-
### TypeScript SDK
|
|
1634
|
-
|
|
1635
|
-
The npm package uses the in-memory backend—ideal for:
|
|
1636
|
-
- Knowledge graph queries
|
|
1637
|
-
- SPARQL execution
|
|
1638
|
-
- Data transformation pipelines
|
|
1639
|
-
- Embedded applications
|
|
1160
|
+
### Proof Theory: Every Execution Has Evidence
|
|
1640
1161
|
|
|
1641
1162
|
```typescript
|
|
1642
|
-
|
|
1163
|
+
// ═══════════════════════════════════════════════════════════════════════════
|
|
1164
|
+
// CURRY-HOWARD CORRESPONDENCE: Types ↔ Propositions, Values ↔ Proofs
|
|
1165
|
+
// ═══════════════════════════════════════════════════════════════════════════
|
|
1643
1166
|
|
|
1644
|
-
//
|
|
1645
|
-
|
|
1167
|
+
// The type signature is a PROPOSITION:
|
|
1168
|
+
// "Given a Query, I can produce a BindingSet"
|
|
1169
|
+
//
|
|
1170
|
+
// The execution is a PROOF:
|
|
1171
|
+
// "Here is the BindingSet I produced, with evidence"
|
|
1172
|
+
|
|
1173
|
+
interface ExecutionWitness {
|
|
1174
|
+
tool: string // "kg.sparql.query"
|
|
1175
|
+
inputType: TypeId // TypeId.Query
|
|
1176
|
+
outputType: TypeId // TypeId.BindingSet
|
|
1177
|
+
input: string // The actual query
|
|
1178
|
+
output: string // The actual results
|
|
1179
|
+
timestamp: Date // When executed
|
|
1180
|
+
durationMs: number // How long it took
|
|
1181
|
+
executionHash: string // SHA-256 of execution (tamper-proof)
|
|
1182
|
+
}
|
|
1646
1183
|
|
|
1647
|
-
//
|
|
1648
|
-
const
|
|
1649
|
-
|
|
1184
|
+
// Every tool execution produces a witness:
|
|
1185
|
+
const witness: ExecutionWitness = {
|
|
1186
|
+
tool: "kg.sparql.query",
|
|
1187
|
+
inputType: TypeId.Query,
|
|
1188
|
+
outputType: TypeId.BindingSet,
|
|
1189
|
+
input: "SELECT ?x WHERE { ?x a :Fraud }",
|
|
1190
|
+
output: "[{x: 'entity001'}, {x: 'entity002'}]",
|
|
1191
|
+
timestamp: new Date("2024-12-14T10:30:00Z"),
|
|
1192
|
+
durationMs: 12,
|
|
1193
|
+
executionHash: "sha256:a3f2c8d9e1b4..."
|
|
1194
|
+
}
|
|
1650
1195
|
```
|
|
1651
1196
|
|
|
1652
|
-
|
|
1653
|
-
|
|
1654
|
-
|
|
1655
|
-
|
|
1656
|
-
|
|
1657
|
-
|
|
1197
|
+
### Audit Trail (Required for Compliance)
|
|
1198
|
+
|
|
1199
|
+
```json
|
|
1200
|
+
{
|
|
1201
|
+
"analysisId": "fraud-2024-001",
|
|
1202
|
+
"timestamp": "2024-12-14T10:30:00Z",
|
|
1203
|
+
"agent": "fraud-detector",
|
|
1204
|
+
"witnesses": [
|
|
1205
|
+
{
|
|
1206
|
+
"step": 1,
|
|
1207
|
+
"tool": "kg.sparql.query",
|
|
1208
|
+
"input": "SELECT ?tx WHERE { ?tx :amount ?a . FILTER(?a > 100000) }",
|
|
1209
|
+
"output": "[{tx: 'tx001'}, {tx: 'tx002'}, {tx: 'tx003'}]",
|
|
1210
|
+
"durationMs": 12,
|
|
1211
|
+
"executionHash": "sha256:a3f2c8..."
|
|
1212
|
+
},
|
|
1213
|
+
{
|
|
1214
|
+
"step": 2,
|
|
1215
|
+
"tool": "kg.motif.find",
|
|
1216
|
+
"input": "(a)-[:sender]->(b); (b)-[:sender]->(c); (c)-[:sender]->(a)",
|
|
1217
|
+
"output": "[{a: 'e001', b: 'e002', c: 'e003'}]",
|
|
1218
|
+
"durationMs": 45,
|
|
1219
|
+
"executionHash": "sha256:b7d1e9..."
|
|
1220
|
+
},
|
|
1221
|
+
{
|
|
1222
|
+
"step": 3,
|
|
1223
|
+
"tool": "kg.graphframes.pagerank",
|
|
1224
|
+
"input": "{vertices: [...], edges: [...]}",
|
|
1225
|
+
"output": "{e001: 0.42, e002: 0.31, e003: 0.27}",
|
|
1226
|
+
"durationMs": 23,
|
|
1227
|
+
"executionHash": "sha256:c9e2f1..."
|
|
1228
|
+
}
|
|
1229
|
+
],
|
|
1230
|
+
"totalDurationMs": 80,
|
|
1231
|
+
"reproducibilityGuarantee": "Re-executing with same inputs produces identical outputs"
|
|
1232
|
+
}
|
|
1658
1233
|
```
|
|
1659
1234
|
|
|
1660
|
-
### Platform Support (v0.2.1)
|
|
1661
|
-
|
|
1662
|
-
| Platform | Architecture | Status | Notes |
|
|
1663
|
-
|----------|-------------|--------|-------|
|
|
1664
|
-
| **macOS** | Intel (x64) | ✅ **Works out of the box** | Pre-built binary included |
|
|
1665
|
-
| **macOS** | Apple Silicon (arm64) | ⏳ v0.2.2 | Coming soon |
|
|
1666
|
-
| **Linux** | x64 | ⏳ v0.2.2 | Coming soon |
|
|
1667
|
-
| **Linux** | arm64 | ⏳ v0.2.2 | Coming soon |
|
|
1668
|
-
| **Windows** | x64 | ⏳ v0.2.2 | Coming soon |
|
|
1669
|
-
|
|
1670
|
-
**This release (v0.2.1)** includes pre-built binary for **macOS x64 only**. Other platforms will be added in the next release.
|
|
1671
|
-
|
|
1672
1235
|
---
|
|
1673
1236
|
|
|
1674
|
-
##
|
|
1675
|
-
|
|
1676
|
-
### Complete Working Example
|
|
1677
|
-
|
|
1678
|
-
```typescript
|
|
1679
|
-
import { GraphDB } from 'rust-kgdb'
|
|
1680
|
-
|
|
1681
|
-
// 1. Create database
|
|
1682
|
-
const db = new GraphDB('http://example.org/myapp')
|
|
1683
|
-
|
|
1684
|
-
// 2. Load data (Turtle format)
|
|
1685
|
-
db.loadTtl(`
|
|
1686
|
-
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
|
|
1687
|
-
@prefix ex: <http://example.org/> .
|
|
1688
|
-
|
|
1689
|
-
ex:alice a foaf:Person ;
|
|
1690
|
-
foaf:name "Alice" ;
|
|
1691
|
-
foaf:age 30 ;
|
|
1692
|
-
foaf:knows ex:bob, ex:charlie .
|
|
1693
|
-
|
|
1694
|
-
ex:bob a foaf:Person ;
|
|
1695
|
-
foaf:name "Bob" ;
|
|
1696
|
-
foaf:age 25 ;
|
|
1697
|
-
foaf:knows ex:charlie .
|
|
1698
|
-
|
|
1699
|
-
ex:charlie a foaf:Person ;
|
|
1700
|
-
foaf:name "Charlie" ;
|
|
1701
|
-
foaf:age 35 .
|
|
1702
|
-
`, null)
|
|
1703
|
-
|
|
1704
|
-
// 3. Query: Find friends-of-friends (WCOJ optimized!)
|
|
1705
|
-
const fof = db.querySelect(`
|
|
1706
|
-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
1707
|
-
PREFIX ex: <http://example.org/>
|
|
1708
|
-
|
|
1709
|
-
SELECT ?person ?friend ?fof WHERE {
|
|
1710
|
-
?person foaf:knows ?friend .
|
|
1711
|
-
?friend foaf:knows ?fof .
|
|
1712
|
-
FILTER(?person != ?fof)
|
|
1713
|
-
}
|
|
1714
|
-
`)
|
|
1715
|
-
console.log('Friends of Friends:', fof)
|
|
1716
|
-
// [{ person: 'ex:alice', friend: 'ex:bob', fof: 'ex:charlie' }]
|
|
1717
|
-
|
|
1718
|
-
// 4. Aggregation: Average age
|
|
1719
|
-
const stats = db.querySelect(`
|
|
1720
|
-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
1721
|
-
|
|
1722
|
-
SELECT (COUNT(?p) AS ?count) (AVG(?age) AS ?avgAge) WHERE {
|
|
1723
|
-
?p a foaf:Person ; foaf:age ?age .
|
|
1724
|
-
}
|
|
1725
|
-
`)
|
|
1726
|
-
console.log('Stats:', stats)
|
|
1727
|
-
// [{ count: '3', avgAge: '30.0' }]
|
|
1728
|
-
|
|
1729
|
-
// 5. ASK query
|
|
1730
|
-
const hasAlice = db.queryAsk(`
|
|
1731
|
-
PREFIX ex: <http://example.org/>
|
|
1732
|
-
ASK { ex:alice a <http://xmlns.com/foaf/0.1/Person> }
|
|
1733
|
-
`)
|
|
1734
|
-
console.log('Has Alice?', hasAlice) // true
|
|
1735
|
-
|
|
1736
|
-
// 6. CONSTRUCT query
|
|
1737
|
-
const graph = db.queryConstruct(`
|
|
1738
|
-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
1739
|
-
PREFIX ex: <http://example.org/>
|
|
1237
|
+
## WASM Sandbox Security
|
|
1740
1238
|
|
|
1741
|
-
|
|
1742
|
-
WHERE { ?p foaf:knows ?f }
|
|
1743
|
-
`)
|
|
1744
|
-
console.log('Extracted graph:', graph)
|
|
1239
|
+
All tool executions run in isolated WASM sandboxes for enterprise security.
|
|
1745
1240
|
|
|
1746
|
-
// 7. Count and cleanup
|
|
1747
|
-
console.log('Triple count:', db.count()) // 11
|
|
1748
|
-
db.clear()
|
|
1749
1241
|
```
|
|
1242
|
+
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
1243
|
+
│ WASM SANDBOX SECURITY MODEL │
|
|
1244
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
1245
|
+
│ │
|
|
1246
|
+
│ Agent Request: kg.sparql.query("SELECT ?x WHERE...") │
|
|
1247
|
+
│ │
|
|
1248
|
+
│ │ │
|
|
1249
|
+
│ ▼ │
|
|
1250
|
+
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
|
1251
|
+
│ │ CAPABILITY PROXY (Permission Check) │ │
|
|
1252
|
+
│ │ │ │
|
|
1253
|
+
│ │ ✅ Agent has 'kg.sparql.query' capability │ │
|
|
1254
|
+
│ │ ❌ Agent does NOT have 'kg.sparql.update' capability │ │
|
|
1255
|
+
│ │ ❌ Agent does NOT have filesystem access │ │
|
|
1256
|
+
│ │ ❌ Agent does NOT have network access │ │
|
|
1257
|
+
│ └─────────────────────────────────────────────────────────────────────┘ │
|
|
1258
|
+
│ │ │
|
|
1259
|
+
│ ▼ │
|
|
1260
|
+
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
|
1261
|
+
│ │ WASMTIME SANDBOX │ │
|
|
1262
|
+
│ │ ┌───────────────────────────────────────────────────────────────┐ │ │
|
|
1263
|
+
│ │ │ WASM MODULE │ │ │
|
|
1264
|
+
│ │ │ │ │ │
|
|
1265
|
+
│ │ │ • Isolated linear memory (no host memory access) │ │ │
|
|
1266
|
+
│ │ │ • No filesystem access │ │ │
|
|
1267
|
+
│ │ │ • No network access │ │ │
|
|
1268
|
+
│ │ │ • CPU time limits (fuel metering: 10M ops max) │ │ │
|
|
1269
|
+
│ │ │ • Memory limits (64MB default) │ │ │
|
|
1270
|
+
│ │ │ │ │ │
|
|
1271
|
+
│ │ └───────────────────────────────────────────────────────────────┘ │ │
|
|
1272
|
+
│ └─────────────────────────────────────────────────────────────────────┘ │
|
|
1273
|
+
│ │ │
|
|
1274
|
+
│ ▼ │
|
|
1275
|
+
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
|
1276
|
+
│ │ RESULT VALIDATION │ │
|
|
1277
|
+
│ │ │ │
|
|
1278
|
+
│ │ ✅ Output type matches expected (BindingSet) │ │
|
|
1279
|
+
│ │ ✅ Output size within limits │ │
|
|
1280
|
+
│ │ ✅ Execution time within limits │ │
|
|
1281
|
+
│ └─────────────────────────────────────────────────────────────────────┘ │
|
|
1282
|
+
│ │
|
|
1283
|
+
└─────────────────────────────────────────────────────────────────────────────┘
|
|
1750
1284
|
|
|
1751
|
-
|
|
1752
|
-
|
|
1753
|
-
|
|
1754
|
-
|
|
1755
|
-
|
|
1756
|
-
|
|
1757
|
-
|
|
1758
|
-
|
|
1759
|
-
|
|
1760
|
-
|
|
1761
|
-
|
|
1285
|
+
CAPABILITY MODEL:
|
|
1286
|
+
┌─────────────────────┬────────────────────────────────────────┬─────────────┐
|
|
1287
|
+
│ Capability │ Description │ Default │
|
|
1288
|
+
├─────────────────────┼────────────────────────────────────────┼─────────────┤
|
|
1289
|
+
│ kg.sparql.query │ Execute SPARQL SELECT/ASK │ ✅ Granted │
|
|
1290
|
+
│ kg.sparql.update │ Execute SPARQL INSERT/DELETE │ ❌ Denied │
|
|
1291
|
+
│ kg.motif.find │ Pattern matching │ ✅ Granted │
|
|
1292
|
+
│ kg.embeddings.read │ Read embeddings │ ✅ Granted │
|
|
1293
|
+
│ kg.embeddings.write │ Write embeddings │ ❌ Denied │
|
|
1294
|
+
│ filesystem │ File system access │ ❌ Denied │
|
|
1295
|
+
│ network │ Network access │ ❌ Denied │
|
|
1296
|
+
└─────────────────────┴────────────────────────────────────────┴─────────────┘
|
|
1762
1297
|
```
|
|
1763
1298
|
|
|
1764
1299
|
---
|
|
1765
1300
|
|
|
1766
|
-
##
|
|
1767
|
-
|
|
1768
|
-
### Query Forms
|
|
1769
|
-
|
|
1770
|
-
```typescript
|
|
1771
|
-
// SELECT - return bindings
|
|
1772
|
-
db.querySelect('SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10')
|
|
1773
|
-
|
|
1774
|
-
// ASK - boolean existence check
|
|
1775
|
-
db.queryAsk('ASK { <http://example.org/x> ?p ?o }')
|
|
1776
|
-
|
|
1777
|
-
// CONSTRUCT - build new graph
|
|
1778
|
-
db.queryConstruct('CONSTRUCT { ?s <http://new/prop> ?o } WHERE { ?s ?p ?o }')
|
|
1779
|
-
```
|
|
1780
|
-
|
|
1781
|
-
### Aggregates
|
|
1782
|
-
|
|
1783
|
-
```typescript
|
|
1784
|
-
db.querySelect(`
|
|
1785
|
-
SELECT ?type (COUNT(*) AS ?count) (AVG(?value) AS ?avg)
|
|
1786
|
-
WHERE { ?s a ?type ; <http://ex/value> ?value }
|
|
1787
|
-
GROUP BY ?type
|
|
1788
|
-
HAVING (COUNT(*) > 5)
|
|
1789
|
-
ORDER BY DESC(?count)
|
|
1790
|
-
`)
|
|
1791
|
-
```
|
|
1301
|
+
## API Reference
|
|
1792
1302
|
|
|
1793
|
-
###
|
|
1303
|
+
### GraphDB
|
|
1794
1304
|
|
|
1795
1305
|
```typescript
|
|
1796
|
-
|
|
1797
|
-
|
|
1798
|
-
|
|
1799
|
-
// Alternative paths
|
|
1800
|
-
db.querySelect('SELECT ?name WHERE { ?x (foaf:name|rdfs:label) ?name }')
|
|
1306
|
+
class GraphDB {
|
|
1307
|
+
constructor(baseUri: string)
|
|
1801
1308
|
|
|
1802
|
-
//
|
|
1803
|
-
|
|
1804
|
-
|
|
1309
|
+
// Load data
|
|
1310
|
+
loadTtl(ttl: string, graph: string | null): void
|
|
1311
|
+
loadNtriples(nt: string, graph: string | null): void
|
|
1805
1312
|
|
|
1806
|
-
|
|
1313
|
+
// Query
|
|
1314
|
+
querySelect(sparql: string): QueryResult[]
|
|
1315
|
+
queryAsk(sparql: string): boolean
|
|
1316
|
+
queryConstruct(sparql: string): TripleResult[]
|
|
1807
1317
|
|
|
1808
|
-
|
|
1809
|
-
|
|
1810
|
-
|
|
1811
|
-
|
|
1812
|
-
// Query specific graph
|
|
1813
|
-
db.querySelect(`
|
|
1814
|
-
SELECT ?s ?p ?o WHERE {
|
|
1815
|
-
GRAPH <http://example.org/graph1> { ?s ?p ?o }
|
|
1816
|
-
}
|
|
1817
|
-
`)
|
|
1318
|
+
// Stats
|
|
1319
|
+
countTriples(): number
|
|
1320
|
+
getVersion(): string
|
|
1321
|
+
}
|
|
1818
1322
|
```
|
|
1819
1323
|
|
|
1820
|
-
###
|
|
1324
|
+
### GraphFrame
|
|
1821
1325
|
|
|
1822
1326
|
```typescript
|
|
1823
|
-
|
|
1824
|
-
|
|
1825
|
-
|
|
1826
|
-
|
|
1827
|
-
|
|
1828
|
-
|
|
1829
|
-
|
|
1830
|
-
|
|
1831
|
-
|
|
1832
|
-
|
|
1833
|
-
|
|
1834
|
-
|
|
1835
|
-
|
|
1836
|
-
|
|
1837
|
-
|
|
1838
|
-
|
|
1839
|
-
|
|
1840
|
-
// Verify insert
|
|
1841
|
-
const count = db.count()
|
|
1842
|
-
console.log(`Total triples after insert: ${count}`)
|
|
1843
|
-
|
|
1844
|
-
// DELETE WHERE - Remove matching triples
|
|
1845
|
-
db.updateDelete(`
|
|
1846
|
-
PREFIX ex: <http://example.org/>
|
|
1847
|
-
DELETE WHERE { ?s ex:status "completed" }
|
|
1848
|
-
`)
|
|
1327
|
+
class GraphFrame {
|
|
1328
|
+
constructor(vertices: string, edges: string)
|
|
1329
|
+
|
|
1330
|
+
// Properties
|
|
1331
|
+
vertexCount(): number
|
|
1332
|
+
edgeCount(): number
|
|
1333
|
+
|
|
1334
|
+
// Algorithms
|
|
1335
|
+
pageRank(damping: number, iterations: number): string
|
|
1336
|
+
connectedComponents(): string
|
|
1337
|
+
shortestPaths(landmarks: string[]): string
|
|
1338
|
+
triangleCount(): number
|
|
1339
|
+
labelPropagation(iterations: number): string
|
|
1340
|
+
|
|
1341
|
+
// Pattern matching
|
|
1342
|
+
find(pattern: string): string
|
|
1343
|
+
}
|
|
1849
1344
|
```
|
|
1850
1345
|
|
|
1851
|
-
###
|
|
1346
|
+
### EmbeddingService
|
|
1852
1347
|
|
|
1853
1348
|
```typescript
|
|
1854
|
-
|
|
1855
|
-
|
|
1856
|
-
|
|
1857
|
-
const db = new GraphDB('http://example.org/bulk-load')
|
|
1349
|
+
class EmbeddingService {
|
|
1350
|
+
constructor()
|
|
1858
1351
|
|
|
1859
|
-
//
|
|
1860
|
-
|
|
1861
|
-
|
|
1352
|
+
// Vector operations
|
|
1353
|
+
storeVector(id: string, vector: number[]): void
|
|
1354
|
+
getVector(id: string): number[] | null
|
|
1355
|
+
countVectors(): number
|
|
1862
1356
|
|
|
1863
|
-
//
|
|
1864
|
-
|
|
1865
|
-
db.loadTtl(orgData, 'http://example.org/graphs/org')
|
|
1357
|
+
// Similarity search
|
|
1358
|
+
findSimilar(id: string, k: number, threshold: number): string
|
|
1866
1359
|
|
|
1867
|
-
//
|
|
1868
|
-
|
|
1869
|
-
|
|
1870
|
-
|
|
1871
|
-
console.log(`Loaded ${db.count()} triples`)
|
|
1872
|
-
|
|
1873
|
-
// Query across all graphs
|
|
1874
|
-
const results = db.querySelect(`
|
|
1875
|
-
SELECT ?g (COUNT(*) AS ?count) WHERE {
|
|
1876
|
-
GRAPH ?g { ?s ?p ?o }
|
|
1877
|
-
}
|
|
1878
|
-
GROUP BY ?g
|
|
1879
|
-
`)
|
|
1880
|
-
console.log('Triples per graph:', results)
|
|
1881
|
-
```
|
|
1882
|
-
|
|
1883
|
-
---
|
|
1884
|
-
|
|
1885
|
-
## Sample Application
|
|
1886
|
-
|
|
1887
|
-
### Knowledge Graph Demo
|
|
1888
|
-
|
|
1889
|
-
A complete, production-ready sample application demonstrating enterprise knowledge graph capabilities is available in the repository.
|
|
1890
|
-
|
|
1891
|
-
**Location**: [`examples/knowledge-graph-demo/`](../../examples/knowledge-graph-demo/)
|
|
1892
|
-
|
|
1893
|
-
**Features Demonstrated**:
|
|
1894
|
-
- Complete organizational knowledge graph (employees, departments, projects, skills)
|
|
1895
|
-
- SPARQL SELECT queries with star and chain patterns (WCOJ-optimized)
|
|
1896
|
-
- Aggregations (COUNT, AVG, GROUP BY, HAVING)
|
|
1897
|
-
- Property paths for transitive closure (organizational hierarchy)
|
|
1898
|
-
- SPARQL ASK and CONSTRUCT queries
|
|
1899
|
-
- Named graphs for multi-tenant data isolation
|
|
1900
|
-
- Data export to Turtle format
|
|
1901
|
-
|
|
1902
|
-
**Run the Demo**:
|
|
1903
|
-
|
|
1904
|
-
```bash
|
|
1905
|
-
cd examples/knowledge-graph-demo
|
|
1906
|
-
npm install
|
|
1907
|
-
npm start
|
|
1360
|
+
// Composite embeddings
|
|
1361
|
+
storeComposite(id: string, embeddings: string): void
|
|
1362
|
+
findSimilarComposite(id: string, k: number, threshold: number, strategy: string): string
|
|
1363
|
+
}
|
|
1908
1364
|
```
|
|
1909
1365
|
|
|
1910
|
-
|
|
1911
|
-
|
|
1912
|
-
The demo creates a realistic knowledge graph with:
|
|
1913
|
-
- 5 employees across 4 departments
|
|
1914
|
-
- 13 technical and soft skills
|
|
1915
|
-
- 2 software projects
|
|
1916
|
-
- Reporting hierarchies and salary data
|
|
1917
|
-
- Named graph for sensitive compensation data
|
|
1918
|
-
|
|
1919
|
-
**Example Query from Demo** (finds all direct and indirect reports):
|
|
1366
|
+
### DatalogProgram
|
|
1920
1367
|
|
|
1921
1368
|
```typescript
|
|
1922
|
-
|
|
1923
|
-
|
|
1924
|
-
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
|
1925
|
-
|
|
1926
|
-
SELECT ?employee ?name WHERE {
|
|
1927
|
-
?employee ex:reportsTo+ ex:alice . # Transitive closure
|
|
1928
|
-
?employee foaf:name ?name .
|
|
1929
|
-
}
|
|
1930
|
-
ORDER BY ?name
|
|
1931
|
-
`
|
|
1932
|
-
const results = db.querySelect(pathQuery)
|
|
1933
|
-
```
|
|
1369
|
+
class DatalogProgram {
|
|
1370
|
+
constructor()
|
|
1934
1371
|
|
|
1935
|
-
|
|
1936
|
-
|
|
1937
|
-
|
|
1938
|
-
|
|
1939
|
-
|
|
1372
|
+
// Facts and rules
|
|
1373
|
+
addFact(fact: string): void
|
|
1374
|
+
addRule(rule: string): void
|
|
1375
|
+
factCount(): number
|
|
1376
|
+
ruleCount(): number
|
|
1940
1377
|
|
|
1941
|
-
|
|
1378
|
+
// Evaluation
|
|
1379
|
+
evaluate(): void
|
|
1942
1380
|
|
|
1943
|
-
|
|
1944
|
-
|
|
1945
|
-
constructor(baseUri: string) // Create with base URI
|
|
1946
|
-
static inMemory(): GraphDB // Create anonymous in-memory DB
|
|
1947
|
-
|
|
1948
|
-
// Data Loading
|
|
1949
|
-
loadTtl(data: string, graph: string | null): void
|
|
1950
|
-
loadNTriples(data: string, graph: string | null): void
|
|
1951
|
-
|
|
1952
|
-
// SPARQL Queries (WCOJ-optimized)
|
|
1953
|
-
querySelect(sparql: string): Array<Record<string, string>>
|
|
1954
|
-
queryAsk(sparql: string): boolean
|
|
1955
|
-
queryConstruct(sparql: string): string // Returns N-Triples
|
|
1956
|
-
|
|
1957
|
-
// SPARQL Updates
|
|
1958
|
-
updateInsert(sparql: string): void
|
|
1959
|
-
updateDelete(sparql: string): void
|
|
1960
|
-
|
|
1961
|
-
// Database Operations
|
|
1962
|
-
count(): number
|
|
1963
|
-
clear(): void
|
|
1964
|
-
getVersion(): string
|
|
1965
|
-
}
|
|
1966
|
-
```
|
|
1967
|
-
|
|
1968
|
-
### Node Class
|
|
1969
|
-
|
|
1970
|
-
```typescript
|
|
1971
|
-
class Node {
|
|
1972
|
-
static iri(uri: string): Node
|
|
1973
|
-
static literal(value: string): Node
|
|
1974
|
-
static langLiteral(value: string, lang: string): Node
|
|
1975
|
-
static typedLiteral(value: string, datatype: string): Node
|
|
1976
|
-
static integer(value: number): Node
|
|
1977
|
-
static boolean(value: boolean): Node
|
|
1978
|
-
static blank(id: string): Node
|
|
1381
|
+
// Query
|
|
1382
|
+
query(pattern: string): string
|
|
1979
1383
|
}
|
|
1980
1384
|
```
|
|
1981
1385
|
|
|
1982
1386
|
---
|
|
1983
1387
|
|
|
1984
|
-
##
|
|
1985
|
-
|
|
1986
|
-
|
|
1987
|
-
|
|
1988
|
-
|
|
1989
|
-
|
|
1990
|
-
|
|
1991
|
-
|
|
1992
|
-
|
|
1993
|
-
|
|
1994
|
-
|
|
1995
|
-
|
|
1996
|
-
|
|
1997
|
-
|
|
1998
|
-
|
|
1388
|
+
## Business Value
|
|
1389
|
+
|
|
1390
|
+
```
|
|
1391
|
+
╔═══════════════════════════════════════════════════════════════════════════════╗
|
|
1392
|
+
║ BUSINESS IMPACT ║
|
|
1393
|
+
╠═══════════════════════════════════════════════════════════════════════════════╣
|
|
1394
|
+
║ ║
|
|
1395
|
+
║ ┌─────────────────────────────────────────────────────────────────────────┐ ║
|
|
1396
|
+
║ │ ROI METRICS │ ║
|
|
1397
|
+
║ ├─────────────────────────────────────────────────────────────────────────┤ ║
|
|
1398
|
+
║ │ │ ║
|
|
1399
|
+
║ │ Query Success Rate: 0% → 86% (430x improvement) │ ║
|
|
1400
|
+
║ │ Development Time: Days → Minutes (100x faster) │ ║
|
|
1401
|
+
║ │ Type Errors: High → Zero (eliminated) │ ║
|
|
1402
|
+
║ │ Audit Compliance: None → Full provenance (SOX/GDPR ready) │ ║
|
|
1403
|
+
║ │ │ ║
|
|
1404
|
+
║ └─────────────────────────────────────────────────────────────────────────┘ ║
|
|
1405
|
+
║ ║
|
|
1406
|
+
║ ┌─────────────────────────────────────────────────────────────────────────┐ ║
|
|
1407
|
+
║ │ USE CASES ENABLED │ ║
|
|
1408
|
+
║ ├─────────────────────────────────────────────────────────────────────────┤ ║
|
|
1409
|
+
║ │ │ ║
|
|
1410
|
+
║ │ 🏦 Financial Services: Fraud detection with explainable reasoning │ ║
|
|
1411
|
+
║ │ 🏥 Healthcare: Drug interaction queries with type safety │ ║
|
|
1412
|
+
║ │ ⚖️ Legal/Compliance: Regulatory queries with full provenance │ ║
|
|
1413
|
+
║ │ 🏭 Manufacturing: Supply chain reasoning with guarantees │ ║
|
|
1414
|
+
║ │ 🛡️ Insurance: Underwriting with mathematical risk models │ ║
|
|
1415
|
+
║ │ │ ║
|
|
1416
|
+
║ └─────────────────────────────────────────────────────────────────────────┘ ║
|
|
1417
|
+
║ ║
|
|
1418
|
+
╚═══════════════════════════════════════════════════════════════════════════════╝
|
|
1999
1419
|
```
|
|
2000
|
-
Triple: 24 bytes
|
|
2001
|
-
├── Subject: 8 bytes (dictionary ID)
|
|
2002
|
-
├── Predicate: 8 bytes (dictionary ID)
|
|
2003
|
-
└── Object: 8 bytes (dictionary ID)
|
|
2004
|
-
|
|
2005
|
-
String Interning: All URIs/literals stored once in Dictionary
|
|
2006
|
-
Index Overhead: ~4x base triple size (4 indexes)
|
|
2007
|
-
Total: ~120 bytes/triple including indexes
|
|
2008
|
-
```
|
|
2009
|
-
|
|
2010
|
-
---
|
|
2011
|
-
|
|
2012
|
-
## Performance Benchmarks
|
|
2013
|
-
|
|
2014
|
-
### By Deployment Mode
|
|
2015
|
-
|
|
2016
|
-
| Mode | Lookup | Insert | Memory | Dataset Size |
|
|
2017
|
-
|------|--------|--------|--------|--------------|
|
|
2018
|
-
| **In-Memory (npm)** | 2.78 µs | 146K/sec | 24 bytes/triple | <10M triples |
|
|
2019
|
-
| **Single Node (RocksDB)** | 5-10 µs | 100K/sec | On-disk | <100M triples |
|
|
2020
|
-
| **Distributed Cluster** | 10-50 µs | 500K+/sec* | Distributed | **1B+ triples** |
|
|
2021
|
-
|
|
2022
|
-
*Aggregate throughput across all executors with HDRF partitioning
|
|
2023
|
-
|
|
2024
|
-
### SIMD + PGO Query Performance (LUBM Benchmark)
|
|
2025
|
-
|
|
2026
|
-
| Query | Pattern | Time | Improvement |
|
|
2027
|
-
|-------|---------|------|-------------|
|
|
2028
|
-
| Q5 | 2-hop chain | 53ms | **77% faster** |
|
|
2029
|
-
| Q3 | 3-way star | 62ms | **65% faster** |
|
|
2030
|
-
| Q4 | 3-hop chain | 101ms | **60% faster** |
|
|
2031
|
-
| Q8 | Triangle | 193ms | **53% faster** |
|
|
2032
|
-
| Q7 | Hierarchy | 198ms | **42% faster** |
|
|
2033
|
-
|
|
2034
|
-
**Average: 44.5% speedup** with zero code changes (compiler optimizations only).
|
|
2035
|
-
|
|
2036
|
-
---
|
|
2037
|
-
|
|
2038
|
-
## Version History
|
|
2039
|
-
|
|
2040
|
-
### v0.2.2 (2025-12-08) - Enhanced Documentation
|
|
2041
|
-
|
|
2042
|
-
- Added comprehensive INSERT DATA examples with PREFIX syntax
|
|
2043
|
-
- Added bulk data loading example with named graphs
|
|
2044
|
-
- Enhanced SPARQL UPDATE section with real-world patterns
|
|
2045
|
-
- Improved documentation for data import workflows
|
|
2046
|
-
|
|
2047
|
-
### v0.2.1 (2025-12-08) - npm Platform Fix
|
|
2048
|
-
|
|
2049
|
-
- Fixed native module loading for platform-specific binaries
|
|
2050
|
-
- This release includes pre-built binary for **macOS x64** only
|
|
2051
|
-
- Other platforms coming in next release
|
|
2052
|
-
|
|
2053
|
-
### v0.2.0 (2025-12-08) - Distributed Cluster Support
|
|
2054
|
-
|
|
2055
|
-
- **NEW: Distributed cluster architecture** with HDRF partitioning
|
|
2056
|
-
- **Subject-Hash Filter** for accurate COUNT deduplication across replicas
|
|
2057
|
-
- **Arrow-powered OLAP** query path for high-performance analytical queries
|
|
2058
|
-
- Coordinator-Executor pattern with gRPC communication
|
|
2059
|
-
- 9-partition default for optimal data distribution
|
|
2060
|
-
- **Contact for cluster deployment**: gonnect.uk@gmail.com
|
|
2061
|
-
- **Coming soon**: Embedding support for semantic search (v0.3.0)
|
|
2062
|
-
|
|
2063
|
-
### v0.1.12 (2025-12-01) - LMDB Backend Release
|
|
2064
|
-
|
|
2065
|
-
- **LMDB storage backend** fully implemented (31 tests passing)
|
|
2066
|
-
- Memory-mapped I/O for optimal read performance
|
|
2067
|
-
- MVCC concurrency for unlimited concurrent readers
|
|
2068
|
-
- Complete LMDB vs RocksDB comparison documentation
|
|
2069
|
-
- Sample application with 87 triples demonstrating all features
|
|
2070
|
-
|
|
2071
|
-
### v0.1.9 (2025-12-01) - SIMD + PGO Release
|
|
2072
|
-
|
|
2073
|
-
- **44.5% average speedup** via SIMD + PGO compiler optimizations
|
|
2074
|
-
- WCOJ execution with LeapFrog TrieJoin
|
|
2075
|
-
- Release automation infrastructure
|
|
2076
|
-
- All packages updated to gonnect-uk namespace
|
|
2077
|
-
|
|
2078
|
-
### v0.1.8 (2025-12-01) - WCOJ Execution
|
|
2079
|
-
|
|
2080
|
-
- WCOJ execution path activated
|
|
2081
|
-
- Variable ordering analysis for optimal joins
|
|
2082
|
-
- 577 tests passing
|
|
2083
|
-
|
|
2084
|
-
### v0.1.7 (2025-11-30)
|
|
2085
|
-
|
|
2086
|
-
- Query optimizer with automatic strategy selection
|
|
2087
|
-
- WCOJ algorithm integration (planning phase)
|
|
2088
|
-
|
|
2089
|
-
### v0.1.3 (2025-11-18)
|
|
2090
|
-
|
|
2091
|
-
- Initial TypeScript SDK
|
|
2092
|
-
- 100% W3C SPARQL 1.1 compliance
|
|
2093
|
-
- 100% W3C RDF 1.2 compliance
|
|
2094
|
-
|
|
2095
|
-
---
|
|
2096
|
-
|
|
2097
|
-
## Use Cases
|
|
2098
|
-
|
|
2099
|
-
| Domain | Application |
|
|
2100
|
-
|--------|-------------|
|
|
2101
|
-
| **Knowledge Graphs** | Enterprise ontologies, taxonomies |
|
|
2102
|
-
| **Semantic Search** | Structured queries over unstructured data |
|
|
2103
|
-
| **Data Integration** | ETL with SPARQL CONSTRUCT |
|
|
2104
|
-
| **Compliance** | SHACL validation, provenance tracking |
|
|
2105
|
-
| **Graph Analytics** | Pattern detection, community analysis |
|
|
2106
|
-
| **Mobile Apps** | Embedded RDF on iOS/Android |
|
|
2107
|
-
|
|
2108
|
-
---
|
|
2109
|
-
|
|
2110
|
-
## Links
|
|
2111
|
-
|
|
2112
|
-
- [GitHub Repository](https://github.com/gonnect-uk/rust-kgdb)
|
|
2113
|
-
- [Documentation](https://github.com/gonnect-uk/rust-kgdb/tree/main/docs)
|
|
2114
|
-
- [CHANGELOG](https://github.com/gonnect-uk/rust-kgdb/blob/main/CHANGELOG.md)
|
|
2115
|
-
- [W3C SPARQL 1.1](https://www.w3.org/TR/sparql11-query/)
|
|
2116
|
-
- [W3C RDF 1.2](https://www.w3.org/TR/rdf12-concepts/)
|
|
2117
1420
|
|
|
2118
1421
|
---
|
|
2119
1422
|
|
|
2120
1423
|
## License
|
|
2121
1424
|
|
|
2122
|
-
Apache
|
|
1425
|
+
Apache-2.0
|
|
2123
1426
|
|
|
2124
|
-
|
|
1427
|
+
## Contributing
|
|
2125
1428
|
|
|
2126
|
-
|
|
1429
|
+
Issues and PRs welcome at [github.com/gonnect-uk/rust-kgdb](https://github.com/gonnect-uk/rust-kgdb)
|