rust-kgdb 0.6.64 → 0.6.67
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +782 -294
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# rust-kgdb
|
|
2
2
|
|
|
3
|
-
High-performance
|
|
3
|
+
High-performance embedded knowledge graph database with neuro-symbolic AI agent framework.
|
|
4
4
|
|
|
5
5
|
## The Problem With AI Today
|
|
6
6
|
|
|
@@ -20,42 +20,27 @@ This keeps happening:
|
|
|
20
20
|
|
|
21
21
|
Every time, the same pattern: The AI sounds confident. The AI is wrong. People get hurt.
|
|
22
22
|
|
|
23
|
-
## The Solution
|
|
23
|
+
## The Solution: Grounded AI
|
|
24
24
|
|
|
25
|
-
What if AI stopped
|
|
25
|
+
What if AI stopped inventing answers and started querying real data?
|
|
26
26
|
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
The AI translates intent into queries. The database finds facts. The AI never makes up data.
|
|
32
|
-
|
|
33
|
-
rust-kgdb is a knowledge graph database with an AI layer that cannot hallucinate because it only returns data from your actual systems.
|
|
34
|
-
|
|
35
|
-
## The Business Value
|
|
36
|
-
|
|
37
|
-
For Enterprises:
|
|
38
|
-
- Zero hallucinations - Every answer traces back to your actual data
|
|
39
|
-
- Full audit trail - Regulators can verify every AI decision (SOX, GDPR, FDA 21 CFR Part 11)
|
|
40
|
-
- No infrastructure - Runs embedded in your app, no servers to manage
|
|
27
|
+
```
|
|
28
|
+
Traditional LLM:
|
|
29
|
+
User Question --> LLM --> Hallucinated Answer
|
|
41
30
|
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
- 132K writes/sec - Handle enterprise transaction volumes
|
|
31
|
+
Grounded AI (rust-kgdb + HyperAgent):
|
|
32
|
+
User Question --> LLM Plans Query --> Database Executes --> Verified Answer
|
|
33
|
+
```
|
|
46
34
|
|
|
47
|
-
|
|
48
|
-
- 86.4% SPARQL accuracy - vs 0% with vanilla LLMs on LUBM benchmark
|
|
49
|
-
- 16ms similarity search - Find related entities across 10K vectors
|
|
50
|
-
- Schema-aware generation - AI uses YOUR ontology, not guessed class names
|
|
35
|
+
The AI translates intent into queries. The database finds facts. The AI never makes up data.
|
|
51
36
|
|
|
52
37
|
## What Is rust-kgdb?
|
|
53
38
|
|
|
54
|
-
|
|
39
|
+
**rust-kgdb** is two things in one npm package:
|
|
55
40
|
|
|
56
|
-
###
|
|
41
|
+
### 1. Embedded Knowledge Graph Database (rust-kgdb Core)
|
|
57
42
|
|
|
58
|
-
A high-performance RDF/SPARQL database that runs inside your application. No server. No Docker. No config.
|
|
43
|
+
A high-performance RDF/SPARQL database that runs inside your application. No server. No Docker. No config. Like SQLite for knowledge graphs.
|
|
59
44
|
|
|
60
45
|
```
|
|
61
46
|
+-----------------------------------------------------------------------------+
|
|
@@ -71,123 +56,270 @@ A high-performance RDF/SPARQL database that runs inside your application. No ser
|
|
|
71
56
|
+-----------------------------------------------------------------------------+
|
|
72
57
|
```
|
|
73
58
|
|
|
74
|
-
|
|
75
|
-
|--------|-----------|-------|-------------|
|
|
76
|
-
| Lookup | 449 ns | 5,000+ ns | 10,000+ ns |
|
|
77
|
-
| Memory/Triple | 24 bytes | 32 bytes | 50-60 bytes |
|
|
78
|
-
| Bulk Insert | 146K/sec | 200K/sec | 50K/sec |
|
|
79
|
-
|
|
80
|
-
Like SQLite - but for knowledge graphs.
|
|
81
|
-
|
|
82
|
-
### HyperMind: Neuro-Symbolic Agent Framework
|
|
59
|
+
### 2. Neuro-Symbolic AI Framework (HyperAgent)
|
|
83
60
|
|
|
84
61
|
An AI agent layer that uses the database to prevent hallucinations. The LLM plans, the database executes.
|
|
85
62
|
|
|
86
63
|
```
|
|
87
64
|
+-----------------------------------------------------------------------------+
|
|
88
|
-
|
|
|
65
|
+
| HYPERAGENT FRAMEWORK |
|
|
89
66
|
| |
|
|
90
67
|
| +-----------+ +-----------+ +-----------+ +-----------+ |
|
|
91
|
-
| |LLMPlanner | |
|
|
92
|
-
| |(Claude/GPT| |
|
|
68
|
+
| |LLMPlanner | | Memory | | ProofDAG | |WasmSandbox| |
|
|
69
|
+
| |(Claude/GPT| |(Hypergraph| | (Audit) | | (Security)| |
|
|
93
70
|
| +-----------+ +-----------+ +-----------+ +-----------+ |
|
|
94
71
|
| |
|
|
95
|
-
| Type Theory:
|
|
96
|
-
| Category Theory: Tools
|
|
97
|
-
| Proof Theory: Every execution produces cryptographic audit trail
|
|
72
|
+
| Type Theory: Tools have typed signatures (Query -> BindingSet) |
|
|
73
|
+
| Category Theory: Tools compose safely (f . g verified at plan time) |
|
|
74
|
+
| Proof Theory: Every execution produces cryptographic audit trail |
|
|
98
75
|
+-----------------------------------------------------------------------------+
|
|
99
76
|
```
|
|
100
77
|
|
|
101
|
-
|
|
102
|
-
|-----------|---------------|-------------|
|
|
103
|
-
| Vanilla LLM | 0% | - |
|
|
104
|
-
| LangChain | 0% | 71.4% |
|
|
105
|
-
| DSPy | 14.3% | 71.4% |
|
|
106
|
-
| HyperMind | - | 71.4% |
|
|
78
|
+
### How They Work Together
|
|
107
79
|
|
|
108
|
-
|
|
80
|
+
```
|
|
81
|
+
+-----------------------------------------------------------------------------------+
|
|
82
|
+
| USER: "Find providers with suspicious billing patterns" |
|
|
83
|
+
+-----------------------------------------------------------------------------------+
|
|
84
|
+
|
|
|
85
|
+
v
|
|
86
|
+
+-----------------------------------------------------------------------------------+
|
|
87
|
+
| HYPERAGENT: Intent Analysis (deterministic, no LLM) |
|
|
88
|
+
| Keywords: "suspicious" -> FRAUD_DETECTION, "providers" -> Provider class |
|
|
89
|
+
+-----------------------------------------------------------------------------------+
|
|
90
|
+
|
|
|
91
|
+
v
|
|
92
|
+
+-----------------------------------------------------------------------------------+
|
|
93
|
+
| HYPERAGENT: Schema Binding |
|
|
94
|
+
| Your ontology has: Provider, Claim, denialRate, hasPattern properties |
|
|
95
|
+
+-----------------------------------------------------------------------------------+
|
|
96
|
+
|
|
|
97
|
+
v
|
|
98
|
+
+-----------------------------------------------------------------------------------+
|
|
99
|
+
| HYPERAGENT: Query Generation (schema-driven) |
|
|
100
|
+
| SELECT ?p ?rate WHERE { ?p a :Provider ; :denialRate ?rate . FILTER(?rate > 0.2)}|
|
|
101
|
+
+-----------------------------------------------------------------------------------+
|
|
102
|
+
|
|
|
103
|
+
v
|
|
104
|
+
+-----------------------------------------------------------------------------------+
|
|
105
|
+
| rust-kgdb CORE: Execute Query (449ns per lookup) |
|
|
106
|
+
| Returns: [{p: "PROV001", rate: "0.34"}] |
|
|
107
|
+
+-----------------------------------------------------------------------------------+
|
|
108
|
+
|
|
|
109
|
+
v
|
|
110
|
+
+-----------------------------------------------------------------------------------+
|
|
111
|
+
| HYPERAGENT: Format Response + Audit Trail |
|
|
112
|
+
| "Provider PROV001 has 34% denial rate" + SHA-256 proof of data source |
|
|
113
|
+
+-----------------------------------------------------------------------------------+
|
|
114
|
+
```
|
|
109
115
|
|
|
110
|
-
##
|
|
116
|
+
## Why rust-kgdb?
|
|
117
|
+
|
|
118
|
+
### Performance Comparison
|
|
119
|
+
|
|
120
|
+
| Metric | rust-kgdb | RDFox | Apache Jena |
|
|
121
|
+
|--------|-----------|-------|-------------|
|
|
122
|
+
| Lookup Speed | 449 ns | 5,000+ ns | 10,000+ ns |
|
|
123
|
+
| Memory per Triple | 24 bytes | 32 bytes | 50-60 bytes |
|
|
124
|
+
| Bulk Insert | 146K/sec | 200K/sec | 50K/sec |
|
|
125
|
+
|
|
126
|
+
**Benchmark Sources:**
|
|
127
|
+
- rust-kgdb: Criterion benchmarks on LUBM(1) dataset (3,272 triples), Apple Silicon M1
|
|
128
|
+
- RDFox: [Oxford Semantic Technologies](https://www.oxfordsemantic.tech/product) published benchmarks
|
|
129
|
+
- Apache Jena: [Jena TDB Performance](https://jena.apache.org/documentation/tdb/performance.html)
|
|
130
|
+
|
|
131
|
+
**How We Measured:**
|
|
132
|
+
```bash
|
|
133
|
+
# rust-kgdb benchmarks (Criterion statistical analysis)
|
|
134
|
+
cargo bench --package storage --bench triple_store_benchmark
|
|
135
|
+
|
|
136
|
+
# LUBM data generation
|
|
137
|
+
./tools/lubm_generator 1 /tmp/lubm_1.nt # 3,272 triples
|
|
138
|
+
./tools/lubm_generator 10 /tmp/lubm_10.nt # ~32K triples
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
### Why 35x Faster Than RDFox?
|
|
142
|
+
|
|
143
|
+
1. **Zero-Copy Semantics**: All data structures use borrowed references. No cloning in hot paths.
|
|
144
|
+
2. **String Interning**: Dictionary interns all URIs once. References are 8-byte IDs, not heap strings.
|
|
145
|
+
3. **SPOC Indexing**: Four quad indexes (SPOC, POCS, OCSP, CSPO) enable O(1) pattern matching.
|
|
146
|
+
4. **Rust Performance**: No garbage collection pauses. Predictable latency.
|
|
147
|
+
|
|
148
|
+
## Why HyperAgent?
|
|
149
|
+
|
|
150
|
+
### Framework Comparison (LUBM Benchmark)
|
|
151
|
+
|
|
152
|
+
| Framework | Without Schema | With Schema | Notes |
|
|
153
|
+
|-----------|----------------|-------------|-------|
|
|
154
|
+
| Vanilla LLM | 0% | N/A | Hallucinates class names |
|
|
155
|
+
| LangChain | 0% | 71.4% | Needs manual schema injection |
|
|
156
|
+
| DSPy | 14.3% | 71.4% | Better prompting, still needs schema |
|
|
157
|
+
| HyperAgent | N/A | 86.4% | Schema auto-discovered from KG |
|
|
158
|
+
|
|
159
|
+
**Benchmark Dataset:** LUBM(1) - 3,272 triples, 30 OWL classes, 23 properties
|
|
160
|
+
**Test Queries:** 7 standard LUBM queries (Q1-Q7)
|
|
161
|
+
|
|
162
|
+
**How We Measured:**
|
|
163
|
+
```bash
|
|
164
|
+
# Framework comparison benchmark
|
|
165
|
+
OPENAI_API_KEY=... python3 benchmark-frameworks.py
|
|
166
|
+
|
|
167
|
+
# HyperMind vs Vanilla LLM
|
|
168
|
+
ANTHROPIC_API_KEY=... node vanilla-vs-hypermind-benchmark.js
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
### Why 86.4% vs 0%?
|
|
172
|
+
|
|
173
|
+
Vanilla LLMs fail because they guess class names:
|
|
174
|
+
- LLM guesses: `Professor`, `Course`, `teaches`
|
|
175
|
+
- Actual ontology: `ub:FullProfessor`, `ub:GraduateCourse`, `ub:teacherOf`
|
|
176
|
+
|
|
177
|
+
HyperAgent reads YOUR schema first, then generates queries using YOUR class names.
|
|
178
|
+
|
|
179
|
+
## Installation
|
|
111
180
|
|
|
112
181
|
```bash
|
|
113
182
|
npm install rust-kgdb
|
|
114
183
|
```
|
|
115
184
|
|
|
185
|
+
**Platforms:** macOS (Intel/Apple Silicon), Linux (x64/ARM64), Windows (x64)
|
|
186
|
+
**Requirements:** Node.js 14+
|
|
187
|
+
|
|
188
|
+
## Quick Start
|
|
189
|
+
|
|
116
190
|
### Basic Database Usage
|
|
117
191
|
|
|
118
192
|
```javascript
|
|
119
|
-
const { GraphDB } = require('rust-kgdb');
|
|
193
|
+
const { GraphDB, getVersion } = require('rust-kgdb');
|
|
120
194
|
|
|
121
|
-
|
|
122
|
-
const db = new GraphDB('http://lawfirm.com/');
|
|
195
|
+
console.log('rust-kgdb version:', getVersion());
|
|
123
196
|
|
|
124
|
-
//
|
|
197
|
+
// Create embedded database (no server needed)
|
|
198
|
+
const db = new GraphDB('http://example.org/');
|
|
199
|
+
|
|
200
|
+
// Load RDF data (N-Triples format)
|
|
125
201
|
db.loadTtl(`
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
202
|
+
<http://example.org/alice> <http://xmlns.com/foaf/0.1/name> "Alice" .
|
|
203
|
+
<http://example.org/alice> <http://xmlns.com/foaf/0.1/knows> <http://example.org/bob> .
|
|
204
|
+
<http://example.org/bob> <http://xmlns.com/foaf/0.1/name> "Bob" .
|
|
205
|
+
`, null);
|
|
130
206
|
|
|
131
|
-
// Query with SPARQL (449ns
|
|
207
|
+
// Query with SPARQL (449ns per lookup)
|
|
132
208
|
const results = db.querySelect(`
|
|
133
|
-
SELECT ?
|
|
134
|
-
|
|
135
|
-
?case :court ?court
|
|
209
|
+
SELECT ?name WHERE {
|
|
210
|
+
?person <http://xmlns.com/foaf/0.1/name> ?name
|
|
136
211
|
}
|
|
137
212
|
`);
|
|
138
|
-
|
|
213
|
+
console.log(results);
|
|
214
|
+
// [{bindings: {name: '"Alice"'}}, {bindings: {name: '"Bob"'}}]
|
|
215
|
+
|
|
216
|
+
// Count triples
|
|
217
|
+
console.log('Triple count:', db.countTriples()); // 3
|
|
139
218
|
```
|
|
140
219
|
|
|
141
|
-
### With
|
|
220
|
+
### With HyperAgent (Grounded AI)
|
|
142
221
|
|
|
143
222
|
```javascript
|
|
144
223
|
const { GraphDB, HyperMindAgent } = require('rust-kgdb');
|
|
145
224
|
|
|
146
225
|
const db = new GraphDB('http://insurance.org/');
|
|
147
226
|
db.loadTtl(`
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
227
|
+
<http://insurance.org/PROV001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://insurance.org/Provider> .
|
|
228
|
+
<http://insurance.org/PROV001> <http://insurance.org/name> "ABC Medical" .
|
|
229
|
+
<http://insurance.org/PROV001> <http://insurance.org/denialRate> "0.34" .
|
|
230
|
+
<http://insurance.org/PROV001> <http://insurance.org/flaggedBy> <http://insurance.org/SIU_2024_Q1> .
|
|
231
|
+
`, null);
|
|
232
|
+
|
|
233
|
+
// Create agent with knowledge graph binding
|
|
234
|
+
const agent = new HyperMindAgent({
|
|
235
|
+
kg: db, // REQUIRED: GraphDB instance
|
|
236
|
+
name: 'fraud-detector', // Optional: Agent name
|
|
237
|
+
apiKey: process.env.OPENAI_API_KEY // Optional: LLM API key for summarization
|
|
238
|
+
});
|
|
239
|
+
|
|
240
|
+
// Natural language query -> Grounded results
|
|
241
|
+
const result = await agent.call("Which providers show suspicious billing patterns?");
|
|
154
242
|
|
|
155
243
|
console.log(result.answer);
|
|
156
|
-
// "
|
|
244
|
+
// "Provider PROV001 (ABC Medical): 34% denial rate, flagged by SIU Q1 2024"
|
|
245
|
+
|
|
246
|
+
console.log(result.explanation);
|
|
247
|
+
// Full execution trace showing SPARQL queries generated
|
|
157
248
|
|
|
158
|
-
console.log(result.
|
|
159
|
-
//
|
|
249
|
+
console.log(result.proof);
|
|
250
|
+
// Cryptographic proof DAG with SHA-256 hashes
|
|
160
251
|
```
|
|
161
252
|
|
|
162
253
|
## Core Components
|
|
163
254
|
|
|
164
|
-
### GraphDB: SPARQL Engine
|
|
255
|
+
### GraphDB: SPARQL 1.1 Engine
|
|
165
256
|
|
|
166
257
|
```javascript
|
|
167
258
|
const { GraphDB } = require('rust-kgdb');
|
|
168
|
-
|
|
169
259
|
const db = new GraphDB('http://example.org/');
|
|
170
260
|
|
|
171
|
-
// Load
|
|
172
|
-
db.loadTtl(
|
|
261
|
+
// Load data
|
|
262
|
+
db.loadTtl(`
|
|
263
|
+
<http://example.org/alice> <http://example.org/knows> <http://example.org/bob> .
|
|
264
|
+
<http://example.org/alice> <http://example.org/age> "30" .
|
|
265
|
+
<http://example.org/bob> <http://example.org/knows> <http://example.org/charlie> .
|
|
266
|
+
<http://example.org/bob> <http://example.org/age> "25" .
|
|
267
|
+
<http://example.org/charlie> <http://example.org/age> "35" .
|
|
268
|
+
`, null);
|
|
269
|
+
|
|
270
|
+
// SELECT query
|
|
271
|
+
const friends = db.querySelect(`
|
|
272
|
+
SELECT ?person ?friend WHERE {
|
|
273
|
+
?person <http://example.org/knows> ?friend
|
|
274
|
+
}
|
|
275
|
+
`);
|
|
173
276
|
|
|
174
|
-
//
|
|
175
|
-
const
|
|
277
|
+
// FILTER with comparison
|
|
278
|
+
const adults = db.querySelect(`
|
|
279
|
+
SELECT ?person ?age WHERE {
|
|
280
|
+
?person <http://example.org/age> ?age .
|
|
281
|
+
FILTER(?age >= "30")
|
|
282
|
+
}
|
|
283
|
+
`);
|
|
284
|
+
|
|
285
|
+
// OPTIONAL pattern
|
|
286
|
+
const withAge = db.querySelect(`
|
|
287
|
+
SELECT ?person ?age WHERE {
|
|
288
|
+
?person <http://example.org/knows> ?someone .
|
|
289
|
+
OPTIONAL { ?person <http://example.org/age> ?age }
|
|
290
|
+
}
|
|
291
|
+
`);
|
|
176
292
|
|
|
177
|
-
//
|
|
178
|
-
const
|
|
293
|
+
// CONSTRUCT new triples
|
|
294
|
+
const inferred = db.queryConstruct(`
|
|
295
|
+
CONSTRUCT { ?a <http://example.org/friendOfFriend> ?c }
|
|
296
|
+
WHERE {
|
|
297
|
+
?a <http://example.org/knows> ?b .
|
|
298
|
+
?b <http://example.org/knows> ?c .
|
|
299
|
+
FILTER(?a != ?c)
|
|
300
|
+
}
|
|
301
|
+
`);
|
|
179
302
|
|
|
180
|
-
// Named
|
|
181
|
-
db.loadTtl('
|
|
303
|
+
// Named Graphs
|
|
304
|
+
db.loadTtl('<http://example.org/data1> <http://example.org/value> "100" .', 'http://example.org/graph1');
|
|
305
|
+
const fromGraph = db.querySelect(`
|
|
306
|
+
SELECT ?s ?v FROM <http://example.org/graph1> WHERE {
|
|
307
|
+
?s <http://example.org/value> ?v
|
|
308
|
+
}
|
|
309
|
+
`);
|
|
182
310
|
|
|
183
|
-
//
|
|
184
|
-
|
|
311
|
+
// Aggregation with Apache Arrow OLAP
|
|
312
|
+
const stats = db.querySelect(`
|
|
313
|
+
SELECT (COUNT(?person) as ?count) (AVG(?age) as ?avgAge) WHERE {
|
|
314
|
+
?person <http://example.org/age> ?age
|
|
315
|
+
}
|
|
316
|
+
`);
|
|
185
317
|
```
|
|
186
318
|
|
|
187
319
|
### GraphFrame: Graph Analytics
|
|
188
320
|
|
|
189
321
|
```javascript
|
|
190
|
-
const { GraphFrame, friendsGraph } = require('rust-kgdb');
|
|
322
|
+
const { GraphFrame, friendsGraph, chainGraph, starGraph, completeGraph, cycleGraph } = require('rust-kgdb');
|
|
191
323
|
|
|
192
324
|
// Create from vertices and edges
|
|
193
325
|
const gf = new GraphFrame(
|
|
@@ -199,137 +331,230 @@ const gf = new GraphFrame(
|
|
|
199
331
|
])
|
|
200
332
|
);
|
|
201
333
|
|
|
202
|
-
//
|
|
203
|
-
|
|
204
|
-
console.log('
|
|
205
|
-
|
|
206
|
-
|
|
334
|
+
// PageRank (damping=0.15, iterations=20)
|
|
335
|
+
const pagerank = gf.pageRank(0.15, 20);
|
|
336
|
+
console.log('PageRank:', JSON.parse(pagerank));
|
|
337
|
+
|
|
338
|
+
// Connected Components (Union-Find algorithm)
|
|
339
|
+
const components = gf.connectedComponents();
|
|
340
|
+
console.log('Components:', JSON.parse(components));
|
|
341
|
+
|
|
342
|
+
// Triangle Count
|
|
343
|
+
const triangles = gf.triangleCount();
|
|
344
|
+
console.log('Triangles:', triangles); // 1
|
|
345
|
+
|
|
346
|
+
// Shortest Paths (Dijkstra)
|
|
347
|
+
const paths = gf.shortestPaths(['alice']);
|
|
348
|
+
console.log('Shortest paths:', JSON.parse(paths));
|
|
207
349
|
|
|
208
|
-
//
|
|
209
|
-
const
|
|
350
|
+
// Label Propagation (Community Detection)
|
|
351
|
+
const communities = gf.labelPropagation(10);
|
|
352
|
+
console.log('Communities:', JSON.parse(communities));
|
|
353
|
+
|
|
354
|
+
// Degree Distribution
|
|
355
|
+
console.log('In-degrees:', JSON.parse(gf.inDegrees()));
|
|
356
|
+
console.log('Out-degrees:', JSON.parse(gf.outDegrees()));
|
|
357
|
+
|
|
358
|
+
// Factory functions for common graphs
|
|
359
|
+
const chain = chainGraph(10); // Linear path
|
|
360
|
+
const star = starGraph(5); // Hub with spokes
|
|
361
|
+
const complete = completeGraph(4); // Fully connected
|
|
362
|
+
const cycle = cycleGraph(6); // Ring
|
|
210
363
|
```
|
|
211
364
|
|
|
212
|
-
###
|
|
365
|
+
### Motif Finding: Pattern Matching DSL
|
|
213
366
|
|
|
214
367
|
```javascript
|
|
215
|
-
const {
|
|
368
|
+
const { GraphFrame } = require('rust-kgdb');
|
|
216
369
|
|
|
217
|
-
const
|
|
370
|
+
const gf = new GraphFrame(
|
|
371
|
+
JSON.stringify([{id:'a'}, {id:'b'}, {id:'c'}, {id:'d'}]),
|
|
372
|
+
JSON.stringify([
|
|
373
|
+
{src:'a', dst:'b'},
|
|
374
|
+
{src:'b', dst:'c'},
|
|
375
|
+
{src:'c', dst:'a'},
|
|
376
|
+
{src:'d', dst:'a'}
|
|
377
|
+
])
|
|
378
|
+
);
|
|
218
379
|
|
|
219
|
-
//
|
|
220
|
-
|
|
221
|
-
|
|
380
|
+
// Find simple edges: (a)-[e]->(b)
|
|
381
|
+
const edges = gf.find('(a)-[e]->(b)');
|
|
382
|
+
console.log('Edges:', JSON.parse(edges).length); // 4
|
|
222
383
|
|
|
223
|
-
//
|
|
224
|
-
|
|
384
|
+
// Find chains: (a)-[e1]->(b); (b)-[e2]->(c)
|
|
385
|
+
const chains = gf.find('(a)-[e1]->(b); (b)-[e2]->(c)');
|
|
386
|
+
|
|
387
|
+
// Find triangles: (a)-[e1]->(b); (b)-[e2]->(c); (c)-[e3]->(a)
|
|
388
|
+
const triangles = gf.find('(a)-[e1]->(b); (b)-[e2]->(c); (c)-[e3]->(a)');
|
|
389
|
+
|
|
390
|
+
// Find stars: hub with multiple connections
|
|
391
|
+
const stars = gf.find('(hub)-[e1]->(spoke1); (hub)-[e2]->(spoke2)');
|
|
225
392
|
|
|
226
|
-
//
|
|
227
|
-
const
|
|
393
|
+
// Fraud pattern: circular payments
|
|
394
|
+
const circular = gf.find('(a)-[pay1]->(b); (b)-[pay2]->(c); (c)-[pay3]->(a)');
|
|
228
395
|
```
|
|
229
396
|
|
|
230
397
|
### DatalogProgram: Rule-Based Reasoning
|
|
231
398
|
|
|
232
399
|
```javascript
|
|
233
|
-
const { DatalogProgram, evaluateDatalog } = require('rust-kgdb');
|
|
400
|
+
const { DatalogProgram, evaluateDatalog, queryDatalog } = require('rust-kgdb');
|
|
234
401
|
|
|
235
402
|
const datalog = new DatalogProgram();
|
|
236
403
|
|
|
237
|
-
// Add facts
|
|
238
|
-
datalog.addFact(JSON.stringify({predicate:'
|
|
239
|
-
datalog.addFact(JSON.stringify({predicate:'
|
|
404
|
+
// Add base facts
|
|
405
|
+
datalog.addFact(JSON.stringify({predicate:'parent', terms:['alice','bob']}));
|
|
406
|
+
datalog.addFact(JSON.stringify({predicate:'parent', terms:['bob','charlie']}));
|
|
407
|
+
datalog.addFact(JSON.stringify({predicate:'parent', terms:['charlie','dave']}));
|
|
240
408
|
|
|
241
|
-
//
|
|
409
|
+
// Transitive closure rule: ancestor(X,Y) :- parent(X,Y)
|
|
242
410
|
datalog.addRule(JSON.stringify({
|
|
243
|
-
head: {predicate:'
|
|
411
|
+
head: {predicate:'ancestor', terms:['?X','?Y']},
|
|
244
412
|
body: [
|
|
245
|
-
{predicate:'
|
|
246
|
-
|
|
413
|
+
{predicate:'parent', terms:['?X','?Y']}
|
|
414
|
+
]
|
|
415
|
+
}));
|
|
416
|
+
|
|
417
|
+
// Recursive rule: ancestor(X,Z) :- parent(X,Y), ancestor(Y,Z)
|
|
418
|
+
datalog.addRule(JSON.stringify({
|
|
419
|
+
head: {predicate:'ancestor', terms:['?X','?Z']},
|
|
420
|
+
body: [
|
|
421
|
+
{predicate:'parent', terms:['?X','?Y']},
|
|
422
|
+
{predicate:'ancestor', terms:['?Y','?Z']}
|
|
247
423
|
]
|
|
248
424
|
}));
|
|
249
425
|
|
|
250
|
-
//
|
|
426
|
+
// Semi-naive evaluation (fixpoint)
|
|
251
427
|
const inferred = evaluateDatalog(datalog);
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
|
|
271
|
-
|
|
272
|
-
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
|
|
428
|
+
console.log('Inferred facts:', JSON.parse(inferred));
|
|
429
|
+
// ancestor(alice,bob), ancestor(alice,charlie), ancestor(alice,dave)
|
|
430
|
+
// ancestor(bob,charlie), ancestor(bob,dave)
|
|
431
|
+
// ancestor(charlie,dave)
|
|
432
|
+
|
|
433
|
+
// Query specific predicate
|
|
434
|
+
const ancestors = queryDatalog(datalog, 'ancestor');
|
|
435
|
+
console.log('Ancestors:', JSON.parse(ancestors));
|
|
436
|
+
```
|
|
437
|
+
|
|
438
|
+
### Datalog vs SPARQL vs Motif: When to Use What
|
|
439
|
+
|
|
440
|
+
| Use Case | Best Tool | Why |
|
|
441
|
+
|----------|-----------|-----|
|
|
442
|
+
| Simple lookups | SPARQL SELECT | Direct pattern matching, 449ns |
|
|
443
|
+
| Transitive closure | Datalog | Recursive rules, fixpoint evaluation |
|
|
444
|
+
| Graph patterns | Motif | Visual DSL, multiple edges |
|
|
445
|
+
| Aggregations | SPARQL + Arrow | OLAP optimized |
|
|
446
|
+
| Fraud rings | Motif | Circular pattern detection |
|
|
447
|
+
| Inference | Datalog | Rule chaining |
|
|
448
|
+
|
|
449
|
+
**Example: Same Query, Different Tools**
|
|
450
|
+
|
|
451
|
+
```javascript
|
|
452
|
+
// Find all ancestors - Datalog (recursive, elegant)
|
|
453
|
+
datalog.addRule(JSON.stringify({
|
|
454
|
+
head: {predicate:'ancestor', terms:['?X','?Z']},
|
|
455
|
+
body: [
|
|
456
|
+
{predicate:'parent', terms:['?X','?Y']},
|
|
457
|
+
{predicate:'ancestor', terms:['?Y','?Z']}
|
|
458
|
+
]
|
|
459
|
+
}));
|
|
460
|
+
|
|
461
|
+
// Find all ancestors - SPARQL (property paths)
|
|
462
|
+
db.querySelect(`
|
|
463
|
+
SELECT ?ancestor ?descendant WHERE {
|
|
464
|
+
?ancestor <http://example.org/parent>+ ?descendant
|
|
465
|
+
}
|
|
466
|
+
`);
|
|
467
|
+
|
|
468
|
+
// Find triangles - Motif (visual, intuitive)
|
|
469
|
+
gf.find('(a)-[e1]->(b); (b)-[e2]->(c); (c)-[e3]->(a)');
|
|
470
|
+
|
|
471
|
+
// Find triangles - SPARQL (verbose)
|
|
472
|
+
db.querySelect(`
|
|
473
|
+
SELECT ?a ?b ?c WHERE {
|
|
474
|
+
?a <http://example.org/knows> ?b .
|
|
475
|
+
?b <http://example.org/knows> ?c .
|
|
476
|
+
?c <http://example.org/knows> ?a .
|
|
477
|
+
FILTER(?a < ?b && ?b < ?c)
|
|
478
|
+
}
|
|
479
|
+
`);
|
|
480
|
+
```
|
|
481
|
+
|
|
482
|
+
### EmbeddingService: Vector Similarity (HNSW)
|
|
483
|
+
|
|
484
|
+
```javascript
|
|
485
|
+
const { EmbeddingService } = require('rust-kgdb');
|
|
486
|
+
|
|
487
|
+
const embeddings = new EmbeddingService();
|
|
488
|
+
|
|
489
|
+
// Store 384-dimensional vectors
|
|
490
|
+
const vector1 = new Array(384).fill(0).map((_, i) => Math.sin(i / 10));
|
|
491
|
+
const vector2 = new Array(384).fill(0).map((_, i) => Math.cos(i / 10));
|
|
492
|
+
embeddings.storeVector('entity1', vector1);
|
|
493
|
+
embeddings.storeVector('entity2', vector2);
|
|
494
|
+
|
|
495
|
+
// Retrieve vector
|
|
496
|
+
const retrieved = embeddings.getVector('entity1');
|
|
497
|
+
console.log('Vector length:', retrieved.length); // 384
|
|
498
|
+
|
|
499
|
+
// Build HNSW index for fast similarity search
|
|
500
|
+
embeddings.rebuildIndex();
|
|
501
|
+
|
|
502
|
+
// Find similar entities (16ms for 10K vectors)
|
|
503
|
+
const similar = embeddings.findSimilar('entity1', 10, 0.7);
|
|
504
|
+
console.log('Similar:', JSON.parse(similar));
|
|
505
|
+
|
|
506
|
+
// Graceful handling of missing entities
|
|
507
|
+
const graceful = embeddings.findSimilarGraceful('nonexistent', 5, 0.5);
|
|
508
|
+
console.log('Graceful:', JSON.parse(graceful)); // []
|
|
509
|
+
|
|
510
|
+
// Delete vector
|
|
511
|
+
embeddings.deleteVector('entity2');
|
|
512
|
+
|
|
513
|
+
// Metrics
|
|
514
|
+
console.log('Metrics:', JSON.parse(embeddings.getMetrics()));
|
|
515
|
+
console.log('Cache stats:', JSON.parse(embeddings.getCacheStats()));
|
|
516
|
+
```
|
|
517
|
+
|
|
518
|
+
### Embedding Triggers: Auto-Generate on Insert
|
|
519
|
+
|
|
520
|
+
```javascript
|
|
521
|
+
const { GraphDB, EmbeddingService } = require('rust-kgdb');
|
|
522
|
+
|
|
523
|
+
const db = new GraphDB('http://example.org/');
|
|
524
|
+
const embeddings = new EmbeddingService();
|
|
525
|
+
|
|
526
|
+
// Trigger callback: generate embedding when entity inserted
|
|
527
|
+
embeddings.onTripleInsert('subject', 'predicate', 'object', null);
|
|
528
|
+
|
|
529
|
+
// In production, configure provider:
|
|
530
|
+
// - OpenAI: text-embedding-3-small (384 dims)
|
|
531
|
+
// - Ollama: nomic-embed-text (local)
|
|
532
|
+
// - Anthropic: (coming soon)
|
|
533
|
+
```
|
|
534
|
+
|
|
535
|
+
### Pregel: Bulk Synchronous Parallel
|
|
536
|
+
|
|
537
|
+
```javascript
|
|
538
|
+
const { chainGraph, pregelShortestPaths } = require('rust-kgdb');
|
|
539
|
+
|
|
540
|
+
const graph = chainGraph(10);
|
|
541
|
+
|
|
542
|
+
// Run Pregel shortest paths from source vertex
|
|
543
|
+
const result = pregelShortestPaths(graph, 'v0', 20);
|
|
544
|
+
const parsed = JSON.parse(result);
|
|
545
|
+
console.log('Supersteps:', parsed.supersteps);
|
|
546
|
+
console.log('Distances:', parsed.values);
|
|
547
|
+
```
|
|
323
548
|
|
|
324
549
|
## Agent Memory: Deep Flashback
|
|
325
550
|
|
|
326
|
-
Most AI agents forget everything between sessions.
|
|
551
|
+
Most AI agents forget everything between sessions. HyperAgent stores memory in the same knowledge graph as your data.
|
|
327
552
|
|
|
328
553
|
```
|
|
329
554
|
+-----------------------------------------------------------------------------+
|
|
330
555
|
| MEMORY HYPERGRAPH |
|
|
331
556
|
| |
|
|
332
|
-
| AGENT MEMORY LAYER
|
|
557
|
+
| AGENT MEMORY LAYER (Episodes) |
|
|
333
558
|
| +-----------+ +-----------+ +-----------+ |
|
|
334
559
|
| |Episode:001| |Episode:002| |Episode:003| |
|
|
335
560
|
| |"Fraud ring| |"Denied | |"Follow-up | |
|
|
@@ -337,9 +562,9 @@ Most AI agents forget everything between sessions. HyperMind stores memory in th
|
|
|
337
562
|
| +-----+-----+ +-----+-----+ +-----+-----+ |
|
|
338
563
|
| | | | |
|
|
339
564
|
| +-----------------+-----------------+ |
|
|
340
|
-
| | HyperEdges
|
|
565
|
+
| | HyperEdges |
|
|
341
566
|
| v |
|
|
342
|
-
| KNOWLEDGE GRAPH LAYER
|
|
567
|
+
| KNOWLEDGE GRAPH LAYER (Facts) |
|
|
343
568
|
| +-----------------------------------------------------------------+ |
|
|
344
569
|
| | Provider:P001 -----> Claim:C123 <----- Claimant:John | |
|
|
345
570
|
| | | | | | |
|
|
@@ -347,181 +572,444 @@ Most AI agents forget everything between sessions. HyperMind stores memory in th
|
|
|
347
572
|
| | riskScore: 0.87 amount: 50000 address: "123 Main" | |
|
|
348
573
|
| +-----------------------------------------------------------------+ |
|
|
349
574
|
| |
|
|
350
|
-
| SAME QUAD STORE - Single SPARQL query traverses BOTH!
|
|
575
|
+
| SAME QUAD STORE - Single SPARQL query traverses BOTH layers! |
|
|
351
576
|
+-----------------------------------------------------------------------------+
|
|
352
577
|
```
|
|
353
578
|
|
|
354
|
-
|
|
355
|
-
- Embeddings enable semantic search over past queries
|
|
356
|
-
- Temporal decay prioritizes recent, relevant memories
|
|
357
|
-
- Single SPARQL query traverses both memory AND knowledge graph
|
|
579
|
+
### Memory Retrieval Depth Benchmark
|
|
358
580
|
|
|
359
|
-
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
|
|
581
|
+
| Depth | Recall | Search Speed | Write Speed |
|
|
582
|
+
|-------|--------|--------------|-------------|
|
|
583
|
+
| 1K queries | 97% | 2.1ms | 145K ops/sec |
|
|
584
|
+
| 5K queries | 95% | 8.4ms | 138K ops/sec |
|
|
585
|
+
| 10K queries | 94% | 16.7ms | 132K ops/sec |
|
|
586
|
+
| 50K queries | 91% | 84ms | 125K ops/sec |
|
|
363
587
|
|
|
364
|
-
|
|
588
|
+
**Benchmark:** `node memory-retrieval-benchmark.js` on darwin-x64
|
|
365
589
|
|
|
366
|
-
###
|
|
590
|
+
### Memory Features
|
|
367
591
|
|
|
368
592
|
```javascript
|
|
369
|
-
const
|
|
370
|
-
|
|
371
|
-
|
|
372
|
-
|
|
373
|
-
|
|
374
|
-
|
|
593
|
+
const { HyperMindAgent, GraphDB } = require('rust-kgdb');
|
|
594
|
+
|
|
595
|
+
const db = new GraphDB('http://example.org/');
|
|
596
|
+
const agent = new HyperMindAgent({ kg: db, name: 'memory-agent' });
|
|
597
|
+
|
|
598
|
+
// Conversation knowledge extraction
|
|
599
|
+
// Agent auto-extracts entities from chat into KG
|
|
600
|
+
const result1 = await agent.call("Provider P001 submitted 5 claims totaling $47,000");
|
|
601
|
+
// Stored: :Conversation_001 :mentions :Provider_P001 .
|
|
602
|
+
// Stored: :Provider_P001 :claimCount "5" ; :claimTotal "47000" .
|
|
603
|
+
|
|
604
|
+
// Later queries use extracted knowledge
|
|
605
|
+
const result2 = await agent.call("What do we know about Provider P001?");
|
|
606
|
+
// Returns facts from BOTH original data AND conversation
|
|
607
|
+
|
|
608
|
+
// Idempotent responses (semantic hashing)
|
|
609
|
+
const result3 = await agent.call("Which providers have high denial rates?");
|
|
610
|
+
// First call: 450ms (compute + cache)
|
|
375
611
|
|
|
376
|
-
const
|
|
377
|
-
//
|
|
612
|
+
const result4 = await agent.call("Show me providers with lots of denials");
|
|
613
|
+
// Second call: 2ms (cache hit - same semantic meaning)
|
|
378
614
|
```
|
|
379
615
|
|
|
380
|
-
|
|
616
|
+
## Embedded vs Clustered Deployment
|
|
617
|
+
|
|
618
|
+
### Embedded Mode (Default)
|
|
381
619
|
|
|
382
620
|
```javascript
|
|
383
|
-
const db = new GraphDB('http://
|
|
384
|
-
|
|
385
|
-
|
|
386
|
-
|
|
387
|
-
|
|
388
|
-
|
|
621
|
+
const db = new GraphDB('http://example.org/'); // In-memory, zero config
|
|
622
|
+
```
|
|
623
|
+
|
|
624
|
+
- **Storage:** RAM only (HashMap-based SPOC indexes)
|
|
625
|
+
- **Performance:** 449ns lookups, 146K triples/sec insert
|
|
626
|
+
- **Persistence:** None (data lost on restart)
|
|
627
|
+
- **Scaling:** Single process, up to ~100M triples
|
|
628
|
+
- **Use case:** Development, testing, embedded apps
|
|
629
|
+
|
|
630
|
+
### Clustered Mode (1B+ triples)
|
|
389
631
|
|
|
390
|
-
const result = await agent.ask("What should we avoid prescribing to Patient 7291?");
|
|
391
|
-
// Returns ACTUAL interactions from your formulary, not made-up drug names
|
|
392
632
|
```
|
|
633
|
+
+-----------------------------------------------------------------------------+
|
|
634
|
+
| DISTRIBUTED CLUSTER ARCHITECTURE |
|
|
635
|
+
| |
|
|
636
|
+
| +-------------------+ |
|
|
637
|
+
| | COORDINATOR | <- Routes queries, manages partitions |
|
|
638
|
+
| | (Raft consensus) | |
|
|
639
|
+
| +--------+----------+ |
|
|
640
|
+
| | |
|
|
641
|
+
| +--------+--------+--------+--------+ |
|
|
642
|
+
| | | | | | |
|
|
643
|
+
| v v v v v |
|
|
644
|
+
| +----+ +----+ +----+ +----+ +----+ |
|
|
645
|
+
| |Exec| |Exec| |Exec| |Exec| |Exec| <- Partition executors |
|
|
646
|
+
| | 0 | | 1 | | 2 | | 3 | | 4 | |
|
|
647
|
+
| +----+ +----+ +----+ +----+ +----+ |
|
|
648
|
+
| | | | | | |
|
|
649
|
+
| v v v v v |
|
|
650
|
+
| [===] [===] [===] [===] [===] <- Local RocksDB partitions |
|
|
651
|
+
| |
|
|
652
|
+
| HDRF Partitioning: Subject-anchored streaming (load factor < 1.1) |
|
|
653
|
+
| Shadow Partitions: Zero-downtime rebalancing (~10ms pause) |
|
|
654
|
+
| Apache Arrow: Columnar OLAP for analytical queries |
|
|
655
|
+
+-----------------------------------------------------------------------------+
|
|
656
|
+
```
|
|
657
|
+
|
|
658
|
+
**Deployment:**
|
|
659
|
+
```bash
|
|
660
|
+
# Kubernetes deployment
|
|
661
|
+
kubectl apply -f infra/k8s/coordinator.yaml
|
|
662
|
+
kubectl apply -f infra/k8s/executor.yaml
|
|
663
|
+
|
|
664
|
+
# Helm chart
|
|
665
|
+
helm install rust-kgdb ./infra/helm -n rust-kgdb --create-namespace
|
|
393
666
|
|
|
394
|
-
|
|
667
|
+
# Verify cluster
|
|
668
|
+
kubectl get pods -n rust-kgdb
|
|
669
|
+
```
|
|
670
|
+
|
|
671
|
+
### Memory in Clustered Mode
|
|
672
|
+
|
|
673
|
+
Agent memory scales with the cluster:
|
|
674
|
+
- Episodes partitioned by agent ID (locality)
|
|
675
|
+
- Embeddings replicated for fast similarity search
|
|
676
|
+
- Cross-partition queries via coordinator routing
|
|
677
|
+
|
|
678
|
+
## Concurrency Benchmarks
|
|
679
|
+
|
|
680
|
+
Measured with `node concurrency-benchmark.js` on darwin-x64:
|
|
681
|
+
|
|
682
|
+
### Write Scaling
|
|
683
|
+
|
|
684
|
+
| Workers | Ops/Sec | Scaling Factor |
|
|
685
|
+
|---------|---------|----------------|
|
|
686
|
+
| 1 | 66,422 | 1.00x |
|
|
687
|
+
| 2 | 79,480 | 1.20x |
|
|
688
|
+
| 4 | 95,655 | 1.44x |
|
|
689
|
+
| 8 | 111,357 | 1.68x |
|
|
690
|
+
| 16 | 132,087 | 1.99x |
|
|
691
|
+
|
|
692
|
+
### Read Scaling
|
|
693
|
+
|
|
694
|
+
| Workers | Ops/Sec | Scaling Factor |
|
|
695
|
+
|---------|---------|----------------|
|
|
696
|
+
| 1 | 290 | 1.00x |
|
|
697
|
+
| 2 | 305 | 1.05x |
|
|
698
|
+
| 4 | 307 | 1.06x |
|
|
699
|
+
| 8 | 282 | 0.97x |
|
|
700
|
+
| 16 | 302 | 1.04x |
|
|
701
|
+
|
|
702
|
+
### GraphFrame Scaling
|
|
703
|
+
|
|
704
|
+
| Workers | Ops/Sec | Scaling Factor |
|
|
705
|
+
|---------|---------|----------------|
|
|
706
|
+
| 1 | 5,987 | 1.00x |
|
|
707
|
+
| 2 | 6,532 | 1.09x |
|
|
708
|
+
| 4 | 6,494 | 1.08x |
|
|
709
|
+
| 8 | 6,715 | 1.12x |
|
|
710
|
+
| 16 | 6,516 | 1.09x |
|
|
711
|
+
|
|
712
|
+
**Interpretation:**
|
|
713
|
+
- Writes scale near-linearly (lock-free dictionary)
|
|
714
|
+
- Reads plateau (SPARQL parsing overhead dominates)
|
|
715
|
+
- GraphFrame stable (compute-bound, not I/O-bound)
|
|
716
|
+
|
|
717
|
+
## Real-World Examples
|
|
718
|
+
|
|
719
|
+
### Fraud Detection (NICB Dataset Patterns)
|
|
720
|
+
|
|
721
|
+
Based on National Insurance Crime Bureau fraud indicators:
|
|
395
722
|
|
|
396
723
|
```javascript
|
|
397
|
-
const
|
|
724
|
+
const { GraphDB, HyperMindAgent, DatalogProgram, evaluateDatalog, GraphFrame } = require('rust-kgdb');
|
|
725
|
+
|
|
726
|
+
// Create database with claims data
|
|
727
|
+
const db = new GraphDB('http://insurance.org/');
|
|
398
728
|
db.loadTtl(`
|
|
399
|
-
|
|
400
|
-
|
|
401
|
-
|
|
402
|
-
|
|
403
|
-
|
|
729
|
+
<http://insurance.org/PROV001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://insurance.org/Provider> .
|
|
730
|
+
<http://insurance.org/PROV001> <http://insurance.org/name> "ABC Medical" .
|
|
731
|
+
<http://insurance.org/PROV001> <http://insurance.org/denialRate> "0.34" .
|
|
732
|
+
<http://insurance.org/PROV001> <http://insurance.org/totalClaims> "89" .
|
|
733
|
+
<http://insurance.org/PROV001> <http://insurance.org/hasPattern> <http://insurance.org/UnbundledBilling> .
|
|
734
|
+
|
|
735
|
+
<http://insurance.org/CLMT001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://insurance.org/Claimant> .
|
|
736
|
+
<http://insurance.org/CLMT001> <http://insurance.org/address> "123 Main St" .
|
|
737
|
+
<http://insurance.org/CLMT002> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://insurance.org/Claimant> .
|
|
738
|
+
<http://insurance.org/CLMT002> <http://insurance.org/address> "123 Main St" .
|
|
739
|
+
<http://insurance.org/CLMT001> <http://insurance.org/knows> <http://insurance.org/CLMT002> .
|
|
740
|
+
`, null);
|
|
741
|
+
|
|
742
|
+
// Method 1: SPARQL for simple queries
|
|
743
|
+
const highDenial = db.querySelect(`
|
|
744
|
+
SELECT ?provider ?rate WHERE {
|
|
745
|
+
?provider <http://insurance.org/denialRate> ?rate .
|
|
746
|
+
FILTER(?rate > "0.2")
|
|
747
|
+
}
|
|
404
748
|
`);
|
|
405
749
|
|
|
406
|
-
//
|
|
750
|
+
// Method 2: Datalog for collusion detection
|
|
751
|
+
const datalog = new DatalogProgram();
|
|
752
|
+
datalog.addFact(JSON.stringify({predicate:'knows', terms:['CLMT001','CLMT002']}));
|
|
753
|
+
datalog.addFact(JSON.stringify({predicate:'sameAddress', terms:['CLMT001','CLMT002']}));
|
|
407
754
|
datalog.addRule(JSON.stringify({
|
|
408
|
-
head: {predicate:'potential_collusion', terms:['?X','?Y'
|
|
755
|
+
head: {predicate:'potential_collusion', terms:['?X','?Y']},
|
|
409
756
|
body: [
|
|
410
|
-
{predicate:'claimant', terms:['?X']},
|
|
411
|
-
{predicate:'claimant', terms:['?Y']},
|
|
412
757
|
{predicate:'knows', terms:['?X','?Y']},
|
|
413
|
-
{predicate:'
|
|
414
|
-
{predicate:'claimsWith', terms:['?Y','?P']}
|
|
758
|
+
{predicate:'sameAddress', terms:['?X','?Y']}
|
|
415
759
|
]
|
|
416
760
|
}));
|
|
761
|
+
const collusion = evaluateDatalog(datalog);
|
|
417
762
|
|
|
418
|
-
|
|
419
|
-
|
|
763
|
+
// Method 3: Motif for ring detection
|
|
764
|
+
const gf = new GraphFrame(
|
|
765
|
+
JSON.stringify([{id:'CLMT001'}, {id:'CLMT002'}, {id:'CLMT003'}]),
|
|
766
|
+
JSON.stringify([
|
|
767
|
+
{src:'CLMT001', dst:'CLMT002'},
|
|
768
|
+
{src:'CLMT002', dst:'CLMT003'},
|
|
769
|
+
{src:'CLMT003', dst:'CLMT001'}
|
|
770
|
+
])
|
|
771
|
+
);
|
|
772
|
+
const rings = gf.find('(a)-[e1]->(b); (b)-[e2]->(c); (c)-[e3]->(a)');
|
|
773
|
+
|
|
774
|
+
// Method 4: HyperAgent for natural language
|
|
775
|
+
const agent = new HyperMindAgent({ kg: db, name: 'fraud-detector' });
|
|
776
|
+
const result = await agent.call("Find suspicious billing patterns");
|
|
420
777
|
```
|
|
421
778
|
|
|
422
|
-
|
|
779
|
+
### Underwriting (ISO/ACORD Dataset Patterns)
|
|
423
780
|
|
|
424
|
-
|
|
781
|
+
Based on insurance industry standard data models:
|
|
425
782
|
|
|
426
|
-
```
|
|
427
|
-
|
|
428
|
-
node vanilla-vs-hypermind-benchmark.js
|
|
429
|
-
```
|
|
783
|
+
```javascript
|
|
784
|
+
const { GraphDB, HyperMindAgent, EmbeddingService } = require('rust-kgdb');
|
|
430
785
|
|
|
431
|
-
|
|
786
|
+
const db = new GraphDB('http://underwriting.org/');
|
|
787
|
+
db.loadTtl(`
|
|
788
|
+
<http://underwriting.org/APP001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://underwriting.org/Applicant> .
|
|
789
|
+
<http://underwriting.org/APP001> <http://underwriting.org/name> "Acme Corp" .
|
|
790
|
+
<http://underwriting.org/APP001> <http://underwriting.org/industry> "Manufacturing" .
|
|
791
|
+
<http://underwriting.org/APP001> <http://underwriting.org/employees> "250" .
|
|
792
|
+
<http://underwriting.org/APP001> <http://underwriting.org/creditScore> "720" .
|
|
793
|
+
|
|
794
|
+
<http://underwriting.org/COMP001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://underwriting.org/Applicant> .
|
|
795
|
+
<http://underwriting.org/COMP001> <http://underwriting.org/industry> "Manufacturing" .
|
|
796
|
+
<http://underwriting.org/COMP001> <http://underwriting.org/employees> "230" .
|
|
797
|
+
<http://underwriting.org/COMP001> <http://underwriting.org/premium> "625000" .
|
|
798
|
+
`, null);
|
|
799
|
+
|
|
800
|
+
// Embeddings for similarity search
|
|
801
|
+
const embeddings = new EmbeddingService();
|
|
802
|
+
const appVector = new Array(384).fill(0).map((_, i) => Math.sin(i / 10));
|
|
803
|
+
embeddings.storeVector('APP001', appVector);
|
|
804
|
+
embeddings.storeVector('COMP001', appVector.map(x => x * 0.95));
|
|
805
|
+
embeddings.rebuildIndex();
|
|
432
806
|
|
|
433
|
-
|
|
434
|
-
|
|
435
|
-
|
|
436
|
-
|
|
437
|
-
|
|
807
|
+
// Find similar accounts
|
|
808
|
+
const similar = embeddings.findSimilar('APP001', 5, 0.7);
|
|
809
|
+
|
|
810
|
+
// Direct SPARQL for comparables
|
|
811
|
+
const comparables = db.querySelect(`
|
|
812
|
+
SELECT ?company ?employees ?premium WHERE {
|
|
813
|
+
?company <http://underwriting.org/industry> "Manufacturing" .
|
|
814
|
+
?company <http://underwriting.org/employees> ?employees .
|
|
815
|
+
OPTIONAL { ?company <http://underwriting.org/premium> ?premium }
|
|
816
|
+
}
|
|
817
|
+
`);
|
|
818
|
+
|
|
819
|
+
// HyperAgent for risk assessment
|
|
820
|
+
const agent = new HyperMindAgent({
|
|
821
|
+
kg: db,
|
|
822
|
+
embeddings: embeddings,
|
|
823
|
+
name: 'underwriter'
|
|
824
|
+
});
|
|
825
|
+
const risk = await agent.call("Assess risk profile for Acme Corp");
|
|
826
|
+
```
|
|
438
827
|
|
|
439
|
-
|
|
440
|
-
|
|
441
|
-
|
|
442
|
-
|
|
443
|
-
|
|
|
444
|
-
|
|
445
|
-
|
|
|
446
|
-
|
|
447
|
-
|
|
448
|
-
|
|
449
|
-
|
|
|
450
|
-
|
|
451
|
-
|
|
|
452
|
-
|
|
453
|
-
|
|
454
|
-
|
|
455
|
-
|
|
|
456
|
-
|
|
457
|
-
|
|
|
458
|
-
|
|
|
459
|
-
|
|
|
460
|
-
|
|
|
461
|
-
|
|
|
462
|
-
|
|
|
828
|
+
## Complete Feature List
|
|
829
|
+
|
|
830
|
+
### Core Database
|
|
831
|
+
|
|
832
|
+
| Feature | Description | Performance |
|
|
833
|
+
|---------|-------------|-------------|
|
|
834
|
+
| SPARQL 1.1 Query | SELECT, CONSTRUCT, ASK, DESCRIBE | 449ns lookups |
|
|
835
|
+
| SPARQL 1.1 Update | INSERT, DELETE, LOAD, CLEAR | 146K/sec |
|
|
836
|
+
| RDF 1.2 | Quoted triples, annotations | W3C compliant |
|
|
837
|
+
| Named Graphs | Quad store with graph isolation | O(1) switching |
|
|
838
|
+
| Triple Indexing | SPOC/POCS/OCSP/CSPO | Sub-microsecond |
|
|
839
|
+
| Storage Backends | InMemory, RocksDB, LMDB | Pluggable |
|
|
840
|
+
| Apache Arrow OLAP | Columnar aggregations | Vectorized |
|
|
841
|
+
|
|
842
|
+
### Graph Analytics (GraphFrame)
|
|
843
|
+
|
|
844
|
+
| Algorithm | Complexity | Description |
|
|
845
|
+
|-----------|------------|-------------|
|
|
846
|
+
| PageRank | O(V+E) per iteration | Damping, iterations configurable |
|
|
847
|
+
| Connected Components | O(V+E) | Union-Find |
|
|
848
|
+
| Triangle Count | O(E^1.5) | Optimized |
|
|
849
|
+
| Shortest Paths | O(V+E) | Dijkstra |
|
|
850
|
+
| Label Propagation | O(V+E) per iteration | Community detection |
|
|
851
|
+
| Motif Finding | Pattern-dependent | DSL: `(a)-[e]->(b)` |
|
|
852
|
+
| Pregel | BSP model | Custom vertex programs |
|
|
853
|
+
|
|
854
|
+
### AI/ML Features
|
|
855
|
+
|
|
856
|
+
| Feature | Performance | Description |
|
|
857
|
+
|---------|-------------|-------------|
|
|
858
|
+
| HNSW Embeddings | 16ms/10K | 384-dimensional vectors |
|
|
859
|
+
| Similarity Search | O(log n) | Approximate nearest neighbor |
|
|
860
|
+
| Embedding Triggers | Auto on INSERT | OpenAI/Ollama providers |
|
|
861
|
+
| Agent Memory | 94% recall @ 10K | Episodic + semantic |
|
|
862
|
+
| Semantic Caching | 2ms hit | Hash-based deduplication |
|
|
863
|
+
|
|
864
|
+
### Reasoning Engine
|
|
865
|
+
|
|
866
|
+
| Feature | Algorithm | Description |
|
|
867
|
+
|---------|-----------|-------------|
|
|
868
|
+
| Datalog | Semi-naive | Recursive rules |
|
|
869
|
+
| Transitive Closure | Fixpoint | ancestor(X,Y) |
|
|
870
|
+
| Stratified Negation | Stratified | NOT in bodies |
|
|
871
|
+
| Rule Chaining | Forward | Multi-hop inference |
|
|
872
|
+
|
|
873
|
+
### Security and Audit
|
|
874
|
+
|
|
875
|
+
| Feature | Implementation | Description |
|
|
876
|
+
|---------|----------------|-------------|
|
|
877
|
+
| WASM Sandbox | Fuel metering | 1M ops max |
|
|
878
|
+
| Capabilities | Set-based | ReadKG, WriteKG |
|
|
879
|
+
| ProofDAG | SHA-256 | Cryptographic audit |
|
|
880
|
+
| Tool Validation | Type checking | Morphism composition |
|
|
881
|
+
|
|
882
|
+
### HyperAgent Framework
|
|
883
|
+
|
|
884
|
+
| Feature | Description |
|
|
885
|
+
|---------|-------------|
|
|
886
|
+
| Schema-Aware Query Gen | Uses YOUR ontology |
|
|
887
|
+
| Deterministic Planning | No LLM for queries |
|
|
888
|
+
| Multi-Step Execution | SPARQL + Datalog + Motif |
|
|
889
|
+
| Memory Hypergraph | Episodes link to KG |
|
|
890
|
+
| Conversation Extraction | Auto-extract entities |
|
|
891
|
+
| Idempotent Responses | Same question = same answer |
|
|
892
|
+
|
|
893
|
+
### Standards Compliance
|
|
894
|
+
|
|
895
|
+
| Standard | Status |
|
|
896
|
+
|----------|--------|
|
|
897
|
+
| SPARQL 1.1 Query | 100% |
|
|
898
|
+
| SPARQL 1.1 Update | 100% |
|
|
899
|
+
| RDF 1.2 | 100% |
|
|
900
|
+
| Turtle | 100% |
|
|
901
|
+
| N-Triples | 100% |
|
|
463
902
|
|
|
464
903
|
## API Reference
|
|
465
904
|
|
|
466
905
|
### GraphDB
|
|
467
906
|
|
|
468
907
|
```javascript
|
|
469
|
-
const db = new GraphDB(baseUri)
|
|
470
|
-
db.loadTtl(turtle, graphUri)
|
|
471
|
-
db.querySelect(sparql)
|
|
472
|
-
db.queryConstruct(sparql)
|
|
473
|
-
db.countTriples()
|
|
474
|
-
db.clear()
|
|
908
|
+
const db = new GraphDB(baseUri) // Create database
|
|
909
|
+
db.loadTtl(turtle, graphUri) // Load RDF data
|
|
910
|
+
db.querySelect(sparql) // SELECT query -> results[]
|
|
911
|
+
db.queryConstruct(sparql) // CONSTRUCT -> triples string
|
|
912
|
+
db.countTriples() // Count triples -> number
|
|
913
|
+
db.clear() // Clear all data
|
|
914
|
+
db.getGraphUri() // Get base URI -> string
|
|
475
915
|
```
|
|
476
916
|
|
|
477
917
|
### GraphFrame
|
|
478
918
|
|
|
479
919
|
```javascript
|
|
480
920
|
const gf = new GraphFrame(verticesJson, edgesJson)
|
|
481
|
-
gf.
|
|
482
|
-
gf.
|
|
483
|
-
gf.
|
|
484
|
-
gf.
|
|
485
|
-
gf.
|
|
921
|
+
gf.vertexCount() // -> number
|
|
922
|
+
gf.edgeCount() // -> number
|
|
923
|
+
gf.pageRank(dampingFactor, iterations) // -> JSON string
|
|
924
|
+
gf.connectedComponents() // -> JSON string
|
|
925
|
+
gf.triangleCount() // -> number
|
|
926
|
+
gf.shortestPaths(landmarks) // -> JSON string
|
|
927
|
+
gf.labelPropagation(iterations) // -> JSON string
|
|
928
|
+
gf.find(motifPattern) // -> JSON string
|
|
929
|
+
gf.inDegrees() // -> JSON string
|
|
930
|
+
gf.outDegrees() // -> JSON string
|
|
931
|
+
gf.degrees() // -> JSON string
|
|
932
|
+
gf.toJson() // -> JSON string
|
|
486
933
|
```
|
|
487
934
|
|
|
488
935
|
### EmbeddingService
|
|
489
936
|
|
|
490
937
|
```javascript
|
|
491
938
|
const emb = new EmbeddingService()
|
|
492
|
-
emb.storeVector(entityId, float32Array)
|
|
493
|
-
emb.
|
|
494
|
-
emb.
|
|
939
|
+
emb.storeVector(entityId, float32Array) // Store vector
|
|
940
|
+
emb.getVector(entityId) // -> Float32Array | null
|
|
941
|
+
emb.deleteVector(entityId) // Delete vector
|
|
942
|
+
emb.rebuildIndex() // Build HNSW index
|
|
943
|
+
emb.findSimilar(entityId, k, threshold) // -> JSON string
|
|
944
|
+
emb.findSimilarGraceful(entityId, k, t) // -> JSON string (no throw)
|
|
945
|
+
emb.isEnabled() // -> boolean
|
|
946
|
+
emb.getMetrics() // -> JSON string
|
|
947
|
+
emb.getCacheStats() // -> JSON string
|
|
948
|
+
emb.onTripleInsert(s, p, o, g) // Trigger hook
|
|
495
949
|
```
|
|
496
950
|
|
|
497
951
|
### DatalogProgram
|
|
498
952
|
|
|
499
953
|
```javascript
|
|
500
954
|
const dl = new DatalogProgram()
|
|
501
|
-
dl.addFact(factJson)
|
|
502
|
-
dl.addRule(ruleJson)
|
|
503
|
-
|
|
955
|
+
dl.addFact(factJson) // Add fact
|
|
956
|
+
dl.addRule(ruleJson) // Add rule
|
|
957
|
+
dl.factCount() // -> number
|
|
958
|
+
dl.ruleCount() // -> number
|
|
959
|
+
evaluateDatalog(dl) // -> JSON string (all inferred)
|
|
960
|
+
queryDatalog(dl, predicate) // -> JSON string (specific)
|
|
961
|
+
```
|
|
962
|
+
|
|
963
|
+
### HyperMindAgent
|
|
964
|
+
|
|
965
|
+
```javascript
|
|
966
|
+
const agent = new HyperMindAgent({
|
|
967
|
+
kg: db, // REQUIRED: GraphDB
|
|
968
|
+
embeddings: embeddingService, // Optional: EmbeddingService
|
|
969
|
+
name: 'agent-name', // Optional: string
|
|
970
|
+
apiKey: process.env.OPENAI_API_KEY, // Optional: LLM API key
|
|
971
|
+
sandbox: { // Optional: security config
|
|
972
|
+
capabilities: ['ReadKG'],
|
|
973
|
+
fuelLimit: 1000000
|
|
974
|
+
}
|
|
975
|
+
})
|
|
976
|
+
|
|
977
|
+
const result = await agent.call(question) // Natural language query
|
|
978
|
+
// result.answer -> string (human-readable)
|
|
979
|
+
// result.explanation -> string (execution trace)
|
|
980
|
+
// result.proof -> object (SHA-256 audit trail)
|
|
504
981
|
```
|
|
505
982
|
|
|
506
983
|
### Factory Functions
|
|
507
984
|
|
|
508
985
|
```javascript
|
|
509
|
-
friendsGraph()
|
|
510
|
-
chainGraph(n)
|
|
511
|
-
starGraph(n)
|
|
512
|
-
completeGraph(n)
|
|
513
|
-
cycleGraph(n)
|
|
986
|
+
friendsGraph() // Sample social graph
|
|
987
|
+
chainGraph(n) // Linear path: v0 -> v1 -> ... -> vn-1
|
|
988
|
+
starGraph(n) // Hub with n spokes
|
|
989
|
+
completeGraph(n) // Fully connected Kn
|
|
990
|
+
cycleGraph(n) // Ring: v0 -> v1 -> ... -> vn-1 -> v0
|
|
991
|
+
binaryTreeGraph(depth) // Binary tree
|
|
992
|
+
bipartiteGraph(m, n) // Bipartite Km,n
|
|
514
993
|
```
|
|
515
994
|
|
|
516
|
-
##
|
|
995
|
+
## Running Benchmarks
|
|
517
996
|
|
|
518
997
|
```bash
|
|
519
|
-
|
|
520
|
-
|
|
998
|
+
# Core engine benchmarks
|
|
999
|
+
node benchmark.js
|
|
521
1000
|
|
|
522
|
-
|
|
1001
|
+
# Concurrency benchmarks
|
|
1002
|
+
node concurrency-benchmark.js
|
|
523
1003
|
|
|
524
|
-
|
|
1004
|
+
# Memory retrieval benchmarks
|
|
1005
|
+
node memory-retrieval-benchmark.js
|
|
1006
|
+
|
|
1007
|
+
# HyperMind vs Vanilla LLM (requires API key)
|
|
1008
|
+
ANTHROPIC_API_KEY=... node vanilla-vs-hypermind-benchmark.js
|
|
1009
|
+
|
|
1010
|
+
# Framework comparison (requires Python + API key)
|
|
1011
|
+
OPENAI_API_KEY=... python3 benchmark-frameworks.py
|
|
1012
|
+
```
|
|
525
1013
|
|
|
526
1014
|
## License
|
|
527
1015
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "rust-kgdb",
|
|
3
|
-
"version": "0.6.
|
|
3
|
+
"version": "0.6.67",
|
|
4
4
|
"description": "High-performance RDF/SPARQL database with AI agent framework. GraphDB (449ns lookups, 35x faster than RDFox), GraphFrames analytics (PageRank, motifs), Datalog reasoning, HNSW vector embeddings. HyperMindAgent for schema-aware query generation with audit trails. W3C SPARQL 1.1 compliant. Native performance via Rust + NAPI-RS.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"types": "index.d.ts",
|