rust-kgdb 0.6.9 → 0.6.13
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +46 -0
- package/README.archive.md +2632 -0
- package/README.md +839 -2267
- package/examples/fraud-detection-agent.js +458 -7
- package/examples/underwriting-agent.js +651 -20
- package/hypermind-agent.js +2221 -76
- package/index.js +28 -0
- package/ontology/agent-memory.ttl +421 -0
- package/package.json +10 -2
package/README.md
CHANGED
|
@@ -4,742 +4,515 @@
|
|
|
4
4
|
[](https://opensource.org/licenses/Apache-2.0)
|
|
5
5
|
[](https://www.w3.org/TR/sparql11-query/)
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
**High-Performance Knowledge Graph Database for Node.js**
|
|
8
8
|
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
## The Problem
|
|
9
|
+
Native Rust RDF/SPARQL engine with graph analytics, embeddings, and rule-based reasoning.
|
|
12
10
|
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
It returned this broken output:
|
|
11
|
+
---
|
|
16
12
|
|
|
17
|
-
|
|
18
|
-
```sparql
|
|
19
|
-
SELECT ?professor WHERE { ?professor a ub:Faculty . }
|
|
20
|
-
```
|
|
21
|
-
This query retrieves faculty members from the knowledge graph.
|
|
22
|
-
```
|
|
13
|
+
## What rust-kgdb Provides
|
|
23
14
|
|
|
24
|
-
|
|
15
|
+
### Core Database
|
|
16
|
+
- **GraphDB** - W3C compliant RDF quad store with SPOC/POCS/OCSP/CSPO indexes
|
|
17
|
+
- **SPARQL 1.1** - Full query and update support (64 builtin functions)
|
|
18
|
+
- **RDF 1.2** - Complete standard implementation
|
|
25
19
|
|
|
26
|
-
|
|
20
|
+
### Graph Analytics (GraphFrames)
|
|
21
|
+
- **PageRank** - Iterative ranking algorithm
|
|
22
|
+
- **Connected Components** - Union-find based component detection
|
|
23
|
+
- **Shortest Paths** - Landmark-based path finding
|
|
24
|
+
- **Triangle Count** - Graph density measurement
|
|
25
|
+
- **Motif Finding** - Pattern matching DSL (e.g., `"(a)-[e1]->(b); (b)-[e2]->(c)"`)
|
|
26
|
+
- **Label Propagation** - Community detection
|
|
27
|
+
- **Pregel API** - Bulk Synchronous Parallel computation model
|
|
27
28
|
|
|
28
|
-
|
|
29
|
+
### Why GraphFrames + SQL over SPARQL?
|
|
29
30
|
|
|
30
|
-
|
|
31
|
+
SPARQL excels at graph pattern matching but struggles with:
|
|
32
|
+
- **Aggregations over large result sets** - SQL's columnar execution is 10-100x faster
|
|
33
|
+
- **Window functions** - Running totals, rankings, moving averages
|
|
34
|
+
- **Join optimization** - Apache DataFusion's query planner with predicate pushdown
|
|
35
|
+
- **Interoperability** - Export to Parquet, connect to BI tools
|
|
31
36
|
|
|
32
|
-
|
|
37
|
+
GraphFrames bridges this gap: your data stays in RDF, but analytics run on Apache Arrow columnar format via DataFusion.
|
|
33
38
|
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
│
|
|
40
|
-
┌────────────────────────────────────▼────────────────────────────────────────────┐
|
|
41
|
-
│ HYPERMIND AGENT FRAMEWORK (SDK Layer) │
|
|
42
|
-
│ ┌────────────────────────────────────────────────────────────────────────────┐ │
|
|
43
|
-
│ │ Mathematical Abstractions (High-Level) │ │
|
|
44
|
-
│ │ • TypeId: Hindley-Milner type system with refinement types │ │
|
|
45
|
-
│ │ • LLMPlanner: Natural language → typed tool pipelines │ │
|
|
46
|
-
│ │ • WasmSandbox: WASM isolation with capability-based security │ │
|
|
47
|
-
│ │ • AgentBuilder: Fluent composition of typed tools │ │
|
|
48
|
-
│ │ • ExecutionWitness: Cryptographic proofs (SHA-256) │ │
|
|
49
|
-
│ └────────────────────────────────────────────────────────────────────────────┘ │
|
|
50
|
-
│ │ │
|
|
51
|
-
│ Category Theory: Tools as Morphisms (A → B) │
|
|
52
|
-
│ Proof Theory: Every execution has a witness │
|
|
53
|
-
└────────────────────────────────────┬────────────────────────────────────────────┘
|
|
54
|
-
│ NAPI-RS Bindings
|
|
55
|
-
┌────────────────────────────────────▼────────────────────────────────────────────┐
|
|
56
|
-
│ RUST CORE ENGINE (Native Performance) │
|
|
57
|
-
│ ┌────────────────────────────────────────────────────────────────────────────┐ │
|
|
58
|
-
│ │ GraphDB │ RDF/SPARQL quad store │ 2.78µs lookups, 24 bytes/triple│
|
|
59
|
-
│ │ GraphFrame │ Graph algorithms │ WCOJ optimal joins, PageRank │
|
|
60
|
-
│ │ EmbeddingService │ Vector similarity │ HNSW index, 1-hop ARCADE cache│
|
|
61
|
-
│ │ DatalogProgram │ Rule-based reasoning │ Semi-naive evaluation │
|
|
62
|
-
│ │ Pregel │ BSP graph processing │ Iterative algorithms │
|
|
63
|
-
│ └────────────────────────────────────────────────────────────────────────────┘ │
|
|
64
|
-
│ │
|
|
65
|
-
│ W3C Standards: SPARQL 1.1 (100%) | RDF 1.2 | OWL 2 RL | SHACL | RDFS │
|
|
66
|
-
│ Storage Backends: InMemory | RocksDB | LMDB │
|
|
67
|
-
│ Distribution: HDRF Partitioning | Raft Consensus | gRPC │
|
|
68
|
-
└──────────────────────────────────────────────────────────────────────────────────┘
|
|
69
|
-
```
|
|
39
|
+
### Distributed Cluster (v0.2.0)
|
|
40
|
+
- **HDRF Partitioning** - High-Degree-Replicated-First streaming partitioner
|
|
41
|
+
- **Coordinator + Executors** - gRPC-based query distribution
|
|
42
|
+
- **Raft Consensus** - Distributed coordination (planned)
|
|
43
|
+
- **Kubernetes Native** - Helm charts included
|
|
70
44
|
|
|
71
|
-
|
|
45
|
+
### AI & Embeddings
|
|
46
|
+
- **EmbeddingService** - HNSW approximate nearest neighbor search
|
|
47
|
+
- **1-Hop ARCADE Cache** - Neighbor-aware embedding retrieval
|
|
48
|
+
- **Multiple Providers** - OpenAI, Ollama, Anthropic, or custom
|
|
72
49
|
|
|
73
|
-
###
|
|
50
|
+
### Reasoning
|
|
51
|
+
- **Datalog** - Semi-naive rule evaluation with stratified negation
|
|
52
|
+
- **HyperMindAgent** - Pattern-based intent classification (no LLM calls)
|
|
74
53
|
|
|
75
|
-
|
|
54
|
+
### Mathematical Foundations (HyperMind Framework)
|
|
76
55
|
|
|
77
|
-
|
|
78
|
-
|-----------|---------------|-------------|-------|
|
|
79
|
-
| **GraphDB** | Rust via NAPI-RS | 2.78µs lookups | Zero-copy RDF quad store |
|
|
80
|
-
| **GraphFrame** | Rust via NAPI-RS | WCOJ optimal | PageRank, triangles, components |
|
|
81
|
-
| **EmbeddingService** | Rust via NAPI-RS | Sub-ms search | HNSW index + 1-hop cache |
|
|
82
|
-
| **DatalogProgram** | Rust via NAPI-RS | Semi-naive eval | Rule-based reasoning |
|
|
83
|
-
| **Pregel** | Rust via NAPI-RS | BSP model | Iterative graph algorithms |
|
|
84
|
-
| **TypeId** | Rust via NAPI-RS | N/A | Hindley-Milner type system |
|
|
85
|
-
| **LLMPlanner** | JavaScript + HTTP | LLM latency | Orchestrates Rust tools via Claude/GPT |
|
|
86
|
-
| **WasmSandbox** | Rust via NAPI-RS | Capability check | WASM isolation runtime |
|
|
87
|
-
| **AgentBuilder** | Rust via NAPI-RS | N/A | Fluent tool composition |
|
|
88
|
-
| **ExecutionWitness** | Rust via NAPI-RS | SHA-256 | Cryptographic audit proofs |
|
|
56
|
+
The HyperMind agent framework is built on three mathematical pillars:
|
|
89
57
|
|
|
90
|
-
|
|
58
|
+
| Theory | Purpose | Implementation |
|
|
59
|
+
|--------|---------|----------------|
|
|
60
|
+
| **Type Theory** | Compile-time contracts for tool inputs/outputs | Hindley-Milner type inference, refinement types |
|
|
61
|
+
| **Category Theory** | Tool composition with mathematical guarantees | Morphisms (A → B), functors, natural transformations |
|
|
62
|
+
| **Proof Theory** | Every execution produces a verifiable witness | Curry-Howard correspondence, proof DAGs |
|
|
91
63
|
|
|
92
|
-
|
|
64
|
+
**Example**: A fraud detection query composes morphisms:
|
|
65
|
+
```
|
|
66
|
+
Query → BindingSet → RiskScore → FraudReport
|
|
67
|
+
(morphism) (morphism) (morphism)
|
|
68
|
+
```
|
|
69
|
+
Each step has typed contracts. Composition is validated at compile time.
|
|
93
70
|
|
|
94
|
-
|
|
71
|
+
### Security: Object Capability Model (WASM Sandbox)
|
|
95
72
|
|
|
96
|
-
|
|
73
|
+
Unlike MCP (Model Context Protocol) which relies on trust-based access, rust-kgdb uses an **Object Capability (OCAP) security model**:
|
|
97
74
|
|
|
98
|
-
|
|
75
|
+
| Aspect | MCP | rust-kgdb WASM Sandbox |
|
|
76
|
+
|--------|-----|------------------------|
|
|
77
|
+
| **Access Control** | Trust-based (server decides) | Capability-based (code has what it's given) |
|
|
78
|
+
| **Isolation** | Process boundaries | WASM linear memory isolation |
|
|
79
|
+
| **Resource Limits** | None built-in | Fuel metering (CPU), memory limits |
|
|
80
|
+
| **Audit Trail** | Optional logging | Built-in execution trace |
|
|
99
81
|
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
82
|
+
**Capabilities** granted to agents:
|
|
83
|
+
```javascript
|
|
84
|
+
const agent = new HyperMindAgent({
|
|
85
|
+
kg: db,
|
|
86
|
+
sandbox: {
|
|
87
|
+
capabilities: ['ReadKG', 'ExecuteTool'], // No WriteKG = read-only
|
|
88
|
+
fuelLimit: 1_000_000 // CPU budget
|
|
89
|
+
}
|
|
90
|
+
})
|
|
103
91
|
```
|
|
104
92
|
|
|
105
|
-
|
|
93
|
+
Available capabilities: `ReadKG`, `WriteKG`, `ExecuteTool`, `SpawnAgent`, `HttpAccess`
|
|
106
94
|
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
**
|
|
95
|
+
**Why OCAP over MCP?**
|
|
96
|
+
- **Principle of Least Authority**: Agent only has capabilities explicitly granted
|
|
97
|
+
- **No Ambient Authority**: Can't access resources just because they exist
|
|
98
|
+
- **Composable Security**: Capabilities can be attenuated when passed down
|
|
110
99
|
|
|
111
100
|
---
|
|
112
101
|
|
|
113
|
-
##
|
|
114
|
-
|
|
115
|
-
Fixing SPARQL syntax is table stakes. Here's what keeps enterprise architects up at night:
|
|
116
|
-
|
|
117
|
-
**Scenario**: Your fraud detection agent correctly identified a circular payment ring last Tuesday. Today, an analyst asks: *"Show me similar patterns to what we found last week."*
|
|
118
|
-
|
|
119
|
-
The LLM response: *"I don't have access to previous conversations. Can you describe what you're looking for?"*
|
|
102
|
+
## Architecture Layers
|
|
120
103
|
|
|
121
|
-
|
|
104
|
+
### Layer Diagram
|
|
122
105
|
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
106
|
+
```
|
|
107
|
+
┌─────────────────────────────────────────────────────────────────────────┐
|
|
108
|
+
│ YOUR APPLICATION │
|
|
109
|
+
│ (Fraud Detection, Risk Analysis, Compliance) │
|
|
110
|
+
└────────────────────────────────┬────────────────────────────────────────┘
|
|
111
|
+
│
|
|
112
|
+
┌────────────────────────────────▼────────────────────────────────────────┐
|
|
113
|
+
│ LAYER 1: SDK BINDINGS │
|
|
114
|
+
│ TypeScript (NAPI-RS) | Python (UniFFI) | Kotlin (UniFFI) | Swift │
|
|
115
|
+
└────────────────────────────────┬────────────────────────────────────────┘
|
|
116
|
+
│
|
|
117
|
+
┌────────────────────────────────▼────────────────────────────────────────┐
|
|
118
|
+
│ LAYER 2: HYPERMIND FRAMEWORK │
|
|
119
|
+
├─────────────────────────────────────────────────────────────────────────┤
|
|
120
|
+
│ Intent Classification │ Tool Orchestration │ Memory Management │
|
|
121
|
+
│ (keyword patterns) │ (morphism compose) │ (episode storage) │
|
|
122
|
+
├─────────────────────────────────────────────────────────────────────────┤
|
|
123
|
+
│ Type Theory │ Category Theory │ Proof Theory │
|
|
124
|
+
│ (Hindley-Milner) │ (morphisms A→B) │ (Curry-Howard) │
|
|
125
|
+
├─────────────────────────────────────────────────────────────────────────┤
|
|
126
|
+
│ WASM Sandbox: Object Capability Security + Fuel Metering │
|
|
127
|
+
└────────────────────────────────┬────────────────────────────────────────┘
|
|
128
|
+
│
|
|
129
|
+
┌────────────────────────────────▼────────────────────────────────────────┐
|
|
130
|
+
│ LAYER 3: RUST CORE ENGINES │
|
|
131
|
+
├──────────────────┬──────────────────┬──────────────────┬────────────────┤
|
|
132
|
+
│ RDF/SPARQL │ GraphFrames │ Embeddings │ Datalog │
|
|
133
|
+
│ • Quad Store │ • DataFusion SQL │ • HNSW ANN │ • Semi-naive │
|
|
134
|
+
│ • SPOC Indexes │ • Arrow Columnar │ • 1-Hop Cache │ • Stratified │
|
|
135
|
+
│ • 64 Builtins │ • Pregel BSP │ • Multi-Provider │ • Negation │
|
|
136
|
+
└──────────────────┴──────────────────┴──────────────────┴────────────────┘
|
|
137
|
+
│
|
|
138
|
+
┌────────────────────────────────▼────────────────────────────────────────┐
|
|
139
|
+
│ LAYER 4: STORAGE │
|
|
140
|
+
│ InMemory (HashMap) │ RocksDB (LSM-tree) │ LMDB (B+tree, mmap) │
|
|
141
|
+
└────────────────────────────────┬────────────────────────────────────────┘
|
|
142
|
+
│
|
|
143
|
+
┌────────────────────────────────▼────────────────────────────────────────┐
|
|
144
|
+
│ LAYER 5: DISTRIBUTED (v0.2.0) │
|
|
145
|
+
│ HDRF Partitioner │ gRPC Protocol │ Coordinator/Executor │ Raft (planned)│
|
|
146
|
+
└─────────────────────────────────────────────────────────────────────────┘
|
|
147
|
+
```
|
|
130
148
|
|
|
131
|
-
|
|
132
|
-
1. **Temporal awareness** - What we decided *last month* vs *yesterday*
|
|
133
|
-
2. **Semantic edges** - The decision *relates to* these specific claims
|
|
134
|
-
3. **Epistemological stratification** - Fact vs inference vs hypothesis
|
|
135
|
-
4. **Proof chain** - *Why* we decided this, not just *that* we did
|
|
149
|
+
### Memory Hypergraph: Temporal + Long-Term Knowledge
|
|
136
150
|
|
|
137
|
-
|
|
151
|
+
The Memory Hypergraph solves a fundamental AI agent problem: **memory persistence across sessions**.
|
|
138
152
|
|
|
139
|
-
|
|
153
|
+
**Two Storage Layers, One Quad Store**:
|
|
140
154
|
|
|
141
|
-
|
|
155
|
+
| Layer | Purpose | Lifespan | Named Graph |
|
|
156
|
+
|-------|---------|----------|-------------|
|
|
157
|
+
| **Temporal Memory** | Agent episodes, conversations, findings | Session → months | `https://gonnect.ai/memory/` |
|
|
158
|
+
| **Long-Term Knowledge** | Domain facts, entities, relationships | Permanent | Default graph |
|
|
142
159
|
|
|
143
|
-
|
|
160
|
+
**How They Connect**:
|
|
144
161
|
|
|
145
162
|
```
|
|
146
|
-
|
|
147
|
-
│
|
|
148
|
-
│
|
|
149
|
-
│
|
|
150
|
-
│ │
|
|
151
|
-
│ │
|
|
152
|
-
│ │
|
|
153
|
-
│ │
|
|
154
|
-
│ │
|
|
155
|
-
│ │
|
|
156
|
-
│ │
|
|
157
|
-
│ │ │
|
|
158
|
-
│ │
|
|
159
|
-
│ │
|
|
160
|
-
│ │
|
|
161
|
-
│ │
|
|
162
|
-
│
|
|
163
|
-
│
|
|
164
|
-
│ │
|
|
165
|
-
│
|
|
166
|
-
|
|
167
|
-
│
|
|
168
|
-
│ │
|
|
169
|
-
│
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
│
|
|
173
|
-
│
|
|
174
|
-
│
|
|
175
|
-
│ │
|
|
176
|
-
│ │
|
|
177
|
-
│ │
|
|
178
|
-
│ │
|
|
179
|
-
│ │
|
|
180
|
-
│
|
|
181
|
-
│
|
|
182
|
-
│
|
|
183
|
-
│
|
|
184
|
-
│
|
|
185
|
-
│
|
|
186
|
-
│
|
|
187
|
-
│
|
|
188
|
-
│
|
|
189
|
-
│
|
|
190
|
-
|
|
191
|
-
│ │ │ │
|
|
192
|
-
│ │ Default: α=0.3, β=0.5, γ=0.2 │ │
|
|
193
|
-
│ └─────────────────────────────────────────────────────────────────────────┘ │
|
|
194
|
-
│ │
|
|
195
|
-
└─────────────────────────────────────────────────────────────────────────────────┘
|
|
163
|
+
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
164
|
+
│ TEMPORAL MEMORY LAYER │
|
|
165
|
+
│ (Named Graph: https://gonnect.ai/memory/) │
|
|
166
|
+
│ │
|
|
167
|
+
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
168
|
+
│ │ Episode:001 │────→│ Episode:002 │────→│ Episode:003 │ │
|
|
169
|
+
│ │ │ │ │ │ │ │
|
|
170
|
+
│ │ prompt: │ │ prompt: │ │ prompt: │ │
|
|
171
|
+
│ │ "Investigate │ │ "Check claim │ │ "Summarize │ │
|
|
172
|
+
│ │ P001" │ │ C123" │ │ investigation"│ │
|
|
173
|
+
│ │ │ │ │ │ │ │
|
|
174
|
+
│ │ timestamp: │ │ timestamp: │ │ timestamp: │ │
|
|
175
|
+
│ │ Dec 10 9:00 │ │ Dec 12 14:30 │ │ Dec 14 11:00 │ │
|
|
176
|
+
│ │ │ │ │ │ │ │
|
|
177
|
+
│ │ success: ✓ │ │ success: ✓ │ │ success: ✓ │ │
|
|
178
|
+
│ │ │ │ │ │ │ │
|
|
179
|
+
│ │ accessCount: │ │ accessCount: │ │ accessCount: │ │
|
|
180
|
+
│ │ 5 │ │ 3 │ │ 1 │ │
|
|
181
|
+
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
|
182
|
+
│ │ am:kgEntity │ am:kgEntity │ am:kgEntity │
|
|
183
|
+
└──────────┼────────────────────┼────────────────────┼────────────────────────┘
|
|
184
|
+
│ │ │
|
|
185
|
+
│ HYPER-EDGES │ (link temporal │ to permanent)
|
|
186
|
+
│ ═══════════ │ │
|
|
187
|
+
▼ ▼ ▼
|
|
188
|
+
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
189
|
+
│ LONG-TERM KNOWLEDGE LAYER │
|
|
190
|
+
│ (Default Graph) │
|
|
191
|
+
│ │
|
|
192
|
+
│ ┌────────────────┐ ┌────────────────┐ │
|
|
193
|
+
│ │ Provider:P001 │───submittedClaim──→│ Claim:C123 │ │
|
|
194
|
+
│ │ │ │ │ │
|
|
195
|
+
│ │ riskScore: 0.87│ │ amount: $50000 │ │
|
|
196
|
+
│ │ name: "MedCorp"│ │ status: "open" │ │
|
|
197
|
+
│ └────────────────┘ └───────┬────────┘ │
|
|
198
|
+
│ │ │
|
|
199
|
+
│ filedBy│ │
|
|
200
|
+
│ ▼ │
|
|
201
|
+
│ ┌────────────────┐ │
|
|
202
|
+
│ │ Claimant:C001 │ │
|
|
203
|
+
│ │ │ │
|
|
204
|
+
│ │ name: "J.Smith"│ │
|
|
205
|
+
│ │ riskScore: 0.85│ │
|
|
206
|
+
│ └────────────────┘ │
|
|
207
|
+
└─────────────────────────────────────────────────────────────────────────────┘
|
|
196
208
|
```
|
|
197
209
|
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
**Without Memory Hypergraph** (LangChain, LlamaIndex):
|
|
201
|
-
```javascript
|
|
202
|
-
// Ask about last week's findings
|
|
203
|
-
agent.chat("What fraud patterns did we find with Provider P001?")
|
|
204
|
-
// Response: "I don't have that information. Could you describe what you're looking for?"
|
|
205
|
-
// Cost: Re-run entire fraud detection pipeline ($5 in API calls, 30 seconds)
|
|
210
|
+
**Memory Scoring Formula** (for retrieval):
|
|
206
211
|
```
|
|
212
|
+
Score = α × Recency + β × Relevance + γ × Importance
|
|
213
|
+
(0.3) (0.5) (0.2)
|
|
207
214
|
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
212
|
-
query: "Provider P001 fraud",
|
|
213
|
-
kgFilter: { predicate: ":amount", operator: ">", value: 25000 },
|
|
214
|
-
limit: 10
|
|
215
|
-
})
|
|
216
|
-
|
|
217
|
-
// Returns typed results:
|
|
218
|
-
// {
|
|
219
|
-
// episode: "Episode:001",
|
|
220
|
-
// finding: "Fraud ring detected in Provider P001",
|
|
221
|
-
// kgContext: {
|
|
222
|
-
// provider: "Provider:P001",
|
|
223
|
-
// claims: [{ id: "Claim:C123", amount: 50000 }],
|
|
224
|
-
// riskScore: 0.87
|
|
225
|
-
// },
|
|
226
|
-
// semanticHash: "semhash:fraud-provider-p001-ring-detection"
|
|
227
|
-
// }
|
|
215
|
+
Recency = 0.995^hours_since_episode (decays ~12% per day)
|
|
216
|
+
Relevance = cosine_similarity(query_embedding, episode_embedding)
|
|
217
|
+
Importance = log10(access_count + 1) / log10(max_access + 1)
|
|
218
|
+
```
|
|
228
219
|
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
220
|
+
**Rolling Context Window** (adaptive retrieval):
|
|
221
|
+
```
|
|
222
|
+
Pass 1: Search last 1 hour → 0 episodes → expand window
|
|
223
|
+
Pass 2: Search last 24 hours → 1 episode → expand window
|
|
224
|
+
Pass 3: Search last 7 days → 3 episodes → sufficient context!
|
|
233
225
|
```
|
|
234
226
|
|
|
235
|
-
**
|
|
227
|
+
**Single Query Traverses Both Layers**:
|
|
236
228
|
```sparql
|
|
237
229
|
PREFIX am: <https://gonnect.ai/ontology/agent-memory#>
|
|
238
|
-
PREFIX : <http://insurance.org/>
|
|
230
|
+
PREFIX ins: <http://insurance.org/>
|
|
239
231
|
|
|
240
|
-
|
|
232
|
+
# Find past investigations and current risk scores
|
|
233
|
+
SELECT ?episode ?finding ?providerRisk ?claimAmount WHERE {
|
|
234
|
+
# Temporal layer: past agent memory
|
|
241
235
|
GRAPH <https://gonnect.ai/memory/> {
|
|
242
|
-
?episode a am:Episode ;
|
|
243
|
-
|
|
236
|
+
?episode a am:Episode ;
|
|
237
|
+
am:prompt ?finding ;
|
|
238
|
+
am:kgEntity ?provider .
|
|
244
239
|
}
|
|
245
|
-
|
|
246
|
-
|
|
240
|
+
# Long-term layer: current facts
|
|
241
|
+
?provider ins:riskScore ?providerRisk .
|
|
242
|
+
?provider ins:submittedClaim ?claim .
|
|
243
|
+
?claim ins:amount ?claimAmount .
|
|
247
244
|
}
|
|
245
|
+
ORDER BY DESC(?providerRisk)
|
|
248
246
|
```
|
|
249
|
-
*You never write this - the typed API builds it for you.*
|
|
250
247
|
|
|
251
|
-
|
|
248
|
+
**Key Benefits**:
|
|
249
|
+
- **Session Persistence**: Agent remembers past investigations
|
|
250
|
+
- **Contextual Recall**: "What did we find about P001 last week?"
|
|
251
|
+
- **Idempotent Responses**: Same question → same answer (semantic hash)
|
|
252
|
+
- **Full Provenance**: Every conclusion traceable to source episodes + KG facts
|
|
252
253
|
|
|
253
|
-
|
|
254
|
+
### Agent Identity & Session Persistence
|
|
254
255
|
|
|
255
|
-
|
|
256
|
-
┌─────────────────────────────────────────────────────────────────────────────────┐
|
|
257
|
-
│ ROLLING CONTEXT WINDOW │
|
|
258
|
-
│ │
|
|
259
|
-
│ Query: "What did we find about Provider P001?" │
|
|
260
|
-
│ │
|
|
261
|
-
│ Pass 1: Search last 1 hour → 0 episodes found → expand │
|
|
262
|
-
│ Pass 2: Search last 24 hours → 1 episode found (not enough) → expand │
|
|
263
|
-
│ Pass 3: Search last 7 days → 3 episodes found → within token budget ✓ │
|
|
264
|
-
│ │
|
|
265
|
-
│ Context returned: │
|
|
266
|
-
│ ┌──────────────────────────────────────────────────────────────────────────┐ │
|
|
267
|
-
│ │ Episode 003 (Dec 15): "Follow-up investigation on P001..." │ │
|
|
268
|
-
│ │ Episode 002 (Dec 12): "Underwriting denied claim from P001..." │ │
|
|
269
|
-
│ │ Episode 001 (Dec 10): "Fraud ring detected in Provider P001..." │ │
|
|
270
|
-
│ │ │ │
|
|
271
|
-
│ │ Estimated tokens: 847 / 8192 max │ │
|
|
272
|
-
│ │ Time window: 7 days │ │
|
|
273
|
-
│ │ Search passes: 3 │ │
|
|
274
|
-
│ └──────────────────────────────────────────────────────────────────────────┘ │
|
|
275
|
-
│ │
|
|
276
|
-
└─────────────────────────────────────────────────────────────────────────────────┘
|
|
277
|
-
```
|
|
278
|
-
|
|
279
|
-
### Idempotent Responses via Semantic Hashing
|
|
280
|
-
|
|
281
|
-
Same question = Same answer. Even with **different wording**. Critical for compliance.
|
|
256
|
+
Each agent has a persistent identity stored in the Memory Hypergraph:
|
|
282
257
|
|
|
283
258
|
```javascript
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
// Second call (different wording, same intent): Cache HIT!
|
|
289
|
-
const result2 = await agent.call("Show me P001's claim patterns")
|
|
290
|
-
// Cache HIT - same semantic hash: semhash:fraud-provider-p001-claims-analysis
|
|
291
|
-
|
|
292
|
-
// Third call (exact same): Also cache hit
|
|
293
|
-
const result3 = await agent.call("Analyze claims from Provider P001")
|
|
294
|
-
// Cache HIT - same semantic hash: semhash:fraud-provider-p001-claims-analysis
|
|
295
|
-
|
|
296
|
-
// Compliance officer: "Why are these identical?"
|
|
297
|
-
// You: "Semantic hashing - same meaning, same output, regardless of phrasing."
|
|
259
|
+
const agent = new HyperMindAgent({
|
|
260
|
+
kg: db,
|
|
261
|
+
name: 'fraud-detector-alpha' // Agent identity
|
|
262
|
+
})
|
|
298
263
|
```
|
|
299
264
|
|
|
300
|
-
**
|
|
301
|
-
|
|
302
|
-
**Research Foundation**:
|
|
303
|
-
- **SimHash** (Charikar, 2002) - Random hyperplane projections for cosine similarity
|
|
304
|
-
- **Semantic Hashing** (Salakhutdinov & Hinton, 2009) - Deep autoencoders for binary codes
|
|
305
|
-
- **Learning to Hash** (Wang et al., 2018) - Survey of neural hashing methods
|
|
306
|
-
|
|
307
|
-
**Implementation**: 384-dim embeddings → LSH with 64 hyperplanes → 64-bit semantic hash
|
|
308
|
-
|
|
309
|
-
**Benefits**:
|
|
310
|
-
- **Semantic deduplication** - "Find fraud" and "Detect fraudulent activity" hit same cache
|
|
311
|
-
- **Cost reduction** - Avoid redundant LLM calls for paraphrased questions
|
|
312
|
-
- **Consistency** - Same answer for same intent, audit-ready
|
|
313
|
-
- **Sub-linear lookup** - O(1) hash lookup vs O(n) embedding comparison
|
|
314
|
-
|
|
315
|
-
---
|
|
316
|
-
|
|
317
|
-
## What This Is
|
|
318
|
-
|
|
319
|
-
**World's first mobile-native knowledge graph database with clustered distribution and mathematically-grounded HyperMind agent framework.**
|
|
320
|
-
|
|
321
|
-
Most graph databases were designed for servers. Most AI agents are built on prompt engineering and hope. We built both from the ground up - the database for performance, the agent framework for correctness:
|
|
322
|
-
|
|
323
|
-
1. **Mobile-First**: Runs natively on iOS and Android with zero-copy FFI
|
|
324
|
-
2. **Standalone + Clustered**: Same codebase scales from smartphone to Kubernetes
|
|
325
|
-
3. **Open Standards**: W3C SPARQL 1.1, RDF 1.2, OWL 2 RL, SHACL - no vendor lock-in
|
|
326
|
-
4. **Mathematical Foundations**: Type theory, category theory, proof theory - not prompt engineering
|
|
327
|
-
5. **Worst-Case Optimal Joins**: WCOJ algorithm guarantees O(N^(ρ/2)) complexity
|
|
328
|
-
|
|
329
|
-
---
|
|
330
|
-
|
|
331
|
-
## Published Benchmarks
|
|
332
|
-
|
|
333
|
-
We don't make claims we can't prove. All measurements use **publicly available, peer-reviewed benchmarks**.
|
|
334
|
-
|
|
335
|
-
**Public Benchmarks Used:**
|
|
336
|
-
- **LUBM** (Lehigh University Benchmark) - Standard RDF/SPARQL benchmark since 2005
|
|
337
|
-
- **SP2Bench** - DBLP-based SPARQL performance benchmark
|
|
338
|
-
- **W3C SPARQL 1.1 Conformance Suite** - Official W3C test cases
|
|
339
|
-
|
|
340
|
-
| Metric | Value | Why It Matters |
|
|
341
|
-
|--------|-------|----------------|
|
|
342
|
-
| **Lookup Latency** | 2.78 µs | 35x faster than RDFox |
|
|
343
|
-
| **Memory per Triple** | 24 bytes | 25% more efficient than RDFox |
|
|
344
|
-
| **Bulk Insert** | 146K triples/sec | Production-ready throughput |
|
|
345
|
-
| **SPARQL Accuracy** | 86.4% | vs 0% vanilla LLM (LUBM benchmark) |
|
|
346
|
-
| **W3C Compliance** | 100% | Full SPARQL 1.1 + RDF 1.2 |
|
|
347
|
-
|
|
348
|
-
### How We Measured
|
|
349
|
-
|
|
350
|
-
- **Dataset**: LUBM benchmark (industry standard since 2005)
|
|
351
|
-
- **Hardware**: Apple Silicon M2 MacBook Pro
|
|
352
|
-
- **Methodology**: 10,000+ iterations, cold-start, statistical analysis
|
|
353
|
-
- **Comparison**: Apache Jena 4.x, RDFox 7.x under identical conditions
|
|
354
|
-
|
|
355
|
-
**Try it yourself:**
|
|
356
|
-
```bash
|
|
357
|
-
node hypermind-benchmark.js # Compare HyperMind vs Vanilla LLM accuracy
|
|
265
|
+
**Agent Memory Structure**:
|
|
358
266
|
```
|
|
359
|
-
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
|
|
363
|
-
|
|
364
|
-
|
|
365
|
-
|
|
366
|
-
|
|
367
|
-
|
|
368
|
-
|
|
369
|
-
|
|
370
|
-
?claim :amount ?amt .
|
|
371
|
-
FILTER(?amt > 50000)
|
|
372
|
-
?claim :provider ?prov .
|
|
373
|
-
?prov :flaggedCount ?flags .
|
|
374
|
-
FILTER(?flags > 3)
|
|
375
|
-
}
|
|
267
|
+
┌────────────────────────────────────────────────────────────────────────────┐
|
|
268
|
+
│ Agent: fraud-detector-alpha │
|
|
269
|
+
│ Created: 2024-12-10 09:00:00 │
|
|
270
|
+
│ Total Episodes: 47 │
|
|
271
|
+
│ Last Active: 2024-12-15 14:30:00 │
|
|
272
|
+
├────────────────────────────────────────────────────────────────────────────┤
|
|
273
|
+
│ Session 1 (Dec 10) │ Session 2 (Dec 12) │ Session 3... │
|
|
274
|
+
│ ├─ Episode:001 │ ├─ Episode:010 │ │
|
|
275
|
+
│ ├─ Episode:002 │ ├─ Episode:011 │ │
|
|
276
|
+
│ └─ Episode:003 │ └─ Episode:012 │ │
|
|
277
|
+
└────────────────────────────────────────────────────────────────────────────┘
|
|
376
278
|
```
|
|
377
279
|
|
|
378
|
-
|
|
280
|
+
**Cross-Session Continuity**:
|
|
281
|
+
```javascript
|
|
282
|
+
// Monday: First investigation
|
|
283
|
+
const agent = new HyperMindAgent({ kg: db, name: 'fraud-detector' })
|
|
284
|
+
await agent.call('Investigate Provider P001')
|
|
285
|
+
// Memory stored: Episode:001 → linked to Provider:P001
|
|
379
286
|
|
|
380
|
-
|
|
287
|
+
// Wednesday: Agent recalls Monday's work
|
|
288
|
+
const agent = new HyperMindAgent({ kg: db, name: 'fraud-detector' })
|
|
289
|
+
await agent.call('What did we find about P001?')
|
|
290
|
+
// Returns: "On Monday at 9:00am, we investigated P001 and found..."
|
|
291
|
+
```
|
|
381
292
|
|
|
382
|
-
|
|
293
|
+
**SPARQL to Query Agent History**:
|
|
294
|
+
```sparql
|
|
295
|
+
PREFIX am: <https://gonnect.ai/ontology/agent-memory#>
|
|
383
296
|
|
|
384
|
-
|
|
385
|
-
|
|
386
|
-
|
|
297
|
+
SELECT ?episode ?prompt ?timestamp ?success WHERE {
|
|
298
|
+
GRAPH <https://gonnect.ai/memory/> {
|
|
299
|
+
?episode a am:Episode ;
|
|
300
|
+
am:agent "fraud-detector-alpha" ;
|
|
301
|
+
am:prompt ?prompt ;
|
|
302
|
+
am:timestamp ?timestamp ;
|
|
303
|
+
am:success ?success .
|
|
304
|
+
}
|
|
305
|
+
}
|
|
306
|
+
ORDER BY DESC(?timestamp)
|
|
307
|
+
LIMIT 10
|
|
387
308
|
```
|
|
388
309
|
|
|
389
|
-
|
|
310
|
+
### Memory Ontology Specification
|
|
390
311
|
|
|
391
|
-
|
|
312
|
+
The agent memory system uses a formal OWL ontology available at [`ontology/agent-memory.ttl`](./ontology/agent-memory.ttl).
|
|
392
313
|
|
|
393
|
-
**
|
|
314
|
+
**Namespace**: `http://hypermind.ai/memory#` (prefix: `am:`)
|
|
394
315
|
|
|
395
|
-
|
|
396
|
-
┌─────────────────────────────────────────────────────────────────────────┐
|
|
397
|
-
│ NEURO-SYMBOLIC PIPELINE │
|
|
398
|
-
│ │
|
|
399
|
-
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
400
|
-
│ │ NEURAL │ │ SYMBOLIC │ │ NEURAL │ │
|
|
401
|
-
│ │ (Discovery) │ ───▶ │ (Reasoning) │ ───▶ │ (Explain) │ │
|
|
402
|
-
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
|
403
|
-
│ │
|
|
404
|
-
│ "Find similar" "Apply rules" "Summarize for │
|
|
405
|
-
│ Embeddings search Datalog inference human consumption" │
|
|
406
|
-
│ HNSW index Semi-naive eval LLM generation │
|
|
407
|
-
│ Sub-ms latency Deterministic Cryptographic proof │
|
|
408
|
-
└─────────────────────────────────────────────────────────────────────────┘
|
|
409
|
-
```
|
|
316
|
+
**Core Classes**:
|
|
410
317
|
|
|
411
|
-
|
|
318
|
+
| Class | Description |
|
|
319
|
+
|-------|-------------|
|
|
320
|
+
| `am:Episode` | A discrete interaction record (prompt → response) |
|
|
321
|
+
| `am:ExecutionRecord` | Tool execution within an episode |
|
|
322
|
+
| `am:Agent` | Persistent agent identity |
|
|
323
|
+
| `am:Session` | Bounded interaction period |
|
|
324
|
+
| `am:ProofDAG` | Reasoning chain (Curry-Howard proof witness) |
|
|
412
325
|
|
|
413
|
-
|
|
326
|
+
**Key Properties**:
|
|
414
327
|
|
|
415
|
-
|
|
416
|
-
|
|
328
|
+
| Property | Domain | Range | Description |
|
|
329
|
+
|----------|--------|-------|-------------|
|
|
330
|
+
| `am:prompt` | Episode | xsd:string | User prompt that initiated the episode |
|
|
331
|
+
| `am:success` | Episode | xsd:boolean | Whether execution succeeded |
|
|
332
|
+
| `am:timestamp` | Episode | xsd:dateTime | When the episode occurred |
|
|
333
|
+
| `am:durationMs` | Episode | xsd:integer | Execution time in milliseconds |
|
|
334
|
+
| `am:accessCount` | Episode | xsd:integer | Retrieval count (for importance scoring) |
|
|
335
|
+
| `am:linksToEntity` | Episode | rdfs:Resource | **Hyper-edge to KG entity** |
|
|
336
|
+
| `am:embedding` | Episode | xsd:string | 384-dim vector (JSON array) |
|
|
337
|
+
| `am:tool` | ExecutionRecord | xsd:string | Tool identifier (e.g., 'kg.sparql.query') |
|
|
338
|
+
| `am:performedBy` | Episode | Agent | Agent that executed the episode |
|
|
417
339
|
|
|
418
|
-
|
|
419
|
-
service.onTripleInsert('CLM001', 'claimant', 'P001', null)
|
|
420
|
-
service.onTripleInsert('P001', 'knows', 'P002', null)
|
|
340
|
+
**Hyper-Edge Pattern** (linking temporal memory to KG):
|
|
421
341
|
|
|
422
|
-
|
|
423
|
-
|
|
342
|
+
```turtle
|
|
343
|
+
@prefix am: <http://hypermind.ai/memory#> .
|
|
344
|
+
@prefix ins: <http://insurance.org/> .
|
|
424
345
|
|
|
425
|
-
|
|
426
|
-
|
|
346
|
+
# Episode links to multiple KG entities via hyper-edges
|
|
347
|
+
<episode:001> a am:Episode ;
|
|
348
|
+
am:prompt "Investigate fraud ring involving P001 and C123" ;
|
|
349
|
+
am:success true ;
|
|
350
|
+
am:timestamp "2025-12-15T10:30:00Z"^^xsd:dateTime ;
|
|
351
|
+
am:linksToEntity ins:P001 ; # Hyper-edge to Provider
|
|
352
|
+
am:linksToEntity ins:C123 ; # Hyper-edge to Claim
|
|
353
|
+
am:performedBy <agent:fraud-detector> .
|
|
427
354
|
```
|
|
428
355
|
|
|
429
|
-
**
|
|
356
|
+
**Named Graphs**:
|
|
430
357
|
|
|
431
|
-
|
|
358
|
+
| Graph | Purpose |
|
|
359
|
+
|-------|---------|
|
|
360
|
+
| `http://hypermind.ai/memory/` | Default episodic memory storage |
|
|
361
|
+
| `http://memory.hypermind.ai/` | Long-term persistent memory |
|
|
432
362
|
|
|
433
|
-
|
|
363
|
+
The ontology is constructed from:
|
|
364
|
+
1. **User conversations** - Prompts and natural language queries
|
|
365
|
+
2. **Agent responses** - Results, explanations, proofs
|
|
366
|
+
3. **Temporal metadata** - Timestamps, durations, access patterns
|
|
367
|
+
4. **KG linkage** - Hyper-edges connecting episodes to business entities
|
|
434
368
|
|
|
435
|
-
###
|
|
369
|
+
### Schema-Aware GraphDB (v0.6.13+)
|
|
436
370
|
|
|
437
|
-
|
|
371
|
+
Automatic schema extraction at load time - internal to the engine:
|
|
438
372
|
|
|
439
373
|
```javascript
|
|
440
|
-
const {
|
|
374
|
+
const { createSchemaAwareGraphDB, wrapWithSchemaAwareness } = require('rust-kgdb')
|
|
441
375
|
|
|
442
|
-
//
|
|
443
|
-
const
|
|
376
|
+
// Option 1: Create new schema-aware database
|
|
377
|
+
const db = createSchemaAwareGraphDB('http://example.org/', {
|
|
378
|
+
autoExtract: true // Extract schema after every load operation
|
|
379
|
+
})
|
|
444
380
|
|
|
445
|
-
//
|
|
446
|
-
|
|
447
|
-
|
|
448
|
-
service.storeVector('entity3', cohereEmbedding) // 384-dim
|
|
381
|
+
// Option 2: Wrap existing database
|
|
382
|
+
const rawDb = new GraphDB('http://example.org/')
|
|
383
|
+
const schemaDb = wrapWithSchemaAwareness(rawDb, { autoExtract: true })
|
|
449
384
|
|
|
450
|
-
//
|
|
451
|
-
|
|
452
|
-
|
|
453
|
-
|
|
385
|
+
// Load data - schema extraction happens automatically
|
|
386
|
+
db.loadTtl(`
|
|
387
|
+
@prefix : <http://example.org/> .
|
|
388
|
+
:alice a :Person ; :knows :bob .
|
|
389
|
+
:bob a :Person ; :age 30 .
|
|
390
|
+
`, null)
|
|
454
391
|
|
|
455
|
-
|
|
392
|
+
// Wait for schema to be ready (handles race conditions)
|
|
393
|
+
const schema = await db.waitForSchema()
|
|
394
|
+
console.log('Classes:', schema.context.classes) // ['Person']
|
|
395
|
+
console.log('Predicates:', schema.context.predicates) // ['knows', 'age']
|
|
396
|
+
```
|
|
456
397
|
|
|
457
|
-
|
|
398
|
+
**Key Features**:
|
|
399
|
+
- **Auto-extraction**: Schema extracted asynchronously after `loadTtl()`, `loadNtriples()`, `updateInsert()`
|
|
400
|
+
- **Race condition handling**: `waitForSchema()` blocks until extraction completes
|
|
401
|
+
- **Caching**: Schema cached globally via `SCHEMA_CACHE` (5 minute TTL)
|
|
402
|
+
- **No redundant extraction**: Only triggers on data modifications, not reads
|
|
458
403
|
|
|
459
|
-
|
|
460
|
-
// Store embeddings from multiple providers for the same entity
|
|
461
|
-
service.storeComposite('CLM001', JSON.stringify({
|
|
462
|
-
openai: await openai.embed('Insurance claim for soft tissue injury'),
|
|
463
|
-
voyage: await voyage.embed('Insurance claim for soft tissue injury'),
|
|
464
|
-
cohere: await cohere.embed('Insurance claim for soft tissue injury')
|
|
465
|
-
}))
|
|
404
|
+
### Schema Caching (v0.6.12+)
|
|
466
405
|
|
|
467
|
-
|
|
468
|
-
const rrfResults = service.findSimilarComposite('CLM001', 10, 0.7, 'rrf') // Reciprocal Rank Fusion
|
|
469
|
-
const maxResults = service.findSimilarComposite('CLM001', 10, 0.7, 'max') // Max score
|
|
470
|
-
const voteResults = service.findSimilarComposite('CLM001', 10, 0.7, 'voting') // Majority voting
|
|
471
|
-
```
|
|
406
|
+
Cross-agent schema sharing via global singleton:
|
|
472
407
|
|
|
473
|
-
|
|
408
|
+
```javascript
|
|
409
|
+
const { SCHEMA_CACHE, SchemaCache } = require('rust-kgdb')
|
|
474
410
|
|
|
475
|
-
|
|
411
|
+
// Global singleton - shared across all agents
|
|
412
|
+
SCHEMA_CACHE.set('http://insurance.org/', schema)
|
|
413
|
+
const cached = SCHEMA_CACHE.get('http://insurance.org/')
|
|
476
414
|
|
|
477
|
-
|
|
478
|
-
|
|
479
|
-
|
|
480
|
-
|
|
481
|
-
|
|
482
|
-
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY })
|
|
483
|
-
|
|
484
|
-
async function getOpenAIEmbedding(text) {
|
|
485
|
-
const response = await openai.embeddings.create({
|
|
486
|
-
model: 'text-embedding-3-small',
|
|
487
|
-
input: text,
|
|
488
|
-
dimensions: 384 // Match rust-kgdb's 384-dim format
|
|
489
|
-
})
|
|
490
|
-
return response.data[0].embedding
|
|
491
|
-
}
|
|
415
|
+
// Cache-aside pattern for automatic computation
|
|
416
|
+
const schema = await SCHEMA_CACHE.getOrCompute(
|
|
417
|
+
'http://insurance.org/',
|
|
418
|
+
async () => SchemaContext.fromKG(db)
|
|
419
|
+
)
|
|
492
420
|
|
|
493
|
-
//
|
|
494
|
-
|
|
495
|
-
// Note: Anthropic recommends Voyage AI for embeddings
|
|
496
|
-
// ============================================================
|
|
497
|
-
async function getVoyageEmbedding(text) {
|
|
498
|
-
// Using fetch directly (no SDK required)
|
|
499
|
-
const response = await fetch('https://api.voyageai.com/v1/embeddings', {
|
|
500
|
-
method: 'POST',
|
|
501
|
-
headers: {
|
|
502
|
-
'Authorization': `Bearer ${process.env.VOYAGE_API_KEY}`,
|
|
503
|
-
'Content-Type': 'application/json'
|
|
504
|
-
},
|
|
505
|
-
body: JSON.stringify({ input: text, model: 'voyage-2' })
|
|
506
|
-
})
|
|
507
|
-
const data = await response.json()
|
|
508
|
-
return data.data[0].embedding.slice(0, 384) // Truncate to 384-dim
|
|
509
|
-
}
|
|
421
|
+
// Invalidate on data changes
|
|
422
|
+
SCHEMA_CACHE.invalidate('http://insurance.org/')
|
|
510
423
|
|
|
511
|
-
//
|
|
512
|
-
//
|
|
513
|
-
// ============================================================
|
|
514
|
-
function getMockEmbedding(text) {
|
|
515
|
-
return new Array(384).fill(0).map((_, i) =>
|
|
516
|
-
Math.sin(text.charCodeAt(i % text.length) * 0.1) * 0.5 + 0.5
|
|
517
|
-
)
|
|
518
|
-
}
|
|
424
|
+
// Monitor cache performance
|
|
425
|
+
console.log(SCHEMA_CACHE.getStats()) // { hits: 42, misses: 3, evictions: 1 }
|
|
519
426
|
```
|
|
520
427
|
|
|
521
|
-
|
|
522
|
-
|
|
523
|
-
|
|
428
|
+
**Cache Configuration** (via `CONFIG.SCHEMA_CACHE_TTL_MS`):
|
|
429
|
+
- Default TTL: 5 minutes (300,000 ms)
|
|
430
|
+
- Eviction: Automatic when cache exceeds 100 entries
|
|
524
431
|
|
|
525
|
-
###
|
|
432
|
+
### Context Theory (v0.6.11+)
|
|
526
433
|
|
|
527
|
-
|
|
434
|
+
Type-theoretic schema validation based on Spivak's Ologs:
|
|
528
435
|
|
|
529
436
|
```javascript
|
|
530
|
-
const {
|
|
531
|
-
|
|
532
|
-
//
|
|
533
|
-
const
|
|
534
|
-
|
|
535
|
-
|
|
536
|
-
|
|
537
|
-
|
|
538
|
-
|
|
539
|
-
|
|
540
|
-
|
|
541
|
-
|
|
542
|
-
|
|
543
|
-
async function ingestClaim(claim) {
|
|
544
|
-
// 1. Insert structured data into knowledge graph
|
|
545
|
-
db.loadTtl(`
|
|
546
|
-
@prefix : <http://insurance.org/> .
|
|
547
|
-
:${claim.id} a :Claim ;
|
|
548
|
-
:amount "${claim.amount}" ;
|
|
549
|
-
:description "${claim.description}" ;
|
|
550
|
-
:claimant :${claim.claimantId} ;
|
|
551
|
-
:provider :${claim.providerId} .
|
|
552
|
-
`, null)
|
|
553
|
-
|
|
554
|
-
// 2. Generate and store embedding for semantic search
|
|
555
|
-
const vector = await getEmbedding(claim.description)
|
|
556
|
-
embeddings.storeVector(claim.id, vector)
|
|
557
|
-
|
|
558
|
-
// 3. Update 1-hop cache for neighbor-aware search
|
|
559
|
-
embeddings.onTripleInsert(claim.id, 'claimant', claim.claimantId, null)
|
|
560
|
-
embeddings.onTripleInsert(claim.id, 'provider', claim.providerId, null)
|
|
561
|
-
|
|
562
|
-
// 4. Rebuild index after batch inserts (or periodically)
|
|
563
|
-
embeddings.rebuildIndex()
|
|
564
|
-
|
|
565
|
-
return { tripleCount: db.countTriples(), embeddingStored: true }
|
|
566
|
-
}
|
|
567
|
-
|
|
568
|
-
// Process batch with embedding triggers
|
|
569
|
-
async function processBatch(claims) {
|
|
570
|
-
for (const claim of claims) {
|
|
571
|
-
await ingestClaim(claim)
|
|
572
|
-
console.log(`Ingested: ${claim.id}`)
|
|
437
|
+
const { SchemaContext, TypeJudgment, QueryValidator, ProofDAG } = require('rust-kgdb')
|
|
438
|
+
|
|
439
|
+
// Extract schema as category (Objects = Classes, Morphisms = Properties)
|
|
440
|
+
const schema = SchemaContext.fromKG(db)
|
|
441
|
+
console.log(schema.objects) // Classes: ['Claim', 'Provider', 'Claimant']
|
|
442
|
+
console.log(schema.morphisms) // Properties: ['submittedBy', 'amount', 'riskScore']
|
|
443
|
+
|
|
444
|
+
// Validate SPARQL queries against schema
|
|
445
|
+
const validator = new QueryValidator(schema)
|
|
446
|
+
const result = validator.validate(`
|
|
447
|
+
SELECT ?claim ?amount WHERE {
|
|
448
|
+
?claim :amount ?amount .
|
|
449
|
+
?claim :unknownPredicate ?x .
|
|
573
450
|
}
|
|
574
|
-
|
|
575
|
-
|
|
576
|
-
|
|
577
|
-
|
|
578
|
-
|
|
451
|
+
`)
|
|
452
|
+
// result: { valid: false, errors: ['unknownPredicate not in schema morphisms'] }
|
|
453
|
+
|
|
454
|
+
// Build proof DAG for verifiable reasoning
|
|
455
|
+
const proof = new ProofDAG()
|
|
456
|
+
proof.addNode('sparql_result', { bindings: [...] })
|
|
457
|
+
proof.addNode('datalog_inference', { rule: 'fraud_rule' })
|
|
458
|
+
proof.setRoot('conclusion', {
|
|
459
|
+
derives_from: ['sparql_result', 'datalog_inference']
|
|
460
|
+
})
|
|
461
|
+
console.log(proof.hash) // Deterministic hash for auditability
|
|
579
462
|
```
|
|
580
463
|
|
|
581
|
-
|
|
464
|
+
**Mathematical Foundation**:
|
|
465
|
+
- Schema as category (Spivak's Ologs)
|
|
466
|
+
- Queries as functors (structure-preserving)
|
|
467
|
+
- Type judgments: Γ ⊢ t : T (context proves term has type)
|
|
468
|
+
- Curry-Howard correspondence for proof witnesses
|
|
582
469
|
|
|
583
|
-
|
|
584
|
-
┌─────────────────────────────────────────────────────────────────────────┐
|
|
585
|
-
│ GRAPH INGESTION PIPELINE │
|
|
586
|
-
│ │
|
|
587
|
-
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
|
|
588
|
-
│ │ Data Source │ │ Transform │ │ Enrich │ │
|
|
589
|
-
│ │ (JSON/CSV) │────▶│ (to RDF) │────▶│ (+Embeddings)│ │
|
|
590
|
-
│ └───────────────┘ └───────────────┘ └───────┬───────┘ │
|
|
591
|
-
│ │ │
|
|
592
|
-
│ ┌───────────────────────────────────────────────────┼───────────────┐ │
|
|
593
|
-
│ │ TRIGGERS │ │ │
|
|
594
|
-
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┴─────────────┐ │ │
|
|
595
|
-
│ │ │ Embedding │ │ 1-Hop │ │ HNSW Index │ │ │
|
|
596
|
-
│ │ │ Generation │ │ Cache │ │ Rebuild │ │ │
|
|
597
|
-
│ │ │ (per entity)│ │ Update │ │ (batch/periodic) │ │ │
|
|
598
|
-
│ │ └─────────────┘ └─────────────┘ └───────────────────────────┘ │ │
|
|
599
|
-
│ └───────────────────────────────────────────────────────────────────┘ │
|
|
600
|
-
│ │ │
|
|
601
|
-
│ ▼ │
|
|
602
|
-
│ ┌───────────────────────────────────────────────────────────────────┐ │
|
|
603
|
-
│ │ RUST CORE (NAPI-RS) │ │
|
|
604
|
-
│ │ GraphDB (triples) │ EmbeddingService (vectors) │ HNSW (index) │ │
|
|
605
|
-
│ └───────────────────────────────────────────────────────────────────┘ │
|
|
606
|
-
└─────────────────────────────────────────────────────────────────────────┘
|
|
607
|
-
```
|
|
470
|
+
### Bring Your Own Ontology (BYOO) - Enterprise Support
|
|
608
471
|
|
|
609
|
-
|
|
472
|
+
For organizations with existing ontology teams:
|
|
610
473
|
|
|
611
|
-
|
|
474
|
+
```javascript
|
|
475
|
+
const { SchemaContext } = require('rust-kgdb')
|
|
612
476
|
|
|
613
|
-
|
|
477
|
+
// Load enterprise ontology (TTL/OWL format)
|
|
478
|
+
const ontologyTtl = `
|
|
479
|
+
@prefix owl: <http://www.w3.org/2002/07/owl#> .
|
|
480
|
+
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
|
|
481
|
+
@prefix ins: <http://insurance.org/> .
|
|
614
482
|
|
|
615
|
-
|
|
483
|
+
ins:Claim a owl:Class ;
|
|
484
|
+
rdfs:label "Insurance Claim" .
|
|
616
485
|
|
|
617
|
-
|
|
618
|
-
|
|
619
|
-
│ HYPERAGENT FRAMEWORK │
|
|
620
|
-
│ │
|
|
621
|
-
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
|
622
|
-
│ │ GOVERNANCE LAYER │ │
|
|
623
|
-
│ │ Policy Engine | Capability Grants | Audit Trail | Compliance │ │
|
|
624
|
-
│ └─────────────────────────────────────────────────────────────────┘ │
|
|
625
|
-
│ │ │
|
|
626
|
-
│ ┌───────────────────────────────┼─────────────────────────────────┐ │
|
|
627
|
-
│ │ RUNTIME LAYER │ │
|
|
628
|
-
│ │ ┌──────────────┐ ┌───────┴───────┐ ┌──────────────┐ │ │
|
|
629
|
-
│ │ │ LLMPlanner │ │ PlanExecutor │ │ WasmSandbox │ │ │
|
|
630
|
-
│ │ │ (Claude/GPT)│───▶│ (Type-safe) │───▶│ (Isolated) │ │ │
|
|
631
|
-
│ │ └──────────────┘ └───────────────┘ └──────┬───────┘ │ │
|
|
632
|
-
│ └──────────────────────────────────────────────────┼──────────────┘ │
|
|
633
|
-
│ │ │
|
|
634
|
-
│ ┌──────────────────────────────────────────────────┼──────────────┐ │
|
|
635
|
-
│ │ PROXY LAYER │ │ │
|
|
636
|
-
│ │ Object Proxy: All tool calls flow through typed morphism layer │ │
|
|
637
|
-
│ │ ┌────────────────────────────────────────────────┴───────────┐ │ │
|
|
638
|
-
│ │ │ proxy.call('kg.sparql.query', { query }) → BindingSet │ │ │
|
|
639
|
-
│ │ │ proxy.call('kg.motif.find', { pattern }) → List<Match> │ │ │
|
|
640
|
-
│ │ │ proxy.call('kg.datalog.infer', { rules }) → List<Fact> │ │ │
|
|
641
|
-
│ │ │ proxy.call('kg.embeddings.search', { entity }) → Similar │ │ │
|
|
642
|
-
│ │ └────────────────────────────────────────────────────────────┘ │ │
|
|
643
|
-
│ └─────────────────────────────────────────────────────────────────┘ │
|
|
644
|
-
│ │
|
|
645
|
-
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
|
646
|
-
│ │ MEMORY LAYER │ │
|
|
647
|
-
│ │ Working Memory | Long-term Memory | Episodic Memory │ │
|
|
648
|
-
│ │ (Current context) (Knowledge graph) (Execution history) │ │
|
|
649
|
-
│ └─────────────────────────────────────────────────────────────────┘ │
|
|
650
|
-
│ │
|
|
651
|
-
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
|
652
|
-
│ │ SCOPE LAYER │ │
|
|
653
|
-
│ │ Namespace isolation | Resource limits | Capability boundaries │ │
|
|
654
|
-
│ └─────────────────────────────────────────────────────────────────┘ │
|
|
655
|
-
└─────────────────────────────────────────────────────────────────────────┘
|
|
656
|
-
```
|
|
486
|
+
ins:Provider a owl:Class ;
|
|
487
|
+
rdfs:label "Healthcare Provider" .
|
|
657
488
|
|
|
658
|
-
|
|
489
|
+
ins:submittedBy a owl:ObjectProperty ;
|
|
490
|
+
rdfs:domain ins:Claim ;
|
|
491
|
+
rdfs:range ins:Provider .
|
|
659
492
|
|
|
660
|
-
|
|
661
|
-
|
|
662
|
-
|
|
663
|
-
|
|
664
|
-
maxExecutionTime: 30000, // 30 second timeout
|
|
665
|
-
allowedTools: ['kg.sparql.query', 'kg.datalog.infer'],
|
|
666
|
-
deniedTools: ['kg.update', 'kg.delete'], // Read-only
|
|
667
|
-
auditLevel: 'full' // Log all tool calls
|
|
668
|
-
})
|
|
669
|
-
```
|
|
670
|
-
|
|
671
|
-
**Runtime Layer**: Type-safe plan execution
|
|
672
|
-
```javascript
|
|
673
|
-
const { LLMPlanner, TOOL_REGISTRY } = require('rust-kgdb/hypermind-agent')
|
|
493
|
+
ins:amount a owl:DatatypeProperty ;
|
|
494
|
+
rdfs:domain ins:Claim ;
|
|
495
|
+
rdfs:range xsd:decimal .
|
|
496
|
+
`
|
|
674
497
|
|
|
675
|
-
|
|
676
|
-
const
|
|
677
|
-
// plan.steps: [{tool: 'kg.sparql.query', args: {...}}, ...]
|
|
678
|
-
// plan.confidence: 0.92
|
|
679
|
-
```
|
|
498
|
+
// Create schema from external ontology
|
|
499
|
+
const ontologySchema = SchemaContext.fromOntology(db, ontologyTtl)
|
|
680
500
|
|
|
681
|
-
|
|
682
|
-
|
|
683
|
-
const
|
|
684
|
-
capabilities: ['ReadKG', 'ExecuteTool'],
|
|
685
|
-
fuelLimit: 1000000
|
|
686
|
-
})
|
|
501
|
+
// Or merge ontology with KG-derived schema
|
|
502
|
+
const kgSchema = SchemaContext.fromKG(db)
|
|
503
|
+
const mergedSchema = SchemaContext.merge(ontologySchema, kgSchema)
|
|
687
504
|
|
|
688
|
-
|
|
689
|
-
|
|
690
|
-
|
|
505
|
+
// Use in HyperMind agent
|
|
506
|
+
const agent = new HyperMindAgent({
|
|
507
|
+
kg: db,
|
|
508
|
+
schema: mergedSchema // Agent uses your enterprise ontology
|
|
691
509
|
})
|
|
692
|
-
|
|
693
|
-
// All calls are logged, metered, and capability-checked
|
|
694
|
-
const result = await proxy['kg.sparql.query']({ query: 'SELECT ?x WHERE { ?x a :Fraud }' })
|
|
695
|
-
```
|
|
696
|
-
|
|
697
|
-
**Memory Layer**: Context management across agent lifecycle
|
|
698
|
-
```javascript
|
|
699
|
-
const agent = new AgentBuilder('investigator')
|
|
700
|
-
.withMemory({
|
|
701
|
-
working: { maxSize: 1024 * 1024 }, // 1MB working memory
|
|
702
|
-
episodic: { retentionDays: 30 }, // 30-day execution history
|
|
703
|
-
longTerm: db // Knowledge graph as long-term memory
|
|
704
|
-
})
|
|
705
510
|
```
|
|
706
511
|
|
|
707
|
-
**
|
|
708
|
-
|
|
709
|
-
|
|
710
|
-
|
|
711
|
-
namespace: 'fraud-detection',
|
|
712
|
-
resourceLimits: {
|
|
713
|
-
maxTriples: 1000000,
|
|
714
|
-
maxEmbeddings: 100000,
|
|
715
|
-
maxConcurrentQueries: 10
|
|
716
|
-
}
|
|
717
|
-
})
|
|
718
|
-
```
|
|
719
|
-
|
|
720
|
-
---
|
|
721
|
-
|
|
722
|
-
## Feature Overview
|
|
723
|
-
|
|
724
|
-
| Category | Feature | What It Does |
|
|
725
|
-
|----------|---------|--------------|
|
|
726
|
-
| **Core** | GraphDB | High-performance RDF/SPARQL quad store |
|
|
727
|
-
| **Core** | SPOC Indexes | Four-way indexing (SPOC/POCS/OCSP/CSPO) |
|
|
728
|
-
| **Core** | Dictionary | String interning with 8-byte IDs |
|
|
729
|
-
| **Analytics** | GraphFrames | PageRank, connected components, triangles |
|
|
730
|
-
| **Analytics** | Motif Finding | Pattern matching DSL |
|
|
731
|
-
| **Analytics** | Pregel | BSP parallel graph processing |
|
|
732
|
-
| **AI** | Embeddings | HNSW similarity with 1-hop ARCADE cache |
|
|
733
|
-
| **AI** | HyperMind | Neuro-symbolic agent framework |
|
|
734
|
-
| **Reasoning** | Datalog | Semi-naive evaluation engine |
|
|
735
|
-
| **Reasoning** | RDFS Reasoner | Subclass/subproperty inference |
|
|
736
|
-
| **Reasoning** | OWL 2 RL | Rule-based OWL reasoning |
|
|
737
|
-
| **Ontology** | SHACL | W3C shapes constraint validation |
|
|
738
|
-
| **Joins** | WCOJ | Worst-case optimal join algorithm |
|
|
739
|
-
| **Distribution** | HDRF | Streaming graph partitioning |
|
|
740
|
-
| **Distribution** | Raft | Consensus for coordination |
|
|
741
|
-
| **Mobile** | iOS/Android | Swift and Kotlin bindings via UniFFI |
|
|
742
|
-
| **Storage** | InMemory/RocksDB/LMDB | Three backend options |
|
|
512
|
+
**Use Cases**:
|
|
513
|
+
- **Large Enterprises**: Central ontology team defines schemas
|
|
514
|
+
- **Industry Standards**: Use FIBO, HL7 FHIR, or domain-specific ontologies
|
|
515
|
+
- **Governance**: Schema changes go through formal approval process
|
|
743
516
|
|
|
744
517
|
---
|
|
745
518
|
|
|
@@ -755,13 +528,15 @@ npm install rust-kgdb
|
|
|
755
528
|
|
|
756
529
|
## Quick Start
|
|
757
530
|
|
|
531
|
+
### 1. Knowledge Graph
|
|
532
|
+
|
|
758
533
|
```javascript
|
|
759
|
-
const { GraphDB,
|
|
534
|
+
const { GraphDB, getVersion } = require('rust-kgdb')
|
|
535
|
+
|
|
536
|
+
console.log('Version:', getVersion()) // "0.2.0"
|
|
760
537
|
|
|
761
|
-
|
|
762
|
-
const db = new GraphDB('http://example.org/myapp')
|
|
538
|
+
const db = new GraphDB('http://example.org/')
|
|
763
539
|
|
|
764
|
-
// 2. Load RDF data (Turtle format)
|
|
765
540
|
db.loadTtl(`
|
|
766
541
|
@prefix : <http://example.org/> .
|
|
767
542
|
:alice :knows :bob .
|
|
@@ -769,16 +544,20 @@ db.loadTtl(`
|
|
|
769
544
|
:charlie :knows :alice .
|
|
770
545
|
`, null)
|
|
771
546
|
|
|
772
|
-
console.log(`Loaded ${db.countTriples()} triples`)
|
|
547
|
+
console.log(`Loaded ${db.countTriples()} triples`) // 3
|
|
773
548
|
|
|
774
|
-
// 3. Query with SPARQL
|
|
775
549
|
const results = db.querySelect(`
|
|
776
550
|
PREFIX : <http://example.org/>
|
|
777
551
|
SELECT ?person WHERE { ?person :knows :bob }
|
|
778
552
|
`)
|
|
779
|
-
console.log(
|
|
553
|
+
console.log(results) // [{ bindings: { person: 'http://example.org/alice' } }]
|
|
554
|
+
```
|
|
555
|
+
|
|
556
|
+
### 2. Graph Analytics
|
|
557
|
+
|
|
558
|
+
```javascript
|
|
559
|
+
const { GraphFrame } = require('rust-kgdb')
|
|
780
560
|
|
|
781
|
-
// 4. Graph analytics
|
|
782
561
|
const graph = new GraphFrame(
|
|
783
562
|
JSON.stringify([{id:'alice'}, {id:'bob'}, {id:'charlie'}]),
|
|
784
563
|
JSON.stringify([
|
|
@@ -787,1430 +566,386 @@ const graph = new GraphFrame(
|
|
|
787
566
|
{src:'charlie', dst:'alice'}
|
|
788
567
|
])
|
|
789
568
|
)
|
|
790
|
-
console.log('Triangles:', graph.triangleCount()) // 1
|
|
791
|
-
console.log('PageRank:', graph.pageRank(0.15, 20))
|
|
792
|
-
|
|
793
|
-
// 5. Semantic similarity
|
|
794
|
-
const embeddings = new EmbeddingService()
|
|
795
|
-
embeddings.storeVector('alice', new Array(384).fill(0.5))
|
|
796
|
-
embeddings.storeVector('bob', new Array(384).fill(0.6))
|
|
797
|
-
embeddings.rebuildIndex()
|
|
798
|
-
console.log('Similar to alice:', embeddings.findSimilar('alice', 5, 0.3))
|
|
799
|
-
|
|
800
|
-
// 6. Datalog reasoning
|
|
801
|
-
const datalog = new DatalogProgram()
|
|
802
|
-
datalog.addFact(JSON.stringify({predicate:'knows', terms:['alice','bob']}))
|
|
803
|
-
datalog.addFact(JSON.stringify({predicate:'knows', terms:['bob','charlie']}))
|
|
804
|
-
datalog.addRule(JSON.stringify({
|
|
805
|
-
head: {predicate:'connected', terms:['?X','?Z']},
|
|
806
|
-
body: [
|
|
807
|
-
{predicate:'knows', terms:['?X','?Y']},
|
|
808
|
-
{predicate:'knows', terms:['?Y','?Z']}
|
|
809
|
-
]
|
|
810
|
-
}))
|
|
811
|
-
console.log('Inferred:', evaluateDatalog(datalog))
|
|
812
|
-
```
|
|
813
569
|
|
|
814
|
-
|
|
815
|
-
|
|
816
|
-
|
|
817
|
-
|
|
818
|
-
```
|
|
819
|
-
╔═══════════════════════════════════════════════╗
|
|
820
|
-
║ THE HYPERMIND ARCHITECTURE ║
|
|
821
|
-
╚═══════════════════════════════════════════════╝
|
|
822
|
-
|
|
823
|
-
Natural Language
|
|
824
|
-
│
|
|
825
|
-
▼
|
|
826
|
-
┌───────────────────────────────────┐
|
|
827
|
-
│ LLM (Neural) │
|
|
828
|
-
│ "Find circular payment patterns │
|
|
829
|
-
│ in claims from last month" │
|
|
830
|
-
└───────────────────────────────────┘
|
|
831
|
-
│
|
|
832
|
-
▼
|
|
833
|
-
┌───────────────────────────────────────────────────────────────────────┐
|
|
834
|
-
│ TYPE THEORY LAYER │
|
|
835
|
-
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
|
|
836
|
-
│ │ TypeId System │ │ Refinement │ │ Session Types │ │
|
|
837
|
-
│ │ (compile-time) │ │ Types │ │ (protocols) │ │
|
|
838
|
-
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
|
|
839
|
-
│ ERRORS CAUGHT HERE, NOT RUNTIME │
|
|
840
|
-
└───────────────────────────────────────────────────────────────────────┘
|
|
841
|
-
│
|
|
842
|
-
▼
|
|
843
|
-
┌───────────────────────────────────────────────────────────────────────┐
|
|
844
|
-
│ CATEGORY THEORY LAYER │
|
|
845
|
-
│ │
|
|
846
|
-
│ kg.sparql.query ────► kg.motif.find ────► kg.datalog │
|
|
847
|
-
│ (Query → Bindings) (Pattern → Matches) (Rules → Facts) │
|
|
848
|
-
│ │
|
|
849
|
-
│ f: A → B g: B → C h: C → D │
|
|
850
|
-
│ g ∘ f: A → C (COMPOSITION IS TYPE-SAFE) │
|
|
851
|
-
└───────────────────────────────────────────────────────────────────────┘
|
|
852
|
-
│
|
|
853
|
-
▼
|
|
854
|
-
┌───────────────────────────────────────────────────────────────────────┐
|
|
855
|
-
│ WASM SANDBOX LAYER │
|
|
856
|
-
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
|
857
|
-
│ │ wasmtime isolation │ │
|
|
858
|
-
│ │ • Isolated linear memory (no host access) │ │
|
|
859
|
-
│ │ • CPU fuel metering (10M ops max) │ │
|
|
860
|
-
│ │ • Capability-based security │ │
|
|
861
|
-
│ │ • NO filesystem, NO network │ │
|
|
862
|
-
│ └─────────────────────────────────────────────────────────────────┘ │
|
|
863
|
-
└───────────────────────────────────────────────────────────────────────┘
|
|
864
|
-
│
|
|
865
|
-
▼
|
|
866
|
-
┌───────────────────────────────────────────────────────────────────────┐
|
|
867
|
-
│ PROOF THEORY LAYER │
|
|
868
|
-
│ │
|
|
869
|
-
│ Every execution produces an ExecutionWitness: │
|
|
870
|
-
│ { tool, input, output, hash, timestamp, duration } │
|
|
871
|
-
│ │
|
|
872
|
-
│ Curry-Howard: Types ↔ Propositions, Programs ↔ Proofs │
|
|
873
|
-
│ Result: Full audit trail for SOX/GDPR/FDA compliance │
|
|
874
|
-
└───────────────────────────────────────────────────────────────────────┘
|
|
875
|
-
│
|
|
876
|
-
▼
|
|
877
|
-
┌───────────────────────────────────┐
|
|
878
|
-
│ Knowledge Graph Result │
|
|
879
|
-
│ 15 fraud patterns detected │
|
|
880
|
-
│ with complete audit trail │
|
|
881
|
-
└───────────────────────────────────┘
|
|
882
|
-
```
|
|
883
|
-
|
|
884
|
-
---
|
|
885
|
-
|
|
886
|
-
## HyperMind Architecture Deep Dive
|
|
887
|
-
|
|
888
|
-
For a complete walkthrough of the architecture, run:
|
|
889
|
-
```bash
|
|
890
|
-
node examples/hypermind-agent-architecture.js
|
|
891
|
-
```
|
|
892
|
-
|
|
893
|
-
### Full System Architecture
|
|
894
|
-
|
|
895
|
-
```
|
|
896
|
-
╔════════════════════════════════════════════════════════════════════════════════╗
|
|
897
|
-
║ HYPERMIND NEURO-SYMBOLIC ARCHITECTURE ║
|
|
898
|
-
╠════════════════════════════════════════════════════════════════════════════════╣
|
|
899
|
-
║ ║
|
|
900
|
-
║ ┌────────────────────────────────────────────────────────────────────────┐ ║
|
|
901
|
-
║ │ APPLICATION LAYER │ ║
|
|
902
|
-
║ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ ║
|
|
903
|
-
║ │ │ Fraud │ │ Underwriting│ │ Compliance │ │ Custom │ │ ║
|
|
904
|
-
║ │ │ Detection │ │ Agent │ │ Checker │ │ Agents │ │ ║
|
|
905
|
-
║ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ ║
|
|
906
|
-
║ └─────────┼────────────────┼────────────────┼────────────────┼───────────┘ ║
|
|
907
|
-
║ └────────────────┴────────┬───────┴────────────────┘ ║
|
|
908
|
-
║ │ ║
|
|
909
|
-
║ ┌───────────────────────────────────┼────────────────────────────────────┐ ║
|
|
910
|
-
║ │ HYPERMIND RUNTIME │ ║
|
|
911
|
-
║ │ ┌────────────────┐ ┌─────────┴─────────┐ ┌─────────────────┐ │ ║
|
|
912
|
-
║ │ │ LLM PLANNER │ │ PLAN EXECUTOR │ │ WASM SANDBOX │ │ ║
|
|
913
|
-
║ │ │ • Claude/GPT │───▶│ • Type validation │───▶│ • Capabilities │ │ ║
|
|
914
|
-
║ │ │ • Intent parse │ │ • Morphism compose│ │ • Fuel metering │ │ ║
|
|
915
|
-
║ │ │ • Tool select │ │ • Step execution │ │ • Memory limits │ │ ║
|
|
916
|
-
║ │ └────────────────┘ └───────────────────┘ └────────┬────────┘ │ ║
|
|
917
|
-
║ │ │ │ ║
|
|
918
|
-
║ │ ┌───────────────────────────────────────────────────────┼───────────┐ │ ║
|
|
919
|
-
║ │ │ OBJECT PROXY (gRPC-style) │ │ │ ║
|
|
920
|
-
║ │ │ proxy.call("kg.sparql.query", args) ────────────────┤ │ │ ║
|
|
921
|
-
║ │ │ proxy.call("kg.motif.find", args) ────────────────┤ │ │ ║
|
|
922
|
-
║ │ │ proxy.call("kg.datalog.infer", args) ────────────────┤ │ │ ║
|
|
923
|
-
║ │ └───────────────────────────────────────────────────────┼───────────┘ │ ║
|
|
924
|
-
║ └──────────────────────────────────────────────────────────┼─────────────┘ ║
|
|
925
|
-
║ │ ║
|
|
926
|
-
║ ┌──────────────────────────────────────────────────────────┼─────────────┐ ║
|
|
927
|
-
║ │ HYPERMIND TOOLS │ │ ║
|
|
928
|
-
║ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌───┴─────────┐ │ ║
|
|
929
|
-
║ │ │ SPARQL │ │ MOTIF │ │ DATALOG │ │ EMBEDDINGS │ │ ║
|
|
930
|
-
║ │ │ String → │ │ Pattern → │ │ Rules → │ │ Entity → │ │ ║
|
|
931
|
-
║ │ │ BindingSet │ │ List<Match> │ │ List<Fact> │ │ List<Sim> │ │ ║
|
|
932
|
-
║ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ ║
|
|
933
|
-
║ └────────────────────────────────────────────────────────────────────────┘ ║
|
|
934
|
-
║ ║
|
|
935
|
-
║ ┌────────────────────────────────────────────────────────────────────────┐ ║
|
|
936
|
-
║ │ rust-kgdb KNOWLEDGE GRAPH │ ║
|
|
937
|
-
║ │ RDF Triples | SPARQL 1.1 | GraphFrames | Embeddings | Datalog │ ║
|
|
938
|
-
║ │ 2.78µs lookups | 24 bytes/triple | 35x faster than RDFox │ ║
|
|
939
|
-
║ └────────────────────────────────────────────────────────────────────────┘ ║
|
|
940
|
-
╚════════════════════════════════════════════════════════════════════════════════╝
|
|
941
|
-
```
|
|
942
|
-
|
|
943
|
-
### Agent Execution Sequence
|
|
944
|
-
|
|
945
|
-
```
|
|
946
|
-
╔════════════════════════════════════════════════════════════════════════════════╗
|
|
947
|
-
║ HYPERMIND AGENT EXECUTION - SEQUENCE DIAGRAM ║
|
|
948
|
-
╠════════════════════════════════════════════════════════════════════════════════╣
|
|
949
|
-
║ ║
|
|
950
|
-
║ User SDK Planner Sandbox Proxy KG ║
|
|
951
|
-
║ │ │ │ │ │ │ ║
|
|
952
|
-
║ │ "Find suspicious claims" │ │ │ │ ║
|
|
953
|
-
║ │────────────▶│ │ │ │ │ ║
|
|
954
|
-
║ │ │ plan(prompt) │ │ │ │ ║
|
|
955
|
-
║ │ │─────────────▶│ │ │ │ ║
|
|
956
|
-
║ │ │ │ ┌──────────────────────────┐│ │ ║
|
|
957
|
-
║ │ │ │ │ LLM Reasoning: ││ │ ║
|
|
958
|
-
║ │ │ │ │ 1. Parse intent ││ │ ║
|
|
959
|
-
║ │ │ │ │ 2. Select tools ││ │ ║
|
|
960
|
-
║ │ │ │ │ 3. Validate types ││ │ ║
|
|
961
|
-
║ │ │ │ └──────────────────────────┘│ │ ║
|
|
962
|
-
║ │ │ Plan{steps, confidence} │ │ │ ║
|
|
963
|
-
║ │ │◀─────────────│ │ │ │ ║
|
|
964
|
-
║ │ │ execute(plan)│ │ │ │ ║
|
|
965
|
-
║ │ │─────────────────────────────▶ │ │ ║
|
|
966
|
-
║ │ │ │ ┌────────────────────────┐ │ │ ║
|
|
967
|
-
║ │ │ │ │ Sandbox Init: │ │ │ ║
|
|
968
|
-
║ │ │ │ │ • Capabilities: [Read] │ │ │ ║
|
|
969
|
-
║ │ │ │ │ • Fuel: 1,000,000 │ │ │ ║
|
|
970
|
-
║ │ │ │ └────────────────────────┘ │ │ ║
|
|
971
|
-
║ │ │ │ │ kg.sparql │ │ ║
|
|
972
|
-
║ │ │ │ │─────────────▶│───────────▶│ ║
|
|
973
|
-
║ │ │ │ │ │ BindingSet │ ║
|
|
974
|
-
║ │ │ │ │◀─────────────│◀───────────│ ║
|
|
975
|
-
║ │ │ │ │ kg.datalog │ │ ║
|
|
976
|
-
║ │ │ │ │─────────────▶│───────────▶│ ║
|
|
977
|
-
║ │ │ │ │ │ List<Fact> │ ║
|
|
978
|
-
║ │ │ │ │◀─────────────│◀───────────│ ║
|
|
979
|
-
║ │ │ ExecutionResult{findings, witness} │ │ ║
|
|
980
|
-
║ │ │◀───────────────────────────── │ │ ║
|
|
981
|
-
║ │ "Found 2 collusion patterns. Evidence: ..." │ │ ║
|
|
982
|
-
║ │◀────────────│ │ │ │ │ ║
|
|
983
|
-
╚════════════════════════════════════════════════════════════════════════════════╝
|
|
570
|
+
console.log('Triangles:', graph.triangleCount()) // 1
|
|
571
|
+
console.log('PageRank:', JSON.parse(graph.pageRank(0.15, 20)))
|
|
572
|
+
console.log('Components:', JSON.parse(graph.connectedComponents()))
|
|
984
573
|
```
|
|
985
574
|
|
|
986
|
-
###
|
|
987
|
-
|
|
988
|
-
The TypeScript SDK exports production-ready HyperMind components. All execution flows through the **WASM sandbox** for complete security isolation:
|
|
575
|
+
### 3. Semantic Similarity
|
|
989
576
|
|
|
990
577
|
```javascript
|
|
991
|
-
const {
|
|
992
|
-
// Type System (Hindley-Milner style)
|
|
993
|
-
TypeId, // Base types + refinement types (RiskScore, PolicyNumber)
|
|
994
|
-
TOOL_REGISTRY, // Tools as typed morphisms (category theory)
|
|
995
|
-
|
|
996
|
-
// Runtime Components
|
|
997
|
-
LLMPlanner, // Natural language → typed tool pipelines
|
|
998
|
-
WasmSandbox, // Secure WASM isolation with capability-based security
|
|
999
|
-
AgentBuilder, // Fluent builder for agent composition
|
|
1000
|
-
ComposedAgent, // Executable agent with execution witness
|
|
1001
|
-
} = require('rust-kgdb/hypermind-agent')
|
|
1002
|
-
```
|
|
1003
|
-
|
|
1004
|
-
**Example: Build a Custom Agent**
|
|
1005
|
-
```javascript
|
|
1006
|
-
const { AgentBuilder, LLMPlanner, TypeId, TOOL_REGISTRY } = require('rust-kgdb/hypermind-agent')
|
|
1007
|
-
|
|
1008
|
-
// Compose an agent using the builder pattern
|
|
1009
|
-
const agent = new AgentBuilder('compliance-checker')
|
|
1010
|
-
.withTool('kg.sparql.query')
|
|
1011
|
-
.withTool('kg.datalog.infer')
|
|
1012
|
-
.withPlanner(new LLMPlanner('claude-sonnet-4', TOOL_REGISTRY))
|
|
1013
|
-
.withSandbox({
|
|
1014
|
-
capabilities: ['ReadKG', 'ExecuteTool'], // No WriteKG for safety
|
|
1015
|
-
fuelLimit: 1000000,
|
|
1016
|
-
maxMemory: 64 * 1024 * 1024 // 64MB
|
|
1017
|
-
})
|
|
1018
|
-
.withHook('afterExecute', (step, result) => {
|
|
1019
|
-
console.log(`Completed: ${step.tool} → ${result.length} results`)
|
|
1020
|
-
})
|
|
1021
|
-
.build()
|
|
1022
|
-
|
|
1023
|
-
// Execute with natural language
|
|
1024
|
-
const result = await agent.call("Check compliance status for all vendors")
|
|
1025
|
-
console.log(result.witness.proof_hash) // sha256:...
|
|
1026
|
-
```
|
|
1027
|
-
|
|
1028
|
-
---
|
|
578
|
+
const { EmbeddingService } = require('rust-kgdb')
|
|
1029
579
|
|
|
1030
|
-
|
|
580
|
+
const embeddings = new EmbeddingService()
|
|
1031
581
|
|
|
1032
|
-
|
|
582
|
+
// Store 384-dimension vectors
|
|
583
|
+
embeddings.storeVector('claim_001', new Array(384).fill(0.5))
|
|
584
|
+
embeddings.storeVector('claim_002', new Array(384).fill(0.6))
|
|
585
|
+
embeddings.rebuildIndex()
|
|
1033
586
|
|
|
587
|
+
// HNSW similarity search
|
|
588
|
+
const similar = JSON.parse(embeddings.findSimilar('claim_001', 5, 0.7))
|
|
589
|
+
console.log('Similar:', similar)
|
|
1034
590
|
```
|
|
1035
|
-
┌───────────────────────┬──────────────────────┬──────────────────────────┐
|
|
1036
|
-
│ Feature │ MCP │ HyperMind Proxy │
|
|
1037
|
-
├───────────────────────┼──────────────────────┼──────────────────────────┤
|
|
1038
|
-
│ Type Safety │ ❌ String only │ ✅ Full type system │
|
|
1039
|
-
│ Domain Knowledge │ ❌ Generic │ ✅ Domain-enriched │
|
|
1040
|
-
│ Tool Composition │ ❌ Isolated │ ✅ Morphism composition │
|
|
1041
|
-
│ Validation │ ❌ Runtime │ ✅ Compile-time │
|
|
1042
|
-
│ Security │ ❌ None │ ✅ WASM sandbox │
|
|
1043
|
-
│ Audit Trail │ ❌ None │ ✅ Execution witness │
|
|
1044
|
-
│ LLM Context │ ❌ Generic schema │ ✅ Rich domain hints │
|
|
1045
|
-
│ Capability Control │ ❌ All or nothing │ ✅ Fine-grained caps │
|
|
1046
|
-
├───────────────────────┼──────────────────────┼──────────────────────────┤
|
|
1047
|
-
│ Result │ 60% accuracy │ 95%+ accuracy │
|
|
1048
|
-
│ │ "I think this might │ "Rule R1 matched facts │
|
|
1049
|
-
│ │ be suspicious..." │ F1,F2,F3. Proof: ..." │
|
|
1050
|
-
└───────────────────────┴──────────────────────┴──────────────────────────┘
|
|
1051
|
-
```
|
|
1052
|
-
|
|
1053
|
-
### The Key Insight
|
|
1054
591
|
|
|
1055
|
-
|
|
1056
|
-
**HyperMind**: LLM selects tools → type system validates → guaranteed correct
|
|
592
|
+
### 4. Rule-Based Reasoning
|
|
1057
593
|
|
|
1058
594
|
```javascript
|
|
1059
|
-
|
|
1060
|
-
// Tool: search_database(query: string)
|
|
1061
|
-
// LLM generates: "SELECT * FROM claims WHERE suspicious = true"
|
|
1062
|
-
// Result: ❌ SQL injection risk, "suspicious" column doesn't exist
|
|
1063
|
-
|
|
1064
|
-
// HYPERMIND APPROACH (Domain-enriched proxy)
|
|
1065
|
-
// Tool: kg.datalog.infer with NICB fraud rules
|
|
1066
|
-
const proxy = sandbox.createObjectProxy(tools)
|
|
1067
|
-
const result = await proxy['kg.datalog.infer']({
|
|
1068
|
-
rules: ['potential_collusion', 'staged_accident']
|
|
1069
|
-
})
|
|
1070
|
-
// Result: ✅ Type-safe, domain-aware, auditable
|
|
1071
|
-
```
|
|
1072
|
-
|
|
1073
|
-
**Why Domain Proxies Win:**
|
|
1074
|
-
1. LLM becomes **orchestrator**, not executor
|
|
1075
|
-
2. Domain knowledge **reduces hallucination**
|
|
1076
|
-
3. Composition **multiplies capability**
|
|
1077
|
-
4. Audit trail **enables compliance**
|
|
1078
|
-
5. Security **enables enterprise deployment**
|
|
1079
|
-
|
|
1080
|
-
---
|
|
1081
|
-
|
|
1082
|
-
## Why Vanilla LLMs Fail
|
|
1083
|
-
|
|
1084
|
-
When you ask an LLM to query a knowledge graph, it produces **broken SPARQL 85% of the time**:
|
|
1085
|
-
|
|
1086
|
-
```
|
|
1087
|
-
User: "Find all professors"
|
|
1088
|
-
|
|
1089
|
-
Vanilla LLM Output:
|
|
1090
|
-
┌───────────────────────────────────────────────────────────────────────┐
|
|
1091
|
-
│ ```sparql │
|
|
1092
|
-
│ PREFIX ub: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#> │
|
|
1093
|
-
│ SELECT ?professor WHERE { │
|
|
1094
|
-
│ ?professor a ub:Faculty . ← WRONG! Schema has "Professor" │
|
|
1095
|
-
│ } │
|
|
1096
|
-
│ ``` ← Parser rejects markdown │
|
|
1097
|
-
│ │
|
|
1098
|
-
│ This query retrieves all faculty members from the LUBM dataset. │
|
|
1099
|
-
│ ↑ Explanation text breaks parsing │
|
|
1100
|
-
└───────────────────────────────────────────────────────────────────────┘
|
|
1101
|
-
Result: ❌ PARSER ERROR - Invalid SPARQL syntax
|
|
1102
|
-
```
|
|
595
|
+
const { DatalogProgram, evaluateDatalog, queryDatalog } = require('rust-kgdb')
|
|
1103
596
|
|
|
1104
|
-
|
|
1105
|
-
1. LLM wraps query in markdown code blocks → parser chokes
|
|
1106
|
-
2. LLM adds explanation text → mixed with query syntax
|
|
1107
|
-
3. LLM hallucinates class names → `ub:Faculty` doesn't exist (it's `ub:Professor`)
|
|
1108
|
-
4. LLM has no schema awareness → guesses predicates and classes
|
|
597
|
+
const program = new DatalogProgram()
|
|
1109
598
|
|
|
1110
|
-
|
|
599
|
+
program.addFact(JSON.stringify({predicate: 'parent', terms: ['alice', 'bob']}))
|
|
600
|
+
program.addFact(JSON.stringify({predicate: 'parent', terms: ['bob', 'charlie']}))
|
|
1111
601
|
|
|
1112
|
-
|
|
602
|
+
// grandparent(X, Z) :- parent(X, Y), parent(Y, Z)
|
|
603
|
+
program.addRule(JSON.stringify({
|
|
604
|
+
head: {predicate: 'grandparent', terms: ['?X', '?Z']},
|
|
605
|
+
body: [
|
|
606
|
+
{predicate: 'parent', terms: ['?X', '?Y']},
|
|
607
|
+
{predicate: 'parent', terms: ['?Y', '?Z']}
|
|
608
|
+
]
|
|
609
|
+
}))
|
|
1113
610
|
|
|
1114
|
-
|
|
1115
|
-
User: "Find all professors"
|
|
1116
|
-
|
|
1117
|
-
HyperMind Output:
|
|
1118
|
-
┌───────────────────────────────────────────────────────────────────────┐
|
|
1119
|
-
│ PREFIX ub: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#> │
|
|
1120
|
-
│ SELECT ?professor WHERE { │
|
|
1121
|
-
│ ?professor a ub:Professor . ← CORRECT! Schema-aware │
|
|
1122
|
-
│ } │
|
|
1123
|
-
└───────────────────────────────────────────────────────────────────────┘
|
|
1124
|
-
Result: ✅ 15 results returned in 2.3ms
|
|
611
|
+
console.log('Inferred:', JSON.parse(evaluateDatalog(program)))
|
|
1125
612
|
```
|
|
1126
613
|
|
|
1127
|
-
|
|
1128
|
-
1. **Type-checked tools** - Query must be valid SPARQL (compile-time check)
|
|
1129
|
-
2. **Schema integration** - Tools know the ontology, not just the LLM
|
|
1130
|
-
3. **No text pollution** - Query output is typed `SPARQLQuery`, not `string`
|
|
1131
|
-
4. **Deterministic execution** - Same query, same result, always
|
|
614
|
+
### 5. HyperMind Agent (Complete Example)
|
|
1132
615
|
|
|
1133
|
-
|
|
616
|
+
```javascript
|
|
617
|
+
const {
|
|
618
|
+
GraphDB, EmbeddingService, HyperMindAgent,
|
|
619
|
+
MemoryManager, AgentScope, LLMPlanner,
|
|
620
|
+
createSchemaAwareGraphDB
|
|
621
|
+
} = require('rust-kgdb')
|
|
1134
622
|
|
|
1135
|
-
|
|
623
|
+
// Create schema-aware database (auto-extracts schema on load)
|
|
624
|
+
const db = createSchemaAwareGraphDB('http://insurance.org/')
|
|
625
|
+
db.loadTtl(`
|
|
626
|
+
@prefix : <http://insurance.org/> .
|
|
627
|
+
:CLM001 a :Claim ; :amount "50000" ; :provider :PROV001 .
|
|
628
|
+
:CLM002 a :Claim ; :amount "75000" ; :provider :PROV001 .
|
|
629
|
+
:PROV001 a :Provider ; :riskScore "0.87" ; :name "MedCorp" .
|
|
630
|
+
:PROV002 a :Provider ; :riskScore "0.35" ; :name "HealthCo" .
|
|
631
|
+
`, null)
|
|
1136
632
|
|
|
1137
|
-
|
|
633
|
+
// Full configuration showing all layers
|
|
634
|
+
const agent = new HyperMindAgent({
|
|
635
|
+
// === REQUIRED ===
|
|
636
|
+
kg: db,
|
|
637
|
+
|
|
638
|
+
// === LAYER 1: LLM Planner (Production Mode) ===
|
|
639
|
+
model: 'gpt-4o', // LLM model for intent + SPARQL
|
|
640
|
+
apiKey: process.env.OPENAI_API_KEY, // Required for LLM calls
|
|
641
|
+
|
|
642
|
+
// === LAYER 2: Memory ===
|
|
643
|
+
memory: new MemoryManager({
|
|
644
|
+
workingMemorySize: 10, // LRU cache for current session
|
|
645
|
+
episodicRetentionDays: 30, // How long to keep episodes
|
|
646
|
+
longTermGraph: 'http://memory.hypermind.ai/' // Persistent memory
|
|
647
|
+
}),
|
|
648
|
+
|
|
649
|
+
// === LAYER 3: Scope ===
|
|
650
|
+
scope: new AgentScope({
|
|
651
|
+
allowedGraphs: ['http://insurance.org/'], // Graphs agent can access
|
|
652
|
+
allowedPredicates: null, // null = all predicates
|
|
653
|
+
maxResultSize: 10000 // Limit result set size
|
|
654
|
+
}),
|
|
655
|
+
|
|
656
|
+
// === LAYER 4: Embeddings ===
|
|
657
|
+
embeddings: new EmbeddingService(), // For similarity search
|
|
658
|
+
|
|
659
|
+
// === LAYER 5: Security ===
|
|
660
|
+
sandbox: {
|
|
661
|
+
capabilities: ['ReadKG', 'ExecuteTool'], // No WriteKG = read-only
|
|
662
|
+
fuelLimit: 1_000_000 // CPU budget
|
|
663
|
+
},
|
|
664
|
+
|
|
665
|
+
// === LAYER 6: Identity ===
|
|
666
|
+
name: 'fraud-detector' // Persistent identity across sessions
|
|
667
|
+
})
|
|
1138
668
|
|
|
1139
|
-
|
|
669
|
+
// Wait for schema extraction to complete
|
|
670
|
+
await db.waitForSchema()
|
|
1140
671
|
|
|
1141
|
-
|
|
1142
|
-
|
|
1143
|
-
THE PROBLEM WITH AI AGENTS TODAY
|
|
1144
|
-
================================================================================
|
|
1145
|
-
|
|
1146
|
-
You ask ChatGPT: "Find suspicious insurance claims in our data"
|
|
1147
|
-
It replies: "Based on typical fraud patterns, you should look for..."
|
|
1148
|
-
|
|
1149
|
-
But wait -- it never SAW your data. It's guessing. Hallucinating.
|
|
1150
|
-
|
|
1151
|
-
HYPERMIND'S INSIGHT: Use LLMs for UNDERSTANDING, symbolic systems for REASONING.
|
|
1152
|
-
|
|
1153
|
-
================================================================================
|
|
1154
|
-
|
|
1155
|
-
+------------------------------------------------------------------------+
|
|
1156
|
-
| SECTION 4: DATALOG REASONING |
|
|
1157
|
-
| Rule-Based Inference Using NICB Fraud Detection Guidelines |
|
|
1158
|
-
+------------------------------------------------------------------------+
|
|
1159
|
-
|
|
1160
|
-
RULE 1: potential_collusion(?X, ?Y, ?P)
|
|
1161
|
-
IF claimant(?X) AND claimant(?Y) AND provider(?P)
|
|
1162
|
-
AND claims_with(?X, ?P) AND claims_with(?Y, ?P)
|
|
1163
|
-
AND knows(?X, ?Y)
|
|
1164
|
-
THEN potential_collusion(?X, ?Y, ?P)
|
|
1165
|
-
Source: NICB Ring Detection Guidelines
|
|
1166
|
-
|
|
1167
|
-
Running Datalog Inference Engine...
|
|
1168
|
-
|
|
1169
|
-
INFERRED FACTS:
|
|
1170
|
-
---------------
|
|
1171
|
-
[!] COLLUSION DETECTED: 1 pattern(s)
|
|
1172
|
-
P001 <-> P002 via PROV001
|
|
1173
|
-
[!] STAGED ACCIDENT INDICATORS: 3 pattern(s)
|
|
1174
|
-
P001 via PROV001
|
|
1175
|
-
P002 via PROV001
|
|
1176
|
-
P005 via PROV001
|
|
1177
|
-
|
|
1178
|
-
+------------------------------------------------------------------------+
|
|
1179
|
-
| SECTION 5: HYPERMIND AGENT INTERACTION |
|
|
1180
|
-
| Natural Language Interface - The Power of Neuro-Symbolic AI |
|
|
1181
|
-
+------------------------------------------------------------------------+
|
|
1182
|
-
|
|
1183
|
-
========================================================================
|
|
1184
|
-
USER PROMPT: "Which claims look suspicious and why should I investigate them?"
|
|
1185
|
-
========================================================================
|
|
1186
|
-
|
|
1187
|
-
Agent Reasoning:
|
|
1188
|
-
1. Decomposing query: "suspicious claims" -> need risk indicators
|
|
1189
|
-
2. Selecting tools: GraphFrame (network), Embeddings (similarity), Datalog (rules)
|
|
1190
|
-
3. Type checking: All tools compatible (Graph -> Analysis -> Inference)
|
|
1191
|
-
4. Executing pipeline...
|
|
1192
|
-
|
|
1193
|
-
========================================================================
|
|
1194
|
-
AGENT RESPONSE:
|
|
1195
|
-
========================================================================
|
|
1196
|
-
|
|
1197
|
-
I analyzed 5 claims across 3 providers and found 2 CRITICAL fraud indicators:
|
|
1198
|
-
|
|
1199
|
-
[CRITICAL] FINDING 1: COLLUSION RING (Confidence: 95%)
|
|
1200
|
-
--------------------------------------------------------
|
|
1201
|
-
Claimants: John Smith (P001) + Jane Doe (P002)
|
|
1202
|
-
Provider: Quick Care Rehabilitation Clinic
|
|
1203
|
-
Evidence:
|
|
1204
|
-
- Both filed soft tissue claims within 3 days of each other
|
|
1205
|
-
- Both live at 123 Main St (shared address)
|
|
1206
|
-
- Both have elevated risk scores (0.85, 0.72)
|
|
1207
|
-
|
|
1208
|
-
Rule Triggered: NICB Collusion Pattern
|
|
1209
|
-
potential_collusion(P001, P002, PROV001) <- knows AND same_provider AND same_address
|
|
1210
|
-
|
|
1211
|
-
Recommendation: IMMEDIATE SIU REFERRAL
|
|
1212
|
-
|
|
1213
|
-
EXECUTION WITNESS (Proof Theory - Audit Trail):
|
|
1214
|
-
------------------------------------------------
|
|
1215
|
-
Timestamp: 2024-12-14T23:15:00Z
|
|
1216
|
-
Agent: fraud-investigator
|
|
1217
|
-
Tools:
|
|
1218
|
-
- kg.graphframe.triangles: 1 triangle detected
|
|
1219
|
-
- kg.embeddings.search: 3 clusters found
|
|
1220
|
-
- kg.datalog.infer: 3 fraud patterns derived
|
|
1221
|
-
Proof Hash: sha256:a3f2b8c9d1e4...
|
|
1222
|
-
|
|
1223
|
-
========================================================================
|
|
1224
|
-
USER FOLLOW-UP: "Show me the rule that detected the collusion"
|
|
1225
|
-
========================================================================
|
|
1226
|
-
|
|
1227
|
-
Agent Response:
|
|
1228
|
-
|
|
1229
|
-
The collusion was detected by this NICB-derived Datalog rule:
|
|
1230
|
-
|
|
1231
|
-
+----------------------------------------------------------------------+
|
|
1232
|
-
| RULE: potential_collusion(?X, ?Y, ?P) |
|
|
1233
|
-
| |
|
|
1234
|
-
| IF: |
|
|
1235
|
-
| claimant(?X) <- John Smith is a claimant |
|
|
1236
|
-
| claimant(?Y) <- Jane Doe is a claimant |
|
|
1237
|
-
| provider(?P) <- Quick Care is a provider |
|
|
1238
|
-
| claims_with(?X, ?P) <- John filed claim with Quick Care |
|
|
1239
|
-
| claims_with(?Y, ?P) <- Jane filed claim with Quick Care |
|
|
1240
|
-
| knows(?X, ?Y) <- John and Jane know each other |
|
|
1241
|
-
| |
|
|
1242
|
-
| THEN: |
|
|
1243
|
-
| potential_collusion(P001, P002, PROV001) |
|
|
1244
|
-
| |
|
|
1245
|
-
| CONFIDENCE: 100% (all facts verified in knowledge graph) |
|
|
1246
|
-
+----------------------------------------------------------------------+
|
|
1247
|
-
|
|
1248
|
-
This derivation is 100% deterministic and auditable.
|
|
1249
|
-
A regulator can verify this finding by checking the rule against the facts.
|
|
1250
|
-
```
|
|
1251
|
-
|
|
1252
|
-
**The Key Difference:**
|
|
1253
|
-
- **Vanilla LLM**: "Some claims may be suspicious" (no data access, no proof)
|
|
1254
|
-
- **HyperMind**: Specific findings + rule derivations + cryptographic audit trail
|
|
672
|
+
// Natural language query - LLM uses schema for accurate SPARQL
|
|
673
|
+
const result = await agent.call('Find all high-risk claims')
|
|
1255
674
|
|
|
1256
|
-
|
|
1257
|
-
|
|
1258
|
-
|
|
1259
|
-
|
|
1260
|
-
node examples/underwriting-agent.js # Underwriting pipeline
|
|
675
|
+
console.log('Answer:', result.answer)
|
|
676
|
+
console.log('Tools Used:', result.explanation.tools_used)
|
|
677
|
+
console.log('SPARQL Generated:', result.explanation.sparql_queries)
|
|
678
|
+
console.log('Proof Hash:', result.proof?.hash)
|
|
1261
679
|
```
|
|
1262
680
|
|
|
1263
|
-
|
|
681
|
+
**Layer Defaults** (if not specified):
|
|
1264
682
|
|
|
1265
|
-
|
|
683
|
+
| Layer | Default Value |
|
|
684
|
+
|-------|---------------|
|
|
685
|
+
| Memory | Disabled (no session persistence) |
|
|
686
|
+
| Scope | Unrestricted (all graphs, all predicates) |
|
|
687
|
+
| Embeddings | Disabled (no similarity search) |
|
|
688
|
+
| Sandbox | `['ReadKG', 'ExecuteTool']`, fuel: 1M |
|
|
689
|
+
| LLM Model | None (demo mode with keyword matching) |
|
|
1266
690
|
|
|
1267
|
-
|
|
1268
|
-
|
|
1269
|
-
### Type Theory: Compile-Time Validation
|
|
1270
|
-
|
|
1271
|
-
```typescript
|
|
1272
|
-
// Refinement types catch errors BEFORE execution
|
|
1273
|
-
type RiskScore = number & { __refinement: '0 ≤ x ≤ 1' }
|
|
1274
|
-
type PolicyNumber = string & { __refinement: '/^POL-\\d{9}$/' }
|
|
1275
|
-
type CreditScore = number & { __refinement: '300 ≤ x ≤ 850' }
|
|
1276
|
-
|
|
1277
|
-
// Framework validates at construction, not runtime
|
|
1278
|
-
function assessRisk(score: RiskScore): Decision {
|
|
1279
|
-
// score is GUARANTEED to be 0.0-1.0
|
|
1280
|
-
// No defensive coding needed
|
|
1281
|
-
}
|
|
1282
|
-
```
|
|
691
|
+
### Schema-Aware Intent: Different Words → Same Result
|
|
1283
692
|
|
|
1284
|
-
|
|
693
|
+
The LLM Planner + Schema injection ensures consistent results regardless of phrasing:
|
|
1285
694
|
|
|
695
|
+
```javascript
|
|
696
|
+
// All these queries produce the SAME SPARQL because LLM knows your schema
|
|
697
|
+
await agent.call('Find high-risk providers') // "high-risk"
|
|
698
|
+
await agent.call('Show me suspicious vendors') // "suspicious vendors"
|
|
699
|
+
await agent.call('Which suppliers have elevated risk?') // "elevated risk"
|
|
700
|
+
await agent.call('List providers with bad scores') // "bad scores"
|
|
701
|
+
|
|
702
|
+
// Generated SPARQL (same for all above):
|
|
703
|
+
// SELECT ?provider ?name ?score WHERE {
|
|
704
|
+
// ?provider a :Provider ; :name ?name ; :riskScore ?score .
|
|
705
|
+
// FILTER(?score > 0.7)
|
|
706
|
+
// }
|
|
1286
707
|
```
|
|
1287
|
-
Tools are morphisms (typed arrows):
|
|
1288
708
|
|
|
1289
|
-
|
|
1290
|
-
|
|
1291
|
-
|
|
1292
|
-
|
|
709
|
+
**How it works**:
|
|
710
|
+
1. LLM receives your schema: `{ classes: ['Claim', 'Provider'], predicates: ['riskScore', 'amount'] }`
|
|
711
|
+
2. LLM understands "vendors", "suppliers", "providers" all map to `:Provider`
|
|
712
|
+
3. LLM understands "high-risk", "suspicious", "bad" all map to `:riskScore > threshold`
|
|
713
|
+
4. Generated SPARQL uses YOUR actual predicates, not hallucinated ones
|
|
1293
714
|
|
|
1294
|
-
|
|
715
|
+
### Mathematical Foundation: Predictable AI
|
|
1295
716
|
|
|
1296
|
-
|
|
1297
|
-
g: B → C
|
|
1298
|
-
g ∘ f: A → C (valid only if types align)
|
|
717
|
+
Unlike black-box LLMs, HyperMind produces **deterministic, verifiable results**:
|
|
1299
718
|
|
|
1300
|
-
Laws guaranteed:
|
|
1301
|
-
1. Identity: id ∘ f = f = f ∘ id
|
|
1302
|
-
2. Associativity: (h ∘ g) ∘ f = h ∘ (g ∘ f)
|
|
1303
719
|
```
|
|
1304
|
-
|
|
1305
|
-
|
|
1306
|
-
|
|
1307
|
-
|
|
1308
|
-
|
|
1309
|
-
|
|
1310
|
-
|
|
1311
|
-
|
|
1312
|
-
|
|
1313
|
-
|
|
1314
|
-
|
|
1315
|
-
|
|
1316
|
-
|
|
1317
|
-
|
|
1318
|
-
"hash": "sha256:a3f2c8d9..."
|
|
1319
|
-
}
|
|
1320
|
-
```
|
|
1321
|
-
|
|
1322
|
-
**Implication**: Full audit trail for SOX, GDPR, FDA 21 CFR Part 11 compliance.
|
|
1323
|
-
|
|
1324
|
-
---
|
|
1325
|
-
|
|
1326
|
-
## Ontology Engine
|
|
1327
|
-
|
|
1328
|
-
rust-kgdb includes a complete ontology engine based on W3C standards.
|
|
1329
|
-
|
|
1330
|
-
### RDFS Reasoning
|
|
1331
|
-
|
|
1332
|
-
```turtle
|
|
1333
|
-
# Schema
|
|
1334
|
-
:Employee rdfs:subClassOf :Person .
|
|
1335
|
-
:Manager rdfs:subClassOf :Employee .
|
|
1336
|
-
|
|
1337
|
-
# Data
|
|
1338
|
-
:alice a :Manager .
|
|
1339
|
-
|
|
1340
|
-
# Inferred (automatic)
|
|
1341
|
-
:alice a :Employee . # via subclass chain
|
|
1342
|
-
:alice a :Person . # via subclass chain
|
|
720
|
+
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
721
|
+
│ NEURO-SYMBOLIC ARCHITECTURE │
|
|
722
|
+
│ │
|
|
723
|
+
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
|
|
724
|
+
│ │ Neural │ │ Symbolic │ │ Output │ │
|
|
725
|
+
│ │ (LLM) │────→│ (SPARQL) │────→│ (Proof DAG) │ │
|
|
726
|
+
│ │ │ │ │ │ │ │
|
|
727
|
+
│ │ Intent classif │ │ Query execution│ │ Verifiable │ │
|
|
728
|
+
│ │ SPARQL gen │ │ Datalog rules │ │ Reproducible │ │
|
|
729
|
+
│ └────────────────┘ └────────────────┘ └────────────────┘ │
|
|
730
|
+
│ │
|
|
731
|
+
│ "Find fraud" → SELECT ?claim WHERE {...} → { hash: "0x8f3a...", │
|
|
732
|
+
│ derivation: [...] } │
|
|
733
|
+
└─────────────────────────────────────────────────────────────────────────────┘
|
|
1343
734
|
```
|
|
1344
735
|
|
|
1345
|
-
|
|
736
|
+
**Three Mathematical Pillars**:
|
|
1346
737
|
|
|
1347
|
-
|
|
|
1348
|
-
|
|
1349
|
-
|
|
|
1350
|
-
|
|
|
1351
|
-
|
|
|
1352
|
-
| `prp-trp` | Transitive property |
|
|
1353
|
-
| `cls-hv` | hasValue restriction |
|
|
1354
|
-
| `cls-svf` | someValuesFrom restriction |
|
|
1355
|
-
| `cax-sco` | Subclass transitivity |
|
|
738
|
+
| Pillar | Guarantee | Implementation |
|
|
739
|
+
|--------|-----------|----------------|
|
|
740
|
+
| **Type Theory** | Input/output contracts enforced | `kg.sparql.query: Query → BindingSet` |
|
|
741
|
+
| **Category Theory** | Safe tool composition | Morphisms compose: `A → B → C` |
|
|
742
|
+
| **Proof Theory** | Every answer has provenance | ProofDAG with Curry-Howard witness |
|
|
1356
743
|
|
|
1357
|
-
|
|
744
|
+
**Why This Matters**:
|
|
745
|
+
- **No Hallucination**: SPARQL results come from your actual data
|
|
746
|
+
- **Audit Trail**: Every conclusion traceable to source triples
|
|
747
|
+
- **Reproducibility**: Same query → same answer → same proof hash
|
|
748
|
+
- **Compliance Ready**: Full provenance for regulatory requirements
|
|
1358
749
|
|
|
1359
|
-
|
|
1360
|
-
:PersonShape a sh:NodeShape ;
|
|
1361
|
-
sh:targetClass :Person ;
|
|
1362
|
-
sh:property [
|
|
1363
|
-
sh:path :email ;
|
|
1364
|
-
sh:pattern "^[a-z]+@[a-z]+\\.[a-z]+$" ;
|
|
1365
|
-
sh:minCount 1 ;
|
|
1366
|
-
] .
|
|
1367
|
-
```
|
|
750
|
+
**How Intent Classification Works:**
|
|
1368
751
|
|
|
1369
|
-
|
|
752
|
+
For accurate natural language → SPARQL conversion, the agent needs:
|
|
1370
753
|
|
|
1371
|
-
|
|
754
|
+
1. **Schema Awareness** - Know actual predicates in your graph
|
|
755
|
+
2. **Semantic Understanding** - Map natural language to graph operations
|
|
756
|
+
3. **Dynamic Query Generation** - Build SPARQL for your specific schema
|
|
1372
757
|
|
|
1373
|
-
**
|
|
1374
|
-
- Staged accidents: 20% of insurance fraud
|
|
1375
|
-
- Provider collusion: 25% of fraud claims
|
|
1376
|
-
- Ring operations: 40% of organized fraud
|
|
758
|
+
**Two Modes of Operation:**
|
|
1377
759
|
|
|
1378
|
-
|
|
760
|
+
| Mode | Intent Classification | SPARQL Generation | Use Case |
|
|
761
|
+
|------|----------------------|-------------------|----------|
|
|
762
|
+
| **Demo Mode** (default) | Keyword patterns | Hardcoded templates | Quick testing, demos |
|
|
763
|
+
| **Production Mode** | LLM + Schema injection | LLM-generated | Accurate queries on real data |
|
|
1379
764
|
|
|
1380
|
-
###
|
|
765
|
+
### Demo Mode (Current Default)
|
|
1381
766
|
|
|
1382
|
-
|
|
767
|
+
Works with keyword matching and pre-built templates:
|
|
1383
768
|
|
|
1384
769
|
```javascript
|
|
1385
|
-
|
|
1386
|
-
// STEP 1: Environment Configuration
|
|
1387
|
-
// ============================================================
|
|
1388
|
-
const { GraphDB, GraphFrame, EmbeddingService, DatalogProgram, evaluateDatalog } = require('rust-kgdb')
|
|
1389
|
-
const { AgentBuilder, LLMPlanner, WasmSandbox, TOOL_REGISTRY } = require('rust-kgdb/hypermind-agent')
|
|
1390
|
-
|
|
1391
|
-
// Configure embedding provider (choose one)
|
|
1392
|
-
const EMBEDDING_PROVIDER = process.env.EMBEDDING_PROVIDER || 'mock'
|
|
1393
|
-
const OPENAI_API_KEY = process.env.OPENAI_API_KEY
|
|
1394
|
-
const VOYAGE_API_KEY = process.env.VOYAGE_API_KEY
|
|
1395
|
-
|
|
1396
|
-
// Embedding dimension must match provider output
|
|
1397
|
-
const EMBEDDING_DIM = 384
|
|
1398
|
-
|
|
1399
|
-
// ============================================================
|
|
1400
|
-
// STEP 2: Initialize Services
|
|
1401
|
-
// ============================================================
|
|
1402
|
-
const db = new GraphDB('http://insurance.org/fraud-kb')
|
|
1403
|
-
const embeddings = new EmbeddingService()
|
|
1404
|
-
|
|
1405
|
-
// ============================================================
|
|
1406
|
-
// STEP 3: Configure Embedding Provider (bring your own)
|
|
1407
|
-
// ============================================================
|
|
1408
|
-
async function getEmbedding(text) {
|
|
1409
|
-
switch (EMBEDDING_PROVIDER) {
|
|
1410
|
-
case 'openai':
|
|
1411
|
-
// Requires: npm install openai
|
|
1412
|
-
const { OpenAI } = require('openai')
|
|
1413
|
-
const openai = new OpenAI({ apiKey: OPENAI_API_KEY })
|
|
1414
|
-
const resp = await openai.embeddings.create({
|
|
1415
|
-
model: 'text-embedding-3-small',
|
|
1416
|
-
input: text,
|
|
1417
|
-
dimensions: EMBEDDING_DIM
|
|
1418
|
-
})
|
|
1419
|
-
return resp.data[0].embedding
|
|
1420
|
-
|
|
1421
|
-
case 'voyage':
|
|
1422
|
-
// Using fetch directly (no SDK required)
|
|
1423
|
-
const vResp = await fetch('https://api.voyageai.com/v1/embeddings', {
|
|
1424
|
-
method: 'POST',
|
|
1425
|
-
headers: {
|
|
1426
|
-
'Authorization': `Bearer ${VOYAGE_API_KEY}`,
|
|
1427
|
-
'Content-Type': 'application/json'
|
|
1428
|
-
},
|
|
1429
|
-
body: JSON.stringify({ input: text, model: 'voyage-2' })
|
|
1430
|
-
})
|
|
1431
|
-
const vData = await vResp.json()
|
|
1432
|
-
return vData.data[0].embedding.slice(0, EMBEDDING_DIM)
|
|
1433
|
-
|
|
1434
|
-
default: // Mock embeddings for testing (no external deps)
|
|
1435
|
-
return new Array(EMBEDDING_DIM).fill(0).map((_, i) =>
|
|
1436
|
-
Math.sin(text.charCodeAt(i % text.length) * 0.1) * 0.5 + 0.5
|
|
1437
|
-
)
|
|
1438
|
-
}
|
|
1439
|
-
}
|
|
770
|
+
const agent = new HyperMindAgent({ kg: db })
|
|
1440
771
|
|
|
1441
|
-
//
|
|
1442
|
-
|
|
1443
|
-
// ============================================================
|
|
1444
|
-
async function loadClaimsDataset() {
|
|
1445
|
-
// Load structured RDF data
|
|
1446
|
-
db.loadTtl(`
|
|
1447
|
-
@prefix : <http://insurance.org/> .
|
|
1448
|
-
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
|
|
1449
|
-
|
|
1450
|
-
# Claims
|
|
1451
|
-
:CLM001 a :Claim ;
|
|
1452
|
-
:amount "18500"^^xsd:decimal ;
|
|
1453
|
-
:description "Soft tissue injury from rear-end collision" ;
|
|
1454
|
-
:claimant :P001 ;
|
|
1455
|
-
:provider :PROV001 ;
|
|
1456
|
-
:filingDate "2024-11-15"^^xsd:date .
|
|
1457
|
-
|
|
1458
|
-
:CLM002 a :Claim ;
|
|
1459
|
-
:amount "22300"^^xsd:decimal ;
|
|
1460
|
-
:description "Whiplash injury from vehicle accident" ;
|
|
1461
|
-
:claimant :P002 ;
|
|
1462
|
-
:provider :PROV001 ;
|
|
1463
|
-
:filingDate "2024-11-18"^^xsd:date .
|
|
1464
|
-
|
|
1465
|
-
# Claimants
|
|
1466
|
-
:P001 a :Claimant ;
|
|
1467
|
-
:name "John Smith" ;
|
|
1468
|
-
:address "123 Main St, Miami, FL" ;
|
|
1469
|
-
:riskScore "0.85"^^xsd:decimal .
|
|
1470
|
-
|
|
1471
|
-
:P002 a :Claimant ;
|
|
1472
|
-
:name "Jane Doe" ;
|
|
1473
|
-
:address "123 Main St, Miami, FL" ; # Same address!
|
|
1474
|
-
:riskScore "0.72"^^xsd:decimal .
|
|
1475
|
-
|
|
1476
|
-
# Relationships (fraud indicators)
|
|
1477
|
-
:P001 :knows :P002 .
|
|
1478
|
-
:P001 :paidTo :P002 .
|
|
1479
|
-
:P002 :paidTo :P003 .
|
|
1480
|
-
:P003 :paidTo :P001 . # Circular payment!
|
|
1481
|
-
|
|
1482
|
-
# Provider
|
|
1483
|
-
:PROV001 a :Provider ;
|
|
1484
|
-
:name "Quick Care Rehabilitation Clinic" ;
|
|
1485
|
-
:flagCount "4"^^xsd:integer .
|
|
1486
|
-
`, null)
|
|
1487
|
-
|
|
1488
|
-
console.log(`[Dataset] Loaded ${db.countTriples()} triples`)
|
|
1489
|
-
|
|
1490
|
-
// Generate embeddings for claims (TRIGGER)
|
|
1491
|
-
const claims = ['CLM001', 'CLM002']
|
|
1492
|
-
for (const claimId of claims) {
|
|
1493
|
-
const desc = db.querySelect(`
|
|
1494
|
-
PREFIX : <http://insurance.org/>
|
|
1495
|
-
SELECT ?desc WHERE { :${claimId} :description ?desc }
|
|
1496
|
-
`)[0]?.bindings?.desc || claimId
|
|
1497
|
-
|
|
1498
|
-
const vector = await getEmbedding(desc)
|
|
1499
|
-
embeddings.storeVector(claimId, vector)
|
|
1500
|
-
console.log(`[Embedding] Stored ${claimId}: ${vector.slice(0, 3).map(v => v.toFixed(3)).join(', ')}...`)
|
|
1501
|
-
}
|
|
772
|
+
// Works: keyword "fraud" matches detect_fraud intent
|
|
773
|
+
await agent.call('Find fraud cases')
|
|
1502
774
|
|
|
1503
|
-
|
|
1504
|
-
|
|
1505
|
-
embeddings.onTripleInsert('CLM001', 'provider', 'PROV001', null)
|
|
1506
|
-
embeddings.onTripleInsert('CLM002', 'claimant', 'P002', null)
|
|
1507
|
-
embeddings.onTripleInsert('CLM002', 'provider', 'PROV001', null)
|
|
1508
|
-
embeddings.onTripleInsert('P001', 'knows', 'P002', null)
|
|
1509
|
-
console.log('[1-Hop Cache] Updated neighbor relationships')
|
|
1510
|
-
|
|
1511
|
-
// Rebuild HNSW index
|
|
1512
|
-
embeddings.rebuildIndex()
|
|
1513
|
-
console.log('[HNSW Index] Rebuilt for similarity search')
|
|
1514
|
-
}
|
|
1515
|
-
|
|
1516
|
-
// ============================================================
|
|
1517
|
-
// STEP 5: Run Fraud Detection Pipeline
|
|
1518
|
-
// ============================================================
|
|
1519
|
-
async function runFraudDetection() {
|
|
1520
|
-
await loadClaimsDataset()
|
|
1521
|
-
|
|
1522
|
-
// Graph network analysis
|
|
1523
|
-
const graph = new GraphFrame(
|
|
1524
|
-
JSON.stringify([{id:'P001'}, {id:'P002'}, {id:'P003'}]),
|
|
1525
|
-
JSON.stringify([
|
|
1526
|
-
{src:'P001', dst:'P002'},
|
|
1527
|
-
{src:'P002', dst:'P003'},
|
|
1528
|
-
{src:'P003', dst:'P001'}
|
|
1529
|
-
])
|
|
1530
|
-
)
|
|
1531
|
-
|
|
1532
|
-
const triangles = graph.triangleCount()
|
|
1533
|
-
console.log(`[GraphFrame] Fraud rings detected: ${triangles}`)
|
|
1534
|
-
|
|
1535
|
-
// Semantic similarity search
|
|
1536
|
-
const similarClaims = JSON.parse(embeddings.findSimilar('CLM001', 5, 0.7))
|
|
1537
|
-
console.log(`[Embeddings] Claims similar to CLM001:`, similarClaims)
|
|
1538
|
-
|
|
1539
|
-
// Datalog rule-based inference
|
|
1540
|
-
const datalog = new DatalogProgram()
|
|
1541
|
-
datalog.addFact(JSON.stringify({predicate:'claim', terms:['CLM001','P001','PROV001']}))
|
|
1542
|
-
datalog.addFact(JSON.stringify({predicate:'claim', terms:['CLM002','P002','PROV001']}))
|
|
1543
|
-
datalog.addFact(JSON.stringify({predicate:'related', terms:['P001','P002']}))
|
|
1544
|
-
|
|
1545
|
-
datalog.addRule(JSON.stringify({
|
|
1546
|
-
head: {predicate:'collusion', terms:['?P1','?P2','?Prov']},
|
|
1547
|
-
body: [
|
|
1548
|
-
{predicate:'claim', terms:['?C1','?P1','?Prov']},
|
|
1549
|
-
{predicate:'claim', terms:['?C2','?P2','?Prov']},
|
|
1550
|
-
{predicate:'related', terms:['?P1','?P2']}
|
|
1551
|
-
]
|
|
1552
|
-
}))
|
|
1553
|
-
|
|
1554
|
-
const result = JSON.parse(evaluateDatalog(datalog))
|
|
1555
|
-
console.log('[Datalog] Collusion detected:', result.collusion)
|
|
1556
|
-
// Output: [["P001","P002","PROV001"]]
|
|
1557
|
-
}
|
|
1558
|
-
|
|
1559
|
-
runFraudDetection()
|
|
775
|
+
// Fails: "anomalous" doesn't match any keyword
|
|
776
|
+
await agent.call('Find anomalous billing patterns') // Falls back to generic query
|
|
1560
777
|
```
|
|
1561
778
|
|
|
1562
|
-
**
|
|
1563
|
-
|
|
1564
|
-
|
|
1565
|
-
|
|
779
|
+
**Limitations:**
|
|
780
|
+
- Only matches exact keywords: "fraud", "suspicious", "risk", "similar", etc.
|
|
781
|
+
- Uses hardcoded SPARQL templates that may not match your schema
|
|
782
|
+
- Suitable for demos with insurance/LUBM ontologies only
|
|
1566
783
|
|
|
1567
|
-
|
|
1568
|
-
```
|
|
1569
|
-
======================================================================
|
|
1570
|
-
FRAUD DETECTION AGENT - Production Pipeline
|
|
1571
|
-
rust-kgdb v0.2.0 | Neuro-Symbolic AI Framework
|
|
1572
|
-
======================================================================
|
|
1573
|
-
|
|
1574
|
-
[PHASE 1] Knowledge Graph Initialization
|
|
1575
|
-
--------------------------------------------------
|
|
1576
|
-
Graph URI: http://insurance.org/fraud-kb
|
|
1577
|
-
Triples: 13
|
|
1578
|
-
|
|
1579
|
-
[PHASE 2] Graph Network Analysis
|
|
1580
|
-
--------------------------------------------------
|
|
1581
|
-
Vertices: 7
|
|
1582
|
-
Edges: 8
|
|
1583
|
-
Triangles: 1 (fraud ring indicator)
|
|
1584
|
-
PageRank (central actors):
|
|
1585
|
-
- PROV001: 0.2169
|
|
1586
|
-
- P001: 0.1418
|
|
1587
|
-
|
|
1588
|
-
[PHASE 3] Semantic Similarity Analysis
|
|
1589
|
-
--------------------------------------------------
|
|
1590
|
-
Embeddings stored: 5
|
|
1591
|
-
Vector dimension: 384
|
|
1592
|
-
|
|
1593
|
-
[PHASE 4] Datalog Rule-Based Inference
|
|
1594
|
-
--------------------------------------------------
|
|
1595
|
-
Facts: 6
|
|
1596
|
-
Rules: 2
|
|
1597
|
-
Inferred facts:
|
|
1598
|
-
- Collusion: [["P001","P002","PROV001"]]
|
|
1599
|
-
- Connected: [["P001","P003"]]
|
|
1600
|
-
|
|
1601
|
-
======================================================================
|
|
1602
|
-
FRAUD DETECTION REPORT - OVERALL RISK: HIGH
|
|
1603
|
-
======================================================================
|
|
1604
|
-
```
|
|
1605
|
-
|
|
1606
|
-
---
|
|
1607
|
-
|
|
1608
|
-
## Production Example: Underwriting
|
|
784
|
+
### Production Mode (Recommended)
|
|
1609
785
|
|
|
1610
|
-
|
|
1611
|
-
- NAICS codes: US Census Bureau industry classification
|
|
1612
|
-
- Territory modifiers: Based on catastrophe exposure (hurricane zones FL, earthquake CA)
|
|
1613
|
-
- Loss ratio thresholds: Industry standard 0.70 referral trigger
|
|
1614
|
-
- Experience modification: Standard 5/10 year breaks
|
|
1615
|
-
|
|
1616
|
-
**Premium Formula:** `Base Rate × Exposure × Territory Mod × Experience Mod × Loss Mod` - standard ISO methodology.
|
|
786
|
+
For accurate queries on real data, provide LLM configuration:
|
|
1617
787
|
|
|
1618
788
|
```javascript
|
|
1619
|
-
const
|
|
1620
|
-
|
|
1621
|
-
//
|
|
1622
|
-
|
|
1623
|
-
|
|
1624
|
-
|
|
1625
|
-
:BUS001 :naics "332119" ; :lossRatio "0.45" ; :territory "FL" .
|
|
1626
|
-
:BUS002 :naics "541512" ; :lossRatio "0.00" ; :territory "CA" .
|
|
1627
|
-
:BUS003 :naics "484121" ; :lossRatio "0.72" ; :territory "TX" .
|
|
1628
|
-
`, null)
|
|
1629
|
-
|
|
1630
|
-
// Apply underwriting rules
|
|
1631
|
-
const datalog = new DatalogProgram()
|
|
1632
|
-
datalog.addFact(JSON.stringify({predicate:'business', terms:['BUS001','manufacturing','0.45']}))
|
|
1633
|
-
datalog.addFact(JSON.stringify({predicate:'business', terms:['BUS002','tech','0.00']}))
|
|
1634
|
-
datalog.addFact(JSON.stringify({predicate:'business', terms:['BUS003','transport','0.72']}))
|
|
1635
|
-
datalog.addFact(JSON.stringify({predicate:'highRiskClass', terms:['transport']}))
|
|
1636
|
-
|
|
1637
|
-
datalog.addRule(JSON.stringify({
|
|
1638
|
-
head: {predicate:'referToUW', terms:['?Bus']},
|
|
1639
|
-
body: [
|
|
1640
|
-
{predicate:'business', terms:['?Bus','?Class','?LR']},
|
|
1641
|
-
{predicate:'highRiskClass', terms:['?Class']}
|
|
1642
|
-
]
|
|
1643
|
-
}))
|
|
1644
|
-
|
|
1645
|
-
datalog.addRule(JSON.stringify({
|
|
1646
|
-
head: {predicate:'autoApprove', terms:['?Bus']},
|
|
1647
|
-
body: [{predicate:'business', terms:['?Bus','tech','?LR']}]
|
|
1648
|
-
}))
|
|
1649
|
-
|
|
1650
|
-
const decisions = JSON.parse(evaluateDatalog(datalog))
|
|
1651
|
-
console.log('Auto-approve:', decisions.autoApprove) // [["BUS002"]]
|
|
1652
|
-
console.log('Refer to UW:', decisions.referToUW) // [["BUS003"]]
|
|
1653
|
-
```
|
|
1654
|
-
|
|
1655
|
-
**Run it yourself:**
|
|
1656
|
-
```bash
|
|
1657
|
-
node examples/underwriting-agent.js
|
|
1658
|
-
```
|
|
1659
|
-
|
|
1660
|
-
**Actual Output:**
|
|
1661
|
-
```
|
|
1662
|
-
======================================================================
|
|
1663
|
-
INSURANCE UNDERWRITING AGENT - Production Pipeline
|
|
1664
|
-
rust-kgdb v0.2.0 | Neuro-Symbolic AI Framework
|
|
1665
|
-
======================================================================
|
|
1666
|
-
|
|
1667
|
-
[PHASE 2] Risk Factor Analysis
|
|
1668
|
-
--------------------------------------------------
|
|
1669
|
-
Risk network: 12 nodes, 10 edges
|
|
1670
|
-
Risk concentration (PageRank):
|
|
1671
|
-
- BUS001: 0.0561
|
|
1672
|
-
- BUS003: 0.0561
|
|
1673
|
-
|
|
1674
|
-
[PHASE 3] Similar Risk Profile Matching
|
|
1675
|
-
--------------------------------------------------
|
|
1676
|
-
Risk embeddings stored: 4
|
|
1677
|
-
Profiles similar to BUS003 (high-risk transportation):
|
|
1678
|
-
- BUS001: manufacturing, loss ratio 0.45
|
|
1679
|
-
- BUS004: hospitality, loss ratio 0.28
|
|
1680
|
-
|
|
1681
|
-
[PHASE 4] Underwriting Decision Rules
|
|
1682
|
-
--------------------------------------------------
|
|
1683
|
-
Facts loaded: 6
|
|
1684
|
-
Decision rules: 2
|
|
1685
|
-
Automated decisions:
|
|
1686
|
-
- BUS002: AUTO-APPROVE
|
|
1687
|
-
- BUS003: REFER TO UNDERWRITER
|
|
1688
|
-
|
|
1689
|
-
[PHASE 5] Premium Calculation
|
|
1690
|
-
--------------------------------------------------
|
|
1691
|
-
- BUS001: $1,339,537 (STANDARD)
|
|
1692
|
-
- BUS002: $74,155 (APPROVED)
|
|
1693
|
-
- BUS003: $1,125,778 (REFER)
|
|
1694
|
-
|
|
1695
|
-
======================================================================
|
|
1696
|
-
Applications processed: 4 | Auto-approved: 1 | Referred: 1
|
|
1697
|
-
======================================================================
|
|
1698
|
-
```
|
|
1699
|
-
|
|
1700
|
-
---
|
|
1701
|
-
|
|
1702
|
-
## HyperMind Agent Design: A Complete Guide
|
|
1703
|
-
|
|
1704
|
-
This section explains how to design production-grade AI agents using HyperMind's mathematical foundations. We'll walk through the complete architecture using our Fraud Detection and Underwriting agents as case studies.
|
|
1705
|
-
|
|
1706
|
-
### The HyperMind Architecture
|
|
789
|
+
const agent = new HyperMindAgent({
|
|
790
|
+
kg: db,
|
|
791
|
+
embeddings: new EmbeddingService(), // For semantic similarity
|
|
792
|
+
model: 'claude-sonnet-4', // LLM for intent + SPARQL generation
|
|
793
|
+
apiKey: process.env.ANTHROPIC_API_KEY // Required for LLM calls
|
|
794
|
+
})
|
|
1707
795
|
|
|
1708
|
-
|
|
1709
|
-
|
|
1710
|
-
│ HYPERMIND FRAMEWORK │
|
|
1711
|
-
│ │
|
|
1712
|
-
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
|
|
1713
|
-
│ │ TYPE THEORY │ │ CATEGORY │ │ PROOF │ │
|
|
1714
|
-
│ │ (Hindley- │ │ THEORY │ │ THEORY │ │
|
|
1715
|
-
│ │ Milner) │ │ (Morphisms) │ │ (Witnesses) │ │
|
|
1716
|
-
│ └───────┬───────┘ └───────┬───────┘ └───────┬───────┘ │
|
|
1717
|
-
│ │ │ │ │
|
|
1718
|
-
│ └─────────────┬─────┴───────────────────┘ │
|
|
1719
|
-
│ │ │
|
|
1720
|
-
│ ┌─────────────────────▼─────────────────────────────────────────┐ │
|
|
1721
|
-
│ │ TOOL REGISTRY │ │
|
|
1722
|
-
│ │ Every tool is a typed morphism: Input Type → Output Type │ │
|
|
1723
|
-
│ │ │ │
|
|
1724
|
-
│ │ kg.sparql.query : SPARQLQuery → BindingSet │ │
|
|
1725
|
-
│ │ kg.graphframe : Graph → AnalysisResult │ │
|
|
1726
|
-
│ │ kg.embeddings : EntityId → SimilarEntities │ │
|
|
1727
|
-
│ │ kg.datalog : DatalogProgram → InferredFacts │ │
|
|
1728
|
-
│ └───────────────────────────────────────────────────────────────┘ │
|
|
1729
|
-
│ │ │
|
|
1730
|
-
│ ┌─────────────────────▼─────────────────────────────────────────┐ │
|
|
1731
|
-
│ │ AGENT EXECUTOR │ │
|
|
1732
|
-
│ │ Composes tools safely • Produces execution witness │ │
|
|
1733
|
-
│ └───────────────────────────────────────────────────────────────┘ │
|
|
1734
|
-
└─────────────────────────────────────────────────────────────────────────────┘
|
|
796
|
+
// Now works: LLM understands semantics
|
|
797
|
+
await agent.call('Find anomalous billing patterns from last quarter')
|
|
1735
798
|
```
|
|
1736
799
|
|
|
1737
|
-
|
|
800
|
+
**How Production Mode Works:**
|
|
1738
801
|
|
|
1739
|
-
The knowledge graph is the foundation. It encodes domain expertise as structured data.
|
|
1740
|
-
|
|
1741
|
-
**Fraud Detection Domain Model:**
|
|
1742
|
-
```
|
|
1743
|
-
┌─────────────┐ paidTo ┌─────────────┐
|
|
1744
|
-
│ Claimant │ ───────────────▶│ Claimant │
|
|
1745
|
-
│ (P001) │ │ (P002) │
|
|
1746
|
-
└──────┬──────┘ └──────┬──────┘
|
|
1747
|
-
│ claimant │ claimant
|
|
1748
|
-
▼ ▼
|
|
1749
|
-
┌─────────────┐ ┌─────────────┐
|
|
1750
|
-
│ Claim │ provider │ Claim │
|
|
1751
|
-
│ (CLM001) │ ───────────────▶│ (CLM002) │
|
|
1752
|
-
└──────┬──────┘ ┌─────────┴─────────────┘
|
|
1753
|
-
│ │
|
|
1754
|
-
▼ ▼
|
|
1755
|
-
┌──────────────────────┐
|
|
1756
|
-
│ Provider │ ◀── High claim volume signals risk
|
|
1757
|
-
│ (PROV001) │
|
|
1758
|
-
└──────────────────────┘
|
|
1759
802
|
```
|
|
1760
|
-
|
|
1761
|
-
|
|
1762
|
-
|
|
1763
|
-
|
|
1764
|
-
|
|
1765
|
-
|
|
1766
|
-
|
|
1767
|
-
|
|
1768
|
-
|
|
1769
|
-
|
|
1770
|
-
|
|
1771
|
-
|
|
1772
|
-
|
|
1773
|
-
|
|
1774
|
-
|
|
1775
|
-
|
|
1776
|
-
|
|
1777
|
-
|
|
1778
|
-
|
|
1779
|
-
|
|
1780
|
-
|
|
1781
|
-
|
|
1782
|
-
|
|
1783
|
-
|
|
1784
|
-
|
|
1785
|
-
|
|
1786
|
-
ins:amount "18500"^^xsd:decimal .
|
|
1787
|
-
|
|
1788
|
-
# Fraud ring indicator: claimants know each other
|
|
1789
|
-
ins:P001 ins:knows ins:P002 .
|
|
1790
|
-
ins:P001 ins:sameAddress ins:P002 .
|
|
1791
|
-
`, 'http://insurance.org/fraud-kb')
|
|
1792
|
-
|
|
1793
|
-
console.log(`Knowledge Graph: ${db.countTriples()} triples`)
|
|
1794
|
-
```
|
|
1795
|
-
|
|
1796
|
-
### Step 2: Graph Analytics with GraphFrames
|
|
1797
|
-
|
|
1798
|
-
GraphFrames detect structural patterns that indicate fraud rings.
|
|
1799
|
-
|
|
1800
|
-
**Design Thinking:** Fraud rings create network triangles. If A→B→C→A, there's a closed loop of money flow - a classic fraud indicator.
|
|
1801
|
-
|
|
803
|
+
User Query: "Find anomalous billing patterns"
|
|
804
|
+
│
|
|
805
|
+
▼
|
|
806
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
807
|
+
│ 1. SCHEMA INJECTION │
|
|
808
|
+
│ Agent extracts predicates from KG: │
|
|
809
|
+
│ Classes: Claim, Provider, Claimant │
|
|
810
|
+
│ Predicates: submittedBy, amount, riskScore, filedDate │
|
|
811
|
+
└─────────────────────────────────────────────────────────────────┘
|
|
812
|
+
│
|
|
813
|
+
▼
|
|
814
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
815
|
+
│ 2. LLM INTENT CLASSIFICATION │
|
|
816
|
+
│ Prompt: "Given schema {classes, predicates}, classify: │
|
|
817
|
+
│ 'Find anomalous billing patterns'" │
|
|
818
|
+
│ Response: { intent: 'detect_fraud', confidence: 0.92 } │
|
|
819
|
+
└─────────────────────────────────────────────────────────────────┘
|
|
820
|
+
│
|
|
821
|
+
▼
|
|
822
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
823
|
+
│ 3. LLM SPARQL GENERATION │
|
|
824
|
+
│ Prompt: "Generate SPARQL for detect_fraud using: │
|
|
825
|
+
│ - Predicates: {submittedBy, amount, riskScore} │
|
|
826
|
+
│ - Type contracts: Output must be valid SPARQL 1.1" │
|
|
827
|
+
│ Response: Valid SPARQL matching YOUR schema │
|
|
828
|
+
└─────────────────────────────────────────────────────────────────┘
|
|
1802
829
|
```
|
|
1803
|
-
Triangle Detection: PageRank Analysis:
|
|
1804
830
|
|
|
1805
|
-
|
|
1806
|
-
╱ ╲ P001: 0.1418 ← High influence
|
|
1807
|
-
╱ ╲ P002: 0.1312 ← Connected to ring
|
|
1808
|
-
▼ ▼
|
|
1809
|
-
P002 ────▶ P003 Interpretation: PROV001 is the hub
|
|
1810
|
-
↖____/ that connects multiple claimants.
|
|
831
|
+
**Why EmbeddingService?**
|
|
1811
832
|
|
|
1812
|
-
|
|
1813
|
-
|
|
833
|
+
EmbeddingService enables two features:
|
|
834
|
+
1. **Semantic Search Tool** - `find_similar` intent uses `kg.embeddings.search`
|
|
835
|
+
2. **Memory Retrieval** - Find similar past queries for context
|
|
1814
836
|
|
|
1815
|
-
**Code: Network Analysis**
|
|
1816
837
|
```javascript
|
|
1817
|
-
|
|
1818
|
-
|
|
1819
|
-
|
|
1820
|
-
|
|
1821
|
-
|
|
1822
|
-
|
|
1823
|
-
|
|
1824
|
-
|
|
1825
|
-
|
|
1826
|
-
|
|
1827
|
-
|
|
1828
|
-
{ src: 'P001', dst: 'P002', relationship: 'paidTo' },
|
|
1829
|
-
{ src: 'P002', dst: 'P003', relationship: 'paidTo' },
|
|
1830
|
-
{ src: 'P003', dst: 'P001', relationship: 'paidTo' }, // Closes the loop!
|
|
1831
|
-
{ src: 'P001', dst: 'PROV001', relationship: 'claimsWith' },
|
|
1832
|
-
{ src: 'P002', dst: 'PROV001', relationship: 'claimsWith' }
|
|
1833
|
-
]
|
|
1834
|
-
|
|
1835
|
-
// GraphFrame requires JSON strings
|
|
1836
|
-
const gf = new GraphFrame(JSON.stringify(vertices), JSON.stringify(edges))
|
|
1837
|
-
|
|
1838
|
-
// Detect triangles (fraud rings)
|
|
1839
|
-
const triangles = gf.triangleCount()
|
|
1840
|
-
console.log(`Fraud rings detected: ${triangles}`) // 1
|
|
1841
|
-
|
|
1842
|
-
// Find central actors with PageRank
|
|
1843
|
-
const pageRankJson = gf.pageRank(0.85, 20)
|
|
1844
|
-
const pageRank = JSON.parse(pageRankJson)
|
|
1845
|
-
console.log('Central actors:', pageRank.ranks)
|
|
838
|
+
// Without embeddings: only SPARQL + Datalog tools available
|
|
839
|
+
const agent = new HyperMindAgent({ kg: db, model: 'claude-sonnet-4', apiKey })
|
|
840
|
+
|
|
841
|
+
// With embeddings: adds semantic search capability
|
|
842
|
+
const agent = new HyperMindAgent({
|
|
843
|
+
kg: db,
|
|
844
|
+
embeddings: new EmbeddingService(),
|
|
845
|
+
model: 'claude-sonnet-4',
|
|
846
|
+
apiKey
|
|
847
|
+
})
|
|
848
|
+
await agent.call('Find claims similar to CLM001') // Uses embeddings
|
|
1846
849
|
```
|
|
1847
850
|
|
|
1848
|
-
|
|
1849
|
-
|
|
1850
|
-
Embeddings find claims with similar characteristics - useful for detecting patterns across different fraud schemes.
|
|
1851
|
-
|
|
1852
|
-
**Design Thinking:** Claims with similar profiles (same type, similar amounts, same provider type) cluster together in vector space.
|
|
851
|
+
**API Summary:**
|
|
1853
852
|
|
|
1854
|
-
```
|
|
1855
|
-
Vector Space Visualization:
|
|
1856
|
-
|
|
1857
|
-
High Amount
|
|
1858
|
-
│
|
|
1859
|
-
│ CLM001 (bodily injury, $18.5K)
|
|
1860
|
-
│ ●
|
|
1861
|
-
│ ╲ similarity: 0.815
|
|
1862
|
-
│ ╲
|
|
1863
|
-
│ ● CLM002 (bodily injury, $22.3K)
|
|
1864
|
-
│
|
|
1865
|
-
│ ● CLM003 (collision, $15.8K)
|
|
1866
|
-
Low Risk ─┼────────────────────────── High Risk
|
|
1867
|
-
│
|
|
1868
|
-
│ ● CLM005 (property, $3.2K)
|
|
1869
|
-
│
|
|
1870
|
-
Low Amount
|
|
1871
|
-
|
|
1872
|
-
Claims cluster by type + amount + risk.
|
|
1873
|
-
Similar claims = similar fraud patterns.
|
|
1874
|
-
```
|
|
1875
|
-
|
|
1876
|
-
**Code: Embedding Storage and Search**
|
|
1877
853
|
```javascript
|
|
1878
|
-
const
|
|
1879
|
-
|
|
1880
|
-
|
|
1881
|
-
|
|
1882
|
-
//
|
|
1883
|
-
|
|
1884
|
-
|
|
1885
|
-
const embedding = new Array(384).fill(0)
|
|
1886
|
-
|
|
1887
|
-
// Encode claim type (one-hot style in first dimensions)
|
|
1888
|
-
const typeIndex = { 'bodily_injury': 0, 'collision': 1, 'property': 2 }
|
|
1889
|
-
embedding[typeIndex[claimType] || 0] = 1.0
|
|
1890
|
-
|
|
1891
|
-
// Encode normalized values
|
|
1892
|
-
embedding[10] = amount / 50000 // Normalize amount
|
|
1893
|
-
embedding[11] = providerVolume / 1000 // Normalize provider volume
|
|
1894
|
-
embedding[12] = riskScore // Risk score (0-1)
|
|
1895
|
-
|
|
1896
|
-
// Add some variance for realistic embedding
|
|
1897
|
-
for (let i = 13; i < 384; i++) {
|
|
1898
|
-
embedding[i] = Math.sin(i * amount * 0.001) * 0.1
|
|
1899
|
-
}
|
|
1900
|
-
|
|
1901
|
-
return embedding
|
|
1902
|
-
}
|
|
1903
|
-
|
|
1904
|
-
// Store claim embeddings
|
|
1905
|
-
const claims = {
|
|
1906
|
-
'CLM001': { type: 'bodily_injury', amount: 18500, volume: 847, risk: 0.85 },
|
|
1907
|
-
'CLM002': { type: 'bodily_injury', amount: 22300, volume: 847, risk: 0.72 },
|
|
1908
|
-
'CLM003': { type: 'collision', amount: 15800, volume: 2341, risk: 0.45 },
|
|
1909
|
-
'CLM004': { type: 'property', amount: 3200, volume: 156, risk: 0.22 }
|
|
1910
|
-
}
|
|
1911
|
-
|
|
1912
|
-
Object.entries(claims).forEach(([id, profile]) => {
|
|
1913
|
-
const vec = generateClaimEmbedding(profile.type, profile.amount, profile.volume, profile.risk)
|
|
1914
|
-
embeddings.storeVector(id, vec)
|
|
854
|
+
const agent = new HyperMindAgent({
|
|
855
|
+
kg: db, // REQUIRED: Knowledge graph
|
|
856
|
+
embeddings: embSvc, // Optional: For similarity search + memory
|
|
857
|
+
model: 'claude-sonnet-4', // Optional: LLM for production accuracy
|
|
858
|
+
apiKey: 'sk-...', // Required if model is specified
|
|
859
|
+
name: 'fraud-detector', // Optional: Agent identity for memory
|
|
860
|
+
sandbox: { ... } // Optional: Security capabilities
|
|
1915
861
|
})
|
|
1916
|
-
|
|
1917
|
-
// Find claims similar to high-risk CLM001
|
|
1918
|
-
const similarJson = embeddings.findSimilar('CLM001', 5, 0.5)
|
|
1919
|
-
const similar = JSON.parse(similarJson)
|
|
1920
|
-
|
|
1921
|
-
similar.forEach(s => {
|
|
1922
|
-
if (s.entity !== 'CLM001') {
|
|
1923
|
-
console.log(`${s.entity}: similarity ${s.score.toFixed(3)}`)
|
|
1924
|
-
}
|
|
1925
|
-
})
|
|
1926
|
-
// CLM002: 0.815 (same type, similar amount)
|
|
1927
|
-
// CLM003: 0.679 (different type, but similar profile)
|
|
1928
862
|
```
|
|
1929
863
|
|
|
1930
|
-
|
|
864
|
+
---
|
|
1931
865
|
|
|
1932
|
-
|
|
866
|
+
## Benchmarks
|
|
1933
867
|
|
|
1934
|
-
|
|
868
|
+
### Test Environment
|
|
1935
869
|
|
|
1936
|
-
|
|
1937
|
-
NICB Fraud Detection Rules:
|
|
1938
|
-
|
|
1939
|
-
Rule 1: COLLUSION
|
|
1940
|
-
IF claimant(X) AND claimant(Y) AND
|
|
1941
|
-
provider(P) AND claims_with(X, P) AND
|
|
1942
|
-
claims_with(Y, P) AND knows(X, Y)
|
|
1943
|
-
THEN potential_collusion(X, Y, P)
|
|
1944
|
-
|
|
1945
|
-
Rule 2: ADDRESS FRAUD
|
|
1946
|
-
IF claimant(X) AND claimant(Y) AND
|
|
1947
|
-
same_address(X, Y) AND high_risk(X) AND high_risk(Y)
|
|
1948
|
-
THEN address_fraud_indicator(X, Y)
|
|
1949
|
-
|
|
1950
|
-
Inference Chain:
|
|
1951
|
-
claimant(P001) ┐
|
|
1952
|
-
claimant(P002) │
|
|
1953
|
-
provider(PROV001) │──▶ potential_collusion(P001, P002, PROV001)
|
|
1954
|
-
claims_with(P001,PROV001)│
|
|
1955
|
-
claims_with(P002,PROV001)│
|
|
1956
|
-
knows(P001, P002) ┘
|
|
1957
|
-
```
|
|
870
|
+
All benchmarks run on **commodity hardware** (Intel Mac) using the InMemory storage backend.
|
|
1958
871
|
|
|
1959
|
-
|
|
1960
|
-
|
|
1961
|
-
|
|
1962
|
-
|
|
1963
|
-
|
|
1964
|
-
|
|
1965
|
-
|
|
1966
|
-
datalog.addFact(JSON.stringify({ predicate: 'claimant', terms: ['P001'] }))
|
|
1967
|
-
datalog.addFact(JSON.stringify({ predicate: 'claimant', terms: ['P002'] }))
|
|
1968
|
-
datalog.addFact(JSON.stringify({ predicate: 'provider', terms: ['PROV001'] }))
|
|
1969
|
-
datalog.addFact(JSON.stringify({ predicate: 'claims_with', terms: ['P001', 'PROV001'] }))
|
|
1970
|
-
datalog.addFact(JSON.stringify({ predicate: 'claims_with', terms: ['P002', 'PROV001'] }))
|
|
1971
|
-
datalog.addFact(JSON.stringify({ predicate: 'knows', terms: ['P001', 'P002'] }))
|
|
1972
|
-
datalog.addFact(JSON.stringify({ predicate: 'same_address', terms: ['P001', 'P002'] }))
|
|
1973
|
-
datalog.addFact(JSON.stringify({ predicate: 'high_risk', terms: ['P001'] }))
|
|
1974
|
-
datalog.addFact(JSON.stringify({ predicate: 'high_risk', terms: ['P002'] }))
|
|
1975
|
-
|
|
1976
|
-
// Add NICB-informed collusion rule
|
|
1977
|
-
datalog.addRule(JSON.stringify({
|
|
1978
|
-
head: { predicate: 'potential_collusion', terms: ['?X', '?Y', '?P'] },
|
|
1979
|
-
body: [
|
|
1980
|
-
{ predicate: 'claimant', terms: ['?X'] },
|
|
1981
|
-
{ predicate: 'claimant', terms: ['?Y'] },
|
|
1982
|
-
{ predicate: 'provider', terms: ['?P'] },
|
|
1983
|
-
{ predicate: 'claims_with', terms: ['?X', '?P'] },
|
|
1984
|
-
{ predicate: 'claims_with', terms: ['?Y', '?P'] },
|
|
1985
|
-
{ predicate: 'knows', terms: ['?X', '?Y'] }
|
|
1986
|
-
]
|
|
1987
|
-
}))
|
|
872
|
+
| Component | Specification |
|
|
873
|
+
|-----------|---------------|
|
|
874
|
+
| **Hardware** | Intel Mac (commodity laptop) |
|
|
875
|
+
| **Backend** | InMemoryBackend (zero-copy, no GC) |
|
|
876
|
+
| **Dataset** | [LUBM](http://swat.cse.lehigh.edu/projects/lubm/) (Lehigh University Benchmark) |
|
|
877
|
+
| **Triples** | 3,272 (LUBM-1 scale factor) |
|
|
878
|
+
| **Tool** | [Criterion.rs](https://github.com/bheisler/criterion.rs) statistical benchmarking |
|
|
1988
879
|
|
|
1989
|
-
|
|
1990
|
-
datalog.addRule(JSON.stringify({
|
|
1991
|
-
head: { predicate: 'address_fraud_indicator', terms: ['?X', '?Y'] },
|
|
1992
|
-
body: [
|
|
1993
|
-
{ predicate: 'claimant', terms: ['?X'] },
|
|
1994
|
-
{ predicate: 'claimant', terms: ['?Y'] },
|
|
1995
|
-
{ predicate: 'same_address', terms: ['?X', '?Y'] },
|
|
1996
|
-
{ predicate: 'high_risk', terms: ['?X'] },
|
|
1997
|
-
{ predicate: 'high_risk', terms: ['?Y'] }
|
|
1998
|
-
]
|
|
1999
|
-
}))
|
|
880
|
+
### Measured Performance (Our Benchmarks)
|
|
2000
881
|
|
|
2001
|
-
|
|
2002
|
-
|
|
2003
|
-
|
|
882
|
+
| Metric | Measured Value | Rate |
|
|
883
|
+
|--------|----------------|------|
|
|
884
|
+
| **Triple Lookup** | 2.78 µs | 359K lookups/sec |
|
|
885
|
+
| **Bulk Insert (100K)** | 682 ms | 146K triples/sec |
|
|
886
|
+
| **Dictionary Intern (new)** | 1.10 ms / 1K | 909K/sec |
|
|
887
|
+
| **Dictionary Lookup (cached)** | 60.4 µs / 100 | 1.65M/sec |
|
|
2004
888
|
|
|
2005
|
-
|
|
2006
|
-
// [["P001", "P002", "PROV001"]]
|
|
889
|
+
### Memory Efficiency
|
|
2007
890
|
|
|
2008
|
-
|
|
2009
|
-
|
|
2010
|
-
|
|
891
|
+
| Metric | Value | Calculation |
|
|
892
|
+
|--------|-------|-------------|
|
|
893
|
+
| **Bytes per Triple** | 24 bytes | 3 × 8-byte node references |
|
|
894
|
+
| **Index Overhead** | 4 indexes | SPOC, POCS, OCSP, CSPO |
|
|
2011
895
|
|
|
2012
|
-
###
|
|
896
|
+
### Comparison Context
|
|
2013
897
|
|
|
2014
|
-
|
|
898
|
+
RDFox numbers below are from [published academic papers](https://www.cs.ox.ac.uk/boris.motik/pubs/nmhdk17rdfox.pdf), not direct same-hardware benchmarks:
|
|
2015
899
|
|
|
2016
|
-
|
|
900
|
+
| Metric | rust-kgdb (measured) | RDFox (published) | Notes |
|
|
901
|
+
|--------|---------------------|-------------------|-------|
|
|
902
|
+
| **Lookup** | 2.78 µs | 100-500 µs | Different hardware/methodology |
|
|
903
|
+
| **Memory/Triple** | 24 bytes | 32 bytes | Structural comparison |
|
|
904
|
+
| **Bulk Insert** | 146K/sec | 200-300K/sec | RDFox faster on this metric |
|
|
2017
905
|
|
|
2018
|
-
|
|
2019
|
-
Agent Execution Flow:
|
|
906
|
+
**Honest assessment**: Our lookup is fast. RDFox has 15+ years of optimization. Direct comparison requires same-hardware benchmarks.
|
|
2020
907
|
|
|
2021
|
-
|
|
2022
|
-
│ HyperMindAgent.spawn() │
|
|
2023
|
-
│ │
|
|
2024
|
-
│ AgentSpec: { │
|
|
2025
|
-
│ name: "fraud-detector", │
|
|
2026
|
-
│ model: "claude-sonnet-4", │
|
|
2027
|
-
│ tools: [kg.sparql.query, kg.graphframe, kg.embeddings, │
|
|
2028
|
-
│ kg.datalog] │
|
|
2029
|
-
│ } │
|
|
2030
|
-
└─────────────────────┬───────────────────────────────────────────┘
|
|
2031
|
-
│
|
|
2032
|
-
▼
|
|
2033
|
-
┌─────────────────────────────────────────────────────────────────┐
|
|
2034
|
-
│ TOOL 1: kg.sparql.query │
|
|
2035
|
-
│ Type: SPARQLQuery → BindingSet │
|
|
2036
|
-
│ Input: "SELECT ?claimant WHERE { ?claimant :riskScore ?s . }" │
|
|
2037
|
-
│ Output: [{ claimant: "P001" }, { claimant: "P002" }] │
|
|
2038
|
-
└─────────────────────┬───────────────────────────────────────────┘
|
|
2039
|
-
│
|
|
2040
|
-
▼
|
|
2041
|
-
┌─────────────────────────────────────────────────────────────────┐
|
|
2042
|
-
│ TOOL 2: kg.graphframe.triangles │
|
|
2043
|
-
│ Type: Graph → TriangleCount │
|
|
2044
|
-
│ Input: 4 nodes, 5 edges │
|
|
2045
|
-
│ Output: 1 triangle (fraud ring indicator) │
|
|
2046
|
-
└─────────────────────┬───────────────────────────────────────────┘
|
|
2047
|
-
│
|
|
2048
|
-
▼
|
|
2049
|
-
┌─────────────────────────────────────────────────────────────────┐
|
|
2050
|
-
│ TOOL 3: kg.embeddings.search │
|
|
2051
|
-
│ Type: EntityId → List[SimilarEntity] │
|
|
2052
|
-
│ Input: "CLM001" │
|
|
2053
|
-
│ Output: [{entity:"CLM002", score:0.815}, ...] │
|
|
2054
|
-
└─────────────────────┬───────────────────────────────────────────┘
|
|
2055
|
-
│
|
|
2056
|
-
▼
|
|
2057
|
-
┌─────────────────────────────────────────────────────────────────┐
|
|
2058
|
-
│ TOOL 4: kg.datalog.infer │
|
|
2059
|
-
│ Type: DatalogProgram → InferredFacts │
|
|
2060
|
-
│ Input: 9 facts, 2 rules │
|
|
2061
|
-
│ Output: { collusion: [...], address_fraud: [...] } │
|
|
2062
|
-
└─────────────────────┬───────────────────────────────────────────┘
|
|
2063
|
-
│
|
|
2064
|
-
▼
|
|
2065
|
-
┌─────────────────────────────────────────────────────────────────┐
|
|
2066
|
-
│ EXECUTION WITNESS │
|
|
2067
|
-
│ │
|
|
2068
|
-
│ { │
|
|
2069
|
-
│ "agent": "fraud-detector", │
|
|
2070
|
-
│ "timestamp": "2024-12-14T22:41:34.077Z", │
|
|
2071
|
-
│ "tools_executed": 4, │
|
|
2072
|
-
│ "findings": { │
|
|
2073
|
-
│ "triangles": 1, │
|
|
2074
|
-
│ "collusions": 1, │
|
|
2075
|
-
│ "addressFraud": 1 │
|
|
2076
|
-
│ }, │
|
|
2077
|
-
│ "proof_hash": "sha256:000000005330d147" │
|
|
2078
|
-
│ } │
|
|
2079
|
-
└─────────────────────────────────────────────────────────────────┘
|
|
2080
|
-
```
|
|
908
|
+
### HyperMind Agent Accuracy
|
|
2081
909
|
|
|
2082
|
-
|
|
2083
|
-
```javascript
|
|
2084
|
-
const { HyperMindAgent } = require('rust-kgdb/hypermind-agent')
|
|
2085
|
-
const { GraphDB, GraphFrame, EmbeddingService, DatalogProgram, evaluateDatalog } = require('rust-kgdb')
|
|
2086
|
-
|
|
2087
|
-
async function runFraudDetectionAgent() {
|
|
2088
|
-
// Step 1: Initialize Knowledge Graph
|
|
2089
|
-
const db = new GraphDB('http://insurance.org/fraud-kb')
|
|
2090
|
-
db.loadTtl(FRAUD_ONTOLOGY, 'http://insurance.org/fraud-kb')
|
|
2091
|
-
|
|
2092
|
-
// Step 2: Spawn Agent
|
|
2093
|
-
const agent = await HyperMindAgent.spawn({
|
|
2094
|
-
name: 'fraud-detector',
|
|
2095
|
-
model: process.env.ANTHROPIC_API_KEY ? 'claude-sonnet-4' : 'mock',
|
|
2096
|
-
tools: ['kg.sparql.query', 'kg.graphframe', 'kg.embeddings.search', 'kg.datalog.apply'],
|
|
2097
|
-
tracing: true
|
|
2098
|
-
})
|
|
910
|
+
Tested on LUBM dataset with 11 hard query scenarios:
|
|
2099
911
|
|
|
2100
|
-
|
|
2101
|
-
|
|
912
|
+
| Approach | Valid SPARQL Generated | Why |
|
|
913
|
+
|----------|------------------------|-----|
|
|
914
|
+
| **Vanilla LLM** | 0% | Markdown fences, hallucinated predicates |
|
|
915
|
+
| **HyperMind + Schema** | 86.4% avg | Schema injection, type contracts |
|
|
2102
916
|
|
|
2103
|
-
|
|
2104
|
-
const highRisk = db.querySelect(`
|
|
2105
|
-
SELECT ?claimant ?score WHERE {
|
|
2106
|
-
?claimant <http://insurance.org/riskScore> ?score .
|
|
2107
|
-
FILTER(?score > 0.7)
|
|
2108
|
-
}
|
|
2109
|
-
`)
|
|
2110
|
-
findings.highRiskClaimants = highRisk.length
|
|
2111
|
-
|
|
2112
|
-
// Tool 2: Detect fraud rings
|
|
2113
|
-
const gf = new GraphFrame(JSON.stringify(vertices), JSON.stringify(edges))
|
|
2114
|
-
findings.triangles = gf.triangleCount()
|
|
2115
|
-
|
|
2116
|
-
// Tool 3: Find similar claims
|
|
2117
|
-
const embeddings = new EmbeddingService()
|
|
2118
|
-
// ... store vectors ...
|
|
2119
|
-
const similar = JSON.parse(embeddings.findSimilar('CLM001', 5, 0.5))
|
|
2120
|
-
findings.similarClaims = similar.length
|
|
2121
|
-
|
|
2122
|
-
// Tool 4: Infer collusion patterns
|
|
2123
|
-
const datalog = new DatalogProgram()
|
|
2124
|
-
// ... add facts and rules ...
|
|
2125
|
-
const inferred = JSON.parse(evaluateDatalog(datalog))
|
|
2126
|
-
findings.collusions = (inferred.potential_collusion || []).length
|
|
2127
|
-
findings.addressFraud = (inferred.address_fraud_indicator || []).length
|
|
2128
|
-
|
|
2129
|
-
// Step 4: Generate Execution Witness
|
|
2130
|
-
const witness = {
|
|
2131
|
-
agent: agent.getName(),
|
|
2132
|
-
model: agent.getModel(),
|
|
2133
|
-
timestamp: new Date().toISOString(),
|
|
2134
|
-
findings,
|
|
2135
|
-
proof_hash: `sha256:${Date.now().toString(16)}`
|
|
2136
|
-
}
|
|
917
|
+
**Models tested**: Claude Sonnet 4 (90.9%), GPT-4o (81.8%)
|
|
2137
918
|
|
|
2138
|
-
|
|
2139
|
-
}
|
|
2140
|
-
```
|
|
919
|
+
**Methodology**: [HYPERMIND_BENCHMARK_REPORT.md](./HYPERMIND_BENCHMARK_REPORT.md)
|
|
2141
920
|
|
|
2142
|
-
### Run
|
|
921
|
+
### Run Benchmarks Yourself
|
|
2143
922
|
|
|
2144
923
|
```bash
|
|
2145
|
-
#
|
|
2146
|
-
|
|
2147
|
-
|
|
2148
|
-
# Underwriting Agent (full pipeline)
|
|
2149
|
-
node examples/underwriting-agent.js
|
|
2150
|
-
|
|
2151
|
-
# With real LLM (Anthropic)
|
|
2152
|
-
ANTHROPIC_API_KEY=sk-ant-... node examples/fraud-detection-agent.js
|
|
924
|
+
# Database benchmarks (requires Rust)
|
|
925
|
+
cargo bench --package storage --bench triple_store_benchmark
|
|
2153
926
|
|
|
2154
|
-
#
|
|
2155
|
-
|
|
927
|
+
# HyperMind agent benchmarks
|
|
928
|
+
node hypermind-benchmark.js
|
|
2156
929
|
```
|
|
2157
930
|
|
|
2158
|
-
|
|
931
|
+
---
|
|
2159
932
|
|
|
2160
|
-
|
|
2161
|
-
|
|
2162
|
-
|
|
2163
|
-
|
|
2164
|
-
|
|
2165
|
-
|
|
2166
|
-
|
|
2167
|
-
|
|
2168
|
-
|
|
2169
|
-
│ ▼ │
|
|
2170
|
-
│ ┌─────────────────┐ │
|
|
2171
|
-
│ │ Knowledge Graph │ RDF/Turtle ontology with NICB patterns │
|
|
2172
|
-
│ │ (GraphDB) │ Claims, claimants, providers, relationships │
|
|
2173
|
-
│ └────────┬────────┘ │
|
|
2174
|
-
│ │ │
|
|
2175
|
-
│ ┌────────┴────────────────────────────────────────────┐ │
|
|
2176
|
-
│ │ │ │
|
|
2177
|
-
│ ▼ ▼ ▼ │
|
|
2178
|
-
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
|
|
2179
|
-
│ │ GraphFrame │ │ Embeddings │ │ Datalog │ │
|
|
2180
|
-
│ │ (Structure) │ │ (Semantics) │ │ (Rules) │ │
|
|
2181
|
-
│ │ │ │ │ │ │ │
|
|
2182
|
-
│ │ • Triangles │ │ • Similar │ │ • Collusion rule │ │
|
|
2183
|
-
│ │ • PageRank │ │ claims │ │ • Address fraud │ │
|
|
2184
|
-
│ │ • Components │ │ • Clustering │ │ • Custom rules │ │
|
|
2185
|
-
│ └──────┬───────┘ └──────┬───────┘ └────────┬─────────┘ │
|
|
2186
|
-
│ │ │ │ │
|
|
2187
|
-
│ └──────────────────┼─────────────────────┘ │
|
|
2188
|
-
│ │ │
|
|
2189
|
-
│ ▼ │
|
|
2190
|
-
│ ┌─────────────────┐ │
|
|
2191
|
-
│ │ HyperMind Agent│ │
|
|
2192
|
-
│ │ Composition │ │
|
|
2193
|
-
│ │ │ │
|
|
2194
|
-
│ │ Type-safe tools │ │
|
|
2195
|
-
│ │ Execution proof │ │
|
|
2196
|
-
│ │ Audit trail │ │
|
|
2197
|
-
│ └────────┬────────┘ │
|
|
2198
|
-
│ │ │
|
|
2199
|
-
│ ▼ │
|
|
2200
|
-
│ ┌─────────────────┐ │
|
|
2201
|
-
│ │ ExecutionWitness│ │
|
|
2202
|
-
│ │ │ │
|
|
2203
|
-
│ │ • SHA-256 hash │ │
|
|
2204
|
-
│ │ • Timestamp │ │
|
|
2205
|
-
│ │ • Tool trace │ │
|
|
2206
|
-
│ │ • Findings │ │
|
|
2207
|
-
│ └─────────────────┘ │
|
|
2208
|
-
│ │
|
|
2209
|
-
│ RESULT: Auditable, provable, type-safe fraud detection │
|
|
2210
|
-
└──────────────────────────────────────────────────────────────────────────────┘
|
|
2211
|
-
```
|
|
933
|
+
## W3C Standards Compliance
|
|
934
|
+
|
|
935
|
+
| Standard | Status | Specification |
|
|
936
|
+
|----------|--------|---------------|
|
|
937
|
+
| **SPARQL 1.1 Query** | 100% | [W3C Rec](https://www.w3.org/TR/sparql11-query/) |
|
|
938
|
+
| **SPARQL 1.1 Update** | 100% | [W3C Rec](https://www.w3.org/TR/sparql11-update/) |
|
|
939
|
+
| **RDF 1.2** | 100% | [W3C Draft](https://www.w3.org/TR/rdf12-concepts/) |
|
|
940
|
+
| **Turtle** | 100% | [W3C Rec](https://www.w3.org/TR/turtle/) |
|
|
941
|
+
| **N-Triples** | 100% | [W3C Rec](https://www.w3.org/TR/n-triples/) |
|
|
2212
942
|
|
|
2213
|
-
|
|
943
|
+
**64 SPARQL Builtin Functions** implemented:
|
|
944
|
+
- String: `STR`, `CONCAT`, `SUBSTR`, `STRLEN`, `REGEX`, `REPLACE`, etc.
|
|
945
|
+
- Numeric: `ABS`, `ROUND`, `CEIL`, `FLOOR`, `RAND`
|
|
946
|
+
- Date/Time: `NOW`, `YEAR`, `MONTH`, `DAY`, `HOURS`, `MINUTES`, `SECONDS`
|
|
947
|
+
- Hash: `MD5`, `SHA1`, `SHA256`, `SHA384`, `SHA512`
|
|
948
|
+
- Aggregates: `COUNT`, `SUM`, `AVG`, `MIN`, `MAX`, `GROUP_CONCAT`
|
|
2214
949
|
|
|
2215
950
|
---
|
|
2216
951
|
|
|
@@ -2220,8 +955,8 @@ This is the power of HyperMind: **every step is typed, every execution is witnes
|
|
|
2220
955
|
|
|
2221
956
|
```typescript
|
|
2222
957
|
class GraphDB {
|
|
2223
|
-
constructor(
|
|
2224
|
-
loadTtl(
|
|
958
|
+
constructor(appGraphUri: string)
|
|
959
|
+
loadTtl(ttlContent: string, graphName: string | null): void
|
|
2225
960
|
querySelect(sparql: string): QueryResult[]
|
|
2226
961
|
query(sparql: string): TripleResult[]
|
|
2227
962
|
countTriples(): number
|
|
@@ -2235,15 +970,15 @@ class GraphDB {
|
|
|
2235
970
|
```typescript
|
|
2236
971
|
class GraphFrame {
|
|
2237
972
|
constructor(verticesJson: string, edgesJson: string)
|
|
2238
|
-
vertexCount(): number
|
|
2239
|
-
edgeCount(): number
|
|
2240
973
|
pageRank(resetProb: number, maxIter: number): string
|
|
2241
974
|
connectedComponents(): string
|
|
2242
975
|
shortestPaths(landmarks: string[]): string
|
|
2243
|
-
labelPropagation(maxIter: number): string
|
|
2244
976
|
triangleCount(): number
|
|
2245
|
-
find(pattern: string): string
|
|
977
|
+
find(pattern: string): string // Motif finding
|
|
2246
978
|
}
|
|
979
|
+
|
|
980
|
+
// Factory functions
|
|
981
|
+
friendsGraph(), chainGraph(n), starGraph(n), completeGraph(n), cycleGraph(n)
|
|
2247
982
|
```
|
|
2248
983
|
|
|
2249
984
|
### EmbeddingService
|
|
@@ -2251,13 +986,11 @@ class GraphFrame {
|
|
|
2251
986
|
```typescript
|
|
2252
987
|
class EmbeddingService {
|
|
2253
988
|
constructor()
|
|
2254
|
-
isEnabled(): boolean
|
|
2255
989
|
storeVector(entityId: string, vector: number[]): void
|
|
2256
990
|
getVector(entityId: string): number[] | null
|
|
2257
991
|
findSimilar(entityId: string, k: number, threshold: number): string
|
|
2258
992
|
rebuildIndex(): void
|
|
2259
|
-
|
|
2260
|
-
findSimilarComposite(entityId: string, k: number, threshold: number, strategy: string): string
|
|
993
|
+
onTripleInsert(subject: string, predicate: string, object: string, graph: string | null): void
|
|
2261
994
|
}
|
|
2262
995
|
```
|
|
2263
996
|
|
|
@@ -2268,365 +1001,204 @@ class DatalogProgram {
|
|
|
2268
1001
|
constructor()
|
|
2269
1002
|
addFact(factJson: string): void
|
|
2270
1003
|
addRule(ruleJson: string): void
|
|
2271
|
-
factCount(): number
|
|
2272
|
-
ruleCount(): number
|
|
2273
1004
|
}
|
|
2274
|
-
|
|
2275
1005
|
function evaluateDatalog(program: DatalogProgram): string
|
|
2276
1006
|
function queryDatalog(program: DatalogProgram, predicate: string): string
|
|
2277
1007
|
```
|
|
2278
1008
|
|
|
2279
|
-
|
|
2280
|
-
|
|
2281
|
-
## Architecture
|
|
1009
|
+
### HyperMindAgent
|
|
2282
1010
|
|
|
2283
|
-
```
|
|
2284
|
-
|
|
2285
|
-
|
|
2286
|
-
|
|
2287
|
-
|
|
2288
|
-
|
|
2289
|
-
|
|
2290
|
-
|
|
2291
|
-
|
|
2292
|
-
|
|
2293
|
-
|
|
2294
|
-
|
|
2295
|
-
|
|
2296
|
-
|
|
2297
|
-
|
|
2298
|
-
│ InMemory │ RocksDB │ LMDB │ SPOC Indexes │ Dictionary │
|
|
2299
|
-
├──────────────────────────────────────────────────────────────────┤
|
|
2300
|
-
│ Distribution Layer │
|
|
2301
|
-
│ HDRF Partitioning │ Raft Consensus │ gRPC │ Kubernetes │
|
|
2302
|
-
└──────────────────────────────────────────────────────────────────┘
|
|
2303
|
-
```
|
|
2304
|
-
|
|
2305
|
-
---
|
|
1011
|
+
```typescript
|
|
1012
|
+
class HyperMindAgent {
|
|
1013
|
+
constructor(config: {
|
|
1014
|
+
kg: GraphDB | SchemaAwareGraphDB, // REQUIRED
|
|
1015
|
+
embeddings?: EmbeddingService, // Optional: for similarity search
|
|
1016
|
+
model?: string, // Optional: 'claude-sonnet-4', 'gpt-4o'
|
|
1017
|
+
apiKey?: string, // Required if model specified
|
|
1018
|
+
name?: string, // Default: 'hypermind-agent'
|
|
1019
|
+
memory?: MemoryManager, // Optional: session persistence
|
|
1020
|
+
scope?: AgentScope, // Optional: access control
|
|
1021
|
+
sandbox?: { // Default: secure (ReadKG, ExecuteTool)
|
|
1022
|
+
capabilities: string[], // 'ReadKG', 'WriteKG', 'ExecuteTool', 'SpawnAgent', 'HttpAccess'
|
|
1023
|
+
fuelLimit: number // CPU budget (default: 1_000_000)
|
|
1024
|
+
}
|
|
1025
|
+
})
|
|
2306
1026
|
|
|
2307
|
-
|
|
1027
|
+
call(prompt: string): Promise<{
|
|
1028
|
+
answer: string,
|
|
1029
|
+
explanation: { tools_used: string[], sparql_queries: string[] },
|
|
1030
|
+
proof: { hash: string, type: string, derivation: object[] }
|
|
1031
|
+
}>
|
|
2308
1032
|
|
|
2309
|
-
|
|
2310
|
-
|
|
2311
|
-
|
|
2312
|
-
║ "It works on my laptop" is not a deployment strategy. ║
|
|
2313
|
-
║ "The LLM usually gets it right" is not acceptable for compliance. ║
|
|
2314
|
-
║ "We'll fix it in production" is how companies get fined. ║
|
|
2315
|
-
║ ║
|
|
2316
|
-
╠═══════════════════════════════════════════════════════════════════════════════╣
|
|
2317
|
-
║ ║
|
|
2318
|
-
║ VIBE CODING (LangChain, AutoGPT, etc.): ║
|
|
2319
|
-
║ ║
|
|
2320
|
-
║ • "Let's just call the LLM and hope" → 0% SPARQL accuracy ║
|
|
2321
|
-
║ • "Tools are just functions" → Runtime type errors ║
|
|
2322
|
-
║ • "We'll add validation later" → Production failures ║
|
|
2323
|
-
║ • "The AI will figure it out" → Infinite loops ║
|
|
2324
|
-
║ • "We don't need proofs" → No audit trail ║
|
|
2325
|
-
║ ║
|
|
2326
|
-
║ Result: Fails FDA, SOX, GDPR audits. Gets you fired. ║
|
|
2327
|
-
║ ║
|
|
2328
|
-
╠═══════════════════════════════════════════════════════════════════════════════╣
|
|
2329
|
-
║ ║
|
|
2330
|
-
║ HYPERMIND (Mathematical Foundations): ║
|
|
2331
|
-
║ ║
|
|
2332
|
-
║ • Type Theory: Errors caught at compile-time → 86.4% SPARQL accuracy ║
|
|
2333
|
-
║ • Category Theory: Morphism composition → No runtime type errors ║
|
|
2334
|
-
║ • Proof Theory: ExecutionWitness for every call → Full audit trail ║
|
|
2335
|
-
║ • WASM Sandbox: Isolated execution → Zero attack surface ║
|
|
2336
|
-
║ • WCOJ Algorithm: Optimal joins → Predictable performance ║
|
|
2337
|
-
║ ║
|
|
2338
|
-
║ Result: Passes audits. Ships to production. Keeps your job. ║
|
|
2339
|
-
║ ║
|
|
2340
|
-
╚═══════════════════════════════════════════════════════════════════════════════╝
|
|
1033
|
+
addRule(name: string, rule: object): void
|
|
1034
|
+
getAuditLog(): object[]
|
|
1035
|
+
}
|
|
2341
1036
|
```
|
|
2342
1037
|
|
|
2343
|
-
|
|
2344
|
-
|
|
2345
|
-
## On AGI, Prompt Optimization, and Mathematical Foundations
|
|
1038
|
+
### SchemaAwareGraphDB
|
|
2346
1039
|
|
|
2347
|
-
|
|
1040
|
+
```typescript
|
|
1041
|
+
class SchemaAwareGraphDB {
|
|
1042
|
+
constructor(baseUriOrDb: string | GraphDB, options?: {
|
|
1043
|
+
autoExtract?: boolean, // Default: true - extract schema on load
|
|
1044
|
+
ontology?: string // Optional: TTL ontology to use
|
|
1045
|
+
})
|
|
2348
1046
|
|
|
2349
|
-
|
|
1047
|
+
// All GraphDB methods available (loadTtl, querySelect, etc.)
|
|
1048
|
+
loadTtl(data: string, graphUri: string | null): void
|
|
1049
|
+
querySelect(sparql: string): QueryResult[]
|
|
2350
1050
|
|
|
2351
|
-
|
|
1051
|
+
// Schema-specific methods
|
|
1052
|
+
waitForSchema(timeoutMs?: number): Promise<SchemaContext>
|
|
1053
|
+
getSchema(): SchemaContext | null
|
|
1054
|
+
refreshSchema(): Promise<void>
|
|
1055
|
+
}
|
|
2352
1056
|
|
|
2353
|
-
|
|
2354
|
-
|
|
2355
|
-
|
|
1057
|
+
// Factory functions
|
|
1058
|
+
function createSchemaAwareGraphDB(baseUri: string, options?: object): SchemaAwareGraphDB
|
|
1059
|
+
function wrapWithSchemaAwareness(db: GraphDB, options?: object): SchemaAwareGraphDB
|
|
2356
1060
|
```
|
|
2357
1061
|
|
|
2358
|
-
###
|
|
1062
|
+
### SchemaContext
|
|
2359
1063
|
|
|
2360
|
-
|
|
2361
|
-
|
|
2362
|
-
|
|
2363
|
-
|
|
2364
|
-
|
|
2365
|
-
|
|
2366
|
-
|
|
2367
|
-
|
|
2368
|
-
|
|
2369
|
-
|
|
2370
|
-
│ Problem: Cannot explain WHY it works │
|
|
2371
|
-
└─────────────────────────────────────────────────────────────┘
|
|
2372
|
-
|
|
2373
|
-
HyperMind Approach:
|
|
2374
|
-
┌─────────────────────────────────────────────────────────────┐
|
|
2375
|
-
│ Type signature → Morphism composition → Proven output │
|
|
2376
|
-
│ │
|
|
2377
|
-
│ Guarantee: Type A in → Type B out (always) │
|
|
2378
|
-
│ Guarantee: Composition laws hold (associativity, id) │
|
|
2379
|
-
│ Guarantee: Execution witness (proof of correctness) │
|
|
2380
|
-
│ Guarantee: Explainable via Curry-Howard correspondence │
|
|
2381
|
-
└─────────────────────────────────────────────────────────────┘
|
|
1064
|
+
```typescript
|
|
1065
|
+
class SchemaContext {
|
|
1066
|
+
objects: string[] // Classes (category objects)
|
|
1067
|
+
morphisms: string[] // Properties (category morphisms)
|
|
1068
|
+
examples: object[] // Sample triples for LLM context
|
|
1069
|
+
|
|
1070
|
+
static fromKG(db: GraphDB): SchemaContext
|
|
1071
|
+
static fromOntology(db: GraphDB, ontologyTtl: string): SchemaContext
|
|
1072
|
+
static merge(...contexts: SchemaContext[]): SchemaContext
|
|
1073
|
+
}
|
|
2382
1074
|
```
|
|
2383
1075
|
|
|
2384
|
-
###
|
|
1076
|
+
### LLMPlanner
|
|
2385
1077
|
|
|
2386
|
-
|
|
2387
|
-
|
|
2388
|
-
|
|
2389
|
-
|
|
2390
|
-
|
|
2391
|
-
|
|
2392
|
-
|
|
2393
|
-
**The hard truth:**
|
|
1078
|
+
```typescript
|
|
1079
|
+
class LLMPlanner {
|
|
1080
|
+
constructor(config: {
|
|
1081
|
+
kg: GraphDB,
|
|
1082
|
+
model?: string, // 'claude-sonnet-4', 'gpt-4o', etc.
|
|
1083
|
+
apiKey?: string
|
|
1084
|
+
})
|
|
2394
1085
|
|
|
2395
|
-
|
|
2396
|
-
|
|
2397
|
-
|
|
2398
|
-
|
|
2399
|
-
× That the result satisfies business constraints
|
|
2400
|
-
× That the execution is deterministic
|
|
2401
|
-
|
|
2402
|
-
HyperMind PROVES:
|
|
2403
|
-
✓ Tool chains form valid morphism compositions
|
|
2404
|
-
✓ Types are checked at compile-time (Hindley-Milner)
|
|
2405
|
-
✓ Business constraints are refinement types
|
|
2406
|
-
✓ Every execution has a cryptographic witness
|
|
1086
|
+
extractSchema(): { predicates: string[], classes: string[], examples: object[] }
|
|
1087
|
+
classify(prompt: string): Promise<{ intent: string, confidence: number }>
|
|
1088
|
+
generateSparql(prompt: string, intent: string): Promise<string>
|
|
1089
|
+
}
|
|
2407
1090
|
```
|
|
2408
1091
|
|
|
2409
|
-
###
|
|
1092
|
+
### MemoryManager
|
|
2410
1093
|
|
|
2411
|
-
|
|
2412
|
-
|
|
1094
|
+
```typescript
|
|
1095
|
+
class MemoryManager {
|
|
1096
|
+
constructor(config?: {
|
|
1097
|
+
workingMemorySize?: number, // Default: 10
|
|
1098
|
+
episodicRetentionDays?: number, // Default: 30
|
|
1099
|
+
longTermGraph?: string // Default: 'http://memory.hypermind.ai/'
|
|
1100
|
+
})
|
|
2413
1101
|
|
|
1102
|
+
storeEpisode(episode: object): void
|
|
1103
|
+
recall(query: string, limit?: number): object[]
|
|
1104
|
+
getWorkingMemory(): object[]
|
|
1105
|
+
clearWorkingMemory(): void
|
|
1106
|
+
}
|
|
2414
1107
|
```
|
|
2415
|
-
DSPy: P(correct | prompt, examples) ≈ 0.85 (probabilistic)
|
|
2416
|
-
HyperMind: ∀x:A. f(x):B (universal quantifier - ALWAYS)
|
|
2417
|
-
```
|
|
2418
|
-
|
|
2419
|
-
This isn't academic distinction. When your fraud detection system flags 15 suspicious patterns, the regulator asks: *"How do you know these are correct?"*
|
|
2420
1108
|
|
|
2421
|
-
|
|
2422
|
-
- **HyperMind answer**: "Here's the ExecutionWitness with SHA-256 hash, timestamp, and full type derivation"
|
|
1109
|
+
### AgentScope
|
|
2423
1110
|
|
|
2424
|
-
|
|
2425
|
-
|
|
2426
|
-
|
|
2427
|
-
|
|
2428
|
-
|
|
2429
|
-
|
|
2430
|
-
|
|
2431
|
-
|
|
2432
|
-
```python
|
|
2433
|
-
# DSPy: Statistically optimized prompt - NO guarantees
|
|
2434
|
-
|
|
2435
|
-
import dspy
|
|
2436
|
-
|
|
2437
|
-
class FraudDetector(dspy.Signature):
|
|
2438
|
-
"""Find fraud patterns in claims data."""
|
|
2439
|
-
claims_data = dspy.InputField()
|
|
2440
|
-
fraud_patterns = dspy.OutputField()
|
|
2441
|
-
|
|
2442
|
-
class FraudPipeline(dspy.Module):
|
|
2443
|
-
def __init__(self):
|
|
2444
|
-
self.detector = dspy.ChainOfThought(FraudDetector)
|
|
2445
|
-
|
|
2446
|
-
def forward(self, claims):
|
|
2447
|
-
return self.detector(claims_data=claims)
|
|
2448
|
-
|
|
2449
|
-
# "Optimize" via statistical fitting
|
|
2450
|
-
optimizer = dspy.BootstrapFewShot(metric=some_metric)
|
|
2451
|
-
optimized = optimizer.compile(FraudPipeline(), trainset=examples)
|
|
2452
|
-
|
|
2453
|
-
# Call and HOPE it works
|
|
2454
|
-
result = optimized(claims="[claim data here]")
|
|
1111
|
+
```typescript
|
|
1112
|
+
class AgentScope {
|
|
1113
|
+
constructor(config?: {
|
|
1114
|
+
allowedGraphs?: string[], // null = all graphs
|
|
1115
|
+
allowedPredicates?: string[], // null = all predicates
|
|
1116
|
+
maxResultSize?: number // Default: 10000
|
|
1117
|
+
})
|
|
2455
1118
|
|
|
2456
|
-
|
|
2457
|
-
|
|
2458
|
-
|
|
2459
|
-
# ❌ No audit trail - "it said fraud" is not compliance
|
|
1119
|
+
checkAccess(graph: string, predicate: string): boolean
|
|
1120
|
+
enforceLimit(results: any[]): any[]
|
|
1121
|
+
}
|
|
2460
1122
|
```
|
|
2461
1123
|
|
|
2462
|
-
|
|
2463
|
-
|
|
2464
|
-
### HyperMind Approach (Mathematical Proof)
|
|
2465
|
-
|
|
2466
|
-
```javascript
|
|
2467
|
-
// HyperMind: Type-safe morphism composition - PROVEN correct
|
|
2468
|
-
|
|
2469
|
-
const { GraphDB, GraphFrame, DatalogProgram, evaluateDatalog } = require('rust-kgdb')
|
|
2470
|
-
|
|
2471
|
-
// Step 1: Load typed knowledge graph (Schema enforced)
|
|
2472
|
-
const db = new GraphDB('http://insurance.org/fraud-kb')
|
|
2473
|
-
db.loadTtl(`
|
|
2474
|
-
@prefix : <http://insurance.org/> .
|
|
2475
|
-
:CLM001 :amount "18500" ; :claimant :P001 ; :provider :PROV001 .
|
|
2476
|
-
:P001 :paidTo :P002 .
|
|
2477
|
-
:P002 :paidTo :P003 .
|
|
2478
|
-
:P003 :paidTo :P001 .
|
|
2479
|
-
`, null)
|
|
2480
|
-
|
|
2481
|
-
// Step 2: GraphFrame analysis (Morphism: Graph → TriangleCount)
|
|
2482
|
-
// Type signature: GraphFrame → number (guaranteed)
|
|
2483
|
-
const graph = new GraphFrame(
|
|
2484
|
-
JSON.stringify([{id:'P001'}, {id:'P002'}, {id:'P003'}]),
|
|
2485
|
-
JSON.stringify([
|
|
2486
|
-
{src:'P001', dst:'P002'},
|
|
2487
|
-
{src:'P002', dst:'P003'},
|
|
2488
|
-
{src:'P003', dst:'P001'}
|
|
2489
|
-
])
|
|
2490
|
-
)
|
|
2491
|
-
const triangles = graph.triangleCount() // Type: number (always)
|
|
2492
|
-
|
|
2493
|
-
// Step 3: Datalog inference (Morphism: Rules → Facts)
|
|
2494
|
-
// Type signature: DatalogProgram → InferredFacts (guaranteed)
|
|
2495
|
-
const datalog = new DatalogProgram()
|
|
2496
|
-
datalog.addFact(JSON.stringify({predicate:'claim', terms:['CLM001','P001','PROV001']}))
|
|
2497
|
-
datalog.addFact(JSON.stringify({predicate:'related', terms:['P001','P002']}))
|
|
2498
|
-
|
|
2499
|
-
datalog.addRule(JSON.stringify({
|
|
2500
|
-
head: {predicate:'collusion', terms:['?P1','?P2','?Prov']},
|
|
2501
|
-
body: [
|
|
2502
|
-
{predicate:'claim', terms:['?C1','?P1','?Prov']},
|
|
2503
|
-
{predicate:'claim', terms:['?C2','?P2','?Prov']},
|
|
2504
|
-
{predicate:'related', terms:['?P1','?P2']}
|
|
2505
|
-
]
|
|
2506
|
-
}))
|
|
1124
|
+
### WASM Sandbox & Fuel Metering
|
|
2507
1125
|
|
|
2508
|
-
|
|
1126
|
+
```typescript
|
|
1127
|
+
class WasmSandbox {
|
|
1128
|
+
constructor(config: {
|
|
1129
|
+
capabilities: string[], // Granted capabilities
|
|
1130
|
+
fuelLimit: number // CPU budget
|
|
1131
|
+
})
|
|
2509
1132
|
|
|
2510
|
-
|
|
2511
|
-
|
|
2512
|
-
|
|
2513
|
-
|
|
1133
|
+
execute(tool: string, args: object): Promise<object>
|
|
1134
|
+
getRemainingFuel(): number
|
|
1135
|
+
getExecutionTrace(): object[]
|
|
1136
|
+
}
|
|
2514
1137
|
```
|
|
2515
1138
|
|
|
2516
|
-
**
|
|
1139
|
+
**Fuel Concept** (CPU Budget):
|
|
2517
1140
|
|
|
2518
|
-
|
|
1141
|
+
Fuel metering prevents runaway computations and enables resource accounting:
|
|
2519
1142
|
|
|
2520
|
-
|
|
2521
|
-
|
|
2522
|
-
|
|
2523
|
-
|
|
2524
|
-
|
|
2525
|
-
|
|
2526
|
-
|
|
2527
|
-
**HyperMind Output:**
|
|
2528
|
-
```json
|
|
2529
|
-
{
|
|
2530
|
-
"triangles": 1,
|
|
2531
|
-
"collusion": [["P001", "P002", "PROV001"]],
|
|
2532
|
-
"executionWitness": {
|
|
2533
|
-
"tool": "datalog.evaluate",
|
|
2534
|
-
"input": "6 facts, 1 rule",
|
|
2535
|
-
"output": "collusion(P001,P002,PROV001)",
|
|
2536
|
-
"derivation": "claim(CLM001,P001,PROV001) ∧ claim(CLM002,P002,PROV001) ∧ related(P001,P002) → collusion(P001,P002,PROV001)",
|
|
2537
|
-
"timestamp": "2024-12-14T10:30:00Z",
|
|
2538
|
-
"semanticHash": "semhash:collusion-p001-p002-prov001"
|
|
1143
|
+
```javascript
|
|
1144
|
+
const agent = new HyperMindAgent({
|
|
1145
|
+
kg: db,
|
|
1146
|
+
sandbox: {
|
|
1147
|
+
capabilities: ['ReadKG', 'ExecuteTool'],
|
|
1148
|
+
fuelLimit: 1_000_000 // 1 million fuel units
|
|
2539
1149
|
}
|
|
2540
|
-
}
|
|
2541
|
-
```
|
|
2542
|
-
*Every result has a logical derivation and cryptographic proof.*
|
|
1150
|
+
})
|
|
2543
1151
|
|
|
2544
|
-
|
|
1152
|
+
// Each operation consumes fuel:
|
|
1153
|
+
// - SPARQL query: ~1000-10000 fuel (depends on complexity)
|
|
1154
|
+
// - Datalog evaluation: ~5000-50000 fuel
|
|
1155
|
+
// - Embedding search: ~500-2000 fuel
|
|
2545
1156
|
|
|
2546
|
-
|
|
1157
|
+
// If fuel exhausted, execution stops with error:
|
|
1158
|
+
// Error: FuelExhausted - agent exceeded CPU budget
|
|
2547
1159
|
|
|
2548
|
-
|
|
1160
|
+
// Check remaining fuel
|
|
1161
|
+
const remaining = agent.sandbox.getRemainingFuel()
|
|
1162
|
+
console.log(`Fuel remaining: ${remaining}`) // e.g., 985000
|
|
1163
|
+
```
|
|
2549
1164
|
|
|
2550
|
-
**
|
|
2551
|
-
1. `claim(CLM001, P001, PROV001)` - fact from data
|
|
2552
|
-
2. `claim(CLM002, P002, PROV001)` - fact from data
|
|
2553
|
-
3. `related(P001, P002)` - fact from data
|
|
2554
|
-
4. Rule: `collusion(?P1, ?P2, ?Prov) :- claim(?C1, ?P1, ?Prov), claim(?C2, ?P2, ?Prov), related(?P1, ?P2)`
|
|
2555
|
-
5. Unification: `?P1=P001, ?P2=P002, ?Prov=PROV001`
|
|
2556
|
-
6. Conclusion: `collusion(P001, P002, PROV001)` - QED
|
|
1165
|
+
**Fuel Limits by Use Case**:
|
|
2557
1166
|
|
|
2558
|
-
|
|
1167
|
+
| Use Case | Recommended Fuel | Rationale |
|
|
1168
|
+
|----------|------------------|-----------|
|
|
1169
|
+
| Simple queries | 100,000 | Single SPARQL + formatting |
|
|
1170
|
+
| Complex analysis | 1,000,000 | Multiple queries + Datalog |
|
|
1171
|
+
| Long-running agent | 10,000,000 | Extended conversation |
|
|
1172
|
+
| Batch processing | 100,000,000 | Many independent queries |
|
|
2559
1173
|
|
|
2560
|
-
|
|
1174
|
+
---
|
|
2561
1175
|
|
|
2562
|
-
|
|
1176
|
+
## Examples
|
|
2563
1177
|
|
|
2564
|
-
```
|
|
2565
|
-
|
|
2566
|
-
|
|
2567
|
-
│ HYPERMIND AGENT (this is what you build with) │
|
|
2568
|
-
│ ├── Natural language → structured queries │
|
|
2569
|
-
│ ├── 86.4% accuracy on complex SPARQL generation │
|
|
2570
|
-
│ └── Full provenance for every decision │
|
|
2571
|
-
│ │
|
|
2572
|
-
├───────────────────────────────────────────────────────────────────────────────┤
|
|
2573
|
-
│ │
|
|
2574
|
-
│ KNOWLEDGE GRAPH DATABASE (this is what powers it) │
|
|
2575
|
-
│ ├── 2.78 µs lookups (35x faster than RDFox) │
|
|
2576
|
-
│ ├── 24 bytes/triple (25% more efficient) │
|
|
2577
|
-
│ ├── W3C SPARQL 1.1 + RDF 1.2 (100% compliance) │
|
|
2578
|
-
│ ├── RDFS + OWL 2 RL reasoners (ontology inference) │
|
|
2579
|
-
│ ├── SHACL validation (schema enforcement) │
|
|
2580
|
-
│ └── WCOJ algorithm (worst-case optimal joins) │
|
|
2581
|
-
│ │
|
|
2582
|
-
├───────────────────────────────────────────────────────────────────────────────┤
|
|
2583
|
-
│ │
|
|
2584
|
-
│ DISTRIBUTION LAYER (this is how it scales) │
|
|
2585
|
-
│ ├── Mobile: iOS + Android with zero-copy FFI │
|
|
2586
|
-
│ ├── Standalone: Single node with RocksDB/LMDB │
|
|
2587
|
-
│ └── Clustered: Kubernetes with HDRF + Raft consensus │
|
|
2588
|
-
│ │
|
|
2589
|
-
└───────────────────────────────────────────────────────────────────────────────┘
|
|
2590
|
-
```
|
|
2591
|
-
|
|
2592
|
-
---
|
|
1178
|
+
```bash
|
|
1179
|
+
# Fraud detection agent
|
|
1180
|
+
node examples/fraud-detection-agent.js
|
|
2593
1181
|
|
|
2594
|
-
|
|
1182
|
+
# Underwriting agent
|
|
1183
|
+
node examples/underwriting-agent.js
|
|
2595
1184
|
|
|
2596
|
-
|
|
2597
|
-
|
|
2598
|
-
|
|
2599
|
-
├─────────────────────────────────────────────────────────────────┤
|
|
2600
|
-
│ │
|
|
2601
|
-
│ Apache Jena: Great features, but 150+ µs lookups │
|
|
2602
|
-
│ RDFox: Fast, but expensive and no mobile support │
|
|
2603
|
-
│ Neo4j: Popular, but no SPARQL/RDF standards │
|
|
2604
|
-
│ Amazon Neptune: Managed, but cloud-only vendor lock-in │
|
|
2605
|
-
│ LangChain: Vibe coding, fails compliance audits │
|
|
2606
|
-
│ │
|
|
2607
|
-
│ rust-kgdb: 2.78 µs lookups, mobile-native, open standards │
|
|
2608
|
-
│ Standalone → Clustered on same codebase │
|
|
2609
|
-
│ Mathematical foundations, audit-ready │
|
|
2610
|
-
│ │
|
|
2611
|
-
└─────────────────────────────────────────────────────────────────┘
|
|
1185
|
+
# Run tests
|
|
1186
|
+
npm test # 42 tests
|
|
1187
|
+
npm run test:jest # 217 tests
|
|
2612
1188
|
```
|
|
2613
1189
|
|
|
2614
1190
|
---
|
|
2615
1191
|
|
|
2616
|
-
##
|
|
2617
|
-
|
|
2618
|
-
**Email:** gonnect.uk@gmail.com
|
|
2619
|
-
|
|
2620
|
-
**GitHub:** [github.com/gonnect-uk/rust-kgdb](https://github.com/gonnect-uk/rust-kgdb)
|
|
1192
|
+
## Links
|
|
2621
1193
|
|
|
2622
|
-
**npm
|
|
1194
|
+
- **npm**: [rust-kgdb](https://www.npmjs.com/package/rust-kgdb)
|
|
1195
|
+
- **GitHub**: [gonnect-uk/rust-kgdb](https://github.com/gonnect-uk/rust-kgdb)
|
|
1196
|
+
- **Benchmark Report**: [HYPERMIND_BENCHMARK_REPORT.md](./HYPERMIND_BENCHMARK_REPORT.md)
|
|
1197
|
+
- **Changelog**: [CHANGELOG.md](./CHANGELOG.md)
|
|
1198
|
+
- **Archive**: [README.archive.md](./README.archive.md) - Previous comprehensive documentation
|
|
2623
1199
|
|
|
2624
1200
|
---
|
|
2625
1201
|
|
|
2626
1202
|
## License
|
|
2627
1203
|
|
|
2628
|
-
Apache
|
|
2629
|
-
|
|
2630
|
-
---
|
|
2631
|
-
|
|
2632
|
-
*Built with Rust. Grounded in mathematics. Ready for production.*
|
|
1204
|
+
Apache 2.0
|