rust-kgdb 0.6.30 → 0.6.32

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md CHANGED
@@ -8,307 +8,6 @@ This is the **TypeScript/Node.js SDK** for `rust-kgdb`, a high-performance RDF/S
8
8
 
9
9
  **npm Package**: [`rust-kgdb`](https://www.npmjs.com/package/rust-kgdb)
10
10
 
11
- ## Benchmark Results
12
-
13
- **HyperMind achieves 86.4% accuracy where vanilla LLMs achieve 0%.**
14
-
15
- | Metric | Vanilla LLM | HyperMind | Improvement |
16
- |--------|-------------|-----------|-------------|
17
- | **Accuracy** | 0% | 86.4% | +86.4 pp |
18
- | **Claude Sonnet 4** | 0% | 90.9% | +90.9 pp |
19
- | **GPT-4o** | 0% | 81.8% | +81.8 pp |
20
- | **Hallucinations** | 100% | 0% | Eliminated |
21
- | **Audit Trail** | None | Complete | Full provenance |
22
- | **Reproducibility** | Random | Deterministic | Same hash |
23
-
24
- ### How We Calculated These Numbers
25
-
26
- ```
27
- ┌─────────────────────────────────────────────────────────────────────────────┐
28
- │ BENCHMARK METHODOLOGY │
29
- │ │
30
- │ DATASET: LUBM (Lehigh University Benchmark) │
31
- │ ───────────────────────────────────────────── │
32
- │ • Industry-standard academic KG benchmark (since 2005) │
33
- │ • 3,272 triples (LUBM-1 scale) │
34
- │ • 30 OWL classes, 23 properties │
35
- │ • Used by: Jena, RDFox, Stardog, GraphDB for comparison │
36
- │ │
37
- │ TEST PROTOCOL: 11 Hard Scenarios × 2 LLMs × 2 Approaches │
38
- │ ───────────────────────────────────────────────────────── │
39
- │ For each test query: │
40
- │ 1. VANILLA: Send query to LLM with NO context │
41
- │ 2. HYPERMIND: Send query with SchemaContext (Γ) injected │
42
- │ 3. VALIDATE: Parse → Type-check → Execute → Verify results │
43
- │ │
44
- │ ACCURACY FORMULA: │
45
- │ ───────────────── │
46
- │ Accuracy = (Queries that pass ALL 3 gates) / (Total queries) × 100 │
47
- │ │
48
- │ Gate 1: Syntax Valid (no markdown, valid SPARQL) │
49
- │ Gate 2: Executable (runs without error on rust-kgdb) │
50
- │ Gate 3: Type Safe (uses ONLY predicates from SchemaContext) │
51
- │ │
52
- │ RESULTS: │
53
- │ ───────── │
54
- │ Vanilla LLM: 0/11 passed (0%) - Failed Gate 1 or 3 every time │
55
- │ HyperMind: 9.5/11 passed (86.4%) - Claude: 10/11, GPT-4o: 9/11 │
56
- └─────────────────────────────────────────────────────────────────────────────┘
57
- ```
58
-
59
- **Reproducibility**: Run `node vanilla-vs-hypermind-benchmark.js` to verify these numbers yourself.
60
-
61
- ### What Was Tested
62
-
63
- | Component | Specification |
64
- |-----------|---------------|
65
- | **Dataset** | LUBM (Lehigh University Benchmark) - standard academic KG benchmark |
66
- | **Triples** | 3,272 (LUBM-1 scale) |
67
- | **Schema** | 30 OWL classes, 23 properties |
68
- | **Deployment** | rust-kgdb Kubernetes cluster (3 executors, 1 coordinator) |
69
-
70
- ### Test Categories (11 Hard Scenarios)
71
-
72
- | Category | Count | What It Tests |
73
- |----------|-------|---------------|
74
- | **ambiguous** | 3 | Queries with multiple valid interpretations |
75
- | **multi_hop** | 2 | Requires JOIN reasoning across entities |
76
- | **syntax** | 2 | Catches markdown/formatting errors |
77
- | **edge_case** | 2 | Boundary conditions, empty results |
78
- | **type_mismatch** | 2 | Schema violation detection |
79
-
80
- ### How We Tested (Evaluation Protocol)
81
-
82
- ```javascript
83
- // VANILLA LLM: No context (baseline)
84
- const vanillaPrompt = `Generate SPARQL: ${query}`
85
- // Result: LLM guesses predicates, wraps in markdown, hallucinates
86
-
87
- // HYPERMIND: Schema injected into prompt
88
- const hypermindPrompt = `
89
- SCHEMA:
90
- Classes: ${schema.classes.join(', ')} // From YOUR actual data
91
- Predicates: ${schema.predicates.join(', ')} // From YOUR actual data
92
-
93
- TYPE CONTRACT:
94
- - Input: natural language query
95
- - Output: raw SPARQL (NO markdown, NO code blocks)
96
- - Precondition: Query references ONLY schema predicates
97
- - Postcondition: Valid SPARQL 1.1 syntax
98
-
99
- Query: ${query}
100
- `
101
- // Result: LLM generates valid, type-safe queries
102
- ```
103
-
104
- ### Success Criteria (Three Gates)
105
-
106
- 1. **Syntax Valid**: Query parses without errors (no markdown wrapping)
107
- 2. **Executable**: Query runs against database without exceptions
108
- 3. **Type Safe**: Uses ONLY predicates defined in schema (no hallucination)
109
-
110
- ### Why Vanilla LLMs Fail (100% Failure Rate)
111
-
112
- ```
113
- User: "Find all professors"
114
-
115
- Vanilla LLM Output:
116
- ┌────────────────────────────────────────────────────────────────────┐
117
- │ ```sparql ← PROBLEM 1: Markdown wrapper │
118
- │ PREFIX ub: <http://...> │
119
- │ SELECT ?prof WHERE { │
120
- │ ?prof a ub:Faculty . ← PROBLEM 2: Wrong class! │
121
- │ } (Schema has "Professor") │
122
- │ ``` │
123
- │ │
124
- │ This query finds all faculty... ← PROBLEM 3: Explanation text │
125
- └────────────────────────────────────────────────────────────────────┘
126
- Result: ❌ Parser rejects (markdown), wrong class (hallucinated)
127
- ```
128
-
129
- ### Why HyperMind Succeeds (86.4% Success Rate)
130
-
131
- ```
132
- User: "Find all professors"
133
-
134
- HyperMind Output:
135
- ┌────────────────────────────────────────────────────────────────────┐
136
- │ PREFIX ub: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#> │
137
- │ SELECT ?prof WHERE { │
138
- │ ?prof a ub:Professor . ← CORRECT: From injected schema │
139
- │ } │
140
- └────────────────────────────────────────────────────────────────────┘
141
- Result: ✅ Parses, executes, returns 15 professors
142
- ```
143
-
144
- ### Calibration Against Industry Benchmarks
145
-
146
- Our methodology is calibrated against established AI benchmarks:
147
-
148
- | Benchmark | Organization | What It Measures | How We Applied It |
149
- |-----------|--------------|------------------|-------------------|
150
- | **GAIA** | Meta Research | Multi-step reasoning, tool use | Test categories (ambiguous, multi_hop) |
151
- | **SWE-bench** | OpenAI | Code generation accuracy | Success criteria (syntax, executable, type-safe) |
152
- | **LUBM** | Lehigh University | Knowledge graph query performance | Dataset (3,272 triples, 30 classes, 23 predicates) |
153
-
154
- **Calibration Process:**
155
- 1. **GAIA-inspired categories**: We adopted GAIA's multi-step reasoning tests for `multi_hop` and `ambiguous` categories
156
- 2. **SWE-bench-inspired validation**: Like SWE-bench validates code patches via test suites, we validate queries via three gates (syntax → executable → type-safe)
157
- 3. **LUBM standard dataset**: Industry-standard academic benchmark ensures reproducibility across implementations
158
-
159
- ### Verification Method
160
-
161
- ```
162
- ┌─────────────────────────────────────────────────────────────────────────────┐
163
- │ VERIFICATION PIPELINE │
164
- │ │
165
- │ 1. GENERATE LLM produces SPARQL from natural language │
166
- │ │ │
167
- │ ▼ │
168
- │ 2. PARSE rust-kgdb SPARQL parser validates syntax │
169
- │ │ ✗ Markdown? → FAIL │
170
- │ │ ✗ Invalid syntax? → FAIL │
171
- │ ▼ │
172
- │ 3. TYPE-CHECK QueryValidator checks against SchemaContext (Γ) │
173
- │ │ ✗ Unknown predicate? → FAIL (hallucination detected) │
174
- │ │ ✗ Wrong domain/range? → FAIL │
175
- │ ▼ │
176
- │ 4. EXECUTE Query runs against LUBM dataset in rust-kgdb cluster │
177
- │ │ ✗ Runtime error? → FAIL │
178
- │ │ ✗ Empty when expecting results? → FAIL │
179
- │ ▼ │
180
- │ 5. VERIFY Results compared against known LUBM answers │
181
- │ ✓ Matches expected? → PASS │
182
- │ │
183
- │ Each test must pass ALL 5 stages to count as SUCCESS │
184
- └─────────────────────────────────────────────────────────────────────────────┘
185
- ```
186
-
187
- ### Published Results
188
-
189
- | Artifact | Location | What It Contains |
190
- |----------|----------|------------------|
191
- | **Benchmark Report** | `HYPERMIND_BENCHMARK_REPORT.md` | Full methodology, per-test results, failure analysis |
192
- | **Benchmark Code** | `vanilla-vs-hypermind-benchmark.js` | Runnable benchmark comparing vanilla vs HyperMind |
193
- | **Example: Fraud** | `examples/fraud-detection-agent.js` | Real dataset (`FRAUD_ONTOLOGY`) loaded via `db.loadTtl()` |
194
- | **Example: Underwriting** | `examples/underwriting-agent.js` | Real dataset (`UNDERWRITING_KB`) loaded via `db.loadTtl()` |
195
- | **npm Package** | `rust-kgdb` | Published SDK with all benchmark code |
196
-
197
- ### Dataset Loading (Factually Verifiable)
198
-
199
- Both examples load real ontologies/knowledge bases via `loadTtl()`:
200
-
201
- ```javascript
202
- // examples/fraud-detection-agent.js (line 612)
203
- db.loadTtl(FRAUD_ONTOLOGY, CONFIG.kg.graphUri)
204
- // FRAUD_ONTOLOGY contains: ins:Claimant, ins:Provider, ins:Claim classes
205
- // with properties: claimant, provider, amount, address (for ring detection)
206
-
207
- // examples/underwriting-agent.js (line 766)
208
- db.loadTtl(UNDERWRITING_KB, 'http://underwriting.org/data')
209
- // UNDERWRITING_KB contains: uw:BusinessAccount, uw:Territory classes
210
- // with properties: naicsCode, revenue, territory, hurricaneExposure, earthquakeExposure
211
- ```
212
-
213
- **Verify in code**: Run `grep -n "loadTtl" examples/*.js` to see exact lines.
214
-
215
- ### End-to-End Architecture: HyperMind Deterministic Flow
216
-
217
- ```
218
- ┌─────────────────────────────────────────────────────────────────────────────┐
219
- │ HYPERMIND: DETERMINISTIC SCHEMA-DRIVEN EXECUTION │
220
- │ Powered by rust-kgdb GraphDB (LLM OPTIONAL) │
221
- │ │
222
- │ ┌─────────────────────────────────────────────────────────────────────┐ │
223
- │ │ USER: "Find high-risk providers with claims over $10,000" │ │
224
- │ └──────────────────────────────┬──────────────────────────────────────┘ │
225
- │ │ │
226
- │ ▼ │
227
- │ ┌─────────────────────────────────────────────────────────────────────┐ │
228
- │ │ 1. SCHEMA CONTEXT (Γ) - Object, NOT string │ │
229
- │ │ ┌─────────────────────────────────────────────────────────────┐ │ │
230
- │ │ │ const schemaContext = await SchemaContext.fromKG(db) │ │ │
231
- │ │ │ // Returns OBJECT: { classes: Set, properties: Map, ... } │ │ │
232
- │ │ └─────────────────────────────────────────────────────────────┘ │ │
233
- │ └──────────────────────────────┬──────────────────────────────────────┘ │
234
- │ │ │
235
- │ ▼ │
236
- │ ┌─────────────────────────────────────────────────────────────────────┐ │
237
- │ │ 2. DETERMINISTIC INTENT ANALYSIS (NO LLM) │ │
238
- │ │ ┌─────────────────────────────────────────────────────────────┐ │ │
239
- │ │ │ const intent = this._analyzeIntent(prompt) │ │ │
240
- │ │ │ // Keyword matching: "high-risk" → intent.risk = true │ │ │
241
- │ │ │ // "claims over" → intent.query = true, intent.filter │ │ │
242
- │ │ │ // DETERMINISTIC: same input → same intent │ │ │
243
- │ │ └─────────────────────────────────────────────────────────────┘ │ │
244
- │ └──────────────────────────────┬──────────────────────────────────────┘ │
245
- │ │ │
246
- │ ▼ │
247
- │ ┌─────────────────────────────────────────────────────────────────────┐ │
248
- │ │ 3. SCHEMA-DRIVEN QUERY GENERATION (NO LLM) │ │
249
- │ │ ┌─────────────────────────────────────────────────────────────┐ │ │
250
- │ │ │ const sparql = this._generateSchemaSparql(intent, schema) │ │ │
251
- │ │ │ // Uses SchemaContext to find matching predicates: │ │ │
252
- │ │ │ // - riskScore found in schema.predicates │ │ │
253
- │ │ │ // - amount found in schema.predicates │ │ │
254
- │ │ │ // Generates: SELECT ?p ?score WHERE { ?p :riskScore ... } │ │ │
255
- │ │ └─────────────────────────────────────────────────────────────┘ │ │
256
- │ └──────────────────────────────┬──────────────────────────────────────┘ │
257
- │ │ │
258
- │ ▼ │
259
- │ ┌─────────────────────────────────────────────────────────────────────┐ │
260
- │ │ 4. VALIDATION + EXECUTION (rust-kgdb) │ │
261
- │ │ ┌─────────────────────────────────────────────────────────────┐ │ │
262
- │ │ │ const validation = validateQuery(sparql, schemaContext) │ │ │
263
- │ │ │ // ✓ All predicates exist in SchemaContext │ │ │
264
- │ │ │ // ✓ Types match (domain/range) │ │ │
265
- │ │ │ const results = db.querySelect(sparql) // 2.78 µs │ │ │
266
- │ │ └─────────────────────────────────────────────────────────────┘ │ │
267
- │ └──────────────────────────────┬──────────────────────────────────────┘ │
268
- │ │ │
269
- │ ▼ │
270
- │ ┌─────────────────────────────────────────────────────────────────────┐ │
271
- │ │ 5. PROOF DAG (Audit Trail) │ │
272
- │ │ ┌─────────────────────────────────────────────────────────────┐ │ │
273
- │ │ │ { │ │ │
274
- │ │ │ answer: "Provider P001, P003 are high-risk", │ │ │
275
- │ │ │ derivations: [{ tool: "kg.sparql.query", ... }], │ │ │
276
- │ │ │ hash: "sha256:8f3a2b1c...", // REPRODUCIBLE │ │ │
277
- │ │ │ } │ │ │
278
- │ │ └─────────────────────────────────────────────────────────────┘ │ │
279
- │ └─────────────────────────────────────────────────────────────────────┘ │
280
- │ │
281
- │ LLM OPTIONAL: If enabled, used ONLY for final summarization │
282
- │ KEY: Same input + same schema = same query = same results = same hash │
283
- └─────────────────────────────────────────────────────────────────────────────┘
284
- ```
285
-
286
- **Code References** (verify in `hypermind-agent.js`):
287
- - `_analyzeIntent()` line 2286: Deterministic keyword matching
288
- - `_generateSteps()` line 2297: Schema-driven step generation
289
- - `_generateSchemaSparql()` line 2368: Schema-aware SPARQL generation
290
- - `validateQuery()`: Type-checks against SchemaContext
291
-
292
- ### Run It Yourself
293
-
294
- ```bash
295
- # 1. Install the SDK
296
- npm install rust-kgdb
297
-
298
- # 2. Set API keys
299
- export OPENAI_API_KEY="sk-..."
300
- export ANTHROPIC_API_KEY="sk-ant-..."
301
-
302
- # 3. Run benchmark
303
- node vanilla-vs-hypermind-benchmark.js
304
-
305
- # 4. Run examples
306
- node examples/fraud-detection-agent.js
307
- node examples/underwriting-agent.js
308
- ```
309
-
310
- **All results are reproducible.** Same schema + same question = same answer = same hash.
311
-
312
11
  ## Commands
313
12
 
314
13
  ### Build Native Addon
@@ -332,8 +31,9 @@ npx jest tests/regression.test.ts --testNamePattern="SPARQL"
332
31
  ### Publishing
333
32
 
334
33
  ```bash
335
- npm publish # Publish to npm
336
- npm view rust-kgdb # View package info
34
+ npm version patch --no-git-tag-version # Bump version
35
+ npm publish # Publish to npm
36
+ npm view rust-kgdb # View package info
337
37
  ```
338
38
 
339
39
  ## Architecture
@@ -365,190 +65,17 @@ npm view rust-kgdb # View package info
365
65
  1. **Native NAPI-RS** (`native/rust-kgdb-napi/src/lib.rs`): Rust bindings for GraphDB, GraphFrame, Embeddings, Datalog, Pregel
366
66
  2. **HyperMind Framework** (`hypermind-agent.js`): Pure JS AI agent framework with schema awareness, memory, sandboxing
367
67
 
368
- ## Our Approach vs Traditional (Why We Built This)
369
-
370
- ```
371
- ┌─────────────────────────────────────────────────────────────────────────────┐
372
- │ APPROACH COMPARISON │
373
- ├─────────────────────────────────────────────────────────────────────────────┤
374
- │ │
375
- │ TRADITIONAL (LangChain, AutoGPT) OUR APPROACH (HyperMind) │
376
- │ ───────────────────────────────── ───────────────────────── │
377
- │ │
378
- │ User → LLM → Tool Call User → Deterministic │
379
- │ LLM DECIDES what to call Planner → Typed Steps │
380
- │ LLM GENERATES query text SCHEMA generates query │
381
- │ SCHEMA validates query │
382
- │ │
383
- │ PROS: CONS: PROS: CONS: │
384
- │ • Flexible • 20-40% • 86.4% success • Needs │
385
- │ • Easy setup • Hallucinates • Zero halluc. schema │
386
- │ • Vague tasks OK • No audit • Full audit • Struct. │
387
- │ • Non-determ. • Reproducible data │
388
- │ • Expensive • Cheap at scale only │
389
- │ │
390
- │ WHY WE CHOSE DETERMINISTIC: │
391
- │ • Enterprise needs audit trails (compliance) │
392
- │ • 86.4% vs 20-40% is category difference │
393
- │ • LLM per query is expensive at scale │
394
- └─────────────────────────────────────────────────────────────────────────────┘
395
- ```
396
-
397
- ## Domain-Enriched Proxy Architecture (Our Unique Approach)
398
-
399
- HyperMind uses a **schema-enriched deterministic planner**. Key difference: LLM is OPTIONAL (for summarization only).
400
-
401
- ```
402
- ┌─────────────────────────────────────────────────────────────────────────────┐
403
- │ TRADITIONAL APPROACH (LangChain, AutoGPT, MCP) │
404
- │ │
405
- │ User Question ──► LLM (no domain knowledge) ──► LLM generates query │
406
- │ │ │ │
407
- │ │ ❌ Hallucinates predicates │ │
408
- │ │ ❌ No schema validation ▼ │
409
- │ │ Tool Call │
410
- │ │ │ │
411
- │ │ ❌ 20-40% success ▼ │
412
- │ └─────────────────────► Results (often wrong) │
413
- └─────────────────────────────────────────────────────────────────────────────┘
414
-
415
- ┌─────────────────────────────────────────────────────────────────────────────┐
416
- │ OUR APPROACH (HyperMind) - Schema-Enriched Deterministic Planner │
417
- │ │
418
- │ ┌─────────────┐ ┌─────────────────────────────────────────────────┐ │
419
- │ │ Knowledge │────►│ SchemaContext (Γ) - AS OBJECT (not string!) │ │
420
- │ │ Graph │ │ { │ │
421
- │ └─────────────┘ │ classes: Set(['Claim', 'Provider']), │ │
422
- │ │ properties: Map({ 'amount': {...} }), │ │
423
- │ │ domains: Map({...}), │ │
424
- │ │ ranges: Map({...}) │ │
425
- │ │ } │ │
426
- │ └──────────────────────┬──────────────────────────┘ │
427
- │ │ │
428
- │ User Question ─────────────────────────────┤ │
429
- │ │ │ │
430
- │ ▼ ▼ │
431
- │ ┌─────────────────────────────────────────────────────────────────────┐ │
432
- │ │ DETERMINISTIC PLANNER (no LLM!) │ │
433
- │ │ ┌─────────────────────────────────────────────────────────────────┐│ │
434
- │ │ │ 1. _analyzeIntent(prompt) // Keyword matching (deterministic) ││ │
435
- │ │ │ 2. _generateSteps(intent, schemaContext) // From schema ││ │
436
- │ │ │ 3. _generateSchemaSparql(intent, schema) // Schema-aware ││ │
437
- │ │ │ 4. validateQuery(sparql, schemaContext) // Type-check ││ │
438
- │ │ └─────────────────────────────────────────────────────────────────┘│ │
439
- │ └──────────────────────────────┬──────────────────────────────────────┘ │
440
- │ │ │
441
- │ ▼ Typed, validated execution plan │
442
- │ rust-kgdb Execution (2.78 µs) │
443
- │ │ │
444
- │ ▼ │
445
- │ ProofDAG (audit trail) │
446
- │ │ │
447
- │ ▼ │
448
- │ Results (86.4% accuracy) │
449
- │ │
450
- │ LLM OPTIONAL: Only for summarization (not query generation) │
451
- └─────────────────────────────────────────────────────────────────────────────┘
452
- ```
453
-
454
- **Key Insight**: SchemaContext is an OBJECT passed to the deterministic planner—NOT a string injected into an LLM prompt. Query generation is deterministic, not LLM-dependent.
455
-
456
- ### Injection vs Proxy: The CMD Analogy
457
-
458
- Think of it like the evolution from DOS to modern shells:
459
-
460
- ```
461
- ┌─────────────────────────────────────────────────────────────────────────────┐
462
- │ DOS/CMD ERA (Classification Approach) │
463
- │ │
464
- │ User: "copy files" │
465
- │ System: ❌ "Bad command or file name" │
466
- │ │
467
- │ You MUST know exact syntax: COPY C:\src\*.txt D:\dst\ │
468
- │ No help, no context, no forgiveness. │
469
- └─────────────────────────────────────────────────────────────────────────────┘
470
-
471
- ┌─────────────────────────────────────────────────────────────────────────────┐
472
- │ MODERN SHELL with AI (Proxy Approach) │
473
- │ │
474
- │ User: "copy all text files from src to dst" │
475
- │ Proxy: I see your filesystem has: │
476
- │ - /src/ with 47 .txt files │
477
- │ - /dst/ exists and is writable │
478
- │ Proxy: Generating: cp /src/*.txt /dst/ │
479
- │ Proxy: ✅ Executed. 47 files copied. │
480
- │ │
481
- │ The PROXY knows your context and translates intent to exact commands. │
482
- └─────────────────────────────────────────────────────────────────────────────┘
483
- ```
484
-
485
- **HyperMind is the "modern shell" for knowledge graphs.** The SchemaContext is your "filesystem listing" - injected so the LLM knows what actually exists before generating queries.
486
-
487
- ### The Beautiful Integration: Context Theory + Proof Theory
488
-
489
- HyperMind elegantly combines two mathematical foundations:
490
-
491
- ```
492
- ┌─────────────────────────────────────────────────────────────────────────────┐
493
- │ CONTEXT THEORY (Spivak's Ologs) │
494
- │ "What CAN be said" │
495
- │ │
496
- │ Your Knowledge Graph as a Category: │
497
- │ ┌─────────────────────────────────────────────────────────────────────┐ │
498
- │ │ OBJECTS (Classes) MORPHISMS (Properties) │ │
499
- │ │ ───────────────── ───────────────────── │ │
500
- │ │ • Claim • Claim ──amount──► xsd:decimal │ │
501
- │ │ • Provider • Claim ──provider──► Provider │ │
502
- │ │ • Policy • Provider ──riskScore──► xsd:float │ │
503
- │ │ │ │
504
- │ │ SchemaContext Γ = (Classes, Properties, Domains, Ranges) │ │
505
- │ └─────────────────────────────────────────────────────────────────────┘ │
506
- │ │
507
- │ Γ defines the "grammar" of valid statements. If it's not in Γ, │
508
- │ it cannot be queried. Hallucination becomes IMPOSSIBLE. │
509
- └─────────────────────────────────────────────────────────────────────────────┘
510
-
511
- │ Schema INJECTED into LLM
512
- │ LLM generates TYPED query
513
-
514
- ┌─────────────────────────────────────────────────────────────────────────────┐
515
- │ PROOF THEORY (Curry-Howard) │
516
- │ "How it WAS derived" │
517
- │ │
518
- │ Every answer has a PROOF (ProofDAG): │
519
- │ ┌─────────────────────────────────────────────────────────────────────┐ │
520
- │ │ │ │
521
- │ │ CONCLUSION: "Provider P001 is high-risk" │ │
522
- │ │ │ │ │
523
- │ │ ├── EVIDENCE: SPARQL returned riskScore = 0.87 │ │
524
- │ │ │ └── DERIVATION: Γ ⊢ ?p :riskScore ?r (type-checked) │ │
525
- │ │ │ │ │
526
- │ │ ├── EVIDENCE: Datalog rule matched "highRisk(?p)" │ │
527
- │ │ │ └── DERIVATION: highRisk(P) :- riskScore(P,R), R>0.8 │ │
528
- │ │ │ │ │
529
- │ │ └── HASH: sha256:8f3a2b1c... (reproducible) │ │
530
- │ │ │ │
531
- │ └─────────────────────────────────────────────────────────────────────┘ │
532
- │ │
533
- │ Proofs are PROGRAMS (Curry-Howard correspondence): │
534
- │ - Type Γ ⊢ e : τ = "expression e has type τ in context Γ" │
535
- │ - Valid query = Valid proof = Executable program │
536
- │ - Same input → Same proof → Same output (deterministic) │
537
- └─────────────────────────────────────────────────────────────────────────────┘
538
- ```
539
-
540
- **The Elegance**:
541
- 1. **Context Theory** ensures you can ONLY ask valid questions (schema-bounded)
542
- 2. **Proof Theory** ensures every answer has a verifiable derivation chain
543
- 3. **Together**: Questions are bounded by reality, answers are backed by proof
544
-
545
- This is why HyperMind achieves **86.4% accuracy** while vanilla LLMs achieve **0%** on structured data tasks—it's not prompt engineering, it's **mathematical guarantees**.
68
+ ## Key Files
546
69
 
547
- **Why This Works**:
548
- - LLM can only reference predicates that exist in YOUR data
549
- - Type contracts validate query structure before execution
550
- - Same question + same schema = same answer (deterministic)
551
- - Every answer has a ProofDAG showing derivation chain
70
+ | File | Purpose |
71
+ |------|---------|
72
+ | `native/rust-kgdb-napi/src/lib.rs` | NAPI-RS Rust bindings (~700 lines) |
73
+ | `hypermind-agent.js` | HyperMind AI Framework (~4000 lines) |
74
+ | `index.js` | Platform loader + exports (~167 lines) |
75
+ | `index.d.ts` | TypeScript definitions (~425 lines) |
76
+ | `test-all-features.js` | 42 feature tests |
77
+ | `tests/*.test.ts` | Jest test suites (~170 tests) |
78
+ | `examples/` | Fraud detection, underwriting demos |
552
79
 
553
80
  ## Key APIs
554
81
 
@@ -563,6 +90,22 @@ This is why HyperMind achieves **86.4% accuracy** while vanilla LLMs achieve **0
563
90
  | **Pregel** | `pregelShortestPaths()` |
564
91
  | **Factories** | `friendsGraph()`, `chainGraph()`, `starGraph()`, `completeGraph()`, `cycleGraph()` |
565
92
 
93
+ ## HyperMind Key Methods (hypermind-agent.js)
94
+
95
+ When modifying the HyperMind framework, these are the critical methods:
96
+
97
+ | Method | Line | Purpose |
98
+ |--------|------|---------|
99
+ | `_analyzeIntent()` | ~2286 | Deterministic keyword matching (NO LLM) |
100
+ | `_generateSteps()` | ~2297 | Schema-driven step generation |
101
+ | `_generateSchemaSparql()` | ~2368 | Schema-aware SPARQL generation |
102
+ | `SchemaContext` class | ~699 | Object with `classes: Set`, `properties: Map` |
103
+ | `WasmSandbox` class | ~2612 | Capability-based execution with audit log |
104
+ | `TOOL_REGISTRY` | ~1687 | Typed morphisms `Query → BindingSet` |
105
+ | `ProofDAG` class | ~2411 | Derivation chain with hash |
106
+
107
+ **Key Design Point**: LLM is OPTIONAL - only used for summarization, NOT query generation. Query generation is deterministic from SchemaContext.
108
+
566
109
  ## Rust Workspace Dependencies
567
110
 
568
111
  Native addon depends on parent workspace crates:
@@ -580,18 +123,6 @@ Native addon depends on parent workspace crates:
580
123
  3. **Export**: Add to `module.exports` in `index.js`
581
124
  4. **Tests**: Add test in `test-all-features.js`
582
125
 
583
- ## Key Files
584
-
585
- | File | Purpose |
586
- |------|---------|
587
- | `native/rust-kgdb-napi/src/lib.rs` | NAPI-RS Rust bindings |
588
- | `hypermind-agent.js` | HyperMind AI Framework (~4000 lines) |
589
- | `index.js` | Platform loader + exports |
590
- | `index.d.ts` | TypeScript definitions |
591
- | `test-all-features.js` | 42 feature tests |
592
- | `tests/*.test.ts` | Jest test suites (~170 tests) |
593
- | `examples/` | Fraud detection, underwriting demos |
594
-
595
126
  ## Native Addon Files
596
127
 
597
128
  Built addons (platform-specific):
@@ -601,7 +132,7 @@ Built addons (platform-specific):
601
132
 
602
133
  ## Version Management
603
134
 
604
- 1. Update version in `package.json`
135
+ 1. Update version: `npm version patch --no-git-tag-version`
605
136
  2. Run tests: `npm test`
606
137
  3. Publish: `npm publish`
607
138
  4. Verify: `npm view rust-kgdb versions`
@@ -616,3 +147,23 @@ cd /path/to/rust-kgdb && cargo build --workspace --release
616
147
  ```
617
148
 
618
149
  **Platform error**: Supported: darwin/linux (x64/arm64), win32 (x64)
150
+
151
+ ## Benchmark Information
152
+
153
+ For benchmark methodology and results, see:
154
+ - `HYPERMIND_BENCHMARK_REPORT.md` - Full methodology, per-test results
155
+ - `vanilla-vs-hypermind-benchmark.js` - HyperMind vs Vanilla LLM (JavaScript)
156
+ - `benchmark-frameworks.py` - Compare Vanilla/LangChain/DSPy with/without schema (Python)
157
+ - `examples/fraud-detection-agent.js` - Real dataset example (line 612: `loadTtl`)
158
+ - `examples/underwriting-agent.js` - Real dataset example (line 766: `loadTtl`)
159
+
160
+ **Running Benchmarks**:
161
+ ```bash
162
+ # JavaScript benchmark (HyperMind vs Vanilla on LUBM)
163
+ ANTHROPIC_API_KEY=... OPENAI_API_KEY=... node vanilla-vs-hypermind-benchmark.js
164
+
165
+ # Python benchmark (Compare frameworks with/without schema)
166
+ OPENAI_API_KEY=... uv run --with openai --with langchain --with langchain-openai --with langchain-core --with dspy-ai python3 benchmark-frameworks.py
167
+ ```
168
+
169
+ **Key Result**: HyperMind achieves 86.4% accuracy on LUBM benchmark (3,272 triples, 30 classes, 23 properties) where vanilla LLMs achieve 0%.