rust-kgdb 0.6.31 → 0.6.32

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,28 +1,31 @@
1
1
  # HyperMind Benchmark Report
2
2
 
3
- ## AI That Doesn't Hallucinate: The Results
3
+ ## Verified Framework Comparison: Schema Injection Works
4
4
 
5
- **Version**: 0.6.17
5
+ **Version**: 0.6.32
6
6
  **Date**: December 16, 2025
7
- **SDK**: rust-kgdb@0.6.17
7
+ **SDK**: rust-kgdb@0.6.32
8
8
 
9
9
  ---
10
10
 
11
- ## The Bottom Line
11
+ ## Executive Summary (Verified Results)
12
12
 
13
- **HyperMind achieves 86.4% accuracy where vanilla LLMs achieve 0%.**
13
+ **Schema injection improves ALL frameworks by +66.7 percentage points.**
14
14
 
15
- | Metric | Vanilla LLM | HyperMind | Improvement |
16
- |--------|-------------|-----------|-------------|
17
- | **Accuracy** | 0% | 86.4% | +86.4 pp |
18
- | **Hallucinations** | 100% | 0% | Eliminated |
19
- | **Audit Trail** | None | Complete | Full provenance |
20
- | **Claude Sonnet 4** | 0% | 90.9% | +90.9 pp |
21
- | **GPT-4o** | 0% | 81.8% | +81.8 pp |
15
+ | Framework | No Schema | With Schema | Improvement |
16
+ |-----------|-----------|-------------|-------------|
17
+ | **Vanilla OpenAI** | 0.0% | 71.4% | +71.4 pp |
18
+ | **LangChain** | 0.0% | 71.4% | +71.4 pp |
19
+ | **DSPy** | 14.3% | 71.4% | +57.1 pp |
20
+ | **Average** | 4.8% | **71.4%** | **+66.7 pp** |
21
+
22
+ *GPT-4o, 7 LUBM queries, real API calls, no mocking. See `verified_benchmark_results.json`.*
23
+
24
+ **Key Insight**: The value is in the ARCHITECTURE (schema injection, type contracts), not the specific framework.
22
25
 
23
26
  ---
24
27
 
25
- ## Why Vanilla LLMs Fail
28
+ ## Why Vanilla LLMs Fail (Without Schema)
26
29
 
27
30
  When you ask a vanilla LLM to query your database:
28
31
 
@@ -33,39 +36,194 @@ When you ask a vanilla LLM to query your database:
33
36
 
34
37
  ---
35
38
 
36
- ## How HyperMind Fixes This
39
+ ## How Schema Injection Fixes This
37
40
 
38
- HyperMind grounds every answer in your actual data:
41
+ The HyperMind approach (schema injection) works with ANY framework:
39
42
 
40
43
  1. **Schema injection** - LLM sees your real data structure (30 classes, 23 properties)
41
- 2. **Input/output validation** - Prevents invalid query combinations
42
- 3. **Reasoning trace** - Every answer shows exactly how it was derived
43
- 4. **Reproducible** - Same question = Same answer = Same hash
44
+ 2. **Output format** - Explicit instructions for raw SPARQL (no markdown)
45
+ 3. **Type contracts** - Predicate constraints from actual schema
46
+ 4. **Reproducible** - Same question = Same answer
47
+
48
+ ---
49
+
50
+ ## Benchmark Setup: Code for Each Framework
51
+
52
+ ### Test Queries (Same for All Frameworks)
53
+
54
+ ```python
55
+ TEST_QUERIES = [
56
+ {"id": "A1", "question": "Find all teachers", "correct_predicate": "teacherOf"},
57
+ {"id": "A2", "question": "Get student emails", "correct_predicate": "emailAddress"},
58
+ {"id": "A3", "question": "Find faculty members", "correct_predicate": "Professor"},
59
+ {"id": "S1", "question": "Write a SPARQL query to count professors. Just give me the query."},
60
+ {"id": "S2", "question": "SPARQL only, no explanation: find graduate students"},
61
+ {"id": "M1", "question": "Find professors who work for departments"},
62
+ {"id": "E1", "question": "Find professors with no publications"}
63
+ ]
64
+ ```
65
+
66
+ ### LUBM Schema (Injected for "With Schema" Tests)
67
+
68
+ ```python
69
+ LUBM_SCHEMA = """LUBM (Lehigh University Benchmark) Schema:
70
+
71
+ PREFIX ub: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#>
44
72
 
73
+ Classes: University, Department, Professor, AssociateProfessor, AssistantProfessor,
74
+ FullProfessor, Lecturer, GraduateStudent, UndergraduateStudent,
75
+ Course, GraduateCourse, Publication, Research, ResearchGroup
76
+
77
+ Properties:
78
+ - ub:worksFor (person → organization)
79
+ - ub:memberOf (person → organization)
80
+ - ub:advisor (student → professor)
81
+ - ub:takesCourse (student → course)
82
+ - ub:teacherOf (professor → course)
83
+ - ub:publicationAuthor (publication → person)
84
+ - ub:subOrganizationOf (organization → organization)
85
+ - ub:emailAddress (person → string)
86
+
87
+ IMPORTANT: Use ONLY these predicates. Do NOT use: teacher, email, faculty"""
45
88
  ```
46
- ┌─────────────────────────────────────────────────────────────────┐
47
- │ YOUR QUESTION │
48
- │ "Find professors who teach courses..." │
49
- └───────────────────────────┬─────────────────────────────────────┘
50
-
51
-
52
- ┌─────────────────────────────────────────────────────────────────┐
53
- │ VANILLA LLM │
54
- │ ❌ No schema awareness │
55
- │ ❌ Hallucinates predicates │
56
- │ ❌ Wraps in markdown │
57
- │ ❌ 0% success rate │
58
- └─────────────────────────────────────────────────────────────────┘
59
-
60
- vs.
61
-
62
- ┌─────────────────────────────────────────────────────────────────┐
63
- │ HYPERMIND │
64
- │ ✅ Sees your actual schema (30 classes, 23 properties) │
65
- │ ✅ Validates query structure before execution │
66
- │ ✅ Provides complete reasoning trace │
67
- │ ✅ 86.4% success rate │
68
- └─────────────────────────────────────────────────────────────────┘
89
+
90
+ ---
91
+
92
+ ## Framework Code Comparison
93
+
94
+ ### 1. Vanilla OpenAI (No Schema) - 0% Accuracy
95
+
96
+ ```python
97
+ from openai import OpenAI
98
+ client = OpenAI(api_key=api_key)
99
+
100
+ response = client.chat.completions.create(
101
+ model="gpt-4o",
102
+ messages=[{"role": "user", "content": f"Generate a SPARQL query for: {question}"}],
103
+ max_tokens=500
104
+ )
105
+ sparql = response.choices[0].message.content
106
+ # Result: 0/7 passed - all wrapped in markdown
107
+ ```
108
+
109
+ ### 2. Vanilla OpenAI (With Schema) - 71.4% Accuracy
110
+
111
+ ```python
112
+ from openai import OpenAI
113
+ client = OpenAI(api_key=api_key)
114
+
115
+ prompt = f"""You are a SPARQL query generator.
116
+
117
+ {LUBM_SCHEMA}
118
+
119
+ TYPE CONTRACT:
120
+ - Input: natural language query
121
+ - Output: raw SPARQL (NO markdown, NO code blocks, NO explanation)
122
+ - Use ONLY predicates from the schema above
123
+
124
+ Query: {question}
125
+
126
+ Output raw SPARQL only:"""
127
+
128
+ response = client.chat.completions.create(
129
+ model="gpt-4o",
130
+ messages=[{"role": "user", "content": prompt}],
131
+ max_tokens=500
132
+ )
133
+ sparql = response.choices[0].message.content
134
+ # Result: 5/7 passed - schema prevents wrong predicates
135
+ ```
136
+
137
+ ### 3. LangChain (No Schema) - 0% Accuracy
138
+
139
+ ```python
140
+ from langchain_openai import ChatOpenAI
141
+ from langchain_core.prompts import PromptTemplate
142
+ from langchain_core.output_parsers import StrOutputParser
143
+
144
+ llm = ChatOpenAI(model="gpt-4o", api_key=api_key)
145
+ parser = StrOutputParser()
146
+
147
+ template = PromptTemplate(
148
+ input_variables=["question"],
149
+ template="Generate a SPARQL query for: {question}"
150
+ )
151
+ chain = template | llm | parser
152
+
153
+ sparql = chain.invoke({"question": question})
154
+ # Result: 0/7 passed - all wrapped in markdown
155
+ ```
156
+
157
+ ### 4. LangChain (With Schema) - 71.4% Accuracy
158
+
159
+ ```python
160
+ from langchain_openai import ChatOpenAI
161
+ from langchain_core.prompts import PromptTemplate
162
+ from langchain_core.output_parsers import StrOutputParser
163
+
164
+ llm = ChatOpenAI(model="gpt-4o", api_key=api_key)
165
+ parser = StrOutputParser()
166
+
167
+ template = PromptTemplate(
168
+ input_variables=["question", "schema"],
169
+ template="""You are a SPARQL query generator.
170
+
171
+ {schema}
172
+
173
+ TYPE CONTRACT:
174
+ - Input: natural language query
175
+ - Output: raw SPARQL (NO markdown, NO code blocks, NO explanation)
176
+ - Use ONLY predicates from the schema above
177
+
178
+ Query: {question}
179
+
180
+ Output raw SPARQL only:"""
181
+ )
182
+ chain = template | llm | parser
183
+
184
+ sparql = chain.invoke({"question": question, "schema": LUBM_SCHEMA})
185
+ # Result: 5/7 passed - same as vanilla with schema
186
+ ```
187
+
188
+ ### 5. DSPy (No Schema) - 14.3% Accuracy
189
+
190
+ ```python
191
+ import dspy
192
+ from dspy import LM
193
+
194
+ lm = LM("openai/gpt-4o")
195
+ dspy.configure(lm=lm)
196
+
197
+ class SPARQLGenerator(dspy.Signature):
198
+ """Generate SPARQL query from natural language."""
199
+ question = dspy.InputField(desc="Natural language question")
200
+ sparql = dspy.OutputField(desc="SPARQL query")
201
+
202
+ generator = dspy.Predict(SPARQLGenerator)
203
+ response = generator(question=question)
204
+ sparql = response.sparql
205
+ # Result: 1/7 passed - slightly better output formatting
206
+ ```
207
+
208
+ ### 6. DSPy (With Schema) - 71.4% Accuracy
209
+
210
+ ```python
211
+ import dspy
212
+ from dspy import LM
213
+
214
+ lm = LM("openai/gpt-4o")
215
+ dspy.configure(lm=lm)
216
+
217
+ class SchemaSPARQLGenerator(dspy.Signature):
218
+ """Generate SPARQL query using the provided schema. Output raw SPARQL only."""
219
+ schema = dspy.InputField(desc="Database schema with classes and properties")
220
+ question = dspy.InputField(desc="Natural language question")
221
+ sparql = dspy.OutputField(desc="Raw SPARQL query (no markdown, no explanation)")
222
+
223
+ generator = dspy.Predict(SchemaSPARQLGenerator)
224
+ response = generator(schema=LUBM_SCHEMA, question=question)
225
+ sparql = response.sparql
226
+ # Result: 5/7 passed - same as others with schema
69
227
  ```
70
228
 
71
229
  ---
package/README.md CHANGED
@@ -12,27 +12,27 @@
12
12
 
13
13
  ---
14
14
 
15
- ## Results
15
+ ## Results (Verified December 2025)
16
16
 
17
17
  ```
18
18
  ┌─────────────────────────────────────────────────────────────────────────────┐
19
19
  │ BENCHMARK: LUBM (Lehigh University Benchmark) │
20
20
  │ DATASET: 3,272 triples │ 30 OWL classes │ 23 properties │
21
- TESTS: 11 hard scenarios (ambiguous, multi-hop, edge cases)
22
- │ PROTOCOL: Query → Parse → Type-check → Execute → Verify │
21
+ MODEL: GPT-4o Real API calls No mocking
23
22
  ├─────────────────────────────────────────────────────────────────────────────┤
24
23
  │ │
25
- METRIC VANILLA LLM HYPERMIND IMPROVEMENT
24
+ FRAMEWORK NO SCHEMA WITH SCHEMA IMPROVEMENT
26
25
  │ ───────────────────────────────────────────────────────────── │
27
- Accuracy 0% 86.4% +86.4 pp
28
- Hallucinations 100% 0% Eliminated
29
- Audit Trail None Complete Full provenance
30
- Reproducibility Random Deterministic Same hash
26
+ Vanilla OpenAI 0.0% 71.4% +71.4 pp
27
+ LangChain 0.0% 71.4% +71.4 pp
28
+ DSPy 14.3% 71.4% +57.1 pp
29
+ ─────────────────────────────────────────────────────────────
30
+ │ AVERAGE 4.8% 71.4% +66.7 pp │
31
31
  │ │
32
- Claude Sonnet 4: 90.9% accuracy
33
- GPT-4o: 81.8% accuracy
32
+ KEY INSIGHT: Schema injection improves ALL frameworks equally.
33
+ HyperMind's value = architecture, not framework.
34
34
  │ │
35
- │ Reproduce: node vanilla-vs-hypermind-benchmark.js
35
+ │ Reproduce: python3 benchmark-frameworks.py
36
36
  └─────────────────────────────────────────────────────────────────────────────┘
37
37
  ```
38
38
 
@@ -811,27 +811,44 @@ console.log('Supersteps:', result.supersteps) // 5
811
811
  | Virtuoso | ~5 µs | 35-75 bytes | No |
812
812
  | Blazegraph | ~100 µs | 100+ bytes | No |
813
813
 
814
- ### AI Agent Accuracy
814
+ ### AI Agent Accuracy (Verified December 2025)
815
+
816
+ | Framework | No Schema | With Schema (HyperMind) | Improvement |
817
+ |-----------|-----------|-------------------------|-------------|
818
+ | **Vanilla OpenAI** | 0.0% | 71.4% | +71.4 pp |
819
+ | **LangChain** | 0.0% | 71.4% | +71.4 pp |
820
+ | **DSPy** | 14.3% | 71.4% | +57.1 pp |
821
+ | **Average** | 4.8% | **71.4%** | **+66.7 pp** |
822
+
823
+ *Tested: GPT-4o, 7 LUBM queries, real API calls. See `framework_benchmark_*.json` for raw data.*
824
+
825
+ ### AI Framework Architectural Comparison
826
+
827
+ | Framework | Type Safety | Schema Aware | Symbolic Execution | Audit Trail |
828
+ |-----------|-------------|--------------|-------------------|-------------|
829
+ | **HyperMind** | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes |
830
+ | LangChain | ❌ No | ❌ No | ❌ No | ❌ No |
831
+ | DSPy | ⚠️ Partial | ❌ No | ❌ No | ❌ No |
832
+
833
+ **Key Insight**: Schema injection (HyperMind's architecture) provides +66.7 pp improvement across ALL frameworks. The value is in the architecture, not the specific framework.
834
+
835
+ ### Reproduce Benchmarks
815
836
 
816
- | Approach | Accuracy | Why |
817
- |----------|----------|-----|
818
- | **Vanilla LLM** | 0% | Hallucinated predicates, markdown in SPARQL |
819
- | **HyperMind** | 86.4% | Schema injection, typed tools, audit trail |
837
+ Two benchmark scripts are available for verification:
820
838
 
821
- ### AI Framework Comparison
839
+ ```bash
840
+ # JavaScript: HyperMind vs Vanilla LLM on LUBM (12 queries)
841
+ ANTHROPIC_API_KEY=... OPENAI_API_KEY=... node vanilla-vs-hypermind-benchmark.js
822
842
 
823
- | Framework | Type Safety | Schema Aware | Symbolic Execution | Success Rate |
824
- |-----------|-------------|--------------|-------------------|--------------|
825
- | **HyperMind** | ✅ Yes | ✅ Yes | ✅ Yes | **86.4%** |
826
- | LangChain | ❌ No | ❌ No | ❌ No | ~20-40%* |
827
- | AutoGPT | ❌ No | ❌ No | ❌ No | ~10-25%* |
828
- | DSPy | ⚠️ Partial | ❌ No | ❌ No | ~30-50%* |
843
+ # Python: Compare frameworks (Vanilla, LangChain, DSPy) with/without schema
844
+ OPENAI_API_KEY=... uv run --with openai --with langchain --with langchain-openai --with langchain-core --with dspy-ai python3 benchmark-frameworks.py
845
+ ```
829
846
 
830
- *Estimated from GAIA (Meta Research, 2023), SWE-bench (OpenAI, 2024), and LUBM (Lehigh University) benchmarks on structured data tasks. HyperMind results measured on LUBM-1 dataset (3,272 triples, 30 classes, 23 properties) using vanilla-vs-hypermind-benchmark.js.
847
+ Both scripts make real API calls and report actual results. No mocking.
831
848
 
832
- **Why HyperMind Wins**:
849
+ **Why These Features Matter**:
833
850
  - **Type Safety**: Tools have typed signatures (Query → BindingSet), invalid combinations rejected
834
- - **Schema Awareness**: LLM sees your actual data structure, can only reference real properties
851
+ - **Schema Awareness**: Planner sees your actual data structure, can only reference real properties
835
852
  - **Symbolic Execution**: Queries run against real database, not LLM imagination
836
853
  - **Audit Trail**: Every answer has cryptographic hash for reproducibility
837
854
 
@@ -1164,140 +1181,6 @@ const result = await agent.call('Find collusion patterns')
1164
1181
  // Result: ✅ Type-safe, domain-aware, auditable
1165
1182
  ```
1166
1183
 
1167
- ### Code Comparison: DSPy vs HyperMind
1168
-
1169
- #### DSPy Approach (Prompt Optimization)
1170
-
1171
- ```python
1172
- # DSPy: Statistically optimized prompt - NO guarantees
1173
-
1174
- import dspy
1175
-
1176
- class FraudDetector(dspy.Signature):
1177
- """Find fraud patterns in claims data."""
1178
- claims_data = dspy.InputField()
1179
- fraud_patterns = dspy.OutputField()
1180
-
1181
- class FraudPipeline(dspy.Module):
1182
- def __init__(self):
1183
- self.detector = dspy.ChainOfThought(FraudDetector)
1184
-
1185
- def forward(self, claims):
1186
- return self.detector(claims_data=claims)
1187
-
1188
- # "Optimize" via statistical fitting
1189
- optimizer = dspy.BootstrapFewShot(metric=some_metric)
1190
- optimized = optimizer.compile(FraudPipeline(), trainset=examples)
1191
-
1192
- # Call and HOPE it works
1193
- result = optimized(claims="[claim data here]")
1194
-
1195
- # ❌ No type guarantee - fraud_patterns could be anything
1196
- # ❌ No proof of execution - just text output
1197
- # ❌ No composition safety - next step might fail
1198
- # ❌ No audit trail - "it said fraud" is not compliance
1199
- ```
1200
-
1201
- **What DSPy produces:** A string that *probably* contains fraud patterns.
1202
-
1203
- #### HyperMind Approach (Mathematical Proof)
1204
-
1205
- ```javascript
1206
- // HyperMind: Type-safe morphism composition - PROVEN correct
1207
-
1208
- const { GraphDB, GraphFrame, DatalogProgram, evaluateDatalog } = require('rust-kgdb')
1209
-
1210
- // Step 1: Load typed knowledge graph (Schema enforced)
1211
- const db = new GraphDB('http://insurance.org/fraud-kb')
1212
- db.loadTtl(`
1213
- @prefix : <http://insurance.org/> .
1214
- :CLM001 :amount "18500" ; :claimant :P001 ; :provider :PROV001 .
1215
- :P001 :paidTo :P002 .
1216
- :P002 :paidTo :P003 .
1217
- :P003 :paidTo :P001 .
1218
- `, null)
1219
-
1220
- // Step 2: GraphFrame analysis (Morphism: Graph → TriangleCount)
1221
- // Type signature: GraphFrame → number (guaranteed)
1222
- const graph = new GraphFrame(
1223
- JSON.stringify([{id:'P001'}, {id:'P002'}, {id:'P003'}]),
1224
- JSON.stringify([
1225
- {src:'P001', dst:'P002'},
1226
- {src:'P002', dst:'P003'},
1227
- {src:'P003', dst:'P001'}
1228
- ])
1229
- )
1230
- const triangles = graph.triangleCount() // Type: number (always)
1231
-
1232
- // Step 3: Datalog inference (Morphism: Rules → Facts)
1233
- // Type signature: DatalogProgram → InferredFacts (guaranteed)
1234
- const datalog = new DatalogProgram()
1235
- datalog.addFact(JSON.stringify({predicate:'claim', terms:['CLM001','P001','PROV001']}))
1236
- datalog.addFact(JSON.stringify({predicate:'related', terms:['P001','P002']}))
1237
-
1238
- datalog.addRule(JSON.stringify({
1239
- head: {predicate:'collusion', terms:['?P1','?P2','?Prov']},
1240
- body: [
1241
- {predicate:'claim', terms:['?C1','?P1','?Prov']},
1242
- {predicate:'claim', terms:['?C2','?P2','?Prov']},
1243
- {predicate:'related', terms:['?P1','?P2']}
1244
- ]
1245
- }))
1246
-
1247
- const result = JSON.parse(evaluateDatalog(datalog))
1248
-
1249
- // ✓ Type guarantee: result.collusion is always array of tuples
1250
- // ✓ Proof of execution: Datalog evaluation is deterministic
1251
- // ✓ Composition safety: Each step has typed input/output
1252
- // ✓ Audit trail: Every fact derivation is traceable
1253
- ```
1254
-
1255
- **What HyperMind produces:** Typed results with mathematical proof of derivation.
1256
-
1257
- #### Actual Output Comparison
1258
-
1259
- **DSPy Output:**
1260
- ```
1261
- fraud_patterns: "I found some suspicious patterns involving P001 and P002
1262
- that appear to be related. There might be collusion with provider PROV001."
1263
- ```
1264
- *How do you validate this? You can't. It's text.*
1265
-
1266
- **HyperMind Output:**
1267
- ```json
1268
- {
1269
- "triangles": 1,
1270
- "collusion": [["P001", "P002", "PROV001"]],
1271
- "executionWitness": {
1272
- "tool": "datalog.evaluate",
1273
- "input": "6 facts, 1 rule",
1274
- "output": "collusion(P001,P002,PROV001)",
1275
- "derivation": "claim(CLM001,P001,PROV001) ∧ claim(CLM002,P002,PROV001) ∧ related(P001,P002) → collusion(P001,P002,PROV001)",
1276
- "timestamp": "2024-12-14T10:30:00Z",
1277
- "semanticHash": "semhash:collusion-p001-p002-prov001"
1278
- }
1279
- }
1280
- ```
1281
- *Every result has a logical derivation and cryptographic proof.*
1282
-
1283
- #### The Compliance Question
1284
-
1285
- **Auditor:** "How do you know P001-P002-PROV001 is actually collusion?"
1286
-
1287
- **DSPy Team:** "Our model said so. It was trained on examples and optimized for accuracy."
1288
-
1289
- **HyperMind Team:** "Here's the derivation chain:
1290
- 1. `claim(CLM001, P001, PROV001)` - fact from data
1291
- 2. `claim(CLM002, P002, PROV001)` - fact from data
1292
- 3. `related(P001, P002)` - fact from data
1293
- 4. Rule: `collusion(?P1, ?P2, ?Prov) :- claim(?C1, ?P1, ?Prov), claim(?C2, ?P2, ?Prov), related(?P1, ?P2)`
1294
- 5. Unification: `?P1=P001, ?P2=P002, ?Prov=PROV001`
1295
- 6. Conclusion: `collusion(P001, P002, PROV001)` - QED
1296
-
1297
- Here's the semantic hash: `semhash:collusion-p001-p002-prov001` - same query intent will always return this exact result."
1298
-
1299
- **Result:** HyperMind passes audit. DSPy gets you a follow-up meeting with legal.
1300
-
1301
1184
  ### Why Vanilla LLMs Fail
1302
1185
 
1303
1186
  When you ask an LLM to query a knowledge graph, it produces **broken SPARQL 85% of the time**:
@@ -1346,16 +1229,15 @@ Result: ❌ PARSER ERROR - Invalid SPARQL syntax
1346
1229
 
1347
1230
  **Note**: Tentris implements WCOJ (see [ISWC 2025 paper](https://papers.dice-research.org/2025/ISWC_Tentris-WCOJ-Update/public.pdf)). rust-kgdb is the only system combining WCOJ with mobile support and integrated AI framework.
1348
1231
 
1349
- #### AI Framework Comparison
1232
+ #### AI Framework Architectural Comparison
1350
1233
 
1351
- | Framework | Type Safety | Schema Aware | Symbolic Execution | Audit Trail | Success Rate |
1352
- |-----------|-------------|--------------|-------------------|-------------|--------------|
1353
- | **HyperMind** | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | **86.4%** |
1354
- | LangChain | ❌ No | ❌ No | ❌ No | ❌ No | ~20-40%* |
1355
- | AutoGPT | No | ❌ No | ❌ No | ❌ No | ~10-25%* |
1356
- | DSPy | ⚠️ Partial | ❌ No | ❌ No | ❌ No | ~30-50%* |
1234
+ | Framework | Type Safety | Schema Aware | Symbolic Execution | Audit Trail |
1235
+ |-----------|-------------|--------------|-------------------|-------------|
1236
+ | **HyperMind** | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes |
1237
+ | LangChain | ❌ No | ❌ No | ❌ No | ❌ No |
1238
+ | DSPy | ⚠️ Partial | ❌ No | ❌ No | ❌ No |
1357
1239
 
1358
- *Estimated from GAIA (Meta Research, 2023), SWE-bench (OpenAI, 2024), and LUBM (Lehigh University) benchmarks. HyperMind: LUBM-1 (3,272 triples).
1240
+ **Note**: This compares architectural features. Benchmark (Dec 2025): Schema injection improves all frameworks by +66.7 pp (Vanilla: 0%→71.4%, LangChain: 0%→71.4%, DSPy: 14.3%→71.4%).
1359
1241
 
1360
1242
  ```
1361
1243
  ┌─────────────────────────────────────────────────────────────────┐
@@ -1368,12 +1250,10 @@ Result: ❌ PARSER ERROR - Invalid SPARQL syntax
1368
1250
  │ Apache Jena: Great features, but 150+ µs lookups │
1369
1251
  │ Neo4j: Popular, but no SPARQL/RDF standards │
1370
1252
  │ Amazon Neptune: Managed, but cloud-only vendor lock-in │
1371
- │ LangChain: Vibe coding, fails compliance audits │
1372
- │ DSPy: Statistical optimization, no guarantees │
1373
1253
  │ │
1374
1254
  │ rust-kgdb: 2.78 µs lookups, WCOJ joins, mobile-native │
1375
1255
  │ Standalone → Clustered on same codebase │
1376
- Mathematical foundations, audit-ready
1256
+ Deterministic planner, audit-ready
1377
1257
  │ │
1378
1258
  └─────────────────────────────────────────────────────────────────┘
1379
1259
  ```