rust-kgdb 0.6.45 → 0.6.47

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/CHANGELOG.md +51 -0
  2. package/README.md +62 -22
  3. package/package.json +1 -1
package/CHANGELOG.md CHANGED
@@ -2,6 +2,57 @@
2
2
 
3
3
  All notable changes to the rust-kgdb TypeScript SDK will be documented in this file.
4
4
 
5
+ ## [0.6.47] - 2025-12-17
6
+
7
+ ### Memory Retrieval Depth Benchmark Added
8
+
9
+ #### New Benchmark Documentation
10
+ Added Memory Retrieval Depth Benchmark to README, based on academic benchmarks:
11
+ - MemQ (arXiv 2503.05193)
12
+ - mKGQAgent (Text2SPARQL 2025)
13
+ - MTEB (Massive Text Embedding Benchmark)
14
+
15
+ **Results** (50 queries per depth, HNSW index):
16
+
17
+ | Depth | P50 Latency | Recall@5 | Recall@10 | MRR |
18
+ |-------|-------------|----------|-----------|-----|
19
+ | 10 | 0.06 ms | 78% | 100% | 0.68 |
20
+ | 100 | 0.50 ms | 88% | 98% | 0.42 |
21
+ | 1,000 | 1.59 ms | 80% | 94% | 0.50 |
22
+ | 10,000 | 16.71 ms | 76% | 94% | 0.54 |
23
+
24
+ **Key insight**: Even at 10K stored queries, Recall@10 stays at 94% with sub-17ms latency.
25
+
26
+ Reproduce: `node memory-retrieval-benchmark.js`
27
+
28
+ ---
29
+
30
+ ## [0.6.46] - 2025-12-17
31
+
32
+ ### Honest Comparison Fix
33
+
34
+ #### Fixed Misleading "Before & After" Section
35
+ - **Old (misleading)**: Implied vanilla LLMs CAN'T use schema/context
36
+ - **New (honest)**: Shows both approaches work, difference is integration effort
37
+
38
+ The "Before & After" section now honestly shows:
39
+ - **Manual Approach**: Works (~71% accuracy), but requires 5-8 manual integration steps
40
+ - Write schema manually
41
+ - Pass to LLM
42
+ - Parse SPARQL from response
43
+ - Find external database
44
+ - Connect, execute, parse results
45
+ - Build audit trail yourself
46
+
47
+ - **HyperMind Approach**: Same accuracy (~71%), but integrated
48
+ - Schema auto-extracted from your data
49
+ - Built-in database executes queries
50
+ - Audit trail included automatically
51
+
52
+ **Key insight**: We don't claim better accuracy than manual approach with schema. We provide integration convenience.
53
+
54
+ ---
55
+
5
56
  ## [0.6.45] - 2025-12-17
6
57
 
7
58
  ### ARCADE Pipeline Documentation & Benchmark Methodology
package/README.md CHANGED
@@ -135,6 +135,31 @@ db.loadTtl(':Provider123 :hasRiskScore "0.87" .', null)
135
135
  └─────────────────────────────────────────────────────────────────────────────┘
136
136
  ```
137
137
 
138
+ ### Memory Retrieval Depth Benchmark
139
+
140
+ Based on academic benchmarks: MemQ (arXiv 2503.05193), mKGQAgent (Text2SPARQL 2025), MTEB.
141
+
142
+ ```
143
+ ┌─────────────────────────────────────────────────────────────────────────────┐
144
+ │ BENCHMARK: Memory Retrieval at Depth (50 queries per depth) │
145
+ │ METHODOLOGY: LUBM schema-driven queries, HNSW index, random seed 42 │
146
+ ├─────────────────────────────────────────────────────────────────────────────┤
147
+ │ │
148
+ │ DEPTH │ P50 LATENCY │ P95 LATENCY │ Recall@5 │ Recall@10 │ MRR │
149
+ │ ──────────────────────────────────────────────────────────────────────────│
150
+ │ 10 │ 0.06 ms │ 0.26 ms │ 78% │ 100% │ 0.68 │
151
+ │ 100 │ 0.50 ms │ 0.75 ms │ 88% │ 98% │ 0.42 │
152
+ │ 1,000 │ 1.59 ms │ 5.03 ms │ 80% │ 94% │ 0.50 │
153
+ │ 10,000 │ 16.71 ms │ 17.37 ms │ 76% │ 94% │ 0.54 │
154
+ │ ──────────────────────────────────────────────────────────────────────────│
155
+ │ │
156
+ │ KEY INSIGHT: Even at 10,000 stored queries, Recall@10 stays at 94% │
157
+ │ Sub-17ms retrieval from 10K query pool = practical for production use │
158
+ │ │
159
+ │ Reproduce: node memory-retrieval-benchmark.js │
160
+ └─────────────────────────────────────────────────────────────────────────────┘
161
+ ```
162
+
138
163
  ### Where We Actually Outperform (Database Performance)
139
164
 
140
165
  ```
@@ -188,47 +213,60 @@ db.loadTtl(':Provider123 :hasRiskScore "0.87" .', null)
188
213
 
189
214
  ---
190
215
 
191
- ## The Difference: Before & After
216
+ ## The Difference: Manual vs Integrated
192
217
 
193
- ### Before: Vanilla LLM (Unreliable)
218
+ ### Manual Approach (Works, But Tedious)
194
219
 
195
220
  ```javascript
196
- // Ask LLM to query your database
221
+ // STEP 1: Manually write your schema (takes hours for large ontologies)
222
+ const LUBM_SCHEMA = `
223
+ PREFIX ub: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#>
224
+ Classes: University, Department, Professor, Student, Course, Publication
225
+ Properties: teacherOf(Faculty→Course), worksFor(Faculty→Department)
226
+ `;
227
+
228
+ // STEP 2: Pass schema to LLM
197
229
  const answer = await openai.chat.completions.create({
198
230
  model: 'gpt-4o',
199
- messages: [{ role: 'user', content: 'Find suspicious providers in my database' }]
231
+ messages: [
232
+ { role: 'system', content: `${LUBM_SCHEMA}\nOutput raw SPARQL only.` },
233
+ { role: 'user', content: 'Find suspicious providers' }
234
+ ]
200
235
  });
201
236
 
202
- console.log(answer.choices[0].message.content);
203
- // "Based on my analysis, Provider P001 appears suspicious because..."
204
- //
205
- // PROBLEMS:
206
- // Did it actually query your database? No - it's guessing
207
- // Where's the evidence? None - it made up "Provider P001"
208
- // Will this answer be the same tomorrow? No - probabilistic
209
- // Can you audit this for regulators? No - black box
237
+ // STEP 3: Parse out the SPARQL (handle markdown, explanations, etc.)
238
+ const sparql = extractSPARQL(answer.choices[0].message.content);
239
+
240
+ // STEP 4: Find a SPARQL database (Jena? RDFox? Virtuoso?)
241
+ // STEP 5: Connect to database
242
+ // STEP 6: Execute query
243
+ // STEP 7: Parse results
244
+ // STEP 8: No audit trail - you'd have to build that yourself
245
+
246
+ // RESULT: ~71% accuracy (same as HyperMind with schema)
247
+ // BUT: 5-8 manual integration steps
210
248
  ```
211
249
 
212
- ### After: HyperMind (Verifiable)
250
+ ### HyperMind Approach (Integrated)
213
251
 
214
252
  ```javascript
215
- // Ask HyperMind to query your database
253
+ // ONE-TIME SETUP: Load your data
216
254
  const { HyperMindAgent, GraphDB } = require('rust-kgdb');
217
255
 
218
256
  const db = new GraphDB('http://insurance.org/');
219
- db.loadTtl(yourActualData, null); // Your real data
257
+ db.loadTtl(yourActualData, null); // Schema auto-extracted from data
220
258
 
221
259
  const agent = new HyperMindAgent({ kg: db, model: 'gpt-4o' });
222
260
  const result = await agent.call('Find suspicious providers');
223
261
 
224
262
  console.log(result.answer);
225
263
  // "Provider PROV001 has risk score 0.87 with 47 claims over $50,000"
226
- //
227
- // VERIFIED:
228
- // ✅ Queried your actual database (SPARQL executed)
229
- // ✅ Evidence included (47 real claims found)
230
- // ✅ Reproducible (same hash every time)
231
- // ✅ Full audit trail for regulators
264
+
265
+ // WHAT YOU GET (ALL AUTOMATIC):
266
+ // ✅ Schema auto-extracted (no manual prompt engineering)
267
+ // ✅ Query executed on built-in database (no external DB needed)
268
+ // ✅ Full audit trail included
269
+ // ✅ Reproducible hash for compliance
232
270
 
233
271
  console.log(result.reasoningTrace);
234
272
  // [
@@ -240,7 +278,9 @@ console.log(result.hash);
240
278
  // "sha256:8f3a2b1c..." - Same question = Same answer = Same hash
241
279
  ```
242
280
 
243
- **The key insight**: The LLM plans WHAT to look for. The database finds EXACTLY that. Every answer traces back to your actual data.
281
+ **Honest comparison**: Both approaches achieve ~71% accuracy on LUBM benchmark. The difference is integration effort:
282
+ - **Manual**: Write schema, integrate database, build audit trail yourself
283
+ - **HyperMind**: Database + schema extraction + audit trail built-in
244
284
 
245
285
  ---
246
286
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "rust-kgdb",
3
- "version": "0.6.45",
3
+ "version": "0.6.47",
4
4
  "description": "High-performance RDF/SPARQL database with AI agent framework. GraphDB (449ns lookups, 35x faster than RDFox), GraphFrames analytics (PageRank, motifs), Datalog reasoning, HNSW vector embeddings. HyperMindAgent for schema-aware query generation with audit trails. W3C SPARQL 1.1 compliant. Native performance via Rust + NAPI-RS.",
5
5
  "main": "index.js",
6
6
  "types": "index.d.ts",