rust-kgdb 0.6.45 → 0.6.47
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +51 -0
- package/README.md +62 -22
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,57 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to the rust-kgdb TypeScript SDK will be documented in this file.
|
|
4
4
|
|
|
5
|
+
## [0.6.47] - 2025-12-17
|
|
6
|
+
|
|
7
|
+
### Memory Retrieval Depth Benchmark Added
|
|
8
|
+
|
|
9
|
+
#### New Benchmark Documentation
|
|
10
|
+
Added Memory Retrieval Depth Benchmark to README, based on academic benchmarks:
|
|
11
|
+
- MemQ (arXiv 2503.05193)
|
|
12
|
+
- mKGQAgent (Text2SPARQL 2025)
|
|
13
|
+
- MTEB (Massive Text Embedding Benchmark)
|
|
14
|
+
|
|
15
|
+
**Results** (50 queries per depth, HNSW index):
|
|
16
|
+
|
|
17
|
+
| Depth | P50 Latency | Recall@5 | Recall@10 | MRR |
|
|
18
|
+
|-------|-------------|----------|-----------|-----|
|
|
19
|
+
| 10 | 0.06 ms | 78% | 100% | 0.68 |
|
|
20
|
+
| 100 | 0.50 ms | 88% | 98% | 0.42 |
|
|
21
|
+
| 1,000 | 1.59 ms | 80% | 94% | 0.50 |
|
|
22
|
+
| 10,000 | 16.71 ms | 76% | 94% | 0.54 |
|
|
23
|
+
|
|
24
|
+
**Key insight**: Even at 10K stored queries, Recall@10 stays at 94% with sub-17ms latency.
|
|
25
|
+
|
|
26
|
+
Reproduce: `node memory-retrieval-benchmark.js`
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## [0.6.46] - 2025-12-17
|
|
31
|
+
|
|
32
|
+
### Honest Comparison Fix
|
|
33
|
+
|
|
34
|
+
#### Fixed Misleading "Before & After" Section
|
|
35
|
+
- **Old (misleading)**: Implied vanilla LLMs CAN'T use schema/context
|
|
36
|
+
- **New (honest)**: Shows both approaches work, difference is integration effort
|
|
37
|
+
|
|
38
|
+
The "Before & After" section now honestly shows:
|
|
39
|
+
- **Manual Approach**: Works (~71% accuracy), but requires 5-8 manual integration steps
|
|
40
|
+
- Write schema manually
|
|
41
|
+
- Pass to LLM
|
|
42
|
+
- Parse SPARQL from response
|
|
43
|
+
- Find external database
|
|
44
|
+
- Connect, execute, parse results
|
|
45
|
+
- Build audit trail yourself
|
|
46
|
+
|
|
47
|
+
- **HyperMind Approach**: Same accuracy (~71%), but integrated
|
|
48
|
+
- Schema auto-extracted from your data
|
|
49
|
+
- Built-in database executes queries
|
|
50
|
+
- Audit trail included automatically
|
|
51
|
+
|
|
52
|
+
**Key insight**: We don't claim better accuracy than manual approach with schema. We provide integration convenience.
|
|
53
|
+
|
|
54
|
+
---
|
|
55
|
+
|
|
5
56
|
## [0.6.45] - 2025-12-17
|
|
6
57
|
|
|
7
58
|
### ARCADE Pipeline Documentation & Benchmark Methodology
|
package/README.md
CHANGED
|
@@ -135,6 +135,31 @@ db.loadTtl(':Provider123 :hasRiskScore "0.87" .', null)
|
|
|
135
135
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
136
136
|
```
|
|
137
137
|
|
|
138
|
+
### Memory Retrieval Depth Benchmark
|
|
139
|
+
|
|
140
|
+
Based on academic benchmarks: MemQ (arXiv 2503.05193), mKGQAgent (Text2SPARQL 2025), MTEB.
|
|
141
|
+
|
|
142
|
+
```
|
|
143
|
+
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
144
|
+
│ BENCHMARK: Memory Retrieval at Depth (50 queries per depth) │
|
|
145
|
+
│ METHODOLOGY: LUBM schema-driven queries, HNSW index, random seed 42 │
|
|
146
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
147
|
+
│ │
|
|
148
|
+
│ DEPTH │ P50 LATENCY │ P95 LATENCY │ Recall@5 │ Recall@10 │ MRR │
|
|
149
|
+
│ ──────────────────────────────────────────────────────────────────────────│
|
|
150
|
+
│ 10 │ 0.06 ms │ 0.26 ms │ 78% │ 100% │ 0.68 │
|
|
151
|
+
│ 100 │ 0.50 ms │ 0.75 ms │ 88% │ 98% │ 0.42 │
|
|
152
|
+
│ 1,000 │ 1.59 ms │ 5.03 ms │ 80% │ 94% │ 0.50 │
|
|
153
|
+
│ 10,000 │ 16.71 ms │ 17.37 ms │ 76% │ 94% │ 0.54 │
|
|
154
|
+
│ ──────────────────────────────────────────────────────────────────────────│
|
|
155
|
+
│ │
|
|
156
|
+
│ KEY INSIGHT: Even at 10,000 stored queries, Recall@10 stays at 94% │
|
|
157
|
+
│ Sub-17ms retrieval from 10K query pool = practical for production use │
|
|
158
|
+
│ │
|
|
159
|
+
│ Reproduce: node memory-retrieval-benchmark.js │
|
|
160
|
+
└─────────────────────────────────────────────────────────────────────────────┘
|
|
161
|
+
```
|
|
162
|
+
|
|
138
163
|
### Where We Actually Outperform (Database Performance)
|
|
139
164
|
|
|
140
165
|
```
|
|
@@ -188,47 +213,60 @@ db.loadTtl(':Provider123 :hasRiskScore "0.87" .', null)
|
|
|
188
213
|
|
|
189
214
|
---
|
|
190
215
|
|
|
191
|
-
## The Difference:
|
|
216
|
+
## The Difference: Manual vs Integrated
|
|
192
217
|
|
|
193
|
-
###
|
|
218
|
+
### Manual Approach (Works, But Tedious)
|
|
194
219
|
|
|
195
220
|
```javascript
|
|
196
|
-
//
|
|
221
|
+
// STEP 1: Manually write your schema (takes hours for large ontologies)
|
|
222
|
+
const LUBM_SCHEMA = `
|
|
223
|
+
PREFIX ub: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#>
|
|
224
|
+
Classes: University, Department, Professor, Student, Course, Publication
|
|
225
|
+
Properties: teacherOf(Faculty→Course), worksFor(Faculty→Department)
|
|
226
|
+
`;
|
|
227
|
+
|
|
228
|
+
// STEP 2: Pass schema to LLM
|
|
197
229
|
const answer = await openai.chat.completions.create({
|
|
198
230
|
model: 'gpt-4o',
|
|
199
|
-
messages: [
|
|
231
|
+
messages: [
|
|
232
|
+
{ role: 'system', content: `${LUBM_SCHEMA}\nOutput raw SPARQL only.` },
|
|
233
|
+
{ role: 'user', content: 'Find suspicious providers' }
|
|
234
|
+
]
|
|
200
235
|
});
|
|
201
236
|
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
//
|
|
206
|
-
//
|
|
207
|
-
//
|
|
208
|
-
//
|
|
209
|
-
//
|
|
237
|
+
// STEP 3: Parse out the SPARQL (handle markdown, explanations, etc.)
|
|
238
|
+
const sparql = extractSPARQL(answer.choices[0].message.content);
|
|
239
|
+
|
|
240
|
+
// STEP 4: Find a SPARQL database (Jena? RDFox? Virtuoso?)
|
|
241
|
+
// STEP 5: Connect to database
|
|
242
|
+
// STEP 6: Execute query
|
|
243
|
+
// STEP 7: Parse results
|
|
244
|
+
// STEP 8: No audit trail - you'd have to build that yourself
|
|
245
|
+
|
|
246
|
+
// RESULT: ~71% accuracy (same as HyperMind with schema)
|
|
247
|
+
// BUT: 5-8 manual integration steps
|
|
210
248
|
```
|
|
211
249
|
|
|
212
|
-
###
|
|
250
|
+
### HyperMind Approach (Integrated)
|
|
213
251
|
|
|
214
252
|
```javascript
|
|
215
|
-
//
|
|
253
|
+
// ONE-TIME SETUP: Load your data
|
|
216
254
|
const { HyperMindAgent, GraphDB } = require('rust-kgdb');
|
|
217
255
|
|
|
218
256
|
const db = new GraphDB('http://insurance.org/');
|
|
219
|
-
db.loadTtl(yourActualData, null); //
|
|
257
|
+
db.loadTtl(yourActualData, null); // Schema auto-extracted from data
|
|
220
258
|
|
|
221
259
|
const agent = new HyperMindAgent({ kg: db, model: 'gpt-4o' });
|
|
222
260
|
const result = await agent.call('Find suspicious providers');
|
|
223
261
|
|
|
224
262
|
console.log(result.answer);
|
|
225
263
|
// "Provider PROV001 has risk score 0.87 with 47 claims over $50,000"
|
|
226
|
-
|
|
227
|
-
//
|
|
228
|
-
// ✅
|
|
229
|
-
// ✅
|
|
230
|
-
// ✅
|
|
231
|
-
// ✅
|
|
264
|
+
|
|
265
|
+
// WHAT YOU GET (ALL AUTOMATIC):
|
|
266
|
+
// ✅ Schema auto-extracted (no manual prompt engineering)
|
|
267
|
+
// ✅ Query executed on built-in database (no external DB needed)
|
|
268
|
+
// ✅ Full audit trail included
|
|
269
|
+
// ✅ Reproducible hash for compliance
|
|
232
270
|
|
|
233
271
|
console.log(result.reasoningTrace);
|
|
234
272
|
// [
|
|
@@ -240,7 +278,9 @@ console.log(result.hash);
|
|
|
240
278
|
// "sha256:8f3a2b1c..." - Same question = Same answer = Same hash
|
|
241
279
|
```
|
|
242
280
|
|
|
243
|
-
**
|
|
281
|
+
**Honest comparison**: Both approaches achieve ~71% accuracy on LUBM benchmark. The difference is integration effort:
|
|
282
|
+
- **Manual**: Write schema, integrate database, build audit trail yourself
|
|
283
|
+
- **HyperMind**: Database + schema extraction + audit trail built-in
|
|
244
284
|
|
|
245
285
|
---
|
|
246
286
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "rust-kgdb",
|
|
3
|
-
"version": "0.6.
|
|
3
|
+
"version": "0.6.47",
|
|
4
4
|
"description": "High-performance RDF/SPARQL database with AI agent framework. GraphDB (449ns lookups, 35x faster than RDFox), GraphFrames analytics (PageRank, motifs), Datalog reasoning, HNSW vector embeddings. HyperMindAgent for schema-aware query generation with audit trails. W3C SPARQL 1.1 compliant. Native performance via Rust + NAPI-RS.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"types": "index.d.ts",
|