rust-kgdb 0.6.1 → 0.6.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +36 -17
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -12,16 +12,16 @@
|
|
|
12
12
|
|
|
13
13
|
We asked GPT-4 to write a simple SPARQL query: *"Find all professors."*
|
|
14
14
|
|
|
15
|
-
It returned this:
|
|
15
|
+
It returned this broken output:
|
|
16
16
|
|
|
17
|
-
```
|
|
18
|
-
```sparql
|
|
19
|
-
SELECT ?professor WHERE { ?professor a ub:Faculty . }
|
|
20
|
-
```
|
|
21
|
-
This query retrieves faculty members from the knowledge graph.
|
|
17
|
+
```text
|
|
18
|
+
```sparql
|
|
19
|
+
SELECT ?professor WHERE { ?professor a ub:Faculty . }
|
|
20
|
+
```
|
|
21
|
+
This query retrieves faculty members from the knowledge graph.
|
|
22
22
|
```
|
|
23
23
|
|
|
24
|
-
Three problems: markdown code fences break the parser, `ub:Faculty` doesn't exist in the schema (it's `ub:Professor`), and the explanation text is mixed with the query. **Result: Parser error. Zero results.**
|
|
24
|
+
Three problems: (1) markdown code fences break the parser, (2) `ub:Faculty` doesn't exist in the schema (it's `ub:Professor`), and (3) the explanation text is mixed with the query. **Result: Parser error. Zero results.**
|
|
25
25
|
|
|
26
26
|
This isn't a cherry-picked failure. When we ran the standard LUBM benchmark (14 queries, 3,272 triples), vanilla LLMs produced valid, correct SPARQL **0% of the time**.
|
|
27
27
|
|
|
@@ -261,23 +261,42 @@ Token limits are real. rust-kgdb uses a **rolling time window strategy** to find
|
|
|
261
261
|
└─────────────────────────────────────────────────────────────────────────────────┘
|
|
262
262
|
```
|
|
263
263
|
|
|
264
|
-
### Idempotent Responses via
|
|
264
|
+
### Idempotent Responses via Semantic Hashing
|
|
265
265
|
|
|
266
|
-
Same question = Same answer. Critical for compliance.
|
|
266
|
+
Same question = Same answer. Even with **different wording**. Critical for compliance.
|
|
267
267
|
|
|
268
268
|
```javascript
|
|
269
|
-
// First call: Compute answer, cache
|
|
269
|
+
// First call: Compute answer, cache with semantic hash
|
|
270
270
|
const result1 = await agent.call("Analyze claims from Provider P001")
|
|
271
|
-
// Hash:
|
|
271
|
+
// Semantic Hash: semhash:fraud-provider-p001-claims-analysis
|
|
272
|
+
|
|
273
|
+
// Second call (different wording, same intent): Cache HIT!
|
|
274
|
+
const result2 = await agent.call("Show me P001's claim patterns")
|
|
275
|
+
// Cache HIT - same semantic hash: semhash:fraud-provider-p001-claims-analysis
|
|
272
276
|
|
|
273
|
-
//
|
|
274
|
-
const
|
|
275
|
-
// Cache HIT - same hash:
|
|
277
|
+
// Third call (exact same): Also cache hit
|
|
278
|
+
const result3 = await agent.call("Analyze claims from Provider P001")
|
|
279
|
+
// Cache HIT - same semantic hash: semhash:fraud-provider-p001-claims-analysis
|
|
276
280
|
|
|
277
281
|
// Compliance officer: "Why are these identical?"
|
|
278
|
-
// You: "
|
|
282
|
+
// You: "Semantic hashing - same meaning, same output, regardless of phrasing."
|
|
279
283
|
```
|
|
280
284
|
|
|
285
|
+
**How it works**: Query embeddings are hashed via **Locality-Sensitive Hashing (LSH)** with random hyperplane projections. Semantically similar queries map to the same bucket.
|
|
286
|
+
|
|
287
|
+
**Research Foundation**:
|
|
288
|
+
- **SimHash** (Charikar, 2002) - Random hyperplane projections for cosine similarity
|
|
289
|
+
- **Semantic Hashing** (Salakhutdinov & Hinton, 2009) - Deep autoencoders for binary codes
|
|
290
|
+
- **Learning to Hash** (Wang et al., 2018) - Survey of neural hashing methods
|
|
291
|
+
|
|
292
|
+
**Implementation**: 384-dim embeddings → LSH with 64 hyperplanes → 64-bit semantic hash
|
|
293
|
+
|
|
294
|
+
**Benefits**:
|
|
295
|
+
- **Semantic deduplication** - "Find fraud" and "Detect fraudulent activity" hit same cache
|
|
296
|
+
- **Cost reduction** - Avoid redundant LLM calls for paraphrased questions
|
|
297
|
+
- **Consistency** - Same answer for same intent, audit-ready
|
|
298
|
+
- **Sub-linear lookup** - O(1) hash lookup vs O(n) embedding comparison
|
|
299
|
+
|
|
281
300
|
---
|
|
282
301
|
|
|
283
302
|
## What This Is
|
|
@@ -2501,7 +2520,7 @@ that appear to be related. There might be collusion with provider PROV001."
|
|
|
2501
2520
|
"output": "collusion(P001,P002,PROV001)",
|
|
2502
2521
|
"derivation": "claim(CLM001,P001,PROV001) ∧ claim(CLM002,P002,PROV001) ∧ related(P001,P002) → collusion(P001,P002,PROV001)",
|
|
2503
2522
|
"timestamp": "2024-12-14T10:30:00Z",
|
|
2504
|
-
"
|
|
2523
|
+
"semanticHash": "semhash:collusion-p001-p002-prov001"
|
|
2505
2524
|
}
|
|
2506
2525
|
}
|
|
2507
2526
|
```
|
|
@@ -2521,7 +2540,7 @@ that appear to be related. There might be collusion with provider PROV001."
|
|
|
2521
2540
|
5. Unification: `?P1=P001, ?P2=P002, ?Prov=PROV001`
|
|
2522
2541
|
6. Conclusion: `collusion(P001, P002, PROV001)` - QED
|
|
2523
2542
|
|
|
2524
|
-
Here's the
|
|
2543
|
+
Here's the semantic hash: `semhash:collusion-p001-p002-prov001` - same query intent will always return this exact result."
|
|
2525
2544
|
|
|
2526
2545
|
**Result:** HyperMind passes audit. DSPy gets you a follow-up meeting with legal.
|
|
2527
2546
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "rust-kgdb",
|
|
3
|
-
"version": "0.6.
|
|
3
|
+
"version": "0.6.3",
|
|
4
4
|
"description": "Production-grade Neuro-Symbolic AI Framework with Memory Hypergraph: +86.4% accuracy improvement over vanilla LLMs. High-performance knowledge graph (2.78µs lookups, 35x faster than RDFox). Features Memory Hypergraph (temporal scoring, rolling context window, idempotent responses), fraud detection, underwriting agents, WASM sandbox, type/category/proof theory, and W3C SPARQL 1.1 compliance.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"types": "index.d.ts",
|