npm - rust-kgdb - Versions diffs - 0.6.1 → 0.6.3 - Mend

rust-kgdb 0.6.1 → 0.6.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +36 -17
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -12,16 +12,16 @@
 We asked GPT-4 to write a simple SPARQL query: *"Find all professors."*
-It returned this:
+It returned this broken output:
-```sparql
-```sparql
-SELECT ?professor WHERE { ?professor a ub:Faculty . }
-```
-This query retrieves faculty members from the knowledge graph.
+```text
+    ```sparql
+    SELECT ?professor WHERE { ?professor a ub:Faculty . }
+    ```
+    This query retrieves faculty members from the knowledge graph.
 ```
-Three problems: markdown code fences break the parser, `ub:Faculty` doesn't exist in the schema (it's `ub:Professor`), and the explanation text is mixed with the query. **Result: Parser error. Zero results.**
+Three problems: (1) markdown code fences break the parser, (2) `ub:Faculty` doesn't exist in the schema (it's `ub:Professor`), and (3) the explanation text is mixed with the query. **Result: Parser error. Zero results.**
 This isn't a cherry-picked failure. When we ran the standard LUBM benchmark (14 queries, 3,272 triples), vanilla LLMs produced valid, correct SPARQL **0% of the time**.
@@ -261,23 +261,42 @@ Token limits are real. rust-kgdb uses a **rolling time window strategy** to find
 └─────────────────────────────────────────────────────────────────────────────────┘
 ```
-### Idempotent Responses via Query Cache
+### Idempotent Responses via Semantic Hashing
-Same question = Same answer. Critical for compliance.
+Same question = Same answer. Even with **different wording**. Critical for compliance.
 ```javascript
-// First call: Compute answer, cache result
+// First call: Compute answer, cache with semantic hash
 const result1 = await agent.call("Analyze claims from Provider P001")
-// Hash: sha256:9f86d081...b0f00a08
+// Semantic Hash: semhash:fraud-provider-p001-claims-analysis
+// Second call (different wording, same intent): Cache HIT!
+const result2 = await agent.call("Show me P001's claim patterns")
+// Cache HIT - same semantic hash: semhash:fraud-provider-p001-claims-analysis
-// Second call (10 minutes later): Return cached result
-const result2 = await agent.call("Analyze claims from Provider P001")
-// Cache HIT - same hash: sha256:9f86d081...b0f00a08
+// Third call (exact same): Also cache hit
+const result3 = await agent.call("Analyze claims from Provider P001")
+// Cache HIT - same semantic hash: semhash:fraud-provider-p001-claims-analysis
 // Compliance officer: "Why are these identical?"
-// You: "Idempotent responses - same input, same output, cryptographic proof."
+// You: "Semantic hashing - same meaning, same output, regardless of phrasing."
 ```
+**How it works**: Query embeddings are hashed via **Locality-Sensitive Hashing (LSH)** with random hyperplane projections. Semantically similar queries map to the same bucket.
+**Research Foundation**:
+- **SimHash** (Charikar, 2002) - Random hyperplane projections for cosine similarity
+- **Semantic Hashing** (Salakhutdinov & Hinton, 2009) - Deep autoencoders for binary codes
+- **Learning to Hash** (Wang et al., 2018) - Survey of neural hashing methods
+**Implementation**: 384-dim embeddings → LSH with 64 hyperplanes → 64-bit semantic hash
+**Benefits**:
+- **Semantic deduplication** - "Find fraud" and "Detect fraudulent activity" hit same cache
+- **Cost reduction** - Avoid redundant LLM calls for paraphrased questions
+- **Consistency** - Same answer for same intent, audit-ready
+- **Sub-linear lookup** - O(1) hash lookup vs O(n) embedding comparison
 ---
 ## What This Is
@@ -2501,7 +2520,7 @@ that appear to be related. There might be collusion with provider PROV001."
     "output": "collusion(P001,P002,PROV001)",
     "derivation": "claim(CLM001,P001,PROV001) ∧ claim(CLM002,P002,PROV001) ∧ related(P001,P002) → collusion(P001,P002,PROV001)",
     "timestamp": "2024-12-14T10:30:00Z",
-    "hash": "sha256:9f86d081...b0f00a08"
+    "semanticHash": "semhash:collusion-p001-p002-prov001"
   }
 }
 ```
@@ -2521,7 +2540,7 @@ that appear to be related. There might be collusion with provider PROV001."
 5. Unification: `?P1=P001, ?P2=P002, ?Prov=PROV001`
 6. Conclusion: `collusion(P001, P002, PROV001)` - QED
-Here's the SHA-256 hash of this execution: `sha256:9f86d081...b0f00a08`"
+Here's the semantic hash: `semhash:collusion-p001-p002-prov001` - same query intent will always return this exact result."
 **Result:** HyperMind passes audit. DSPy gets you a follow-up meeting with legal.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "rust-kgdb",
-  "version": "0.6.1",
+  "version": "0.6.3",
   "description": "Production-grade Neuro-Symbolic AI Framework with Memory Hypergraph: +86.4% accuracy improvement over vanilla LLMs. High-performance knowledge graph (2.78µs lookups, 35x faster than RDFox). Features Memory Hypergraph (temporal scoring, rolling context window, idempotent responses), fraud detection, underwriting agents, WASM sandbox, type/category/proof theory, and W3C SPARQL 1.1 compliance.",
   "main": "index.js",
   "types": "index.d.ts",