rust-kgdb 0.6.82 → 0.6.84

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -4,7 +4,9 @@
4
4
  [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
5
5
  [![W3C](https://img.shields.io/badge/W3C-SPARQL%201.1%20%7C%20RDF%201.2-blue)](https://www.w3.org/TR/sparql11-query/)
6
6
 
7
- > **Enterprise Knowledge Graph with Native Graph Embeddings**: A production-grade RDF database featuring built-in RDF2Vec, multi-vector composite search, and distributed SPARQL execution—engineered for teams who need verifiable AI at scale.
7
+ > **Your knowledge is scattered. Your claims live in Snowflake. Your customer graph sits in Neo4j. Your risk models run on BigQuery. Your compliance docs are in SharePoint. And your AI? It hallucinates because it can't see the full picture.**
8
+ >
9
+ > rust-kgdb unifies scattered enterprise knowledge into a single queryable graph—with native embeddings, cross-database federation, and AI that generates queries instead of fabricating answers. No hallucinations. Full audit trails. One query across everything.
8
10
 
9
11
  ---
10
12
 
@@ -54,21 +56,35 @@ const { GraphDB, Rdf2VecEngine, EmbeddingService } = require('rust-kgdb')
54
56
 
55
57
  ## The Problem With AI Today
56
58
 
57
- Enterprise AI projects keep failing. Not because the technology is bad, but because organizations use it wrong.
59
+ **Here's what actually happens in every enterprise AI project:**
58
60
 
59
- A claims investigator asks ChatGPT: *"Has Provider #4521 shown suspicious billing patterns?"*
61
+ Your fraud analyst asks a simple question: *"Show me high-risk customers with large account balances who've had claims in the past 6 months."*
60
62
 
61
- The AI responds confidently: *"Yes, Provider #4521 has a history of duplicate billing and upcoding."*
63
+ Sounds simple. It's not.
62
64
 
63
- The investigator opens a case. Weeks later, legal discovers Provider #4521 has a perfect record. **The AI made it up.** Lawsuit incoming.
65
+ The **customer data** lives in Snowflake. The **risk scores** are computed in your knowledge graph. The **claims history** sits in BigQuery. The **policy details** are in a legacy Oracle database. And **nobody can write a query that spans all four**.
64
66
 
65
- This keeps happening:
67
+ So the analyst does what everyone does:
68
+ 1. Export customers from Snowflake to CSV
69
+ 2. Run a separate risk query in the graph database
70
+ 3. Pull claims from BigQuery into another spreadsheet
71
+ 4. Spend 3 hours in Excel doing VLOOKUP joins
72
+ 5. Present "findings" that are already 6 hours stale
73
+
74
+ **This is the reality of enterprise data in 2025.** Knowledge is scattered across dozens of systems. Every "simple" question requires a data engineering project. And when you finally get your answer, you can't trace how it was derived.
75
+
76
+ Now add AI to this mess.
77
+
78
+ Your analyst asks ChatGPT the same question. It responds confidently: *"Customer #4521 is high-risk with $847,000 in account balance and 3 recent claims."*
79
+
80
+ The analyst opens an investigation. Two weeks later, legal discovers Customer #4521 doesn't exist. **The AI made up everything—the customer ID, the balance, the claims.** The AI had no access to your data. It just generated plausible-sounding text.
66
81
 
67
- - A lawyer cites "Smith v. Johnson (2019)" in court. The judge is confused. **That case doesn't exist.**
68
- - A doctor avoids prescribing "Nexapril" due to cardiac interactions. **Nexapril isn't a real drug.**
82
+ This keeps happening:
83
+ - A lawyer cites "Smith v. Johnson (2019)" in court. **That case doesn't exist.**
84
+ - A doctor avoids prescribing "Nexapril" for cardiac patients. **Nexapril isn't a real drug.**
69
85
  - A fraud analyst flags Account #7842 for money laundering. **It belongs to a children's charity.**
70
86
 
71
- Every time, the same pattern: The AI sounds confident. The AI is wrong. People get hurt.
87
+ Every time, the same pattern: Data is scattered. AI can't see it. AI fabricates. People get hurt.
72
88
 
73
89
  ---
74
90
 
@@ -91,29 +107,46 @@ A real solution requires a different architecture. One built on solid engineerin
91
107
 
92
108
  ## The Solution: Query Generation, Not Answer Generation
93
109
 
94
- What if AI stopped providing answers and started **generating queries**?
110
+ What if we're thinking about AI wrong?
111
+
112
+ Every enterprise wants the same thing: ask a question in plain English, get an accurate answer from their data. But we've been trying to make the AI *know* the answer. That's backwards.
113
+
114
+ **The AI doesn't need to know anything. It just needs to know how to ask.**
115
+
116
+ Think about what's actually happening when a fraud analyst asks: *"Show me high-risk customers with large balances."*
95
117
 
96
- Think about it:
97
- - Your database knows the facts (claims, providers, transactions)
98
- - AI understands language (can parse "find suspicious patterns")
99
- - You need both working together
118
+ The analyst already has everything needed to answer this question:
119
+ - Customer data in Snowflake
120
+ - Risk scores in the knowledge graph
121
+ - Account balances in the core banking system
122
+ - Complete audit logs of every transaction
100
123
 
101
- **The AI translates intent into queries. The database finds facts. The AI never makes up data.**
124
+ The problem isn't missing data. It's that **no human can write a query that spans all these systems**. SQL doesn't work on graphs. SPARQL doesn't work on Snowflake. And nobody has 4 hours to manually join CSVs.
125
+
126
+ **The breakthrough**: What if AI generated the query instead of the answer?
102
127
 
103
128
  ```
104
- Before (Dangerous):
105
- Human: "Is Provider #4521 suspicious?"
106
- AI: "Yes, they have billing anomalies" <-- FABRICATED
129
+ The Old Way (Dangerous):
130
+ Human: "Show me high-risk customers with large balances"
131
+ AI: "Customer #4521 has $847K and high risk score" <-- FABRICATED
132
+
133
+ The New Way (Verifiable):
134
+ Human: "Show me high-risk customers with large balances"
135
+ AI: Understands intent → Generates federated SQL:
136
+
137
+ SELECT kg.customer, kg.risk_score, sf.balance
138
+ FROM graph_search('...risk assessment...') kg
139
+ JOIN snowflake.ACCOUNTS sf ON kg.customer_id = sf.id
140
+ WHERE kg.risk_score > 0.8 AND sf.balance > 100000
107
141
 
108
- After (Safe):
109
- Human: "Is Provider #4521 suspicious?"
110
- AI: Generates SPARQL query
111
- AI: Executes against YOUR database
112
- Database: Returns actual facts about Provider #4521
113
- Result: Real data with audit trail <-- VERIFIABLE
142
+ Database: Executes across KGDB + Snowflake + BigQuery
143
+ Result: Real customers. Real balances. Real risk scores.
144
+ With SHA-256 proof hash for audit trail. <-- VERIFIABLE
114
145
  ```
115
146
 
116
- rust-kgdb is a knowledge graph database with an AI layer that **cannot hallucinate** because it only returns data from your actual systems.
147
+ The AI never touches your data. It translates human language into precise queries. The database executes against real systems. Every answer traces back to actual records.
148
+
149
+ **rust-kgdb is not an AI that knows answers. It's an AI that knows how to ask the right questions—across every system where your knowledge lives.**
117
150
 
118
151
  ---
119
152
 
@@ -149,18 +182,29 @@ The math matters. When your fraud detection runs 35x faster, you catch fraud bef
149
182
 
150
183
  ## Why rust-kgdb and HyperMind?
151
184
 
152
- Most AI frameworks trust the LLM. We don't.
185
+ **The question isn't "Can AI answer my question?" It's "Can I trust the answer?"**
186
+
187
+ Every AI framework makes the same mistake: they treat the LLM as the source of truth. LangChain. LlamaIndex. AutoGPT. They all assume the model knows things. It doesn't. It generates plausible text. There's a difference.
188
+
189
+ We built rust-kgdb on a contrarian principle: **Never trust the AI. Verify everything.**
190
+
191
+ The LLM proposes a query. The type system validates it against your actual schema. The sandbox executes it in isolation. The database returns only facts that exist. The proof DAG creates a cryptographic audit trail.
192
+
193
+ At no point does the AI "know" anything. It's a translator—from human intent to precise queries—with four layers of verification before anything touches your data.
194
+
195
+ **This is the difference between an AI that sounds right and an AI that is right.**
153
196
 
154
- ### Core Capabilities
197
+ ### The Engineering Foundation
155
198
 
156
- | Layer | Feature | What It Does |
157
- |-------|---------|--------------|
158
- | **Database** | GraphDB | W3C SPARQL 1.1 compliant RDF store with 449ns lookups |
199
+ | Layer | Component | What It Does |
200
+ |-------|-----------|--------------|
201
+ | **Database** | GraphDB | W3C SPARQL 1.1 compliant RDF store, 449ns lookups, 35x faster than RDFox |
159
202
  | **Database** | Distributed SPARQL | HDRF partitioning across Kubernetes executors |
160
- | **Embeddings** | Rdf2VecEngine | Train 384-dim vectors from graph random walks |
203
+ | **Federation** | HyperFederate | Cross-database SQL: KGDB + Snowflake + BigQuery in single query |
204
+ | **Embeddings** | Rdf2VecEngine | Train 384-dim vectors from graph random walks, 68µs lookup |
161
205
  | **Embeddings** | EmbeddingService | Multi-provider composite vectors with RRF fusion |
162
206
  | **Embeddings** | HNSW Index | Approximate nearest neighbor search in 303µs |
163
- | **Analytics** | GraphFrames | PageRank, connected components, motif matching |
207
+ | **Analytics** | GraphFrames | PageRank, connected components, triangle count, motif matching |
164
208
  | **Analytics** | Pregel API | Bulk synchronous parallel graph algorithms |
165
209
  | **Reasoning** | Datalog Engine | Recursive rule evaluation with fixpoint semantics |
166
210
  | **AI Agent** | HyperMindAgent | Schema-aware SPARQL generation from natural language |
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "rust-kgdb",
3
- "version": "0.6.82",
3
+ "version": "0.6.84",
4
4
  "description": "High-performance RDF/SPARQL database with AI agent framework. GraphDB (449ns lookups, 35x faster than RDFox), GraphFrames analytics (PageRank, motifs), Datalog reasoning, HNSW vector embeddings. HyperMindAgent for schema-aware query generation with audit trails. W3C SPARQL 1.1 compliant. Native performance via Rust + NAPI-RS.",
5
5
  "main": "index.js",
6
6
  "types": "index.d.ts",