rust-kgdb 0.6.82 → 0.6.84
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +76 -32
- package/package.json +1 -1
- package/README.archive.md +0 -2632
- package/README.archive.md.old +0 -1206
package/README.md
CHANGED
|
@@ -4,7 +4,9 @@
|
|
|
4
4
|
[](https://opensource.org/licenses/Apache-2.0)
|
|
5
5
|
[](https://www.w3.org/TR/sparql11-query/)
|
|
6
6
|
|
|
7
|
-
> **
|
|
7
|
+
> **Your knowledge is scattered. Your claims live in Snowflake. Your customer graph sits in Neo4j. Your risk models run on BigQuery. Your compliance docs are in SharePoint. And your AI? It hallucinates because it can't see the full picture.**
|
|
8
|
+
>
|
|
9
|
+
> rust-kgdb unifies scattered enterprise knowledge into a single queryable graph—with native embeddings, cross-database federation, and AI that generates queries instead of fabricating answers. No hallucinations. Full audit trails. One query across everything.
|
|
8
10
|
|
|
9
11
|
---
|
|
10
12
|
|
|
@@ -54,21 +56,35 @@ const { GraphDB, Rdf2VecEngine, EmbeddingService } = require('rust-kgdb')
|
|
|
54
56
|
|
|
55
57
|
## The Problem With AI Today
|
|
56
58
|
|
|
57
|
-
|
|
59
|
+
**Here's what actually happens in every enterprise AI project:**
|
|
58
60
|
|
|
59
|
-
|
|
61
|
+
Your fraud analyst asks a simple question: *"Show me high-risk customers with large account balances who've had claims in the past 6 months."*
|
|
60
62
|
|
|
61
|
-
|
|
63
|
+
Sounds simple. It's not.
|
|
62
64
|
|
|
63
|
-
The
|
|
65
|
+
The **customer data** lives in Snowflake. The **risk scores** are computed in your knowledge graph. The **claims history** sits in BigQuery. The **policy details** are in a legacy Oracle database. And **nobody can write a query that spans all four**.
|
|
64
66
|
|
|
65
|
-
|
|
67
|
+
So the analyst does what everyone does:
|
|
68
|
+
1. Export customers from Snowflake to CSV
|
|
69
|
+
2. Run a separate risk query in the graph database
|
|
70
|
+
3. Pull claims from BigQuery into another spreadsheet
|
|
71
|
+
4. Spend 3 hours in Excel doing VLOOKUP joins
|
|
72
|
+
5. Present "findings" that are already 6 hours stale
|
|
73
|
+
|
|
74
|
+
**This is the reality of enterprise data in 2025.** Knowledge is scattered across dozens of systems. Every "simple" question requires a data engineering project. And when you finally get your answer, you can't trace how it was derived.
|
|
75
|
+
|
|
76
|
+
Now add AI to this mess.
|
|
77
|
+
|
|
78
|
+
Your analyst asks ChatGPT the same question. It responds confidently: *"Customer #4521 is high-risk with $847,000 in account balance and 3 recent claims."*
|
|
79
|
+
|
|
80
|
+
The analyst opens an investigation. Two weeks later, legal discovers Customer #4521 doesn't exist. **The AI made up everything—the customer ID, the balance, the claims.** The AI had no access to your data. It just generated plausible-sounding text.
|
|
66
81
|
|
|
67
|
-
|
|
68
|
-
- A
|
|
82
|
+
This keeps happening:
|
|
83
|
+
- A lawyer cites "Smith v. Johnson (2019)" in court. **That case doesn't exist.**
|
|
84
|
+
- A doctor avoids prescribing "Nexapril" for cardiac patients. **Nexapril isn't a real drug.**
|
|
69
85
|
- A fraud analyst flags Account #7842 for money laundering. **It belongs to a children's charity.**
|
|
70
86
|
|
|
71
|
-
Every time, the same pattern:
|
|
87
|
+
Every time, the same pattern: Data is scattered. AI can't see it. AI fabricates. People get hurt.
|
|
72
88
|
|
|
73
89
|
---
|
|
74
90
|
|
|
@@ -91,29 +107,46 @@ A real solution requires a different architecture. One built on solid engineerin
|
|
|
91
107
|
|
|
92
108
|
## The Solution: Query Generation, Not Answer Generation
|
|
93
109
|
|
|
94
|
-
What if
|
|
110
|
+
What if we're thinking about AI wrong?
|
|
111
|
+
|
|
112
|
+
Every enterprise wants the same thing: ask a question in plain English, get an accurate answer from their data. But we've been trying to make the AI *know* the answer. That's backwards.
|
|
113
|
+
|
|
114
|
+
**The AI doesn't need to know anything. It just needs to know how to ask.**
|
|
115
|
+
|
|
116
|
+
Think about what's actually happening when a fraud analyst asks: *"Show me high-risk customers with large balances."*
|
|
95
117
|
|
|
96
|
-
|
|
97
|
-
-
|
|
98
|
-
-
|
|
99
|
-
-
|
|
118
|
+
The analyst already has everything needed to answer this question:
|
|
119
|
+
- Customer data in Snowflake
|
|
120
|
+
- Risk scores in the knowledge graph
|
|
121
|
+
- Account balances in the core banking system
|
|
122
|
+
- Complete audit logs of every transaction
|
|
100
123
|
|
|
101
|
-
|
|
124
|
+
The problem isn't missing data. It's that **no human can write a query that spans all these systems**. SQL doesn't work on graphs. SPARQL doesn't work on Snowflake. And nobody has 4 hours to manually join CSVs.
|
|
125
|
+
|
|
126
|
+
**The breakthrough**: What if AI generated the query instead of the answer?
|
|
102
127
|
|
|
103
128
|
```
|
|
104
|
-
|
|
105
|
-
Human: "
|
|
106
|
-
AI: "
|
|
129
|
+
The Old Way (Dangerous):
|
|
130
|
+
Human: "Show me high-risk customers with large balances"
|
|
131
|
+
AI: "Customer #4521 has $847K and high risk score" <-- FABRICATED
|
|
132
|
+
|
|
133
|
+
The New Way (Verifiable):
|
|
134
|
+
Human: "Show me high-risk customers with large balances"
|
|
135
|
+
AI: Understands intent → Generates federated SQL:
|
|
136
|
+
|
|
137
|
+
SELECT kg.customer, kg.risk_score, sf.balance
|
|
138
|
+
FROM graph_search('...risk assessment...') kg
|
|
139
|
+
JOIN snowflake.ACCOUNTS sf ON kg.customer_id = sf.id
|
|
140
|
+
WHERE kg.risk_score > 0.8 AND sf.balance > 100000
|
|
107
141
|
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
AI: Executes against YOUR database
|
|
112
|
-
Database: Returns actual facts about Provider #4521
|
|
113
|
-
Result: Real data with audit trail <-- VERIFIABLE
|
|
142
|
+
Database: Executes across KGDB + Snowflake + BigQuery
|
|
143
|
+
Result: Real customers. Real balances. Real risk scores.
|
|
144
|
+
With SHA-256 proof hash for audit trail. <-- VERIFIABLE
|
|
114
145
|
```
|
|
115
146
|
|
|
116
|
-
|
|
147
|
+
The AI never touches your data. It translates human language into precise queries. The database executes against real systems. Every answer traces back to actual records.
|
|
148
|
+
|
|
149
|
+
**rust-kgdb is not an AI that knows answers. It's an AI that knows how to ask the right questions—across every system where your knowledge lives.**
|
|
117
150
|
|
|
118
151
|
---
|
|
119
152
|
|
|
@@ -149,18 +182,29 @@ The math matters. When your fraud detection runs 35x faster, you catch fraud bef
|
|
|
149
182
|
|
|
150
183
|
## Why rust-kgdb and HyperMind?
|
|
151
184
|
|
|
152
|
-
|
|
185
|
+
**The question isn't "Can AI answer my question?" It's "Can I trust the answer?"**
|
|
186
|
+
|
|
187
|
+
Every AI framework makes the same mistake: they treat the LLM as the source of truth. LangChain. LlamaIndex. AutoGPT. They all assume the model knows things. It doesn't. It generates plausible text. There's a difference.
|
|
188
|
+
|
|
189
|
+
We built rust-kgdb on a contrarian principle: **Never trust the AI. Verify everything.**
|
|
190
|
+
|
|
191
|
+
The LLM proposes a query. The type system validates it against your actual schema. The sandbox executes it in isolation. The database returns only facts that exist. The proof DAG creates a cryptographic audit trail.
|
|
192
|
+
|
|
193
|
+
At no point does the AI "know" anything. It's a translator—from human intent to precise queries—with four layers of verification before anything touches your data.
|
|
194
|
+
|
|
195
|
+
**This is the difference between an AI that sounds right and an AI that is right.**
|
|
153
196
|
|
|
154
|
-
###
|
|
197
|
+
### The Engineering Foundation
|
|
155
198
|
|
|
156
|
-
| Layer |
|
|
157
|
-
|
|
158
|
-
| **Database** | GraphDB | W3C SPARQL 1.1 compliant RDF store
|
|
199
|
+
| Layer | Component | What It Does |
|
|
200
|
+
|-------|-----------|--------------|
|
|
201
|
+
| **Database** | GraphDB | W3C SPARQL 1.1 compliant RDF store, 449ns lookups, 35x faster than RDFox |
|
|
159
202
|
| **Database** | Distributed SPARQL | HDRF partitioning across Kubernetes executors |
|
|
160
|
-
| **
|
|
203
|
+
| **Federation** | HyperFederate | Cross-database SQL: KGDB + Snowflake + BigQuery in single query |
|
|
204
|
+
| **Embeddings** | Rdf2VecEngine | Train 384-dim vectors from graph random walks, 68µs lookup |
|
|
161
205
|
| **Embeddings** | EmbeddingService | Multi-provider composite vectors with RRF fusion |
|
|
162
206
|
| **Embeddings** | HNSW Index | Approximate nearest neighbor search in 303µs |
|
|
163
|
-
| **Analytics** | GraphFrames | PageRank, connected components, motif matching |
|
|
207
|
+
| **Analytics** | GraphFrames | PageRank, connected components, triangle count, motif matching |
|
|
164
208
|
| **Analytics** | Pregel API | Bulk synchronous parallel graph algorithms |
|
|
165
209
|
| **Reasoning** | Datalog Engine | Recursive rule evaluation with fixpoint semantics |
|
|
166
210
|
| **AI Agent** | HyperMindAgent | Schema-aware SPARQL generation from natural language |
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "rust-kgdb",
|
|
3
|
-
"version": "0.6.
|
|
3
|
+
"version": "0.6.84",
|
|
4
4
|
"description": "High-performance RDF/SPARQL database with AI agent framework. GraphDB (449ns lookups, 35x faster than RDFox), GraphFrames analytics (PageRank, motifs), Datalog reasoning, HNSW vector embeddings. HyperMindAgent for schema-aware query generation with audit trails. W3C SPARQL 1.1 compliant. Native performance via Rust + NAPI-RS.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"types": "index.d.ts",
|