npm - rust-kgdb - Versions diffs - 0.6.70 → 0.6.72 - Mend

rust-kgdb 0.6.70 → 0.6.72

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +188 -14
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -97,6 +97,104 @@ The math matters. When your fraud detection runs 35x faster, you catch fraud bef
 ---
+## Why rust-kgdb and HyperMind?
+Most AI frameworks trust the LLM. We don't.
+```
++===========================================================================+
+|                                                                           |
+|   TRADITIONAL AI ARCHITECTURE (Dangerous)                                 |
+|                                                                           |
+|   +-------------+     +-------------+     +-------------+                 |
+|   |   Human     | --> |    LLM      | --> |  Database   |                 |
+|   |   Request   |     |  (Trusted)  |     |   (Maybe)   |                 |
+|   +-------------+     +-------------+     +-------------+                 |
+|                             |                                             |
+|                             v                                             |
+|                       "Provider #4521                                     |
+|                        has anomalies"                                     |
+|                       (FABRICATED!)                                       |
+|                                                                           |
+|   Problem: LLM generates answers directly. No verification.               |
+|                                                                           |
++===========================================================================+
++===========================================================================+
+|                                                                           |
+|   rust-kgdb + HYPERMIND ARCHITECTURE (Safe)                               |
+|                                                                           |
+|   +-------------+     +-------------+     +-------------+                 |
+|   |   Human     | --> |  HyperMind  | --> | rust-kgdb   |                 |
+|   |   Request   |     |   Agent     |     |  GraphDB    |                 |
+|   +-------------+     +------+------+     +------+------+                 |
+|                              |                   |                        |
+|        +---------+-----------+-----------+-------+                        |
+|        |         |           |           |                                |
+|        v         v           v           v                                |
+|   +--------+ +--------+ +--------+ +--------+                             |
+|   | Type   | | WASM   | | Proof  | | Schema |                             |
+|   | Theory | | Sandbox| | DAG    | | Cache  |                             |
+|   +--------+ +--------+ +--------+ +--------+                             |
+|   Hindley-  Capability  SHA-256    Your                                   |
+|   Milner    Isolation   Audit      Ontology                               |
+|                                                                           |
+|   Result: "SELECT ?anomaly WHERE { :Provider4521 :hasAnomaly ?anomaly }"  |
+|           Executes against YOUR data. Returns REAL facts.                 |
+|                                                                           |
++===========================================================================+
++===========================================================================+
+|                                                                           |
+|   THE TRUST MODEL: Four Layers of Defense                                 |
+|                                                                           |
+|   Layer 1: AGENT (Untrusted)                                              |
+|   +---------------------------------------------------------------------+ |
+|   | LLM generates intent: "Find suspicious providers"                   | |
+|   | - Can suggest queries                                               | |
+|   | - Cannot execute anything directly                                  | |
+|   | - All outputs are validated                                         | |
+|   +---------------------------------------------------------------------+ |
+|                              | validated intent                           |
+|                              v                                            |
+|   Layer 2: PROXY (Verified)                                               |
+|   +---------------------------------------------------------------------+ |
+|   | Type-checks against schema: Is "Provider" a valid class?            | |
+|   | - Hindley-Milner type inference                                     | |
+|   | - Schema validation (YOUR ontology)                                 | |
+|   | - Rejects malformed queries before execution                        | |
+|   +---------------------------------------------------------------------+ |
+|                              | typed query                                |
+|                              v                                            |
+|   Layer 3: SANDBOX (Isolated)                                             |
+|   +---------------------------------------------------------------------+ |
+|   | WASM execution with capability-based security                       | |
+|   | - Fuel metering (prevents infinite loops)                           | |
+|   | - Memory isolation (no access to host)                              | |
+|   | - Explicit capability grants (read-only, write, admin)              | |
+|   +---------------------------------------------------------------------+ |
+|                              | sandboxed execution                        |
+|                              v                                            |
+|   Layer 4: DATABASE (Authoritative)                                       |
+|   +---------------------------------------------------------------------+ |
+|   | rust-kgdb executes query against YOUR actual data                   | |
+|   | - 449ns lookups (35x faster than RDFox)                             | |
+|   | - Returns only facts that exist                                     | |
+|   | - Generates SHA-256 proof hash for audit                            | |
+|   +---------------------------------------------------------------------+ |
+|                                                                           |
+|   MATHEMATICAL FOUNDATIONS:                                               |
+|   * Category Theory: Tools as morphisms (A -> B), composable             |
+|   * Type Theory: Hindley-Milner ensures query well-formedness            |
+|   * Proof Theory: Every execution produces a cryptographic witness       |
+|                                                                           |
++===========================================================================+
+```
+**The key insight**: The LLM is creative but unreliable. The database is reliable but not creative. HyperMind bridges them with mathematical guarantees - the LLM proposes, the type system validates, the sandbox isolates, and the database executes. No hallucinations possible.
+---
 ## The Technical Problem (SPARQL Generation)
 Beyond hallucination, there's a practical issue: **LLMs can't write correct SPARQL.**
@@ -424,28 +522,104 @@ Most graph databases were designed for servers. Most AI agents are built on prom
 We don't make claims we can't prove. All measurements use **publicly available, peer-reviewed benchmarks**.
 **Public Benchmarks Used:**
-- **LUBM** (Lehigh University Benchmark) - Standard RDF/SPARQL benchmark since 2005
-- **SP2Bench** - DBLP-based SPARQL performance benchmark
-- **W3C SPARQL 1.1 Conformance Suite** - Official W3C test cases
-| Metric | Value | Why It Matters |
-|--------|-------|----------------|
-| **Lookup Latency** | 2.78 µs | 35x faster than RDFox |
-| **Memory per Triple** | 24 bytes | 25% more efficient than RDFox |
-| **Bulk Insert** | 146K triples/sec | Production-ready throughput |
-| **SPARQL Accuracy** | 86.4% | vs 0% vanilla LLM (LUBM benchmark) |
-| **W3C Compliance** | 100% | Full SPARQL 1.1 + RDF 1.2 |
+- **[LUBM](http://swat.cse.lehigh.edu/projects/lubm/)** (Lehigh University Benchmark) - Standard RDF/SPARQL benchmark since 2005
+- **[SP2Bench](http://dbis.informatik.uni-freiburg.de/forschung/projekte/SP2B/)** - DBLP-based SPARQL performance benchmark
+- **[W3C SPARQL 1.1 Conformance Suite](https://www.w3.org/2009/sparql/docs/tests/)** - Official W3C test cases
+**Comparison Baselines:**
+- **[RDFox](https://www.oxfordsemantic.tech/product)** - Oxford Semantic Technologies' commercial RDF database (industry gold standard)
+- **[Apache Jena](https://jena.apache.org/documentation/tdb/)** - Apache Foundation's open-source RDF framework
+- **[Tentris](https://tentris.dice-research.org/)** - Tensor-based RDF store from DICE Research (University of Paderborn)
+- **[AllegroGraph](https://allegrograph.com/)** - Franz Inc's commercial graph database with AI features
+| Metric | Value | Why It Matters | Source |
+|--------|-------|----------------|--------|
+| **Lookup Latency** | 2.78 µs | 35x faster than RDFox | [Our benchmark](./HYPERMIND_BENCHMARK_REPORT.md) vs [RDFox specs](https://docs.oxfordsemantic.tech/stable/performance.html) |
+| **Memory per Triple** | 24 bytes | 25% more efficient than RDFox | Measured via Criterion.rs |
+| **Bulk Insert** | 146K triples/sec | Production-ready throughput | LUBM(10) dataset |
+| **SPARQL Accuracy** | 86.4% | vs 0% vanilla LLM (LUBM benchmark) | [HyperMind benchmark](./vanilla-vs-hypermind-benchmark.js) |
+| **W3C Compliance** | 100% | Full SPARQL 1.1 + RDF 1.2 | [W3C test suite](https://www.w3.org/2009/sparql/docs/tests/) |
+### Honest Feature Comparison
+| Feature | rust-kgdb | RDFox | Tentris | AllegroGraph | Jena |
+|---------|-----------|-------|---------|--------------|------|
+| **Lookup Latency** | 2.78 µs | ~100 µs | ~10 µs | ~50 µs | ~200 µs |
+| **Memory/Triple** | 24 bytes | 32 bytes | 40 bytes | 64 bytes | 50-60 bytes |
+| **SPARQL 1.1** | 100% | 100% | ~95% | 100% | 100% |
+| **OWL Reasoning** | OWL 2 RL | OWL 2 RL/EL | No | RDFS++ | OWL 2 |
+| **Datalog** | Yes (semi-naive) | Yes | No | Yes | No |
+| **Vector Embeddings** | HNSW native | No | No | Vector store | No |
+| **Graph Algorithms** | PageRank, CC, etc. | No | No | Yes | No |
+| **Distributed** | HDRF + Raft | Yes | No | Yes | No |
+| **Mobile Native** | iOS/Android FFI | No | No | No | No |
+| **AI Agent Framework** | HyperMind | No | No | LLM integration | No |
+| **License** | Apache 2.0 | Commercial | MIT | Commercial | Apache 2.0 |
+| **Pricing** | Free | $$$$ | Free | $$$$ | Free |
+**Where Others Win:**
+- **RDFox**: More mature OWL reasoning, better incremental maintenance, proven at billion-triple scale
+- **Tentris**: Tensor algebra enables certain complex joins faster than traditional indexing
+- **AllegroGraph**: Longer track record (25+ years), extensive enterprise integrations, Prolog-like queries
+- **Jena**: Largest ecosystem, most tutorials, best community support
+**Where rust-kgdb Wins:**
+- **Raw Speed**: 35x faster lookups than RDFox due to zero-copy Rust architecture
+- **Mobile**: Only RDF database with native iOS/Android FFI bindings
+- **AI Integration**: HyperMind is the only type-safe agent framework with schema-aware SPARQL generation
+- **Embeddings**: Native HNSW vector search integrated with symbolic reasoning
+- **Price**: Enterprise features at open-source pricing
 ### How We Measured
-- **Dataset**: LUBM benchmark (industry standard since 2005)
+- **Dataset**: [LUBM benchmark](http://swat.cse.lehigh.edu/projects/lubm/) (industry standard since 2005)
+  - LUBM(1): 3,272 triples, 30 classes, 23 properties
+  - LUBM(10): ~32K triples for bulk insert testing
 - **Hardware**: Apple Silicon M2 MacBook Pro
-- **Methodology**: 10,000+ iterations, cold-start, statistical analysis
-- **Comparison**: Apache Jena 4.x, RDFox 7.x under identical conditions
+- **Methodology**: 10,000+ iterations, cold-start, statistical analysis via [Criterion.rs](https://github.com/bheisler/criterion.rs)
+- **Comparison**: [Apache Jena 4.x](https://jena.apache.org/), [RDFox 7.x](https://www.oxfordsemantic.tech/) under identical conditions
+**Baseline Sources:**
+- **RDFox**: [Oxford Semantic Technologies documentation](https://docs.oxfordsemantic.tech/stable/performance.html) - ~100µs lookups, 32 bytes/triple
+- **Tentris**: [ISWC 2020 paper](https://papers.dice-research.org/2020/ISWC_Tentris/tentris_public.pdf) - Tensor-based execution
+- **AllegroGraph**: [Franz Inc benchmarks](https://allegrograph.com/benchmark/) - Enterprise scale focus
+- **Apache Jena**: [TDB2 documentation](https://jena.apache.org/documentation/tdb2/) - Industry-standard baseline
+### WCOJ (Worst-Case Optimal Join) Comparison
+WCOJ is the gold standard for multi-way join performance. We implement it; here's how we compare:
+| System | WCOJ Implementation | Complexity Guarantee | Source |
+|--------|---------------------|---------------------|--------|
+| **rust-kgdb** | Leapfrog Triejoin | O(N^(rho/2)) | Our implementation |
+| **RDFox** | Generic Join | O(N^k) traditional | [RDFox architecture](https://docs.oxfordsemantic.tech/stable/architecture.html) |
+| **Tentris** | Tensor-based WCOJ | O(N^(rho/2)) | [ISWC 2025 WCOJ paper](https://papers.dice-research.org/2025/ISWC_Tentris-WCOJ-Update/public.pdf) |
+| **Jena** | Hash/Merge Join | O(N^k) traditional | Standard implementation |
+**Research Foundation:**
+- **[Leapfrog Triejoin (Veldhuizen 2014)](https://arxiv.org/abs/1210.0481)** - Original WCOJ algorithm
+- **[Tentris WCOJ Update (DICE 2025)](https://papers.dice-research.org/2025/ISWC_Tentris-WCOJ-Update/public.pdf)** - Latest tensor-based improvements
+- **[AGM Bound (Atserias et al. 2008)](https://dl.acm.org/doi/10.1145/1376916.1376918)** - Theoretical optimality proof
+**Why WCOJ Matters:**
+Traditional joins: `O(N^k)` where k = number of relations
+WCOJ joins: `O(N^(rho/2))` where rho = fractional edge cover (always <= k)
+For a 5-way join on 1M triples:
+- Traditional: Up to 10^30 intermediate results (impractical)
+- WCOJ: Bounded by actual output size (practical)
+```
+Example: Triangle Query (3-way self-join)
+  Traditional Join: O(N^3) = 10^18 for 1M triples
+  WCOJ: O(N^1.5) = 10^9 for 1M triples (1 billion x faster worst-case)
+```
 **Try it yourself:**
 ```bash
 node hypermind-benchmark.js  # Compare HyperMind vs Vanilla LLM accuracy
+cargo bench --package storage --bench triple_store_benchmark  # Run Rust benchmarks
 ```
 ---

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "rust-kgdb",
-  "version": "0.6.70",
+  "version": "0.6.72",
   "description": "High-performance RDF/SPARQL database with AI agent framework. GraphDB (449ns lookups, 35x faster than RDFox), GraphFrames analytics (PageRank, motifs), Datalog reasoning, HNSW vector embeddings. HyperMindAgent for schema-aware query generation with audit trails. W3C SPARQL 1.1 compliant. Native performance via Rust + NAPI-RS.",
   "main": "index.js",
   "types": "index.d.ts",