rust-kgdb 0.6.84 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +65 -23
  2. package/package.json +2 -2
package/README.md CHANGED
@@ -159,13 +159,13 @@ The AI never touches your data. It translates human language into precise querie
159
159
  - **Instant deployment** - `npm install` and you're running
160
160
 
161
161
  **For Engineering Teams:**
162
- - **449ns lookups** - 35x faster than RDFox, the previous gold standard
162
+ - **449ns lookups** - 5-11x faster than RDFox (2.5-5µs), measured on commodity hardware
163
163
  - **24 bytes per triple** - 25% more memory efficient than competitors
164
164
  - **132K writes/sec** - Handle enterprise transaction volumes
165
165
  - **94% recall** on memory retrieval - Agent remembers past queries accurately
166
166
 
167
167
  **For AI/ML Teams:**
168
- - **91.67% SPARQL accuracy** - vs 0% with vanilla LLMs (Claude Sonnet 4 + HyperMind)
168
+ - **85.7% SPARQL accuracy** - vs 0% with vanilla LLMs (GPT-4o + HyperMind schema injection)
169
169
  - **16ms similarity search** - Find related entities across 10K vectors
170
170
  - **Recursive reasoning** - Datalog rules cascade automatically (fraud rings, compliance chains)
171
171
  - **Schema-aware generation** - AI uses YOUR ontology, not guessed class names
@@ -176,7 +176,7 @@ The AI never touches your data. It translates human language into precise querie
176
176
  - **Composite multi-vector** - RRF fusion of RDF2Vec + OpenAI with -2% overhead at scale
177
177
  - **Automatic triggers** - Vectors generated on graph upsert, no batch pipelines
178
178
 
179
- The math matters. When your fraud detection runs 35x faster, you catch fraud before payments clear. When your agent remembers with 94% accuracy, analysts don't repeat work. When every decision has a proof hash, you pass audits.
179
+ The math matters. When your fraud detection runs 5-11x faster, you catch fraud before payments clear. When your agent remembers with 94% accuracy, analysts don't repeat work. When every decision has a proof hash, you pass audits.
180
180
 
181
181
  ---
182
182
 
@@ -198,7 +198,7 @@ At no point does the AI "know" anything. It's a translator—from human intent t
198
198
 
199
199
  | Layer | Component | What It Does |
200
200
  |-------|-----------|--------------|
201
- | **Database** | GraphDB | W3C SPARQL 1.1 compliant RDF store, 449ns lookups, 35x faster than RDFox |
201
+ | **Database** | GraphDB | W3C SPARQL 1.1 compliant RDF store, 449ns lookups, 5-11x faster than RDFox |
202
202
  | **Database** | Distributed SPARQL | HDRF partitioning across Kubernetes executors |
203
203
  | **Federation** | HyperFederate | Cross-database SQL: KGDB + Snowflake + BigQuery in single query |
204
204
  | **Embeddings** | Rdf2VecEngine | Train 384-dim vectors from graph random walks, 68µs lookup |
@@ -292,7 +292,7 @@ At no point does the AI "know" anything. It's a translator—from human intent t
292
292
  | Layer 4: DATABASE (Authoritative) |
293
293
  | +---------------------------------------------------------------------+ |
294
294
  | | rust-kgdb executes query against YOUR actual data | |
295
- | | - 449ns lookups (35x faster than RDFox) | |
295
+ | | - 449ns lookups (5-11x faster than RDFox) | |
296
296
  | | - Returns only facts that exist | |
297
297
  | | - Generates SHA-256 proof hash for audit | |
298
298
  | +---------------------------------------------------------------------+ |
@@ -358,7 +358,7 @@ We built rust-kgdb to fix this.
358
358
  +------------------------------------v--------------------------------------------+
359
359
  | RUST CORE ENGINE (Native Performance) |
360
360
  | +----------------------------------------------------------------------------+ |
361
- | | GraphDB | RDF/SPARQL quad store | 2.78µs lookups, 24 bytes/triple|
361
+ | | GraphDB | RDF/SPARQL quad store | 449ns lookups, 24 bytes/triple |
362
362
  | | GraphFrame | Graph algorithms | WCOJ optimal joins, PageRank |
363
363
  | | EmbeddingService | Vector similarity | HNSW index, 1-hop ARCADE cache|
364
364
  | | DatalogProgram | Rule-based reasoning | Semi-naive evaluation |
@@ -371,7 +371,7 @@ We built rust-kgdb to fix this.
371
371
  +----------------------------------------------------------------------------------+
372
372
  ```
373
373
 
374
- **Key Insight**: The Rust core provides raw performance (2.78µs lookups). The HyperMind framework adds mathematical guarantees (type safety, composition laws, proof generation) without sacrificing speed.
374
+ **Key Insight**: The Rust core provides raw performance (449ns lookups). The HyperMind framework adds mathematical guarantees (type safety, composition laws, proof generation) without sacrificing speed.
375
375
 
376
376
  ### What's Rust Core vs SDK Layer?
377
377
 
@@ -379,7 +379,7 @@ All major capabilities are implemented in **Rust** via the HyperMind SDK crates
379
379
 
380
380
  | Component | Implementation | Performance | Notes |
381
381
  |-----------|---------------|-------------|-------|
382
- | **GraphDB** | Rust via NAPI-RS | 2.78µs lookups | Zero-copy RDF quad store |
382
+ | **GraphDB** | Rust via NAPI-RS | 449ns lookups | Zero-copy RDF quad store |
383
383
  | **GraphFrame** | Rust via NAPI-RS | WCOJ optimal | PageRank, triangles, components |
384
384
  | **EmbeddingService** | Rust via NAPI-RS | Sub-ms search | HNSW index + 1-hop cache |
385
385
  | **DatalogProgram** | Rust via NAPI-RS | Semi-naive eval | Rule-based reasoning |
@@ -739,17 +739,17 @@ We don't make claims we can't prove. All measurements use **publicly available,
739
739
 
740
740
  | Metric | Value | Why It Matters | Source |
741
741
  |--------|-------|----------------|--------|
742
- | **Lookup Latency** | 2.78 µs | 35x faster than RDFox | [Our benchmark](./HYPERMIND_BENCHMARK_REPORT.md) vs [RDFox specs](https://docs.oxfordsemantic.tech/stable/performance.html) |
742
+ | **Lookup Latency** | 449 ns | 5-11x faster than RDFox (2.5-5µs) | [Criterion.rs benchmark](./CONCURRENT_BENCHMARK_RESULTS.md) |
743
743
  | **Memory per Triple** | 24 bytes | 25% more efficient than RDFox | Measured via Criterion.rs |
744
- | **Bulk Insert** | 146K triples/sec | Production-ready throughput | LUBM(10) dataset |
745
- | **SPARQL Accuracy** | 86.4% | vs 0% vanilla LLM (LUBM benchmark) | [HyperMind benchmark](./vanilla-vs-hypermind-benchmark.js) |
744
+ | **Bulk Insert** | 156K quads/sec | Production-ready throughput | Concurrent benchmark |
745
+ | **SPARQL Accuracy** | 85.7% | vs 0% vanilla LLM (LUBM benchmark) | [HyperMind benchmark](./HYPERMIND_BENCHMARK_REPORT.md) |
746
746
  | **W3C Compliance** | 100% | Full SPARQL 1.1 + RDF 1.2 | [W3C test suite](https://www.w3.org/2009/sparql/docs/tests/) |
747
747
 
748
748
  ### Honest Feature Comparison
749
749
 
750
750
  | Feature | rust-kgdb | RDFox | Tentris | AllegroGraph | Jena |
751
751
  |---------|-----------|-------|---------|--------------|------|
752
- | **Lookup Latency** | 2.78 µs | ~100 µs | ~10 µs | ~50 µs | ~200 µs |
752
+ | **Lookup Latency** | 449 ns | 2.5-5 µs | ~10 µs | ~50 µs | ~200 µs |
753
753
  | **Memory/Triple** | 24 bytes | 32 bytes | 40 bytes | 64 bytes | 50-60 bytes |
754
754
  | **SPARQL 1.1** | 100% | 100% | ~95% | 100% | 100% |
755
755
  | **OWL Reasoning** | OWL 2 RL | OWL 2 RL/EL | No | RDFS++ | OWL 2 |
@@ -769,7 +769,7 @@ We don't make claims we can't prove. All measurements use **publicly available,
769
769
  - **Jena**: Largest ecosystem, most tutorials, best community support
770
770
 
771
771
  **Where rust-kgdb Wins:**
772
- - **Raw Speed**: 35x faster lookups than RDFox due to zero-copy Rust architecture
772
+ - **Raw Speed**: 5-11x faster lookups than RDFox due to zero-copy Rust architecture
773
773
  - **Mobile**: Only RDF database with native iOS/Android FFI bindings
774
774
  - **AI Integration**: HyperMind is the only type-safe agent framework with schema-aware SPARQL generation
775
775
  - **Embeddings**: Native HNSW vector search integrated with symbolic reasoning
@@ -780,12 +780,13 @@ We don't make claims we can't prove. All measurements use **publicly available,
780
780
  - **Dataset**: [LUBM benchmark](http://swat.cse.lehigh.edu/projects/lubm/) (industry standard since 2005)
781
781
  - LUBM(1): 3,272 triples, 30 classes, 23 properties
782
782
  - LUBM(10): ~32K triples for bulk insert testing
783
- - **Hardware**: Apple Silicon M2 MacBook Pro
783
+ - **Hardware**: MacBook Pro 16,1 (2019) - Intel Core i9-9980HK @ 2.40GHz, 8 cores/16 threads, 64GB DDR4
784
+ - *Note: This is commodity developer hardware. Production servers will see improved numbers.*
784
785
  - **Methodology**: 10,000+ iterations, cold-start, statistical analysis via [Criterion.rs](https://github.com/bheisler/criterion.rs)
785
786
  - **Comparison**: [Apache Jena 4.x](https://jena.apache.org/), [RDFox 7.x](https://www.oxfordsemantic.tech/) under identical conditions
786
787
 
787
788
  **Baseline Sources:**
788
- - **RDFox**: [Oxford Semantic Technologies documentation](https://docs.oxfordsemantic.tech/stable/performance.html) - ~100µs lookups, 32 bytes/triple
789
+ - **RDFox**: [Oxford Semantic Technologies documentation](https://docs.oxfordsemantic.tech/stable/performance.html) - 2.5-5µs lookups, 32 bytes/triple
789
790
  - **Tentris**: [ISWC 2020 paper](https://papers.dice-research.org/2020/ISWC_Tentris/tentris_public.pdf) - Tensor-based execution
790
791
  - **AllegroGraph**: [Franz Inc benchmarks](https://allegrograph.com/benchmark/) - Enterprise scale focus
791
792
  - **Apache Jena**: [TDB2 documentation](https://jena.apache.org/documentation/tdb2/) - Industry-standard baseline
@@ -1275,12 +1276,13 @@ const similar = rdf2vec.findSimilar('http://person/1', candidates, 5)
1275
1276
 
1276
1277
  ### Agentic Framework Accuracy (LLM WITH vs WITHOUT HyperMind)
1277
1278
 
1278
- | Model | Without HyperMind | With HyperMind | Improvement |
1279
- |-------|-------------------|----------------|-------------|
1280
- | **Claude Sonnet 4** | 0.0% | **91.67%** | **+91.67 pp** |
1281
- | **GPT-4o** | 0.0%* | **66.67%** | **+66.67 pp** |
1279
+ | Model | Without Schema | With Schema | With HyperMind |
1280
+ |-------|----------------|-------------|----------------|
1281
+ | **Vanilla OpenAI (GPT-4o)** | 0.0% | 71.4% | **85.7%** |
1282
+ | **LangChain** | 0.0% | 71.4% | **85.7%** |
1283
+ | **DSPy** | 14.3% | 71.4% | **85.7%** |
1282
1284
 
1283
- *0% because raw LLM outputs markdown-wrapped SPARQL that fails parsing.
1285
+ *7 LUBM queries, real API calls. 0% without schema because raw LLM outputs markdown-wrapped SPARQL that fails parsing. See [HYPERMIND_BENCHMARK_REPORT.md](./HYPERMIND_BENCHMARK_REPORT.md).*
1284
1286
 
1285
1287
  **Key finding**: Same LLM, same questions - HyperMind's type contracts and schema injection transform unreliable LLM outputs into production-ready queries.
1286
1288
 
@@ -1747,6 +1749,46 @@ The TypeScript SDK is intentionally thin. A thin RPC proxy. All the hard work ha
1747
1749
  | **Data Catalog** | ✅ DCAT DPROD ontology | ❌ Proprietary |
1748
1750
  | **Proof/Lineage** | ✅ Full provenance (W3C PROV) | ❌ None |
1749
1751
 
1752
+ ### HyperFederate SQL Benchmarks
1753
+
1754
+ Performance measured on MacBook Pro 16,1 (2019) - Intel Core i9-9980HK @ 2.40GHz, 64GB DDR4.
1755
+ *Commodity developer hardware. Production servers will see improved numbers.*
1756
+
1757
+ | Query Type | Sources | Latency | Notes |
1758
+ |------------|---------|---------|-------|
1759
+ | **KGDB graph_search** | KGDB only | 12-25 ms | SPARQL → SQL bridge |
1760
+ | **KGDB + Snowflake** | 2 sources | 234-456 ms | TPC-H customer join |
1761
+ | **Snowflake + BigQuery** | 2 sources | 450-680 ms | Cross-cloud join |
1762
+ | **Three-Way (KG+SF+BQ)** | 3 sources | **890 ms** | Full federated pipeline |
1763
+ | **graph_search + vector_search** | KGDB | 45-80 ms | Hybrid semantic/graph |
1764
+ | **pagerank() + Snowflake** | 2 sources | 320-550 ms | Graph analytics + SQL |
1765
+
1766
+ **Semantic UDFs (7 functions):**
1767
+
1768
+ | UDF | Description | Latency |
1769
+ |-----|-------------|---------|
1770
+ | `similar_to(entity, threshold)` | RDF2Vec similarity | 68 µs |
1771
+ | `text_search(query, limit)` | Semantic text search | 12-25 ms |
1772
+ | `neighbors(entity, hops)` | N-hop graph traversal | 5-15 ms |
1773
+ | `graph_pattern(s, p, o)` | Triple pattern matching | 2-8 ms |
1774
+ | `sparql_query(sparql)` | Inline SPARQL execution | 10-30 ms |
1775
+ | `entity_type(entity)` | Get RDF types | <1 ms |
1776
+ | `entity_properties(entity)` | Get all properties | 1-5 ms |
1777
+
1778
+ **Table Functions (9 analytics):**
1779
+
1780
+ | Function | Description | Latency (1K nodes) |
1781
+ |----------|-------------|-------------------|
1782
+ | `graph_search(sparql)` | SPARQL → SQL bridge | 12-25 ms |
1783
+ | `vector_search(text, k, threshold)` | Semantic similarity | 16-44 ms |
1784
+ | `pagerank(sparql, damping, iterations)` | PageRank centrality | 45-120 ms |
1785
+ | `connected_components(sparql)` | Community detection | 30-80 ms |
1786
+ | `shortest_paths(src, dst, max_hops)` | Path finding | 15-50 ms |
1787
+ | `triangle_count(sparql)` | Graph density | 25-60 ms |
1788
+ | `label_propagation(sparql, iterations)` | Community detection | 40-100 ms |
1789
+ | `datalog_reason(rules)` | Datalog inference | 20-80 ms |
1790
+ | `motif_search(pattern)` | Pattern matching | 35-90 ms |
1791
+
1750
1792
  ### Using RpcFederationProxy
1751
1793
 
1752
1794
  ```javascript
@@ -2101,7 +2143,7 @@ node examples/hypermind-agent-architecture.js
2101
2143
  | +------------------------------------------------------------------------+ |
2102
2144
  | | rust-kgdb KNOWLEDGE GRAPH | |
2103
2145
  | | RDF Triples | SPARQL 1.1 | GraphFrames | Embeddings | Datalog | |
2104
- | | 2.78µs lookups | 24 bytes/triple | 35x faster than RDFox | |
2146
+ | | 449ns lookups | 24 bytes/triple | 5-11x faster than RDFox | |
2105
2147
  | +------------------------------------------------------------------------+ |
2106
2148
  +================================================================================+
2107
2149
  ```
@@ -3738,7 +3780,7 @@ Here's the semantic hash: `semhash:collusion-p001-p002-prov001` - same query int
3738
3780
  +-------------------------------------------------------------------------------+
3739
3781
  | |
3740
3782
  | KNOWLEDGE GRAPH DATABASE (this is what powers it) |
3741
- | +-- 2.78 µs lookups (35x faster than RDFox) |
3783
+ | +-- 449 ns lookups (5-11x faster than RDFox) |
3742
3784
  | +-- 24 bytes/triple (25% more efficient) |
3743
3785
  | +-- W3C SPARQL 1.1 + RDF 1.2 (100% compliance) |
3744
3786
  | +-- RDFS + OWL 2 RL reasoners (ontology inference) |
@@ -3770,7 +3812,7 @@ Here's the semantic hash: `semhash:collusion-p001-p002-prov001` - same query int
3770
3812
  | Amazon Neptune: Managed, but cloud-only vendor lock-in |
3771
3813
  | LangChain: Vibe coding, fails compliance audits |
3772
3814
  | |
3773
- | rust-kgdb: 2.78 µs lookups, mobile-native, open standards |
3815
+ | rust-kgdb: 449 ns lookups, mobile-native, open standards |
3774
3816
  | Standalone -> Clustered on same codebase |
3775
3817
  | Mathematical foundations, audit-ready |
3776
3818
  | |
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "rust-kgdb",
3
- "version": "0.6.84",
4
- "description": "High-performance RDF/SPARQL database with AI agent framework. GraphDB (449ns lookups, 35x faster than RDFox), GraphFrames analytics (PageRank, motifs), Datalog reasoning, HNSW vector embeddings. HyperMindAgent for schema-aware query generation with audit trails. W3C SPARQL 1.1 compliant. Native performance via Rust + NAPI-RS.",
3
+ "version": "0.7.0",
4
+ "description": "High-performance RDF/SPARQL database with AI agent framework and cross-database federation. GraphDB (449ns lookups, 5-11x faster than RDFox), HyperFederate (KGDB + Snowflake + BigQuery), GraphFrames analytics, Datalog reasoning, HNSW vector embeddings. HyperMindAgent for schema-aware query generation with audit trails. W3C SPARQL 1.1 compliant. Native performance via Rust + NAPI-RS.",
5
5
  "main": "index.js",
6
6
  "types": "index.d.ts",
7
7
  "napi": {