rust-kgdb 0.5.10 → 0.5.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/CHANGELOG.md +19 -0
  2. package/README.md +58 -77
  3. package/package.json +1 -1
package/CHANGELOG.md CHANGED
@@ -2,6 +2,25 @@
2
2
 
3
3
  All notable changes to the rust-kgdb TypeScript SDK will be documented in this file.
4
4
 
5
+ ## [0.5.12] - 2025-12-15
6
+
7
+ ### Benchmark Section Cleanup
8
+
9
+ - Removed internal Cargo/Rust implementation details from benchmark documentation
10
+ - Simplified to focus on WHAT (metrics), WHY (value), and HOW (user-facing commands)
11
+ - Kept key numbers: 2.78µs lookups, 24 bytes/triple, 86.4% accuracy
12
+ - Removed: rustc commands, cargo bench paths, crate paths
13
+ - User-facing: `node hypermind-benchmark.js` for accuracy comparison
14
+
15
+ ## [0.5.11] - 2025-12-15
16
+
17
+ ### Documentation Clarification
18
+
19
+ - Clarified that embedding providers (OpenAI, Voyage AI) are third-party libraries, not built into rust-kgdb
20
+ - Updated examples to show `fetch` API for Voyage AI instead of non-existent SDK
21
+ - Added "bring your own embeddings" messaging to make provider abstraction clear
22
+ - rust-kgdb's EmbeddingService stores/searches vectors; users provide embeddings from their preferred provider
23
+
5
24
  ## [0.5.10] - 2025-12-15
6
25
 
7
26
  ### Documentation Cleanup
package/README.md CHANGED
@@ -131,68 +131,25 @@ We don't make claims we can't prove. All measurements use **publicly available,
131
131
  - **SP2Bench** - DBLP-based SPARQL performance benchmark
132
132
  - **W3C SPARQL 1.1 Conformance Suite** - Official W3C test cases
133
133
 
134
- **Test Environment:**
135
- - Hardware: Apple Silicon M-series (ARM64), Intel x64
136
- - Dataset: LUBM(1) - 3,272 triples, LUBM(10) - 32K triples, LUBM(100) - 327K triples
137
- - Tool: Criterion.rs statistical benchmarking (10,000+ iterations per measurement)
138
- - Comparison: Apache Jena 4.x, RDFox 7.x under identical conditions
139
-
140
- | Metric | Value | Context |
141
- |--------|-------|---------|
134
+ | Metric | Value | Why It Matters |
135
+ |--------|-------|----------------|
142
136
  | **Lookup Latency** | 2.78 µs | 35x faster than RDFox |
143
137
  | **Memory per Triple** | 24 bytes | 25% more efficient than RDFox |
144
- | **Bulk Insert** | 146K triples/sec | Competitive with commercial systems |
145
- | **SPARQL Accuracy** | 86.4% | vs 0% vanilla LLM (LUBM Q1-Q14) |
138
+ | **Bulk Insert** | 146K triples/sec | Production-ready throughput |
139
+ | **SPARQL Accuracy** | 86.4% | vs 0% vanilla LLM (LUBM benchmark) |
146
140
  | **W3C Compliance** | 100% | Full SPARQL 1.1 + RDF 1.2 |
147
- | **SIMD Speedup** | 44.5% avg | Range: 9-77% depending on query |
148
- | **WCOJ Joins** | O(N^(ρ/2)) | Worst-case optimal guaranteed |
149
- | **Ontology Support** | RDFS + OWL 2 RL | Full reasoning engine |
150
- | **Test Coverage** | 945+ tests | Production certified |
151
-
152
- **Reproducibility:** All benchmarks at `crates/storage/benches/` and `crates/hypergraph/benches/`. Run with `cargo bench --workspace`.
153
-
154
- ### Benchmark Methodology
155
-
156
- **How we measure performance:**
157
-
158
- 1. **LUBM Data Generation**
159
- ```bash
160
- # Generate test data (matches official Java UBA generator)
161
- rustc tools/lubm_generator.rs -O -o tools/lubm_generator
162
- ./tools/lubm_generator 1 /tmp/lubm_1.nt # 3,272 triples
163
- ./tools/lubm_generator 10 /tmp/lubm_10.nt # ~32K triples
164
- ```
165
-
166
- 2. **Storage Benchmarks**
167
- ```bash
168
- # Run Criterion benchmarks (statistical analysis, 10K+ samples)
169
- cargo bench --package storage --bench triple_store_benchmark
170
-
171
- # Results include:
172
- # - Mean, median, standard deviation
173
- # - Outlier detection
174
- # - Comparison vs baseline
175
- ```
176
-
177
- 3. **HyperMind Agent Accuracy**
178
- ```bash
179
- # Run LUBM benchmark comparing Vanilla LLM vs HyperMind
180
- node hypermind-benchmark.js
181
-
182
- # Tests 12 queries (Easy: 3, Medium: 5, Hard: 4)
183
- # Measures: Syntax validity, execution success, latency
184
- ```
185
-
186
- 4. **Hardware Requirements**
187
- - Minimum: 4GB RAM, any x64/ARM64 CPU
188
- - Recommended: 8GB+ RAM, Apple Silicon or modern x64
189
- - Benchmarks run on: M2 MacBook Pro (baseline measurements)
190
-
191
- 5. **Fair Comparison Conditions**
192
- - All systems tested with identical LUBM datasets
193
- - Same SPARQL queries across all systems
194
- - Cold-start measurements (no warm cache)
195
- - 10,000+ iterations per measurement for statistical significance
141
+
142
+ ### How We Measured
143
+
144
+ - **Dataset**: LUBM benchmark (industry standard since 2005)
145
+ - **Hardware**: Apple Silicon M2 MacBook Pro
146
+ - **Methodology**: 10,000+ iterations, cold-start, statistical analysis
147
+ - **Comparison**: Apache Jena 4.x, RDFox 7.x under identical conditions
148
+
149
+ **Try it yourself:**
150
+ ```bash
151
+ node hypermind-benchmark.js # Compare HyperMind vs Vanilla LLM accuracy
152
+ ```
196
153
 
197
154
  ---
198
155
 
@@ -309,11 +266,13 @@ const voteResults = service.findSimilarComposite('CLM001', 10, 0.7, 'voting') //
309
266
 
310
267
  ### Provider Configuration
311
268
 
312
- Configure your embedding providers with API keys:
269
+ rust-kgdb's `EmbeddingService` stores and searches vectors - you bring your own embeddings from any provider. Here are examples using popular third-party libraries:
313
270
 
314
271
  ```javascript
315
- // Example: Using OpenAI embeddings
316
- const { OpenAI } = require('openai')
272
+ // ============================================================
273
+ // EXAMPLE: Using OpenAI embeddings (requires: npm install openai)
274
+ // ============================================================
275
+ const { OpenAI } = require('openai') // Third-party library
317
276
  const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY })
318
277
 
319
278
  async function getOpenAIEmbedding(text) {
@@ -325,17 +284,31 @@ async function getOpenAIEmbedding(text) {
325
284
  return response.data[0].embedding
326
285
  }
327
286
 
328
- // Example: Using Anthropic (via their embedding partner)
329
- // Note: Anthropic doesn't provide embeddings directly; use Voyage AI
330
- const { VoyageAIClient } = require('voyageai')
331
- const voyage = new VoyageAIClient({ apiKey: process.env.VOYAGE_API_KEY })
332
-
287
+ // ============================================================
288
+ // EXAMPLE: Using Voyage AI (requires: npm install voyageai)
289
+ // Note: Anthropic recommends Voyage AI for embeddings
290
+ // ============================================================
333
291
  async function getVoyageEmbedding(text) {
334
- const response = await voyage.embed({
335
- input: text,
336
- model: 'voyage-2'
292
+ // Using fetch directly (no SDK required)
293
+ const response = await fetch('https://api.voyageai.com/v1/embeddings', {
294
+ method: 'POST',
295
+ headers: {
296
+ 'Authorization': `Bearer ${process.env.VOYAGE_API_KEY}`,
297
+ 'Content-Type': 'application/json'
298
+ },
299
+ body: JSON.stringify({ input: text, model: 'voyage-2' })
337
300
  })
338
- return response.embeddings[0].slice(0, 384) // Truncate to 384-dim
301
+ const data = await response.json()
302
+ return data.data[0].embedding.slice(0, 384) // Truncate to 384-dim
303
+ }
304
+
305
+ // ============================================================
306
+ // EXAMPLE: Mock embeddings for testing (no external deps)
307
+ // ============================================================
308
+ function getMockEmbedding(text) {
309
+ return new Array(384).fill(0).map((_, i) =>
310
+ Math.sin(text.charCodeAt(i % text.length) * 0.1) * 0.5 + 0.5
311
+ )
339
312
  }
340
313
  ```
341
314
 
@@ -1224,11 +1197,12 @@ const db = new GraphDB('http://insurance.org/fraud-kb')
1224
1197
  const embeddings = new EmbeddingService()
1225
1198
 
1226
1199
  // ============================================================
1227
- // STEP 3: Configure Embedding Provider
1200
+ // STEP 3: Configure Embedding Provider (bring your own)
1228
1201
  // ============================================================
1229
1202
  async function getEmbedding(text) {
1230
1203
  switch (EMBEDDING_PROVIDER) {
1231
1204
  case 'openai':
1205
+ // Requires: npm install openai
1232
1206
  const { OpenAI } = require('openai')
1233
1207
  const openai = new OpenAI({ apiKey: OPENAI_API_KEY })
1234
1208
  const resp = await openai.embeddings.create({
@@ -1239,12 +1213,19 @@ async function getEmbedding(text) {
1239
1213
  return resp.data[0].embedding
1240
1214
 
1241
1215
  case 'voyage':
1242
- const { VoyageAIClient } = require('voyageai')
1243
- const voyage = new VoyageAIClient({ apiKey: VOYAGE_API_KEY })
1244
- const vResp = await voyage.embed({ input: text, model: 'voyage-2' })
1245
- return vResp.embeddings[0].slice(0, EMBEDDING_DIM)
1216
+ // Using fetch directly (no SDK required)
1217
+ const vResp = await fetch('https://api.voyageai.com/v1/embeddings', {
1218
+ method: 'POST',
1219
+ headers: {
1220
+ 'Authorization': `Bearer ${VOYAGE_API_KEY}`,
1221
+ 'Content-Type': 'application/json'
1222
+ },
1223
+ body: JSON.stringify({ input: text, model: 'voyage-2' })
1224
+ })
1225
+ const vData = await vResp.json()
1226
+ return vData.data[0].embedding.slice(0, EMBEDDING_DIM)
1246
1227
 
1247
- default: // Mock embeddings for testing
1228
+ default: // Mock embeddings for testing (no external deps)
1248
1229
  return new Array(EMBEDDING_DIM).fill(0).map((_, i) =>
1249
1230
  Math.sin(text.charCodeAt(i % text.length) * 0.1) * 0.5 + 0.5
1250
1231
  )
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "rust-kgdb",
3
- "version": "0.5.10",
3
+ "version": "0.5.12",
4
4
  "description": "Production-grade Neuro-Symbolic AI Framework: +86.4% accuracy improvement over vanilla LLMs. High-performance knowledge graph (2.78µs lookups, 35x faster than RDFox). Features fraud detection, underwriting agents, WASM sandbox, type/category/proof theory, and W3C SPARQL 1.1 compliance.",
5
5
  "main": "index.js",
6
6
  "types": "index.d.ts",