npm - @soulcraft/brainy - Versions diffs - 0.46.0 → 0.48.0 - Mend

@soulcraft/brainy 0.46.0 → 0.48.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

package/OFFLINE_MODELS.md +56 -0
package/README.md +46 -1
package/dist/brainyData.js +7 -9
package/dist/brainyData.js.map +1 -1
package/dist/demo.js +2 -2
package/dist/demo.js.map +1 -1
package/dist/hnsw/hnswIndex.d.ts +1 -1
package/dist/hnsw/hnswIndex.js +4 -4
package/dist/hnsw/hnswIndex.js.map +1 -1
package/dist/index.d.ts +2 -3
package/dist/index.js +3 -9
package/dist/index.js.map +1 -1
package/dist/setup.d.ts +3 -3
package/dist/setup.js +6 -6
package/dist/setup.js.map +1 -1
package/dist/utils/distance.d.ts +4 -4
package/dist/utils/distance.js +67 -140
package/dist/utils/distance.js.map +1 -1
package/dist/utils/embedding.d.ts +58 -84
package/dist/utils/embedding.js +250 -594
package/dist/utils/embedding.js.map +1 -1
package/dist/utils/robustModelLoader.d.ts +4 -0
package/dist/utils/robustModelLoader.js +58 -7
package/dist/utils/robustModelLoader.js.map +1 -1
package/dist/utils/textEncoding.d.ts +2 -3
package/dist/utils/textEncoding.js +31 -274
package/dist/utils/textEncoding.js.map +1 -1
package/package.json +10 -19
package/scripts/download-models.cjs +190 -0

package/OFFLINE_MODELS.md ADDED Viewed

@@ -0,0 +1,56 @@
+# Offline Models
+Brainy uses Transformers.js with ONNX Runtime for **true offline operation** - no more TensorFlow.js dependency hell!
+## How it works
+Brainy automatically figures out the best approach:
+1. **First use**: Downloads models once (~87 MB) to local cache
+2. **Subsequent use**: Loads from cache (completely offline, zero network calls)
+3. **Smart detection**: Automatically finds models in cache, bundled, or downloads as needed
+## Standard usage
+```bash
+npm install @soulcraft/brainy
+# Use immediately - models download automatically on first use
+```
+## Docker with production egress restrictions
+For environments where production has no internet but build does:
+```dockerfile
+FROM node:24-slim
+WORKDIR /app
+COPY package*.json ./
+RUN npm install @soulcraft/brainy
+RUN npm run download-models  # Download during build (when internet available)
+COPY . .
+# Production container now works completely offline
+```
+## Development with immediate offline
+If you want models available immediately for development:
+```bash
+npm install @soulcraft/brainy
+npm run download-models  # Optional: download now instead of on first use
+```
+## Key benefits vs TensorFlow.js
+- ✅ **95% smaller package** - 643 kB vs 12.5 MB
+- ✅ **84% smaller models** - 87 MB vs 525 MB
+- ✅ **True offline** - Zero network calls after initial download
+- ✅ **No dependency issues** - 5 deps vs 47+, no more --legacy-peer-deps
+- ✅ **Better performance** - ONNX Runtime beats TensorFlow.js
+- ✅ **Same API** - Drop-in replacement
+## Philosophy
+**Install and use. Brainy handles the rest.**
+No configuration files, no environment variables, no complex setup. Brainy detects your environment and does the right thing automatically.

package/README.md CHANGED Viewed

@@ -11,6 +11,51 @@
 </div>
+## 🔥 MAJOR UPDATE: TensorFlow.js → Transformers.js Migration (v0.46+)
+**We've completely replaced TensorFlow.js with Transformers.js for better performance and true offline operation!**
+### Why We Made This Change
+**The Honest Truth About TensorFlow.js:**
+- 📦 **Massive Package Size**: 12.5MB+ packages with complex dependency trees
+- 🌐 **Hidden Network Calls**: Even "local" models triggered fetch() calls internally
+- 🐛 **Dependency Hell**: Constant `--legacy-peer-deps` issues with Node.js updates
+- 🔧 **Maintenance Burden**: 47+ dependencies to keep compatible across environments
+- 💾 **Huge Models**: 525MB Universal Sentence Encoder models
+### What You Get Now
+- ✅ **95% Smaller Package**: 643 kB vs 12.5 MB (and it actually works better!)
+- ✅ **84% Smaller Models**: 87 MB vs 525 MB all-MiniLM-L6-v2 vs USE
+- ✅ **True Offline Operation**: Zero network calls after initial model download
+- ✅ **5x Fewer Dependencies**: Clean dependency tree, no more peer dep issues
+- ✅ **Same API**: Drop-in replacement - your existing code just works
+- ✅ **Better Performance**: ONNX Runtime is faster than TensorFlow.js in most cases
+### Migration (It's Automatic!)
+```javascript
+// Your existing code works unchanged!
+import { BrainyData } from '@soulcraft/brainy'
+const db = new BrainyData({
+  embedding: { type: 'transformer' } // Now uses Transformers.js automatically
+})
+// Dimensions changed from 512 → 384 (handled automatically)
+```
+**For Docker/Production or No Egress:**
+```dockerfile
+RUN npm install @soulcraft/brainy
+RUN npm run download-models  # Download during build for offline production
+```
+---
 ## ✨ What is Brainy?
 Imagine a database that thinks like you do - connecting ideas, finding patterns, and getting smarter over time. Brainy
@@ -33,7 +78,7 @@ easy-to-use package.
   environment and optimizes itself
 - **🌍 True Write-Once, Run-Anywhere** - Same code runs in Angular, React, Vue, Node.js, Deno, Bun, serverless, edge
   workers, and web workers with automatic environment detection
-- **⚡ Scary Fast** - Handles millions of vectors with sub-millisecond search. Built-in GPU acceleration when available
+- **⚡ Scary Fast** - Handles millions of vectors with sub-millisecond search. GPU acceleration for embeddings, optimized CPU for distance calculations
 - **🎯 Self-Learning** - Like having a database that goes to the gym. Gets faster and smarter the more you use it
 - **🔮 AI-First Design** - Built for the age of embeddings, RAG, and semantic search. Your LLMs will thank you
 - **🎮 Actually Fun to Use** - Clean API, great DX, and it does the heavy lifting so you can build cool stuff

package/dist/brainyData.js CHANGED Viewed

@@ -5,7 +5,7 @@
 import { v4 as uuidv4 } from 'uuid';
 import { HNSWIndexOptimized } from './hnsw/hnswIndexOptimized.js';
 import { createStorage } from './storage/storageFactory.js';
-import { cosineDistance, defaultBatchEmbeddingFunction, getDefaultEmbeddingFunction, cleanupWorkerPools } from './utils/index.js';
+import { cosineDistance, defaultEmbeddingFunction, cleanupWorkerPools, batchEmbed } from './utils/index.js';
 import { getAugmentationVersion } from './utils/version.js';
 import { NounType, VerbType } from './types/graphTypes.js';
 import { createServerSearchAugmentations } from './augmentations/serverSearchAugmentations.js';
@@ -73,8 +73,8 @@ export class BrainyData {
         this.healthMonitor = null;
         // Statistics collector
         this.statisticsCollector = new StatisticsCollector();
-        // Set dimensions to fixed value of 512 (Universal Sentence Encoder dimension)
-        this._dimensions = 512;
+        // Set dimensions to fixed value of 384 (all-MiniLM-L6-v2 dimension)
+        this._dimensions = 384;
         // Set distance function
         this.distanceFunction = config.distanceFunction || cosineDistance;
         // Always use the optimized HNSW index implementation
@@ -99,9 +99,7 @@ export class BrainyData {
             this.embeddingFunction = config.embeddingFunction;
         }
         else {
-            this.embeddingFunction = getDefaultEmbeddingFunction({
-                verbose: this.loggingConfig?.verbose
-            });
+            this.embeddingFunction = defaultEmbeddingFunction;
         }
         // Set persistent storage request flag
         this.requestPersistentStorage =
@@ -554,8 +552,8 @@ export class BrainyData {
                     await new Promise((resolve) => setTimeout(resolve, 1000));
                     // Try again with a different approach - use the non-threaded version
                     // This is a fallback in case the threaded version fails
-                    const { createTensorFlowEmbeddingFunction } = await import('./utils/embedding.js');
-                    const fallbackEmbeddingFunction = createTensorFlowEmbeddingFunction();
+                    const { createEmbeddingFunction } = await import('./utils/embedding.js');
+                    const fallbackEmbeddingFunction = createEmbeddingFunction();
                     // Test the fallback embedding function
                     await fallbackEmbeddingFunction('');
                     // If successful, replace the embedding function
@@ -1181,7 +1179,7 @@ export class BrainyData {
                     // Extract just the text for batch embedding
                     const texts = textItems.map((item) => item.text);
                     // Perform batch embedding
-                    const embeddings = await defaultBatchEmbeddingFunction(texts);
+                    const embeddings = await batchEmbed(texts);
                     // Add each item with its embedding
                     textPromises = textItems.map((item, i) => this.add(embeddings[i], item.metadata, {
                         ...options,