@soulcraft/brainy 1.5.0 → 2.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +188 -0
- package/LICENSE +2 -2
- package/README.md +200 -595
- package/bin/brainy-interactive.js +564 -0
- package/bin/brainy-ts.js +18 -0
- package/bin/brainy.js +672 -81
- package/dist/augmentationPipeline.d.ts +48 -220
- package/dist/augmentationPipeline.js +60 -508
- package/dist/augmentationRegistry.d.ts +22 -31
- package/dist/augmentationRegistry.js +28 -79
- package/dist/augmentations/apiServerAugmentation.d.ts +108 -0
- package/dist/augmentations/apiServerAugmentation.js +502 -0
- package/dist/augmentations/batchProcessingAugmentation.d.ts +95 -0
- package/dist/augmentations/batchProcessingAugmentation.js +567 -0
- package/dist/augmentations/brainyAugmentation.d.ts +153 -0
- package/dist/augmentations/brainyAugmentation.js +145 -0
- package/dist/augmentations/cacheAugmentation.d.ts +105 -0
- package/dist/augmentations/cacheAugmentation.js +238 -0
- package/dist/augmentations/conduitAugmentations.d.ts +54 -156
- package/dist/augmentations/conduitAugmentations.js +156 -1082
- package/dist/augmentations/connectionPoolAugmentation.d.ts +62 -0
- package/dist/augmentations/connectionPoolAugmentation.js +316 -0
- package/dist/augmentations/defaultAugmentations.d.ts +53 -0
- package/dist/augmentations/defaultAugmentations.js +88 -0
- package/dist/augmentations/entityRegistryAugmentation.d.ts +126 -0
- package/dist/augmentations/entityRegistryAugmentation.js +386 -0
- package/dist/augmentations/indexAugmentation.d.ts +117 -0
- package/dist/augmentations/indexAugmentation.js +284 -0
- package/dist/augmentations/intelligentVerbScoringAugmentation.d.ts +152 -0
- package/dist/augmentations/intelligentVerbScoringAugmentation.js +554 -0
- package/dist/augmentations/metricsAugmentation.d.ts +202 -0
- package/dist/augmentations/metricsAugmentation.js +291 -0
- package/dist/augmentations/monitoringAugmentation.d.ts +94 -0
- package/dist/augmentations/monitoringAugmentation.js +227 -0
- package/dist/augmentations/neuralImport.d.ts +50 -117
- package/dist/augmentations/neuralImport.js +255 -629
- package/dist/augmentations/requestDeduplicatorAugmentation.d.ts +52 -0
- package/dist/augmentations/requestDeduplicatorAugmentation.js +162 -0
- package/dist/augmentations/serverSearchAugmentations.d.ts +43 -22
- package/dist/augmentations/serverSearchAugmentations.js +125 -72
- package/dist/augmentations/storageAugmentation.d.ts +54 -0
- package/dist/augmentations/storageAugmentation.js +93 -0
- package/dist/augmentations/storageAugmentations.d.ts +96 -0
- package/dist/augmentations/storageAugmentations.js +182 -0
- package/dist/augmentations/synapseAugmentation.d.ts +156 -0
- package/dist/augmentations/synapseAugmentation.js +312 -0
- package/dist/augmentations/walAugmentation.d.ts +108 -0
- package/dist/augmentations/walAugmentation.js +515 -0
- package/dist/brainyData.d.ts +404 -130
- package/dist/brainyData.js +1331 -853
- package/dist/chat/BrainyChat.d.ts +16 -8
- package/dist/chat/BrainyChat.js +60 -32
- package/dist/chat/ChatCLI.d.ts +1 -1
- package/dist/chat/ChatCLI.js +6 -6
- package/dist/cli/catalog.d.ts +3 -3
- package/dist/cli/catalog.js +116 -70
- package/dist/cli/commands/core.d.ts +61 -0
- package/dist/cli/commands/core.js +348 -0
- package/dist/cli/commands/neural.d.ts +25 -0
- package/dist/cli/commands/neural.js +508 -0
- package/dist/cli/commands/utility.d.ts +37 -0
- package/dist/cli/commands/utility.js +276 -0
- package/dist/cli/index.d.ts +7 -0
- package/dist/cli/index.js +167 -0
- package/dist/cli/interactive.d.ts +164 -0
- package/dist/cli/interactive.js +542 -0
- package/dist/cortex/neuralImport.js +5 -5
- package/dist/critical/model-guardian.js +11 -4
- package/dist/embeddings/lightweight-embedder.d.ts +23 -0
- package/dist/embeddings/lightweight-embedder.js +136 -0
- package/dist/embeddings/universal-memory-manager.d.ts +38 -0
- package/dist/embeddings/universal-memory-manager.js +206 -0
- package/dist/embeddings/worker-embedding.d.ts +7 -0
- package/dist/embeddings/worker-embedding.js +77 -0
- package/dist/embeddings/worker-manager.d.ts +28 -0
- package/dist/embeddings/worker-manager.js +162 -0
- package/dist/examples/basicUsage.js +7 -7
- package/dist/graph/pathfinding.d.ts +78 -0
- package/dist/graph/pathfinding.js +393 -0
- package/dist/hnsw/hnswIndex.d.ts +13 -0
- package/dist/hnsw/hnswIndex.js +35 -0
- package/dist/hnsw/hnswIndexOptimized.d.ts +1 -0
- package/dist/hnsw/hnswIndexOptimized.js +3 -0
- package/dist/index.d.ts +9 -11
- package/dist/index.js +21 -11
- package/dist/indices/fieldIndex.d.ts +76 -0
- package/dist/indices/fieldIndex.js +357 -0
- package/dist/mcp/brainyMCPAdapter.js +3 -2
- package/dist/mcp/mcpAugmentationToolset.js +11 -17
- package/dist/neural/embeddedPatterns.d.ts +41 -0
- package/dist/neural/embeddedPatterns.js +4044 -0
- package/dist/neural/naturalLanguageProcessor.d.ts +94 -0
- package/dist/neural/naturalLanguageProcessor.js +317 -0
- package/dist/neural/naturalLanguageProcessorStatic.d.ts +64 -0
- package/dist/neural/naturalLanguageProcessorStatic.js +151 -0
- package/dist/neural/neuralAPI.d.ts +255 -0
- package/dist/neural/neuralAPI.js +612 -0
- package/dist/neural/patternLibrary.d.ts +101 -0
- package/dist/neural/patternLibrary.js +313 -0
- package/dist/neural/patterns.d.ts +27 -0
- package/dist/neural/patterns.js +68 -0
- package/dist/neural/staticPatternMatcher.d.ts +35 -0
- package/dist/neural/staticPatternMatcher.js +153 -0
- package/dist/scripts/precomputePatternEmbeddings.d.ts +19 -0
- package/dist/scripts/precomputePatternEmbeddings.js +100 -0
- package/dist/storage/adapters/fileSystemStorage.d.ts +5 -0
- package/dist/storage/adapters/fileSystemStorage.js +20 -0
- package/dist/storage/adapters/s3CompatibleStorage.d.ts +5 -0
- package/dist/storage/adapters/s3CompatibleStorage.js +16 -0
- package/dist/storage/enhancedClearOperations.d.ts +83 -0
- package/dist/storage/enhancedClearOperations.js +345 -0
- package/dist/storage/storageFactory.js +31 -27
- package/dist/triple/TripleIntelligence.d.ts +134 -0
- package/dist/triple/TripleIntelligence.js +548 -0
- package/dist/types/augmentations.d.ts +45 -344
- package/dist/types/augmentations.js +5 -2
- package/dist/types/brainyDataInterface.d.ts +20 -10
- package/dist/types/graphTypes.d.ts +46 -0
- package/dist/types/graphTypes.js +16 -2
- package/dist/utils/BoundedRegistry.d.ts +29 -0
- package/dist/utils/BoundedRegistry.js +54 -0
- package/dist/utils/embedding.js +20 -3
- package/dist/utils/hybridModelManager.js +10 -5
- package/dist/utils/metadataFilter.d.ts +33 -19
- package/dist/utils/metadataFilter.js +58 -23
- package/dist/utils/metadataIndex.d.ts +37 -6
- package/dist/utils/metadataIndex.js +427 -64
- package/dist/utils/requestDeduplicator.d.ts +10 -0
- package/dist/utils/requestDeduplicator.js +24 -0
- package/dist/utils/unifiedCache.d.ts +103 -0
- package/dist/utils/unifiedCache.js +311 -0
- package/package.json +40 -125
- package/scripts/ensure-models.js +108 -0
- package/scripts/prepare-models.js +387 -0
- package/OFFLINE_MODELS.md +0 -56
- package/dist/intelligence/neuralEngine.d.ts +0 -207
- package/dist/intelligence/neuralEngine.js +0 -706
- package/dist/utils/modelLoader.d.ts +0 -32
- package/dist/utils/modelLoader.js +0 -219
- package/dist/utils/modelManager.d.ts +0 -77
- package/dist/utils/modelManager.js +0 -219
|
@@ -0,0 +1,108 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
/**
|
|
3
|
+
* Ensures transformer models are available for production
|
|
4
|
+
* This script handles model availability in multiple ways:
|
|
5
|
+
* 1. Check if models exist locally
|
|
6
|
+
* 2. Download from CDN if needed
|
|
7
|
+
* 3. Verify model integrity
|
|
8
|
+
*/
|
|
9
|
+
|
|
10
|
+
import { existsSync } from 'fs'
|
|
11
|
+
import { readFile, mkdir, writeFile } from 'fs/promises'
|
|
12
|
+
import { join, dirname } from 'path'
|
|
13
|
+
import { createHash } from 'crypto'
|
|
14
|
+
import { fileURLToPath } from 'url'
|
|
15
|
+
|
|
16
|
+
const __dirname = dirname(fileURLToPath(import.meta.url))
|
|
17
|
+
const PROJECT_ROOT = join(__dirname, '..')
|
|
18
|
+
|
|
19
|
+
// Model configuration
|
|
20
|
+
const MODEL_CONFIG = {
|
|
21
|
+
name: 'Xenova/all-MiniLM-L6-v2',
|
|
22
|
+
files: {
|
|
23
|
+
'onnx/model.onnx': {
|
|
24
|
+
size: 90555481, // 86.3 MB
|
|
25
|
+
sha256: 'expected_hash_here' // We'd compute this from actual model
|
|
26
|
+
},
|
|
27
|
+
'tokenizer.json': {
|
|
28
|
+
size: 711661,
|
|
29
|
+
sha256: 'expected_hash_here'
|
|
30
|
+
},
|
|
31
|
+
'tokenizer_config.json': {
|
|
32
|
+
size: 366,
|
|
33
|
+
sha256: 'expected_hash_here'
|
|
34
|
+
},
|
|
35
|
+
'config.json': {
|
|
36
|
+
size: 650,
|
|
37
|
+
sha256: 'expected_hash_here'
|
|
38
|
+
}
|
|
39
|
+
}
|
|
40
|
+
}
|
|
41
|
+
|
|
42
|
+
// CDN URLs for model files (would be your own CDN in production)
|
|
43
|
+
const CDN_BASE = 'https://cdn.soulcraft.com/models'
|
|
44
|
+
|
|
45
|
+
async function ensureModels() {
|
|
46
|
+
const modelsDir = join(PROJECT_ROOT, 'models', 'Xenova', 'all-MiniLM-L6-v2')
|
|
47
|
+
|
|
48
|
+
console.log('🔍 Checking for transformer models...')
|
|
49
|
+
|
|
50
|
+
// Check if all model files exist
|
|
51
|
+
let missingFiles = []
|
|
52
|
+
for (const [filePath, info] of Object.entries(MODEL_CONFIG.files)) {
|
|
53
|
+
const fullPath = join(modelsDir, filePath)
|
|
54
|
+
if (!existsSync(fullPath)) {
|
|
55
|
+
missingFiles.push(filePath)
|
|
56
|
+
}
|
|
57
|
+
}
|
|
58
|
+
|
|
59
|
+
if (missingFiles.length === 0) {
|
|
60
|
+
console.log('✅ All model files present')
|
|
61
|
+
|
|
62
|
+
// Optionally verify integrity
|
|
63
|
+
if (process.env.VERIFY_MODELS === 'true') {
|
|
64
|
+
console.log('🔐 Verifying model integrity...')
|
|
65
|
+
// Add hash verification here
|
|
66
|
+
}
|
|
67
|
+
|
|
68
|
+
return true
|
|
69
|
+
}
|
|
70
|
+
|
|
71
|
+
console.log(`⚠️ Missing ${missingFiles.length} model files`)
|
|
72
|
+
|
|
73
|
+
// In production, models should be pre-bundled
|
|
74
|
+
if (process.env.NODE_ENV === 'production' && !process.env.ALLOW_MODEL_DOWNLOAD) {
|
|
75
|
+
throw new Error(
|
|
76
|
+
'Critical: Transformer models not found in production. ' +
|
|
77
|
+
'Run "npm run download-models" during build stage.'
|
|
78
|
+
)
|
|
79
|
+
}
|
|
80
|
+
|
|
81
|
+
// Development: offer to download
|
|
82
|
+
if (process.env.CI !== 'true') {
|
|
83
|
+
console.log('📥 Would download models from CDN in development')
|
|
84
|
+
console.log(' Run: npm run download-models')
|
|
85
|
+
}
|
|
86
|
+
|
|
87
|
+
return false
|
|
88
|
+
}
|
|
89
|
+
|
|
90
|
+
// Export for use in main code
|
|
91
|
+
export async function verifyModelsAvailable() {
|
|
92
|
+
try {
|
|
93
|
+
return await ensureModels()
|
|
94
|
+
} catch (error) {
|
|
95
|
+
console.error('❌ Model verification failed:', error.message)
|
|
96
|
+
return false
|
|
97
|
+
}
|
|
98
|
+
}
|
|
99
|
+
|
|
100
|
+
// Run if called directly
|
|
101
|
+
if (import.meta.url === `file://${process.argv[1]}`) {
|
|
102
|
+
ensureModels()
|
|
103
|
+
.then(success => process.exit(success ? 0 : 1))
|
|
104
|
+
.catch(error => {
|
|
105
|
+
console.error(error)
|
|
106
|
+
process.exit(1)
|
|
107
|
+
})
|
|
108
|
+
}
|
|
@@ -0,0 +1,387 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
/**
|
|
3
|
+
* Prepare Models Script
|
|
4
|
+
*
|
|
5
|
+
* Intelligently handles model preparation for different deployment scenarios:
|
|
6
|
+
* 1. Development: Models download automatically on first use
|
|
7
|
+
* 2. Docker/CI: Pre-download during build stage
|
|
8
|
+
* 3. Serverless: Bundle with deployment package
|
|
9
|
+
* 4. Production: Verify models exist, fail fast if missing
|
|
10
|
+
*/
|
|
11
|
+
|
|
12
|
+
import { existsSync } from 'fs'
|
|
13
|
+
import { readFile, mkdir, writeFile, stat } from 'fs/promises'
|
|
14
|
+
import { join, dirname } from 'path'
|
|
15
|
+
import { fileURLToPath } from 'url'
|
|
16
|
+
import { pipeline, env } from '@huggingface/transformers'
|
|
17
|
+
import { execSync } from 'child_process'
|
|
18
|
+
import https from 'https'
|
|
19
|
+
import { createWriteStream } from 'fs'
|
|
20
|
+
import { promisify } from 'util'
|
|
21
|
+
import { finished } from 'stream'
|
|
22
|
+
|
|
23
|
+
const streamFinished = promisify(finished)
|
|
24
|
+
const __dirname = dirname(fileURLToPath(import.meta.url))
|
|
25
|
+
|
|
26
|
+
// Model configuration
|
|
27
|
+
const MODEL_CONFIG = {
|
|
28
|
+
name: 'Xenova/all-MiniLM-L6-v2',
|
|
29
|
+
expectedFiles: [
|
|
30
|
+
'config.json',
|
|
31
|
+
'tokenizer.json',
|
|
32
|
+
'tokenizer_config.json',
|
|
33
|
+
'onnx/model.onnx'
|
|
34
|
+
],
|
|
35
|
+
fallbackUrls: {
|
|
36
|
+
// GitHub Releases (our backup)
|
|
37
|
+
github: 'https://github.com/soulcraftlabs/brainy-models/releases/download/v1.0/all-MiniLM-L6-v2.tar.gz',
|
|
38
|
+
// Future CDN
|
|
39
|
+
cdn: 'https://models.soulcraft.com/brainy/all-MiniLM-L6-v2.tar.gz'
|
|
40
|
+
}
|
|
41
|
+
}
|
|
42
|
+
|
|
43
|
+
class ModelPreparer {
|
|
44
|
+
constructor() {
|
|
45
|
+
this.modelsDir = join(__dirname, '..', 'models')
|
|
46
|
+
this.modelPath = join(this.modelsDir, ...MODEL_CONFIG.name.split('/'))
|
|
47
|
+
}
|
|
48
|
+
|
|
49
|
+
/**
|
|
50
|
+
* Main entry point - intelligently prepares models based on context
|
|
51
|
+
*/
|
|
52
|
+
async prepare() {
|
|
53
|
+
console.log('🧠 Brainy Model Preparation')
|
|
54
|
+
console.log('===========================')
|
|
55
|
+
|
|
56
|
+
// Detect deployment context
|
|
57
|
+
const context = this.detectContext()
|
|
58
|
+
console.log(`📍 Context: ${context}`)
|
|
59
|
+
|
|
60
|
+
switch (context) {
|
|
61
|
+
case 'production':
|
|
62
|
+
return await this.prepareProduction()
|
|
63
|
+
case 'docker':
|
|
64
|
+
return await this.prepareDocker()
|
|
65
|
+
case 'ci':
|
|
66
|
+
return await this.prepareCI()
|
|
67
|
+
case 'development':
|
|
68
|
+
return await this.prepareDevelopment()
|
|
69
|
+
default:
|
|
70
|
+
return await this.prepareDefault()
|
|
71
|
+
}
|
|
72
|
+
}
|
|
73
|
+
|
|
74
|
+
/**
|
|
75
|
+
* Detect the deployment context
|
|
76
|
+
*/
|
|
77
|
+
detectContext() {
|
|
78
|
+
// Check environment variables
|
|
79
|
+
if (process.env.NODE_ENV === 'production') return 'production'
|
|
80
|
+
if (process.env.DOCKER_BUILD === 'true') return 'docker'
|
|
81
|
+
if (process.env.CI === 'true') return 'ci'
|
|
82
|
+
if (process.env.NODE_ENV === 'development') return 'development'
|
|
83
|
+
|
|
84
|
+
// Check for Docker build context
|
|
85
|
+
if (existsSync('/.dockerenv')) return 'docker'
|
|
86
|
+
|
|
87
|
+
// Check for common CI indicators
|
|
88
|
+
if (process.env.GITHUB_ACTIONS || process.env.GITLAB_CI) return 'ci'
|
|
89
|
+
|
|
90
|
+
// Default to development
|
|
91
|
+
return 'development'
|
|
92
|
+
}
|
|
93
|
+
|
|
94
|
+
/**
|
|
95
|
+
* Production: Models MUST exist, fail fast if not
|
|
96
|
+
*/
|
|
97
|
+
async prepareProduction() {
|
|
98
|
+
console.log('🏭 Production mode - verifying models...')
|
|
99
|
+
|
|
100
|
+
const modelExists = await this.verifyModels()
|
|
101
|
+
|
|
102
|
+
if (!modelExists) {
|
|
103
|
+
console.error('❌ CRITICAL: Models not found in production!')
|
|
104
|
+
console.error(' Models must be pre-downloaded during build stage.')
|
|
105
|
+
console.error(' Run: npm run download-models')
|
|
106
|
+
process.exit(1)
|
|
107
|
+
}
|
|
108
|
+
|
|
109
|
+
console.log('✅ Models verified for production')
|
|
110
|
+
return true
|
|
111
|
+
}
|
|
112
|
+
|
|
113
|
+
/**
|
|
114
|
+
* Docker: Download models during build stage
|
|
115
|
+
*/
|
|
116
|
+
async prepareDocker() {
|
|
117
|
+
console.log('🐳 Docker build - downloading models...')
|
|
118
|
+
|
|
119
|
+
// Check if already exists
|
|
120
|
+
if (await this.verifyModels()) {
|
|
121
|
+
console.log('✅ Models already present')
|
|
122
|
+
return true
|
|
123
|
+
}
|
|
124
|
+
|
|
125
|
+
// Download models
|
|
126
|
+
return await this.downloadModels()
|
|
127
|
+
}
|
|
128
|
+
|
|
129
|
+
/**
|
|
130
|
+
* CI: Download models for testing
|
|
131
|
+
*/
|
|
132
|
+
async prepareCI() {
|
|
133
|
+
console.log('🔧 CI environment - downloading models for tests...')
|
|
134
|
+
|
|
135
|
+
// Check cache first
|
|
136
|
+
if (await this.checkCICache()) {
|
|
137
|
+
console.log('✅ Using cached models')
|
|
138
|
+
return true
|
|
139
|
+
}
|
|
140
|
+
|
|
141
|
+
// Download and cache
|
|
142
|
+
const success = await this.downloadModels()
|
|
143
|
+
if (success) {
|
|
144
|
+
await this.saveCICache()
|
|
145
|
+
}
|
|
146
|
+
return success
|
|
147
|
+
}
|
|
148
|
+
|
|
149
|
+
/**
|
|
150
|
+
* Development: Optional download, will auto-download on first use
|
|
151
|
+
*/
|
|
152
|
+
async prepareDevelopment() {
|
|
153
|
+
console.log('💻 Development mode')
|
|
154
|
+
|
|
155
|
+
if (await this.verifyModels()) {
|
|
156
|
+
console.log('✅ Models already downloaded')
|
|
157
|
+
return true
|
|
158
|
+
}
|
|
159
|
+
|
|
160
|
+
console.log('ℹ️ Models will download automatically on first use')
|
|
161
|
+
console.log(' To pre-download now: npm run download-models')
|
|
162
|
+
|
|
163
|
+
// Ask if they want to download now
|
|
164
|
+
if (process.stdout.isTTY && !process.env.SKIP_PROMPT) {
|
|
165
|
+
const readline = await import('readline')
|
|
166
|
+
const rl = readline.createInterface({
|
|
167
|
+
input: process.stdin,
|
|
168
|
+
output: process.stdout
|
|
169
|
+
})
|
|
170
|
+
|
|
171
|
+
return new Promise((resolve) => {
|
|
172
|
+
rl.question('Download models now? (y/N): ', async (answer) => {
|
|
173
|
+
rl.close()
|
|
174
|
+
if (answer.toLowerCase() === 'y') {
|
|
175
|
+
resolve(await this.downloadModels())
|
|
176
|
+
} else {
|
|
177
|
+
resolve(true)
|
|
178
|
+
}
|
|
179
|
+
})
|
|
180
|
+
})
|
|
181
|
+
}
|
|
182
|
+
|
|
183
|
+
return true
|
|
184
|
+
}
|
|
185
|
+
|
|
186
|
+
/**
|
|
187
|
+
* Default: Try to be smart about it
|
|
188
|
+
*/
|
|
189
|
+
async prepareDefault() {
|
|
190
|
+
console.log('🤖 Auto-detecting best approach...')
|
|
191
|
+
|
|
192
|
+
if (await this.verifyModels()) {
|
|
193
|
+
console.log('✅ Models found')
|
|
194
|
+
return true
|
|
195
|
+
}
|
|
196
|
+
|
|
197
|
+
// If running as part of install, don't download
|
|
198
|
+
if (process.env.npm_lifecycle_event === 'postinstall') {
|
|
199
|
+
console.log('ℹ️ Skipping download during install (will download on first use)')
|
|
200
|
+
return true
|
|
201
|
+
}
|
|
202
|
+
|
|
203
|
+
// Otherwise download
|
|
204
|
+
return await this.downloadModels()
|
|
205
|
+
}
|
|
206
|
+
|
|
207
|
+
/**
|
|
208
|
+
* Verify all required model files exist
|
|
209
|
+
*/
|
|
210
|
+
async verifyModels() {
|
|
211
|
+
for (const file of MODEL_CONFIG.expectedFiles) {
|
|
212
|
+
const filePath = join(this.modelPath, file)
|
|
213
|
+
if (!existsSync(filePath)) {
|
|
214
|
+
return false
|
|
215
|
+
}
|
|
216
|
+
}
|
|
217
|
+
|
|
218
|
+
// Verify model.onnx size (should be ~87MB)
|
|
219
|
+
const modelOnnxPath = join(this.modelPath, 'onnx', 'model.onnx')
|
|
220
|
+
if (existsSync(modelOnnxPath)) {
|
|
221
|
+
const stats = await stat(modelOnnxPath)
|
|
222
|
+
const sizeMB = Math.round(stats.size / (1024 * 1024))
|
|
223
|
+
if (sizeMB < 80 || sizeMB > 100) {
|
|
224
|
+
console.warn(`⚠️ Model size unexpected: ${sizeMB}MB (expected ~87MB)`)
|
|
225
|
+
return false
|
|
226
|
+
}
|
|
227
|
+
}
|
|
228
|
+
|
|
229
|
+
return true
|
|
230
|
+
}
|
|
231
|
+
|
|
232
|
+
/**
|
|
233
|
+
* Download models with fallback sources
|
|
234
|
+
*/
|
|
235
|
+
async downloadModels() {
|
|
236
|
+
console.log('📥 Downloading transformer models...')
|
|
237
|
+
|
|
238
|
+
// Try transformers.js first (Hugging Face)
|
|
239
|
+
try {
|
|
240
|
+
await this.downloadFromTransformers()
|
|
241
|
+
console.log('✅ Downloaded from Hugging Face')
|
|
242
|
+
return true
|
|
243
|
+
} catch (error) {
|
|
244
|
+
console.warn('⚠️ Hugging Face download failed:', error.message)
|
|
245
|
+
}
|
|
246
|
+
|
|
247
|
+
// Try GitHub releases
|
|
248
|
+
try {
|
|
249
|
+
await this.downloadFromGitHub()
|
|
250
|
+
console.log('✅ Downloaded from GitHub')
|
|
251
|
+
return true
|
|
252
|
+
} catch (error) {
|
|
253
|
+
console.warn('⚠️ GitHub download failed:', error.message)
|
|
254
|
+
}
|
|
255
|
+
|
|
256
|
+
// Try CDN
|
|
257
|
+
try {
|
|
258
|
+
await this.downloadFromCDN()
|
|
259
|
+
console.log('✅ Downloaded from CDN')
|
|
260
|
+
return true
|
|
261
|
+
} catch (error) {
|
|
262
|
+
console.warn('⚠️ CDN download failed:', error.message)
|
|
263
|
+
}
|
|
264
|
+
|
|
265
|
+
console.error('❌ All download sources failed')
|
|
266
|
+
return false
|
|
267
|
+
}
|
|
268
|
+
|
|
269
|
+
/**
|
|
270
|
+
* Download using transformers.js (official Hugging Face)
|
|
271
|
+
*/
|
|
272
|
+
async downloadFromTransformers() {
|
|
273
|
+
env.cacheDir = this.modelsDir
|
|
274
|
+
env.allowRemoteModels = true
|
|
275
|
+
|
|
276
|
+
console.log(' Source: Hugging Face')
|
|
277
|
+
console.log(' Model:', MODEL_CONFIG.name)
|
|
278
|
+
|
|
279
|
+
// Load pipeline to trigger download
|
|
280
|
+
const extractor = await pipeline('feature-extraction', MODEL_CONFIG.name)
|
|
281
|
+
|
|
282
|
+
// Test it works
|
|
283
|
+
const test = await extractor('test', { pooling: 'mean', normalize: true })
|
|
284
|
+
console.log(` ✓ Model test passed (dims: ${test.data.length})`)
|
|
285
|
+
|
|
286
|
+
return true
|
|
287
|
+
}
|
|
288
|
+
|
|
289
|
+
/**
|
|
290
|
+
* Download from GitHub releases (our backup)
|
|
291
|
+
*/
|
|
292
|
+
async downloadFromGitHub() {
|
|
293
|
+
const url = MODEL_CONFIG.fallbackUrls.github
|
|
294
|
+
console.log(' Source: GitHub Releases')
|
|
295
|
+
|
|
296
|
+
// Download tar.gz
|
|
297
|
+
const tempFile = join(this.modelsDir, 'temp-model.tar.gz')
|
|
298
|
+
await this.downloadFile(url, tempFile)
|
|
299
|
+
|
|
300
|
+
// Extract
|
|
301
|
+
await mkdir(this.modelPath, { recursive: true })
|
|
302
|
+
execSync(`tar -xzf ${tempFile} -C ${this.modelPath}`, { stdio: 'inherit' })
|
|
303
|
+
|
|
304
|
+
// Cleanup
|
|
305
|
+
await unlink(tempFile)
|
|
306
|
+
|
|
307
|
+
return true
|
|
308
|
+
}
|
|
309
|
+
|
|
310
|
+
/**
|
|
311
|
+
* Download from CDN (future)
|
|
312
|
+
*/
|
|
313
|
+
async downloadFromCDN() {
|
|
314
|
+
const url = MODEL_CONFIG.fallbackUrls.cdn
|
|
315
|
+
console.log(' Source: Soulcraft CDN')
|
|
316
|
+
|
|
317
|
+
// Similar to GitHub approach
|
|
318
|
+
throw new Error('CDN not yet available')
|
|
319
|
+
}
|
|
320
|
+
|
|
321
|
+
/**
|
|
322
|
+
* Download a file from URL
|
|
323
|
+
*/
|
|
324
|
+
async downloadFile(url, destination) {
|
|
325
|
+
await mkdir(dirname(destination), { recursive: true })
|
|
326
|
+
|
|
327
|
+
return new Promise((resolve, reject) => {
|
|
328
|
+
const file = createWriteStream(destination)
|
|
329
|
+
|
|
330
|
+
https.get(url, (response) => {
|
|
331
|
+
if (response.statusCode !== 200) {
|
|
332
|
+
reject(new Error(`HTTP ${response.statusCode}`))
|
|
333
|
+
return
|
|
334
|
+
}
|
|
335
|
+
|
|
336
|
+
response.pipe(file)
|
|
337
|
+
|
|
338
|
+
file.on('finish', () => {
|
|
339
|
+
file.close()
|
|
340
|
+
resolve()
|
|
341
|
+
})
|
|
342
|
+
}).on('error', reject)
|
|
343
|
+
})
|
|
344
|
+
}
|
|
345
|
+
|
|
346
|
+
/**
|
|
347
|
+
* Check CI cache for models
|
|
348
|
+
*/
|
|
349
|
+
async checkCICache() {
|
|
350
|
+
// GitHub Actions cache
|
|
351
|
+
if (process.env.GITHUB_ACTIONS) {
|
|
352
|
+
const cachePath = process.env.RUNNER_TEMP + '/brainy-models'
|
|
353
|
+
if (existsSync(cachePath)) {
|
|
354
|
+
// Copy from cache
|
|
355
|
+
execSync(`cp -r ${cachePath}/* ${this.modelsDir}/`, { stdio: 'inherit' })
|
|
356
|
+
return true
|
|
357
|
+
}
|
|
358
|
+
}
|
|
359
|
+
|
|
360
|
+
return false
|
|
361
|
+
}
|
|
362
|
+
|
|
363
|
+
/**
|
|
364
|
+
* Save models to CI cache
|
|
365
|
+
*/
|
|
366
|
+
async saveCICache() {
|
|
367
|
+
// GitHub Actions cache
|
|
368
|
+
if (process.env.GITHUB_ACTIONS) {
|
|
369
|
+
const cachePath = process.env.RUNNER_TEMP + '/brainy-models'
|
|
370
|
+
await mkdir(cachePath, { recursive: true })
|
|
371
|
+
execSync(`cp -r ${this.modelsDir}/* ${cachePath}/`, { stdio: 'inherit' })
|
|
372
|
+
}
|
|
373
|
+
}
|
|
374
|
+
}
|
|
375
|
+
|
|
376
|
+
// Run the preparer
|
|
377
|
+
const preparer = new ModelPreparer()
|
|
378
|
+
preparer.prepare()
|
|
379
|
+
.then(success => {
|
|
380
|
+
if (!success) {
|
|
381
|
+
process.exit(1)
|
|
382
|
+
}
|
|
383
|
+
})
|
|
384
|
+
.catch(error => {
|
|
385
|
+
console.error('❌ Fatal error:', error)
|
|
386
|
+
process.exit(1)
|
|
387
|
+
})
|
package/OFFLINE_MODELS.md
DELETED
|
@@ -1,56 +0,0 @@
|
|
|
1
|
-
# Offline Models
|
|
2
|
-
|
|
3
|
-
Brainy uses Transformers.js with ONNX Runtime for **true offline operation** - no more TensorFlow.js dependency hell!
|
|
4
|
-
|
|
5
|
-
## How it works
|
|
6
|
-
|
|
7
|
-
Brainy automatically figures out the best approach:
|
|
8
|
-
|
|
9
|
-
1. **First use**: Downloads models once (~87 MB) to local cache
|
|
10
|
-
2. **Subsequent use**: Loads from cache (completely offline, zero network calls)
|
|
11
|
-
3. **Smart detection**: Automatically finds models in cache, bundled, or downloads as needed
|
|
12
|
-
|
|
13
|
-
## Standard usage
|
|
14
|
-
|
|
15
|
-
```bash
|
|
16
|
-
npm install @soulcraft/brainy
|
|
17
|
-
# Use immediately - models download automatically on first use
|
|
18
|
-
```
|
|
19
|
-
|
|
20
|
-
## Docker with production egress restrictions
|
|
21
|
-
|
|
22
|
-
For environments where production has no internet but build does:
|
|
23
|
-
|
|
24
|
-
```dockerfile
|
|
25
|
-
FROM node:24-slim
|
|
26
|
-
WORKDIR /app
|
|
27
|
-
COPY package*.json ./
|
|
28
|
-
RUN npm install @soulcraft/brainy
|
|
29
|
-
RUN npm run download-models # Download during build (when internet available)
|
|
30
|
-
COPY . .
|
|
31
|
-
# Production container now works completely offline
|
|
32
|
-
```
|
|
33
|
-
|
|
34
|
-
## Development with immediate offline
|
|
35
|
-
|
|
36
|
-
If you want models available immediately for development:
|
|
37
|
-
|
|
38
|
-
```bash
|
|
39
|
-
npm install @soulcraft/brainy
|
|
40
|
-
npm run download-models # Optional: download now instead of on first use
|
|
41
|
-
```
|
|
42
|
-
|
|
43
|
-
## Key benefits vs TensorFlow.js
|
|
44
|
-
|
|
45
|
-
- ✅ **95% smaller package** - 643 kB vs 12.5 MB
|
|
46
|
-
- ✅ **84% smaller models** - 87 MB vs 525 MB
|
|
47
|
-
- ✅ **True offline** - Zero network calls after initial download
|
|
48
|
-
- ✅ **No dependency issues** - 5 deps vs 47+, no more --legacy-peer-deps
|
|
49
|
-
- ✅ **Better performance** - ONNX Runtime beats TensorFlow.js
|
|
50
|
-
- ✅ **Same API** - Drop-in replacement
|
|
51
|
-
|
|
52
|
-
## Philosophy
|
|
53
|
-
|
|
54
|
-
**Install and use. Brainy handles the rest.**
|
|
55
|
-
|
|
56
|
-
No configuration files, no environment variables, no complex setup. Brainy detects your environment and does the right thing automatically.
|