llm-checker 3.6.1 → 3.7.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +45 -8
- package/bin/enhanced_cli.js +407 -5
- package/bin/mcp-server.mjs +5 -0
- package/package.json +7 -2
- package/src/data/model-database.js +452 -0
- package/src/data/registry-ingestors.js +765 -0
- package/src/data/registry-recommender.js +632 -0
- package/src/data/seed/README.md +11 -3
- package/src/data/seed/models.db +0 -0
- package/src/index.js +68 -4
- package/src/models/deterministic-selector.js +85 -39
- package/src/models/moe-assumptions.js +11 -0
package/README.md
CHANGED
|
@@ -5,7 +5,7 @@
|
|
|
5
5
|
**Intelligent Ollama Model Selector**
|
|
6
6
|
|
|
7
7
|
AI-powered CLI that analyzes your hardware and recommends optimal LLM models.
|
|
8
|
-
Deterministic scoring across **
|
|
8
|
+
Deterministic scoring across a packaged **multi-source registry** (Hugging Face + Ollama + GPT4All, **33k+ exact artifacts**) and the Ollama catalog, with live sync, runtime targeting, and hardware-calibrated memory estimation.
|
|
9
9
|
|
|
10
10
|
[](https://www.npmjs.com/package/llm-checker)
|
|
11
11
|
[](https://www.npmjs.com/package/llm-checker)
|
|
@@ -39,6 +39,7 @@ Choosing the right LLM for your hardware is complex. With thousands of model var
|
|
|
39
39
|
| | Feature | Description |
|
|
40
40
|
|:---:|---|---|
|
|
41
41
|
| **200+** | Packaged Model Catalog | Ships with a synced Ollama SQLite catalog and can refresh from Ollama on demand |
|
|
42
|
+
| **33k+** | Multi-Source Registry | Exact installable/downloadable artifacts from Hugging Face, Ollama, and GPT4All with per-source commands and runtime targeting |
|
|
42
43
|
| **4D** | Scoring Engine | Quality, Speed, Fit, Context — weighted by use case |
|
|
43
44
|
| **Multi-GPU** | Hardware Detection | Apple Silicon, NVIDIA CUDA, AMD ROCm, Intel Arc, CPU, integrated/dedicated inventory visibility |
|
|
44
45
|
| **Calibrated** | Memory Estimation | Bytes-per-parameter formula validated against real Ollama sizes |
|
|
@@ -151,6 +152,14 @@ hash -r
|
|
|
151
152
|
llm-checker --version
|
|
152
153
|
```
|
|
153
154
|
|
|
155
|
+
### v3.7.0 Highlights
|
|
156
|
+
|
|
157
|
+
- New **multi-source model registry**: a packaged snapshot of ~33,700 exact installable/downloadable artifacts from Hugging Face, Ollama, and GPT4All, with per-source commands (`hf download ...`, `ollama pull ...`).
|
|
158
|
+
- `recommend` and `check` now draw candidates from the registry through one canonical deterministic scoring core, with `--runtime auto/ollama/vllm/mlx/llama.cpp/transformers` targeting; they fall back to the Ollama catalog when the registry is unavailable.
|
|
159
|
+
- New `registry-sync`, `registry-search`, and `registry-recommend` commands.
|
|
160
|
+
- Mixture-of-Experts models are sized by their **total** parameter count (all experts stay resident under Ollama/Metal/vLLM), so a large MoE can no longer falsely "fit" small hardware.
|
|
161
|
+
- Carries the 3.6.1 batch: unified scoring across `check`/`recommend`/`smart-recommend` (#88), high-end/multi-GPU VRAM detection (#95), MCP server hardening (#97), and the Windows interactive-panel fixes (#86).
|
|
162
|
+
|
|
154
163
|
### v3.5.13 Highlights
|
|
155
164
|
|
|
156
165
|
- Ships npm packages with a ready-to-use SQLite model catalog:
|
|
@@ -389,6 +398,27 @@ llm-checker search "qwen coder" --json
|
|
|
389
398
|
| `search <query>` | Search the synced catalog with filters and intelligent scoring |
|
|
390
399
|
| `smart-recommend` | Advanced recommendations using the full scoring engine |
|
|
391
400
|
|
|
401
|
+
### Model Registry Commands (v3.7.0+)
|
|
402
|
+
|
|
403
|
+
Exact installable/downloadable artifacts from a packaged multi-source registry (Hugging Face + Ollama + GPT4All).
|
|
404
|
+
|
|
405
|
+
| Command | Description |
|
|
406
|
+
|---------|-------------|
|
|
407
|
+
| `registry-sync` | Sync the multi-source registry (Hugging Face, Ollama, GPT4All) |
|
|
408
|
+
| `registry-search [query]` | Search exact artifacts with `--source`, `--format`, `--runtime`, `--quant`, `--max-size`, `--min-params`/`--max-params` filters |
|
|
409
|
+
| `registry-recommend [query]` | Recommend the best exact artifacts for your hardware, with `--runtime auto/ollama/vllm/mlx/llama.cpp/transformers` targeting and `--category`/`--optimize` |
|
|
410
|
+
|
|
411
|
+
```bash
|
|
412
|
+
# Best coding artifacts across all sources, auto runtime
|
|
413
|
+
llm-checker registry-recommend --category coding
|
|
414
|
+
|
|
415
|
+
# Only Apple-native MLX artifacts
|
|
416
|
+
llm-checker registry-recommend --category coding --runtime mlx
|
|
417
|
+
|
|
418
|
+
# Search Hugging Face for vLLM-ready reasoning models under 24B
|
|
419
|
+
llm-checker registry-search qwen --source huggingface --runtime vllm --max-params 24
|
|
420
|
+
```
|
|
421
|
+
|
|
392
422
|
### Enterprise Policy Commands
|
|
393
423
|
|
|
394
424
|
| Command | Description |
|
|
@@ -641,30 +671,36 @@ llm-checker search qwen --quant Q4_K_M --max-size 8
|
|
|
641
671
|
|
|
642
672
|
## Model Catalog
|
|
643
673
|
|
|
644
|
-
LLM Checker ships with a pre-synced SQLite snapshot of the Ollama catalog. On first run, that snapshot is copied to `~/.llm-checker/models.db`, so recommendations and catalog search work immediately after npm install.
|
|
674
|
+
LLM Checker ships with a pre-synced SQLite snapshot of the Ollama catalog plus a multi-source registry of exact downloadable/installable model artifacts. On first run, that snapshot is copied to `~/.llm-checker/models.db`, so recommendations and catalog search work immediately after npm install.
|
|
645
675
|
|
|
646
676
|
The packaged snapshot currently includes:
|
|
647
677
|
|
|
648
678
|
- 229 Ollama models
|
|
649
679
|
- 7176 variants
|
|
680
|
+
- 3259 multi-source registry repositories
|
|
681
|
+
- 33729 exact model artifacts from Hugging Face, Ollama, and GPT4All
|
|
682
|
+
- Hugging Face top 3000 repositories by downloads, fetched with API pagination
|
|
650
683
|
- pull counts
|
|
651
684
|
- tag counts
|
|
652
685
|
- last-updated metadata
|
|
653
|
-
- variant params, quantization, size, context,
|
|
686
|
+
- variant params, quantization, size, context, runtime, install commands, download URLs, license/gated flags, tasks, and modalities when available
|
|
654
687
|
|
|
655
688
|
Refresh it any time:
|
|
656
689
|
|
|
657
690
|
```bash
|
|
658
691
|
llm-checker sync
|
|
692
|
+
llm-checker registry-sync --sources ollama,huggingface,gpt4all
|
|
693
|
+
llm-checker registry-search qwen --runtime auto --max-size 8
|
|
694
|
+
llm-checker registry-recommend --category coding --runtime auto --max-size 8
|
|
659
695
|
```
|
|
660
696
|
|
|
661
|
-
For release maintainers, the packaged seed can be regenerated from the synced local DB:
|
|
697
|
+
For release maintainers, the packaged seed can be regenerated from the synced local DB and registry APIs:
|
|
662
698
|
|
|
663
699
|
```bash
|
|
664
700
|
npm run sync:seed
|
|
665
701
|
```
|
|
666
702
|
|
|
667
|
-
`recommend`, `list-models`, `ai-run`, and `ai-check` prefer the synced SQLite catalog. If the SQLite catalog is unavailable, LLM Checker falls back to the scraped cache and then to the curated catalog.
|
|
703
|
+
`recommend`, `list-models`, `ai-run`, and `ai-check` prefer the synced SQLite catalog. `registry-search` queries exact artifacts across sources, and `registry-recommend` ranks exact artifacts from the registry with the deterministic hardware-aware selector. If the SQLite catalog is unavailable, LLM Checker falls back to the scraped cache and then to the curated catalog.
|
|
668
704
|
|
|
669
705
|
The curated fallback catalog includes 35+ models from the most popular Ollama families:
|
|
670
706
|
|
|
@@ -836,7 +872,7 @@ LLM Checker uses a deterministic pipeline so the same inputs produce the same ra
|
|
|
836
872
|
flowchart LR
|
|
837
873
|
subgraph Inputs
|
|
838
874
|
HW["Hardware detector<br/>CPU/GPU/RAM/backend"]
|
|
839
|
-
REG["Synced SQLite
|
|
875
|
+
REG["Synced SQLite model catalog<br/>(Ollama seed + multi-source registry)"]
|
|
840
876
|
LOCAL["Installed local models"]
|
|
841
877
|
FLAGS["CLI options<br/>use-case/runtime/limits/policy"]
|
|
842
878
|
end
|
|
@@ -952,8 +988,9 @@ src/
|
|
|
952
988
|
detector.js # Hardware detection
|
|
953
989
|
unified-detector.js # Cross-platform detection
|
|
954
990
|
data/
|
|
955
|
-
model-database.js # SQLite storage and packaged seed loading
|
|
956
|
-
|
|
991
|
+
model-database.js # SQLite storage, registry tables, and packaged seed loading
|
|
992
|
+
registry-ingestors.js # Ollama/Hugging Face/GPT4All artifact normalization
|
|
993
|
+
seed/models.db # npm-packaged Ollama + multi-source registry snapshot
|
|
957
994
|
sync-manager.js # Database sync from Ollama registry
|
|
958
995
|
bin/
|
|
959
996
|
enhanced_cli.js # CLI entry point
|
package/bin/enhanced_cli.js
CHANGED
|
@@ -61,6 +61,9 @@ const COMMAND_HEADER_LABELS = {
|
|
|
61
61
|
'smart-recommend': 'Smart Recommend (Experimental)',
|
|
62
62
|
search: 'Model Search',
|
|
63
63
|
sync: 'Database Sync',
|
|
64
|
+
'registry-sync': 'Model Registry Sync',
|
|
65
|
+
'registry-search': 'Model Registry Search',
|
|
66
|
+
'registry-recommend': 'Registry Recommendations',
|
|
64
67
|
'mcp-setup': 'Claude MCP Setup',
|
|
65
68
|
check: 'Compatibility Check',
|
|
66
69
|
installed: 'Installed Models',
|
|
@@ -401,6 +404,65 @@ function getRealSizeFromOllamaCache(model) {
|
|
|
401
404
|
}
|
|
402
405
|
}
|
|
403
406
|
|
|
407
|
+
function parsePositiveNumberOption(value, fallback = null) {
|
|
408
|
+
if (value === undefined || value === null || value === '') return fallback;
|
|
409
|
+
const parsed = Number(value);
|
|
410
|
+
return Number.isFinite(parsed) && parsed > 0 ? parsed : fallback;
|
|
411
|
+
}
|
|
412
|
+
|
|
413
|
+
// Allowed enum values for the registry commands. Invalid values must be rejected
|
|
414
|
+
// with a clear error instead of silently returning "no results" or falling back
|
|
415
|
+
// to the built-in catalog.
|
|
416
|
+
const REGISTRY_SOURCES = ['ollama', 'huggingface', 'gpt4all'];
|
|
417
|
+
const REGISTRY_FORMATS = ['gguf', 'safetensors', 'mlx', 'ollama', 'pytorch', 'pytorch_bin', 'ggml'];
|
|
418
|
+
const REGISTRY_RUNTIMES = ['auto', 'all', '*', 'ollama', 'llama.cpp', 'transformers', 'vllm', 'mlx'];
|
|
419
|
+
const REGISTRY_OPTIMIZE = ['balanced', 'speed', 'quality', 'context', 'coding'];
|
|
420
|
+
|
|
421
|
+
function assertRegistryEnum(label, value, allowed) {
|
|
422
|
+
if (value === undefined || value === null || value === '') return;
|
|
423
|
+
if (!allowed.includes(String(value).toLowerCase())) {
|
|
424
|
+
const shown = allowed.filter((v) => !['all', '*'].includes(v)).join(', ');
|
|
425
|
+
throw new Error(`Invalid --${label} "${value}". Allowed: ${shown}`);
|
|
426
|
+
}
|
|
427
|
+
}
|
|
428
|
+
|
|
429
|
+
// Throws on the first invalid registry enum option. Returns nothing on success.
|
|
430
|
+
function validateRegistryFilters(options = {}) {
|
|
431
|
+
assertRegistryEnum('source', options.source, REGISTRY_SOURCES);
|
|
432
|
+
assertRegistryEnum('format', options.format, REGISTRY_FORMATS);
|
|
433
|
+
assertRegistryEnum('runtime', options.runtime, REGISTRY_RUNTIMES);
|
|
434
|
+
assertRegistryEnum('optimize', options.optimize, REGISTRY_OPTIMIZE);
|
|
435
|
+
}
|
|
436
|
+
|
|
437
|
+
function truncateMiddle(value, maxLength = 48) {
|
|
438
|
+
const text = String(value || '');
|
|
439
|
+
if (text.length <= maxLength) return text;
|
|
440
|
+
if (maxLength <= 4) return text.slice(0, maxLength);
|
|
441
|
+
const head = Math.ceil((maxLength - 3) / 2);
|
|
442
|
+
const tail = Math.floor((maxLength - 3) / 2);
|
|
443
|
+
return `${text.slice(0, head)}...${text.slice(text.length - tail)}`;
|
|
444
|
+
}
|
|
445
|
+
|
|
446
|
+
function formatRegistryNumber(value, suffix = '') {
|
|
447
|
+
const parsed = Number(value);
|
|
448
|
+
if (!Number.isFinite(parsed) || parsed <= 0) return '?';
|
|
449
|
+
const rounded = parsed >= 100 ? Math.round(parsed) : Math.round(parsed * 10) / 10;
|
|
450
|
+
return `${rounded}${suffix}`;
|
|
451
|
+
}
|
|
452
|
+
|
|
453
|
+
function formatRegistrySize(value) {
|
|
454
|
+
const parsed = Number(value);
|
|
455
|
+
if (!Number.isFinite(parsed) || parsed <= 0) return '?';
|
|
456
|
+
return `${Math.round(parsed * 100) / 100}GB`;
|
|
457
|
+
}
|
|
458
|
+
|
|
459
|
+
function formatRegistryList(value, maxItems = 3) {
|
|
460
|
+
const items = Array.isArray(value) ? value : [];
|
|
461
|
+
if (items.length === 0) return '-';
|
|
462
|
+
const shown = items.slice(0, maxItems).join(', ');
|
|
463
|
+
return items.length > maxItems ? `${shown}, +${items.length - maxItems}` : shown;
|
|
464
|
+
}
|
|
465
|
+
|
|
404
466
|
const program = new Command();
|
|
405
467
|
|
|
406
468
|
program
|
|
@@ -1285,12 +1347,17 @@ function displayIntelligentRecommendations(intelligentData, hardware = null) {
|
|
|
1285
1347
|
const { summary, recommendations } = intelligentData;
|
|
1286
1348
|
const tier = summary.hardware_tier.replace('_', ' ').toUpperCase();
|
|
1287
1349
|
const optimizeProfile = (summary.optimize_for || intelligentData.optimizeFor || 'balanced').toUpperCase();
|
|
1350
|
+
const runtimeLabel = (intelligentData.runtime || summary.best_overall?.runtime || 'auto').toUpperCase();
|
|
1351
|
+
const sourceLabel = intelligentData.recommendationSource === 'registry'
|
|
1352
|
+
? 'Multi-source registry'
|
|
1353
|
+
: 'Ollama catalog';
|
|
1288
1354
|
const tierColor = tier.includes('HIGH') ? chalk.green : tier.includes('MEDIUM') ? chalk.yellow : chalk.red;
|
|
1289
1355
|
|
|
1290
1356
|
console.log('\n' + chalk.bgRed.white.bold(' INTELLIGENT RECOMMENDATIONS BY CATEGORY '));
|
|
1291
1357
|
console.log(chalk.red('╭' + '─'.repeat(65)));
|
|
1292
1358
|
console.log(chalk.red('│') + ` Hardware Tier: ${tierColor.bold(tier)} | Models Analyzed: ${chalk.cyan.bold(intelligentData.totalModelsAnalyzed)}`);
|
|
1293
|
-
console.log(chalk.red('│') + ` Optimization: ${chalk.magenta.bold(optimizeProfile)}`);
|
|
1359
|
+
console.log(chalk.red('│') + ` Optimization: ${chalk.magenta.bold(optimizeProfile)} | Runtime: ${chalk.cyan.bold(runtimeLabel)}`);
|
|
1360
|
+
console.log(chalk.red('│') + ` Source: ${chalk.white.bold(sourceLabel)}`);
|
|
1294
1361
|
console.log(chalk.red('│'));
|
|
1295
1362
|
|
|
1296
1363
|
// Mostrar mejor modelo general
|
|
@@ -1301,6 +1368,7 @@ function displayIntelligentRecommendations(intelligentData, hardware = null) {
|
|
|
1301
1368
|
console.log(chalk.red('│') + ` Command: ${chalk.cyan.bold(best.command)}`);
|
|
1302
1369
|
console.log(chalk.red('│') + ` Score: ${chalk.yellow.bold(best.score)}/100 | Category: ${chalk.magenta(best.category)}`);
|
|
1303
1370
|
console.log(chalk.red('│') + ` Quantization: ${chalk.white.bold(best.quantization || 'Q4_K_M')}`);
|
|
1371
|
+
console.log(chalk.red('│') + ` Runtime: ${chalk.cyan.bold(best.runtime || intelligentData.runtime || 'ollama')} | Source: ${chalk.gray(best.source || 'unknown')}`);
|
|
1304
1372
|
console.log(chalk.red('│') + ` Fine-tuning: ${chalk.blue.bold(bestFineTuning.shortLabel)}`);
|
|
1305
1373
|
console.log(chalk.red('│'));
|
|
1306
1374
|
}
|
|
@@ -1326,6 +1394,7 @@ function displayIntelligentRecommendations(intelligentData, hardware = null) {
|
|
|
1326
1394
|
console.log(chalk.red('│') + ` ${chalk.green(model.name)} (${model.size})`);
|
|
1327
1395
|
console.log(chalk.red('│') + ` Score: ${scoreColor.bold(model.score)}/100 | Pulls: ${chalk.gray(model.pulls?.toLocaleString() || 'N/A')}`);
|
|
1328
1396
|
console.log(chalk.red('│') + ` Quantization: ${chalk.white.bold(model.quantization || 'Q4_K_M')}`);
|
|
1397
|
+
console.log(chalk.red('│') + ` Runtime: ${chalk.cyan(model.runtime || intelligentData.runtime || 'ollama')} | Source: ${chalk.gray(model.source || 'unknown')}`);
|
|
1329
1398
|
console.log(chalk.red('│') + ` Fine-tuning: ${chalk.blue.bold(fineTuningSupport.shortLabel)}`);
|
|
1330
1399
|
console.log(chalk.red('│') + ` Command: ${chalk.cyan.bold(model.command)}`);
|
|
1331
1400
|
console.log(chalk.red('│'));
|
|
@@ -3017,7 +3086,7 @@ auditCommand
|
|
|
3017
3086
|
.option('-u, --use-case <case>', 'Use case when --command check is selected', 'general')
|
|
3018
3087
|
.option('-c, --category <category>', 'Category hint when --command recommend is selected')
|
|
3019
3088
|
.option('--optimize <profile>', 'Optimization profile for recommend mode (balanced|speed|quality|context|coding)', 'balanced')
|
|
3020
|
-
.option('--runtime <runtime>',
|
|
3089
|
+
.option('--runtime <runtime>', 'Runtime for check/recommend mode (auto|ollama|vllm|mlx|llama.cpp|transformers)', 'auto')
|
|
3021
3090
|
.option('--include-cloud', 'Include cloud models in check-mode analysis')
|
|
3022
3091
|
.option('--max-size <size>', 'Maximum model size for check mode (e.g., "24B" or "12GB")')
|
|
3023
3092
|
.option('--min-size <size>', 'Minimum model size for check mode (e.g., "3B" or "2GB")')
|
|
@@ -3071,13 +3140,14 @@ auditCommand
|
|
|
3071
3140
|
policyCandidates = collectCandidatesFromAnalysis(analysisResult);
|
|
3072
3141
|
} else {
|
|
3073
3142
|
recommendationResult = await checker.generateIntelligentRecommendations(hardware, {
|
|
3074
|
-
optimizeFor: options.optimize
|
|
3143
|
+
optimizeFor: options.optimize,
|
|
3144
|
+
runtime: options.runtime
|
|
3075
3145
|
});
|
|
3076
3146
|
if (!recommendationResult) {
|
|
3077
3147
|
throw new Error('Unable to generate recommendation data for policy audit export.');
|
|
3078
3148
|
}
|
|
3079
3149
|
|
|
3080
|
-
runtimeBackend =
|
|
3150
|
+
runtimeBackend = recommendationResult.runtime || options.runtime || 'auto';
|
|
3081
3151
|
policyCandidates = collectCandidatesFromRecommendationData(recommendationResult);
|
|
3082
3152
|
}
|
|
3083
3153
|
|
|
@@ -3844,6 +3914,8 @@ program
|
|
|
3844
3914
|
.description('Get intelligent model recommendations for your hardware')
|
|
3845
3915
|
.option('-c, --category <category>', 'Get recommendations for specific category (coding, talking, reading, etc.)')
|
|
3846
3916
|
.option('--optimize <profile>', 'Optimization profile (balanced|speed|quality|context|coding)', 'balanced')
|
|
3917
|
+
.option('--runtime <runtime>', 'Runtime target for registry recommendations (auto|ollama|vllm|mlx|llama.cpp|transformers)', 'auto')
|
|
3918
|
+
.option('--no-registry', 'Use the legacy Ollama catalog recommendation path')
|
|
3847
3919
|
.option('--no-verbose', 'Disable step-by-step progress display')
|
|
3848
3920
|
.option('--policy <file>', 'Evaluate recommendations against a policy file')
|
|
3849
3921
|
.option('--simulate <profile>', 'Simulate a hardware profile instead of detecting real hardware (use "list" to see profiles)')
|
|
@@ -3868,6 +3940,11 @@ Hardware simulation:
|
|
|
3868
3940
|
$ llm-checker recommend --simulate m4pro24 --category coding
|
|
3869
3941
|
$ llm-checker recommend --gpu "RTX 5060" --ram 32 --cpu "AMD Ryzen 7 5700X"
|
|
3870
3942
|
|
|
3943
|
+
Registry/runtime examples:
|
|
3944
|
+
$ llm-checker recommend --runtime auto --category coding
|
|
3945
|
+
$ llm-checker recommend --runtime vllm --category coding
|
|
3946
|
+
$ llm-checker recommend --runtime mlx --category general
|
|
3947
|
+
|
|
3871
3948
|
Calibrated routing examples:
|
|
3872
3949
|
$ llm-checker recommend --calibrated --category coding
|
|
3873
3950
|
$ llm-checker recommend --calibrated ./calibration-policy.yaml --category reasoning
|
|
@@ -3945,7 +4022,9 @@ Calibrated routing examples:
|
|
|
3945
4022
|
|
|
3946
4023
|
const hardware = await checker.getSystemInfo();
|
|
3947
4024
|
const intelligentRecommendations = await checker.generateIntelligentRecommendations(hardware, {
|
|
3948
|
-
optimizeFor: options.optimize
|
|
4025
|
+
optimizeFor: options.optimize,
|
|
4026
|
+
runtime: options.runtime,
|
|
4027
|
+
registry: options.registry
|
|
3949
4028
|
});
|
|
3950
4029
|
|
|
3951
4030
|
if (!intelligentRecommendations) {
|
|
@@ -4729,6 +4808,329 @@ program
|
|
|
4729
4808
|
}
|
|
4730
4809
|
});
|
|
4731
4810
|
|
|
4811
|
+
program
|
|
4812
|
+
.command('registry-sync')
|
|
4813
|
+
.description('Sync the multi-source model registry (Ollama, Hugging Face, GPT4All)')
|
|
4814
|
+
.option('-s, --sources <list>', 'Comma-separated sources: ollama,huggingface,gpt4all', 'ollama,huggingface,gpt4all')
|
|
4815
|
+
.option('-l, --limit <n>', 'Fallback maximum records per source')
|
|
4816
|
+
.option('--hf-limit <n>', 'Maximum Hugging Face repos to ingest', '3000')
|
|
4817
|
+
.option('--ollama-limit <n>', 'Maximum Ollama artifacts to ingest', '10000')
|
|
4818
|
+
.option('--gpt4all-limit <n>', 'Maximum GPT4All entries to ingest', '1000')
|
|
4819
|
+
.option('--query <text>', 'Hugging Face search query')
|
|
4820
|
+
.option('--task <task>', 'Hugging Face task/filter, for example text-generation or text-embeddings-inference')
|
|
4821
|
+
.option('--dry-run', 'Fetch and normalize without writing to the database')
|
|
4822
|
+
.option('-q, --quiet', 'Suppress progress output')
|
|
4823
|
+
.option('-j, --json', 'Output as JSON')
|
|
4824
|
+
.action(async (options) => {
|
|
4825
|
+
const quiet = Boolean(options.quiet || options.json);
|
|
4826
|
+
if (!quiet) showAsciiArt('registry-sync');
|
|
4827
|
+
|
|
4828
|
+
const ModelDatabase = require('../src/data/model-database');
|
|
4829
|
+
const { RegistryIngestor } = require('../src/data/registry-ingestors');
|
|
4830
|
+
const database = new ModelDatabase();
|
|
4831
|
+
const spinner = quiet ? null : ora('Preparing model registry sync...').start();
|
|
4832
|
+
|
|
4833
|
+
try {
|
|
4834
|
+
await database.initialize();
|
|
4835
|
+
|
|
4836
|
+
const ingestor = new RegistryIngestor({
|
|
4837
|
+
database,
|
|
4838
|
+
onProgress: (info) => {
|
|
4839
|
+
if (spinner && info.message) {
|
|
4840
|
+
spinner.text = info.message;
|
|
4841
|
+
}
|
|
4842
|
+
}
|
|
4843
|
+
});
|
|
4844
|
+
|
|
4845
|
+
const summary = await ingestor.ingest({
|
|
4846
|
+
sources: options.sources,
|
|
4847
|
+
limit: parsePositiveNumberOption(options.limit),
|
|
4848
|
+
hfLimit: parsePositiveNumberOption(options.hfLimit, 3000),
|
|
4849
|
+
ollamaLimit: parsePositiveNumberOption(options.ollamaLimit, 10000),
|
|
4850
|
+
gpt4allLimit: parsePositiveNumberOption(options.gpt4allLimit, 1000),
|
|
4851
|
+
query: options.query,
|
|
4852
|
+
task: options.task,
|
|
4853
|
+
dryRun: Boolean(options.dryRun)
|
|
4854
|
+
});
|
|
4855
|
+
const stats = options.dryRun ? null : database.getRegistryStats();
|
|
4856
|
+
|
|
4857
|
+
if (options.json) {
|
|
4858
|
+
console.log(JSON.stringify({ summary, stats }, null, 2));
|
|
4859
|
+
return;
|
|
4860
|
+
}
|
|
4861
|
+
|
|
4862
|
+
if (spinner) {
|
|
4863
|
+
const action = options.dryRun ? 'normalized' : 'synced';
|
|
4864
|
+
spinner.succeed(`Registry ${action}: ${summary.repos} repos, ${summary.artifacts} artifacts`);
|
|
4865
|
+
}
|
|
4866
|
+
|
|
4867
|
+
console.log(chalk.green('\n[OK] Registry sync complete'));
|
|
4868
|
+
console.log(chalk.gray(` Sources touched: ${summary.sources}`));
|
|
4869
|
+
console.log(chalk.gray(` Collections: ${summary.collections}`));
|
|
4870
|
+
console.log(chalk.gray(` Repositories: ${summary.repos}`));
|
|
4871
|
+
console.log(chalk.gray(` Artifacts: ${summary.artifacts}`));
|
|
4872
|
+
|
|
4873
|
+
if (stats) {
|
|
4874
|
+
console.log(chalk.blue.bold('\nRegistry totals:'));
|
|
4875
|
+
console.log(chalk.gray(` Sources: ${stats.sources}`));
|
|
4876
|
+
console.log(chalk.gray(` Repositories: ${stats.repos}`));
|
|
4877
|
+
console.log(chalk.gray(` Artifacts: ${stats.artifacts}`));
|
|
4878
|
+
|
|
4879
|
+
if (stats.bySource.length > 0) {
|
|
4880
|
+
const rows = [['Source', 'Artifacts']];
|
|
4881
|
+
for (const item of stats.bySource) {
|
|
4882
|
+
rows.push([item.source_id, String(item.artifact_count)]);
|
|
4883
|
+
}
|
|
4884
|
+
console.log('\n' + table(rows));
|
|
4885
|
+
}
|
|
4886
|
+
}
|
|
4887
|
+
|
|
4888
|
+
console.log(chalk.cyan('Try: llm-checker registry-search llama --runtime auto --limit 10'));
|
|
4889
|
+
} catch (error) {
|
|
4890
|
+
if (spinner) spinner.fail('Registry sync failed');
|
|
4891
|
+
console.error(chalk.red('Error:'), error.message);
|
|
4892
|
+
if (process.env.DEBUG) console.error(error.stack);
|
|
4893
|
+
process.exitCode = 1;
|
|
4894
|
+
} finally {
|
|
4895
|
+
database.close();
|
|
4896
|
+
}
|
|
4897
|
+
});
|
|
4898
|
+
|
|
4899
|
+
program
|
|
4900
|
+
.command('registry-search [query]')
|
|
4901
|
+
.description('Search exact downloadable/installable artifacts in the multi-source model registry')
|
|
4902
|
+
.option('-s, --source <source>', 'Filter by source: ollama, huggingface, gpt4all')
|
|
4903
|
+
.option('--format <format>', 'Filter by artifact format: gguf, safetensors, mlx, ollama')
|
|
4904
|
+
.option('--runtime <runtime>', 'Filter by runtime support: auto, ollama, llama.cpp, transformers, vllm, mlx')
|
|
4905
|
+
.option('--quant <type>', 'Filter by quantization, for example Q4_K_M or Q8_0')
|
|
4906
|
+
.option('--max-size <gb>', 'Maximum artifact size in GB')
|
|
4907
|
+
.option('--min-params <billion>', 'Minimum parameter count in billions')
|
|
4908
|
+
.option('--max-params <billion>', 'Maximum parameter count in billions')
|
|
4909
|
+
.option('--local-only', 'Exclude gated/auth-required artifacts')
|
|
4910
|
+
.option('-l, --limit <n>', 'Maximum number of results', '20')
|
|
4911
|
+
.option('-j, --json', 'Output as JSON')
|
|
4912
|
+
.action(async (query = '', options) => {
|
|
4913
|
+
try {
|
|
4914
|
+
validateRegistryFilters(options);
|
|
4915
|
+
} catch (validationError) {
|
|
4916
|
+
if (options.json) {
|
|
4917
|
+
console.log(JSON.stringify({ error: validationError.message }, null, 2));
|
|
4918
|
+
} else {
|
|
4919
|
+
console.error(chalk.red(`✗ ${validationError.message}`));
|
|
4920
|
+
}
|
|
4921
|
+
process.exitCode = 1;
|
|
4922
|
+
return;
|
|
4923
|
+
}
|
|
4924
|
+
if (!options.json) showAsciiArt('registry-search');
|
|
4925
|
+
|
|
4926
|
+
const ModelDatabase = require('../src/data/model-database');
|
|
4927
|
+
const database = new ModelDatabase();
|
|
4928
|
+
|
|
4929
|
+
try {
|
|
4930
|
+
await database.initialize();
|
|
4931
|
+
|
|
4932
|
+
const filters = {
|
|
4933
|
+
source: options.source,
|
|
4934
|
+
format: options.format ? String(options.format).toLowerCase() : undefined,
|
|
4935
|
+
runtime: options.runtime,
|
|
4936
|
+
quantization: options.quant,
|
|
4937
|
+
maxSizeGB: parsePositiveNumberOption(options.maxSize),
|
|
4938
|
+
minParamsB: parsePositiveNumberOption(options.minParams),
|
|
4939
|
+
maxParamsB: parsePositiveNumberOption(options.maxParams),
|
|
4940
|
+
localOnly: Boolean(options.localOnly),
|
|
4941
|
+
limit: parsePositiveNumberOption(options.limit, 20)
|
|
4942
|
+
};
|
|
4943
|
+
const results = database.searchModelArtifacts(query, filters);
|
|
4944
|
+
const stats = database.getRegistryStats();
|
|
4945
|
+
|
|
4946
|
+
if (options.json) {
|
|
4947
|
+
console.log(JSON.stringify({
|
|
4948
|
+
query,
|
|
4949
|
+
filters,
|
|
4950
|
+
count: results.length,
|
|
4951
|
+
stats,
|
|
4952
|
+
results
|
|
4953
|
+
}, null, 2));
|
|
4954
|
+
return;
|
|
4955
|
+
}
|
|
4956
|
+
|
|
4957
|
+
if (results.length === 0) {
|
|
4958
|
+
console.log(chalk.yellow('No registry artifacts found.'));
|
|
4959
|
+
if (stats.artifacts === 0) {
|
|
4960
|
+
console.log(chalk.gray('Populate the registry first with: llm-checker registry-sync'));
|
|
4961
|
+
}
|
|
4962
|
+
return;
|
|
4963
|
+
}
|
|
4964
|
+
|
|
4965
|
+
console.log(chalk.blue.bold('\nRegistry Results'));
|
|
4966
|
+
console.log(chalk.gray(`Stored registry: ${stats.artifacts} artifacts across ${stats.sources} sources`));
|
|
4967
|
+
console.log('');
|
|
4968
|
+
|
|
4969
|
+
const rows = [[
|
|
4970
|
+
'Source',
|
|
4971
|
+
'Model',
|
|
4972
|
+
'Artifact',
|
|
4973
|
+
'Params',
|
|
4974
|
+
'Size',
|
|
4975
|
+
'Format',
|
|
4976
|
+
'Runtime',
|
|
4977
|
+
'Install'
|
|
4978
|
+
]];
|
|
4979
|
+
|
|
4980
|
+
for (const item of results) {
|
|
4981
|
+
rows.push([
|
|
4982
|
+
item.source_id,
|
|
4983
|
+
truncateMiddle(item.canonical_model_id, 34),
|
|
4984
|
+
truncateMiddle(item.artifact_name || item.filename, 34),
|
|
4985
|
+
formatRegistryNumber(item.parameter_count_b, 'B'),
|
|
4986
|
+
formatRegistrySize(item.size_gb),
|
|
4987
|
+
item.quantization ? `${item.format}/${item.quantization}` : item.format,
|
|
4988
|
+
formatRegistryList(item.runtime_support, 2),
|
|
4989
|
+
truncateMiddle(item.install_command || item.download_url, 46)
|
|
4990
|
+
]);
|
|
4991
|
+
}
|
|
4992
|
+
|
|
4993
|
+
console.log(table(rows));
|
|
4994
|
+
|
|
4995
|
+
const links = results
|
|
4996
|
+
.filter((item) => item.download_url)
|
|
4997
|
+
.slice(0, 5);
|
|
4998
|
+
if (links.length > 0) {
|
|
4999
|
+
console.log(chalk.blue.bold('Exact download links:'));
|
|
5000
|
+
links.forEach((item, index) => {
|
|
5001
|
+
console.log(chalk.gray(` ${index + 1}. ${item.canonical_model_id} -> ${item.download_url}`));
|
|
5002
|
+
});
|
|
5003
|
+
}
|
|
5004
|
+
} catch (error) {
|
|
5005
|
+
console.error(chalk.red('Error:'), error.message);
|
|
5006
|
+
if (process.env.DEBUG) console.error(error.stack);
|
|
5007
|
+
process.exitCode = 1;
|
|
5008
|
+
} finally {
|
|
5009
|
+
database.close();
|
|
5010
|
+
}
|
|
5011
|
+
});
|
|
5012
|
+
|
|
5013
|
+
program
|
|
5014
|
+
.command('registry-recommend [query]')
|
|
5015
|
+
.description('Recommend the best exact model artifacts from the multi-source registry for this hardware')
|
|
5016
|
+
.option('-c, --category <category>', 'Task category (general, coding, reasoning, embeddings, multimodal)', 'general')
|
|
5017
|
+
.option('--optimize <profile>', 'Optimization profile (balanced|speed|quality|context|coding)', 'balanced')
|
|
5018
|
+
.option('--runtime <runtime>', 'Runtime target: auto, ollama, llama.cpp, vllm, mlx, transformers', 'auto')
|
|
5019
|
+
.option('-s, --source <source>', 'Filter by source: ollama, huggingface, gpt4all')
|
|
5020
|
+
.option('--format <format>', 'Filter by artifact format: gguf, safetensors, mlx, ollama')
|
|
5021
|
+
.option('--quant <type>', 'Filter by quantization, for example Q4_K_M or Q8_0')
|
|
5022
|
+
.option('--max-size <gb>', 'Maximum artifact size in GB')
|
|
5023
|
+
.option('--min-params <billion>', 'Minimum parameter count in billions')
|
|
5024
|
+
.option('--max-params <billion>', 'Maximum parameter count in billions')
|
|
5025
|
+
.option('--target-context <tokens>', 'Target context window for scoring')
|
|
5026
|
+
.option('--include-gated', 'Include gated/auth-required artifacts')
|
|
5027
|
+
.option('--pool-limit <n>', 'Maximum registry artifacts to score before ranking', '20000')
|
|
5028
|
+
.option('-l, --limit <n>', 'Maximum number of recommendations', '10')
|
|
5029
|
+
.option('-j, --json', 'Output as JSON')
|
|
5030
|
+
.action(async (query = '', options) => {
|
|
5031
|
+
try {
|
|
5032
|
+
validateRegistryFilters(options);
|
|
5033
|
+
} catch (validationError) {
|
|
5034
|
+
if (options.json) {
|
|
5035
|
+
console.log(JSON.stringify({ error: validationError.message }, null, 2));
|
|
5036
|
+
} else {
|
|
5037
|
+
console.error(chalk.red(`✗ ${validationError.message}`));
|
|
5038
|
+
}
|
|
5039
|
+
process.exitCode = 1;
|
|
5040
|
+
return;
|
|
5041
|
+
}
|
|
5042
|
+
if (!options.json) showAsciiArt('registry-recommend');
|
|
5043
|
+
|
|
5044
|
+
const UnifiedDetector = require('../src/hardware/unified-detector');
|
|
5045
|
+
const { RegistryRecommender } = require('../src/data/registry-recommender');
|
|
5046
|
+
const recommender = new RegistryRecommender();
|
|
5047
|
+
const spinner = options.json ? null : ora('Scoring registry artifacts...').start();
|
|
5048
|
+
|
|
5049
|
+
try {
|
|
5050
|
+
await recommender.initialize();
|
|
5051
|
+
|
|
5052
|
+
const detector = new UnifiedDetector();
|
|
5053
|
+
const hardware = await detector.detect();
|
|
5054
|
+
const category = normalizeTaskName(options.category || 'general');
|
|
5055
|
+
const result = await recommender.recommend({
|
|
5056
|
+
query,
|
|
5057
|
+
category,
|
|
5058
|
+
optimizeFor: options.optimize,
|
|
5059
|
+
runtime: options.runtime,
|
|
5060
|
+
source: options.source,
|
|
5061
|
+
format: options.format ? String(options.format).toLowerCase() : undefined,
|
|
5062
|
+
quantization: options.quant,
|
|
5063
|
+
maxSizeGB: parsePositiveNumberOption(options.maxSize),
|
|
5064
|
+
minParamsB: parsePositiveNumberOption(options.minParams),
|
|
5065
|
+
maxParamsB: parsePositiveNumberOption(options.maxParams),
|
|
5066
|
+
targetContext: parsePositiveNumberOption(options.targetContext),
|
|
5067
|
+
localOnly: !options.includeGated,
|
|
5068
|
+
poolLimit: parsePositiveNumberOption(options.poolLimit, 20000),
|
|
5069
|
+
limit: parsePositiveNumberOption(options.limit, 10),
|
|
5070
|
+
hardware
|
|
5071
|
+
});
|
|
5072
|
+
|
|
5073
|
+
if (options.json) {
|
|
5074
|
+
console.log(JSON.stringify({
|
|
5075
|
+
query,
|
|
5076
|
+
hardware: hardware.summary || hardware,
|
|
5077
|
+
...result
|
|
5078
|
+
}, null, 2));
|
|
5079
|
+
return;
|
|
5080
|
+
}
|
|
5081
|
+
|
|
5082
|
+
if (spinner) {
|
|
5083
|
+
spinner.succeed(
|
|
5084
|
+
`Scored ${result.total_evaluated} candidates from ${result.total_artifacts} registry artifacts`
|
|
5085
|
+
);
|
|
5086
|
+
}
|
|
5087
|
+
|
|
5088
|
+
if (result.recommendations.length === 0) {
|
|
5089
|
+
console.log(chalk.yellow('No registry recommendations found for those filters.'));
|
|
5090
|
+
if (result.registry.artifacts === 0) {
|
|
5091
|
+
console.log(chalk.gray('Populate the registry first with: llm-checker registry-sync'));
|
|
5092
|
+
}
|
|
5093
|
+
return;
|
|
5094
|
+
}
|
|
5095
|
+
|
|
5096
|
+
console.log(chalk.blue.bold('\nRegistry Recommendations'));
|
|
5097
|
+
console.log(chalk.gray(`Registry: ${result.registry.repos} repos, ${result.registry.artifacts} artifacts`));
|
|
5098
|
+
console.log(chalk.gray(`Runtime: ${result.runtime} | Category: ${result.category} | Optimize: ${result.optimizeFor}`));
|
|
5099
|
+
console.log('');
|
|
5100
|
+
|
|
5101
|
+
const rows = [['#', 'Score', 'Source', 'Model', 'Artifact', 'Params', 'Size', 'Install']];
|
|
5102
|
+
result.recommendations.forEach((item, index) => {
|
|
5103
|
+
rows.push([
|
|
5104
|
+
String(index + 1),
|
|
5105
|
+
String(item.score),
|
|
5106
|
+
item.source,
|
|
5107
|
+
truncateMiddle(item.model, 30),
|
|
5108
|
+
truncateMiddle(item.artifact, 32),
|
|
5109
|
+
formatRegistryNumber(item.params_b, 'B'),
|
|
5110
|
+
formatRegistrySize(item.size_gb),
|
|
5111
|
+
truncateMiddle(item.install_command || item.download_url, 44)
|
|
5112
|
+
]);
|
|
5113
|
+
});
|
|
5114
|
+
|
|
5115
|
+
console.log(table(rows));
|
|
5116
|
+
|
|
5117
|
+
console.log(chalk.blue.bold('Top pick:'));
|
|
5118
|
+
const best = result.recommendations[0];
|
|
5119
|
+
console.log(chalk.white.bold(` ${best.model}`));
|
|
5120
|
+
console.log(chalk.gray(` Artifact: ${best.artifact}`));
|
|
5121
|
+
console.log(chalk.gray(` Why: ${best.rationale}`));
|
|
5122
|
+
if (best.install_command) console.log(chalk.cyan(` ${best.install_command}`));
|
|
5123
|
+
if (best.download_url) console.log(chalk.gray(` ${best.download_url}`));
|
|
5124
|
+
} catch (error) {
|
|
5125
|
+
if (spinner) spinner.fail('Registry recommendation failed');
|
|
5126
|
+
console.error(chalk.red('Error:'), error.message);
|
|
5127
|
+
if (process.env.DEBUG) console.error(error.stack);
|
|
5128
|
+
process.exitCode = 1;
|
|
5129
|
+
} finally {
|
|
5130
|
+
recommender.close();
|
|
5131
|
+
}
|
|
5132
|
+
});
|
|
5133
|
+
|
|
4732
5134
|
program
|
|
4733
5135
|
.command('search <query>')
|
|
4734
5136
|
.description('Search models in the database with intelligent scoring')
|
package/bin/mcp-server.mjs
CHANGED
|
@@ -290,9 +290,14 @@ const ALLOWED_CLI_COMMANDS = new Set([
|
|
|
290
290
|
"sync",
|
|
291
291
|
"search",
|
|
292
292
|
"smart-recommend",
|
|
293
|
+
"registry-sync",
|
|
294
|
+
"registry-search",
|
|
295
|
+
"registry-recommend",
|
|
293
296
|
"hw-detect",
|
|
294
297
|
]);
|
|
295
298
|
|
|
299
|
+
export { ALLOWED_CLI_COMMANDS };
|
|
300
|
+
|
|
296
301
|
// ============================================================================
|
|
297
302
|
// MCP SERVER
|
|
298
303
|
// ============================================================================
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "llm-checker",
|
|
3
|
-
"version": "3.
|
|
3
|
+
"version": "3.7.4",
|
|
4
4
|
"description": "Intelligent CLI tool with AI-powered model selection that analyzes your hardware and recommends optimal LLM models for your system",
|
|
5
5
|
"bin": {
|
|
6
6
|
"llm-checker": "bin/cli.js",
|
|
@@ -16,6 +16,10 @@
|
|
|
16
16
|
"test:ui": "node tests/ui-cli-smoke.test.js",
|
|
17
17
|
"test:runtime": "node tests/runtime-specdec-tests.js",
|
|
18
18
|
"test:deterministic-pool": "node tests/deterministic-model-pool-check.js",
|
|
19
|
+
"test:registry": "node tests/model-registry-ingestors.test.js",
|
|
20
|
+
"test:registry-main": "node tests/model-registry-main-flow.test.js",
|
|
21
|
+
"test:registry-recommender": "node tests/model-registry-recommender.test.js",
|
|
22
|
+
"test:registry-seed": "node tests/model-registry-seed.test.js",
|
|
19
23
|
"test:policy": "node tests/policy-commands.test.js",
|
|
20
24
|
"test:policy-cli": "node tests/policy-cli-enforcement.js",
|
|
21
25
|
"test:policy-engine": "node tests/policy-engine.test.js",
|
|
@@ -36,7 +40,8 @@
|
|
|
36
40
|
"list-models": "node bin/enhanced_cli.js list-models",
|
|
37
41
|
"ai-check": "node bin/enhanced_cli.js ai-check",
|
|
38
42
|
"ai-run": "node bin/enhanced_cli.js ai-run",
|
|
39
|
-
"sync:seed": "node bin/enhanced_cli.js sync --force --quiet && node scripts/update-seed-db.js",
|
|
43
|
+
"sync:seed": "node bin/enhanced_cli.js sync --force --quiet && node scripts/update-seed-db.js && node scripts/update-registry-seed.js",
|
|
44
|
+
"sync:registry-seed": "node scripts/update-registry-seed.js",
|
|
40
45
|
"benchmark": "cd ml-model && python python/benchmark_collector.py",
|
|
41
46
|
"train-ai": "cd ml-model && python python/train_model.py",
|
|
42
47
|
"postinstall": "echo 'LLM Checker installed. Run: llm-checker hw-detect'"
|