llm-checker 3.6.1 → 3.7.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -5,7 +5,7 @@
5
5
  **Intelligent Ollama Model Selector**
6
6
 
7
7
  AI-powered CLI that analyzes your hardware and recommends optimal LLM models.
8
- Deterministic scoring across **200+ Ollama models** and **7k+ variants** with a packaged SQLite catalog, live sync, and hardware-calibrated memory estimation.
8
+ Deterministic scoring across a packaged **multi-source registry** (Hugging Face + Ollama + GPT4All, **33k+ exact artifacts**) and the Ollama catalog, with live sync, runtime targeting, and hardware-calibrated memory estimation.
9
9
 
10
10
  [![npm version](https://img.shields.io/npm/v/llm-checker?style=flat-square&color=0066FF)](https://www.npmjs.com/package/llm-checker)
11
11
  [![npm downloads](https://img.shields.io/npm/dm/llm-checker?style=flat-square&color=0066FF)](https://www.npmjs.com/package/llm-checker)
@@ -39,6 +39,7 @@ Choosing the right LLM for your hardware is complex. With thousands of model var
39
39
  | | Feature | Description |
40
40
  |:---:|---|---|
41
41
  | **200+** | Packaged Model Catalog | Ships with a synced Ollama SQLite catalog and can refresh from Ollama on demand |
42
+ | **33k+** | Multi-Source Registry | Exact installable/downloadable artifacts from Hugging Face, Ollama, and GPT4All with per-source commands and runtime targeting |
42
43
  | **4D** | Scoring Engine | Quality, Speed, Fit, Context — weighted by use case |
43
44
  | **Multi-GPU** | Hardware Detection | Apple Silicon, NVIDIA CUDA, AMD ROCm, Intel Arc, CPU, integrated/dedicated inventory visibility |
44
45
  | **Calibrated** | Memory Estimation | Bytes-per-parameter formula validated against real Ollama sizes |
@@ -151,6 +152,14 @@ hash -r
151
152
  llm-checker --version
152
153
  ```
153
154
 
155
+ ### v3.7.0 Highlights
156
+
157
+ - New **multi-source model registry**: a packaged snapshot of ~33,700 exact installable/downloadable artifacts from Hugging Face, Ollama, and GPT4All, with per-source commands (`hf download ...`, `ollama pull ...`).
158
+ - `recommend` and `check` now draw candidates from the registry through one canonical deterministic scoring core, with `--runtime auto/ollama/vllm/mlx/llama.cpp/transformers` targeting; they fall back to the Ollama catalog when the registry is unavailable.
159
+ - New `registry-sync`, `registry-search`, and `registry-recommend` commands.
160
+ - Mixture-of-Experts models are sized by their **total** parameter count (all experts stay resident under Ollama/Metal/vLLM), so a large MoE can no longer falsely "fit" small hardware.
161
+ - Carries the 3.6.1 batch: unified scoring across `check`/`recommend`/`smart-recommend` (#88), high-end/multi-GPU VRAM detection (#95), MCP server hardening (#97), and the Windows interactive-panel fixes (#86).
162
+
154
163
  ### v3.5.13 Highlights
155
164
 
156
165
  - Ships npm packages with a ready-to-use SQLite model catalog:
@@ -389,6 +398,27 @@ llm-checker search "qwen coder" --json
389
398
  | `search <query>` | Search the synced catalog with filters and intelligent scoring |
390
399
  | `smart-recommend` | Advanced recommendations using the full scoring engine |
391
400
 
401
+ ### Model Registry Commands (v3.7.0+)
402
+
403
+ Exact installable/downloadable artifacts from a packaged multi-source registry (Hugging Face + Ollama + GPT4All).
404
+
405
+ | Command | Description |
406
+ |---------|-------------|
407
+ | `registry-sync` | Sync the multi-source registry (Hugging Face, Ollama, GPT4All) |
408
+ | `registry-search [query]` | Search exact artifacts with `--source`, `--format`, `--runtime`, `--quant`, `--max-size`, `--min-params`/`--max-params` filters |
409
+ | `registry-recommend [query]` | Recommend the best exact artifacts for your hardware, with `--runtime auto/ollama/vllm/mlx/llama.cpp/transformers` targeting and `--category`/`--optimize` |
410
+
411
+ ```bash
412
+ # Best coding artifacts across all sources, auto runtime
413
+ llm-checker registry-recommend --category coding
414
+
415
+ # Only Apple-native MLX artifacts
416
+ llm-checker registry-recommend --category coding --runtime mlx
417
+
418
+ # Search Hugging Face for vLLM-ready reasoning models under 24B
419
+ llm-checker registry-search qwen --source huggingface --runtime vllm --max-params 24
420
+ ```
421
+
392
422
  ### Enterprise Policy Commands
393
423
 
394
424
  | Command | Description |
@@ -641,30 +671,36 @@ llm-checker search qwen --quant Q4_K_M --max-size 8
641
671
 
642
672
  ## Model Catalog
643
673
 
644
- LLM Checker ships with a pre-synced SQLite snapshot of the Ollama catalog. On first run, that snapshot is copied to `~/.llm-checker/models.db`, so recommendations and catalog search work immediately after npm install.
674
+ LLM Checker ships with a pre-synced SQLite snapshot of the Ollama catalog plus a multi-source registry of exact downloadable/installable model artifacts. On first run, that snapshot is copied to `~/.llm-checker/models.db`, so recommendations and catalog search work immediately after npm install.
645
675
 
646
676
  The packaged snapshot currently includes:
647
677
 
648
678
  - 229 Ollama models
649
679
  - 7176 variants
680
+ - 3259 multi-source registry repositories
681
+ - 33729 exact model artifacts from Hugging Face, Ollama, and GPT4All
682
+ - Hugging Face top 3000 repositories by downloads, fetched with API pagination
650
683
  - pull counts
651
684
  - tag counts
652
685
  - last-updated metadata
653
- - variant params, quantization, size, context, and input type fields when available
686
+ - variant params, quantization, size, context, runtime, install commands, download URLs, license/gated flags, tasks, and modalities when available
654
687
 
655
688
  Refresh it any time:
656
689
 
657
690
  ```bash
658
691
  llm-checker sync
692
+ llm-checker registry-sync --sources ollama,huggingface,gpt4all
693
+ llm-checker registry-search qwen --runtime auto --max-size 8
694
+ llm-checker registry-recommend --category coding --runtime auto --max-size 8
659
695
  ```
660
696
 
661
- For release maintainers, the packaged seed can be regenerated from the synced local DB:
697
+ For release maintainers, the packaged seed can be regenerated from the synced local DB and registry APIs:
662
698
 
663
699
  ```bash
664
700
  npm run sync:seed
665
701
  ```
666
702
 
667
- `recommend`, `list-models`, `ai-run`, and `ai-check` prefer the synced SQLite catalog. If the SQLite catalog is unavailable, LLM Checker falls back to the scraped cache and then to the curated catalog.
703
+ `recommend`, `list-models`, `ai-run`, and `ai-check` prefer the synced SQLite catalog. `registry-search` queries exact artifacts across sources, and `registry-recommend` ranks exact artifacts from the registry with the deterministic hardware-aware selector. If the SQLite catalog is unavailable, LLM Checker falls back to the scraped cache and then to the curated catalog.
668
704
 
669
705
  The curated fallback catalog includes 35+ models from the most popular Ollama families:
670
706
 
@@ -836,7 +872,7 @@ LLM Checker uses a deterministic pipeline so the same inputs produce the same ra
836
872
  flowchart LR
837
873
  subgraph Inputs
838
874
  HW["Hardware detector<br/>CPU/GPU/RAM/backend"]
839
- REG["Synced SQLite Ollama catalog<br/>(packaged seed + live sync)"]
875
+ REG["Synced SQLite model catalog<br/>(Ollama seed + multi-source registry)"]
840
876
  LOCAL["Installed local models"]
841
877
  FLAGS["CLI options<br/>use-case/runtime/limits/policy"]
842
878
  end
@@ -952,8 +988,9 @@ src/
952
988
  detector.js # Hardware detection
953
989
  unified-detector.js # Cross-platform detection
954
990
  data/
955
- model-database.js # SQLite storage and packaged seed loading
956
- seed/models.db # npm-packaged Ollama catalog snapshot
991
+ model-database.js # SQLite storage, registry tables, and packaged seed loading
992
+ registry-ingestors.js # Ollama/Hugging Face/GPT4All artifact normalization
993
+ seed/models.db # npm-packaged Ollama + multi-source registry snapshot
957
994
  sync-manager.js # Database sync from Ollama registry
958
995
  bin/
959
996
  enhanced_cli.js # CLI entry point
@@ -61,6 +61,9 @@ const COMMAND_HEADER_LABELS = {
61
61
  'smart-recommend': 'Smart Recommend (Experimental)',
62
62
  search: 'Model Search',
63
63
  sync: 'Database Sync',
64
+ 'registry-sync': 'Model Registry Sync',
65
+ 'registry-search': 'Model Registry Search',
66
+ 'registry-recommend': 'Registry Recommendations',
64
67
  'mcp-setup': 'Claude MCP Setup',
65
68
  check: 'Compatibility Check',
66
69
  installed: 'Installed Models',
@@ -401,6 +404,65 @@ function getRealSizeFromOllamaCache(model) {
401
404
  }
402
405
  }
403
406
 
407
+ function parsePositiveNumberOption(value, fallback = null) {
408
+ if (value === undefined || value === null || value === '') return fallback;
409
+ const parsed = Number(value);
410
+ return Number.isFinite(parsed) && parsed > 0 ? parsed : fallback;
411
+ }
412
+
413
+ // Allowed enum values for the registry commands. Invalid values must be rejected
414
+ // with a clear error instead of silently returning "no results" or falling back
415
+ // to the built-in catalog.
416
+ const REGISTRY_SOURCES = ['ollama', 'huggingface', 'gpt4all'];
417
+ const REGISTRY_FORMATS = ['gguf', 'safetensors', 'mlx', 'ollama', 'pytorch', 'pytorch_bin', 'ggml'];
418
+ const REGISTRY_RUNTIMES = ['auto', 'all', '*', 'ollama', 'llama.cpp', 'transformers', 'vllm', 'mlx'];
419
+ const REGISTRY_OPTIMIZE = ['balanced', 'speed', 'quality', 'context', 'coding'];
420
+
421
+ function assertRegistryEnum(label, value, allowed) {
422
+ if (value === undefined || value === null || value === '') return;
423
+ if (!allowed.includes(String(value).toLowerCase())) {
424
+ const shown = allowed.filter((v) => !['all', '*'].includes(v)).join(', ');
425
+ throw new Error(`Invalid --${label} "${value}". Allowed: ${shown}`);
426
+ }
427
+ }
428
+
429
+ // Throws on the first invalid registry enum option. Returns nothing on success.
430
+ function validateRegistryFilters(options = {}) {
431
+ assertRegistryEnum('source', options.source, REGISTRY_SOURCES);
432
+ assertRegistryEnum('format', options.format, REGISTRY_FORMATS);
433
+ assertRegistryEnum('runtime', options.runtime, REGISTRY_RUNTIMES);
434
+ assertRegistryEnum('optimize', options.optimize, REGISTRY_OPTIMIZE);
435
+ }
436
+
437
+ function truncateMiddle(value, maxLength = 48) {
438
+ const text = String(value || '');
439
+ if (text.length <= maxLength) return text;
440
+ if (maxLength <= 4) return text.slice(0, maxLength);
441
+ const head = Math.ceil((maxLength - 3) / 2);
442
+ const tail = Math.floor((maxLength - 3) / 2);
443
+ return `${text.slice(0, head)}...${text.slice(text.length - tail)}`;
444
+ }
445
+
446
+ function formatRegistryNumber(value, suffix = '') {
447
+ const parsed = Number(value);
448
+ if (!Number.isFinite(parsed) || parsed <= 0) return '?';
449
+ const rounded = parsed >= 100 ? Math.round(parsed) : Math.round(parsed * 10) / 10;
450
+ return `${rounded}${suffix}`;
451
+ }
452
+
453
+ function formatRegistrySize(value) {
454
+ const parsed = Number(value);
455
+ if (!Number.isFinite(parsed) || parsed <= 0) return '?';
456
+ return `${Math.round(parsed * 100) / 100}GB`;
457
+ }
458
+
459
+ function formatRegistryList(value, maxItems = 3) {
460
+ const items = Array.isArray(value) ? value : [];
461
+ if (items.length === 0) return '-';
462
+ const shown = items.slice(0, maxItems).join(', ');
463
+ return items.length > maxItems ? `${shown}, +${items.length - maxItems}` : shown;
464
+ }
465
+
404
466
  const program = new Command();
405
467
 
406
468
  program
@@ -1285,12 +1347,17 @@ function displayIntelligentRecommendations(intelligentData, hardware = null) {
1285
1347
  const { summary, recommendations } = intelligentData;
1286
1348
  const tier = summary.hardware_tier.replace('_', ' ').toUpperCase();
1287
1349
  const optimizeProfile = (summary.optimize_for || intelligentData.optimizeFor || 'balanced').toUpperCase();
1350
+ const runtimeLabel = (intelligentData.runtime || summary.best_overall?.runtime || 'auto').toUpperCase();
1351
+ const sourceLabel = intelligentData.recommendationSource === 'registry'
1352
+ ? 'Multi-source registry'
1353
+ : 'Ollama catalog';
1288
1354
  const tierColor = tier.includes('HIGH') ? chalk.green : tier.includes('MEDIUM') ? chalk.yellow : chalk.red;
1289
1355
 
1290
1356
  console.log('\n' + chalk.bgRed.white.bold(' INTELLIGENT RECOMMENDATIONS BY CATEGORY '));
1291
1357
  console.log(chalk.red('╭' + '─'.repeat(65)));
1292
1358
  console.log(chalk.red('│') + ` Hardware Tier: ${tierColor.bold(tier)} | Models Analyzed: ${chalk.cyan.bold(intelligentData.totalModelsAnalyzed)}`);
1293
- console.log(chalk.red('│') + ` Optimization: ${chalk.magenta.bold(optimizeProfile)}`);
1359
+ console.log(chalk.red('│') + ` Optimization: ${chalk.magenta.bold(optimizeProfile)} | Runtime: ${chalk.cyan.bold(runtimeLabel)}`);
1360
+ console.log(chalk.red('│') + ` Source: ${chalk.white.bold(sourceLabel)}`);
1294
1361
  console.log(chalk.red('│'));
1295
1362
 
1296
1363
  // Mostrar mejor modelo general
@@ -1301,6 +1368,7 @@ function displayIntelligentRecommendations(intelligentData, hardware = null) {
1301
1368
  console.log(chalk.red('│') + ` Command: ${chalk.cyan.bold(best.command)}`);
1302
1369
  console.log(chalk.red('│') + ` Score: ${chalk.yellow.bold(best.score)}/100 | Category: ${chalk.magenta(best.category)}`);
1303
1370
  console.log(chalk.red('│') + ` Quantization: ${chalk.white.bold(best.quantization || 'Q4_K_M')}`);
1371
+ console.log(chalk.red('│') + ` Runtime: ${chalk.cyan.bold(best.runtime || intelligentData.runtime || 'ollama')} | Source: ${chalk.gray(best.source || 'unknown')}`);
1304
1372
  console.log(chalk.red('│') + ` Fine-tuning: ${chalk.blue.bold(bestFineTuning.shortLabel)}`);
1305
1373
  console.log(chalk.red('│'));
1306
1374
  }
@@ -1326,6 +1394,7 @@ function displayIntelligentRecommendations(intelligentData, hardware = null) {
1326
1394
  console.log(chalk.red('│') + ` ${chalk.green(model.name)} (${model.size})`);
1327
1395
  console.log(chalk.red('│') + ` Score: ${scoreColor.bold(model.score)}/100 | Pulls: ${chalk.gray(model.pulls?.toLocaleString() || 'N/A')}`);
1328
1396
  console.log(chalk.red('│') + ` Quantization: ${chalk.white.bold(model.quantization || 'Q4_K_M')}`);
1397
+ console.log(chalk.red('│') + ` Runtime: ${chalk.cyan(model.runtime || intelligentData.runtime || 'ollama')} | Source: ${chalk.gray(model.source || 'unknown')}`);
1329
1398
  console.log(chalk.red('│') + ` Fine-tuning: ${chalk.blue.bold(fineTuningSupport.shortLabel)}`);
1330
1399
  console.log(chalk.red('│') + ` Command: ${chalk.cyan.bold(model.command)}`);
1331
1400
  console.log(chalk.red('│'));
@@ -3017,7 +3086,7 @@ auditCommand
3017
3086
  .option('-u, --use-case <case>', 'Use case when --command check is selected', 'general')
3018
3087
  .option('-c, --category <category>', 'Category hint when --command recommend is selected')
3019
3088
  .option('--optimize <profile>', 'Optimization profile for recommend mode (balanced|speed|quality|context|coding)', 'balanced')
3020
- .option('--runtime <runtime>', `Runtime for check mode (${SUPPORTED_RUNTIMES.join('|')})`, 'ollama')
3089
+ .option('--runtime <runtime>', 'Runtime for check/recommend mode (auto|ollama|vllm|mlx|llama.cpp|transformers)', 'auto')
3021
3090
  .option('--include-cloud', 'Include cloud models in check-mode analysis')
3022
3091
  .option('--max-size <size>', 'Maximum model size for check mode (e.g., "24B" or "12GB")')
3023
3092
  .option('--min-size <size>', 'Minimum model size for check mode (e.g., "3B" or "2GB")')
@@ -3071,13 +3140,14 @@ auditCommand
3071
3140
  policyCandidates = collectCandidatesFromAnalysis(analysisResult);
3072
3141
  } else {
3073
3142
  recommendationResult = await checker.generateIntelligentRecommendations(hardware, {
3074
- optimizeFor: options.optimize
3143
+ optimizeFor: options.optimize,
3144
+ runtime: options.runtime
3075
3145
  });
3076
3146
  if (!recommendationResult) {
3077
3147
  throw new Error('Unable to generate recommendation data for policy audit export.');
3078
3148
  }
3079
3149
 
3080
- runtimeBackend = normalizeRuntime(options.runtime || 'ollama');
3150
+ runtimeBackend = recommendationResult.runtime || options.runtime || 'auto';
3081
3151
  policyCandidates = collectCandidatesFromRecommendationData(recommendationResult);
3082
3152
  }
3083
3153
 
@@ -3844,6 +3914,8 @@ program
3844
3914
  .description('Get intelligent model recommendations for your hardware')
3845
3915
  .option('-c, --category <category>', 'Get recommendations for specific category (coding, talking, reading, etc.)')
3846
3916
  .option('--optimize <profile>', 'Optimization profile (balanced|speed|quality|context|coding)', 'balanced')
3917
+ .option('--runtime <runtime>', 'Runtime target for registry recommendations (auto|ollama|vllm|mlx|llama.cpp|transformers)', 'auto')
3918
+ .option('--no-registry', 'Use the legacy Ollama catalog recommendation path')
3847
3919
  .option('--no-verbose', 'Disable step-by-step progress display')
3848
3920
  .option('--policy <file>', 'Evaluate recommendations against a policy file')
3849
3921
  .option('--simulate <profile>', 'Simulate a hardware profile instead of detecting real hardware (use "list" to see profiles)')
@@ -3868,6 +3940,11 @@ Hardware simulation:
3868
3940
  $ llm-checker recommend --simulate m4pro24 --category coding
3869
3941
  $ llm-checker recommend --gpu "RTX 5060" --ram 32 --cpu "AMD Ryzen 7 5700X"
3870
3942
 
3943
+ Registry/runtime examples:
3944
+ $ llm-checker recommend --runtime auto --category coding
3945
+ $ llm-checker recommend --runtime vllm --category coding
3946
+ $ llm-checker recommend --runtime mlx --category general
3947
+
3871
3948
  Calibrated routing examples:
3872
3949
  $ llm-checker recommend --calibrated --category coding
3873
3950
  $ llm-checker recommend --calibrated ./calibration-policy.yaml --category reasoning
@@ -3945,7 +4022,9 @@ Calibrated routing examples:
3945
4022
 
3946
4023
  const hardware = await checker.getSystemInfo();
3947
4024
  const intelligentRecommendations = await checker.generateIntelligentRecommendations(hardware, {
3948
- optimizeFor: options.optimize
4025
+ optimizeFor: options.optimize,
4026
+ runtime: options.runtime,
4027
+ registry: options.registry
3949
4028
  });
3950
4029
 
3951
4030
  if (!intelligentRecommendations) {
@@ -4729,6 +4808,329 @@ program
4729
4808
  }
4730
4809
  });
4731
4810
 
4811
+ program
4812
+ .command('registry-sync')
4813
+ .description('Sync the multi-source model registry (Ollama, Hugging Face, GPT4All)')
4814
+ .option('-s, --sources <list>', 'Comma-separated sources: ollama,huggingface,gpt4all', 'ollama,huggingface,gpt4all')
4815
+ .option('-l, --limit <n>', 'Fallback maximum records per source')
4816
+ .option('--hf-limit <n>', 'Maximum Hugging Face repos to ingest', '3000')
4817
+ .option('--ollama-limit <n>', 'Maximum Ollama artifacts to ingest', '10000')
4818
+ .option('--gpt4all-limit <n>', 'Maximum GPT4All entries to ingest', '1000')
4819
+ .option('--query <text>', 'Hugging Face search query')
4820
+ .option('--task <task>', 'Hugging Face task/filter, for example text-generation or text-embeddings-inference')
4821
+ .option('--dry-run', 'Fetch and normalize without writing to the database')
4822
+ .option('-q, --quiet', 'Suppress progress output')
4823
+ .option('-j, --json', 'Output as JSON')
4824
+ .action(async (options) => {
4825
+ const quiet = Boolean(options.quiet || options.json);
4826
+ if (!quiet) showAsciiArt('registry-sync');
4827
+
4828
+ const ModelDatabase = require('../src/data/model-database');
4829
+ const { RegistryIngestor } = require('../src/data/registry-ingestors');
4830
+ const database = new ModelDatabase();
4831
+ const spinner = quiet ? null : ora('Preparing model registry sync...').start();
4832
+
4833
+ try {
4834
+ await database.initialize();
4835
+
4836
+ const ingestor = new RegistryIngestor({
4837
+ database,
4838
+ onProgress: (info) => {
4839
+ if (spinner && info.message) {
4840
+ spinner.text = info.message;
4841
+ }
4842
+ }
4843
+ });
4844
+
4845
+ const summary = await ingestor.ingest({
4846
+ sources: options.sources,
4847
+ limit: parsePositiveNumberOption(options.limit),
4848
+ hfLimit: parsePositiveNumberOption(options.hfLimit, 3000),
4849
+ ollamaLimit: parsePositiveNumberOption(options.ollamaLimit, 10000),
4850
+ gpt4allLimit: parsePositiveNumberOption(options.gpt4allLimit, 1000),
4851
+ query: options.query,
4852
+ task: options.task,
4853
+ dryRun: Boolean(options.dryRun)
4854
+ });
4855
+ const stats = options.dryRun ? null : database.getRegistryStats();
4856
+
4857
+ if (options.json) {
4858
+ console.log(JSON.stringify({ summary, stats }, null, 2));
4859
+ return;
4860
+ }
4861
+
4862
+ if (spinner) {
4863
+ const action = options.dryRun ? 'normalized' : 'synced';
4864
+ spinner.succeed(`Registry ${action}: ${summary.repos} repos, ${summary.artifacts} artifacts`);
4865
+ }
4866
+
4867
+ console.log(chalk.green('\n[OK] Registry sync complete'));
4868
+ console.log(chalk.gray(` Sources touched: ${summary.sources}`));
4869
+ console.log(chalk.gray(` Collections: ${summary.collections}`));
4870
+ console.log(chalk.gray(` Repositories: ${summary.repos}`));
4871
+ console.log(chalk.gray(` Artifacts: ${summary.artifacts}`));
4872
+
4873
+ if (stats) {
4874
+ console.log(chalk.blue.bold('\nRegistry totals:'));
4875
+ console.log(chalk.gray(` Sources: ${stats.sources}`));
4876
+ console.log(chalk.gray(` Repositories: ${stats.repos}`));
4877
+ console.log(chalk.gray(` Artifacts: ${stats.artifacts}`));
4878
+
4879
+ if (stats.bySource.length > 0) {
4880
+ const rows = [['Source', 'Artifacts']];
4881
+ for (const item of stats.bySource) {
4882
+ rows.push([item.source_id, String(item.artifact_count)]);
4883
+ }
4884
+ console.log('\n' + table(rows));
4885
+ }
4886
+ }
4887
+
4888
+ console.log(chalk.cyan('Try: llm-checker registry-search llama --runtime auto --limit 10'));
4889
+ } catch (error) {
4890
+ if (spinner) spinner.fail('Registry sync failed');
4891
+ console.error(chalk.red('Error:'), error.message);
4892
+ if (process.env.DEBUG) console.error(error.stack);
4893
+ process.exitCode = 1;
4894
+ } finally {
4895
+ database.close();
4896
+ }
4897
+ });
4898
+
4899
+ program
4900
+ .command('registry-search [query]')
4901
+ .description('Search exact downloadable/installable artifacts in the multi-source model registry')
4902
+ .option('-s, --source <source>', 'Filter by source: ollama, huggingface, gpt4all')
4903
+ .option('--format <format>', 'Filter by artifact format: gguf, safetensors, mlx, ollama')
4904
+ .option('--runtime <runtime>', 'Filter by runtime support: auto, ollama, llama.cpp, transformers, vllm, mlx')
4905
+ .option('--quant <type>', 'Filter by quantization, for example Q4_K_M or Q8_0')
4906
+ .option('--max-size <gb>', 'Maximum artifact size in GB')
4907
+ .option('--min-params <billion>', 'Minimum parameter count in billions')
4908
+ .option('--max-params <billion>', 'Maximum parameter count in billions')
4909
+ .option('--local-only', 'Exclude gated/auth-required artifacts')
4910
+ .option('-l, --limit <n>', 'Maximum number of results', '20')
4911
+ .option('-j, --json', 'Output as JSON')
4912
+ .action(async (query = '', options) => {
4913
+ try {
4914
+ validateRegistryFilters(options);
4915
+ } catch (validationError) {
4916
+ if (options.json) {
4917
+ console.log(JSON.stringify({ error: validationError.message }, null, 2));
4918
+ } else {
4919
+ console.error(chalk.red(`✗ ${validationError.message}`));
4920
+ }
4921
+ process.exitCode = 1;
4922
+ return;
4923
+ }
4924
+ if (!options.json) showAsciiArt('registry-search');
4925
+
4926
+ const ModelDatabase = require('../src/data/model-database');
4927
+ const database = new ModelDatabase();
4928
+
4929
+ try {
4930
+ await database.initialize();
4931
+
4932
+ const filters = {
4933
+ source: options.source,
4934
+ format: options.format ? String(options.format).toLowerCase() : undefined,
4935
+ runtime: options.runtime,
4936
+ quantization: options.quant,
4937
+ maxSizeGB: parsePositiveNumberOption(options.maxSize),
4938
+ minParamsB: parsePositiveNumberOption(options.minParams),
4939
+ maxParamsB: parsePositiveNumberOption(options.maxParams),
4940
+ localOnly: Boolean(options.localOnly),
4941
+ limit: parsePositiveNumberOption(options.limit, 20)
4942
+ };
4943
+ const results = database.searchModelArtifacts(query, filters);
4944
+ const stats = database.getRegistryStats();
4945
+
4946
+ if (options.json) {
4947
+ console.log(JSON.stringify({
4948
+ query,
4949
+ filters,
4950
+ count: results.length,
4951
+ stats,
4952
+ results
4953
+ }, null, 2));
4954
+ return;
4955
+ }
4956
+
4957
+ if (results.length === 0) {
4958
+ console.log(chalk.yellow('No registry artifacts found.'));
4959
+ if (stats.artifacts === 0) {
4960
+ console.log(chalk.gray('Populate the registry first with: llm-checker registry-sync'));
4961
+ }
4962
+ return;
4963
+ }
4964
+
4965
+ console.log(chalk.blue.bold('\nRegistry Results'));
4966
+ console.log(chalk.gray(`Stored registry: ${stats.artifacts} artifacts across ${stats.sources} sources`));
4967
+ console.log('');
4968
+
4969
+ const rows = [[
4970
+ 'Source',
4971
+ 'Model',
4972
+ 'Artifact',
4973
+ 'Params',
4974
+ 'Size',
4975
+ 'Format',
4976
+ 'Runtime',
4977
+ 'Install'
4978
+ ]];
4979
+
4980
+ for (const item of results) {
4981
+ rows.push([
4982
+ item.source_id,
4983
+ truncateMiddle(item.canonical_model_id, 34),
4984
+ truncateMiddle(item.artifact_name || item.filename, 34),
4985
+ formatRegistryNumber(item.parameter_count_b, 'B'),
4986
+ formatRegistrySize(item.size_gb),
4987
+ item.quantization ? `${item.format}/${item.quantization}` : item.format,
4988
+ formatRegistryList(item.runtime_support, 2),
4989
+ truncateMiddle(item.install_command || item.download_url, 46)
4990
+ ]);
4991
+ }
4992
+
4993
+ console.log(table(rows));
4994
+
4995
+ const links = results
4996
+ .filter((item) => item.download_url)
4997
+ .slice(0, 5);
4998
+ if (links.length > 0) {
4999
+ console.log(chalk.blue.bold('Exact download links:'));
5000
+ links.forEach((item, index) => {
5001
+ console.log(chalk.gray(` ${index + 1}. ${item.canonical_model_id} -> ${item.download_url}`));
5002
+ });
5003
+ }
5004
+ } catch (error) {
5005
+ console.error(chalk.red('Error:'), error.message);
5006
+ if (process.env.DEBUG) console.error(error.stack);
5007
+ process.exitCode = 1;
5008
+ } finally {
5009
+ database.close();
5010
+ }
5011
+ });
5012
+
5013
+ program
5014
+ .command('registry-recommend [query]')
5015
+ .description('Recommend the best exact model artifacts from the multi-source registry for this hardware')
5016
+ .option('-c, --category <category>', 'Task category (general, coding, reasoning, embeddings, multimodal)', 'general')
5017
+ .option('--optimize <profile>', 'Optimization profile (balanced|speed|quality|context|coding)', 'balanced')
5018
+ .option('--runtime <runtime>', 'Runtime target: auto, ollama, llama.cpp, vllm, mlx, transformers', 'auto')
5019
+ .option('-s, --source <source>', 'Filter by source: ollama, huggingface, gpt4all')
5020
+ .option('--format <format>', 'Filter by artifact format: gguf, safetensors, mlx, ollama')
5021
+ .option('--quant <type>', 'Filter by quantization, for example Q4_K_M or Q8_0')
5022
+ .option('--max-size <gb>', 'Maximum artifact size in GB')
5023
+ .option('--min-params <billion>', 'Minimum parameter count in billions')
5024
+ .option('--max-params <billion>', 'Maximum parameter count in billions')
5025
+ .option('--target-context <tokens>', 'Target context window for scoring')
5026
+ .option('--include-gated', 'Include gated/auth-required artifacts')
5027
+ .option('--pool-limit <n>', 'Maximum registry artifacts to score before ranking', '20000')
5028
+ .option('-l, --limit <n>', 'Maximum number of recommendations', '10')
5029
+ .option('-j, --json', 'Output as JSON')
5030
+ .action(async (query = '', options) => {
5031
+ try {
5032
+ validateRegistryFilters(options);
5033
+ } catch (validationError) {
5034
+ if (options.json) {
5035
+ console.log(JSON.stringify({ error: validationError.message }, null, 2));
5036
+ } else {
5037
+ console.error(chalk.red(`✗ ${validationError.message}`));
5038
+ }
5039
+ process.exitCode = 1;
5040
+ return;
5041
+ }
5042
+ if (!options.json) showAsciiArt('registry-recommend');
5043
+
5044
+ const UnifiedDetector = require('../src/hardware/unified-detector');
5045
+ const { RegistryRecommender } = require('../src/data/registry-recommender');
5046
+ const recommender = new RegistryRecommender();
5047
+ const spinner = options.json ? null : ora('Scoring registry artifacts...').start();
5048
+
5049
+ try {
5050
+ await recommender.initialize();
5051
+
5052
+ const detector = new UnifiedDetector();
5053
+ const hardware = await detector.detect();
5054
+ const category = normalizeTaskName(options.category || 'general');
5055
+ const result = await recommender.recommend({
5056
+ query,
5057
+ category,
5058
+ optimizeFor: options.optimize,
5059
+ runtime: options.runtime,
5060
+ source: options.source,
5061
+ format: options.format ? String(options.format).toLowerCase() : undefined,
5062
+ quantization: options.quant,
5063
+ maxSizeGB: parsePositiveNumberOption(options.maxSize),
5064
+ minParamsB: parsePositiveNumberOption(options.minParams),
5065
+ maxParamsB: parsePositiveNumberOption(options.maxParams),
5066
+ targetContext: parsePositiveNumberOption(options.targetContext),
5067
+ localOnly: !options.includeGated,
5068
+ poolLimit: parsePositiveNumberOption(options.poolLimit, 20000),
5069
+ limit: parsePositiveNumberOption(options.limit, 10),
5070
+ hardware
5071
+ });
5072
+
5073
+ if (options.json) {
5074
+ console.log(JSON.stringify({
5075
+ query,
5076
+ hardware: hardware.summary || hardware,
5077
+ ...result
5078
+ }, null, 2));
5079
+ return;
5080
+ }
5081
+
5082
+ if (spinner) {
5083
+ spinner.succeed(
5084
+ `Scored ${result.total_evaluated} candidates from ${result.total_artifacts} registry artifacts`
5085
+ );
5086
+ }
5087
+
5088
+ if (result.recommendations.length === 0) {
5089
+ console.log(chalk.yellow('No registry recommendations found for those filters.'));
5090
+ if (result.registry.artifacts === 0) {
5091
+ console.log(chalk.gray('Populate the registry first with: llm-checker registry-sync'));
5092
+ }
5093
+ return;
5094
+ }
5095
+
5096
+ console.log(chalk.blue.bold('\nRegistry Recommendations'));
5097
+ console.log(chalk.gray(`Registry: ${result.registry.repos} repos, ${result.registry.artifacts} artifacts`));
5098
+ console.log(chalk.gray(`Runtime: ${result.runtime} | Category: ${result.category} | Optimize: ${result.optimizeFor}`));
5099
+ console.log('');
5100
+
5101
+ const rows = [['#', 'Score', 'Source', 'Model', 'Artifact', 'Params', 'Size', 'Install']];
5102
+ result.recommendations.forEach((item, index) => {
5103
+ rows.push([
5104
+ String(index + 1),
5105
+ String(item.score),
5106
+ item.source,
5107
+ truncateMiddle(item.model, 30),
5108
+ truncateMiddle(item.artifact, 32),
5109
+ formatRegistryNumber(item.params_b, 'B'),
5110
+ formatRegistrySize(item.size_gb),
5111
+ truncateMiddle(item.install_command || item.download_url, 44)
5112
+ ]);
5113
+ });
5114
+
5115
+ console.log(table(rows));
5116
+
5117
+ console.log(chalk.blue.bold('Top pick:'));
5118
+ const best = result.recommendations[0];
5119
+ console.log(chalk.white.bold(` ${best.model}`));
5120
+ console.log(chalk.gray(` Artifact: ${best.artifact}`));
5121
+ console.log(chalk.gray(` Why: ${best.rationale}`));
5122
+ if (best.install_command) console.log(chalk.cyan(` ${best.install_command}`));
5123
+ if (best.download_url) console.log(chalk.gray(` ${best.download_url}`));
5124
+ } catch (error) {
5125
+ if (spinner) spinner.fail('Registry recommendation failed');
5126
+ console.error(chalk.red('Error:'), error.message);
5127
+ if (process.env.DEBUG) console.error(error.stack);
5128
+ process.exitCode = 1;
5129
+ } finally {
5130
+ recommender.close();
5131
+ }
5132
+ });
5133
+
4732
5134
  program
4733
5135
  .command('search <query>')
4734
5136
  .description('Search models in the database with intelligent scoring')
@@ -290,9 +290,14 @@ const ALLOWED_CLI_COMMANDS = new Set([
290
290
  "sync",
291
291
  "search",
292
292
  "smart-recommend",
293
+ "registry-sync",
294
+ "registry-search",
295
+ "registry-recommend",
293
296
  "hw-detect",
294
297
  ]);
295
298
 
299
+ export { ALLOWED_CLI_COMMANDS };
300
+
296
301
  // ============================================================================
297
302
  // MCP SERVER
298
303
  // ============================================================================
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "llm-checker",
3
- "version": "3.6.1",
3
+ "version": "3.7.4",
4
4
  "description": "Intelligent CLI tool with AI-powered model selection that analyzes your hardware and recommends optimal LLM models for your system",
5
5
  "bin": {
6
6
  "llm-checker": "bin/cli.js",
@@ -16,6 +16,10 @@
16
16
  "test:ui": "node tests/ui-cli-smoke.test.js",
17
17
  "test:runtime": "node tests/runtime-specdec-tests.js",
18
18
  "test:deterministic-pool": "node tests/deterministic-model-pool-check.js",
19
+ "test:registry": "node tests/model-registry-ingestors.test.js",
20
+ "test:registry-main": "node tests/model-registry-main-flow.test.js",
21
+ "test:registry-recommender": "node tests/model-registry-recommender.test.js",
22
+ "test:registry-seed": "node tests/model-registry-seed.test.js",
19
23
  "test:policy": "node tests/policy-commands.test.js",
20
24
  "test:policy-cli": "node tests/policy-cli-enforcement.js",
21
25
  "test:policy-engine": "node tests/policy-engine.test.js",
@@ -36,7 +40,8 @@
36
40
  "list-models": "node bin/enhanced_cli.js list-models",
37
41
  "ai-check": "node bin/enhanced_cli.js ai-check",
38
42
  "ai-run": "node bin/enhanced_cli.js ai-run",
39
- "sync:seed": "node bin/enhanced_cli.js sync --force --quiet && node scripts/update-seed-db.js",
43
+ "sync:seed": "node bin/enhanced_cli.js sync --force --quiet && node scripts/update-seed-db.js && node scripts/update-registry-seed.js",
44
+ "sync:registry-seed": "node scripts/update-registry-seed.js",
40
45
  "benchmark": "cd ml-model && python python/benchmark_collector.py",
41
46
  "train-ai": "cd ml-model && python python/train_model.py",
42
47
  "postinstall": "echo 'LLM Checker installed. Run: llm-checker hw-detect'"