npm - llm-checker - Versions diffs - 3.6.1 → 3.7.4 - Mend

llm-checker 3.6.1 → 3.7.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/README.md +45 -8
package/bin/enhanced_cli.js +407 -5
package/bin/mcp-server.mjs +5 -0
package/package.json +7 -2
package/src/data/model-database.js +452 -0
package/src/data/registry-ingestors.js +765 -0
package/src/data/registry-recommender.js +632 -0
package/src/data/seed/README.md +11 -3
package/src/data/seed/models.db +0 -0
package/src/index.js +68 -4
package/src/models/deterministic-selector.js +85 -39
package/src/models/moe-assumptions.js +11 -0

package/README.md CHANGED Viewed

@@ -5,7 +5,7 @@
 **Intelligent Ollama Model Selector**
 AI-powered CLI that analyzes your hardware and recommends optimal LLM models.
-Deterministic scoring across **200+ Ollama models** and **7k+ variants** with a packaged SQLite catalog, live sync, and hardware-calibrated memory estimation.
+Deterministic scoring across a packaged **multi-source registry** (Hugging Face + Ollama + GPT4All, **33k+ exact artifacts**) and the Ollama catalog, with live sync, runtime targeting, and hardware-calibrated memory estimation.
 [![npm version](https://img.shields.io/npm/v/llm-checker?style=flat-square&color=0066FF)](https://www.npmjs.com/package/llm-checker)
 [![npm downloads](https://img.shields.io/npm/dm/llm-checker?style=flat-square&color=0066FF)](https://www.npmjs.com/package/llm-checker)
@@ -39,6 +39,7 @@ Choosing the right LLM for your hardware is complex. With thousands of model var
 | | Feature | Description |
 |:---:|---|---|
 | **200+** | Packaged Model Catalog | Ships with a synced Ollama SQLite catalog and can refresh from Ollama on demand |
+| **33k+** | Multi-Source Registry | Exact installable/downloadable artifacts from Hugging Face, Ollama, and GPT4All with per-source commands and runtime targeting |
 | **4D** | Scoring Engine | Quality, Speed, Fit, Context &mdash; weighted by use case |
 | **Multi-GPU** | Hardware Detection | Apple Silicon, NVIDIA CUDA, AMD ROCm, Intel Arc, CPU, integrated/dedicated inventory visibility |
 | **Calibrated** | Memory Estimation | Bytes-per-parameter formula validated against real Ollama sizes |
@@ -151,6 +152,14 @@ hash -r
 llm-checker --version
 ```
+### v3.7.0 Highlights
+- New **multi-source model registry**: a packaged snapshot of ~33,700 exact installable/downloadable artifacts from Hugging Face, Ollama, and GPT4All, with per-source commands (`hf download ...`, `ollama pull ...`).
+- `recommend` and `check` now draw candidates from the registry through one canonical deterministic scoring core, with `--runtime auto/ollama/vllm/mlx/llama.cpp/transformers` targeting; they fall back to the Ollama catalog when the registry is unavailable.
+- New `registry-sync`, `registry-search`, and `registry-recommend` commands.
+- Mixture-of-Experts models are sized by their **total** parameter count (all experts stay resident under Ollama/Metal/vLLM), so a large MoE can no longer falsely "fit" small hardware.
+- Carries the 3.6.1 batch: unified scoring across `check`/`recommend`/`smart-recommend` (#88), high-end/multi-GPU VRAM detection (#95), MCP server hardening (#97), and the Windows interactive-panel fixes (#86).
 ### v3.5.13 Highlights
 - Ships npm packages with a ready-to-use SQLite model catalog:
@@ -389,6 +398,27 @@ llm-checker search "qwen coder" --json
 | `search <query>` | Search the synced catalog with filters and intelligent scoring |
 | `smart-recommend` | Advanced recommendations using the full scoring engine |
+### Model Registry Commands (v3.7.0+)
+Exact installable/downloadable artifacts from a packaged multi-source registry (Hugging Face + Ollama + GPT4All).
+| Command | Description |
+|---------|-------------|
+| `registry-sync` | Sync the multi-source registry (Hugging Face, Ollama, GPT4All) |
+| `registry-search [query]` | Search exact artifacts with `--source`, `--format`, `--runtime`, `--quant`, `--max-size`, `--min-params`/`--max-params` filters |
+| `registry-recommend [query]` | Recommend the best exact artifacts for your hardware, with `--runtime auto/ollama/vllm/mlx/llama.cpp/transformers` targeting and `--category`/`--optimize` |
+```bash
+# Best coding artifacts across all sources, auto runtime
+llm-checker registry-recommend --category coding
+# Only Apple-native MLX artifacts
+llm-checker registry-recommend --category coding --runtime mlx
+# Search Hugging Face for vLLM-ready reasoning models under 24B
+llm-checker registry-search qwen --source huggingface --runtime vllm --max-params 24
+```
 ### Enterprise Policy Commands
 | Command | Description |
@@ -641,30 +671,36 @@ llm-checker search qwen --quant Q4_K_M --max-size 8
 ## Model Catalog
-LLM Checker ships with a pre-synced SQLite snapshot of the Ollama catalog. On first run, that snapshot is copied to `~/.llm-checker/models.db`, so recommendations and catalog search work immediately after npm install.
+LLM Checker ships with a pre-synced SQLite snapshot of the Ollama catalog plus a multi-source registry of exact downloadable/installable model artifacts. On first run, that snapshot is copied to `~/.llm-checker/models.db`, so recommendations and catalog search work immediately after npm install.
 The packaged snapshot currently includes:
 - 229 Ollama models
 - 7176 variants
+- 3259 multi-source registry repositories
+- 33729 exact model artifacts from Hugging Face, Ollama, and GPT4All
+- Hugging Face top 3000 repositories by downloads, fetched with API pagination
 - pull counts
 - tag counts
 - last-updated metadata
-- variant params, quantization, size, context, and input type fields when available
+- variant params, quantization, size, context, runtime, install commands, download URLs, license/gated flags, tasks, and modalities when available
 Refresh it any time:
 ```bash
 llm-checker sync
+llm-checker registry-sync --sources ollama,huggingface,gpt4all
+llm-checker registry-search qwen --runtime auto --max-size 8
+llm-checker registry-recommend --category coding --runtime auto --max-size 8
 ```
-For release maintainers, the packaged seed can be regenerated from the synced local DB:
+For release maintainers, the packaged seed can be regenerated from the synced local DB and registry APIs:
 ```bash
 npm run sync:seed
 ```
-`recommend`, `list-models`, `ai-run`, and `ai-check` prefer the synced SQLite catalog. If the SQLite catalog is unavailable, LLM Checker falls back to the scraped cache and then to the curated catalog.
+`recommend`, `list-models`, `ai-run`, and `ai-check` prefer the synced SQLite catalog. `registry-search` queries exact artifacts across sources, and `registry-recommend` ranks exact artifacts from the registry with the deterministic hardware-aware selector. If the SQLite catalog is unavailable, LLM Checker falls back to the scraped cache and then to the curated catalog.
 The curated fallback catalog includes 35+ models from the most popular Ollama families:
@@ -836,7 +872,7 @@ LLM Checker uses a deterministic pipeline so the same inputs produce the same ra
 flowchart LR
   subgraph Inputs
     HW["Hardware detector<br/>CPU/GPU/RAM/backend"]
-    REG["Synced SQLite Ollama catalog<br/>(packaged seed + live sync)"]
+    REG["Synced SQLite model catalog<br/>(Ollama seed + multi-source registry)"]
     LOCAL["Installed local models"]
     FLAGS["CLI options<br/>use-case/runtime/limits/policy"]
   end
@@ -952,8 +988,9 @@ src/
     detector.js                # Hardware detection
     unified-detector.js        # Cross-platform detection
   data/
-    model-database.js          # SQLite storage and packaged seed loading
-    seed/models.db             # npm-packaged Ollama catalog snapshot
+    model-database.js          # SQLite storage, registry tables, and packaged seed loading
+    registry-ingestors.js      # Ollama/Hugging Face/GPT4All artifact normalization
+    seed/models.db             # npm-packaged Ollama + multi-source registry snapshot
     sync-manager.js            # Database sync from Ollama registry
 bin/
   enhanced_cli.js              # CLI entry point

package/bin/enhanced_cli.js CHANGED Viewed

@@ -61,6 +61,9 @@ const COMMAND_HEADER_LABELS = {
     'smart-recommend': 'Smart Recommend (Experimental)',
     search: 'Model Search',
     sync: 'Database Sync',
+    'registry-sync': 'Model Registry Sync',
+    'registry-search': 'Model Registry Search',
+    'registry-recommend': 'Registry Recommendations',
     'mcp-setup': 'Claude MCP Setup',
     check: 'Compatibility Check',
     installed: 'Installed Models',
@@ -401,6 +404,65 @@ function getRealSizeFromOllamaCache(model) {
     }
 }
+function parsePositiveNumberOption(value, fallback = null) {
+    if (value === undefined || value === null || value === '') return fallback;
+    const parsed = Number(value);
+    return Number.isFinite(parsed) && parsed > 0 ? parsed : fallback;
+}
+// Allowed enum values for the registry commands. Invalid values must be rejected
+// with a clear error instead of silently returning "no results" or falling back
+// to the built-in catalog.
+const REGISTRY_SOURCES = ['ollama', 'huggingface', 'gpt4all'];
+const REGISTRY_FORMATS = ['gguf', 'safetensors', 'mlx', 'ollama', 'pytorch', 'pytorch_bin', 'ggml'];
+const REGISTRY_RUNTIMES = ['auto', 'all', '*', 'ollama', 'llama.cpp', 'transformers', 'vllm', 'mlx'];
+const REGISTRY_OPTIMIZE = ['balanced', 'speed', 'quality', 'context', 'coding'];
+function assertRegistryEnum(label, value, allowed) {
+    if (value === undefined || value === null || value === '') return;
+    if (!allowed.includes(String(value).toLowerCase())) {
+        const shown = allowed.filter((v) => !['all', '*'].includes(v)).join(', ');
+        throw new Error(`Invalid --${label} "${value}". Allowed: ${shown}`);
+    }
+}
+// Throws on the first invalid registry enum option. Returns nothing on success.
+function validateRegistryFilters(options = {}) {
+    assertRegistryEnum('source', options.source, REGISTRY_SOURCES);
+    assertRegistryEnum('format', options.format, REGISTRY_FORMATS);
+    assertRegistryEnum('runtime', options.runtime, REGISTRY_RUNTIMES);
+    assertRegistryEnum('optimize', options.optimize, REGISTRY_OPTIMIZE);
+}
+function truncateMiddle(value, maxLength = 48) {
+    const text = String(value || '');
+    if (text.length <= maxLength) return text;
+    if (maxLength <= 4) return text.slice(0, maxLength);
+    const head = Math.ceil((maxLength - 3) / 2);
+    const tail = Math.floor((maxLength - 3) / 2);
+    return `${text.slice(0, head)}...${text.slice(text.length - tail)}`;
+}
+function formatRegistryNumber(value, suffix = '') {
+    const parsed = Number(value);
+    if (!Number.isFinite(parsed) || parsed <= 0) return '?';
+    const rounded = parsed >= 100 ? Math.round(parsed) : Math.round(parsed * 10) / 10;
+    return `${rounded}${suffix}`;
+}
+function formatRegistrySize(value) {
+    const parsed = Number(value);
+    if (!Number.isFinite(parsed) || parsed <= 0) return '?';
+    return `${Math.round(parsed * 100) / 100}GB`;
+}
+function formatRegistryList(value, maxItems = 3) {
+    const items = Array.isArray(value) ? value : [];
+    if (items.length === 0) return '-';
+    const shown = items.slice(0, maxItems).join(', ');
+    return items.length > maxItems ? `${shown}, +${items.length - maxItems}` : shown;
+}
 const program = new Command();
 program
@@ -1285,12 +1347,17 @@ function displayIntelligentRecommendations(intelligentData, hardware = null) {
     const { summary, recommendations } = intelligentData;
     const tier = summary.hardware_tier.replace('_', ' ').toUpperCase();
     const optimizeProfile = (summary.optimize_for || intelligentData.optimizeFor || 'balanced').toUpperCase();
+    const runtimeLabel = (intelligentData.runtime || summary.best_overall?.runtime || 'auto').toUpperCase();
+    const sourceLabel = intelligentData.recommendationSource === 'registry'
+        ? 'Multi-source registry'
+        : 'Ollama catalog';
     const tierColor = tier.includes('HIGH') ? chalk.green : tier.includes('MEDIUM') ? chalk.yellow : chalk.red;
     console.log('\n' + chalk.bgRed.white.bold(' INTELLIGENT RECOMMENDATIONS BY CATEGORY '));
     console.log(chalk.red('╭' + '─'.repeat(65)));
     console.log(chalk.red('│') + ` Hardware Tier: ${tierColor.bold(tier)} | Models Analyzed: ${chalk.cyan.bold(intelligentData.totalModelsAnalyzed)}`);
-    console.log(chalk.red('│') + ` Optimization: ${chalk.magenta.bold(optimizeProfile)}`);
+    console.log(chalk.red('│') + ` Optimization: ${chalk.magenta.bold(optimizeProfile)} | Runtime: ${chalk.cyan.bold(runtimeLabel)}`);
+    console.log(chalk.red('│') + ` Source: ${chalk.white.bold(sourceLabel)}`);
     console.log(chalk.red('│'));
     // Mostrar mejor modelo general
@@ -1301,6 +1368,7 @@ function displayIntelligentRecommendations(intelligentData, hardware = null) {
         console.log(chalk.red('│') + `    Command: ${chalk.cyan.bold(best.command)}`);
         console.log(chalk.red('│') + `    Score: ${chalk.yellow.bold(best.score)}/100 | Category: ${chalk.magenta(best.category)}`);
         console.log(chalk.red('│') + `    Quantization: ${chalk.white.bold(best.quantization || 'Q4_K_M')}`);
+        console.log(chalk.red('│') + `    Runtime: ${chalk.cyan.bold(best.runtime || intelligentData.runtime || 'ollama')} | Source: ${chalk.gray(best.source || 'unknown')}`);
         console.log(chalk.red('│') + `    Fine-tuning: ${chalk.blue.bold(bestFineTuning.shortLabel)}`);
         console.log(chalk.red('│'));
     }
@@ -1326,6 +1394,7 @@ function displayIntelligentRecommendations(intelligentData, hardware = null) {
         console.log(chalk.red('│') + `    ${chalk.green(model.name)} (${model.size})`);
         console.log(chalk.red('│') + `    Score: ${scoreColor.bold(model.score)}/100 | Pulls: ${chalk.gray(model.pulls?.toLocaleString() || 'N/A')}`);
         console.log(chalk.red('│') + `    Quantization: ${chalk.white.bold(model.quantization || 'Q4_K_M')}`);
+        console.log(chalk.red('│') + `    Runtime: ${chalk.cyan(model.runtime || intelligentData.runtime || 'ollama')} | Source: ${chalk.gray(model.source || 'unknown')}`);
         console.log(chalk.red('│') + `    Fine-tuning: ${chalk.blue.bold(fineTuningSupport.shortLabel)}`);
         console.log(chalk.red('│') + `    Command: ${chalk.cyan.bold(model.command)}`);
         console.log(chalk.red('│'));
@@ -3017,7 +3086,7 @@ auditCommand
     .option('-u, --use-case <case>', 'Use case when --command check is selected', 'general')
     .option('-c, --category <category>', 'Category hint when --command recommend is selected')
     .option('--optimize <profile>', 'Optimization profile for recommend mode (balanced|speed|quality|context|coding)', 'balanced')
-    .option('--runtime <runtime>', `Runtime for check mode (${SUPPORTED_RUNTIMES.join('|')})`, 'ollama')
+    .option('--runtime <runtime>', 'Runtime for check/recommend mode (auto|ollama|vllm|mlx|llama.cpp|transformers)', 'auto')
     .option('--include-cloud', 'Include cloud models in check-mode analysis')
     .option('--max-size <size>', 'Maximum model size for check mode (e.g., "24B" or "12GB")')
     .option('--min-size <size>', 'Minimum model size for check mode (e.g., "3B" or "2GB")')
@@ -3071,13 +3140,14 @@ auditCommand
                 policyCandidates = collectCandidatesFromAnalysis(analysisResult);
             } else {
                 recommendationResult = await checker.generateIntelligentRecommendations(hardware, {
-                    optimizeFor: options.optimize
+                    optimizeFor: options.optimize,
+                    runtime: options.runtime
                 });
                 if (!recommendationResult) {
                     throw new Error('Unable to generate recommendation data for policy audit export.');
                 }
-                runtimeBackend = normalizeRuntime(options.runtime || 'ollama');
+                runtimeBackend = recommendationResult.runtime || options.runtime || 'auto';
                 policyCandidates = collectCandidatesFromRecommendationData(recommendationResult);
             }
@@ -3844,6 +3914,8 @@ program
     .description('Get intelligent model recommendations for your hardware')
     .option('-c, --category <category>', 'Get recommendations for specific category (coding, talking, reading, etc.)')
     .option('--optimize <profile>', 'Optimization profile (balanced|speed|quality|context|coding)', 'balanced')
+    .option('--runtime <runtime>', 'Runtime target for registry recommendations (auto|ollama|vllm|mlx|llama.cpp|transformers)', 'auto')
+    .option('--no-registry', 'Use the legacy Ollama catalog recommendation path')
     .option('--no-verbose', 'Disable step-by-step progress display')
     .option('--policy <file>', 'Evaluate recommendations against a policy file')
     .option('--simulate <profile>', 'Simulate a hardware profile instead of detecting real hardware (use "list" to see profiles)')
@@ -3868,6 +3940,11 @@ Hardware simulation:
   $ llm-checker recommend --simulate m4pro24 --category coding
   $ llm-checker recommend --gpu "RTX 5060" --ram 32 --cpu "AMD Ryzen 7 5700X"
+Registry/runtime examples:
+  $ llm-checker recommend --runtime auto --category coding
+  $ llm-checker recommend --runtime vllm --category coding
+  $ llm-checker recommend --runtime mlx --category general
 Calibrated routing examples:
   $ llm-checker recommend --calibrated --category coding
   $ llm-checker recommend --calibrated ./calibration-policy.yaml --category reasoning
@@ -3945,7 +4022,9 @@ Calibrated routing examples:
             const hardware = await checker.getSystemInfo();
             const intelligentRecommendations = await checker.generateIntelligentRecommendations(hardware, {
-                optimizeFor: options.optimize
+                optimizeFor: options.optimize,
+                runtime: options.runtime,
+                registry: options.registry
             });
             if (!intelligentRecommendations) {
@@ -4729,6 +4808,329 @@ program
         }
     });
+program
+    .command('registry-sync')
+    .description('Sync the multi-source model registry (Ollama, Hugging Face, GPT4All)')
+    .option('-s, --sources <list>', 'Comma-separated sources: ollama,huggingface,gpt4all', 'ollama,huggingface,gpt4all')
+    .option('-l, --limit <n>', 'Fallback maximum records per source')
+    .option('--hf-limit <n>', 'Maximum Hugging Face repos to ingest', '3000')
+    .option('--ollama-limit <n>', 'Maximum Ollama artifacts to ingest', '10000')
+    .option('--gpt4all-limit <n>', 'Maximum GPT4All entries to ingest', '1000')
+    .option('--query <text>', 'Hugging Face search query')
+    .option('--task <task>', 'Hugging Face task/filter, for example text-generation or text-embeddings-inference')
+    .option('--dry-run', 'Fetch and normalize without writing to the database')
+    .option('-q, --quiet', 'Suppress progress output')
+    .option('-j, --json', 'Output as JSON')
+    .action(async (options) => {
+        const quiet = Boolean(options.quiet || options.json);
+        if (!quiet) showAsciiArt('registry-sync');
+        const ModelDatabase = require('../src/data/model-database');
+        const { RegistryIngestor } = require('../src/data/registry-ingestors');
+        const database = new ModelDatabase();
+        const spinner = quiet ? null : ora('Preparing model registry sync...').start();
+        try {
+            await database.initialize();
+            const ingestor = new RegistryIngestor({
+                database,
+                onProgress: (info) => {
+                    if (spinner && info.message) {
+                        spinner.text = info.message;
+                    }
+                }
+            });
+            const summary = await ingestor.ingest({
+                sources: options.sources,
+                limit: parsePositiveNumberOption(options.limit),
+                hfLimit: parsePositiveNumberOption(options.hfLimit, 3000),
+                ollamaLimit: parsePositiveNumberOption(options.ollamaLimit, 10000),
+                gpt4allLimit: parsePositiveNumberOption(options.gpt4allLimit, 1000),
+                query: options.query,
+                task: options.task,
+                dryRun: Boolean(options.dryRun)
+            });
+            const stats = options.dryRun ? null : database.getRegistryStats();
+            if (options.json) {
+                console.log(JSON.stringify({ summary, stats }, null, 2));
+                return;
+            }
+            if (spinner) {
+                const action = options.dryRun ? 'normalized' : 'synced';
+                spinner.succeed(`Registry ${action}: ${summary.repos} repos, ${summary.artifacts} artifacts`);
+            }
+            console.log(chalk.green('\n[OK] Registry sync complete'));
+            console.log(chalk.gray(`  Sources touched: ${summary.sources}`));
+            console.log(chalk.gray(`  Collections: ${summary.collections}`));
+            console.log(chalk.gray(`  Repositories: ${summary.repos}`));
+            console.log(chalk.gray(`  Artifacts: ${summary.artifacts}`));
+            if (stats) {
+                console.log(chalk.blue.bold('\nRegistry totals:'));
+                console.log(chalk.gray(`  Sources: ${stats.sources}`));
+                console.log(chalk.gray(`  Repositories: ${stats.repos}`));
+                console.log(chalk.gray(`  Artifacts: ${stats.artifacts}`));
+                if (stats.bySource.length > 0) {
+                    const rows = [['Source', 'Artifacts']];
+                    for (const item of stats.bySource) {
+                        rows.push([item.source_id, String(item.artifact_count)]);
+                    }
+                    console.log('\n' + table(rows));
+                }
+            }
+            console.log(chalk.cyan('Try: llm-checker registry-search llama --runtime auto --limit 10'));
+        } catch (error) {
+            if (spinner) spinner.fail('Registry sync failed');
+            console.error(chalk.red('Error:'), error.message);
+            if (process.env.DEBUG) console.error(error.stack);
+            process.exitCode = 1;
+        } finally {
+            database.close();
+        }
+    });
+program
+    .command('registry-search [query]')
+    .description('Search exact downloadable/installable artifacts in the multi-source model registry')
+    .option('-s, --source <source>', 'Filter by source: ollama, huggingface, gpt4all')
+    .option('--format <format>', 'Filter by artifact format: gguf, safetensors, mlx, ollama')
+    .option('--runtime <runtime>', 'Filter by runtime support: auto, ollama, llama.cpp, transformers, vllm, mlx')
+    .option('--quant <type>', 'Filter by quantization, for example Q4_K_M or Q8_0')
+    .option('--max-size <gb>', 'Maximum artifact size in GB')
+    .option('--min-params <billion>', 'Minimum parameter count in billions')
+    .option('--max-params <billion>', 'Maximum parameter count in billions')
+    .option('--local-only', 'Exclude gated/auth-required artifacts')
+    .option('-l, --limit <n>', 'Maximum number of results', '20')
+    .option('-j, --json', 'Output as JSON')
+    .action(async (query = '', options) => {
+        try {
+            validateRegistryFilters(options);
+        } catch (validationError) {
+            if (options.json) {
+                console.log(JSON.stringify({ error: validationError.message }, null, 2));
+            } else {
+                console.error(chalk.red(`✗ ${validationError.message}`));
+            }
+            process.exitCode = 1;
+            return;
+        }
+        if (!options.json) showAsciiArt('registry-search');
+        const ModelDatabase = require('../src/data/model-database');
+        const database = new ModelDatabase();
+        try {
+            await database.initialize();
+            const filters = {
+                source: options.source,
+                format: options.format ? String(options.format).toLowerCase() : undefined,
+                runtime: options.runtime,
+                quantization: options.quant,
+                maxSizeGB: parsePositiveNumberOption(options.maxSize),
+                minParamsB: parsePositiveNumberOption(options.minParams),
+                maxParamsB: parsePositiveNumberOption(options.maxParams),
+                localOnly: Boolean(options.localOnly),
+                limit: parsePositiveNumberOption(options.limit, 20)
+            };
+            const results = database.searchModelArtifacts(query, filters);
+            const stats = database.getRegistryStats();
+            if (options.json) {
+                console.log(JSON.stringify({
+                    query,
+                    filters,
+                    count: results.length,
+                    stats,
+                    results
+                }, null, 2));
+                return;
+            }
+            if (results.length === 0) {
+                console.log(chalk.yellow('No registry artifacts found.'));
+                if (stats.artifacts === 0) {
+                    console.log(chalk.gray('Populate the registry first with: llm-checker registry-sync'));
+                }
+                return;
+            }
+            console.log(chalk.blue.bold('\nRegistry Results'));
+            console.log(chalk.gray(`Stored registry: ${stats.artifacts} artifacts across ${stats.sources} sources`));
+            console.log('');
+            const rows = [[
+                'Source',
+                'Model',
+                'Artifact',
+                'Params',
+                'Size',
+                'Format',
+                'Runtime',
+                'Install'
+            ]];
+            for (const item of results) {
+                rows.push([
+                    item.source_id,
+                    truncateMiddle(item.canonical_model_id, 34),
+                    truncateMiddle(item.artifact_name || item.filename, 34),
+                    formatRegistryNumber(item.parameter_count_b, 'B'),
+                    formatRegistrySize(item.size_gb),
+                    item.quantization ? `${item.format}/${item.quantization}` : item.format,
+                    formatRegistryList(item.runtime_support, 2),
+                    truncateMiddle(item.install_command || item.download_url, 46)
+                ]);
+            }
+            console.log(table(rows));
+            const links = results
+                .filter((item) => item.download_url)
+                .slice(0, 5);
+            if (links.length > 0) {
+                console.log(chalk.blue.bold('Exact download links:'));
+                links.forEach((item, index) => {
+                    console.log(chalk.gray(`  ${index + 1}. ${item.canonical_model_id} -> ${item.download_url}`));
+                });
+            }
+        } catch (error) {
+            console.error(chalk.red('Error:'), error.message);
+            if (process.env.DEBUG) console.error(error.stack);
+            process.exitCode = 1;
+        } finally {
+            database.close();
+        }
+    });
+program
+    .command('registry-recommend [query]')
+    .description('Recommend the best exact model artifacts from the multi-source registry for this hardware')
+    .option('-c, --category <category>', 'Task category (general, coding, reasoning, embeddings, multimodal)', 'general')
+    .option('--optimize <profile>', 'Optimization profile (balanced|speed|quality|context|coding)', 'balanced')
+    .option('--runtime <runtime>', 'Runtime target: auto, ollama, llama.cpp, vllm, mlx, transformers', 'auto')
+    .option('-s, --source <source>', 'Filter by source: ollama, huggingface, gpt4all')
+    .option('--format <format>', 'Filter by artifact format: gguf, safetensors, mlx, ollama')
+    .option('--quant <type>', 'Filter by quantization, for example Q4_K_M or Q8_0')
+    .option('--max-size <gb>', 'Maximum artifact size in GB')
+    .option('--min-params <billion>', 'Minimum parameter count in billions')
+    .option('--max-params <billion>', 'Maximum parameter count in billions')
+    .option('--target-context <tokens>', 'Target context window for scoring')
+    .option('--include-gated', 'Include gated/auth-required artifacts')
+    .option('--pool-limit <n>', 'Maximum registry artifacts to score before ranking', '20000')
+    .option('-l, --limit <n>', 'Maximum number of recommendations', '10')
+    .option('-j, --json', 'Output as JSON')
+    .action(async (query = '', options) => {
+        try {
+            validateRegistryFilters(options);
+        } catch (validationError) {
+            if (options.json) {
+                console.log(JSON.stringify({ error: validationError.message }, null, 2));
+            } else {
+                console.error(chalk.red(`✗ ${validationError.message}`));
+            }
+            process.exitCode = 1;
+            return;
+        }
+        if (!options.json) showAsciiArt('registry-recommend');
+        const UnifiedDetector = require('../src/hardware/unified-detector');
+        const { RegistryRecommender } = require('../src/data/registry-recommender');
+        const recommender = new RegistryRecommender();
+        const spinner = options.json ? null : ora('Scoring registry artifacts...').start();
+        try {
+            await recommender.initialize();
+            const detector = new UnifiedDetector();
+            const hardware = await detector.detect();
+            const category = normalizeTaskName(options.category || 'general');
+            const result = await recommender.recommend({
+                query,
+                category,
+                optimizeFor: options.optimize,
+                runtime: options.runtime,
+                source: options.source,
+                format: options.format ? String(options.format).toLowerCase() : undefined,
+                quantization: options.quant,
+                maxSizeGB: parsePositiveNumberOption(options.maxSize),
+                minParamsB: parsePositiveNumberOption(options.minParams),
+                maxParamsB: parsePositiveNumberOption(options.maxParams),
+                targetContext: parsePositiveNumberOption(options.targetContext),
+                localOnly: !options.includeGated,
+                poolLimit: parsePositiveNumberOption(options.poolLimit, 20000),
+                limit: parsePositiveNumberOption(options.limit, 10),
+                hardware
+            });
+            if (options.json) {
+                console.log(JSON.stringify({
+                    query,
+                    hardware: hardware.summary || hardware,
+                    ...result
+                }, null, 2));
+                return;
+            }
+            if (spinner) {
+                spinner.succeed(
+                    `Scored ${result.total_evaluated} candidates from ${result.total_artifacts} registry artifacts`
+                );
+            }
+            if (result.recommendations.length === 0) {
+                console.log(chalk.yellow('No registry recommendations found for those filters.'));
+                if (result.registry.artifacts === 0) {
+                    console.log(chalk.gray('Populate the registry first with: llm-checker registry-sync'));
+                }
+                return;
+            }
+            console.log(chalk.blue.bold('\nRegistry Recommendations'));
+            console.log(chalk.gray(`Registry: ${result.registry.repos} repos, ${result.registry.artifacts} artifacts`));
+            console.log(chalk.gray(`Runtime: ${result.runtime} | Category: ${result.category} | Optimize: ${result.optimizeFor}`));
+            console.log('');
+            const rows = [['#', 'Score', 'Source', 'Model', 'Artifact', 'Params', 'Size', 'Install']];
+            result.recommendations.forEach((item, index) => {
+                rows.push([
+                    String(index + 1),
+                    String(item.score),
+                    item.source,
+                    truncateMiddle(item.model, 30),
+                    truncateMiddle(item.artifact, 32),
+                    formatRegistryNumber(item.params_b, 'B'),
+                    formatRegistrySize(item.size_gb),
+                    truncateMiddle(item.install_command || item.download_url, 44)
+                ]);
+            });
+            console.log(table(rows));
+            console.log(chalk.blue.bold('Top pick:'));
+            const best = result.recommendations[0];
+            console.log(chalk.white.bold(`  ${best.model}`));
+            console.log(chalk.gray(`  Artifact: ${best.artifact}`));
+            console.log(chalk.gray(`  Why: ${best.rationale}`));
+            if (best.install_command) console.log(chalk.cyan(`  ${best.install_command}`));
+            if (best.download_url) console.log(chalk.gray(`  ${best.download_url}`));
+        } catch (error) {
+            if (spinner) spinner.fail('Registry recommendation failed');
+            console.error(chalk.red('Error:'), error.message);
+            if (process.env.DEBUG) console.error(error.stack);
+            process.exitCode = 1;
+        } finally {
+            recommender.close();
+        }
+    });
 program
     .command('search <query>')
     .description('Search models in the database with intelligent scoring')

package/bin/mcp-server.mjs CHANGED Viewed

@@ -290,9 +290,14 @@ const ALLOWED_CLI_COMMANDS = new Set([
   "sync",
   "search",
   "smart-recommend",
+  "registry-sync",
+  "registry-search",
+  "registry-recommend",
   "hw-detect",
 ]);
+export { ALLOWED_CLI_COMMANDS };
 // ============================================================================
 // MCP SERVER
 // ============================================================================

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "llm-checker",
-  "version": "3.6.1",
+  "version": "3.7.4",
   "description": "Intelligent CLI tool with AI-powered model selection that analyzes your hardware and recommends optimal LLM models for your system",
   "bin": {
     "llm-checker": "bin/cli.js",
@@ -16,6 +16,10 @@
     "test:ui": "node tests/ui-cli-smoke.test.js",
     "test:runtime": "node tests/runtime-specdec-tests.js",
     "test:deterministic-pool": "node tests/deterministic-model-pool-check.js",
+    "test:registry": "node tests/model-registry-ingestors.test.js",
+    "test:registry-main": "node tests/model-registry-main-flow.test.js",
+    "test:registry-recommender": "node tests/model-registry-recommender.test.js",
+    "test:registry-seed": "node tests/model-registry-seed.test.js",
     "test:policy": "node tests/policy-commands.test.js",
     "test:policy-cli": "node tests/policy-cli-enforcement.js",
     "test:policy-engine": "node tests/policy-engine.test.js",
@@ -36,7 +40,8 @@
     "list-models": "node bin/enhanced_cli.js list-models",
     "ai-check": "node bin/enhanced_cli.js ai-check",
     "ai-run": "node bin/enhanced_cli.js ai-run",
-    "sync:seed": "node bin/enhanced_cli.js sync --force --quiet && node scripts/update-seed-db.js",
+    "sync:seed": "node bin/enhanced_cli.js sync --force --quiet && node scripts/update-seed-db.js && node scripts/update-registry-seed.js",
+    "sync:registry-seed": "node scripts/update-registry-seed.js",
     "benchmark": "cd ml-model && python python/benchmark_collector.py",
     "train-ai": "cd ml-model && python python/train_model.py",
     "postinstall": "echo 'LLM Checker installed. Run: llm-checker hw-detect'"