npm - llm-checker - Versions diffs - 3.5.15 → 3.7.0 - Mend

llm-checker 3.5.15 → 3.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/README.md +28 -8
package/analyzer/compatibility.js +5 -0
package/analyzer/performance.js +5 -4
package/bin/cli.js +5 -39
package/bin/enhanced_cli.js +449 -24
package/bin/mcp-server.mjs +266 -101
package/package.json +13 -8
package/src/ai/multi-objective-selector.js +118 -11
package/src/calibration/calibration-manager.js +4 -1
package/src/data/model-database.js +489 -5
package/src/data/registry-ingestors.js +751 -0
package/src/data/registry-recommender.js +514 -0
package/src/data/seed/README.md +11 -3
package/src/data/seed/models.db +0 -0
package/src/data/sync-manager.js +32 -18
package/src/hardware/backends/apple-silicon.js +5 -1
package/src/hardware/backends/cuda-detector.js +47 -19
package/src/hardware/backends/intel-detector.js +6 -2
package/src/hardware/backends/rocm-detector.js +6 -2
package/src/hardware/detector.js +57 -30
package/src/hardware/unified-detector.js +129 -25
package/src/index.js +68 -4
package/src/models/ai-check-selector.js +36 -5
package/src/models/deterministic-selector.js +179 -18
package/src/models/expanded_database.js +9 -5
package/src/models/intelligent-selector.js +87 -1
package/src/models/moe-assumptions.js +11 -0
package/src/models/requirements.js +16 -11
package/src/models/scoring-core.js +341 -0
package/src/models/scoring-engine.js +9 -2
package/src/ollama/capacity-planner.js +15 -2
package/src/ollama/client.js +70 -30
package/src/ollama/enhanced-client.js +20 -2
package/src/ollama/manager.js +14 -2
package/src/policy/cli-policy.js +8 -2
package/src/policy/policy-engine.js +2 -1
package/src/provenance/model-provenance.js +4 -1
package/src/ui/cli-theme.js +47 -7
package/src/ui/interactive-panel.js +162 -24

package/README.md CHANGED Viewed

@@ -573,6 +573,19 @@ This makes integrated GPUs visible even when the selected runtime backend is sti
 llm-checker recommend
 ```
+As of the scoring unification (#96), `check`, `recommend`, and `smart-recommend`
+all derive their ranking from **one canonical scoring core**
+(`DeterministicModelSelector` via `src/models/scoring-core.js`), so identical
+`(model, hardware)` inputs score identically across all three and the
+high-capacity right-sizing floor applies everywhere. They differ only in their
+model **source** and **presentation**, not in how a given model is ranked:
+| Command | Role | Ranking core |
+|---------|------|--------------|
+| `recommend` | Canonical model recommendations by category | Shared core (reference output) |
+| `check` | Full hardware-compatibility report with a recommendation card | Shared core (consistent ranking, fit-oriented report) |
+| `smart-recommend` | Catalog/DB-backed recommendations with a detailed score breakdown | Shared core (same ordering + scores) |
 Use optimization profiles to steer ranking by intent:
 ```bash
@@ -628,30 +641,36 @@ llm-checker search qwen --quant Q4_K_M --max-size 8
 ## Model Catalog
-LLM Checker ships with a pre-synced SQLite snapshot of the Ollama catalog. On first run, that snapshot is copied to `~/.llm-checker/models.db`, so recommendations and catalog search work immediately after npm install.
+LLM Checker ships with a pre-synced SQLite snapshot of the Ollama catalog plus a multi-source registry of exact downloadable/installable model artifacts. On first run, that snapshot is copied to `~/.llm-checker/models.db`, so recommendations and catalog search work immediately after npm install.
 The packaged snapshot currently includes:
 - 229 Ollama models
 - 7176 variants
+- 3259 multi-source registry repositories
+- 33729 exact model artifacts from Hugging Face, Ollama, and GPT4All
+- Hugging Face top 3000 repositories by downloads, fetched with API pagination
 - pull counts
 - tag counts
 - last-updated metadata
-- variant params, quantization, size, context, and input type fields when available
+- variant params, quantization, size, context, runtime, install commands, download URLs, license/gated flags, tasks, and modalities when available
 Refresh it any time:
 ```bash
 llm-checker sync
+llm-checker registry-sync --sources ollama,huggingface,gpt4all
+llm-checker registry-search qwen --runtime auto --max-size 8
+llm-checker registry-recommend --category coding --runtime auto --max-size 8
 ```
-For release maintainers, the packaged seed can be regenerated from the synced local DB:
+For release maintainers, the packaged seed can be regenerated from the synced local DB and registry APIs:
 ```bash
 npm run sync:seed
 ```
-`recommend`, `list-models`, `ai-run`, and `ai-check` prefer the synced SQLite catalog. If the SQLite catalog is unavailable, LLM Checker falls back to the scraped cache and then to the curated catalog.
+`recommend`, `list-models`, `ai-run`, and `ai-check` prefer the synced SQLite catalog. `registry-search` queries exact artifacts across sources, and `registry-recommend` ranks exact artifacts from the registry with the deterministic hardware-aware selector. If the SQLite catalog is unavailable, LLM Checker falls back to the scraped cache and then to the curated catalog.
 The curated fallback catalog includes 35+ models from the most popular Ollama families:
@@ -695,7 +714,7 @@ Three scoring systems are available, each optimized for different workflows:
 | `reasoning` | 60% | 10% | 20% | 10% |
 | `multimodal` | 50% | 15% | 20% | 15% |
-**Scoring Engine** (used by `smart-recommend` and `search`):
+**Scoring Engine** (used by `search` for catalog scoring; `smart-recommend`'s final ranking is produced by the shared scoring core &mdash; see #96):
 | Use Case | Quality | Speed | Fit | Context |
 |----------|:-------:|:-----:|:---:|:-------:|
@@ -823,7 +842,7 @@ LLM Checker uses a deterministic pipeline so the same inputs produce the same ra
 flowchart LR
   subgraph Inputs
     HW["Hardware detector<br/>CPU/GPU/RAM/backend"]
-    REG["Synced SQLite Ollama catalog<br/>(packaged seed + live sync)"]
+    REG["Synced SQLite model catalog<br/>(Ollama seed + multi-source registry)"]
     LOCAL["Installed local models"]
     FLAGS["CLI options<br/>use-case/runtime/limits/policy"]
   end
@@ -939,8 +958,9 @@ src/
     detector.js                # Hardware detection
     unified-detector.js        # Cross-platform detection
   data/
-    model-database.js          # SQLite storage and packaged seed loading
-    seed/models.db             # npm-packaged Ollama catalog snapshot
+    model-database.js          # SQLite storage, registry tables, and packaged seed loading
+    registry-ingestors.js      # Ollama/Hugging Face/GPT4All artifact normalization
+    seed/models.db             # npm-packaged Ollama + multi-source registry snapshot
     sync-manager.js            # Database sync from Ollama registry
 bin/
   enhanced_cli.js              # CLI entry point

package/analyzer/compatibility.js CHANGED Viewed

@@ -428,6 +428,11 @@ class CompatibilityAnalyzer {
     }
     parseModelSize(sizeString) {
+        // Guard non-string input (undefined / a numeric size) — this runs for every
+        // model in calculateModelCompatibility, so one bad entry must not crash the
+        // whole analysis. Matches the guard in analyzer/performance.js.
+        if (typeof sizeString !== 'string' || !sizeString.trim()) return 1;
         const match = sizeString.match(/(\d+\.?\d*)[BM]/i);
         if (!match) return 1;

package/analyzer/performance.js CHANGED Viewed

@@ -363,12 +363,13 @@ class PerformanceAnalyzer {
     }
     estimateLoadTime(model, hardware) {
+        // ~2 GB per 1B params (fp16-ish) on-disk approximation.
         const modelSizeGB = this.parseModelSize(model.size) * 2;
-        let loadTimeSeconds = modelSizeGB * 2;
-        loadTimeSeconds *= 0.7;
+        // Fold the previous `* 2` then `* 0.7` two-step (a leftover from an
+        // incomplete edit, with a dead blank line) into one documented factor:
+        // ~1.4 s of load time per GB before hardware adjustments.
+        let loadTimeSeconds = modelSizeGB * 1.4;
         const cpuSpeedFactor = Math.max(0.5, Math.min(1.5, (hardware.cpu.speed || 2.5) / 2.5));
         loadTimeSeconds /= cpuSpeedFactor;

package/bin/cli.js CHANGED Viewed

@@ -11,44 +11,10 @@ if (!Number.isFinite(majorNodeVersion) || majorNodeVersion < 16) {
     process.exit(1);
 }
-function preprocessAiCheckModelsArg(argv) {
-    const normalizedArgs = [];
-    let modelsFilter = null;
-    let sawAiCheck = false;
-    for (let index = 0; index < argv.length; index += 1) {
-        const token = argv[index];
-        if (token === 'ai-check') {
-            sawAiCheck = true;
-        }
-        if (sawAiCheck && token === '--models') {
-            const nextToken = argv[index + 1];
-            if (nextToken && !nextToken.startsWith('-')) {
-                modelsFilter = nextToken;
-                index += 1;
-            }
-            continue;
-        }
-        if (sawAiCheck && token.startsWith('--models=')) {
-            modelsFilter = token.slice('--models='.length);
-            continue;
-        }
-        normalizedArgs.push(token);
-    }
-    return { args: normalizedArgs, modelsFilter };
-}
-const preprocessedArgs = preprocessAiCheckModelsArg(process.argv.slice(2));
-if (typeof preprocessedArgs.modelsFilter === 'string' && preprocessedArgs.modelsFilter.trim()) {
-    process.env.LLM_CHECKER_AI_CHECK_MODELS = preprocessedArgs.modelsFilter.trim();
-}
-process.argv = [process.argv[0], process.argv[1], ...preprocessedArgs.args];
+// `ai-check --models <list>` is now a real commander option handled in
+// enhanced_cli.js (and the AICheckSelector applies it as a candidate filter), so
+// the previous argv-rewriting shim — which stripped the flag and stashed it in an
+// env var that nothing read — is gone. LLM_CHECKER_AI_CHECK_MODELS still works as
+// an explicit fallback for the same filter.
 require('./enhanced_cli');