llm-checker 3.2.3 → 3.2.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -93,14 +93,14 @@ npm install sql.js
93
93
 
94
94
  LLM Checker is published in all primary channels:
95
95
 
96
- - npm (latest): [`llm-checker@3.2.1`](https://www.npmjs.com/package/llm-checker)
97
- - GitHub Release: [`v3.2.1` (2026-02-17)](https://github.com/Pavelevich/llm-checker/releases/tag/v3.2.1)
96
+ - npm (latest): [`llm-checker@3.2.4`](https://www.npmjs.com/package/llm-checker)
97
+ - GitHub Release: [`v3.2.4`](https://github.com/Pavelevich/llm-checker/releases/tag/v3.2.4)
98
98
  - GitHub Packages: [`@pavelevich/llm-checker`](https://github.com/users/Pavelevich/packages/npm/package/llm-checker)
99
99
 
100
- ### v3.2.1 Highlights
100
+ ### v3.2.4 Highlights
101
101
 
102
- - Added vLLM/MLX runtime support and speculative decoding estimation.
103
- - Improved GPU detection, added DGX Spark/GB10 support, strengthened Node runtime guards, and updated tooling comparison notes.
102
+ - Fixed `recommend` hardware-profile handling so discrete VRAM limits are honored consistently.
103
+ - Added deterministic selector regression coverage for 24GB VRAM fit behavior.
104
104
 
105
105
  ### Optional: Install from GitHub Packages
106
106
 
@@ -110,7 +110,7 @@ echo "@pavelevich:registry=https://npm.pkg.github.com" >> ~/.npmrc
110
110
  echo "//npm.pkg.github.com/:_authToken=${GITHUB_TOKEN}" >> ~/.npmrc
111
111
 
112
112
  # 2) Install
113
- npm install -g @pavelevich/llm-checker@3.2.1
113
+ npm install -g @pavelevich/llm-checker@3.2.4
114
114
  ```
115
115
 
116
116
  ---
@@ -266,9 +266,47 @@ llm-checker audit export --policy ./policy.yaml --command check --format all --o
266
266
  - `--format all` honors `reporting.formats` in your policy (falls back to `json,csv,sarif`).
267
267
  - In `enforce` mode with blocking violations, reports are still written before non-zero exit.
268
268
 
269
+ ### Integration Examples (SIEM / CI Artifacts)
270
+
271
+ ```bash
272
+ # CI artifact (JSON) for post-processing in pipeline jobs
273
+ llm-checker audit export --policy ./policy.yaml --command check --format json --out ./reports/policy-report.json
274
+
275
+ # Flat CSV for SIEM ingestion (Splunk/ELK/DataDog pipelines)
276
+ llm-checker audit export --policy ./policy.yaml --command check --format csv --out ./reports/policy-report.csv
277
+
278
+ # SARIF for security/code-scanning tooling integrations
279
+ llm-checker audit export --policy ./policy.yaml --command check --format sarif --out ./reports/policy-report.sarif
280
+ ```
281
+
282
+ ### GitHub Actions Policy Gate (Copy-Paste)
283
+
284
+ ```yaml
285
+ name: Policy Gate
286
+ on: [pull_request]
287
+
288
+ jobs:
289
+ policy-gate:
290
+ runs-on: ubuntu-latest
291
+ steps:
292
+ - uses: actions/checkout@v4
293
+ - uses: actions/setup-node@v4
294
+ with:
295
+ node-version: 20
296
+ - run: npm ci
297
+ - run: node bin/enhanced_cli.js check --policy ./policy.yaml --runtime ollama --no-verbose
298
+ - if: always()
299
+ run: node bin/enhanced_cli.js audit export --policy ./policy.yaml --command check --format all --runtime ollama --no-verbose --out-dir ./policy-reports
300
+ - if: always()
301
+ uses: actions/upload-artifact@v4
302
+ with:
303
+ name: policy-audit-reports
304
+ path: ./policy-reports
305
+ ```
306
+
269
307
  ### Provenance Fields in Reports
270
308
 
271
- Each finding includes normalized model provenance fields:
309
+ `check`, `recommend`, and `audit export` outputs include normalized model provenance fields:
272
310
 
273
311
  - `source`
274
312
  - `registry`
@@ -276,7 +314,8 @@ Each finding includes normalized model provenance fields:
276
314
  - `license`
277
315
  - `digest`
278
316
 
279
- If a field is unavailable from model metadata, reports use `"unknown"` instead of omitting the field. This keeps downstream parsers deterministic.
317
+ If a field is unavailable from model metadata, outputs use `"unknown"` instead of omitting the field. This keeps downstream parsers deterministic.
318
+ License values are canonicalized for policy checks (for example `MIT License` -> `mit`, `Apache 2.0` -> `apache-2.0`).
280
319
 
281
320
  ### AI Commands
282
321
 
@@ -359,7 +398,7 @@ llm-checker search qwen --quant Q4_K_M --max-size 8
359
398
 
360
399
  LLM Checker prioritizes the full scraped Ollama model cache (all families/sizes/variants) and falls back to a built-in curated catalog when cache is unavailable.
361
400
 
362
- The curated fallback catalog includes 35+ models from the most popular Ollama families:
401
+ The curated fallback catalog includes 35+ models from the most popular Ollama families (used only when the dynamic scraped pool is unavailable):
363
402
 
364
403
  | Family | Models | Best For |
365
404
  |--------|--------|----------|
@@ -481,30 +520,66 @@ The selector automatically picks the best quantization that fits your available
481
520
 
482
521
  ## Architecture
483
522
 
523
+ LLM Checker uses a deterministic pipeline so the same inputs produce the same ranked output, with explicit policy outcomes for governance workflows.
524
+
525
+ ```mermaid
526
+ flowchart LR
527
+ subgraph Inputs
528
+ HW["Hardware detector<br/>CPU/GPU/RAM/backend"]
529
+ REG["Dynamic Ollama catalog<br/>(curated fallback if unavailable)"]
530
+ LOCAL["Installed local models"]
531
+ FLAGS["CLI options<br/>use-case/runtime/limits/policy"]
532
+ end
533
+
534
+ subgraph Pipeline["Selection Pipeline"]
535
+ NORMALIZE["Normalize and deduplicate model pool"]
536
+ PROFILE["Hardware profile and memory budget"]
537
+ FILTER["Use-case/category filtering"]
538
+ QUANT["Quantization fit selection"]
539
+ SCORE["Deterministic 4D scoring<br/>Q/S/F/C"]
540
+ POLICY["Policy evaluation (optional)<br/>audit or enforce"]
541
+ RANK["Rank and explain candidates"]
542
+ end
543
+
544
+ subgraph Outputs
545
+ REC["check / recommend output"]
546
+ AUDIT["audit export<br/>JSON / CSV / SARIF"]
547
+ RUN["pull/run-ready commands"]
548
+ end
549
+
550
+ REG --> NORMALIZE
551
+ LOCAL --> NORMALIZE
552
+ HW --> PROFILE
553
+ FLAGS --> FILTER
554
+ FLAGS --> POLICY
555
+ NORMALIZE --> FILTER
556
+ PROFILE --> QUANT
557
+ FILTER --> QUANT
558
+ QUANT --> SCORE
559
+ SCORE --> POLICY
560
+ SCORE --> RANK
561
+ POLICY --> RANK
562
+ RANK --> REC
563
+ POLICY --> AUDIT
564
+ RANK --> RUN
484
565
  ```
485
- ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
486
- │ Hardware │────>│ Model │────>│ Deterministic │
487
- │ Detection │ │ Catalog (35+) │ │ Selector │
488
- └─────────────────┘ └─────────────────┘ └─────────────────┘
489
- │ │ │
490
- Detects GPU/CPU JSON catalog + 4D scoring
491
- Memory / Backend Installed models Per-category weights
492
- Usable memory calc Auto-dedup Memory calibration
493
-
494
- v
495
- ┌─────────────────┐
496
- │ Ranked │
497
- │ Recommendations│
498
- └─────────────────┘
499
- ```
500
566
 
501
- **Selector Pipeline:**
502
- 1. **Hardware profiling** &mdash; CPU, GPU, RAM, acceleration backend
503
- 2. **Model pool** &mdash; Merge full Ollama scraped pool (or curated fallback) + installed models (deduped)
504
- 3. **Category filter** &mdash; Keep models relevant to the use case
505
- 4. **Quantization selection** &mdash; Best quant that fits in memory budget
506
- 5. **4D scoring** &mdash; Q, S, F, C with category-specific weights
507
- 6. **Ranking** &mdash; Top N candidates returned
567
+ ### Component Responsibilities
568
+
569
+ - **Input layer**: Collects runtime constraints from hardware detection, local inventory, dynamic registry data, and CLI flags.
570
+ - **Normalization layer**: Deduplicates identifiers/tags and builds a canonical candidate set.
571
+ - **Selection layer**: Filters by use case, selects the best fitting quantization, and computes deterministic Q/S/F/C scores.
572
+ - **Governance layer**: Applies policy rules in `audit` or `enforce` mode and records explicit violation metadata.
573
+ - **Output layer**: Returns ranked recommendations plus machine-readable compliance artifacts when requested.
574
+
575
+ ### Execution Stages
576
+
577
+ 1. **Hardware profiling**: Detect CPU/GPU/RAM and effective backend capabilities.
578
+ 2. **Model pool assembly**: Merge dynamic scraped catalog (or curated fallback) with locally installed models.
579
+ 3. **Candidate filtering**: Keep only relevant models for the requested use case.
580
+ 4. **Fit selection**: Choose the best quantization for available memory budget.
581
+ 5. **Deterministic scoring**: Score each candidate across quality, speed, fit, and context.
582
+ 6. **Policy + ranking**: Apply optional policy checks, then rank and return actionable commands.
508
583
 
509
584
  ---
510
585
 
@@ -559,7 +634,7 @@ src/
559
634
  deterministic-selector.js # Primary selection algorithm
560
635
  scoring-config.js # Centralized scoring weights
561
636
  scoring-engine.js # Advanced scoring (smart-recommend)
562
- catalog.json # Curated fallback catalog (35+ models)
637
+ catalog.json # Curated fallback catalog (35+ models, only if dynamic pool unavailable)
563
638
  ai/
564
639
  multi-objective-selector.js # Multi-objective optimization
565
640
  ai-check-selector.js # LLM-based evaluation
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "llm-checker",
3
- "version": "3.2.3",
3
+ "version": "3.2.5",
4
4
  "description": "Intelligent CLI tool with AI-powered model selection that analyzes your hardware and recommends optimal LLM models for your system",
5
5
  "bin": {
6
6
  "llm-checker": "bin/cli.js",
@@ -10,14 +10,15 @@
10
10
  "main": "src/index.js",
11
11
  "scripts": {
12
12
  "test": "node tests/run-all-tests.js",
13
- "test:gpu": "node tests/gpu-detection/multi-gpu.test.js",
14
- "test:platform": "node tests/platform-tests/cross-platform.test.js",
15
- "test:ui": "node tests/ui-tests/interface.test.js",
13
+ "test:gpu": "node tests/amd-gpu-detection.test.js",
14
+ "test:platform": "node tests/hardware-simulation-tests.js",
15
+ "test:ui": "node tests/ui-cli-smoke.test.js",
16
16
  "test:runtime": "node tests/runtime-specdec-tests.js",
17
17
  "test:deterministic-pool": "node tests/deterministic-model-pool-check.js",
18
18
  "test:policy": "node tests/policy-commands.test.js",
19
19
  "test:policy-cli": "node tests/policy-cli-enforcement.js",
20
20
  "test:policy-engine": "node tests/policy-engine.test.js",
21
+ "test:policy-audit": "node tests/policy-audit-reporter.test.js",
21
22
  "test:policy-e2e": "node tests/policy-e2e-integration.test.js",
22
23
  "test:hardware-detector": "node tests/hardware-detector-regression.js",
23
24
  "test:all": "node tests/run-all-tests.js",
@@ -128,6 +128,95 @@ class DeterministicModelSelector {
128
128
  return hardware;
129
129
  }
130
130
 
131
+ /**
132
+ * Normalize hardware shape coming from different detectors/callers.
133
+ * Ensures deterministic selector always has:
134
+ * - memory.totalGB
135
+ * - gpu.vramGB
136
+ * - acceleration.supports_*
137
+ */
138
+ normalizeHardwareProfile(input = {}) {
139
+ const toNumber = (value) => {
140
+ if (typeof value === 'number' && Number.isFinite(value)) return value;
141
+ if (typeof value === 'string' && value.trim() !== '' && Number.isFinite(Number(value))) {
142
+ return Number(value);
143
+ }
144
+ return null;
145
+ };
146
+
147
+ const cpu = input.cpu || {};
148
+ const gpu = input.gpu || {};
149
+ const memory = input.memory || {};
150
+ const acceleration = input.acceleration || {};
151
+
152
+ const totalMemGB =
153
+ toNumber(memory.totalGB) ??
154
+ toNumber(memory.total) ??
155
+ toNumber(input.total_ram_gb) ??
156
+ toNumber(input.memoryGB) ??
157
+ 8;
158
+
159
+ const usableMemGB =
160
+ toNumber(input.usableMemGB) ??
161
+ Math.max(1, Math.min(0.8 * totalMemGB, totalMemGB - 2));
162
+
163
+ const vramGB =
164
+ toNumber(gpu.vramGB) ??
165
+ toNumber(gpu.vram) ??
166
+ toNumber(gpu.totalVRAM) ??
167
+ toNumber(gpu.vramPerGPU) ??
168
+ 0;
169
+
170
+ const modelHints = `${gpu.model || ''} ${gpu.vendor || ''} ${gpu.type || ''}`.toLowerCase();
171
+ const inferredUnified =
172
+ Boolean(gpu.unified) ||
173
+ /apple|m1|m2|m3|m4|unified/.test(modelHints);
174
+
175
+ let gpuType = gpu.type;
176
+ if (!gpuType) {
177
+ if (inferredUnified) gpuType = 'apple_silicon';
178
+ else if (/nvidia|rtx|gtx|tesla|quadro/.test(modelHints)) gpuType = 'nvidia';
179
+ else if (/amd|radeon|rx |instinct/.test(modelHints)) gpuType = 'amd';
180
+ else gpuType = 'cpu_only';
181
+ }
182
+
183
+ const normalizedAcceleration = {
184
+ supports_metal:
185
+ typeof acceleration.supports_metal === 'boolean'
186
+ ? acceleration.supports_metal
187
+ : gpuType === 'apple_silicon',
188
+ supports_cuda:
189
+ typeof acceleration.supports_cuda === 'boolean'
190
+ ? acceleration.supports_cuda
191
+ : gpuType === 'nvidia',
192
+ supports_rocm:
193
+ typeof acceleration.supports_rocm === 'boolean'
194
+ ? acceleration.supports_rocm
195
+ : gpuType === 'amd'
196
+ };
197
+
198
+ return {
199
+ ...input,
200
+ cpu: {
201
+ ...cpu,
202
+ architecture: cpu.architecture || cpu.arch || process.arch || 'x86_64',
203
+ cores: toNumber(cpu.cores) ?? toNumber(cpu.physicalCores) ?? 4
204
+ },
205
+ gpu: {
206
+ ...gpu,
207
+ type: gpuType,
208
+ vramGB,
209
+ unified: inferredUnified
210
+ },
211
+ memory: {
212
+ ...memory,
213
+ totalGB: totalMemGB
214
+ },
215
+ acceleration: normalizedAcceleration,
216
+ usableMemGB
217
+ };
218
+ }
219
+
131
220
  async getCPUInfo() {
132
221
  const os = require('os');
133
222
  return {
@@ -810,7 +899,8 @@ class DeterministicModelSelector {
810
899
  }
811
900
 
812
901
  // Phase 0: Gather data
813
- const hardware = providedHardware || await this.getHardware();
902
+ const detectedHardware = providedHardware || await this.getHardware();
903
+ const hardware = this.normalizeHardwareProfile(detectedHardware);
814
904
  const installed = Array.isArray(installedModels) ? installedModels : await this.getInstalledModels();
815
905
  const externalPool = Array.isArray(modelPool) && modelPool.length > 0
816
906
  ? this.normalizeExternalModels(modelPool)
@@ -1299,22 +1389,22 @@ class DeterministicModelSelector {
1299
1389
  };
1300
1390
  }
1301
1391
 
1302
- mapHardwareTier(hardware) {
1392
+ mapHardwareTier(hardware = {}) {
1303
1393
  let ram, cores;
1304
1394
 
1305
- if (hardware.memory && hardware.memory.totalGB) {
1395
+ if (hardware?.memory?.totalGB) {
1306
1396
  ram = hardware.memory.totalGB;
1307
- } else if (hardware.memory && hardware.memory.total) {
1397
+ } else if (hardware?.memory?.total) {
1308
1398
  ram = hardware.memory.total;
1309
- } else if (hardware.total_ram_gb) {
1399
+ } else if (hardware?.total_ram_gb) {
1310
1400
  ram = hardware.total_ram_gb;
1311
1401
  } else {
1312
1402
  ram = 8;
1313
1403
  }
1314
1404
 
1315
- if (hardware.cpu && hardware.cpu.cores) {
1405
+ if (hardware?.cpu?.cores) {
1316
1406
  cores = hardware.cpu.cores;
1317
- } else if (hardware.cpu_cores) {
1407
+ } else if (hardware?.cpu_cores) {
1318
1408
  cores = hardware.cpu_cores;
1319
1409
  } else {
1320
1410
  cores = 4;
@@ -1366,6 +1456,7 @@ class DeterministicModelSelector {
1366
1456
  const recommendations = {};
1367
1457
  const normalizedPool = this.normalizeExternalModels(Array.isArray(allModels) ? allModels : []);
1368
1458
  const installedModels = await this.getInstalledModels();
1459
+ const normalizedHardware = this.normalizeHardwareProfile(hardware || await this.getHardware());
1369
1460
 
1370
1461
  for (const category of categories) {
1371
1462
  try {
@@ -1373,19 +1464,20 @@ class DeterministicModelSelector {
1373
1464
  topN: 3,
1374
1465
  enableProbe: false,
1375
1466
  silent: true,
1467
+ hardware: normalizedHardware,
1376
1468
  installedModels,
1377
1469
  modelPool: normalizedPool
1378
1470
  });
1379
1471
 
1380
1472
  recommendations[category] = {
1381
- tier: this.mapHardwareTier(hardware),
1473
+ tier: this.mapHardwareTier(normalizedHardware),
1382
1474
  bestModels: result.candidates.map(candidate => this.mapCandidateToLegacyFormat(candidate)),
1383
1475
  totalEvaluated: result.total_evaluated,
1384
1476
  category: this.getCategoryInfo(category)
1385
1477
  };
1386
1478
  } catch (error) {
1387
1479
  recommendations[category] = {
1388
- tier: this.mapHardwareTier(hardware),
1480
+ tier: this.mapHardwareTier(normalizedHardware),
1389
1481
  bestModels: [],
1390
1482
  totalEvaluated: 0,
1391
1483
  category: this.getCategoryInfo(category)
@@ -1,3 +1,5 @@
1
+ const { normalizeLicense, UNKNOWN_VALUE } = require('../provenance/model-provenance');
2
+
1
3
  const NOOP_POLICY = {
2
4
  version: 1,
3
5
  org: 'default',
@@ -271,8 +273,8 @@ class PolicyEngine {
271
273
  if (!isPlainObject(complianceRules)) return;
272
274
 
273
275
  const approvedLicenses = asArray(complianceRules.approved_licenses)
274
- .map((license) => toLowerString(license))
275
- .filter(Boolean);
276
+ .map((license) => normalizeLicense(license))
277
+ .filter((license) => license && license !== UNKNOWN_VALUE);
276
278
 
277
279
  if (approvedLicenses.length === 0) return;
278
280
 
@@ -410,11 +412,17 @@ class PolicyEngine {
410
412
 
411
413
  getModelLicense(model) {
412
414
  const raw =
413
- toLowerString(model.license) ||
414
- toLowerString(model.license_id) ||
415
- toLowerString(model.licenseId);
415
+ model?.provenance?.license ??
416
+ model?.license ??
417
+ model?.license_id ??
418
+ model?.licenseId;
419
+ const normalized = normalizeLicense(raw);
420
+
421
+ if (!normalized || normalized === UNKNOWN_VALUE) {
422
+ return null;
423
+ }
416
424
 
417
- return raw;
425
+ return normalized;
418
426
  }
419
427
 
420
428
  resolveBackend(model, context) {