llm-checker 3.2.8 → 3.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +119 -17
- package/bin/enhanced_cli.js +516 -3
- package/package.json +1 -1
- package/src/calibration/calibration-manager.js +798 -0
- package/src/calibration/policy-routing.js +376 -0
- package/src/calibration/schemas.js +212 -0
- package/src/hardware/backends/cuda-detector.js +355 -5
- package/src/ollama/capacity-planner.js +399 -0
package/README.md
CHANGED
|
@@ -22,8 +22,11 @@
|
|
|
22
22
|
</p>
|
|
23
23
|
|
|
24
24
|
<p align="center">
|
|
25
|
+
<a href="#start-here-2-minutes">Start Here</a> •
|
|
25
26
|
<a href="#installation">Installation</a> •
|
|
26
27
|
<a href="#quick-start">Quick Start</a> •
|
|
28
|
+
<a href="#calibration-quick-start-10-minutes">Calibration Quick Start</a> •
|
|
29
|
+
<a href="https://github.com/Pavelevich/llm-checker/tree/main/docs">Docs</a> •
|
|
27
30
|
<a href="#claude-code-mcp">Claude MCP</a> •
|
|
28
31
|
<a href="#commands">Commands</a> •
|
|
29
32
|
<a href="#scoring-system">Scoring</a> •
|
|
@@ -54,6 +57,17 @@ Choosing the right LLM for your hardware is complex. With thousands of model var
|
|
|
54
57
|
|
|
55
58
|
---
|
|
56
59
|
|
|
60
|
+
## Documentation
|
|
61
|
+
|
|
62
|
+
- [Docs Hub](https://github.com/Pavelevich/llm-checker/tree/main/docs)
|
|
63
|
+
- [Usage Guide](https://github.com/Pavelevich/llm-checker/blob/main/docs/guides/usage-guide.md)
|
|
64
|
+
- [Advanced Usage](https://github.com/Pavelevich/llm-checker/blob/main/docs/guides/advanced-usage.md)
|
|
65
|
+
- [Technical Reference](https://github.com/Pavelevich/llm-checker/blob/main/docs/reference/technical-docs.md)
|
|
66
|
+
- [Changelog](https://github.com/Pavelevich/llm-checker/blob/main/docs/reference/changelog.md)
|
|
67
|
+
- [Calibration Fixtures](https://github.com/Pavelevich/llm-checker/tree/main/docs/fixtures/calibration)
|
|
68
|
+
|
|
69
|
+
---
|
|
70
|
+
|
|
57
71
|
## Comparison with Other Tooling (e.g. `llmfit`)
|
|
58
72
|
|
|
59
73
|
LLM Checker and `llmfit` solve related but different problems:
|
|
@@ -89,6 +103,32 @@ npm install sql.js
|
|
|
89
103
|
|
|
90
104
|
---
|
|
91
105
|
|
|
106
|
+
## Start Here (2 Minutes)
|
|
107
|
+
|
|
108
|
+
If you are new, use this exact flow:
|
|
109
|
+
|
|
110
|
+
```bash
|
|
111
|
+
# 1) Install
|
|
112
|
+
npm install -g llm-checker
|
|
113
|
+
|
|
114
|
+
# 2) Detect your hardware
|
|
115
|
+
llm-checker hw-detect
|
|
116
|
+
|
|
117
|
+
# 3) Get recommendations by category
|
|
118
|
+
llm-checker recommend --category coding
|
|
119
|
+
|
|
120
|
+
# 4) Run with auto-selection
|
|
121
|
+
llm-checker ai-run --category coding --prompt "Write a hello world in Python"
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
If you already calibrated routing:
|
|
125
|
+
|
|
126
|
+
```bash
|
|
127
|
+
llm-checker ai-run --calibrated --category coding --prompt "Refactor this function"
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
---
|
|
131
|
+
|
|
92
132
|
## Distribution
|
|
93
133
|
|
|
94
134
|
LLM Checker is published in all primary channels:
|
|
@@ -97,23 +137,16 @@ LLM Checker is published in all primary channels:
|
|
|
97
137
|
- GitHub Releases: [Release history](https://github.com/Pavelevich/llm-checker/releases)
|
|
98
138
|
- GitHub Packages: [`@pavelevich/llm-checker`](https://github.com/users/Pavelevich/packages/npm/package/llm-checker)
|
|
99
139
|
|
|
100
|
-
### v3.
|
|
101
|
-
|
|
102
|
-
- Fixed multimodal recommendation false positives from noisy metadata.
|
|
103
|
-
- Coding-only models with incidental `input_types: image` flags are no longer treated as vision models.
|
|
104
|
-
- Added regression tests to keep multimodal category picks aligned with true vision-capable models.
|
|
105
|
-
|
|
106
|
-
### v3.2.7 Highlights
|
|
140
|
+
### v3.3.0 Highlights
|
|
107
141
|
|
|
108
|
-
-
|
|
109
|
-
|
|
110
|
-
-
|
|
111
|
-
-
|
|
112
|
-
-
|
|
113
|
-
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
- Expanded deterministic and hardware regression coverage for multi-GPU and unified-memory edge cases.
|
|
142
|
+
- Calibrated routing is now first-class in `recommend` and `ai-run`:
|
|
143
|
+
- `--calibrated [file]` support with default discovery path.
|
|
144
|
+
- clear precedence: `--policy` > `--calibrated` > deterministic fallback.
|
|
145
|
+
- routing provenance output (source, route, selected model).
|
|
146
|
+
- New calibration fixtures and end-to-end tests for:
|
|
147
|
+
- `calibrate --policy-out ...` → `recommend --calibrated ...`
|
|
148
|
+
- Hardened Jetson CUDA detection to avoid false CPU-only fallback.
|
|
149
|
+
- Documentation reorganized under `docs/` with clearer onboarding paths.
|
|
117
150
|
|
|
118
151
|
### Optional: Install from GitHub Packages
|
|
119
152
|
|
|
@@ -147,6 +180,51 @@ llm-checker search qwen --use-case coding
|
|
|
147
180
|
|
|
148
181
|
---
|
|
149
182
|
|
|
183
|
+
## Calibration Quick Start (10 Minutes)
|
|
184
|
+
|
|
185
|
+
This path produces both calibration artifacts and verifies calibrated routing in one pass.
|
|
186
|
+
|
|
187
|
+
### 1) Use the sample prompt suite
|
|
188
|
+
|
|
189
|
+
```bash
|
|
190
|
+
cp ./docs/fixtures/calibration/sample-suite.jsonl ./sample-suite.jsonl
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
### 2) Generate calibration artifacts (dry-run)
|
|
194
|
+
|
|
195
|
+
```bash
|
|
196
|
+
mkdir -p ./artifacts
|
|
197
|
+
llm-checker calibrate \
|
|
198
|
+
--suite ./sample-suite.jsonl \
|
|
199
|
+
--models qwen2.5-coder:7b llama3.2:3b \
|
|
200
|
+
--runtime ollama \
|
|
201
|
+
--objective balanced \
|
|
202
|
+
--dry-run \
|
|
203
|
+
--output ./artifacts/calibration-result.json \
|
|
204
|
+
--policy-out ./artifacts/calibration-policy.yaml
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
Artifacts created:
|
|
208
|
+
|
|
209
|
+
- `./artifacts/calibration-result.json` (calibration contract)
|
|
210
|
+
- `./artifacts/calibration-policy.yaml` (routing policy for runtime commands)
|
|
211
|
+
|
|
212
|
+
### 3) Apply calibrated routing
|
|
213
|
+
|
|
214
|
+
```bash
|
|
215
|
+
llm-checker recommend --calibrated ./artifacts/calibration-policy.yaml --category coding
|
|
216
|
+
llm-checker ai-run --calibrated ./artifacts/calibration-policy.yaml --category coding --prompt "Refactor this function"
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
Notes:
|
|
220
|
+
|
|
221
|
+
- `--policy <file>` has precedence over `--calibrated [file]`.
|
|
222
|
+
- If `--calibrated` has no path, discovery uses `~/.llm-checker/calibration-policy.{yaml,yml,json}`.
|
|
223
|
+
- `--mode full` currently requires `--runtime ollama`.
|
|
224
|
+
- `./docs/fixtures/calibration/sample-generated-policy.yaml` shows the expected policy structure.
|
|
225
|
+
|
|
226
|
+
---
|
|
227
|
+
|
|
150
228
|
## Claude Code MCP
|
|
151
229
|
|
|
152
230
|
LLM Checker includes a built-in [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) server, allowing **Claude Code** and other MCP-compatible AI assistants to analyze your hardware and manage local models directly.
|
|
@@ -229,7 +307,9 @@ Claude will automatically call the right tools and give you actionable results.
|
|
|
229
307
|
| `hw-detect` | Detect GPU/CPU capabilities, memory, backends |
|
|
230
308
|
| `check` | Full system analysis with compatible models and recommendations |
|
|
231
309
|
| `recommend` | Intelligent recommendations by category (coding, reasoning, multimodal, etc.) |
|
|
310
|
+
| `calibrate` | Generate calibration result + routing policy artifacts from a JSONL prompt suite |
|
|
232
311
|
| `installed` | Rank your installed Ollama models by compatibility |
|
|
312
|
+
| `ollama-plan` | Compute safe Ollama runtime env vars (`NUM_CTX`, `NUM_PARALLEL`, `MAX_LOADED_MODELS`) for selected local models |
|
|
233
313
|
|
|
234
314
|
### Advanced Commands (require `sql.js`)
|
|
235
315
|
|
|
@@ -263,6 +343,28 @@ llm-checker check --policy ./policy.yaml --use-case coding --runtime vllm
|
|
|
263
343
|
llm-checker recommend --policy ./policy.yaml --category coding
|
|
264
344
|
```
|
|
265
345
|
|
|
346
|
+
### Calibrated Routing in `recommend` and `ai-run`
|
|
347
|
+
|
|
348
|
+
`recommend` and `ai-run` now support calibration routing policies generated by `calibrate --policy-out`.
|
|
349
|
+
|
|
350
|
+
- `--calibrated [file]`:
|
|
351
|
+
- If `file` is omitted, discovery defaults to `~/.llm-checker/calibration-policy.{yaml,yml,json}`.
|
|
352
|
+
- `--policy <file>` takes precedence over `--calibrated` for routing resolution.
|
|
353
|
+
- Resolution precedence:
|
|
354
|
+
- `--policy` (explicit)
|
|
355
|
+
- `--calibrated` (explicit file or default discovery)
|
|
356
|
+
- deterministic selector fallback
|
|
357
|
+
- CLI output includes routing provenance (`--policy`, `--calibrated`, or default discovery) and the selected route/model.
|
|
358
|
+
|
|
359
|
+
Examples:
|
|
360
|
+
|
|
361
|
+
```bash
|
|
362
|
+
llm-checker recommend --calibrated --category coding
|
|
363
|
+
llm-checker recommend --calibrated ./calibration-policy.yaml --category reasoning
|
|
364
|
+
llm-checker ai-run --calibrated --category coding --prompt "Refactor this function"
|
|
365
|
+
llm-checker ai-run --policy ./calibration-policy.yaml --prompt "Summarize this report"
|
|
366
|
+
```
|
|
367
|
+
|
|
266
368
|
### Policy Audit Export
|
|
267
369
|
|
|
268
370
|
Use `audit export` when you need machine-readable compliance evidence for CI/CD gates, governance reviews, or security tooling.
|
|
@@ -722,7 +824,7 @@ LLM Checker is licensed under **NPDL-1.0** (No Paid Distribution License).
|
|
|
722
824
|
- Free use, modification, and redistribution are allowed.
|
|
723
825
|
- Selling the software or offering it as a paid hosted/API service is not allowed without a separate commercial license.
|
|
724
826
|
|
|
725
|
-
See [LICENSE](LICENSE) for full terms.
|
|
827
|
+
See [LICENSE](https://github.com/Pavelevich/llm-checker/blob/main/LICENSE) for full terms.
|
|
726
828
|
|
|
727
829
|
---
|
|
728
830
|
|