pi-free 2.0.1 → 2.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -7,9 +7,87 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [2.0.2] - 2026-04-26
11
+
12
+ ### Added
13
+
14
+ - **Model matching debug logging** — Added `~/.pi/modelmatch.log` to diagnose which models get Coding Index scores and which don't:
15
+ - Logs every matching attempt with provider, model ID, normalization strategy, and result
16
+ - CSV-like format: `timestamp|provider|modelId|modelName|action|strategy|normalizedId|matchKey|codingIndex|details`
17
+ - Provider-specific normalizers for better matching:
18
+ - **NVIDIA**: Strips vendor prefixes (`meta/`, `mistralai/`, `microsoft/`, `qwen/`, etc.)
19
+ - **Cloudflare**: Strips `@cf/namespace/` prefixes
20
+ - **Groq**: Removes `-versatile` and numeric context suffixes (`-32768`)
21
+ - **Cerebras**: Normalizes `llama3.1` → `llama-3.1`, auto-adds `instruct` suffix
22
+ - **Mistral**: Strips `-latest` suffix
23
+ - **Ollama**: Converts `model:tag` → `model-tag`
24
+ - Common suffix stripping: `:free`, date codes (`-20250514`), versions (`-v1.1`), `-it`, `-fp8`/`-bf16`
25
+
26
+ - **Enhanced benchmark lookup** — `enhanceModelNameWithCodingIndex()` now accepts optional `provider` parameter for provider-aware normalization
27
+
28
+ - **Static 404 model blocklist for NVIDIA** — Probed all 136 models from `integrate.api.nvidia.com/v1/models` and identified 57 that return 404 "Function not found" on `/v1/chat/completions`. These are now hard-filtered so they never appear in the model selector:
29
+ - Covers discontinued models (`databricks/dbrx-instruct`, `meta/codellama-70b`, `meta/llama2-70b`, `ibm/granite-*`, etc.)
30
+ - Covers embedding-only models listed as chat-capable (`nvidia/nv-embed-v1`, `nvidia/nv-embedqa-*`, `snowflake/arctic-embed-l`, etc.)
31
+ - Covers stale API catalog entries (`mistralai/mistral-large`, `mistralai/mistral-large-2-instruct`, `writer/palmyra-*`, etc.)
32
+ - Full list in `NVIDIA_KNOWN_404_MODELS` in `providers/nvidia/nvidia.ts`
33
+
34
+ - **`/probe-nvidia` command** — On-demand model health check. Tests every registered NVIDIA model with a minimal `max_tokens: 1` request, auto-hides any new 404s in `~/.pi/free.json`, and re-registers the provider immediately.
35
+
36
+ - **`scripts/probe-nvidia.mjs`** — Standalone Node.js script to reproduce the probe. Reads `~/.pi/free.json` for the API key, batches 20 requests at a time with 10s timeout, and prints all broken model IDs for adding to the blocklist.
37
+
38
+ - **Ollama Cloud 403 handling** — Same pattern as NVIDIA 404s for Ollama Cloud:
39
+ - `OLLAMA_KNOWN_403_MODELS` blocklist for models that return 403 "access denied"
40
+ - `/probe-ollama` command to test all models on-demand, auto-hide broken ones, and re-register
41
+ - `scripts/probe-ollama.mjs` standalone script for blocklist maintenance
42
+
43
+ - **Provider-scoped hidden models** — Hidden models are now provider-specific:
44
+ - Format: `"provider/model-id"` (e.g., `"ollama/kimi-k2.6"`, `"nvidia/broken-model"`)
45
+ - A model hidden from one provider doesn't hide it from other providers
46
+ - Backward compatible with old global `"model-id"` format
47
+ - All providers updated: NVIDIA, Ollama, Cloudflare, Cline, Kilo, Modal
48
+
49
+ ### Fixed
50
+
51
+ - **Probe commands timeout handling** — Added `fetchWithTimeout` with 10-second timeout to `/probe-nvidia` and `/probe-ollama` commands. Prevents the coding harness from freezing when individual model probe requests hang indefinitely.
52
+
53
+ - **NVIDIA provider now sends `authHeader: true`** — Explicitly enables `Authorization: Bearer` header injection. Previously relied on pi's implicit behavior which could fail in some configurations.
54
+
55
+ ### Removed
56
+
57
+ - **NVIDIA 404 model warning log** — Removed the `console.warn("[nvidia] Skipping known 404 model: ...")` output when filtering out known broken models. The filter still works silently; use `/probe-nvidia` to identify new 404s if needed.
58
+
59
+ ### Changed
60
+
61
+ - **Cloudflare provider now fetches models dynamically** — Replaced static 19-model hardcoded list with live API fetch from `api.cloudflare.com/client/v4/accounts/{account_id}/ai/models`:
62
+ - Automatically discovers all 30+ text generation models (was manually maintaining 19)
63
+ - Smart filtering excludes embeddings, image generation, speech, translation, and vision-only models via regex patterns
64
+ - Metadata inference from model IDs: detects vision (`vision`/`multimodal`), reasoning (`r1`/`thinking`/`qwq`), context windows, and estimated costs
65
+ - Fixed Mistral Small ID: changed from incorrect `@cf/mistralai/...` to correct `@cf/mistral/...`
66
+ - Added new fallback models: Kimi K2.6, OpenAI GPT-OSS 120B/20B, Qwen 2.5 Coder 32B, QwQ 32B, Llama 3.2 11B Vision
67
+ - Graceful fallback to expanded 18-model hardcoded list if API fetch fails
68
+
69
+ - **NVIDIA provider now queries NVIDIA's API directly** — Source of truth switched from `models.dev` curated JSON to `https://integrate.api.nvidia.com/v1/models`:
70
+ - Eliminates 57 missing models and 25 stale entries from the old third-party source
71
+ - Models not in `models.dev` get inferred metadata (128k context, 4k output, vision/reasoning heuristics)
72
+ - Added regex-based non-chat model filtering for unknown models (embeddings, whisper, reward models, safety guards, parsers, detectors, etc.)
73
+ - Graceful fallback to `models.dev` if NVIDIA API is unreachable
74
+ - Removed paid/free toggle filtering — NVIDIA is freemium (all models use free credits)
75
+
76
+ ## [2.0.2] - 2026-04-24
77
+
78
+ ### Fixed
79
+
80
+ - **Provider toggle state now persists reliably** — Follow-up fixes to the new `toggle-{provider}` flow ensure saved free-vs-all preferences are restored consistently across sessions for built-in and extension-managed providers.
81
+ - **Config parse errors are now logged** — Invalid `~/.pi/free.json` content is no longer ignored silently; startup parse failures are written to `~/.pi/free.log` to make misconfiguration easier to diagnose.
82
+
83
+ ### Changed
84
+
85
+ - **README refreshed** — Clarified that provider toggle changes apply immediately, persist across restarts, and that malformed config is surfaced in the extension log.
86
+
10
87
  ## [2.0.1] - 2026-04-24
11
88
 
12
89
  ### Added
90
+
13
91
  - **Built-in provider toggle support** (`lib/built-in-toggle.ts`) — Enables free/paid filtering for Pi's built-in providers that expose per-model pricing:
14
92
  - **OpenCode (`/toggle-opencode`)** — Captures built-in OpenCode models on session start and filters to free-only by default
15
93
  - **OpenRouter (`/toggle-openrouter`)** — Now uses the built-in toggle system for consistency
@@ -17,6 +95,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
17
95
  - Persisted via `opencode_show_paid` and `openrouter_show_paid` in `~/.pi/free.json`
18
96
 
19
97
  ### Changed
98
+
20
99
  - **OpenRouter moved to built-in toggle system** — OpenRouter is now handled by `lib/built-in-toggle.ts` alongside OpenCode for a unified approach:
21
100
  - Removed from `providers/dynamic-built-in/index.ts`
22
101
  - Eliminated duplicate toggle command registration logic
@@ -35,16 +114,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
35
114
  - `/toggle-opencode` (new)
36
115
 
37
116
  ### Fixed
117
+
38
118
  - **Ollama Cloud model fetching endpoint** — Corrected the `/v1/models` → `/models` endpoint path in `providers/ollama/ollama.ts`:
39
119
  - The previous fix (2.0.0) incorrectly used `/v1/models`; Ollama Cloud's models endpoint is `/v1/models` for chat completions but `/models` for listing
40
120
  - This ensures model fetching works correctly with the OpenAI-compatible API
41
121
 
42
122
  ### Removed
123
+
43
124
  - **Global `/free` command** — Removed the global free-only toggle. Per-provider toggles (`/toggle-{provider}`) are now the only way to switch between free and paid models. The `/free-providers` status command remains.
44
125
 
45
126
  ## [2.0.0] - 2026-04-23
46
127
 
47
128
  ### Breaking Changes
129
+
48
130
  - **Removed Fireworks provider** — Fireworks is now a built-in Pi provider (added in pi 0.68.1), so the extension's Fireworks provider has been removed to avoid conflicts:
49
131
  - Deleted `providers/fireworks/fireworks.ts` and `tests/fireworks.test.ts`
50
132
  - Removed all Fireworks configuration options from `config.ts` (`fireworks_api_key`, `fireworks_show_paid`)
@@ -55,15 +137,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
55
137
  - All log messages and documentation now reference "Ollama Cloud"
56
138
 
57
139
  ### Removed
140
+
58
141
  - **Dropped `@sinclair/typebox` peer dependency** — Pi 0.69.0 migrated from `@sinclair/typebox` to `typebox` 1.x. The extension didn't directly import this package, so it was removed from `peerDependencies` to avoid potential conflicts.
59
142
 
60
143
  ### Fixed
144
+
61
145
  - **Ollama Cloud API endpoint** — Fixed broken Ollama Cloud integration:
62
146
  - Changed `BASE_URL_OLLAMA` from `https://ollama.com` to `https://ollama.com/v1` — the OpenAI-compatible API endpoint
63
147
  - Fixed model fetching to use `/v1/models` instead of `/api/tags` — ensures model IDs work with chat completions endpoint
64
148
  - Previously calls went to HTML homepage instead of API endpoints, causing 404 errors
65
149
 
66
150
  ### Removed
151
+
67
152
  - **Removed paid model warning on selection** — Deleted the `model_select` event handler that showed:
68
153
  - `⚠️ Paid model selected (${model.id}). Use "/free off" to enable paid models.`
69
154
  - This warning was redundant since the global `/free` toggle and provider toggles already control model visibility
@@ -73,12 +158,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
73
158
  - Modal provider now sets `skipToggle: true` to prevent toggle command creation
74
159
 
75
160
  ### Changed
161
+
76
162
  - **Marked Qwen provider as fully deprecated** — Updated messaging to clarify the provider is broken:
77
163
  - Changed model name from `"Qwen Coder — Free 1k/day"` to `"Qwen Coder — DEPRECATED (free tier discontinued)"`
78
164
  - Updated all JSDoc comments to clearly state auth is broken and free tier is no longer available
79
165
  - Provider remains for backward compatibility but should not be used
80
166
 
81
167
  ### Added
168
+
82
169
  - **Cloudflare Workers AI provider** — New provider for Cloudflare's serverless GPU platform:
83
170
  - 50+ open-source models: Llama 4, Mistral Small 3.1, Qwen 2.5/3, DeepSeek R1, Gemma 4, Kimi K2.5/2.6, and more
84
171
  - **10,000 Neurons/day FREE tier** (resets daily at 00:00 UTC)
@@ -103,12 +190,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
103
190
  - Providers register via `registerWithGlobalToggle()` for unified filtering
104
191
 
105
192
  ### Fixed
193
+
106
194
  - **Toggle commands now actually filter models from UI** — Previously, toggle commands only showed notifications but didn't remove paid models from the model picker:
107
195
  - **OpenRouter (`/openrouter-toggle`)**: Now uses `registerProvider`/`unregisterProvider` to actually filter models from the picker UI
108
196
  - **NVIDIA (`/nvidia-toggle`)**: Added dynamic `showPaid` parameter to `fetchNvidiaModels()` so toggle properly switches between free and paid model sets
109
197
  - **Fireworks**: Removed broken toggle command — all models are paid with no free tier, so there was nothing to toggle
110
198
 
111
199
  ### Added
200
+
112
201
  - **OpenRouter per-provider free model toggle** — Added `/openrouter-toggle` command for the built-in OpenRouter provider:
113
202
  - `/openrouter-toggle` — Switch between showing only free models vs all models (including paid)
114
203
  - New config flag `openrouter_show_paid` in `~/.pi/free.json` (default: `false`)
@@ -116,6 +205,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
116
205
  - This brings OpenRouter (a built-in pi provider) in line with extension providers that have per-provider toggles
117
206
 
118
207
  ### Deprecated
208
+
119
209
  - **Qwen provider** — The 1,000 requests/day free tier is no longer available from Qwen/DashScope. The provider code remains for backward compatibility but is now deprecated:
120
210
  - Added `@deprecated` JSDoc tags to all Qwen-related exports
121
211
  - Added deprecation warning when Qwen provider loads
@@ -123,11 +213,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
123
213
  - Consider migrating to other free providers: Kilo, Cline, NVIDIA, or Modal
124
214
 
125
215
  ### Added
216
+
126
217
  - **Go provider** — OpenCode Go subscription gateway (⚠️ paid only — $5 first month, then $10/month, no free tier) with models: GLM-5, Kimi K2.5, MiMo-V2-Pro, MiMo-V2-Omni, MiniMax M2.7, MiniMax M2.5
127
218
  - Set `OPENCODE_GO_API_KEY` or `opencode_go_api_key` in `~/.pi/free.json`
128
219
  - Toggle with `/go-toggle`
129
220
 
130
221
  ### Fixed
222
+
131
223
  - **All providers now show Coding Index scores in model selector** — Added `enhanceWithCI()` to factory-based providers (nvidia, fireworks, mistral, modal, ollama) and cline. Now all providers display CI scores in `/models` command (pi-models extension).
132
224
 
133
225
  - **All providers now show in `--list-models`** — Providers (zen, openrouter, go) that registered models only in `session_start` were missing from `pi --list-models` which runs before session starts. Added immediate registration for these providers:
@@ -137,22 +229,26 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
137
229
  - All 11 providers now visible in `--list-models`
138
230
 
139
231
  ### Changed
232
+
140
233
  - Updated README with clear free vs paid provider distinction (9 free + 2 paid-only: Go, Fireworks)
141
234
  - Added Go and Fireworks provider documentation under new "💳 Paid-Only Providers" section
142
235
  - Added `opencode_go_api_key` to config file template
143
236
  - Updated package.json description and keywords to include all 11 providers
144
237
 
145
238
  ### Added
239
+
146
240
  - **Provider model cache** (`lib/provider-cache.ts`) — New utility for caching provider model lists to `~/.pi/provider-cache.json`. Used by zen provider for faster startup and offline access after first successful fetch.
147
241
 
148
242
  ## [1.0.9] - 2026-04-14
149
243
 
150
244
  ### Fixed
245
+
151
246
  - **Qwen OAuth breaks other OAuth providers** — `modifyModels` receives all models across every registered provider, not just Qwen's. The previous `map()` stamped the Qwen dashscope `baseUrl` onto every model, causing other OAuth providers (Kilo, OpenRouter, etc.) to return 404 after a `/login qwen` flow. Now only models with `provider === PROVIDER_QWEN` are patched; others pass through unchanged.
152
247
 
153
248
  ## [1.0.8] - 2026-04-13
154
249
 
155
250
  ### Added
251
+
156
252
  - **Modal provider** — Free access to GLM-5.1 FP8 (128k context, 16k max output) during promotional period (free until April 30, 2026)
157
253
  - Requires a free Modal API key (`MODAL_API_KEY` or `modal_api_key` in `~/.pi/free.json`)
158
254
  - Model: `zai-org/GLM-5.1-FP8` — 128k context window, 16k max output tokens
@@ -162,6 +258,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
162
258
  - 131k context window, 16k max output tokens, zero cost
163
259
 
164
260
  ### Fixed
261
+
165
262
  - **Qwen OAuth browser launch on Windows** — URLs with `&` query params were truncated by `cmd.exe`'s `&` command separator; switched to `powershell.exe Start-Process` which passes the URL as a literal string
166
263
  - **Qwen API endpoint** — Replicates qwen-code's `getCurrentEndpoint()` logic: uses `resource_url` from OAuth token response (`dashscope.aliyuncs.com` for Chinese accounts, `portal.qwen.ai` for international), with fallback to `dashscope.aliyuncs.com/compatible-mode/v1`
167
264
  - **Qwen DashScope headers** — Added all headers required by DashScope's OpenAI-compatible API: `X-DashScope-AuthType: qwen-oauth`, `X-DashScope-CacheControl: enable`, `X-DashScope-UserAgent`, `Client-Code: QwenCode`
@@ -170,6 +267,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
170
267
  ## [1.0.5] - 2025-04-03
171
268
 
172
269
  ### Fixed
270
+
173
271
  - **NVIDIA provider non-chat model filtering** (comment/implementation mismatch)
174
272
  - Added modalities-based filtering to exclude embedding, speech-to-text, OCR, and image-gen models
175
273
  - Filters models where `output` is not `["text"]` (e.g., image generation like `black-forest-labs/flux.1-dev`)
@@ -180,6 +278,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
180
278
  ## [1.0.4] - 2025-04-03
181
279
 
182
280
  ### Fixed
281
+
183
282
  - **All tests now passing** (127/127)
184
283
  - Fixed mock paths in kilo.test.ts, zen.test.ts, ollama.test.ts
185
284
  - Fixed createCtxReRegister mocks in zen.test.ts and openrouter.test.ts
@@ -187,6 +286,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
187
286
  - Added missing DEFAULT_MIN_SIZE_B constant to openrouter mock
188
287
 
189
288
  ### Changed
289
+
190
290
  - **Code quality improvements**
191
291
  - Refactored usage modules to break circular dependency (limits.ts ↔ formatters.ts)
192
292
  - Created usage/types.ts with shared interfaces (FreeTierLimit, FreeTierUsage)
@@ -195,12 +295,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
195
295
  ## [1.0.3] - 2025-04-03
196
296
 
197
297
  ### Changed
298
+
198
299
  - Updated package.json metadata (name, description, keywords, repository URL)
199
300
  - Updated .npmignore for cleaner publishes
200
301
 
201
302
  ## [1.0.0] - 2024-03-28
202
303
 
203
304
  ### Added
305
+
204
306
  - Initial release with 6 providers: Kilo, Zen, OpenRouter, NVIDIA, Cline, Fireworks
205
307
  - Free tier usage tracking across all sessions
206
308
  - Provider failover with model hopping
@@ -210,12 +312,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
210
312
  - Hardcoded benchmark data from Artificial Analysis
211
313
 
212
314
  ### Changed
213
- - **Major refactoring**: Split free-tier-limits.ts into usage/* modules
315
+
316
+ - **Major refactoring**: Split free-tier-limits.ts into usage/\* modules
214
317
  - usage/tracking.ts - runtime session tracking
215
318
  - usage/cumulative.ts - persistent storage
216
319
  - usage/formatters.ts - display formatting
217
320
  - 77% line reduction (741 → 166 lines)
218
- - **Major refactoring**: Split usage-widget.ts into widget/* modules
321
+ - **Major refactoring**: Split usage-widget.ts into widget/\* modules
219
322
  - widget/data.ts - data collection
220
323
  - widget/format.ts - formatting utilities
221
324
  - widget/render.ts - HTML generation
@@ -240,6 +343,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
240
343
  - Added resetUsageStats() for test isolation
241
344
 
242
345
  ### Fixed
346
+
243
347
  - fetchWithRetry() now properly throws after exhausting retries
244
348
  - Auth error pattern matching now handles more message variants
245
349
  - Test isolation for free-tier-limits tests