pi-free 2.0.0 → 2.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,9 +5,128 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [Unreleased]
9
+
10
+ ## [2.0.2] - 2026-04-26
11
+
12
+ ### Added
13
+
14
+ - **Model matching debug logging** — Added `~/.pi/modelmatch.log` to diagnose which models get Coding Index scores and which don't:
15
+ - Logs every matching attempt with provider, model ID, normalization strategy, and result
16
+ - CSV-like format: `timestamp|provider|modelId|modelName|action|strategy|normalizedId|matchKey|codingIndex|details`
17
+ - Provider-specific normalizers for better matching:
18
+ - **NVIDIA**: Strips vendor prefixes (`meta/`, `mistralai/`, `microsoft/`, `qwen/`, etc.)
19
+ - **Cloudflare**: Strips `@cf/namespace/` prefixes
20
+ - **Groq**: Removes `-versatile` and numeric context suffixes (`-32768`)
21
+ - **Cerebras**: Normalizes `llama3.1` → `llama-3.1`, auto-adds `instruct` suffix
22
+ - **Mistral**: Strips `-latest` suffix
23
+ - **Ollama**: Converts `model:tag` → `model-tag`
24
+ - Common suffix stripping: `:free`, date codes (`-20250514`), versions (`-v1.1`), `-it`, `-fp8`/`-bf16`
25
+
26
+ - **Enhanced benchmark lookup** — `enhanceModelNameWithCodingIndex()` now accepts optional `provider` parameter for provider-aware normalization
27
+
28
+ - **Static 404 model blocklist for NVIDIA** — Probed all 136 models from `integrate.api.nvidia.com/v1/models` and identified 57 that return 404 "Function not found" on `/v1/chat/completions`. These are now hard-filtered so they never appear in the model selector:
29
+ - Covers discontinued models (`databricks/dbrx-instruct`, `meta/codellama-70b`, `meta/llama2-70b`, `ibm/granite-*`, etc.)
30
+ - Covers embedding-only models listed as chat-capable (`nvidia/nv-embed-v1`, `nvidia/nv-embedqa-*`, `snowflake/arctic-embed-l`, etc.)
31
+ - Covers stale API catalog entries (`mistralai/mistral-large`, `mistralai/mistral-large-2-instruct`, `writer/palmyra-*`, etc.)
32
+ - Full list in `NVIDIA_KNOWN_404_MODELS` in `providers/nvidia/nvidia.ts`
33
+
34
+ - **`/probe-nvidia` command** — On-demand model health check. Tests every registered NVIDIA model with a minimal `max_tokens: 1` request, auto-hides any new 404s in `~/.pi/free.json`, and re-registers the provider immediately.
35
+
36
+ - **`scripts/probe-nvidia.mjs`** — Standalone Node.js script to reproduce the probe. Reads `~/.pi/free.json` for the API key, batches 20 requests at a time with 10s timeout, and prints all broken model IDs for adding to the blocklist.
37
+
38
+ - **Ollama Cloud 403 handling** — Same pattern as NVIDIA 404s for Ollama Cloud:
39
+ - `OLLAMA_KNOWN_403_MODELS` blocklist for models that return 403 "access denied"
40
+ - `/probe-ollama` command to test all models on-demand, auto-hide broken ones, and re-register
41
+ - `scripts/probe-ollama.mjs` standalone script for blocklist maintenance
42
+
43
+ - **Provider-scoped hidden models** — Hidden models are now provider-specific:
44
+ - Format: `"provider/model-id"` (e.g., `"ollama/kimi-k2.6"`, `"nvidia/broken-model"`)
45
+ - A model hidden from one provider doesn't hide it from other providers
46
+ - Backward compatible with old global `"model-id"` format
47
+ - All providers updated: NVIDIA, Ollama, Cloudflare, Cline, Kilo, Modal
48
+
49
+ ### Fixed
50
+
51
+ - **Probe commands timeout handling** — Added `fetchWithTimeout` with 10-second timeout to `/probe-nvidia` and `/probe-ollama` commands. Prevents the coding harness from freezing when individual model probe requests hang indefinitely.
52
+
53
+ - **NVIDIA provider now sends `authHeader: true`** — Explicitly enables `Authorization: Bearer` header injection. Previously relied on pi's implicit behavior which could fail in some configurations.
54
+
55
+ ### Removed
56
+
57
+ - **NVIDIA 404 model warning log** — Removed the `console.warn("[nvidia] Skipping known 404 model: ...")` output when filtering out known broken models. The filter still works silently; use `/probe-nvidia` to identify new 404s if needed.
58
+
59
+ ### Changed
60
+
61
+ - **Cloudflare provider now fetches models dynamically** — Replaced static 19-model hardcoded list with live API fetch from `api.cloudflare.com/client/v4/accounts/{account_id}/ai/models`:
62
+ - Automatically discovers all 30+ text generation models (was manually maintaining 19)
63
+ - Smart filtering excludes embeddings, image generation, speech, translation, and vision-only models via regex patterns
64
+ - Metadata inference from model IDs: detects vision (`vision`/`multimodal`), reasoning (`r1`/`thinking`/`qwq`), context windows, and estimated costs
65
+ - Fixed Mistral Small ID: changed from incorrect `@cf/mistralai/...` to correct `@cf/mistral/...`
66
+ - Added new fallback models: Kimi K2.6, OpenAI GPT-OSS 120B/20B, Qwen 2.5 Coder 32B, QwQ 32B, Llama 3.2 11B Vision
67
+ - Graceful fallback to expanded 18-model hardcoded list if API fetch fails
68
+
69
+ - **NVIDIA provider now queries NVIDIA's API directly** — Source of truth switched from `models.dev` curated JSON to `https://integrate.api.nvidia.com/v1/models`:
70
+ - Eliminates 57 missing models and 25 stale entries from the old third-party source
71
+ - Models not in `models.dev` get inferred metadata (128k context, 4k output, vision/reasoning heuristics)
72
+ - Added regex-based non-chat model filtering for unknown models (embeddings, whisper, reward models, safety guards, parsers, detectors, etc.)
73
+ - Graceful fallback to `models.dev` if NVIDIA API is unreachable
74
+ - Removed paid/free toggle filtering — NVIDIA is freemium (all models use free credits)
75
+
76
+ ## [2.0.2] - 2026-04-24
77
+
78
+ ### Fixed
79
+
80
+ - **Provider toggle state now persists reliably** — Follow-up fixes to the new `toggle-{provider}` flow ensure saved free-vs-all preferences are restored consistently across sessions for built-in and extension-managed providers.
81
+ - **Config parse errors are now logged** — Invalid `~/.pi/free.json` content is no longer ignored silently; startup parse failures are written to `~/.pi/free.log` to make misconfiguration easier to diagnose.
82
+
83
+ ### Changed
84
+
85
+ - **README refreshed** — Clarified that provider toggle changes apply immediately, persist across restarts, and that malformed config is surfaced in the extension log.
86
+
87
+ ## [2.0.1] - 2026-04-24
88
+
89
+ ### Added
90
+
91
+ - **Built-in provider toggle support** (`lib/built-in-toggle.ts`) — Enables free/paid filtering for Pi's built-in providers that expose per-model pricing:
92
+ - **OpenCode (`/toggle-opencode`)** — Captures built-in OpenCode models on session start and filters to free-only by default
93
+ - **OpenRouter (`/toggle-openrouter`)** — Now uses the built-in toggle system for consistency
94
+ - Toggle works in the current session (no restart needed)
95
+ - Persisted via `opencode_show_paid` and `openrouter_show_paid` in `~/.pi/free.json`
96
+
97
+ ### Changed
98
+
99
+ - **OpenRouter moved to built-in toggle system** — OpenRouter is now handled by `lib/built-in-toggle.ts` alongside OpenCode for a unified approach:
100
+ - Removed from `providers/dynamic-built-in/index.ts`
101
+ - Eliminated duplicate toggle command registration logic
102
+ - Consolidated toggle persistence with other built-in providers
103
+
104
+ - **Standardized all toggle commands to `toggle-{provider}`** — Renamed from `{provider}-toggle` for consistency:
105
+ - `/kilo-toggle` → `/toggle-kilo`
106
+ - `/cline-toggle` → `/toggle-cline`
107
+ - `/openrouter-toggle` → `/toggle-openrouter`
108
+ - `/nvidia-toggle` → `/toggle-nvidia`
109
+ - `/cloudflare-toggle` → `/toggle-cloudflare`
110
+ - `/ollama-toggle` → `/toggle-ollama`
111
+ - `/mistral-toggle` → `/toggle-mistral`
112
+ - `/groq-toggle` → `/toggle-groq`
113
+ - `/cerebras-toggle` → `/toggle-cerebras`
114
+ - `/toggle-opencode` (new)
115
+
116
+ ### Fixed
117
+
118
+ - **Ollama Cloud model fetching endpoint** — Corrected the `/v1/models` → `/models` endpoint path in `providers/ollama/ollama.ts`:
119
+ - The previous fix (2.0.0) incorrectly used `/v1/models`; Ollama Cloud's models endpoint is `/v1/models` for chat completions but `/models` for listing
120
+ - This ensures model fetching works correctly with the OpenAI-compatible API
121
+
122
+ ### Removed
123
+
124
+ - **Global `/free` command** — Removed the global free-only toggle. Per-provider toggles (`/toggle-{provider}`) are now the only way to switch between free and paid models. The `/free-providers` status command remains.
125
+
8
126
  ## [2.0.0] - 2026-04-23
9
127
 
10
128
  ### Breaking Changes
129
+
11
130
  - **Removed Fireworks provider** — Fireworks is now a built-in Pi provider (added in pi 0.68.1), so the extension's Fireworks provider has been removed to avoid conflicts:
12
131
  - Deleted `providers/fireworks/fireworks.ts` and `tests/fireworks.test.ts`
13
132
  - Removed all Fireworks configuration options from `config.ts` (`fireworks_api_key`, `fireworks_show_paid`)
@@ -18,15 +137,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
18
137
  - All log messages and documentation now reference "Ollama Cloud"
19
138
 
20
139
  ### Removed
140
+
21
141
  - **Dropped `@sinclair/typebox` peer dependency** — Pi 0.69.0 migrated from `@sinclair/typebox` to `typebox` 1.x. The extension didn't directly import this package, so it was removed from `peerDependencies` to avoid potential conflicts.
22
142
 
23
143
  ### Fixed
144
+
24
145
  - **Ollama Cloud API endpoint** — Fixed broken Ollama Cloud integration:
25
146
  - Changed `BASE_URL_OLLAMA` from `https://ollama.com` to `https://ollama.com/v1` — the OpenAI-compatible API endpoint
26
147
  - Fixed model fetching to use `/v1/models` instead of `/api/tags` — ensures model IDs work with chat completions endpoint
27
148
  - Previously calls went to HTML homepage instead of API endpoints, causing 404 errors
28
149
 
29
150
  ### Removed
151
+
30
152
  - **Removed paid model warning on selection** — Deleted the `model_select` event handler that showed:
31
153
  - `⚠️ Paid model selected (${model.id}). Use "/free off" to enable paid models.`
32
154
  - This warning was redundant since the global `/free` toggle and provider toggles already control model visibility
@@ -36,12 +158,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
36
158
  - Modal provider now sets `skipToggle: true` to prevent toggle command creation
37
159
 
38
160
  ### Changed
161
+
39
162
  - **Marked Qwen provider as fully deprecated** — Updated messaging to clarify the provider is broken:
40
163
  - Changed model name from `"Qwen Coder — Free 1k/day"` to `"Qwen Coder — DEPRECATED (free tier discontinued)"`
41
164
  - Updated all JSDoc comments to clearly state auth is broken and free tier is no longer available
42
165
  - Provider remains for backward compatibility but should not be used
43
166
 
44
167
  ### Added
168
+
45
169
  - **Cloudflare Workers AI provider** — New provider for Cloudflare's serverless GPU platform:
46
170
  - 50+ open-source models: Llama 4, Mistral Small 3.1, Qwen 2.5/3, DeepSeek R1, Gemma 4, Kimi K2.5/2.6, and more
47
171
  - **10,000 Neurons/day FREE tier** (resets daily at 00:00 UTC)
@@ -66,12 +190,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
66
190
  - Providers register via `registerWithGlobalToggle()` for unified filtering
67
191
 
68
192
  ### Fixed
193
+
69
194
  - **Toggle commands now actually filter models from UI** — Previously, toggle commands only showed notifications but didn't remove paid models from the model picker:
70
195
  - **OpenRouter (`/openrouter-toggle`)**: Now uses `registerProvider`/`unregisterProvider` to actually filter models from the picker UI
71
196
  - **NVIDIA (`/nvidia-toggle`)**: Added dynamic `showPaid` parameter to `fetchNvidiaModels()` so toggle properly switches between free and paid model sets
72
197
  - **Fireworks**: Removed broken toggle command — all models are paid with no free tier, so there was nothing to toggle
73
198
 
74
199
  ### Added
200
+
75
201
  - **OpenRouter per-provider free model toggle** — Added `/openrouter-toggle` command for the built-in OpenRouter provider:
76
202
  - `/openrouter-toggle` — Switch between showing only free models vs all models (including paid)
77
203
  - New config flag `openrouter_show_paid` in `~/.pi/free.json` (default: `false`)
@@ -79,6 +205,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
79
205
  - This brings OpenRouter (a built-in pi provider) in line with extension providers that have per-provider toggles
80
206
 
81
207
  ### Deprecated
208
+
82
209
  - **Qwen provider** — The 1,000 requests/day free tier is no longer available from Qwen/DashScope. The provider code remains for backward compatibility but is now deprecated:
83
210
  - Added `@deprecated` JSDoc tags to all Qwen-related exports
84
211
  - Added deprecation warning when Qwen provider loads
@@ -86,11 +213,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
86
213
  - Consider migrating to other free providers: Kilo, Cline, NVIDIA, or Modal
87
214
 
88
215
  ### Added
216
+
89
217
  - **Go provider** — OpenCode Go subscription gateway (⚠️ paid only — $5 first month, then $10/month, no free tier) with models: GLM-5, Kimi K2.5, MiMo-V2-Pro, MiMo-V2-Omni, MiniMax M2.7, MiniMax M2.5
90
218
  - Set `OPENCODE_GO_API_KEY` or `opencode_go_api_key` in `~/.pi/free.json`
91
219
  - Toggle with `/go-toggle`
92
220
 
93
221
  ### Fixed
222
+
94
223
  - **All providers now show Coding Index scores in model selector** — Added `enhanceWithCI()` to factory-based providers (nvidia, fireworks, mistral, modal, ollama) and cline. Now all providers display CI scores in `/models` command (pi-models extension).
95
224
 
96
225
  - **All providers now show in `--list-models`** — Providers (zen, openrouter, go) that registered models only in `session_start` were missing from `pi --list-models` which runs before session starts. Added immediate registration for these providers:
@@ -100,22 +229,26 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
100
229
  - All 11 providers now visible in `--list-models`
101
230
 
102
231
  ### Changed
232
+
103
233
  - Updated README with clear free vs paid provider distinction (9 free + 2 paid-only: Go, Fireworks)
104
234
  - Added Go and Fireworks provider documentation under new "💳 Paid-Only Providers" section
105
235
  - Added `opencode_go_api_key` to config file template
106
236
  - Updated package.json description and keywords to include all 11 providers
107
237
 
108
238
  ### Added
239
+
109
240
  - **Provider model cache** (`lib/provider-cache.ts`) — New utility for caching provider model lists to `~/.pi/provider-cache.json`. Used by zen provider for faster startup and offline access after first successful fetch.
110
241
 
111
242
  ## [1.0.9] - 2026-04-14
112
243
 
113
244
  ### Fixed
245
+
114
246
  - **Qwen OAuth breaks other OAuth providers** — `modifyModels` receives all models across every registered provider, not just Qwen's. The previous `map()` stamped the Qwen dashscope `baseUrl` onto every model, causing other OAuth providers (Kilo, OpenRouter, etc.) to return 404 after a `/login qwen` flow. Now only models with `provider === PROVIDER_QWEN` are patched; others pass through unchanged.
115
247
 
116
248
  ## [1.0.8] - 2026-04-13
117
249
 
118
250
  ### Added
251
+
119
252
  - **Modal provider** — Free access to GLM-5.1 FP8 (128k context, 16k max output) during promotional period (free until April 30, 2026)
120
253
  - Requires a free Modal API key (`MODAL_API_KEY` or `modal_api_key` in `~/.pi/free.json`)
121
254
  - Model: `zai-org/GLM-5.1-FP8` — 128k context window, 16k max output tokens
@@ -125,6 +258,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
125
258
  - 131k context window, 16k max output tokens, zero cost
126
259
 
127
260
  ### Fixed
261
+
128
262
  - **Qwen OAuth browser launch on Windows** — URLs with `&` query params were truncated by `cmd.exe`'s `&` command separator; switched to `powershell.exe Start-Process` which passes the URL as a literal string
129
263
  - **Qwen API endpoint** — Replicates qwen-code's `getCurrentEndpoint()` logic: uses `resource_url` from OAuth token response (`dashscope.aliyuncs.com` for Chinese accounts, `portal.qwen.ai` for international), with fallback to `dashscope.aliyuncs.com/compatible-mode/v1`
130
264
  - **Qwen DashScope headers** — Added all headers required by DashScope's OpenAI-compatible API: `X-DashScope-AuthType: qwen-oauth`, `X-DashScope-CacheControl: enable`, `X-DashScope-UserAgent`, `Client-Code: QwenCode`
@@ -133,6 +267,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
133
267
  ## [1.0.5] - 2025-04-03
134
268
 
135
269
  ### Fixed
270
+
136
271
  - **NVIDIA provider non-chat model filtering** (comment/implementation mismatch)
137
272
  - Added modalities-based filtering to exclude embedding, speech-to-text, OCR, and image-gen models
138
273
  - Filters models where `output` is not `["text"]` (e.g., image generation like `black-forest-labs/flux.1-dev`)
@@ -143,6 +278,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
143
278
  ## [1.0.4] - 2025-04-03
144
279
 
145
280
  ### Fixed
281
+
146
282
  - **All tests now passing** (127/127)
147
283
  - Fixed mock paths in kilo.test.ts, zen.test.ts, ollama.test.ts
148
284
  - Fixed createCtxReRegister mocks in zen.test.ts and openrouter.test.ts
@@ -150,6 +286,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
150
286
  - Added missing DEFAULT_MIN_SIZE_B constant to openrouter mock
151
287
 
152
288
  ### Changed
289
+
153
290
  - **Code quality improvements**
154
291
  - Refactored usage modules to break circular dependency (limits.ts ↔ formatters.ts)
155
292
  - Created usage/types.ts with shared interfaces (FreeTierLimit, FreeTierUsage)
@@ -158,12 +295,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
158
295
  ## [1.0.3] - 2025-04-03
159
296
 
160
297
  ### Changed
298
+
161
299
  - Updated package.json metadata (name, description, keywords, repository URL)
162
300
  - Updated .npmignore for cleaner publishes
163
301
 
164
302
  ## [1.0.0] - 2024-03-28
165
303
 
166
304
  ### Added
305
+
167
306
  - Initial release with 6 providers: Kilo, Zen, OpenRouter, NVIDIA, Cline, Fireworks
168
307
  - Free tier usage tracking across all sessions
169
308
  - Provider failover with model hopping
@@ -173,12 +312,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
173
312
  - Hardcoded benchmark data from Artificial Analysis
174
313
 
175
314
  ### Changed
176
- - **Major refactoring**: Split free-tier-limits.ts into usage/* modules
315
+
316
+ - **Major refactoring**: Split free-tier-limits.ts into usage/\* modules
177
317
  - usage/tracking.ts - runtime session tracking
178
318
  - usage/cumulative.ts - persistent storage
179
319
  - usage/formatters.ts - display formatting
180
320
  - 77% line reduction (741 → 166 lines)
181
- - **Major refactoring**: Split usage-widget.ts into widget/* modules
321
+ - **Major refactoring**: Split usage-widget.ts into widget/\* modules
182
322
  - widget/data.ts - data collection
183
323
  - widget/format.ts - formatting utilities
184
324
  - widget/render.ts - HTML generation
@@ -203,6 +343,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
203
343
  - Added resetUsageStats() for test isolation
204
344
 
205
345
  ### Fixed
346
+
206
347
  - fetchWithRetry() now properly throws after exhausting retries
207
348
  - Auth error pattern matching now handles more message variants
208
349
  - Test isolation for free-tier-limits tests