npm - free-coding-models - Versions diffs - 0.1.66 → 0.1.68 - Mend

free-coding-models 0.1.66 → 0.1.68

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md +183 -30
package/bin/free-coding-models.js +990 -119
package/lib/config.js +164 -3
package/lib/utils.js +293 -30
package/package.json +1 -1
package/sources.js +17 -0

package/README.md CHANGED Viewed

@@ -24,7 +24,7 @@
 <p align="center">
   <strong>Find the fastest coding LLM models in seconds</strong><br>
-  <sub>Ping free coding models from 17 providers in real-time — pick the best one for OpenCode, OpenClaw, or any AI coding assistant</sub>
+  <sub>Ping free coding models from 18 providers in real-time — pick the best one for OpenCode, OpenClaw, or any AI coding assistant</sub>
 </p>
 <p align="center">
@@ -36,7 +36,9 @@
   <a href="#-requirements">Requirements</a> •
   <a href="#-installation">Installation</a> •
   <a href="#-usage">Usage</a> •
-  <a href="#-models">Models</a> •
+  <a href="#-tui-columns">Columns</a> •
+  <a href="#-stability-score">Stability</a> •
+  <a href="#-coding-models">Models</a> •
   <a href="#-opencode-integration">OpenCode</a> •
   <a href="#-openclaw-integration">OpenClaw</a> •
   <a href="#-how-it-works">How it works</a>
@@ -47,14 +49,15 @@
 ## ✨ Features
 - **🎯 Coding-focused** — Only LLM models optimized for code generation, not chat or vision
-- **🌐 Multi-provider** — 134 models from NVIDIA NIM, Groq, Cerebras, SambaNova, OpenRouter, Hugging Face Inference, Replicate, DeepInfra, Fireworks AI, Codestral, Hyperbolic, Scaleway, Google AI, SiliconFlow, Together AI, Cloudflare Workers AI, and Perplexity API
+- **🌐 Multi-provider** — Models from NVIDIA NIM, Groq, Cerebras, SambaNova, OpenRouter, Hugging Face Inference, Replicate, DeepInfra, Fireworks AI, Codestral, Hyperbolic, Scaleway, Google AI, SiliconFlow, Together AI, Cloudflare Workers AI, Perplexity API, and ZAI
 - **⚙️ Settings screen** — Press `P` to manage provider API keys, enable/disable providers, test keys live, and manually check/install updates
 - **🚀 Parallel pings** — All models tested simultaneously via native `fetch`
 - **📊 Real-time animation** — Watch latency appear live in alternate screen buffer
 - **🏆 Smart ranking** — Top 3 fastest models highlighted with medals 🥇🥈🥉
-- **⏱ Continuous monitoring** — Pings all models every 2 seconds forever, never stops
+- **⏱ Continuous monitoring** — Pings all models every 3 seconds forever, never stops
 - **📈 Rolling averages** — Avg calculated from ALL successful pings since start
 - **📊 Uptime tracking** — Percentage of successful pings shown in real-time
+- **📐 Stability score** — Composite 0–100 score measuring consistency (p95, jitter, spikes, uptime) — a model with 400ms avg and stable responses beats a 250ms avg model that randomly spikes to 6s
 - **🔄 Auto-retry** — Timeout models keep getting retried, nothing is ever "given up on"
 - **🎮 Interactive selection** — Navigate with arrow keys directly in the table, press Enter to act
 - **🔀 Startup mode menu** — Choose between OpenCode and OpenClaw before the TUI launches
@@ -92,6 +95,7 @@ Before using `free-coding-models`, make sure you have:
    - **Together AI** — [api.together.ai/settings/api-keys](https://api.together.ai/settings/api-keys) → API Keys (credits/promotions vary)
    - **Cloudflare Workers AI** — [dash.cloudflare.com](https://dash.cloudflare.com) → Create API token + set `CLOUDFLARE_ACCOUNT_ID` (Free: 10k neurons/day)
    - **Perplexity API** — [perplexity.ai/settings/api](https://www.perplexity.ai/settings/api) → API Key (tiered limits by spend)
+   - **ZAI** — [z.ai](https://z.ai) → Get API key (Coding Plan subscription)
 3. **OpenCode** *(optional)* — [Install OpenCode](https://github.com/opencode-ai/opencode) to use the OpenCode integration
 4. **OpenClaw** *(optional)* — [Install OpenClaw](https://openclaw.ai) to use the OpenClaw integration
@@ -176,13 +180,13 @@ When you run `free-coding-models` without `--opencode` or `--openclaw`, you get
 Use `↑↓` arrows to select, `Enter` to confirm. Then the TUI launches with your chosen mode shown in the header badge.
 **How it works:**
-1. **Ping phase** — All enabled models are pinged in parallel (up to 134 across 17 providers)
-2. **Continuous monitoring** — Models are re-pinged every 2 seconds forever
+1. **Ping phase** — All enabled models are pinged in parallel (up to 139 across 18 providers)
+2. **Continuous monitoring** — Models are re-pinged every 60 seconds forever
 3. **Real-time updates** — Watch "Latest", "Avg", and "Up%" columns update live
 4. **Select anytime** — Use ↑↓ arrows to navigate, press Enter on a model to act
 5. **Smart detection** — Automatically detects if NVIDIA NIM is configured in OpenCode or OpenClaw
-Setup wizard (first run — walks through all 17 providers):
+Setup wizard (first run — walks through all 18 providers):
 ```
   🔑 First-time setup — API keys
@@ -265,6 +269,7 @@ SILICONFLOW_API_KEY=sk_xxx free-coding-models
 TOGETHER_API_KEY=together_xxx free-coding-models
 CLOUDFLARE_API_TOKEN=cf_xxx CLOUDFLARE_ACCOUNT_ID=your_account_id free-coding-models
 PERPLEXITY_API_KEY=pplx_xxx free-coding-models
+ZAI_API_KEY=zai-xxx free-coding-models
 FREE_CODING_MODELS_TELEMETRY=0 free-coding-models
 ```
@@ -347,13 +352,24 @@ When enabled, telemetry events include: event name, app version, selected mode,
 1. Sign up at [perplexity.ai/settings/api](https://www.perplexity.ai/settings/api)
 2. Create API key (`PERPLEXITY_API_KEY`)
-> 💡 **Free tiers** — each provider exposes a dev/free tier with its own quotas.
+**ZAI** (5 models, GLM family):
+1. Sign up at [z.ai](https://z.ai)
+2. Subscribe to Coding Plan
+3. Get API key from dashboard
+> 💡 **Free tiers** — each provider exposes a dev/free tier with its own quotas. ZAI requires a Coding Plan subscription.
 ---
 ## 🤖 Coding Models
-**134 coding models** across 17 providers and 8 tiers, ranked by [SWE-bench Verified](https://www.swebench.com) — the industry-standard benchmark measuring real GitHub issue resolution. Scores are self-reported by providers unless noted.
+**139 coding models** across 18 providers and 8 tiers, ranked by [SWE-bench Verified](https://www.swebench.com) — the industry-standard benchmark measuring real GitHub issue resolution. Scores are self-reported by providers unless noted.
+### ZAI Coding Plan (5 models)
+| Tier | SWE-bench | Model |
+|------|-----------|-------|
+| **S+** ≥70% | GLM-5 (77.8%), GLM-4.5 (75.0%), GLM-4.7 (73.8%), GLM-4.5-Air (72.0%), GLM-4.6 (70.0%) |
 ### NVIDIA NIM (44 models)
@@ -414,6 +430,92 @@ Current tier filter is shown in the header badge (e.g., `[Tier S]`)
 ---
+## 📊 TUI Columns
+The main table displays one row per model with the following columns:
+| Column | Sort key | Description |
+|--------|----------|-------------|
+| **Rank** | `R` | Position based on current sort order (medals for top 3: 🥇🥈🥉) |
+| **Tier** | `Y` | SWE-bench tier (S+, S, A+, A, A-, B+, B, C) |
+| **SWE%** | `S` | SWE-bench Verified score — the industry-standard benchmark for real GitHub issue resolution |
+| **CTX** | `C` | Context window size in thousands of tokens (e.g. `128k`) |
+| **Model** | `M` | Model display name (favorites show ⭐ prefix) |
+| **Origin** | `N` | Provider name (NIM, Groq, Cerebras, etc.) — press `N` to cycle origin filter |
+| **Latest Ping** | `L` | Most recent round-trip latency in milliseconds |
+| **Avg Ping** | `A` | Rolling average of ALL successful pings since launch |
+| **Health** | `H` | Current status: UP ✅, NO KEY 🔑, Timeout ⏳, Overloaded 🔥, Not Found 🚫 |
+| **Verdict** | `V` | Health verdict based on avg latency + stability analysis (see below) |
+| **Stability** | `B` | Composite 0–100 consistency score (see [Stability Score](#-stability-score)) |
+| **Up%** | `U` | Uptime — percentage of successful pings out of total attempts |
+### Verdict values
+The Verdict column combines average latency with stability analysis:
+| Verdict | Meaning |
+|---------|---------|
+| **Perfect** | Avg < 400ms with stable p95/jitter |
+| **Normal** | Avg < 1000ms, consistent responses |
+| **Slow** | Avg 1000–2000ms |
+| **Spiky** | Good avg but erratic tail latency (p95 >> avg) |
+| **Very Slow** | Avg 2000–5000ms |
+| **Overloaded** | Server returned 429/503 (rate limited or capacity hit) |
+| **Unstable** | Was previously up but now timing out, or avg > 5000ms |
+| **Not Active** | No successful pings yet |
+| **Pending** | First ping still in flight |
+---
+## 📐 Stability Score
+The **Stability** column (sort with `B` key) shows a composite 0–100 score that answers: *"How consistent and predictable is this model?"*
+Average latency alone is misleading — a model averaging 250ms that randomly spikes to 6 seconds *feels* slower in practice than a steady 400ms model. The stability score captures this.
+### Formula
+Four signals are normalized to 0–100 each, then combined with weights:
+```
+Stability = 0.30 × p95_score
+          + 0.30 × jitter_score
+          + 0.20 × spike_score
+          + 0.20 × reliability_score
+```
+| Component | Weight | What it measures | How it's normalized |
+|-----------|--------|-----------------|---------------------|
+| **p95 latency** | 30% | Tail-latency spikes — the worst 5% of response times | `100 × (1 - p95 / 5000)`, clamped to 0–100 |
+| **Jitter (σ)** | 30% | Erratic response times — standard deviation of ping times | `100 × (1 - jitter / 2000)`, clamped to 0–100 |
+| **Spike rate** | 20% | Fraction of pings above 3000ms | `100 × (1 - spikes / total_pings)` |
+| **Reliability** | 20% | Uptime — fraction of successful HTTP 200 pings | Direct uptime percentage (0–100) |
+### Color coding
+| Score | Color | Interpretation |
+|-------|-------|----------------|
+| **80–100** | Green | Rock-solid — very consistent, safe to rely on |
+| **60–79** | Cyan | Good — occasional variance but generally stable |
+| **40–59** | Yellow | Shaky — noticeable inconsistency |
+| **< 40** | Red | Unreliable — frequent spikes or failures |
+| **—** | Dim | No data yet (no successful pings) |
+### Example
+Two models with similar average latency, very different real-world experience:
+```
+Model A:  avg 250ms,  p95 6000ms,  jitter 1800ms  →  Stability ~30  (red)
+Model B:  avg 400ms,  p95  650ms,  jitter  120ms  →  Stability ~85  (green)
+```
+Model B is the better choice despite its higher average — it won't randomly stall your coding workflow.
+> 💡 **Tip:** Sort by Stability (`B` key) after a few minutes of monitoring to find the models that deliver the most predictable performance.
+---
 ## 🔌 OpenCode Integration
 **The easiest way** — let `free-coding-models` do everything:
@@ -439,6 +541,18 @@ You can force a specific port:
 OPENCODE_PORT=4098 free-coding-models --opencode
 ```
+### ZAI provider proxy
+OpenCode doesn't natively support ZAI's API path format (`/api/coding/paas/v4/*`). When you select a ZAI model, `free-coding-models` automatically starts a local reverse proxy that translates OpenCode's standard `/v1/*` requests to ZAI's API. This is fully transparent -- just select a ZAI model and press Enter.
+**How it works:**
+1. A localhost HTTP proxy starts on a random available port
+2. OpenCode is configured with a `zai` provider pointing at `http://localhost:<port>/v1`
+3. The proxy rewrites `/v1/models` to `/api/coding/paas/v4/models` and `/v1/chat/completions` to `/api/coding/paas/v4/chat/completions`
+4. When OpenCode exits, the proxy shuts down automatically
+No manual configuration needed -- the proxy lifecycle is managed entirely by `free-coding-models`.
 ### Manual OpenCode Setup (Optional)
 Create or edit `~/.config/opencode/opencode.json`:
@@ -589,19 +703,19 @@ This script:
 ## ⚙️ How it works
 ```
-┌─────────────────────────────────────────────────────────────┐
-│  1. Enter alternate screen buffer (like vim/htop/less)      │
-│  2. Ping ALL models in parallel                             │
-│  3. Display real-time table with Latest/Avg/Up% columns     │
-│  4. Re-ping ALL models every 2 seconds (forever)           │
-│  5. Update rolling averages from ALL successful pings      │
-│  6. User can navigate with ↑↓ and select with Enter       │
-│  7. On Enter (OpenCode): set model, launch OpenCode        │
-│  8. On Enter (OpenClaw): update ~/.openclaw/openclaw.json  │
-└─────────────────────────────────────────────────────────────┘
+┌──────────────────────────────────────────────────────────────────┐
+│  1. Enter alternate screen buffer (like vim/htop/less)           │
+│  2. Ping ALL models in parallel                                  │
+│  3. Display real-time table with Latest/Avg/Stability/Up%        │
+│  4. Re-ping ALL models every 60 seconds (forever)               │
+│  5. Update rolling averages + stability scores per model        │
+│  6. User can navigate with ↑↓ and select with Enter            │
+│  7. On Enter (OpenCode): set model, launch OpenCode             │
+│  8. On Enter (OpenClaw): update ~/.openclaw/openclaw.json       │
+└──────────────────────────────────────────────────────────────────┘
 ```
-**Result:** Continuous monitoring interface that stays open until you select a model or press Ctrl+C. Rolling averages give you accurate long-term latency data, uptime percentage tracks reliability, and you can configure your tool of choice with your chosen model in one keystroke.
+**Result:** Continuous monitoring interface that stays open until you select a model or press Ctrl+C. Rolling averages give you accurate long-term latency data, the stability score reveals which models are truly consistent vs. deceptively spikey, and you can configure your tool of choice with one keystroke.
 ---
@@ -628,6 +742,7 @@ This script:
 | `CLOUDFLARE_API_TOKEN` / `CLOUDFLARE_API_KEY` | Cloudflare Workers AI token/key |
 | `CLOUDFLARE_ACCOUNT_ID` | Cloudflare account ID (required for Workers AI endpoint URL) |
 | `PERPLEXITY_API_KEY` / `PPLX_API_KEY` | Perplexity API key |
+| `ZAI_API_KEY` | ZAI key |
 | `FREE_CODING_MODELS_TELEMETRY` | `0` disables analytics, `1` enables analytics |
 | `FREE_CODING_MODELS_POSTHOG_KEY` | PostHog project API key used for anonymous event capture |
 | `FREE_CODING_MODELS_POSTHOG_HOST` | Optional PostHog ingest host (`https://eu.i.posthog.com` default) |
@@ -647,7 +762,8 @@ This script:
     "siliconflow": "sk_xxx",
     "together": "together_xxx",
     "cloudflare": "cf_xxx",
-    "perplexity": "pplx_xxx"
+    "perplexity": "pplx_xxx",
+    "zai":      "zai-xxx"
   },
   "providers": {
     "nvidia":   { "enabled": true },
@@ -660,7 +776,8 @@ This script:
     "siliconflow": { "enabled": true },
     "together": { "enabled": true },
     "cloudflare": { "enabled": true },
-    "perplexity": { "enabled": true }
+    "perplexity": { "enabled": true },
+    "zai":      { "enabled": true }
   },
   "favorites": [
     "nvidia/deepseek-ai/deepseek-v3.2"
@@ -675,7 +792,7 @@ This script:
 **Configuration:**
 - **Ping timeout**: 15 seconds per attempt (slow models get more time)
-- **Ping interval**: 2 seconds between complete re-pings of all models (adjustable with W/X keys)
+- **Ping interval**: 60 seconds between complete re-pings of all models (adjustable with W/X keys)
 - **Monitor mode**: Interface stays open forever, press Ctrl+C to exit
 **Flags:**
@@ -693,15 +810,22 @@ This script:
 | `--tier A` | Show only A+, A, A- tier models |
 | `--tier B` | Show only B+, B tier models |
 | `--tier C` | Show only C tier models |
+| `--profile <name>` | Load a saved config profile on startup |
+| `--recommend` | Auto-open Smart Recommend overlay on start |
 **Keyboard shortcuts (main TUI):**
 - **↑↓** — Navigate models
 - **Enter** — Select model (launches OpenCode or sets OpenClaw default, depending on mode)
-- **R/Y/O/M/L/A/S/N/H/V/U** — Sort by Rank/Tier/Origin/Model/LatestPing/Avg/SWE/Ctx/Health/Verdict/Uptime
+- **R/Y/O/M/L/A/S/N/H/V/B/U** — Sort by Rank/Tier/Origin/Model/LatestPing/Avg/SWE/Ctx/Health/Verdict/Stability/Uptime
 - **F** — Toggle favorite on selected model (⭐ in Model column, pinned at top)
 - **T** — Cycle tier filter (All → S+ → S → A+ → A → A- → B+ → B → C → All)
 - **Z** — Cycle mode (OpenCode CLI → OpenCode Desktop → OpenClaw)
-- **P** — Open Settings (manage API keys, provider toggles, analytics toggle, manual update)
+- **P** — Open Settings (manage API keys, provider toggles, analytics toggle, manual update, profiles)
+- **Shift+P** — Cycle through saved profiles (switches live TUI settings)
+- **Shift+S** — Save current TUI settings as a named profile (inline prompt)
+- **Q** — Open Smart Recommend overlay (find the best model for your task)
+- **E** — Elevate tier filter (show higher tiers)
+- **D** — Descend tier filter (show lower tiers)
 - **W** — Decrease ping interval (faster pings)
 - **X** — Increase ping interval (slower pings)
 - **K** / **Esc** — Show/hide help overlay
@@ -710,15 +834,46 @@ This script:
 Pressing **K** now shows a full in-app reference: main hotkeys, settings hotkeys, and CLI flags with usage examples.
 **Keyboard shortcuts (Settings screen — `P` key):**
-- **↑↓** — Navigate providers, analytics row, and maintenance row
-- **Enter** — Edit API key inline, toggle analytics on analytics row, or check/install update on maintenance row
-- **Space** — Toggle provider enabled/disabled, or toggle analytics on analytics row
+- **↑↓** — Navigate providers, analytics row, maintenance row, and profile rows
+- **Enter** — Edit API key inline, toggle analytics, check/install update, or load a profile
+- **Space** — Toggle provider enabled/disabled, or toggle analytics
 - **T** — Test current provider's API key (fires a live ping)
 - **U** — Check for updates manually from settings
+- **Backspace** — Delete the selected profile (only on profile rows)
 - **Esc** — Close settings and return to main TUI
 ---
+### 📋 Config Profiles
+Profiles let you save and restore different TUI configurations — useful if you switch between work/personal setups, different tier preferences, or want to keep separate favorites lists.
+**What's stored in a profile:**
+- Favorites (starred models)
+- Sort column and direction
+- Tier filter
+- Ping interval
+- API keys
+**Saving a profile:**
+1. Configure the TUI the way you want (favorites, sort, tier, etc.)
+2. Press **Shift+S** — an inline prompt appears at the bottom
+3. Type a name (e.g. `work`, `fast-only`, `presentation`) and press **Enter**
+4. The profile is saved and becomes the active profile (shown as a purple badge in the header)
+**Switching profiles:**
+- **Shift+P** in the main table — cycles through saved profiles (or back to raw config)
+- **`--profile <name>`** — load a specific profile on startup
+**Managing profiles:**
+- Open Settings (**P** key) — scroll down to the **Profiles** section
+- **Enter** on a profile row to load it
+- **Backspace** on a profile row to delete it
+Profiles are stored inside `~/.free-coding-models.json` under the `profiles` key.
+---
 ## 🔧 Development
 ```bash
@@ -772,5 +927,3 @@ We welcome contributions! Feel free to open issues, submit pull requests, or get
 For questions or issues, open a [GitHub issue](https://github.com/vava-nessa/free-coding-models/issues).
 💬 Let's talk about the project on Discord: https://discord.gg/5MbTnDC3Md
-> ⚠️ **free-coding-models is a BETA TUI** — it might crash or have problems. Use at your own risk and feel free to report issues!