free-coding-models 0.1.66 → 0.1.68

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -24,7 +24,7 @@
24
24
 
25
25
  <p align="center">
26
26
  <strong>Find the fastest coding LLM models in seconds</strong><br>
27
- <sub>Ping free coding models from 17 providers in real-time — pick the best one for OpenCode, OpenClaw, or any AI coding assistant</sub>
27
+ <sub>Ping free coding models from 18 providers in real-time — pick the best one for OpenCode, OpenClaw, or any AI coding assistant</sub>
28
28
  </p>
29
29
 
30
30
  <p align="center">
@@ -36,7 +36,9 @@
36
36
  <a href="#-requirements">Requirements</a> •
37
37
  <a href="#-installation">Installation</a> •
38
38
  <a href="#-usage">Usage</a> •
39
- <a href="#-models">Models</a> •
39
+ <a href="#-tui-columns">Columns</a> •
40
+ <a href="#-stability-score">Stability</a> •
41
+ <a href="#-coding-models">Models</a> •
40
42
  <a href="#-opencode-integration">OpenCode</a> •
41
43
  <a href="#-openclaw-integration">OpenClaw</a> •
42
44
  <a href="#-how-it-works">How it works</a>
@@ -47,14 +49,15 @@
47
49
  ## ✨ Features
48
50
 
49
51
  - **🎯 Coding-focused** — Only LLM models optimized for code generation, not chat or vision
50
- - **🌐 Multi-provider** — 134 models from NVIDIA NIM, Groq, Cerebras, SambaNova, OpenRouter, Hugging Face Inference, Replicate, DeepInfra, Fireworks AI, Codestral, Hyperbolic, Scaleway, Google AI, SiliconFlow, Together AI, Cloudflare Workers AI, and Perplexity API
52
+ - **🌐 Multi-provider** — Models from NVIDIA NIM, Groq, Cerebras, SambaNova, OpenRouter, Hugging Face Inference, Replicate, DeepInfra, Fireworks AI, Codestral, Hyperbolic, Scaleway, Google AI, SiliconFlow, Together AI, Cloudflare Workers AI, Perplexity API, and ZAI
51
53
  - **⚙️ Settings screen** — Press `P` to manage provider API keys, enable/disable providers, test keys live, and manually check/install updates
52
54
  - **🚀 Parallel pings** — All models tested simultaneously via native `fetch`
53
55
  - **📊 Real-time animation** — Watch latency appear live in alternate screen buffer
54
56
  - **🏆 Smart ranking** — Top 3 fastest models highlighted with medals 🥇🥈🥉
55
- - **⏱ Continuous monitoring** — Pings all models every 2 seconds forever, never stops
57
+ - **⏱ Continuous monitoring** — Pings all models every 3 seconds forever, never stops
56
58
  - **📈 Rolling averages** — Avg calculated from ALL successful pings since start
57
59
  - **📊 Uptime tracking** — Percentage of successful pings shown in real-time
60
+ - **📐 Stability score** — Composite 0–100 score measuring consistency (p95, jitter, spikes, uptime) — a model with 400ms avg and stable responses beats a 250ms avg model that randomly spikes to 6s
58
61
  - **🔄 Auto-retry** — Timeout models keep getting retried, nothing is ever "given up on"
59
62
  - **🎮 Interactive selection** — Navigate with arrow keys directly in the table, press Enter to act
60
63
  - **🔀 Startup mode menu** — Choose between OpenCode and OpenClaw before the TUI launches
@@ -92,6 +95,7 @@ Before using `free-coding-models`, make sure you have:
92
95
  - **Together AI** — [api.together.ai/settings/api-keys](https://api.together.ai/settings/api-keys) → API Keys (credits/promotions vary)
93
96
  - **Cloudflare Workers AI** — [dash.cloudflare.com](https://dash.cloudflare.com) → Create API token + set `CLOUDFLARE_ACCOUNT_ID` (Free: 10k neurons/day)
94
97
  - **Perplexity API** — [perplexity.ai/settings/api](https://www.perplexity.ai/settings/api) → API Key (tiered limits by spend)
98
+ - **ZAI** — [z.ai](https://z.ai) → Get API key (Coding Plan subscription)
95
99
  3. **OpenCode** *(optional)* — [Install OpenCode](https://github.com/opencode-ai/opencode) to use the OpenCode integration
96
100
  4. **OpenClaw** *(optional)* — [Install OpenClaw](https://openclaw.ai) to use the OpenClaw integration
97
101
 
@@ -176,13 +180,13 @@ When you run `free-coding-models` without `--opencode` or `--openclaw`, you get
176
180
  Use `↑↓` arrows to select, `Enter` to confirm. Then the TUI launches with your chosen mode shown in the header badge.
177
181
 
178
182
  **How it works:**
179
- 1. **Ping phase** — All enabled models are pinged in parallel (up to 134 across 17 providers)
180
- 2. **Continuous monitoring** — Models are re-pinged every 2 seconds forever
183
+ 1. **Ping phase** — All enabled models are pinged in parallel (up to 139 across 18 providers)
184
+ 2. **Continuous monitoring** — Models are re-pinged every 60 seconds forever
181
185
  3. **Real-time updates** — Watch "Latest", "Avg", and "Up%" columns update live
182
186
  4. **Select anytime** — Use ↑↓ arrows to navigate, press Enter on a model to act
183
187
  5. **Smart detection** — Automatically detects if NVIDIA NIM is configured in OpenCode or OpenClaw
184
188
 
185
- Setup wizard (first run — walks through all 17 providers):
189
+ Setup wizard (first run — walks through all 18 providers):
186
190
 
187
191
  ```
188
192
  🔑 First-time setup — API keys
@@ -265,6 +269,7 @@ SILICONFLOW_API_KEY=sk_xxx free-coding-models
265
269
  TOGETHER_API_KEY=together_xxx free-coding-models
266
270
  CLOUDFLARE_API_TOKEN=cf_xxx CLOUDFLARE_ACCOUNT_ID=your_account_id free-coding-models
267
271
  PERPLEXITY_API_KEY=pplx_xxx free-coding-models
272
+ ZAI_API_KEY=zai-xxx free-coding-models
268
273
  FREE_CODING_MODELS_TELEMETRY=0 free-coding-models
269
274
  ```
270
275
 
@@ -347,13 +352,24 @@ When enabled, telemetry events include: event name, app version, selected mode,
347
352
  1. Sign up at [perplexity.ai/settings/api](https://www.perplexity.ai/settings/api)
348
353
  2. Create API key (`PERPLEXITY_API_KEY`)
349
354
 
350
- > 💡 **Free tiers** each provider exposes a dev/free tier with its own quotas.
355
+ **ZAI** (5 models, GLM family):
356
+ 1. Sign up at [z.ai](https://z.ai)
357
+ 2. Subscribe to Coding Plan
358
+ 3. Get API key from dashboard
359
+
360
+ > 💡 **Free tiers** — each provider exposes a dev/free tier with its own quotas. ZAI requires a Coding Plan subscription.
351
361
 
352
362
  ---
353
363
 
354
364
  ## 🤖 Coding Models
355
365
 
356
- **134 coding models** across 17 providers and 8 tiers, ranked by [SWE-bench Verified](https://www.swebench.com) — the industry-standard benchmark measuring real GitHub issue resolution. Scores are self-reported by providers unless noted.
366
+ **139 coding models** across 18 providers and 8 tiers, ranked by [SWE-bench Verified](https://www.swebench.com) — the industry-standard benchmark measuring real GitHub issue resolution. Scores are self-reported by providers unless noted.
367
+
368
+ ### ZAI Coding Plan (5 models)
369
+
370
+ | Tier | SWE-bench | Model |
371
+ |------|-----------|-------|
372
+ | **S+** ≥70% | GLM-5 (77.8%), GLM-4.5 (75.0%), GLM-4.7 (73.8%), GLM-4.5-Air (72.0%), GLM-4.6 (70.0%) |
357
373
 
358
374
  ### NVIDIA NIM (44 models)
359
375
 
@@ -414,6 +430,92 @@ Current tier filter is shown in the header badge (e.g., `[Tier S]`)
414
430
 
415
431
  ---
416
432
 
433
+ ## 📊 TUI Columns
434
+
435
+ The main table displays one row per model with the following columns:
436
+
437
+ | Column | Sort key | Description |
438
+ |--------|----------|-------------|
439
+ | **Rank** | `R` | Position based on current sort order (medals for top 3: 🥇🥈🥉) |
440
+ | **Tier** | `Y` | SWE-bench tier (S+, S, A+, A, A-, B+, B, C) |
441
+ | **SWE%** | `S` | SWE-bench Verified score — the industry-standard benchmark for real GitHub issue resolution |
442
+ | **CTX** | `C` | Context window size in thousands of tokens (e.g. `128k`) |
443
+ | **Model** | `M` | Model display name (favorites show ⭐ prefix) |
444
+ | **Origin** | `N` | Provider name (NIM, Groq, Cerebras, etc.) — press `N` to cycle origin filter |
445
+ | **Latest Ping** | `L` | Most recent round-trip latency in milliseconds |
446
+ | **Avg Ping** | `A` | Rolling average of ALL successful pings since launch |
447
+ | **Health** | `H` | Current status: UP ✅, NO KEY 🔑, Timeout ⏳, Overloaded 🔥, Not Found 🚫 |
448
+ | **Verdict** | `V` | Health verdict based on avg latency + stability analysis (see below) |
449
+ | **Stability** | `B` | Composite 0–100 consistency score (see [Stability Score](#-stability-score)) |
450
+ | **Up%** | `U` | Uptime — percentage of successful pings out of total attempts |
451
+
452
+ ### Verdict values
453
+
454
+ The Verdict column combines average latency with stability analysis:
455
+
456
+ | Verdict | Meaning |
457
+ |---------|---------|
458
+ | **Perfect** | Avg < 400ms with stable p95/jitter |
459
+ | **Normal** | Avg < 1000ms, consistent responses |
460
+ | **Slow** | Avg 1000–2000ms |
461
+ | **Spiky** | Good avg but erratic tail latency (p95 >> avg) |
462
+ | **Very Slow** | Avg 2000–5000ms |
463
+ | **Overloaded** | Server returned 429/503 (rate limited or capacity hit) |
464
+ | **Unstable** | Was previously up but now timing out, or avg > 5000ms |
465
+ | **Not Active** | No successful pings yet |
466
+ | **Pending** | First ping still in flight |
467
+
468
+ ---
469
+
470
+ ## 📐 Stability Score
471
+
472
+ The **Stability** column (sort with `B` key) shows a composite 0–100 score that answers: *"How consistent and predictable is this model?"*
473
+
474
+ Average latency alone is misleading — a model averaging 250ms that randomly spikes to 6 seconds *feels* slower in practice than a steady 400ms model. The stability score captures this.
475
+
476
+ ### Formula
477
+
478
+ Four signals are normalized to 0–100 each, then combined with weights:
479
+
480
+ ```
481
+ Stability = 0.30 × p95_score
482
+ + 0.30 × jitter_score
483
+ + 0.20 × spike_score
484
+ + 0.20 × reliability_score
485
+ ```
486
+
487
+ | Component | Weight | What it measures | How it's normalized |
488
+ |-----------|--------|-----------------|---------------------|
489
+ | **p95 latency** | 30% | Tail-latency spikes — the worst 5% of response times | `100 × (1 - p95 / 5000)`, clamped to 0–100 |
490
+ | **Jitter (σ)** | 30% | Erratic response times — standard deviation of ping times | `100 × (1 - jitter / 2000)`, clamped to 0–100 |
491
+ | **Spike rate** | 20% | Fraction of pings above 3000ms | `100 × (1 - spikes / total_pings)` |
492
+ | **Reliability** | 20% | Uptime — fraction of successful HTTP 200 pings | Direct uptime percentage (0–100) |
493
+
494
+ ### Color coding
495
+
496
+ | Score | Color | Interpretation |
497
+ |-------|-------|----------------|
498
+ | **80–100** | Green | Rock-solid — very consistent, safe to rely on |
499
+ | **60–79** | Cyan | Good — occasional variance but generally stable |
500
+ | **40–59** | Yellow | Shaky — noticeable inconsistency |
501
+ | **< 40** | Red | Unreliable — frequent spikes or failures |
502
+ | **—** | Dim | No data yet (no successful pings) |
503
+
504
+ ### Example
505
+
506
+ Two models with similar average latency, very different real-world experience:
507
+
508
+ ```
509
+ Model A: avg 250ms, p95 6000ms, jitter 1800ms → Stability ~30 (red)
510
+ Model B: avg 400ms, p95 650ms, jitter 120ms → Stability ~85 (green)
511
+ ```
512
+
513
+ Model B is the better choice despite its higher average — it won't randomly stall your coding workflow.
514
+
515
+ > 💡 **Tip:** Sort by Stability (`B` key) after a few minutes of monitoring to find the models that deliver the most predictable performance.
516
+
517
+ ---
518
+
417
519
  ## 🔌 OpenCode Integration
418
520
 
419
521
  **The easiest way** — let `free-coding-models` do everything:
@@ -439,6 +541,18 @@ You can force a specific port:
439
541
  OPENCODE_PORT=4098 free-coding-models --opencode
440
542
  ```
441
543
 
544
+ ### ZAI provider proxy
545
+
546
+ OpenCode doesn't natively support ZAI's API path format (`/api/coding/paas/v4/*`). When you select a ZAI model, `free-coding-models` automatically starts a local reverse proxy that translates OpenCode's standard `/v1/*` requests to ZAI's API. This is fully transparent -- just select a ZAI model and press Enter.
547
+
548
+ **How it works:**
549
+ 1. A localhost HTTP proxy starts on a random available port
550
+ 2. OpenCode is configured with a `zai` provider pointing at `http://localhost:<port>/v1`
551
+ 3. The proxy rewrites `/v1/models` to `/api/coding/paas/v4/models` and `/v1/chat/completions` to `/api/coding/paas/v4/chat/completions`
552
+ 4. When OpenCode exits, the proxy shuts down automatically
553
+
554
+ No manual configuration needed -- the proxy lifecycle is managed entirely by `free-coding-models`.
555
+
442
556
  ### Manual OpenCode Setup (Optional)
443
557
 
444
558
  Create or edit `~/.config/opencode/opencode.json`:
@@ -589,19 +703,19 @@ This script:
589
703
  ## ⚙️ How it works
590
704
 
591
705
  ```
592
- ┌─────────────────────────────────────────────────────────────┐
593
- │ 1. Enter alternate screen buffer (like vim/htop/less)
594
- │ 2. Ping ALL models in parallel
595
- │ 3. Display real-time table with Latest/Avg/Up% columns
596
- │ 4. Re-ping ALL models every 2 seconds (forever)
597
- │ 5. Update rolling averages from ALL successful pings
598
- │ 6. User can navigate with ↑↓ and select with Enter
599
- │ 7. On Enter (OpenCode): set model, launch OpenCode
600
- │ 8. On Enter (OpenClaw): update ~/.openclaw/openclaw.json
601
- └─────────────────────────────────────────────────────────────┘
706
+ ┌──────────────────────────────────────────────────────────────────┐
707
+ │ 1. Enter alternate screen buffer (like vim/htop/less)
708
+ │ 2. Ping ALL models in parallel
709
+ │ 3. Display real-time table with Latest/Avg/Stability/Up%
710
+ │ 4. Re-ping ALL models every 60 seconds (forever)
711
+ │ 5. Update rolling averages + stability scores per model
712
+ │ 6. User can navigate with ↑↓ and select with Enter
713
+ │ 7. On Enter (OpenCode): set model, launch OpenCode
714
+ │ 8. On Enter (OpenClaw): update ~/.openclaw/openclaw.json
715
+ └──────────────────────────────────────────────────────────────────┘
602
716
  ```
603
717
 
604
- **Result:** Continuous monitoring interface that stays open until you select a model or press Ctrl+C. Rolling averages give you accurate long-term latency data, uptime percentage tracks reliability, and you can configure your tool of choice with your chosen model in one keystroke.
718
+ **Result:** Continuous monitoring interface that stays open until you select a model or press Ctrl+C. Rolling averages give you accurate long-term latency data, the stability score reveals which models are truly consistent vs. deceptively spikey, and you can configure your tool of choice with one keystroke.
605
719
 
606
720
  ---
607
721
 
@@ -628,6 +742,7 @@ This script:
628
742
  | `CLOUDFLARE_API_TOKEN` / `CLOUDFLARE_API_KEY` | Cloudflare Workers AI token/key |
629
743
  | `CLOUDFLARE_ACCOUNT_ID` | Cloudflare account ID (required for Workers AI endpoint URL) |
630
744
  | `PERPLEXITY_API_KEY` / `PPLX_API_KEY` | Perplexity API key |
745
+ | `ZAI_API_KEY` | ZAI key |
631
746
  | `FREE_CODING_MODELS_TELEMETRY` | `0` disables analytics, `1` enables analytics |
632
747
  | `FREE_CODING_MODELS_POSTHOG_KEY` | PostHog project API key used for anonymous event capture |
633
748
  | `FREE_CODING_MODELS_POSTHOG_HOST` | Optional PostHog ingest host (`https://eu.i.posthog.com` default) |
@@ -647,7 +762,8 @@ This script:
647
762
  "siliconflow": "sk_xxx",
648
763
  "together": "together_xxx",
649
764
  "cloudflare": "cf_xxx",
650
- "perplexity": "pplx_xxx"
765
+ "perplexity": "pplx_xxx",
766
+ "zai": "zai-xxx"
651
767
  },
652
768
  "providers": {
653
769
  "nvidia": { "enabled": true },
@@ -660,7 +776,8 @@ This script:
660
776
  "siliconflow": { "enabled": true },
661
777
  "together": { "enabled": true },
662
778
  "cloudflare": { "enabled": true },
663
- "perplexity": { "enabled": true }
779
+ "perplexity": { "enabled": true },
780
+ "zai": { "enabled": true }
664
781
  },
665
782
  "favorites": [
666
783
  "nvidia/deepseek-ai/deepseek-v3.2"
@@ -675,7 +792,7 @@ This script:
675
792
 
676
793
  **Configuration:**
677
794
  - **Ping timeout**: 15 seconds per attempt (slow models get more time)
678
- - **Ping interval**: 2 seconds between complete re-pings of all models (adjustable with W/X keys)
795
+ - **Ping interval**: 60 seconds between complete re-pings of all models (adjustable with W/X keys)
679
796
  - **Monitor mode**: Interface stays open forever, press Ctrl+C to exit
680
797
 
681
798
  **Flags:**
@@ -693,15 +810,22 @@ This script:
693
810
  | `--tier A` | Show only A+, A, A- tier models |
694
811
  | `--tier B` | Show only B+, B tier models |
695
812
  | `--tier C` | Show only C tier models |
813
+ | `--profile <name>` | Load a saved config profile on startup |
814
+ | `--recommend` | Auto-open Smart Recommend overlay on start |
696
815
 
697
816
  **Keyboard shortcuts (main TUI):**
698
817
  - **↑↓** — Navigate models
699
818
  - **Enter** — Select model (launches OpenCode or sets OpenClaw default, depending on mode)
700
- - **R/Y/O/M/L/A/S/N/H/V/U** — Sort by Rank/Tier/Origin/Model/LatestPing/Avg/SWE/Ctx/Health/Verdict/Uptime
819
+ - **R/Y/O/M/L/A/S/N/H/V/B/U** — Sort by Rank/Tier/Origin/Model/LatestPing/Avg/SWE/Ctx/Health/Verdict/Stability/Uptime
701
820
  - **F** — Toggle favorite on selected model (⭐ in Model column, pinned at top)
702
821
  - **T** — Cycle tier filter (All → S+ → S → A+ → A → A- → B+ → B → C → All)
703
822
  - **Z** — Cycle mode (OpenCode CLI → OpenCode Desktop → OpenClaw)
704
- - **P** — Open Settings (manage API keys, provider toggles, analytics toggle, manual update)
823
+ - **P** — Open Settings (manage API keys, provider toggles, analytics toggle, manual update, profiles)
824
+ - **Shift+P** — Cycle through saved profiles (switches live TUI settings)
825
+ - **Shift+S** — Save current TUI settings as a named profile (inline prompt)
826
+ - **Q** — Open Smart Recommend overlay (find the best model for your task)
827
+ - **E** — Elevate tier filter (show higher tiers)
828
+ - **D** — Descend tier filter (show lower tiers)
705
829
  - **W** — Decrease ping interval (faster pings)
706
830
  - **X** — Increase ping interval (slower pings)
707
831
  - **K** / **Esc** — Show/hide help overlay
@@ -710,15 +834,46 @@ This script:
710
834
  Pressing **K** now shows a full in-app reference: main hotkeys, settings hotkeys, and CLI flags with usage examples.
711
835
 
712
836
  **Keyboard shortcuts (Settings screen — `P` key):**
713
- - **↑↓** — Navigate providers, analytics row, and maintenance row
714
- - **Enter** — Edit API key inline, toggle analytics on analytics row, or check/install update on maintenance row
715
- - **Space** — Toggle provider enabled/disabled, or toggle analytics on analytics row
837
+ - **↑↓** — Navigate providers, analytics row, maintenance row, and profile rows
838
+ - **Enter** — Edit API key inline, toggle analytics, check/install update, or load a profile
839
+ - **Space** — Toggle provider enabled/disabled, or toggle analytics
716
840
  - **T** — Test current provider's API key (fires a live ping)
717
841
  - **U** — Check for updates manually from settings
842
+ - **Backspace** — Delete the selected profile (only on profile rows)
718
843
  - **Esc** — Close settings and return to main TUI
719
844
 
720
845
  ---
721
846
 
847
+ ### 📋 Config Profiles
848
+
849
+ Profiles let you save and restore different TUI configurations — useful if you switch between work/personal setups, different tier preferences, or want to keep separate favorites lists.
850
+
851
+ **What's stored in a profile:**
852
+ - Favorites (starred models)
853
+ - Sort column and direction
854
+ - Tier filter
855
+ - Ping interval
856
+ - API keys
857
+
858
+ **Saving a profile:**
859
+ 1. Configure the TUI the way you want (favorites, sort, tier, etc.)
860
+ 2. Press **Shift+S** — an inline prompt appears at the bottom
861
+ 3. Type a name (e.g. `work`, `fast-only`, `presentation`) and press **Enter**
862
+ 4. The profile is saved and becomes the active profile (shown as a purple badge in the header)
863
+
864
+ **Switching profiles:**
865
+ - **Shift+P** in the main table — cycles through saved profiles (or back to raw config)
866
+ - **`--profile <name>`** — load a specific profile on startup
867
+
868
+ **Managing profiles:**
869
+ - Open Settings (**P** key) — scroll down to the **Profiles** section
870
+ - **Enter** on a profile row to load it
871
+ - **Backspace** on a profile row to delete it
872
+
873
+ Profiles are stored inside `~/.free-coding-models.json` under the `profiles` key.
874
+
875
+ ---
876
+
722
877
  ## 🔧 Development
723
878
 
724
879
  ```bash
@@ -772,5 +927,3 @@ We welcome contributions! Feel free to open issues, submit pull requests, or get
772
927
  For questions or issues, open a [GitHub issue](https://github.com/vava-nessa/free-coding-models/issues).
773
928
 
774
929
  💬 Let's talk about the project on Discord: https://discord.gg/5MbTnDC3Md
775
-
776
- > ⚠️ **free-coding-models is a BETA TUI** — it might crash or have problems. Use at your own risk and feel free to report issues!