better-ccflare 3.4.27 → 3.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -185,6 +185,9 @@ RETRY_ATTEMPTS=3 # Number of retry attempts
185
185
  RETRY_DELAY_MS=1000 # Initial retry delay in milliseconds
186
186
  RETRY_BACKOFF=2 # Retry backoff multiplier
187
187
 
188
+ # Health endpoint
189
+ HEALTH_DETAIL_ENABLED=false # Enable ?detail=1 on /health to expose per-account status (default: off, set true for internal monitoring)
190
+
188
191
  # Storage
189
192
  STORE_PAYLOADS=false # Disable storing request/response bodies (reduces DB size and memory usage)
190
193
  # Token counts, costs, model, status and timing are still recorded
@@ -636,6 +639,9 @@ We recommend using one of the workarounds above until the npm bug is fixed.
636
639
  - **Auto-fallback** - Automatically switch back to higher priority Claude OAuth accounts when their usage windows reset
637
640
  - **Auto-refresh** - Automatically start new usage windows when they reset
638
641
  - **Usage Window Alignment** - Sessions automatically align with Claude OAuth usage window resets for optimal resource utilization
642
+ - **Usage Throttling** - Configurable monthly token/cost limits per account with peak-hours auto-pause for Zai accounts
643
+ - **503 on Pool Exhaustion** - Returns HTTP 503 when all accounts are rate-limited or paused, enabling client-side retry logic
644
+ - **Rate Limit Audit Trail** - Tracks when and why each account became rate-limited (`rate_limited_reason`, `rate_limited_at`)
639
645
 
640
646
  ### 🔗 Combos — Cross-Provider Fallback Chains
641
647
  - **Named Combos** - Create named fallback chains with ordered (account, model) slots
@@ -660,6 +666,9 @@ We recommend using one of the workarounds above until the npm bug is fixed.
660
666
  - Web dashboard (`http://localhost:8080/dashboard`)
661
667
  - CLI for account management
662
668
  - REST API for automation
669
+ - `--doctor` command for database integrity checks and telemetry
670
+ - Reasoning effort compatibility layer for OpenAI/Codex routes (downgrade mapping, `count_tokens` support)
671
+ - `/health` endpoint with three-state pool status (`healthy`/`degraded`/`unhealthy`), 503 on degraded/unhealthy, optional `?detail=1` behind `HEALTH_DETAIL_ENABLED`
663
672
 
664
673
  ### 🔒 Production Ready
665
674
  - Automatic failover between accounts
@@ -770,11 +779,11 @@ Inspired by [snipeship/ccflare](https://github.com/snipeship/ccflare) - thanks f
770
779
  - [@anonym-uz](https://github.com/anonym-uz) - Critical auto-pause bug fix, analytics performance optimizations, request body truncation, and incremental vacuum implementation
771
780
  - [@makhweeb](https://github.com/makhweeb) - Enhanced request handling and analytics improvements
772
781
  - [@jw409](https://github.com/jw409) - Fixed OAuth account addition in WSL2 and compiled binaries by replacing unreliable prompt() with readline; systemd deployment guide, BUN_JSC_* crash loop analysis, and preflight environment validator (PR #106)
773
- - [@materemias](https://github.com/materemias) - Testing and validation of Vertex AI provider implementation, thorough debugging of OAuth API key authentication (issue #54), requesting and validating AWS Bedrock support (issue #49), and extensive testing of new releases and features
782
+ - [@materemias](https://github.com/materemias) - Testing and validation of Vertex AI provider implementation, thorough debugging of OAuth API key authentication (issue #54), requesting and validating AWS Bedrock support (issue #49), extensive testing of new releases and features; fix request details modal hydration race where payload row committed after parent request row, adding lazy re-fetch via `/api/requests/payload/:id` with 404 fallback (PR #186)
774
783
  - [@tqtensor](https://github.com/tqtensor) - Comprehensive memory leak fix preventing OOM kills with smart chunk capping, memory monitoring, and optimized cleanup (PR #67)
775
784
  - [@lunetics](https://github.com/lunetics) - Force-reset rate limit feature allowing manual clearing of stale rate-limit locks via API, CLI, and dashboard with immediate usage polling (PR #68), OOM kill prevention with periodic data retention cleanup, 3-day default retention, and time-scoped stats queries (PR #70), model registry sync removing retired models and adding sonnet-4.6 CLI shortcut (PR #71)
776
785
  - [@troykelly](https://github.com/troykelly) - Comprehensive PostgreSQL compatibility fixes including boolean type handling, identifier case preservation, BIGINT string coercion, UNION ALL type alignment, HAVING clause compatibility, parameter ordering corrections, worker initialization, and connection pooling (issue #81); detailed bug report and root cause analysis for `/api/accounts` Invalid Date error affecting PostgreSQL BIGINT columns (issue #88)
777
- - [@cowwoc](https://github.com/cowwoc) - Compact reliability fixes, space breakdown after cleanup, requests tab without payloads (PR #149); clear stale rate_limited_until when usage API shows capacity restored (PR #150); prevent manually-paused accounts from being selected as auto-fallback candidates (PR #151); balance new sessions by utilization within same-priority accounts using water-filling algorithm (PR #152); fix ModuleNotFound crash in compiled binary when using Compact Database by embedding vacuum-worker at build time (PR #155); expected usage position indicator on rate limit bars showing projected pacing vs. reset window (PR #156); show explicit rate-limited state when usage data unavailable on startup (PR #161); deduplicate concurrent fetchAndCache calls per account to prevent redundant Anthropic requests (PR #159); make GET /api/accounts cache-only and await usage fetch in refresh endpoint, eliminating blind 5s timeout (PR #162); mark account rate-limited when all models exhausted to prevent stale-state retry loops (PR #163); use retry-after header for dynamic model-exhaustion cooldown instead of hardcoded 1 hour (PR #164); remove implicit sonnet catch-all in getModelList preventing silent unexpected model remaps (PR #165); reduce log noise by aggregating auto-unpause skip messages and suppressing identity model mapping logs (PR #167)
786
+ - [@cowwoc](https://github.com/cowwoc) - Compact reliability fixes, space breakdown after cleanup, requests tab without payloads (PR #149); clear stale rate_limited_until when usage API shows capacity restored (PR #150); prevent manually-paused accounts from being selected as auto-fallback candidates (PR #151); balance new sessions by utilization within same-priority accounts using water-filling algorithm (PR #152); fix ModuleNotFound crash in compiled binary when using Compact Database by embedding vacuum-worker at build time (PR #155); expected usage position indicator on rate limit bars showing projected pacing vs. reset window (PR #156); show explicit rate-limited state when usage data unavailable on startup (PR #161); deduplicate concurrent fetchAndCache calls per account to prevent redundant Anthropic requests (PR #159); make GET /api/accounts cache-only and await usage fetch in refresh endpoint, eliminating blind 5s timeout (PR #162); mark account rate-limited when all models exhausted to prevent stale-state retry loops (PR #163); use retry-after header for dynamic model-exhaustion cooldown instead of hardcoded 1 hour (PR #164); remove implicit sonnet catch-all in getModelList preventing silent unexpected model remaps (PR #165); reduce log noise by aggregating auto-unpause skip messages and suppressing identity model mapping logs (PR #167); reasoning effort compatibility layer for OpenAI/Codex routes with deterministic downgrade mapping and count_tokens path support (PR #172, implemented in PR #188)
778
787
 
779
788
  ## Contributing
780
789
 
Binary file
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "better-ccflare",
3
- "version": "3.4.27",
3
+ "version": "3.5.0",
4
4
  "description": "Load balancer proxy for Claude API with intelligent distribution across multiple OAuth accounts to avoid rate limiting",
5
5
  "license": "MIT",
6
6
  "repository": {