npm - better-ccflare - Versions diffs - 3.4.27 → 3.5.0 - Mend

better-ccflare 3.4.27 → 3.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -185,6 +185,9 @@ RETRY_ATTEMPTS=3                       # Number of retry attempts
 RETRY_DELAY_MS=1000                   # Initial retry delay in milliseconds
 RETRY_BACKOFF=2                        # Retry backoff multiplier
+# Health endpoint
+HEALTH_DETAIL_ENABLED=false            # Enable ?detail=1 on /health to expose per-account status (default: off, set true for internal monitoring)
 # Storage
 STORE_PAYLOADS=false                   # Disable storing request/response bodies (reduces DB size and memory usage)
                                        # Token counts, costs, model, status and timing are still recorded
@@ -636,6 +639,9 @@ We recommend using one of the workarounds above until the npm bug is fixed.
 - **Auto-fallback** - Automatically switch back to higher priority Claude OAuth accounts when their usage windows reset
 - **Auto-refresh** - Automatically start new usage windows when they reset
 - **Usage Window Alignment** - Sessions automatically align with Claude OAuth usage window resets for optimal resource utilization
+- **Usage Throttling** - Configurable monthly token/cost limits per account with peak-hours auto-pause for Zai accounts
+- **503 on Pool Exhaustion** - Returns HTTP 503 when all accounts are rate-limited or paused, enabling client-side retry logic
+- **Rate Limit Audit Trail** - Tracks when and why each account became rate-limited (`rate_limited_reason`, `rate_limited_at`)
 ### 🔗 Combos — Cross-Provider Fallback Chains
 - **Named Combos** - Create named fallback chains with ordered (account, model) slots
@@ -660,6 +666,9 @@ We recommend using one of the workarounds above until the npm bug is fixed.
 - Web dashboard (`http://localhost:8080/dashboard`)
 - CLI for account management
 - REST API for automation
+- `--doctor` command for database integrity checks and telemetry
+- Reasoning effort compatibility layer for OpenAI/Codex routes (downgrade mapping, `count_tokens` support)
+- `/health` endpoint with three-state pool status (`healthy`/`degraded`/`unhealthy`), 503 on degraded/unhealthy, optional `?detail=1` behind `HEALTH_DETAIL_ENABLED`
 ### 🔒 Production Ready
 - Automatic failover between accounts
@@ -770,11 +779,11 @@ Inspired by [snipeship/ccflare](https://github.com/snipeship/ccflare) - thanks f
 - [@anonym-uz](https://github.com/anonym-uz) - Critical auto-pause bug fix, analytics performance optimizations, request body truncation, and incremental vacuum implementation
 - [@makhweeb](https://github.com/makhweeb) - Enhanced request handling and analytics improvements
 - [@jw409](https://github.com/jw409) - Fixed OAuth account addition in WSL2 and compiled binaries by replacing unreliable prompt() with readline; systemd deployment guide, BUN_JSC_* crash loop analysis, and preflight environment validator (PR #106)
-- [@materemias](https://github.com/materemias) - Testing and validation of Vertex AI provider implementation, thorough debugging of OAuth API key authentication (issue #54), requesting and validating AWS Bedrock support (issue #49), and extensive testing of new releases and features
+- [@materemias](https://github.com/materemias) - Testing and validation of Vertex AI provider implementation, thorough debugging of OAuth API key authentication (issue #54), requesting and validating AWS Bedrock support (issue #49), extensive testing of new releases and features; fix request details modal hydration race where payload row committed after parent request row, adding lazy re-fetch via `/api/requests/payload/:id` with 404 fallback (PR #186)
 - [@tqtensor](https://github.com/tqtensor) - Comprehensive memory leak fix preventing OOM kills with smart chunk capping, memory monitoring, and optimized cleanup (PR #67)
 - [@lunetics](https://github.com/lunetics) - Force-reset rate limit feature allowing manual clearing of stale rate-limit locks via API, CLI, and dashboard with immediate usage polling (PR #68), OOM kill prevention with periodic data retention cleanup, 3-day default retention, and time-scoped stats queries (PR #70), model registry sync removing retired models and adding sonnet-4.6 CLI shortcut (PR #71)
 - [@troykelly](https://github.com/troykelly) - Comprehensive PostgreSQL compatibility fixes including boolean type handling, identifier case preservation, BIGINT string coercion, UNION ALL type alignment, HAVING clause compatibility, parameter ordering corrections, worker initialization, and connection pooling (issue #81); detailed bug report and root cause analysis for `/api/accounts` Invalid Date error affecting PostgreSQL BIGINT columns (issue #88)
-- [@cowwoc](https://github.com/cowwoc) - Compact reliability fixes, space breakdown after cleanup, requests tab without payloads (PR #149); clear stale rate_limited_until when usage API shows capacity restored (PR #150); prevent manually-paused accounts from being selected as auto-fallback candidates (PR #151); balance new sessions by utilization within same-priority accounts using water-filling algorithm (PR #152); fix ModuleNotFound crash in compiled binary when using Compact Database by embedding vacuum-worker at build time (PR #155); expected usage position indicator on rate limit bars showing projected pacing vs. reset window (PR #156); show explicit rate-limited state when usage data unavailable on startup (PR #161); deduplicate concurrent fetchAndCache calls per account to prevent redundant Anthropic requests (PR #159); make GET /api/accounts cache-only and await usage fetch in refresh endpoint, eliminating blind 5s timeout (PR #162); mark account rate-limited when all models exhausted to prevent stale-state retry loops (PR #163); use retry-after header for dynamic model-exhaustion cooldown instead of hardcoded 1 hour (PR #164); remove implicit sonnet catch-all in getModelList preventing silent unexpected model remaps (PR #165); reduce log noise by aggregating auto-unpause skip messages and suppressing identity model mapping logs (PR #167)
+- [@cowwoc](https://github.com/cowwoc) - Compact reliability fixes, space breakdown after cleanup, requests tab without payloads (PR #149); clear stale rate_limited_until when usage API shows capacity restored (PR #150); prevent manually-paused accounts from being selected as auto-fallback candidates (PR #151); balance new sessions by utilization within same-priority accounts using water-filling algorithm (PR #152); fix ModuleNotFound crash in compiled binary when using Compact Database by embedding vacuum-worker at build time (PR #155); expected usage position indicator on rate limit bars showing projected pacing vs. reset window (PR #156); show explicit rate-limited state when usage data unavailable on startup (PR #161); deduplicate concurrent fetchAndCache calls per account to prevent redundant Anthropic requests (PR #159); make GET /api/accounts cache-only and await usage fetch in refresh endpoint, eliminating blind 5s timeout (PR #162); mark account rate-limited when all models exhausted to prevent stale-state retry loops (PR #163); use retry-after header for dynamic model-exhaustion cooldown instead of hardcoded 1 hour (PR #164); remove implicit sonnet catch-all in getModelList preventing silent unexpected model remaps (PR #165); reduce log noise by aggregating auto-unpause skip messages and suppressing identity model mapping logs (PR #167); reasoning effort compatibility layer for OpenAI/Codex routes with deterministic downgrade mapping and count_tokens path support (PR #172, implemented in PR #188)
 ## Contributing

package/dist/better-ccflare CHANGED Viewed

Binary file

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
 	"name": "better-ccflare",
-	"version": "3.4.27",
+	"version": "3.5.0",
 	"description": "Load balancer proxy for Claude API with intelligent distribution across multiple OAuth accounts to avoid rate limiting",
 	"license": "MIT",
 	"repository": {