npm - claude-code-router-config - Versions diffs - 1.0.1 → 1.1.0 - Mend

claude-code-router-config 1.0.1 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/README.md +169 -8
package/cli/analytics.js +509 -0
package/cli/benchmark.js +342 -0
package/cli/commands.js +300 -0
package/config/smart-intent-router.js +543 -0
package/docs/v1.1.0-FEATURES.md +752 -0
package/logging/enhanced-logger.js +410 -0
package/logging/health-monitor.js +472 -0
package/logging/middleware.js +384 -0
package/package.json +29 -6
package/plugins/plugin-manager.js +607 -0
package/templates/README.md +161 -0
package/templates/balanced.json +111 -0
package/templates/cost-optimized.json +96 -0
package/templates/development.json +104 -0
package/templates/performance-optimized.json +88 -0
package/templates/quality-focused.json +105 -0
package/web-dashboard/public/css/dashboard.css +575 -0
package/web-dashboard/public/index.html +308 -0
package/web-dashboard/public/js/dashboard.js +512 -0
package/web-dashboard/server.js +352 -0

package/templates/README.md ADDED Viewed

@@ -0,0 +1,161 @@
+# Configuration Templates
+This directory contains pre-configured templates optimized for different use cases. Each template provides a complete configuration with optimized provider selection, routing rules, and settings.
+## Available Templates
+### 1. 🚀 `performance-optimized.json`
+**Best for**: Real-time applications, chatbots, time-sensitive tasks
+- **Priority**: Speed and low latency
+- **Expected Latency**: 500-2000ms
+- **Cost Profile**: Low to Medium
+- **Primary Provider**: Gemini Flash (fastest)
+- **Fallback**: OpenAI Mini, Qwen Turbo
+### 2. 💰 `cost-optimized.json`
+**Best for**: Budget-conscious users, bulk processing, non-critical tasks
+- **Priority**: Maximum cost savings
+- **Expected Savings**: 70-90% vs premium models
+- **Cost Profile**: Very Low
+- **Primary Provider**: Qwen Turbo (cheapest)
+- **Features**: Budget tracking and alerts
+### 3. 🎯 `quality-focused.json`
+**Best for**: Critical applications, complex reasoning, professional work
+- **Priority**: Highest quality and accuracy
+- **Expected Latency**: 3-10 seconds
+- **Cost Profile**: High
+- **Primary Provider**: Claude Sonnet 4 (best reasoning)
+- **Features**: Enhanced reasoning, chain-of-thought
+### 4. 💻 `development.json`
+**Best for**: Developers, coding assistance, software engineering
+- **Priority**: Code generation and debugging
+- **Specialized**: Coding models and frameworks
+- **Cost Profile**: Medium
+- **Primary Provider**: OpenAI GPT-4o Mini (best for coding)
+- **Features**: Code context awareness, project intelligence
+### 5. ⚖️ `balanced.json` **(Default)**
+**Best for**: General use, mixed workloads, everyday tasks
+- **Priority**: Balanced cost-performance-quality
+- **Expected Latency**: 2-4 seconds
+- **Cost Profile**: Medium
+- **Primary Provider**: OpenAI GPT-4o (versatile)
+- **Features**: Adaptive routing, auto-optimization
+## Usage
+### Apply a Template:
+```bash
+# Using CLI (recommended)
+ccr config template performance-optimized
+ccr config template cost-optimized
+ccr config template quality-focused
+ccr config template development
+ccr config template balanced
+# Manual copy
+cp templates/performance-optimized.json ~/.claude-code-router/config.json
+```
+### Compare Templates:
+```bash
+# See quick comparison
+ccr config compare --all
+# Compare specific templates
+ccr config compare performance-optimized cost-optimized
+```
+## Template Features
+### Performance Optimization
+- Request timeout settings
+- Retry policies
+- Connection pooling
+- Caching strategies
+### Cost Management
+- Budget limits and alerts
+- Cost tracking
+- Provider cost analysis
+- Usage optimization
+### Quality Enhancement
+- Temperature settings
+- Model-specific parameters
+- Prompt optimization
+- Response validation
+### Development Tools
+- Code context awareness
+- Language-specific optimization
+- Debug mode
+- Testing configurations
+## Customization
+Templates can be customized by:
+1. Copying a template as a base
+2. Modifying providers and models
+3. Adjusting routing rules
+4. Fine-tuning parameters
+5. Adding custom features
+Example customization:
+```json
+{
+  "customSetting": {
+    "myPreference": true,
+    "specificModel": "claude-sonnet-4-latest"
+  }
+}
+```
+## Switching Templates
+To switch between templates:
+```bash
+# Backup current config
+ccr config backup
+# Apply new template
+ccr config template quality-focused
+# Verify configuration
+ccr config validate
+# Test providers
+ccr benchmark full
+```
+## Recommendations
+| Use Case | Recommended Template |
+|----------|---------------------|
+| **Chat Applications** | performance-optimized |
+| **Data Processing** | cost-optimized |
+| **Research & Analysis** | quality-focused |
+| **Software Development** | development |
+| **General Purpose** | balanced |
+## Tips
+1. **Start with balanced** if unsure - it works well for most cases
+2. **Monitor costs** when using quality-focused templates
+3. **Test latency** with performance-optimized for your specific use case
+4. **Customize** templates based on your actual usage patterns
+5. **Backup** configurations before switching templates
+## Template Comparison
+| Feature | Performance | Cost | Quality | Development | Balanced |
+|---------|-------------|------|---------|--------------|----------|
+| **Speed** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
+| **Cost** | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐ | ⭐⭐⭐ | ⭐⭐⭐ |
+| **Quality** | ⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
+| **Coding** | ⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
+| **Best For** | Speed | Budget | Accuracy | Development | Everything |
+Choose the template that best matches your priorities and use case!

package/templates/balanced.json ADDED Viewed

@@ -0,0 +1,111 @@
+{
+  "LOG": true,
+  "LOG_LEVEL": "info",
+  "API_TIMEOUT_MS": 30000,
+  "CUSTOM_ROUTER_PATH": "$HOME/.claude-code-router/intent-router.js",
+  "Providers": [
+    {
+      "name": "openai",
+      "api_base_url": "https://api.openai.com/v1/chat/completions",
+      "api_key": "$OPENAI_API_KEY",
+      "models": ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo"],
+      "transformer": {
+        "use": []
+      },
+      "priority": 1,
+      "description": "Versatile performer for most tasks",
+      "strengths": ["coding", "reasoning", "general"]
+    },
+    {
+      "name": "anthropic",
+      "api_base_url": "https://api.anthropic.com/v1/messages",
+      "api_key": "$ANTHROPIC_API_KEY",
+      "models": ["claude-3-5-sonnet-latest", "claude-sonnet-4-latest"],
+      "transformer": {
+        "use": ["Anthropic"]
+      },
+      "priority": 2,
+      "description": "Excellent for analysis and complex reasoning",
+      "strengths": ["analysis", "writing", "reasoning"]
+    },
+    {
+      "name": "gemini",
+      "api_base_url": "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions",
+      "api_key": "$GEMINI_API_KEY",
+      "models": ["gemini-2.5-flash", "gemini-2.5-pro"],
+      "transformer": {
+        "use": ["gemini"]
+      },
+      "priority": 3,
+      "description": "Fast with large context and good quality",
+      "strengths": ["speed", "context", "multilingual"]
+    },
+    {
+      "name": "qwen",
+      "api_base_url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
+      "api_key": "$QWEN_API_KEY",
+      "models": ["qwen-plus", "qwen-max"],
+      "transformer": {
+        "use": []
+      },
+      "priority": 4,
+      "description": "Cost-effective with good performance",
+      "strengths": ["cost", "multilingual", "coding"]
+    },
+    {
+      "name": "glm",
+      "api_base_url": "https://api.z.ai/api/paas/v4/chat/completions",
+      "api_key": "$GLM_API_KEY",
+      "models": ["glm-4.6", "glm-4.5"],
+      "transformer": {
+        "use": []
+      },
+      "priority": 5,
+      "description": "Specialized in Chinese and multilingual tasks",
+      "strengths": ["multilingual", "translation", "chinese"]
+    }
+  ],
+  "Router": {
+    "default": "openai,gpt-4o",
+    "background": "gemini,gemini-2.5-flash",
+    "think": "anthropic,claude-3-5-sonnet-latest",
+    "longContext": "gemini,gemini-2.5-pro",
+    "longContextThreshold": 100000,
+    "Balanced": {
+      "enabled": true,
+      "adaptiveRouting": true,
+      "costAware": true,
+      "performanceAware": true
+    }
+  },
+  "BalancedOptimization": {
+    "costQualityRatio": 0.7,
+    "performanceWeight": 0.6,
+    "fallbackStrategy": "cost_first",
+    "adaptiveThreshold": true
+  },
+  "AutoOptimization": {
+    "enabled": true,
+    "learningPeriod": 7,
+    "performanceTracking": true,
+    "costMonitoring": true,
+    "automaticTuning": false
+  },
+  "description": "Balanced configuration providing good mix of cost, speed, and quality",
+  "useCase": "Best for general use cases, balanced workloads, and users who want it all",
+  "optimizationTarget": "Balanced cost-performance-quality ratio",
+  "costProfile": "Medium",
+  "expectedLatency": "2-4 seconds",
+  "qualityLevel": "High",
+  "recommendedFor": [
+    "General chat",
+    "Mixed workloads",
+    "Everyday tasks",
+    "Balanced requirements"
+  ]
+}

package/templates/cost-optimized.json ADDED Viewed

@@ -0,0 +1,96 @@
+{
+  "LOG": true,
+  "LOG_LEVEL": "warn",
+  "API_TIMEOUT_MS": 30000,
+  "CUSTOM_ROUTER_PATH": "$HOME/.claude-code-router/intent-router.js",
+  "Providers": [
+    {
+      "name": "qwen",
+      "api_base_url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
+      "api_key": "$QWEN_API_KEY",
+      "models": ["qwen-turbo", "qwen-plus"],
+      "transformer": {
+        "use": []
+      },
+      "priority": 1,
+      "description": "Most cost-effective for general tasks",
+      "estimatedCost": "$0.0003 per 1K tokens"
+    },
+    {
+      "name": "gemini",
+      "api_base_url": "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions",
+      "api_key": "$GEMINI_API_KEY",
+      "models": ["gemini-2.5-flash"],
+      "transformer": {
+        "use": ["gemini"]
+      },
+      "priority": 2,
+      "description": "Very affordable with good quality",
+      "estimatedCost": "$0.000075 per 1K tokens"
+    },
+    {
+      "name": "glm",
+      "api_base_url": "https://api.z.ai/api/paas/v4/chat/completions",
+      "api_key": "$GLM_API_KEY",
+      "models": ["glm-4.5", "glm-4.6"],
+      "transformer": {
+        "use": []
+      },
+      "priority": 3,
+      "description": "Cost-effective for multilingual tasks",
+      "estimatedCost": "$0.0005 per 1K tokens"
+    },
+    {
+      "name": "openai",
+      "api_base_url": "https://api.openai.com/v1/chat/completions",
+      "api_key": "$OPENAI_API_KEY",
+      "models": ["gpt-4o-mini"],
+      "transformer": {
+        "use": []
+      },
+      "priority": 4,
+      "description": "Fallback for when quality is critical",
+      "estimatedCost": "$0.00015 per 1K tokens"
+    }
+  ],
+  "Router": {
+    "default": "qwen,qwen-turbo",
+    "background": "qwen,qwen-turbo",
+    "think": "gemini,gemini-2.5-flash",
+    "longContext": "gemini,gemini-2.5-flash",
+    "longContextThreshold": 100000,
+    "CostOptimization": {
+      "enabled": true,
+      "budgetAlerts": true,
+      "dailyBudget": 10.0,
+      "preferCheaper": true,
+      "costTracking": true
+    }
+  },
+  "BudgetManagement": {
+    "dailyLimit": 10.0,
+    "monthlyLimit": 100.0,
+    "alertThreshold": 0.8,
+    "trackByProvider": true,
+    "cheapestFirstFallback": true
+  },
+  "CostTracking": {
+    "enabled": true,
+    "granularTracking": true,
+    "reporting": "daily",
+    "alerts": {
+      "daily": true,
+      "threshold": 5.0
+    }
+  },
+  "description": "Cost-optimized configuration minimizing expenses while maintaining good quality",
+  "useCase": "Best for budget-conscious users, bulk processing, and non-critical tasks",
+  "expectedSavings": "70-90% compared to premium models",
+  "costProfile": "Very Low",
+  "qualityLevel": "Good for most tasks"
+}

package/templates/development.json ADDED Viewed

@@ -0,0 +1,104 @@
+{
+  "LOG": true,
+  "LOG_LEVEL": "debug",
+  "API_TIMEOUT_MS": 15000,
+  "CUSTOM_ROUTER_PATH": "$HOME/.claude-code-router/intent-router.js",
+  "Providers": [
+    {
+      "name": "openai",
+      "api_base_url": "https://api.openai.com/v1/chat/completions",
+      "api_key": "$OPENAI_API_KEY",
+      "models": ["gpt-4o-mini", "gpt-4o"],
+      "transformer": {
+        "use": []
+      },
+      "priority": 1,
+      "description": "Excellent for coding and development tasks"
+    },
+    {
+      "name": "anthropic",
+      "api_base_url": "https://api.anthropic.com/v1/messages",
+      "api_key": "$ANTHROPIC_API_KEY",
+      "models": ["claude-3-5-sonnet-latest"],
+      "transformer": {
+        "use": ["Anthropic"]
+      },
+      "priority": 2,
+      "description": "Great for debugging and architecture"
+    },
+    {
+      "name": "qwen",
+      "api_base_url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
+      "api_key": "$QWEN_API_KEY",
+      "models": ["qwen3-coder-plus", "qwen-plus"],
+      "transformer": {
+        "use": []
+      },
+      "priority": 3,
+      "description": "Specialized for coding tasks"
+    },
+    {
+      "name": "gemini",
+      "api_base_url": "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions",
+      "api_key": "$GEMINI_API_KEY",
+      "models": ["gemini-2.5-flash"],
+      "transformer": {
+        "use": ["gemini"]
+      },
+      "priority": 4,
+      "description": "Quick for documentation and simple queries"
+    }
+  ],
+  "Router": {
+    "default": "openai,gpt-4o-mini",
+    "background": "gemini,gemini-2.5-flash",
+    "think": "anthropic,claude-3-5-sonnet-latest",
+    "longContext": "gemini,gemini-2.5-flash",
+    "longContextThreshold": 50000,
+    "Development": {
+      "enabled": true,
+      "preferCodingModels": true,
+      "codeContextAware": true
+    }
+  },
+  "DevelopmentFeatures": {
+    "codeContext": true,
+    "projectAwareness": true,
+    "debugMode": true,
+    "verboseLogging": true,
+    "testMode": false
+  },
+  "CodeAssistance": {
+    "languages": ["javascript", "python", "typescript", "java", "go", "rust", "cpp"],
+    "frameworks": ["react", "vue", "angular", "nodejs", "django", "flask", "spring"],
+    "specializedModels": {
+      "coding": "qwen3-coder-plus",
+      "debugging": "anthropic,claude-3-5-sonnet-latest",
+      "architecture": "anthropic,claude-sonnet-4-latest",
+      "documentation": "openai,gpt-4o"
+    }
+  },
+  "Testing": {
+    "mockMode": false,
+    "localTesting": true,
+    "apiValidation": true,
+    "responseCaching": true
+  },
+  "description": "Development-optimized configuration for coding, debugging, and software engineering tasks",
+  "useCase": "Best for developers, coding assistance, debugging, and software development workflows",
+  "optimizedFor": [
+    "Code generation",
+    "Debugging",
+    "Code review",
+    "Documentation",
+    "Architecture design"
+  ],
+  "costProfile": "Medium",
+  "expectedLatency": "2-5 seconds"
+}

package/templates/performance-optimized.json ADDED Viewed

@@ -0,0 +1,88 @@
+{
+  "LOG": true,
+  "LOG_LEVEL": "info",
+  "API_TIMEOUT_MS": 20000,
+  "CUSTOM_ROUTER_PATH": "$HOME/.claude-code-router/intent-router.js",
+  "Providers": [
+    {
+      "name": "gemini",
+      "api_base_url": "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions",
+      "api_key": "$GEMINI_API_KEY",
+      "models": ["gemini-2.5-flash"],
+      "transformer": {
+        "use": ["gemini"]
+      },
+      "priority": 1,
+      "description": "Fastest responses for time-critical tasks"
+    },
+    {
+      "name": "openai",
+      "api_base_url": "https://api.openai.com/v1/chat/completions",
+      "api_key": "$OPENAI_API_KEY",
+      "models": ["gpt-4o-mini", "gpt-4o"],
+      "transformer": {
+        "use": []
+      },
+      "priority": 2,
+      "description": "Balance of speed and capability"
+    },
+    {
+      "name": "qwen",
+      "api_base_url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
+      "api_key": "$QWEN_API_KEY",
+      "models": ["qwen-turbo", "qwen-plus"],
+      "transformer": {
+        "use": []
+      },
+      "priority": 3,
+      "description": "Fast for simple tasks"
+    },
+    {
+      "name": "anthropic",
+      "api_base_url": "https://api.anthropic.com/v1/messages",
+      "api_key": "$ANTHROPIC_API_KEY",
+      "models": ["claude-3-5-haiku-latest", "claude-sonnet-4-latest"],
+      "transformer": {
+        "use": ["Anthropic"]
+      },
+      "priority": 4,
+      "description": "High quality when needed"
+    }
+  ],
+  "Router": {
+    "default": "gemini,gemini-2.5-flash",
+    "background": "qwen,qwen-turbo",
+    "think": "openai,gpt-4o-mini",
+    "longContext": "gemini,gemini-2.5-flash",
+    "longContextThreshold": 100000,
+    "Performance": {
+      "enabled": true,
+      "timeoutMs": 15000,
+      "retryAttempts": 2,
+      "preferLowLatency": true
+    }
+  },
+  "PerformanceOptimization": {
+    "enableCaching": true,
+    "cacheTTL": 300,
+    "maxConcurrentRequests": 5,
+    "compressionEnabled": true,
+    "smartTimeout": true
+  },
+  "HealthChecks": {
+    "enabled": true,
+    "interval": 30000,
+    "timeout": 5000,
+    "failureThreshold": 3,
+    "recoveryTimeout": 60000
+  },
+  "description": "Performance-optimized configuration prioritizing speed and low latency",
+  "useCase": "Best for real-time applications, chatbots, and time-sensitive tasks",
+  "expectedLatency": "500-2000ms",
+  "costProfile": "Low to Medium"
+}

package/templates/quality-focused.json ADDED Viewed

@@ -0,0 +1,105 @@
+{
+  "LOG": true,
+  "LOG_LEVEL": "debug",
+  "API_TIMEOUT_MS": 60000,
+  "CUSTOM_ROUTER_PATH": "$HOME/.claude-code-router/intent-router.js",
+  "Providers": [
+    {
+      "name": "anthropic",
+      "api_base_url": "https://api.anthropic.com/v1/messages",
+      "api_key": "$ANTHROPIC_API_KEY",
+      "models": ["claude-sonnet-4-latest", "claude-3-5-sonnet-latest"],
+      "transformer": {
+        "use": ["Anthropic"]
+      },
+      "priority": 1,
+      "description": "Highest reasoning and analytical capability",
+      "estimatedCost": "$15.00 per 1M input tokens"
+    },
+    {
+      "name": "openai",
+      "api_base_url": "https://api.openai.com/v1/chat/completions",
+      "api_key": "$OPENAI_API_KEY",
+      "models": ["gpt-4o", "o1", "gpt-4-turbo"],
+      "transformer": {
+        "use": []
+      },
+      "priority": 2,
+      "description": "Strong coding and problem-solving abilities",
+      "estimatedCost": "$5.00 per 1M input tokens"
+    },
+    {
+      "name": "gemini",
+      "api_base_url": "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions",
+      "api_key": "$GEMINI_API_KEY",
+      "models": ["gemini-2.5-pro", "gemini-2.5-flash"],
+      "transformer": {
+        "use": ["gemini"]
+      },
+      "priority": 3,
+      "description": "Good balance of quality and speed",
+      "estimatedCost": "$1.25 per 1M input tokens"
+    },
+    {
+      "name": "openrouter",
+      "api_base_url": "https://openrouter.ai/api/v1/chat/completions",
+      "api_key": "$OPENROUTER_API_KEY",
+      "models": [
+        "anthropic/claude-3.5-sonnet",
+        "openai/gpt-4-turbo",
+        "google/gemini-pro-1.5"
+      ],
+      "transformer": {
+        "use": ["openrouter"]
+      },
+      "priority": 4,
+      "description": "Access to specialized high-quality models",
+      "estimatedCost": "Variable"
+    }
+  ],
+  "Router": {
+    "default": "anthropic,claude-sonnet-4-latest",
+    "background": "gemini,gemini-2.5-pro",
+    "think": "openai,o1",
+    "longContext": "gemini,gemini-2.5-pro",
+    "longContextThreshold": 200000,
+    "QualityOptimization": {
+      "enabled": true,
+      "preferAccuracy": true,
+      "maxRetries": 3,
+      "temperature": 0.3,
+      "topP": 0.9
+    }
+  },
+  "QualitySettings": {
+    "temperature": 0.3,
+    "maxTokens": 4096,
+    "topP": 0.9,
+    "frequencyPenalty": 0.1,
+    "presencePenalty": 0.1,
+    "responseFormat": "auto"
+  },
+  "EnhancedReasoning": {
+    "enabled": true,
+    "chainOfThought": true,
+    "selfCritique": true,
+    "multiplePerspectives": true
+  },
+  "description": "Quality-focused configuration prioritizing accuracy and capabilities over cost/speed",
+  "useCase": "Best for critical applications, complex reasoning, research, and professional work",
+  "expectedQuality": "Highest available",
+  "costProfile": "High",
+  "expectedLatency": "3-10 seconds",
+  "recommendedFor": [
+    "Complex analysis",
+    "Creative writing",
+    "Code review",
+    "Strategic planning",
+    "Research tasks"
+  ]
+}