agentic-flow 1.2.0 → 1.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +25 -3
- package/dist/agents/claudeAgent.js +7 -5
- package/dist/cli-proxy.js +74 -5
- package/dist/proxy/anthropic-to-onnx.js +213 -0
- package/dist/utils/.claude-flow/metrics/agent-metrics.json +1 -0
- package/dist/utils/.claude-flow/metrics/performance.json +9 -0
- package/dist/utils/.claude-flow/metrics/task-metrics.json +10 -0
- package/dist/utils/cli.js +9 -1
- package/dist/utils/modelOptimizer.js +18 -2
- package/docs/.claude-flow/metrics/performance.json +1 -1
- package/docs/.claude-flow/metrics/task-metrics.json +3 -3
- package/docs/INDEX.md +44 -7
- package/docs/ONNX-PROXY-IMPLEMENTATION.md +254 -0
- package/docs/guides/PROXY-ARCHITECTURE-AND-EXTENSION.md +708 -0
- package/docs/mcp-validation/README.md +43 -0
- package/docs/releases/HOTFIX-v1.2.1.md +315 -0
- package/docs/releases/PUBLISH-COMPLETE-v1.2.0.md +308 -0
- package/docs/releases/README.md +18 -0
- package/docs/testing/README.md +46 -0
- package/package.json +2 -2
- /package/docs/{RELEASE-SUMMARY-v1.1.14-beta.1.md → archived/RELEASE-SUMMARY-v1.1.14-beta.1.md} +0 -0
- /package/docs/{V1.1.14-BETA-READY.md → archived/V1.1.14-BETA-READY.md} +0 -0
- /package/docs/{NPM-PUBLISH-GUIDE-v1.2.0.md → releases/NPM-PUBLISH-GUIDE-v1.2.0.md} +0 -0
- /package/docs/{RELEASE-v1.2.0.md → releases/RELEASE-v1.2.0.md} +0 -0
- /package/docs/{AGENT-SYSTEM-VALIDATION.md → testing/AGENT-SYSTEM-VALIDATION.md} +0 -0
- /package/docs/{FINAL-TESTING-SUMMARY.md → testing/FINAL-TESTING-SUMMARY.md} +0 -0
- /package/docs/{REGRESSION-TEST-RESULTS.md → testing/REGRESSION-TEST-RESULTS.md} +0 -0
- /package/docs/{STREAMING-AND-MCP-VALIDATION.md → testing/STREAMING-AND-MCP-VALIDATION.md} +0 -0
package/README.md
CHANGED
|
@@ -12,7 +12,15 @@
|
|
|
12
12
|
|
|
13
13
|
## 📖 Introduction
|
|
14
14
|
|
|
15
|
-
Agentic Flow
|
|
15
|
+
I built Agentic Flow to easily switch between alternative low-cost AI models in Claude Code/Agent SDK. For those comfortable using Claude agents and commands, it lets you take what you've created and deploy fully hosted agents for real business purposes. Use Claude Code to get the agent working, then deploy it in your favorite cloud.
|
|
16
|
+
|
|
17
|
+
Agentic Flow runs Claude Code agents at near zero cost without rewriting a thing. The built-in model optimizer automatically routes every task to the cheapest option that meets your quality requirements—free local models for privacy, OpenRouter for 99% cost savings, Gemini for speed, or Anthropic when quality matters most. It analyzes each task and selects the optimal model from 27+ options with a single flag, reducing API costs dramatically compared to using Claude exclusively.
|
|
18
|
+
|
|
19
|
+
The system spawns specialized agents on demand through Claude Code's Task tool and MCP coordination. It orchestrates swarms of 66+ pre-built agents (researchers, coders, reviewers, testers, architects) that work in parallel, coordinate through shared memory, and auto-scale based on workload. Transparent OpenRouter and Gemini proxies translate Anthropic API calls automatically—no code changes needed. Local models run direct without proxies for maximum privacy. Switch providers with environment variables, not refactoring.
|
|
20
|
+
|
|
21
|
+
Extending agent capabilities is effortless. Add custom tools and integrations through the CLI—weather data, databases, search engines, or any external service—without touching config files. Your agents instantly gain new abilities across all projects. Every tool you add becomes available to the entire agent ecosystem automatically, and all operations are logged with full traceability for auditing, debugging, and compliance. This means your agents can connect to proprietary systems, third-party APIs, or internal tools in seconds, not hours.
|
|
22
|
+
|
|
23
|
+
Define routing rules through flexible policy modes: Strict mode keeps sensitive data offline, Economy mode prefers free models (99% savings), Premium mode uses Anthropic for highest quality, or create custom cost/quality thresholds. The policy defines the rules; the swarm enforces them automatically. Runs local for development, Docker for CI/CD, or Flow Nexus cloud for production scale. Agentic Flow is the framework for autonomous efficiency—one unified runner for every Claude Code agent, self-tuning, self-routing, and built for real-world deployment.
|
|
16
24
|
|
|
17
25
|
**Key Capabilities:**
|
|
18
26
|
- ✅ **66 Specialized Agents** - Pre-built experts for coding, research, review, testing, DevOps
|
|
@@ -370,9 +378,9 @@ node dist/mcp/fastmcp/servers/http-sse.js
|
|
|
370
378
|
- **stdio**: Claude Desktop, Cursor IDE, command-line tools
|
|
371
379
|
- **HTTP/SSE**: Web apps, browser extensions, REST APIs, mobile apps
|
|
372
380
|
|
|
373
|
-
### Add Custom MCP Servers (No Code Required)
|
|
381
|
+
### Add Custom MCP Servers (No Code Required) ✨ NEW in v1.2.1
|
|
374
382
|
|
|
375
|
-
Add your own MCP servers via CLI without editing code:
|
|
383
|
+
Add your own MCP servers via CLI without editing code—extends agent capabilities in seconds:
|
|
376
384
|
|
|
377
385
|
```bash
|
|
378
386
|
# Add MCP server (Claude Desktop style JSON config)
|
|
@@ -393,6 +401,13 @@ npx agentic-flow mcp disable weather
|
|
|
393
401
|
|
|
394
402
|
# Remove server
|
|
395
403
|
npx agentic-flow mcp remove weather
|
|
404
|
+
|
|
405
|
+
# Test server configuration
|
|
406
|
+
npx agentic-flow mcp test weather
|
|
407
|
+
|
|
408
|
+
# Export/import configurations
|
|
409
|
+
npx agentic-flow mcp export ./mcp-backup.json
|
|
410
|
+
npx agentic-flow mcp import ./mcp-backup.json
|
|
396
411
|
```
|
|
397
412
|
|
|
398
413
|
**Configuration stored in:** `~/.agentic-flow/mcp-config.json`
|
|
@@ -411,6 +426,13 @@ npx agentic-flow --agent researcher --task "Get weather forecast for Tokyo"
|
|
|
411
426
|
- `weather-mcp` - Weather data
|
|
412
427
|
- `database-mcp` - Database operations
|
|
413
428
|
|
|
429
|
+
**v1.2.1 Improvements:**
|
|
430
|
+
- ✅ CLI routing fixed - `mcp add/list/remove` commands now work correctly
|
|
431
|
+
- ✅ Model optimizer filters models without tool support automatically
|
|
432
|
+
- ✅ Full compatibility with Claude Desktop config format
|
|
433
|
+
- ✅ Test command for validating server configurations
|
|
434
|
+
- ✅ Export/import for backing up and sharing configurations
|
|
435
|
+
|
|
414
436
|
**Documentation:** See [docs/guides/ADDING-MCP-SERVERS-CLI.md](docs/guides/ADDING-MCP-SERVERS-CLI.md) for complete guide.
|
|
415
437
|
|
|
416
438
|
### Using MCP Tools in Agents
|
|
@@ -85,11 +85,13 @@ export async function claudeAgent(agent, input, onStream, modelOverride) {
|
|
|
85
85
|
});
|
|
86
86
|
}
|
|
87
87
|
else if (provider === 'onnx') {
|
|
88
|
-
// For ONNX:
|
|
89
|
-
envOverrides.ANTHROPIC_API_KEY = 'local';
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
88
|
+
// For ONNX: Use ANTHROPIC_BASE_URL if already set by CLI (proxy mode)
|
|
89
|
+
envOverrides.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY || 'sk-ant-onnx-local-key';
|
|
90
|
+
envOverrides.ANTHROPIC_BASE_URL = process.env.ANTHROPIC_BASE_URL || process.env.ONNX_PROXY_URL || 'http://localhost:3001';
|
|
91
|
+
logger.info('Using ONNX local proxy', {
|
|
92
|
+
proxyUrl: envOverrides.ANTHROPIC_BASE_URL,
|
|
93
|
+
model: finalModel
|
|
94
|
+
});
|
|
93
95
|
}
|
|
94
96
|
// For Anthropic provider, use existing ANTHROPIC_API_KEY (no proxy needed)
|
|
95
97
|
logger.info('Multi-provider configuration', {
|
package/dist/cli-proxy.js
CHANGED
|
@@ -65,6 +65,26 @@ class AgenticFlowCLI {
|
|
|
65
65
|
await handleAgentCommand(agentArgs);
|
|
66
66
|
process.exit(0);
|
|
67
67
|
}
|
|
68
|
+
if (options.mode === 'mcp-manager') {
|
|
69
|
+
// Handle MCP manager commands (add, list, remove, etc.)
|
|
70
|
+
const { spawn } = await import('child_process');
|
|
71
|
+
const { resolve, dirname } = await import('path');
|
|
72
|
+
const { fileURLToPath } = await import('url');
|
|
73
|
+
const __filename = fileURLToPath(import.meta.url);
|
|
74
|
+
const __dirname = dirname(__filename);
|
|
75
|
+
const mcpManagerPath = resolve(__dirname, './cli/mcp-manager.js');
|
|
76
|
+
// Pass all args after 'mcp' to mcp-manager
|
|
77
|
+
const mcpArgs = process.argv.slice(3); // Skip 'node', 'cli-proxy.js', 'mcp'
|
|
78
|
+
const proc = spawn('node', [mcpManagerPath, ...mcpArgs], {
|
|
79
|
+
stdio: 'inherit'
|
|
80
|
+
});
|
|
81
|
+
proc.on('exit', (code) => {
|
|
82
|
+
process.exit(code || 0);
|
|
83
|
+
});
|
|
84
|
+
process.on('SIGINT', () => proc.kill('SIGINT'));
|
|
85
|
+
process.on('SIGTERM', () => proc.kill('SIGTERM'));
|
|
86
|
+
return;
|
|
87
|
+
}
|
|
68
88
|
if (options.mode === 'proxy') {
|
|
69
89
|
// Run standalone proxy server for Claude Code/Cursor
|
|
70
90
|
await this.runStandaloneProxy();
|
|
@@ -110,7 +130,8 @@ class AgenticFlowCLI {
|
|
|
110
130
|
agent: options.agent,
|
|
111
131
|
task: options.task,
|
|
112
132
|
priority: options.optimizePriority || 'balanced',
|
|
113
|
-
maxCostPerTask: options.maxCost
|
|
133
|
+
maxCostPerTask: options.maxCost,
|
|
134
|
+
requiresTools: true // Agents have MCP tools available, so require tool support
|
|
114
135
|
});
|
|
115
136
|
// Display recommendation
|
|
116
137
|
ModelOptimizer.displayRecommendation(recommendation);
|
|
@@ -124,6 +145,7 @@ class AgenticFlowCLI {
|
|
|
124
145
|
console.log(`✅ Using optimized model: ${recommendation.modelName}\n`);
|
|
125
146
|
}
|
|
126
147
|
// Determine which provider to use
|
|
148
|
+
const useONNX = this.shouldUseONNX(options);
|
|
127
149
|
const useOpenRouter = this.shouldUseOpenRouter(options);
|
|
128
150
|
const useGemini = this.shouldUseGemini(options);
|
|
129
151
|
// Debug output for provider selection
|
|
@@ -131,6 +153,7 @@ class AgenticFlowCLI {
|
|
|
131
153
|
console.log('\n🔍 Provider Selection Debug:');
|
|
132
154
|
console.log(` Provider flag: ${options.provider || 'not set'}`);
|
|
133
155
|
console.log(` Model: ${options.model || 'default'}`);
|
|
156
|
+
console.log(` Use ONNX: ${useONNX}`);
|
|
134
157
|
console.log(` Use OpenRouter: ${useOpenRouter}`);
|
|
135
158
|
console.log(` Use Gemini: ${useGemini}`);
|
|
136
159
|
console.log(` OPENROUTER_API_KEY: ${process.env.OPENROUTER_API_KEY ? '✓ set' : '✗ not set'}`);
|
|
@@ -138,8 +161,12 @@ class AgenticFlowCLI {
|
|
|
138
161
|
console.log(` ANTHROPIC_API_KEY: ${process.env.ANTHROPIC_API_KEY ? '✓ set' : '✗ not set'}\n`);
|
|
139
162
|
}
|
|
140
163
|
try {
|
|
141
|
-
// Start proxy if needed (OpenRouter or Gemini)
|
|
142
|
-
if (
|
|
164
|
+
// Start proxy if needed (ONNX, OpenRouter, or Gemini)
|
|
165
|
+
if (useONNX) {
|
|
166
|
+
console.log('🚀 Initializing ONNX local inference proxy...');
|
|
167
|
+
await this.startONNXProxy(options.model);
|
|
168
|
+
}
|
|
169
|
+
else if (useOpenRouter) {
|
|
143
170
|
console.log('🚀 Initializing OpenRouter proxy...');
|
|
144
171
|
await this.startOpenRouterProxy(options.model);
|
|
145
172
|
}
|
|
@@ -151,7 +178,7 @@ class AgenticFlowCLI {
|
|
|
151
178
|
console.log('🚀 Using direct Anthropic API...\n');
|
|
152
179
|
}
|
|
153
180
|
// Run agent
|
|
154
|
-
await this.runAgent(options, useOpenRouter, useGemini);
|
|
181
|
+
await this.runAgent(options, useOpenRouter, useGemini, useONNX);
|
|
155
182
|
logger.info('Execution completed successfully');
|
|
156
183
|
process.exit(0);
|
|
157
184
|
}
|
|
@@ -161,6 +188,19 @@ class AgenticFlowCLI {
|
|
|
161
188
|
process.exit(1);
|
|
162
189
|
}
|
|
163
190
|
}
|
|
191
|
+
shouldUseONNX(options) {
|
|
192
|
+
// Use ONNX if:
|
|
193
|
+
// 1. Provider is explicitly set to onnx
|
|
194
|
+
// 2. PROVIDER env var is set to onnx
|
|
195
|
+
// 3. USE_ONNX env var is set
|
|
196
|
+
if (options.provider === 'onnx' || process.env.PROVIDER === 'onnx') {
|
|
197
|
+
return true;
|
|
198
|
+
}
|
|
199
|
+
if (process.env.USE_ONNX === 'true') {
|
|
200
|
+
return true;
|
|
201
|
+
}
|
|
202
|
+
return false;
|
|
203
|
+
}
|
|
164
204
|
shouldUseGemini(options) {
|
|
165
205
|
// Use Gemini if:
|
|
166
206
|
// 1. Provider is explicitly set to gemini
|
|
@@ -272,6 +312,35 @@ class AgenticFlowCLI {
|
|
|
272
312
|
// Wait for proxy to be ready
|
|
273
313
|
await new Promise(resolve => setTimeout(resolve, 1500));
|
|
274
314
|
}
|
|
315
|
+
async startONNXProxy(modelOverride) {
|
|
316
|
+
logger.info('Starting integrated ONNX local inference proxy');
|
|
317
|
+
console.log('🔧 Provider: ONNX Local (Phi-4-mini)');
|
|
318
|
+
console.log('💾 Free local inference - no API costs\n');
|
|
319
|
+
// Import ONNX proxy
|
|
320
|
+
const { AnthropicToONNXProxy } = await import('./proxy/anthropic-to-onnx.js');
|
|
321
|
+
// Use a different port for ONNX to avoid conflicts
|
|
322
|
+
const onnxProxyPort = parseInt(process.env.ONNX_PROXY_PORT || '3001');
|
|
323
|
+
const proxy = new AnthropicToONNXProxy({
|
|
324
|
+
port: onnxProxyPort,
|
|
325
|
+
modelPath: process.env.ONNX_MODEL_PATH,
|
|
326
|
+
executionProviders: process.env.ONNX_EXECUTION_PROVIDERS?.split(',') || ['cpu']
|
|
327
|
+
});
|
|
328
|
+
// Start proxy in background
|
|
329
|
+
await proxy.start();
|
|
330
|
+
this.proxyServer = proxy;
|
|
331
|
+
// Set ANTHROPIC_BASE_URL to ONNX proxy
|
|
332
|
+
process.env.ANTHROPIC_BASE_URL = `http://localhost:${onnxProxyPort}`;
|
|
333
|
+
// Set dummy ANTHROPIC_API_KEY for proxy (local inference doesn't need key)
|
|
334
|
+
if (!process.env.ANTHROPIC_API_KEY) {
|
|
335
|
+
process.env.ANTHROPIC_API_KEY = 'sk-ant-onnx-local-key';
|
|
336
|
+
}
|
|
337
|
+
console.log(`🔗 Proxy Mode: ONNX Local Inference`);
|
|
338
|
+
console.log(`🔧 Proxy URL: http://localhost:${onnxProxyPort}`);
|
|
339
|
+
console.log(`🤖 Model: Phi-4-mini-instruct (ONNX)\n`);
|
|
340
|
+
// Wait for proxy to be ready and model to load
|
|
341
|
+
console.log('⏳ Loading ONNX model... (this may take a moment)\n');
|
|
342
|
+
await new Promise(resolve => setTimeout(resolve, 2000));
|
|
343
|
+
}
|
|
275
344
|
async runStandaloneProxy() {
|
|
276
345
|
const args = process.argv.slice(3); // Skip 'node', 'cli-proxy.js', 'proxy'
|
|
277
346
|
// Parse proxy arguments
|
|
@@ -411,7 +480,7 @@ EXAMPLES:
|
|
|
411
480
|
claude
|
|
412
481
|
`);
|
|
413
482
|
}
|
|
414
|
-
async runAgent(options, useOpenRouter, useGemini) {
|
|
483
|
+
async runAgent(options, useOpenRouter, useGemini, useONNX = false) {
|
|
415
484
|
const agentName = options.agent || process.env.AGENT || '';
|
|
416
485
|
const task = options.task || process.env.TASK || '';
|
|
417
486
|
if (!agentName) {
|
|
@@ -0,0 +1,213 @@
|
|
|
1
|
+
// Anthropic to ONNX Local Proxy Server
|
|
2
|
+
// Converts Anthropic API format to ONNX Runtime local inference
|
|
3
|
+
import express from 'express';
|
|
4
|
+
import { logger } from '../utils/logger.js';
|
|
5
|
+
import { ONNXLocalProvider } from '../router/providers/onnx-local.js';
|
|
6
|
+
export class AnthropicToONNXProxy {
|
|
7
|
+
app;
|
|
8
|
+
onnxProvider;
|
|
9
|
+
port;
|
|
10
|
+
server;
|
|
11
|
+
constructor(config = {}) {
|
|
12
|
+
this.app = express();
|
|
13
|
+
this.port = config.port || 3001;
|
|
14
|
+
// Initialize ONNX provider with configuration
|
|
15
|
+
this.onnxProvider = new ONNXLocalProvider({
|
|
16
|
+
modelPath: config.modelPath,
|
|
17
|
+
executionProviders: config.executionProviders || ['cpu'],
|
|
18
|
+
maxTokens: 512,
|
|
19
|
+
temperature: 0.7
|
|
20
|
+
});
|
|
21
|
+
this.setupMiddleware();
|
|
22
|
+
this.setupRoutes();
|
|
23
|
+
}
|
|
24
|
+
setupMiddleware() {
|
|
25
|
+
// Parse JSON bodies
|
|
26
|
+
this.app.use(express.json({ limit: '50mb' }));
|
|
27
|
+
// Logging middleware
|
|
28
|
+
this.app.use((req, res, next) => {
|
|
29
|
+
logger.debug('ONNX proxy request', {
|
|
30
|
+
method: req.method,
|
|
31
|
+
path: req.path,
|
|
32
|
+
headers: Object.keys(req.headers)
|
|
33
|
+
});
|
|
34
|
+
next();
|
|
35
|
+
});
|
|
36
|
+
}
|
|
37
|
+
setupRoutes() {
|
|
38
|
+
// Health check
|
|
39
|
+
this.app.get('/health', (req, res) => {
|
|
40
|
+
const modelInfo = this.onnxProvider.getModelInfo();
|
|
41
|
+
res.json({
|
|
42
|
+
status: 'ok',
|
|
43
|
+
service: 'anthropic-to-onnx-proxy',
|
|
44
|
+
onnx: {
|
|
45
|
+
initialized: modelInfo.initialized,
|
|
46
|
+
tokenizerLoaded: modelInfo.tokenizerLoaded,
|
|
47
|
+
executionProviders: modelInfo.executionProviders
|
|
48
|
+
}
|
|
49
|
+
});
|
|
50
|
+
});
|
|
51
|
+
// Anthropic Messages API → ONNX Local Inference
|
|
52
|
+
this.app.post('/v1/messages', async (req, res) => {
|
|
53
|
+
try {
|
|
54
|
+
const anthropicReq = req.body;
|
|
55
|
+
// Extract system prompt
|
|
56
|
+
let systemPrompt = '';
|
|
57
|
+
if (typeof anthropicReq.system === 'string') {
|
|
58
|
+
systemPrompt = anthropicReq.system;
|
|
59
|
+
}
|
|
60
|
+
else if (Array.isArray(anthropicReq.system)) {
|
|
61
|
+
systemPrompt = anthropicReq.system
|
|
62
|
+
.filter((block) => block.type === 'text')
|
|
63
|
+
.map((block) => block.text)
|
|
64
|
+
.join('\n');
|
|
65
|
+
}
|
|
66
|
+
logger.info('Converting Anthropic request to ONNX', {
|
|
67
|
+
anthropicModel: anthropicReq.model,
|
|
68
|
+
onnxModel: 'Phi-4-mini-instruct',
|
|
69
|
+
messageCount: anthropicReq.messages.length,
|
|
70
|
+
systemPromptLength: systemPrompt.length,
|
|
71
|
+
maxTokens: anthropicReq.max_tokens,
|
|
72
|
+
temperature: anthropicReq.temperature
|
|
73
|
+
});
|
|
74
|
+
// Convert Anthropic messages to internal format
|
|
75
|
+
const messages = [];
|
|
76
|
+
// Add system message if present
|
|
77
|
+
if (systemPrompt) {
|
|
78
|
+
messages.push({
|
|
79
|
+
role: 'system',
|
|
80
|
+
content: systemPrompt
|
|
81
|
+
});
|
|
82
|
+
}
|
|
83
|
+
// Add user/assistant messages
|
|
84
|
+
for (const msg of anthropicReq.messages) {
|
|
85
|
+
let content;
|
|
86
|
+
if (typeof msg.content === 'string') {
|
|
87
|
+
content = msg.content;
|
|
88
|
+
}
|
|
89
|
+
else {
|
|
90
|
+
content = msg.content
|
|
91
|
+
.filter((block) => block.type === 'text')
|
|
92
|
+
.map((block) => block.text || '')
|
|
93
|
+
.join('\n');
|
|
94
|
+
}
|
|
95
|
+
messages.push({
|
|
96
|
+
role: msg.role,
|
|
97
|
+
content
|
|
98
|
+
});
|
|
99
|
+
}
|
|
100
|
+
// Streaming not supported by ONNX provider yet
|
|
101
|
+
if (anthropicReq.stream) {
|
|
102
|
+
logger.warn('Streaming requested but not supported by ONNX provider, falling back to non-streaming');
|
|
103
|
+
}
|
|
104
|
+
// Run ONNX inference
|
|
105
|
+
const result = await this.onnxProvider.chat({
|
|
106
|
+
model: 'phi-4-mini-instruct',
|
|
107
|
+
messages,
|
|
108
|
+
maxTokens: anthropicReq.max_tokens || 512,
|
|
109
|
+
temperature: anthropicReq.temperature || 0.7
|
|
110
|
+
});
|
|
111
|
+
// Convert ONNX response to Anthropic format
|
|
112
|
+
const anthropicResponse = {
|
|
113
|
+
id: result.id,
|
|
114
|
+
type: 'message',
|
|
115
|
+
role: 'assistant',
|
|
116
|
+
content: result.content.map(block => ({
|
|
117
|
+
type: 'text',
|
|
118
|
+
text: block.text || ''
|
|
119
|
+
})),
|
|
120
|
+
model: 'onnx-local/phi-4-mini-instruct',
|
|
121
|
+
stop_reason: result.stopReason || 'end_turn',
|
|
122
|
+
usage: {
|
|
123
|
+
input_tokens: result.usage?.inputTokens || 0,
|
|
124
|
+
output_tokens: result.usage?.outputTokens || 0
|
|
125
|
+
}
|
|
126
|
+
};
|
|
127
|
+
logger.info('ONNX inference completed', {
|
|
128
|
+
inputTokens: result.usage?.inputTokens || 0,
|
|
129
|
+
outputTokens: result.usage?.outputTokens || 0,
|
|
130
|
+
latency: result.metadata?.latency,
|
|
131
|
+
tokensPerSecond: result.metadata?.tokensPerSecond
|
|
132
|
+
});
|
|
133
|
+
res.json(anthropicResponse);
|
|
134
|
+
}
|
|
135
|
+
catch (error) {
|
|
136
|
+
logger.error('ONNX proxy error', {
|
|
137
|
+
error: error.message,
|
|
138
|
+
provider: error.provider,
|
|
139
|
+
retryable: error.retryable
|
|
140
|
+
});
|
|
141
|
+
res.status(500).json({
|
|
142
|
+
error: {
|
|
143
|
+
type: 'api_error',
|
|
144
|
+
message: `ONNX inference failed: ${error.message}`
|
|
145
|
+
}
|
|
146
|
+
});
|
|
147
|
+
}
|
|
148
|
+
});
|
|
149
|
+
// 404 handler
|
|
150
|
+
this.app.use((req, res) => {
|
|
151
|
+
res.status(404).json({
|
|
152
|
+
error: {
|
|
153
|
+
type: 'not_found',
|
|
154
|
+
message: `Route not found: ${req.method} ${req.path}`
|
|
155
|
+
}
|
|
156
|
+
});
|
|
157
|
+
});
|
|
158
|
+
}
|
|
159
|
+
start() {
|
|
160
|
+
return new Promise((resolve) => {
|
|
161
|
+
this.server = this.app.listen(this.port, () => {
|
|
162
|
+
logger.info('ONNX proxy server started', {
|
|
163
|
+
port: this.port,
|
|
164
|
+
endpoint: `http://localhost:${this.port}`,
|
|
165
|
+
healthCheck: `http://localhost:${this.port}/health`,
|
|
166
|
+
messagesEndpoint: `http://localhost:${this.port}/v1/messages`
|
|
167
|
+
});
|
|
168
|
+
console.log(`\n🚀 ONNX Proxy Server running on http://localhost:${this.port}`);
|
|
169
|
+
console.log(` 📋 Messages API: POST http://localhost:${this.port}/v1/messages`);
|
|
170
|
+
console.log(` ❤️ Health check: GET http://localhost:${this.port}/health\n`);
|
|
171
|
+
resolve();
|
|
172
|
+
});
|
|
173
|
+
});
|
|
174
|
+
}
|
|
175
|
+
stop() {
|
|
176
|
+
return new Promise((resolve) => {
|
|
177
|
+
if (this.server) {
|
|
178
|
+
this.server.close(() => {
|
|
179
|
+
logger.info('ONNX proxy server stopped');
|
|
180
|
+
resolve();
|
|
181
|
+
});
|
|
182
|
+
}
|
|
183
|
+
else {
|
|
184
|
+
resolve();
|
|
185
|
+
}
|
|
186
|
+
});
|
|
187
|
+
}
|
|
188
|
+
async dispose() {
|
|
189
|
+
await this.stop();
|
|
190
|
+
await this.onnxProvider.dispose();
|
|
191
|
+
}
|
|
192
|
+
}
|
|
193
|
+
// CLI entry point
|
|
194
|
+
if (import.meta.url === `file://${process.argv[1]}`) {
|
|
195
|
+
const proxy = new AnthropicToONNXProxy({
|
|
196
|
+
port: parseInt(process.env.ONNX_PROXY_PORT || '3001')
|
|
197
|
+
});
|
|
198
|
+
proxy.start().catch(error => {
|
|
199
|
+
console.error('Failed to start ONNX proxy:', error);
|
|
200
|
+
process.exit(1);
|
|
201
|
+
});
|
|
202
|
+
// Graceful shutdown
|
|
203
|
+
process.on('SIGINT', async () => {
|
|
204
|
+
console.log('\n🛑 Shutting down ONNX proxy...');
|
|
205
|
+
await proxy.dispose();
|
|
206
|
+
process.exit(0);
|
|
207
|
+
});
|
|
208
|
+
process.on('SIGTERM', async () => {
|
|
209
|
+
console.log('\n🛑 Shutting down ONNX proxy...');
|
|
210
|
+
await proxy.dispose();
|
|
211
|
+
process.exit(0);
|
|
212
|
+
});
|
|
213
|
+
}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{}
|
package/dist/utils/cli.js
CHANGED
|
@@ -16,8 +16,16 @@ export function parseArgs() {
|
|
|
16
16
|
}
|
|
17
17
|
// Check for MCP command
|
|
18
18
|
if (args[0] === 'mcp') {
|
|
19
|
+
const mcpSubcommand = args[1];
|
|
20
|
+
// MCP Manager commands (CLI configuration)
|
|
21
|
+
const managerCommands = ['add', 'list', 'remove', 'enable', 'disable', 'update', 'test', 'info', 'export', 'import'];
|
|
22
|
+
if (managerCommands.includes(mcpSubcommand)) {
|
|
23
|
+
options.mode = 'mcp-manager';
|
|
24
|
+
return options;
|
|
25
|
+
}
|
|
26
|
+
// MCP Server commands (start/stop server)
|
|
19
27
|
options.mode = 'mcp';
|
|
20
|
-
options.mcpCommand =
|
|
28
|
+
options.mcpCommand = mcpSubcommand || 'start'; // default to start
|
|
21
29
|
options.mcpServer = args[2] || 'all'; // default to all servers
|
|
22
30
|
return options;
|
|
23
31
|
}
|
|
@@ -16,6 +16,7 @@ const MODEL_DATABASE = {
|
|
|
16
16
|
speed_score: 85,
|
|
17
17
|
cost_score: 20,
|
|
18
18
|
tier: 'flagship',
|
|
19
|
+
supports_tools: true,
|
|
19
20
|
strengths: ['reasoning', 'coding', 'analysis', 'production'],
|
|
20
21
|
weaknesses: ['cost'],
|
|
21
22
|
bestFor: ['coder', 'reviewer', 'architecture', 'planner', 'production-validator']
|
|
@@ -30,6 +31,7 @@ const MODEL_DATABASE = {
|
|
|
30
31
|
speed_score: 90,
|
|
31
32
|
cost_score: 30,
|
|
32
33
|
tier: 'flagship',
|
|
34
|
+
supports_tools: true,
|
|
33
35
|
strengths: ['multimodal', 'speed', 'general-purpose', 'vision'],
|
|
34
36
|
weaknesses: ['cost'],
|
|
35
37
|
bestFor: ['researcher', 'analyst', 'multimodal-tasks']
|
|
@@ -44,6 +46,7 @@ const MODEL_DATABASE = {
|
|
|
44
46
|
speed_score: 75,
|
|
45
47
|
cost_score: 50,
|
|
46
48
|
tier: 'flagship',
|
|
49
|
+
supports_tools: true,
|
|
47
50
|
strengths: ['reasoning', 'large-context', 'math', 'analysis'],
|
|
48
51
|
weaknesses: ['speed'],
|
|
49
52
|
bestFor: ['planner', 'architecture', 'researcher', 'code-analyzer']
|
|
@@ -59,8 +62,9 @@ const MODEL_DATABASE = {
|
|
|
59
62
|
speed_score: 80,
|
|
60
63
|
cost_score: 100,
|
|
61
64
|
tier: 'cost-effective',
|
|
65
|
+
supports_tools: false, // DeepSeek R1 does NOT support tool/function calling
|
|
62
66
|
strengths: ['reasoning', 'coding', 'math', 'value', 'free'],
|
|
63
|
-
weaknesses: ['newer-model'],
|
|
67
|
+
weaknesses: ['newer-model', 'no-tool-use'],
|
|
64
68
|
bestFor: ['coder', 'pseudocode', 'specification', 'refinement', 'tester']
|
|
65
69
|
},
|
|
66
70
|
'deepseek-chat-v3': {
|
|
@@ -73,6 +77,7 @@ const MODEL_DATABASE = {
|
|
|
73
77
|
speed_score: 90,
|
|
74
78
|
cost_score: 100,
|
|
75
79
|
tier: 'cost-effective',
|
|
80
|
+
supports_tools: true,
|
|
76
81
|
strengths: ['cost', 'speed', 'coding', 'development', 'free'],
|
|
77
82
|
weaknesses: ['complex-reasoning'],
|
|
78
83
|
bestFor: ['coder', 'reviewer', 'tester', 'backend-dev', 'cicd-engineer']
|
|
@@ -88,6 +93,7 @@ const MODEL_DATABASE = {
|
|
|
88
93
|
speed_score: 98,
|
|
89
94
|
cost_score: 98,
|
|
90
95
|
tier: 'balanced',
|
|
96
|
+
supports_tools: true,
|
|
91
97
|
strengths: ['speed', 'cost', 'interactive'],
|
|
92
98
|
weaknesses: ['quality'],
|
|
93
99
|
bestFor: ['researcher', 'planner', 'smart-agent']
|
|
@@ -102,6 +108,7 @@ const MODEL_DATABASE = {
|
|
|
102
108
|
speed_score: 95,
|
|
103
109
|
cost_score: 100,
|
|
104
110
|
tier: 'balanced',
|
|
111
|
+
supports_tools: true,
|
|
105
112
|
strengths: ['open-source', 'versatile', 'coding', 'free', 'fast'],
|
|
106
113
|
weaknesses: ['smaller-model'],
|
|
107
114
|
bestFor: ['coder', 'reviewer', 'base-template-generator', 'tester']
|
|
@@ -116,6 +123,7 @@ const MODEL_DATABASE = {
|
|
|
116
123
|
speed_score: 85,
|
|
117
124
|
cost_score: 90,
|
|
118
125
|
tier: 'balanced',
|
|
126
|
+
supports_tools: true,
|
|
119
127
|
strengths: ['multilingual', 'coding', 'reasoning'],
|
|
120
128
|
weaknesses: ['english-optimized'],
|
|
121
129
|
bestFor: ['researcher', 'coder', 'multilingual-tasks']
|
|
@@ -131,6 +139,7 @@ const MODEL_DATABASE = {
|
|
|
131
139
|
speed_score: 95,
|
|
132
140
|
cost_score: 99,
|
|
133
141
|
tier: 'budget',
|
|
142
|
+
supports_tools: true,
|
|
134
143
|
strengths: ['ultra-low-cost', 'speed'],
|
|
135
144
|
weaknesses: ['quality', 'complex-tasks'],
|
|
136
145
|
bestFor: ['simple-tasks', 'testing']
|
|
@@ -146,6 +155,7 @@ const MODEL_DATABASE = {
|
|
|
146
155
|
speed_score: 30,
|
|
147
156
|
cost_score: 100,
|
|
148
157
|
tier: 'local',
|
|
158
|
+
supports_tools: false,
|
|
149
159
|
strengths: ['privacy', 'offline', 'zero-cost'],
|
|
150
160
|
weaknesses: ['quality', 'speed'],
|
|
151
161
|
bestFor: ['privacy-tasks', 'offline-tasks']
|
|
@@ -197,8 +207,14 @@ export class ModelOptimizer {
|
|
|
197
207
|
const taskComplexity = criteria.taskComplexity || this.inferComplexity(criteria.task);
|
|
198
208
|
// Set default priority to balanced if not specified
|
|
199
209
|
const priority = criteria.priority || 'balanced';
|
|
210
|
+
// Filter models that support tools if required
|
|
211
|
+
let availableModels = Object.entries(MODEL_DATABASE);
|
|
212
|
+
if (criteria.requiresTools) {
|
|
213
|
+
availableModels = availableModels.filter(([key, model]) => model.supports_tools !== false);
|
|
214
|
+
logger.info(`Filtered to ${availableModels.length} models with tool support`);
|
|
215
|
+
}
|
|
200
216
|
// Score all models
|
|
201
|
-
const scoredModels =
|
|
217
|
+
const scoredModels = availableModels.map(([key, model]) => {
|
|
202
218
|
// Calculate overall score based on priority
|
|
203
219
|
let overall_score;
|
|
204
220
|
switch (priority) {
|
|
@@ -1,10 +1,10 @@
|
|
|
1
1
|
[
|
|
2
2
|
{
|
|
3
|
-
"id": "cmd-hooks-
|
|
3
|
+
"id": "cmd-hooks-1759768557202",
|
|
4
4
|
"type": "hooks",
|
|
5
5
|
"success": true,
|
|
6
|
-
"duration":
|
|
7
|
-
"timestamp":
|
|
6
|
+
"duration": 9.268956000000003,
|
|
7
|
+
"timestamp": 1759768557212,
|
|
8
8
|
"metadata": {}
|
|
9
9
|
}
|
|
10
10
|
]
|