agentic-flow 1.1.0 → 1.1.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +305 -158
- package/dist/cli-proxy.js +94 -17
- package/dist/proxy/anthropic-to-openrouter.js +5 -1
- package/dist/router/providers/gemini.js +126 -0
- package/dist/router/router.js +12 -0
- package/dist/utils/modelOptimizer.js +22 -22
- package/package.json +2 -1
package/README.md
CHANGED
|
@@ -1,31 +1,58 @@
|
|
|
1
1
|
# 🤖 Agentic Flow
|
|
2
2
|
|
|
3
|
-
**
|
|
3
|
+
**Production-Ready AI Agent Orchestration with Multi-Model Router, OpenRouter Integration & Free Local Inference**
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
Agentic Flow works with any agent or command built or used in Claude Code. It automatically runs through the Claude Agent SDK, forming swarms of intelligent, cost and performance-optimized agents that decide how to execute each task. Built for business, government, and commercial use where cost, traceability, and reliability matter.
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
Agentic Flow runs Claude Code agents at near zero cost without rewriting a thing. It routes every task to the cheapest lane that still meets the bar. Local ONNX when privacy or price wins. OpenRouter for breadth. Gemini for speed. Anthropic when quality matters most. One agent. Any model. Lowest viable cost.
|
|
8
|
+
|
|
9
|
+
The system takes the Claude SDK's logic and merges it with Claude Flow memory to give every agent a durable brain. Each run logs inputs, outputs, and route decisions with artifacts, manifests, and checksums for proof and reproducibility. It self-optimizes in real time, balancing price, latency, and accuracy through a simple policy file.
|
|
10
|
+
|
|
11
|
+
Strict mode keeps sensitive data offline. Economy mode prefers ONNX or OpenRouter. Premium mode goes Anthropic first. The policy defines the rules, and the swarm enforces them automatically.
|
|
12
|
+
|
|
13
|
+
It runs anywhere: local for dev, Docker for CI, or Flow Nexus for scale. With project-scoped settings, explicit tool allowlists, and an offline privacy lane, it stays secure by default.
|
|
14
|
+
|
|
15
|
+
**Agentic Flow is the framework for autonomous efficiency—one unified runner for every Claude Code agent, self-tuning, self-routing, and built for real-world deployment.**
|
|
16
|
+
|
|
17
|
+
Built on **[Claude Agent SDK](https://docs.claude.com/en/api/agent-sdk)** by Anthropic, powered by **[Claude Flow](https://github.com/ruvnet/claude-flow)** (101 MCP tools), **[Flow Nexus](https://github.com/ruvnet/flow-nexus)** (96 cloud tools), **[OpenRouter](https://openrouter.ai)** (100+ LLM models), **Google Gemini** (fast, cost-effective inference), **[Agentic Payments](https://www.npmjs.com/package/agentic-payments)** (payment authorization), and **ONNX Runtime** (free local CPU or GPU inference).
|
|
8
18
|
|
|
9
19
|
[](https://www.npmjs.com/package/agentic-flow)
|
|
20
|
+
[](https://www.npmjs.com/package/agentic-flow)
|
|
21
|
+
[](https://www.npmjs.com/package/agentic-flow)
|
|
10
22
|
[](https://opensource.org/licenses/MIT)
|
|
11
23
|
[](https://nodejs.org/)
|
|
24
|
+
[](https://github.com/ruvnet/)
|
|
25
|
+
[](https://github.com/ruvnet/agentic-flow#-agent-types)
|
|
12
26
|
|
|
13
27
|
---
|
|
14
28
|
|
|
15
29
|
## Why Agentic Flow?
|
|
16
30
|
|
|
17
|
-
|
|
31
|
+
**The Problem:** You need agents that actually complete tasks, not chatbots that need constant supervision. Long-running workflows - migrating codebases, generating documentation, analyzing datasets - shouldn't require you to sit there clicking "continue."
|
|
32
|
+
|
|
33
|
+
**What True Agentic Systems Need:**
|
|
34
|
+
- **Autonomy** - Agents that plan, execute, and recover from errors without hand-holding
|
|
35
|
+
- **Persistence** - Tasks that run for hours, even when you're offline
|
|
36
|
+
- **Collaboration** - Multiple agents coordinating on complex work
|
|
37
|
+
- **Tool Access** - Real capabilities: file systems, APIs, databases, not just text generation
|
|
38
|
+
- **Cost Control** - Run cheap models for grunt work, expensive ones only when needed
|
|
39
|
+
|
|
40
|
+
**What You Get:**
|
|
18
41
|
|
|
19
|
-
- **
|
|
20
|
-
- **
|
|
21
|
-
- **
|
|
22
|
-
- **
|
|
23
|
-
- **Auto
|
|
24
|
-
- **
|
|
25
|
-
- **Production-Ready** - Built on battle-tested Claude Agent SDK v0.1.5
|
|
26
|
-
- **Model Flexibility** - Use Claude, OpenRouter, or free local ONNX models
|
|
42
|
+
- **150+ Specialized Agents** - Researcher, coder, reviewer, tester, architect - each with domain expertise and tool access
|
|
43
|
+
- **Multi-Agent Swarms** - Deploy 3, 10, or 100 agents that collaborate via shared memory to complete complex projects
|
|
44
|
+
- **Long-Running Tasks** - Agents persist through hours-long operations: full codebase refactors, comprehensive audits, dataset processing
|
|
45
|
+
- **213 MCP Tools** - Agents have real capabilities: GitHub operations, neural network training, workflow automation, memory persistence
|
|
46
|
+
- **Auto Model Optimization** - `--optimize` flag intelligently selects best model for each task. DeepSeek R1 costs 85% less than Claude with similar quality. Save $2,400/month on 100 daily reviews.
|
|
47
|
+
- **Deploy Anywhere** - Same agentic capabilities locally, in Docker/Kubernetes, or cloud sandboxes
|
|
27
48
|
|
|
28
|
-
|
|
49
|
+
**Real Agentic Use Cases:**
|
|
50
|
+
- **Overnight Code Migration** - Deploy a swarm to migrate a 50K line codebase from JavaScript to TypeScript while you sleep
|
|
51
|
+
- **Continuous Security Audits** - Agents monitor repos, analyze PRs, and flag vulnerabilities 24/7
|
|
52
|
+
- **Automated API Development** - One agent designs schema, another implements endpoints, a third writes tests - all coordinated
|
|
53
|
+
- **Data Pipeline Processing** - Agents process TBs of data across distributed sandboxes, checkpoint progress, and recover from failures
|
|
54
|
+
|
|
55
|
+
> **True autonomy at commodity prices.** Your agents work independently on long-running tasks, coordinate when needed, and cost pennies per hour instead of dollars.
|
|
29
56
|
|
|
30
57
|
### Built on Industry Standards
|
|
31
58
|
|
|
@@ -40,7 +67,7 @@ Traditional AI frameworks require persistent infrastructure and complex orchestr
|
|
|
40
67
|
|
|
41
68
|
## 🚀 Quick Start
|
|
42
69
|
|
|
43
|
-
### Installation
|
|
70
|
+
### Local Installation (Recommended for Development)
|
|
44
71
|
|
|
45
72
|
```bash
|
|
46
73
|
# Global installation
|
|
@@ -48,29 +75,11 @@ npm install -g agentic-flow
|
|
|
48
75
|
|
|
49
76
|
# Or use directly with npx (no installation)
|
|
50
77
|
npx agentic-flow --help
|
|
51
|
-
```
|
|
52
78
|
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
```bash
|
|
56
|
-
# Launch interactive configuration wizard
|
|
57
|
-
npx agentic-flow config
|
|
58
|
-
|
|
59
|
-
# Or use direct commands
|
|
60
|
-
npx agentic-flow config set ANTHROPIC_API_KEY sk-ant-xxxxx
|
|
61
|
-
npx agentic-flow config set PROVIDER anthropic
|
|
62
|
-
npx agentic-flow config list
|
|
79
|
+
# Set your API key
|
|
80
|
+
export ANTHROPIC_API_KEY=sk-ant-...
|
|
63
81
|
```
|
|
64
82
|
|
|
65
|
-
The wizard helps you configure:
|
|
66
|
-
- **API Keys** - Anthropic, OpenRouter with validation
|
|
67
|
-
- **Provider Settings** - Choose default provider (anthropic/openrouter/onnx)
|
|
68
|
-
- **Model Selection** - Set default models
|
|
69
|
-
- **Custom Paths** - Configure agents directory
|
|
70
|
-
- **Advanced Options** - Proxy port, feature flags
|
|
71
|
-
|
|
72
|
-
All configuration is saved to `.env` with helpful comments.
|
|
73
|
-
|
|
74
83
|
### Your First Agent (Local Execution)
|
|
75
84
|
|
|
76
85
|
```bash
|
|
@@ -180,7 +189,7 @@ docker run --rm \
|
|
|
180
189
|
- **Pay-Per-Use** - Only pay for actual sandbox runtime (≈$1/hour)
|
|
181
190
|
|
|
182
191
|
### 🤖 Intelligent Agents
|
|
183
|
-
- **
|
|
192
|
+
- **150+ Pre-Built Specialists** - Researchers, coders, testers, reviewers, architects
|
|
184
193
|
- **Swarm Coordination** - Agents collaborate via shared memory
|
|
185
194
|
- **Tool Access** - 200+ MCP tools for GitHub, neural networks, workflows
|
|
186
195
|
- **Custom Agents** - Define your own in YAML with system prompts
|
|
@@ -351,35 +360,34 @@ spec:
|
|
|
351
360
|
}
|
|
352
361
|
```
|
|
353
362
|
|
|
354
|
-
###
|
|
355
|
-
```javascript
|
|
356
|
-
// Lambda limitations: No MCP subprocesses, only 6 in-SDK tools
|
|
357
|
-
exports.handler = async (event) => {
|
|
358
|
-
// ❌ claude-flow MCP server won't work (subprocess not allowed)
|
|
359
|
-
// ❌ flow-nexus MCP server won't work (subprocess not allowed)
|
|
360
|
-
// ✅ Only claude-flow-sdk in-SDK tools available (6 tools)
|
|
361
|
-
|
|
362
|
-
const result = await query({
|
|
363
|
-
prompt: event.query,
|
|
364
|
-
options: {
|
|
365
|
-
mcpServers: {
|
|
366
|
-
'claude-flow-sdk': claudeFlowSdkServer // Only 6 tools work
|
|
367
|
-
// 'claude-flow': subprocess blocked by Lambda
|
|
368
|
-
// 'flow-nexus': subprocess blocked by Lambda
|
|
369
|
-
}
|
|
370
|
-
}
|
|
371
|
-
});
|
|
363
|
+
### 🔓 ONNX Local Inference (Free Offline AI)
|
|
372
364
|
|
|
373
|
-
|
|
374
|
-
|
|
365
|
+
**Run agents completely offline with zero API costs:**
|
|
366
|
+
|
|
367
|
+
```bash
|
|
368
|
+
# Auto-downloads Phi-4 model (~4.9GB one-time download)
|
|
369
|
+
npx agentic-flow \
|
|
370
|
+
--agent coder \
|
|
371
|
+
--task "Build a REST API" \
|
|
372
|
+
--provider onnx
|
|
373
|
+
|
|
374
|
+
# Router auto-selects ONNX for privacy-sensitive tasks
|
|
375
|
+
npx agentic-flow \
|
|
376
|
+
--agent researcher \
|
|
377
|
+
--task "Analyze confidential medical records" \
|
|
378
|
+
--privacy high \
|
|
379
|
+
--local-only
|
|
375
380
|
```
|
|
376
381
|
|
|
377
|
-
**
|
|
378
|
-
-
|
|
379
|
-
-
|
|
380
|
-
-
|
|
381
|
-
-
|
|
382
|
-
- ✅
|
|
382
|
+
**ONNX Capabilities:**
|
|
383
|
+
- ✅ 100% free local inference (Microsoft Phi-4 model)
|
|
384
|
+
- ✅ Privacy: All processing stays on your machine
|
|
385
|
+
- ✅ Offline: No internet required after model download
|
|
386
|
+
- ✅ Performance: ~6 tokens/sec CPU, 60-300 tokens/sec GPU
|
|
387
|
+
- ✅ Auto-download: Model fetches automatically on first use
|
|
388
|
+
- ✅ Quantized: INT4 optimization for efficiency (~4.9GB total)
|
|
389
|
+
- ⚠️ Limited to 6 in-SDK tools (no subprocess MCP servers)
|
|
390
|
+
- 📚 See [docs](docs/ONNX_INTEGRATION.md) for full capabilities
|
|
383
391
|
|
|
384
392
|
---
|
|
385
393
|
|
|
@@ -441,50 +449,174 @@ Docker: Infrastructure costs (AWS/GCP/Azure) + Claude API costs.*
|
|
|
441
449
|
- **`production-validator`** - Deployment readiness checks
|
|
442
450
|
- **`tdd-london-swarm`** - Test-driven development
|
|
443
451
|
|
|
444
|
-
*Use `npx agentic-flow --list` to see all
|
|
452
|
+
*Use `npx agentic-flow --list` to see all 150+ agents*
|
|
453
|
+
|
|
454
|
+
---
|
|
455
|
+
|
|
456
|
+
## 🎯 Model Optimization (NEW!)
|
|
457
|
+
|
|
458
|
+
**Automatically select the optimal model for any agent and task**, balancing quality, cost, and speed based on your priorities.
|
|
459
|
+
|
|
460
|
+
### Why Model Optimization?
|
|
461
|
+
|
|
462
|
+
Different tasks need different models:
|
|
463
|
+
- **Production code** → Claude Sonnet 4.5 (highest quality)
|
|
464
|
+
- **Code reviews** → DeepSeek R1 (85% cheaper, nearly same quality)
|
|
465
|
+
- **Simple functions** → Llama 3.1 8B (99% cheaper)
|
|
466
|
+
- **Privacy-critical** → ONNX Phi-4 (free, local, offline)
|
|
467
|
+
|
|
468
|
+
**The optimizer analyzes your agent type + task complexity and recommends the best model automatically.**
|
|
469
|
+
|
|
470
|
+
### Quick Examples
|
|
471
|
+
|
|
472
|
+
```bash
|
|
473
|
+
# Let the optimizer choose (balanced quality vs cost)
|
|
474
|
+
npx agentic-flow --agent coder --task "Build REST API" --optimize
|
|
475
|
+
|
|
476
|
+
# Optimize for lowest cost
|
|
477
|
+
npx agentic-flow --agent coder --task "Simple function" --optimize --priority cost
|
|
478
|
+
|
|
479
|
+
# Optimize for highest quality
|
|
480
|
+
npx agentic-flow --agent reviewer --task "Security audit" --optimize --priority quality
|
|
481
|
+
|
|
482
|
+
# Optimize for speed
|
|
483
|
+
npx agentic-flow --agent researcher --task "Quick analysis" --optimize --priority speed
|
|
484
|
+
|
|
485
|
+
# Set maximum budget ($0.001 per task)
|
|
486
|
+
npx agentic-flow --agent coder --task "Code cleanup" --optimize --max-cost 0.001
|
|
487
|
+
```
|
|
488
|
+
|
|
489
|
+
### Optimization Priorities
|
|
490
|
+
|
|
491
|
+
- **`quality`** (70% quality, 20% speed, 10% cost) - Best results, production code
|
|
492
|
+
- **`balanced`** (40% quality, 40% cost, 20% speed) - Default, good mix
|
|
493
|
+
- **`cost`** (70% cost, 20% quality, 10% speed) - Cheapest, development/testing
|
|
494
|
+
- **`speed`** (70% speed, 20% quality, 10% cost) - Fastest responses
|
|
495
|
+
- **`privacy`** - Local-only models (ONNX), zero cloud API calls
|
|
496
|
+
|
|
497
|
+
### Model Tier Examples
|
|
498
|
+
|
|
499
|
+
The optimizer chooses from 10+ models across 5 tiers:
|
|
500
|
+
|
|
501
|
+
**Tier 1: Flagship** (premium quality)
|
|
502
|
+
- Claude Sonnet 4.5 - $3/$15 per 1M tokens
|
|
503
|
+
- GPT-4o - $2.50/$10 per 1M tokens
|
|
504
|
+
- Gemini 2.5 Pro - $0.00/$2.00 per 1M tokens
|
|
505
|
+
|
|
506
|
+
**Tier 2: Cost-Effective** (2025 breakthrough models)
|
|
507
|
+
- **DeepSeek R1** - $0.55/$2.19 per 1M tokens (85% cheaper, flagship quality)
|
|
508
|
+
- **DeepSeek Chat V3** - $0.14/$0.28 per 1M tokens (98% cheaper)
|
|
509
|
+
|
|
510
|
+
**Tier 3: Balanced**
|
|
511
|
+
- Gemini 2.5 Flash - $0.07/$0.30 per 1M tokens (fastest)
|
|
512
|
+
- Llama 3.3 70B - $0.30/$0.30 per 1M tokens (open-source)
|
|
513
|
+
|
|
514
|
+
**Tier 4: Budget**
|
|
515
|
+
- Llama 3.1 8B - $0.055/$0.055 per 1M tokens (ultra-low cost)
|
|
516
|
+
|
|
517
|
+
**Tier 5: Local/Privacy**
|
|
518
|
+
- **ONNX Phi-4** - FREE (offline, private, no API)
|
|
519
|
+
|
|
520
|
+
### Agent-Specific Recommendations
|
|
521
|
+
|
|
522
|
+
The optimizer knows what each agent needs:
|
|
523
|
+
|
|
524
|
+
```bash
|
|
525
|
+
# Coder agent → prefers high quality (min 85/100)
|
|
526
|
+
npx agentic-flow --agent coder --task "Production API" --optimize
|
|
527
|
+
# → Selects: DeepSeek R1 (quality 90, cost 85)
|
|
528
|
+
|
|
529
|
+
# Researcher agent → flexible, can use cheaper models
|
|
530
|
+
npx agentic-flow --agent researcher --task "Trend analysis" --optimize --priority cost
|
|
531
|
+
# → Selects: Gemini 2.5 Flash (quality 78, cost 98)
|
|
532
|
+
|
|
533
|
+
# Reviewer agent → needs reasoning (min 85/100)
|
|
534
|
+
npx agentic-flow --agent reviewer --task "Security review" --optimize
|
|
535
|
+
# → Selects: DeepSeek R1 (quality 90, reasoning-optimized)
|
|
536
|
+
|
|
537
|
+
# Tester agent → simple tasks, use budget models
|
|
538
|
+
npx agentic-flow --agent tester --task "Unit tests" --optimize --priority cost
|
|
539
|
+
# → Selects: Llama 3.1 8B (cost 95)
|
|
540
|
+
```
|
|
541
|
+
|
|
542
|
+
### Cost Savings Examples
|
|
543
|
+
|
|
544
|
+
**Without Optimization** (always using Claude Sonnet 4.5):
|
|
545
|
+
- 100 code reviews/day × $0.08 each = **$8/day = $240/month**
|
|
546
|
+
|
|
547
|
+
**With Optimization** (DeepSeek R1 for reviews):
|
|
548
|
+
- 100 code reviews/day × $0.012 each = **$1.20/day = $36/month**
|
|
549
|
+
- **Savings: $204/month (85% reduction)**
|
|
550
|
+
|
|
551
|
+
### Comprehensive Model Guide
|
|
552
|
+
|
|
553
|
+
For detailed analysis of all 10 models, see:
|
|
554
|
+
📖 **[Model Capabilities Guide](docs/agentic-flow/benchmarks/MODEL_CAPABILITIES.md)**
|
|
555
|
+
|
|
556
|
+
Includes:
|
|
557
|
+
- Full benchmark results across 6 task types
|
|
558
|
+
- Cost comparison tables
|
|
559
|
+
- Use case decision matrices
|
|
560
|
+
- Performance characteristics
|
|
561
|
+
- Best practices by model
|
|
562
|
+
|
|
563
|
+
### MCP Tool for Optimization
|
|
564
|
+
|
|
565
|
+
```javascript
|
|
566
|
+
// Get model recommendation via MCP tool
|
|
567
|
+
await query({
|
|
568
|
+
mcp: {
|
|
569
|
+
server: 'agentic-flow',
|
|
570
|
+
tool: 'agentic_flow_optimize_model',
|
|
571
|
+
params: {
|
|
572
|
+
agent: 'coder',
|
|
573
|
+
task: 'Build REST API with auth',
|
|
574
|
+
priority: 'balanced', // quality | balanced | cost | speed | privacy
|
|
575
|
+
max_cost: 0.01 // optional budget cap in dollars
|
|
576
|
+
}
|
|
577
|
+
}
|
|
578
|
+
});
|
|
579
|
+
```
|
|
580
|
+
|
|
581
|
+
**Learn More:**
|
|
582
|
+
- See [benchmarks/README.md](docs/agentic-flow/benchmarks/README.md) for quick results
|
|
583
|
+
- Run your own tests: `cd docs/agentic-flow/benchmarks && ./quick-benchmark.sh`
|
|
445
584
|
|
|
446
585
|
---
|
|
447
586
|
|
|
448
587
|
## 📋 Commands
|
|
449
588
|
|
|
450
|
-
###
|
|
589
|
+
### MCP Server Management (Direct Tool Access)
|
|
451
590
|
|
|
452
591
|
```bash
|
|
453
|
-
#
|
|
454
|
-
npx agentic-flow
|
|
592
|
+
# Start all MCP servers (213 tools)
|
|
593
|
+
npx agentic-flow mcp start
|
|
455
594
|
|
|
456
|
-
#
|
|
457
|
-
npx agentic-flow
|
|
458
|
-
npx agentic-flow
|
|
459
|
-
npx agentic-flow
|
|
460
|
-
npx agentic-flow config set COMPLETION_MODEL meta-llama/llama-3.1-8b-instruct
|
|
595
|
+
# Start specific MCP server
|
|
596
|
+
npx agentic-flow mcp start claude-flow # 101 tools
|
|
597
|
+
npx agentic-flow mcp start flow-nexus # 96 cloud tools
|
|
598
|
+
npx agentic-flow mcp start agentic-payments # Payment tools
|
|
461
599
|
|
|
462
|
-
#
|
|
463
|
-
npx agentic-flow
|
|
464
|
-
npx agentic-flow config get PROVIDER
|
|
600
|
+
# List all available MCP tools (213 total)
|
|
601
|
+
npx agentic-flow mcp list
|
|
465
602
|
|
|
466
|
-
#
|
|
467
|
-
npx agentic-flow
|
|
468
|
-
npx agentic-flow config reset
|
|
603
|
+
# Check MCP server status
|
|
604
|
+
npx agentic-flow mcp status
|
|
469
605
|
|
|
470
|
-
#
|
|
471
|
-
npx agentic-flow
|
|
606
|
+
# Stop MCP servers
|
|
607
|
+
npx agentic-flow mcp stop [server]
|
|
472
608
|
```
|
|
473
609
|
|
|
474
|
-
**
|
|
475
|
-
-
|
|
476
|
-
-
|
|
477
|
-
-
|
|
478
|
-
-
|
|
479
|
-
- `AGENTS_DIR` - Custom agents directory path
|
|
480
|
-
- `PROXY_PORT` - Proxy server port (default: 3000)
|
|
481
|
-
- `USE_OPENROUTER` - Force OpenRouter usage (true/false)
|
|
482
|
-
- `USE_ONNX` - Use ONNX local inference (true/false)
|
|
610
|
+
**MCP Servers Available:**
|
|
611
|
+
- **claude-flow** (101 tools): Neural networks, GitHub integration, workflows, DAA, performance
|
|
612
|
+
- **flow-nexus** (96 tools): E2B sandboxes, distributed swarms, templates, cloud storage
|
|
613
|
+
- **agentic-payments** (10 tools): Payment authorization, Ed25519 signatures, consensus
|
|
614
|
+
- **claude-flow-sdk** (6 tools): In-process memory and swarm coordination
|
|
483
615
|
|
|
484
616
|
### Basic Operations (Works Locally, Docker, Cloud)
|
|
485
617
|
|
|
486
618
|
```bash
|
|
487
|
-
# List all available agents (
|
|
619
|
+
# List all available agents (150+ total)
|
|
488
620
|
npx agentic-flow --list
|
|
489
621
|
|
|
490
622
|
# Run specific agent (local execution)
|
|
@@ -497,15 +629,12 @@ npx agentic-flow --agent coder --task "Build API" --stream
|
|
|
497
629
|
npx agentic-flow # Requires TOPIC, DIFF, DATASET env vars
|
|
498
630
|
```
|
|
499
631
|
|
|
500
|
-
### Environment Configuration
|
|
632
|
+
### Environment Configuration
|
|
501
633
|
|
|
502
634
|
```bash
|
|
503
|
-
# Required
|
|
635
|
+
# Required
|
|
504
636
|
export ANTHROPIC_API_KEY=sk-ant-...
|
|
505
637
|
|
|
506
|
-
# Or use OpenRouter
|
|
507
|
-
export OPENROUTER_API_KEY=sk-or-v1-...
|
|
508
|
-
|
|
509
638
|
# Agent mode (optional)
|
|
510
639
|
export AGENT=researcher
|
|
511
640
|
export TASK="Your task description"
|
|
@@ -775,9 +904,34 @@ npx agentic-flow \
|
|
|
775
904
|
|
|
776
905
|
---
|
|
777
906
|
|
|
778
|
-
## 🔧 MCP Tools (
|
|
907
|
+
## 🔧 MCP Tools (213 Total)
|
|
779
908
|
|
|
780
|
-
Agentic Flow integrates with **four MCP servers** providing
|
|
909
|
+
Agentic Flow integrates with **four MCP servers** providing 213 tools total:
|
|
910
|
+
|
|
911
|
+
### Direct MCP Access
|
|
912
|
+
|
|
913
|
+
You can now directly manage MCP servers via the CLI:
|
|
914
|
+
|
|
915
|
+
```bash
|
|
916
|
+
# Start all MCP servers
|
|
917
|
+
npx agentic-flow mcp start
|
|
918
|
+
|
|
919
|
+
# List all 213 available tools
|
|
920
|
+
npx agentic-flow mcp list
|
|
921
|
+
|
|
922
|
+
# Check server status
|
|
923
|
+
npx agentic-flow mcp status
|
|
924
|
+
|
|
925
|
+
# Start specific server
|
|
926
|
+
npx agentic-flow mcp start claude-flow
|
|
927
|
+
```
|
|
928
|
+
|
|
929
|
+
**How It Works:**
|
|
930
|
+
1. **Automatic** (Recommended): Agents automatically access all 213 tools when you run tasks
|
|
931
|
+
2. **Manual**: Use `npx agentic-flow mcp <command>` for direct server management
|
|
932
|
+
3. **Integrated**: All tools work seamlessly whether accessed automatically or manually
|
|
933
|
+
|
|
934
|
+
### Tool Breakdown
|
|
781
935
|
|
|
782
936
|
### Core Orchestration (claude-flow - 101 tools)
|
|
783
937
|
|
|
@@ -894,15 +1048,16 @@ Add to your MCP config (`~/.config/claude/mcp.json`):
|
|
|
894
1048
|
|
|
895
1049
|
## 🔍 Deployment Comparison
|
|
896
1050
|
|
|
897
|
-
| Feature | Local | Docker | Flow Nexus Sandboxes |
|
|
1051
|
+
| Feature | Local | Docker | Flow Nexus Sandboxes | ONNX Local |
|
|
898
1052
|
|---------|-------|--------|----------------------|------------|
|
|
899
1053
|
| **MCP Tools Available** | 203 (100%) | 203 (100%) | 203 (100%) | 6 (3%) |
|
|
900
|
-
| **Setup Complexity** | Low | Medium | Medium |
|
|
901
|
-
| **Cold Start Time** | <500ms | <2s | <2s |
|
|
902
|
-
| **Cost (Development)** | Free* | Free* | $1/hour | $0
|
|
903
|
-
| **Cost (Production)** | Free* | Infra costs | $1/hour |
|
|
904
|
-
| **
|
|
905
|
-
| **
|
|
1054
|
+
| **Setup Complexity** | Low | Medium | Medium | Low |
|
|
1055
|
+
| **Cold Start Time** | <500ms | <2s | <2s | ~2s (first load) |
|
|
1056
|
+
| **Cost (Development)** | Free* | Free* | $1/hour | $0 (100% free) |
|
|
1057
|
+
| **Cost (Production)** | Free* | Infra costs | $1/hour | $0 (100% free) |
|
|
1058
|
+
| **Privacy** | Local | Local | Cloud | 100% Offline |
|
|
1059
|
+
| **Scaling** | Manual | Orchestrator | Automatic | Manual |
|
|
1060
|
+
| **Best For** | Dev/Testing | CI/CD/Prod | Cloud-Scale | Privacy/Offline |
|
|
906
1061
|
|
|
907
1062
|
*Free infrastructure, Claude API costs only
|
|
908
1063
|
|
|
@@ -1033,63 +1188,55 @@ spec:
|
|
|
1033
1188
|
- Implement PodDisruptionBudgets
|
|
1034
1189
|
- All 203 MCP tools available
|
|
1035
1190
|
|
|
1036
|
-
###
|
|
1191
|
+
### 💡 ONNX Local Inference - Extended Configuration
|
|
1037
1192
|
|
|
1038
|
-
|
|
1193
|
+
**Advanced ONNX setup with router integration:**
|
|
1039
1194
|
|
|
1040
1195
|
```javascript
|
|
1041
|
-
//
|
|
1042
|
-
|
|
1043
|
-
|
|
1044
|
-
|
|
1045
|
-
|
|
1046
|
-
|
|
1047
|
-
|
|
1048
|
-
|
|
1049
|
-
|
|
1050
|
-
|
|
1051
|
-
|
|
1052
|
-
mcpServers: {
|
|
1053
|
-
// ✅ Works: In-SDK server (6 tools)
|
|
1054
|
-
'claude-flow-sdk': claudeFlowSdkServer,
|
|
1055
|
-
|
|
1056
|
-
// ❌ Blocked: Cannot spawn subprocess
|
|
1057
|
-
// 'claude-flow': { command: 'npx', args: [...] },
|
|
1058
|
-
|
|
1059
|
-
// ❌ Blocked: Cannot spawn subprocess
|
|
1060
|
-
// 'flow-nexus': { command: 'npx', args: [...] }
|
|
1196
|
+
// router.config.json - Auto-route privacy tasks to ONNX
|
|
1197
|
+
{
|
|
1198
|
+
"routing": {
|
|
1199
|
+
"rules": [
|
|
1200
|
+
{
|
|
1201
|
+
"condition": { "privacy": "high", "localOnly": true },
|
|
1202
|
+
"action": { "provider": "onnx" }
|
|
1203
|
+
},
|
|
1204
|
+
{
|
|
1205
|
+
"condition": { "cost": "free" },
|
|
1206
|
+
"action": { "provider": "onnx" }
|
|
1061
1207
|
}
|
|
1208
|
+
]
|
|
1209
|
+
},
|
|
1210
|
+
"providers": {
|
|
1211
|
+
"onnx": {
|
|
1212
|
+
"modelPath": "./models/phi-4/model.onnx",
|
|
1213
|
+
"maxTokens": 2048,
|
|
1214
|
+
"temperature": 0.7
|
|
1062
1215
|
}
|
|
1063
|
-
}
|
|
1064
|
-
|
|
1065
|
-
return { statusCode: 200, body: JSON.stringify(result) };
|
|
1066
|
-
};
|
|
1216
|
+
}
|
|
1217
|
+
}
|
|
1067
1218
|
```
|
|
1068
1219
|
|
|
1069
|
-
**
|
|
1070
|
-
|
|
|
1071
|
-
|
|
1072
|
-
|
|
|
1073
|
-
|
|
|
1074
|
-
|
|
|
1075
|
-
|
|
|
1076
|
-
|
|
|
1077
|
-
| Total Tools | 6/203 | Only 3% of tools work |
|
|
1220
|
+
**Performance Benchmarks:**
|
|
1221
|
+
| Metric | CPU (Intel i7) | GPU (NVIDIA RTX 3060) |
|
|
1222
|
+
|--------|---------------|----------------------|
|
|
1223
|
+
| Tokens/sec | ~6 | 60-300 |
|
|
1224
|
+
| First Token | ~2s | ~500ms |
|
|
1225
|
+
| Model Load | ~3s | ~2s |
|
|
1226
|
+
| Memory Usage | ~2GB | ~3GB |
|
|
1227
|
+
| Cost | $0 | $0 |
|
|
1078
1228
|
|
|
1079
|
-
**
|
|
1080
|
-
|
|
1081
|
-
|
|
1082
|
-
|
|
1083
|
-
|
|
1084
|
-
|
|
1085
|
-
**Solution: Use Flow Nexus sandboxes instead** - Full 203 tool support with Lambda-triggered sandbox execution:
|
|
1086
|
-
|
|
1087
|
-
```javascript
|
|
1088
|
-
// ✅ RECOMMENDED: Lambda triggers Flow Nexus sandbox
|
|
1089
|
-
import { flowNexus } from 'flow-nexus';
|
|
1229
|
+
**Use Cases:**
|
|
1230
|
+
- ✅ Privacy-sensitive data processing
|
|
1231
|
+
- ✅ Offline/air-gapped environments
|
|
1232
|
+
- ✅ Cost-conscious development
|
|
1233
|
+
- ✅ Compliance requirements (HIPAA, GDPR)
|
|
1234
|
+
- ✅ Prototype/testing without API costs
|
|
1090
1235
|
|
|
1091
|
-
|
|
1092
|
-
|
|
1236
|
+
**Documentation:**
|
|
1237
|
+
- [ONNX Integration Guide](docs/ONNX_INTEGRATION.md)
|
|
1238
|
+
- [ONNX CLI Usage](docs/ONNX_CLI_USAGE.md)
|
|
1239
|
+
- [ONNX vs Claude Quality Analysis](docs/ONNX_VS_CLAUDE_QUALITY.md)
|
|
1093
1240
|
const sandbox = await flowNexus.sandboxCreate({
|
|
1094
1241
|
template: 'node',
|
|
1095
1242
|
env_vars: { ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY }
|
|
@@ -1225,16 +1372,16 @@ npx agentic-flow --agent flow-nexus-sandbox \
|
|
|
1225
1372
|
| **Concurrent Agents** | 10+ on t3.small, 100+ on c6a.xlarge |
|
|
1226
1373
|
| **Token Efficiency** | 32% reduction via swarm coordination |
|
|
1227
1374
|
|
|
1228
|
-
### Cost Analysis
|
|
1375
|
+
### Cost Analysis - ONNX vs Cloud APIs
|
|
1229
1376
|
|
|
1230
|
-
|
|
|
1231
|
-
|
|
1232
|
-
|
|
|
1233
|
-
|
|
|
1234
|
-
|
|
|
1235
|
-
|
|
|
1377
|
+
| Provider | Model | Tokens/sec | Cost per 1M tokens | Monthly (100K tasks) |
|
|
1378
|
+
|----------|-------|------------|-------------------|---------------------|
|
|
1379
|
+
| ONNX Local | Phi-4 | 6-300 | $0 | $0 |
|
|
1380
|
+
| OpenRouter | Llama 3.1 8B | API | $0.06 | $6 |
|
|
1381
|
+
| OpenRouter | DeepSeek | API | $0.14 | $14 |
|
|
1382
|
+
| Claude | Sonnet 3.5 | API | $3.00 | $300 |
|
|
1236
1383
|
|
|
1237
|
-
|
|
1384
|
+
**ONNX Savings:** Up to $3,600/year for typical development workloads
|
|
1238
1385
|
|
|
1239
1386
|
---
|
|
1240
1387
|
|
package/dist/cli-proxy.js
CHANGED
|
@@ -82,15 +82,16 @@ class AgenticFlowCLI {
|
|
|
82
82
|
}
|
|
83
83
|
console.log(`✅ Using optimized model: ${recommendation.modelName}\n`);
|
|
84
84
|
}
|
|
85
|
-
// Determine
|
|
85
|
+
// Determine which provider to use
|
|
86
86
|
const useOpenRouter = this.shouldUseOpenRouter(options);
|
|
87
|
+
const useGemini = this.shouldUseGemini(options);
|
|
87
88
|
try {
|
|
88
|
-
// Start proxy if needed
|
|
89
|
+
// Start proxy if needed (OpenRouter only)
|
|
89
90
|
if (useOpenRouter) {
|
|
90
|
-
await this.startProxy();
|
|
91
|
+
await this.startProxy(options.model);
|
|
91
92
|
}
|
|
92
93
|
// Run agent
|
|
93
|
-
await this.runAgent(options, useOpenRouter);
|
|
94
|
+
await this.runAgent(options, useOpenRouter, useGemini);
|
|
94
95
|
logger.info('Execution completed successfully');
|
|
95
96
|
process.exit(0);
|
|
96
97
|
}
|
|
@@ -100,11 +101,34 @@ class AgenticFlowCLI {
|
|
|
100
101
|
process.exit(1);
|
|
101
102
|
}
|
|
102
103
|
}
|
|
104
|
+
shouldUseGemini(options) {
|
|
105
|
+
// Use Gemini if:
|
|
106
|
+
// 1. Provider is explicitly set to gemini
|
|
107
|
+
// 2. PROVIDER env var is set to gemini
|
|
108
|
+
// 3. USE_GEMINI env var is set
|
|
109
|
+
// 4. GOOGLE_GEMINI_API_KEY is set and no other provider is specified
|
|
110
|
+
if (options.provider === 'gemini' || process.env.PROVIDER === 'gemini') {
|
|
111
|
+
return true;
|
|
112
|
+
}
|
|
113
|
+
if (process.env.USE_GEMINI === 'true') {
|
|
114
|
+
return true;
|
|
115
|
+
}
|
|
116
|
+
if (process.env.GOOGLE_GEMINI_API_KEY &&
|
|
117
|
+
!process.env.ANTHROPIC_API_KEY &&
|
|
118
|
+
!process.env.OPENROUTER_API_KEY &&
|
|
119
|
+
options.provider !== 'onnx') {
|
|
120
|
+
return true;
|
|
121
|
+
}
|
|
122
|
+
return false;
|
|
123
|
+
}
|
|
103
124
|
shouldUseOpenRouter(options) {
|
|
104
|
-
// Don't use OpenRouter if ONNX is explicitly requested
|
|
125
|
+
// Don't use OpenRouter if ONNX or Gemini is explicitly requested
|
|
105
126
|
if (options.provider === 'onnx' || process.env.USE_ONNX === 'true' || process.env.PROVIDER === 'onnx') {
|
|
106
127
|
return false;
|
|
107
128
|
}
|
|
129
|
+
if (options.provider === 'gemini' || process.env.PROVIDER === 'gemini') {
|
|
130
|
+
return false;
|
|
131
|
+
}
|
|
108
132
|
// Use OpenRouter if:
|
|
109
133
|
// 1. Provider is explicitly set to openrouter
|
|
110
134
|
// 2. Model parameter contains "/" (e.g., "meta-llama/llama-3.1-8b-instruct")
|
|
@@ -119,12 +143,12 @@ class AgenticFlowCLI {
|
|
|
119
143
|
if (process.env.USE_OPENROUTER === 'true') {
|
|
120
144
|
return true;
|
|
121
145
|
}
|
|
122
|
-
if (process.env.OPENROUTER_API_KEY && !process.env.ANTHROPIC_API_KEY) {
|
|
146
|
+
if (process.env.OPENROUTER_API_KEY && !process.env.ANTHROPIC_API_KEY && !process.env.GOOGLE_GEMINI_API_KEY) {
|
|
123
147
|
return true;
|
|
124
148
|
}
|
|
125
149
|
return false;
|
|
126
150
|
}
|
|
127
|
-
async startProxy() {
|
|
151
|
+
async startProxy(modelOverride) {
|
|
128
152
|
const openrouterKey = process.env.OPENROUTER_API_KEY;
|
|
129
153
|
if (!openrouterKey) {
|
|
130
154
|
console.error('❌ Error: OPENROUTER_API_KEY required for OpenRouter models');
|
|
@@ -132,7 +156,8 @@ class AgenticFlowCLI {
|
|
|
132
156
|
process.exit(1);
|
|
133
157
|
}
|
|
134
158
|
logger.info('Starting integrated OpenRouter proxy');
|
|
135
|
-
const defaultModel =
|
|
159
|
+
const defaultModel = modelOverride ||
|
|
160
|
+
process.env.COMPLETION_MODEL ||
|
|
136
161
|
process.env.REASONING_MODEL ||
|
|
137
162
|
'meta-llama/llama-3.1-8b-instruct';
|
|
138
163
|
const proxy = new AnthropicToOpenRouterProxy({
|
|
@@ -145,13 +170,17 @@ class AgenticFlowCLI {
|
|
|
145
170
|
this.proxyServer = proxy;
|
|
146
171
|
// Set ANTHROPIC_BASE_URL to proxy
|
|
147
172
|
process.env.ANTHROPIC_BASE_URL = `http://localhost:${this.proxyPort}`;
|
|
173
|
+
// Set dummy ANTHROPIC_API_KEY for proxy (actual auth uses OPENROUTER_API_KEY)
|
|
174
|
+
if (!process.env.ANTHROPIC_API_KEY) {
|
|
175
|
+
process.env.ANTHROPIC_API_KEY = 'sk-ant-proxy-dummy-key';
|
|
176
|
+
}
|
|
148
177
|
console.log(`🔗 Proxy Mode: OpenRouter`);
|
|
149
178
|
console.log(`🔧 Proxy URL: http://localhost:${this.proxyPort}`);
|
|
150
179
|
console.log(`🤖 Default Model: ${defaultModel}\n`);
|
|
151
180
|
// Wait for proxy to be ready
|
|
152
181
|
await new Promise(resolve => setTimeout(resolve, 1500));
|
|
153
182
|
}
|
|
154
|
-
async runAgent(options, useOpenRouter) {
|
|
183
|
+
async runAgent(options, useOpenRouter, useGemini) {
|
|
155
184
|
const agentName = options.agent || process.env.AGENT || '';
|
|
156
185
|
const task = options.task || process.env.TASK || '';
|
|
157
186
|
if (!agentName) {
|
|
@@ -166,12 +195,13 @@ class AgenticFlowCLI {
|
|
|
166
195
|
}
|
|
167
196
|
// Check for API key (unless using ONNX)
|
|
168
197
|
const isOnnx = options.provider === 'onnx' || process.env.USE_ONNX === 'true' || process.env.PROVIDER === 'onnx';
|
|
169
|
-
if (!isOnnx && !useOpenRouter && !process.env.ANTHROPIC_API_KEY) {
|
|
198
|
+
if (!isOnnx && !useOpenRouter && !useGemini && !process.env.ANTHROPIC_API_KEY) {
|
|
170
199
|
console.error('\n❌ Error: ANTHROPIC_API_KEY is required\n');
|
|
171
200
|
console.error('Please set your API key:');
|
|
172
201
|
console.error(' export ANTHROPIC_API_KEY=sk-ant-xxxxx\n');
|
|
173
202
|
console.error('Or use alternative providers:');
|
|
174
203
|
console.error(' --provider openrouter (requires OPENROUTER_API_KEY)');
|
|
204
|
+
console.error(' --provider gemini (requires GOOGLE_GEMINI_API_KEY)');
|
|
175
205
|
console.error(' --provider onnx (free local inference)\n');
|
|
176
206
|
process.exit(1);
|
|
177
207
|
}
|
|
@@ -181,9 +211,20 @@ class AgenticFlowCLI {
|
|
|
181
211
|
console.error(' export OPENROUTER_API_KEY=sk-or-v1-xxxxx\n');
|
|
182
212
|
console.error('Or use alternative providers:');
|
|
183
213
|
console.error(' --provider anthropic (requires ANTHROPIC_API_KEY)');
|
|
214
|
+
console.error(' --provider gemini (requires GOOGLE_GEMINI_API_KEY)');
|
|
184
215
|
console.error(' --provider onnx (free local inference)\n');
|
|
185
216
|
process.exit(1);
|
|
186
217
|
}
|
|
218
|
+
if (!isOnnx && useGemini && !process.env.GOOGLE_GEMINI_API_KEY) {
|
|
219
|
+
console.error('\n❌ Error: GOOGLE_GEMINI_API_KEY is required for Gemini\n');
|
|
220
|
+
console.error('Please set your API key:');
|
|
221
|
+
console.error(' export GOOGLE_GEMINI_API_KEY=xxxxx\n');
|
|
222
|
+
console.error('Or use alternative providers:');
|
|
223
|
+
console.error(' --provider anthropic (requires ANTHROPIC_API_KEY)');
|
|
224
|
+
console.error(' --provider openrouter (requires OPENROUTER_API_KEY)');
|
|
225
|
+
console.error(' --provider onnx (free local inference)\n');
|
|
226
|
+
process.exit(1);
|
|
227
|
+
}
|
|
187
228
|
const agent = getAgent(agentName);
|
|
188
229
|
if (!agent) {
|
|
189
230
|
const available = listAgents();
|
|
@@ -205,6 +246,11 @@ class AgenticFlowCLI {
|
|
|
205
246
|
console.log(`🔧 Provider: OpenRouter (via proxy)`);
|
|
206
247
|
console.log(`🔧 Model: ${model}\n`);
|
|
207
248
|
}
|
|
249
|
+
else if (useGemini) {
|
|
250
|
+
const model = options.model || 'gemini-2.0-flash-exp';
|
|
251
|
+
console.log(`🔧 Provider: Google Gemini`);
|
|
252
|
+
console.log(`🔧 Model: ${model}\n`);
|
|
253
|
+
}
|
|
208
254
|
else if (options.provider === 'onnx' || process.env.USE_ONNX === 'true' || process.env.PROVIDER === 'onnx') {
|
|
209
255
|
console.log(`🔧 Provider: ONNX Local (Phi-4-mini)`);
|
|
210
256
|
console.log(`💾 Free local inference - no API costs`);
|
|
@@ -226,7 +272,7 @@ class AgenticFlowCLI {
|
|
|
226
272
|
logger.info('Agent completed', {
|
|
227
273
|
agent: agentName,
|
|
228
274
|
outputLength: result.output.length,
|
|
229
|
-
provider: useOpenRouter ? 'openrouter' : 'anthropic'
|
|
275
|
+
provider: useOpenRouter ? 'openrouter' : useGemini ? 'gemini' : 'anthropic'
|
|
230
276
|
});
|
|
231
277
|
}
|
|
232
278
|
listAgents() {
|
|
@@ -291,13 +337,14 @@ AGENT COMMANDS:
|
|
|
291
337
|
OPTIONS:
|
|
292
338
|
--task, -t <task> Task description for agent mode
|
|
293
339
|
--model, -m <model> Model to use (triggers OpenRouter if contains "/")
|
|
294
|
-
--provider, -p <name> Provider to use (anthropic, openrouter, onnx)
|
|
340
|
+
--provider, -p <name> Provider to use (anthropic, openrouter, gemini, onnx)
|
|
295
341
|
--stream, -s Enable real-time streaming output
|
|
296
342
|
--help, -h Show this help message
|
|
297
343
|
|
|
298
344
|
API CONFIGURATION:
|
|
299
345
|
--anthropic-key <key> Override ANTHROPIC_API_KEY environment variable
|
|
300
346
|
--openrouter-key <key> Override OPENROUTER_API_KEY environment variable
|
|
347
|
+
--gemini-key <key> Override GOOGLE_GEMINI_API_KEY environment variable
|
|
301
348
|
|
|
302
349
|
AGENT BEHAVIOR:
|
|
303
350
|
--temperature <0.0-1.0> Sampling temperature (creativity control)
|
|
@@ -350,16 +397,21 @@ EXAMPLES:
|
|
|
350
397
|
ENVIRONMENT VARIABLES:
|
|
351
398
|
ANTHROPIC_API_KEY Anthropic API key (for Claude models)
|
|
352
399
|
OPENROUTER_API_KEY OpenRouter API key (for alternative models)
|
|
400
|
+
GOOGLE_GEMINI_API_KEY Google Gemini API key (for Gemini models)
|
|
353
401
|
USE_OPENROUTER Set to 'true' to force OpenRouter usage
|
|
402
|
+
USE_GEMINI Set to 'true' to force Gemini usage
|
|
354
403
|
COMPLETION_MODEL Default model for OpenRouter
|
|
355
404
|
AGENTS_DIR Path to agents directory
|
|
356
405
|
PROXY_PORT Proxy server port (default: 3000)
|
|
357
406
|
|
|
358
|
-
OPENROUTER MODELS:
|
|
359
|
-
|
|
360
|
-
|
|
361
|
-
-
|
|
362
|
-
-
|
|
407
|
+
OPENROUTER MODELS (Best Free Tested):
|
|
408
|
+
✅ deepseek/deepseek-r1-0528:free (reasoning, 95s/task, RFC validation)
|
|
409
|
+
✅ deepseek/deepseek-chat-v3.1:free (coding, 21-103s/task, enterprise-grade)
|
|
410
|
+
✅ meta-llama/llama-3.3-8b-instruct:free (versatile, 4.4s/task, fast coding)
|
|
411
|
+
✅ openai/gpt-4-turbo (premium, 10.7s/task, no :free needed)
|
|
412
|
+
|
|
413
|
+
All models above support OpenRouter leaderboard tracking via HTTP-Referer headers.
|
|
414
|
+
See https://openrouter.ai/models for full model catalog.
|
|
363
415
|
|
|
364
416
|
MCP TOOLS (213+ available):
|
|
365
417
|
• agentic-flow: 7 tools (agent execution, creation, management, model optimization)
|
|
@@ -373,6 +425,31 @@ OPTIMIZATION BENEFITS:
|
|
|
373
425
|
📊 10+ Models: Claude, GPT-4o, Gemini, DeepSeek, Llama, ONNX local
|
|
374
426
|
⚡ Zero Overhead: <5ms decision time, no API calls during optimization
|
|
375
427
|
|
|
428
|
+
PROXY MODE (Claude Code CLI Integration):
|
|
429
|
+
The OpenRouter proxy allows Claude Code to use alternative models via API translation.
|
|
430
|
+
|
|
431
|
+
Terminal 1 - Start Proxy Server:
|
|
432
|
+
npx agentic-flow proxy
|
|
433
|
+
# Or with custom port: PROXY_PORT=8080 npx agentic-flow proxy
|
|
434
|
+
# Proxy runs at http://localhost:3000 by default
|
|
435
|
+
|
|
436
|
+
Terminal 2 - Use with Claude Code:
|
|
437
|
+
export ANTHROPIC_BASE_URL="http://localhost:3000"
|
|
438
|
+
export ANTHROPIC_API_KEY="sk-ant-proxy-dummy-key"
|
|
439
|
+
export OPENROUTER_API_KEY="sk-or-v1-xxxxx"
|
|
440
|
+
|
|
441
|
+
# Now Claude Code will route through OpenRouter proxy
|
|
442
|
+
claude-code --agent coder --task "Create API"
|
|
443
|
+
|
|
444
|
+
Proxy automatically translates Anthropic API calls to OpenRouter format.
|
|
445
|
+
Model override happens automatically: Claude requests → OpenRouter models.
|
|
446
|
+
|
|
447
|
+
Benefits for Claude Code users:
|
|
448
|
+
• 85-99% cost savings vs Claude Sonnet 4.5
|
|
449
|
+
• Access to 100+ models (DeepSeek, Llama, Gemini, etc.)
|
|
450
|
+
• Leaderboard tracking on OpenRouter
|
|
451
|
+
• No code changes to Claude Code itself
|
|
452
|
+
|
|
376
453
|
For more information: https://github.com/ruvnet/agentic-flow
|
|
377
454
|
`);
|
|
378
455
|
}
|
|
@@ -127,6 +127,10 @@ export class AnthropicToOpenRouterProxy {
|
|
|
127
127
|
content: anthropicReq.system
|
|
128
128
|
});
|
|
129
129
|
}
|
|
130
|
+
// Override model - if request has a Claude model, use defaultModel instead
|
|
131
|
+
const requestedModel = anthropicReq.model || '';
|
|
132
|
+
const shouldOverrideModel = requestedModel.startsWith('claude-') || !requestedModel;
|
|
133
|
+
const finalModel = shouldOverrideModel ? this.defaultModel : requestedModel;
|
|
130
134
|
// Convert Anthropic messages to OpenAI format
|
|
131
135
|
for (const msg of anthropicReq.messages) {
|
|
132
136
|
let content;
|
|
@@ -149,7 +153,7 @@ export class AnthropicToOpenRouterProxy {
|
|
|
149
153
|
});
|
|
150
154
|
}
|
|
151
155
|
return {
|
|
152
|
-
model:
|
|
156
|
+
model: finalModel,
|
|
153
157
|
messages,
|
|
154
158
|
max_tokens: anthropicReq.max_tokens,
|
|
155
159
|
temperature: anthropicReq.temperature,
|
|
@@ -0,0 +1,126 @@
|
|
|
1
|
+
// Google Gemini provider implementation
|
|
2
|
+
import { GoogleGenAI } from '@google/genai';
|
|
3
|
+
export class GeminiProvider {
|
|
4
|
+
name = 'gemini';
|
|
5
|
+
type = 'gemini';
|
|
6
|
+
supportsStreaming = true;
|
|
7
|
+
supportsTools = false; // Will add function calling support later
|
|
8
|
+
supportsMCP = false;
|
|
9
|
+
client; // GoogleGenAI instance
|
|
10
|
+
config;
|
|
11
|
+
constructor(config) {
|
|
12
|
+
this.config = config;
|
|
13
|
+
if (!config.apiKey) {
|
|
14
|
+
throw new Error('Google Gemini API key is required');
|
|
15
|
+
}
|
|
16
|
+
this.client = new GoogleGenAI({
|
|
17
|
+
apiKey: config.apiKey
|
|
18
|
+
});
|
|
19
|
+
}
|
|
20
|
+
validateCapabilities(features) {
|
|
21
|
+
const supported = ['chat', 'streaming'];
|
|
22
|
+
return features.every(f => supported.includes(f));
|
|
23
|
+
}
|
|
24
|
+
async chat(params) {
|
|
25
|
+
try {
|
|
26
|
+
const startTime = Date.now();
|
|
27
|
+
// Convert messages format
|
|
28
|
+
const contents = this.convertMessages(params.messages);
|
|
29
|
+
const response = await this.client.models.generateContent({
|
|
30
|
+
model: params.model || 'gemini-2.0-flash-exp',
|
|
31
|
+
contents,
|
|
32
|
+
generationConfig: {
|
|
33
|
+
temperature: params.temperature,
|
|
34
|
+
maxOutputTokens: params.maxTokens || 4096
|
|
35
|
+
}
|
|
36
|
+
});
|
|
37
|
+
const latency = Date.now() - startTime;
|
|
38
|
+
// Extract text from response
|
|
39
|
+
const text = response.text || '';
|
|
40
|
+
const usage = {
|
|
41
|
+
inputTokens: response.usageMetadata?.promptTokenCount || 0,
|
|
42
|
+
outputTokens: response.usageMetadata?.candidatesTokenCount || 0
|
|
43
|
+
};
|
|
44
|
+
return {
|
|
45
|
+
id: `gemini-${Date.now()}`,
|
|
46
|
+
model: params.model || 'gemini-2.0-flash-exp',
|
|
47
|
+
content: [{
|
|
48
|
+
type: 'text',
|
|
49
|
+
text
|
|
50
|
+
}],
|
|
51
|
+
stopReason: response.candidates?.[0]?.finishReason === 'STOP' ? 'end_turn' : 'max_tokens',
|
|
52
|
+
usage,
|
|
53
|
+
metadata: {
|
|
54
|
+
provider: 'gemini',
|
|
55
|
+
cost: this.calculateCost(usage),
|
|
56
|
+
latency
|
|
57
|
+
}
|
|
58
|
+
};
|
|
59
|
+
}
|
|
60
|
+
catch (error) {
|
|
61
|
+
throw this.handleError(error);
|
|
62
|
+
}
|
|
63
|
+
}
|
|
64
|
+
async *stream(params) {
|
|
65
|
+
try {
|
|
66
|
+
// Convert messages format
|
|
67
|
+
const contents = this.convertMessages(params.messages);
|
|
68
|
+
const response = await this.client.models.generateContentStream({
|
|
69
|
+
model: params.model || 'gemini-2.0-flash-exp',
|
|
70
|
+
contents,
|
|
71
|
+
generationConfig: {
|
|
72
|
+
temperature: params.temperature,
|
|
73
|
+
maxOutputTokens: params.maxTokens || 4096
|
|
74
|
+
}
|
|
75
|
+
});
|
|
76
|
+
for await (const chunk of response) {
|
|
77
|
+
const text = chunk.text || '';
|
|
78
|
+
if (text) {
|
|
79
|
+
yield {
|
|
80
|
+
type: 'content_block_delta',
|
|
81
|
+
delta: {
|
|
82
|
+
type: 'text_delta',
|
|
83
|
+
text
|
|
84
|
+
}
|
|
85
|
+
};
|
|
86
|
+
}
|
|
87
|
+
}
|
|
88
|
+
}
|
|
89
|
+
catch (error) {
|
|
90
|
+
throw this.handleError(error);
|
|
91
|
+
}
|
|
92
|
+
}
|
|
93
|
+
convertMessages(messages) {
|
|
94
|
+
// Gemini expects a single prompt string for simple use cases
|
|
95
|
+
// For more complex scenarios, we'd use the chat history format
|
|
96
|
+
return messages
|
|
97
|
+
.map(msg => {
|
|
98
|
+
if (typeof msg.content === 'string') {
|
|
99
|
+
return `${msg.role === 'user' ? 'User' : 'Assistant'}: ${msg.content}`;
|
|
100
|
+
}
|
|
101
|
+
else if (Array.isArray(msg.content)) {
|
|
102
|
+
const texts = msg.content
|
|
103
|
+
.filter((block) => block.type === 'text')
|
|
104
|
+
.map((block) => block.text)
|
|
105
|
+
.join('\n');
|
|
106
|
+
return `${msg.role === 'user' ? 'User' : 'Assistant'}: ${texts}`;
|
|
107
|
+
}
|
|
108
|
+
return '';
|
|
109
|
+
})
|
|
110
|
+
.filter(Boolean)
|
|
111
|
+
.join('\n\n');
|
|
112
|
+
}
|
|
113
|
+
calculateCost(usage) {
|
|
114
|
+
// Gemini 2.0 Flash pricing: Free up to rate limits, then ~$0.075/MTok input, $0.30/MTok output
|
|
115
|
+
const inputCost = (usage.inputTokens / 1_000_000) * 0.075;
|
|
116
|
+
const outputCost = (usage.outputTokens / 1_000_000) * 0.30;
|
|
117
|
+
return inputCost + outputCost;
|
|
118
|
+
}
|
|
119
|
+
handleError(error) {
|
|
120
|
+
const providerError = new Error(error.message || 'Gemini request failed');
|
|
121
|
+
providerError.provider = 'gemini';
|
|
122
|
+
providerError.statusCode = error.status || 500;
|
|
123
|
+
providerError.retryable = error.status >= 500 || error.status === 429;
|
|
124
|
+
return providerError;
|
|
125
|
+
}
|
|
126
|
+
}
|
package/dist/router/router.js
CHANGED
|
@@ -5,6 +5,7 @@ import { join } from 'path';
|
|
|
5
5
|
import { OpenRouterProvider } from './providers/openrouter.js';
|
|
6
6
|
import { AnthropicProvider } from './providers/anthropic.js';
|
|
7
7
|
import { ONNXLocalProvider } from './providers/onnx-local.js';
|
|
8
|
+
import { GeminiProvider } from './providers/gemini.js';
|
|
8
9
|
export class ModelRouter {
|
|
9
10
|
config;
|
|
10
11
|
providers = new Map();
|
|
@@ -91,6 +92,17 @@ export class ModelRouter {
|
|
|
91
92
|
console.error('❌ Failed to initialize ONNX:', error);
|
|
92
93
|
}
|
|
93
94
|
}
|
|
95
|
+
// Initialize Gemini
|
|
96
|
+
if (this.config.providers.gemini) {
|
|
97
|
+
try {
|
|
98
|
+
const provider = new GeminiProvider(this.config.providers.gemini);
|
|
99
|
+
this.providers.set('gemini', provider);
|
|
100
|
+
console.log('✅ Gemini provider initialized');
|
|
101
|
+
}
|
|
102
|
+
catch (error) {
|
|
103
|
+
console.error('❌ Failed to initialize Gemini:', error);
|
|
104
|
+
}
|
|
105
|
+
}
|
|
94
106
|
// TODO: Initialize other providers (OpenAI, Ollama, LiteLLM)
|
|
95
107
|
// Will be implemented in Phase 1
|
|
96
108
|
}
|
|
@@ -51,29 +51,29 @@ const MODEL_DATABASE = {
|
|
|
51
51
|
// Tier 2: Cost-Effective Champions
|
|
52
52
|
'deepseek-r1': {
|
|
53
53
|
provider: 'openrouter',
|
|
54
|
-
model: 'deepseek/deepseek-r1',
|
|
54
|
+
model: 'deepseek/deepseek-r1-0528:free',
|
|
55
55
|
modelName: 'DeepSeek R1',
|
|
56
|
-
cost_per_1m_input: 0.
|
|
57
|
-
cost_per_1m_output:
|
|
56
|
+
cost_per_1m_input: 0.00,
|
|
57
|
+
cost_per_1m_output: 0.00,
|
|
58
58
|
quality_score: 90,
|
|
59
59
|
speed_score: 80,
|
|
60
|
-
cost_score:
|
|
60
|
+
cost_score: 100,
|
|
61
61
|
tier: 'cost-effective',
|
|
62
|
-
strengths: ['reasoning', 'coding', 'math', 'value'],
|
|
62
|
+
strengths: ['reasoning', 'coding', 'math', 'value', 'free'],
|
|
63
63
|
weaknesses: ['newer-model'],
|
|
64
64
|
bestFor: ['coder', 'pseudocode', 'specification', 'refinement', 'tester']
|
|
65
65
|
},
|
|
66
66
|
'deepseek-chat-v3': {
|
|
67
67
|
provider: 'openrouter',
|
|
68
|
-
model: 'deepseek/deepseek-chat',
|
|
69
|
-
modelName: 'DeepSeek Chat V3',
|
|
70
|
-
cost_per_1m_input: 0.
|
|
71
|
-
cost_per_1m_output: 0.
|
|
68
|
+
model: 'deepseek/deepseek-chat-v3.1:free',
|
|
69
|
+
modelName: 'DeepSeek Chat V3.1',
|
|
70
|
+
cost_per_1m_input: 0.00,
|
|
71
|
+
cost_per_1m_output: 0.00,
|
|
72
72
|
quality_score: 82,
|
|
73
73
|
speed_score: 90,
|
|
74
|
-
cost_score:
|
|
74
|
+
cost_score: 100,
|
|
75
75
|
tier: 'cost-effective',
|
|
76
|
-
strengths: ['cost', 'speed', 'coding', 'development'],
|
|
76
|
+
strengths: ['cost', 'speed', 'coding', 'development', 'free'],
|
|
77
77
|
weaknesses: ['complex-reasoning'],
|
|
78
78
|
bestFor: ['coder', 'reviewer', 'tester', 'backend-dev', 'cicd-engineer']
|
|
79
79
|
},
|
|
@@ -92,19 +92,19 @@ const MODEL_DATABASE = {
|
|
|
92
92
|
weaknesses: ['quality'],
|
|
93
93
|
bestFor: ['researcher', 'planner', 'smart-agent']
|
|
94
94
|
},
|
|
95
|
-
'llama-3-3-
|
|
95
|
+
'llama-3-3-8b': {
|
|
96
96
|
provider: 'openrouter',
|
|
97
|
-
model: 'meta-llama/llama-3.3-
|
|
98
|
-
modelName: 'Llama 3.3
|
|
99
|
-
cost_per_1m_input: 0.
|
|
100
|
-
cost_per_1m_output: 0.
|
|
101
|
-
quality_score:
|
|
102
|
-
speed_score:
|
|
103
|
-
cost_score:
|
|
97
|
+
model: 'meta-llama/llama-3.3-8b-instruct:free',
|
|
98
|
+
modelName: 'Llama 3.3 8B',
|
|
99
|
+
cost_per_1m_input: 0.00,
|
|
100
|
+
cost_per_1m_output: 0.00,
|
|
101
|
+
quality_score: 72,
|
|
102
|
+
speed_score: 95,
|
|
103
|
+
cost_score: 100,
|
|
104
104
|
tier: 'balanced',
|
|
105
|
-
strengths: ['open-source', 'versatile', 'coding'],
|
|
106
|
-
weaknesses: ['
|
|
107
|
-
bestFor: ['coder', 'reviewer', 'base-template-generator']
|
|
105
|
+
strengths: ['open-source', 'versatile', 'coding', 'free', 'fast'],
|
|
106
|
+
weaknesses: ['smaller-model'],
|
|
107
|
+
bestFor: ['coder', 'reviewer', 'base-template-generator', 'tester']
|
|
108
108
|
},
|
|
109
109
|
'qwen-2-5-72b': {
|
|
110
110
|
provider: 'openrouter',
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "agentic-flow",
|
|
3
|
-
"version": "1.1.
|
|
3
|
+
"version": "1.1.2",
|
|
4
4
|
"description": "Production-ready AI agent orchestration platform with 66 specialized agents, 111 MCP tools, and autonomous multi-agent swarms. Built by @ruvnet with Claude Agent SDK, neural networks, memory persistence, GitHub integration, and distributed consensus protocols.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "dist/index.js",
|
|
@@ -110,6 +110,7 @@
|
|
|
110
110
|
"dependencies": {
|
|
111
111
|
"@anthropic-ai/claude-agent-sdk": "^0.1.5",
|
|
112
112
|
"@anthropic-ai/sdk": "^0.65.0",
|
|
113
|
+
"@google/genai": "^1.22.0",
|
|
113
114
|
"agentic-payments": "^0.1.3",
|
|
114
115
|
"axios": "^1.12.2",
|
|
115
116
|
"claude-flow": "^2.0.0",
|