tachibot-mcp 2.0.6 → 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +13 -3
- package/README.md +88 -44
- package/dist/src/config/model-constants.js +121 -91
- package/dist/src/config/model-defaults.js +35 -21
- package/dist/src/config/model-preferences.js +5 -4
- package/dist/src/config.js +2 -1
- package/dist/src/mcp-client.js +3 -3
- package/dist/src/modes/scout.js +2 -1
- package/dist/src/optimization/model-router.js +19 -16
- package/dist/src/orchestrator-instructions.js +1 -1
- package/dist/src/orchestrator-lite.js +1 -1
- package/dist/src/orchestrator.js +1 -1
- package/dist/src/profiles/balanced.js +1 -2
- package/dist/src/profiles/code_focus.js +1 -2
- package/dist/src/profiles/full.js +1 -2
- package/dist/src/profiles/minimal.js +1 -2
- package/dist/src/profiles/research_power.js +1 -2
- package/dist/src/server.js +13 -12
- package/dist/src/tools/gemini-tools.js +32 -16
- package/dist/src/tools/grok-enhanced.js +18 -17
- package/dist/src/tools/grok-tools.js +34 -20
- package/dist/src/tools/openai-tools.js +52 -61
- package/dist/src/tools/tool-router.js +53 -52
- package/dist/src/tools/unified-ai-provider.js +90 -9
- package/dist/src/tools/workflow-runner.js +16 -0
- package/dist/src/tools/workflow-validator-tool.js +1 -1
- package/dist/src/utils/api-keys.js +20 -0
- package/dist/src/utils/openrouter-gateway.js +117 -0
- package/dist/src/validators/interpolation-validator.js +4 -0
- package/dist/src/validators/tool-registry-validator.js +1 -1
- package/dist/src/validators/tool-types.js +0 -1
- package/dist/src/workflows/custom-workflows.js +4 -3
- package/dist/src/workflows/engine/VariableInterpolator.js +30 -3
- package/dist/src/workflows/engine/WorkflowExecutionEngine.js +2 -2
- package/dist/src/workflows/engine/WorkflowOutputFormatter.js +27 -4
- package/dist/src/workflows/fallback-strategies.js +2 -2
- package/dist/src/workflows/model-router.js +20 -11
- package/dist/src/workflows/tool-mapper.js +51 -24
- package/docs/API_KEYS.md +52 -18
- package/docs/CONFIGURATION.md +25 -8
- package/docs/TOOLS_REFERENCE.md +12 -48
- package/docs/TOOL_PARAMETERS.md +19 -16
- package/docs/WORKFLOWS.md +7 -7
- package/package.json +1 -1
- package/profiles/balanced.json +1 -2
- package/profiles/code_focus.json +1 -2
- package/profiles/debug_intensive.json +0 -1
- package/profiles/full.json +2 -3
- package/profiles/minimal.json +1 -2
- package/profiles/research_power.json +1 -2
- package/profiles/workflow_builder.json +1 -2
- package/tools.config.json +15 -3
- package/workflows/code-architecture-review.yaml +5 -3
- package/workflows/creative-brainstorm-yaml.yaml +1 -1
- package/workflows/pingpong.yaml +5 -3
- package/workflows/system/README.md +1 -1
- package/workflows/system/verifier.yaml +8 -5
- package/workflows/ultra-creative-brainstorm.yaml +3 -3
package/docs/API_KEYS.md
CHANGED
|
@@ -48,7 +48,7 @@ TachiBot MCP works with multiple AI providers to offer diverse capabilities. You
|
|
|
48
48
|
|----------|--------------|----------------|
|
|
49
49
|
| **Perplexity** | Research, web search | `perplexity_ask`, `perplexity_research`, `perplexity_reason`, `scout` (default) |
|
|
50
50
|
| **Grok/xAI** | Live search, reasoning | `grok_search`, `grok_reason`, `grok_code`, `grok_debug`, `grok_architect`, `grok_brainstorm`, `scout` (with grok) |
|
|
51
|
-
| **OpenAI** | GPT-5 models | `openai_brainstorm`, `
|
|
51
|
+
| **OpenAI** | GPT-5 models | `openai_brainstorm`, `openai_reason`, `openai_code_review`, `openai_explain`, `focus` (some modes), `verifier`, `challenger` |
|
|
52
52
|
| **Google** | Gemini models | `gemini_brainstorm`, `gemini_analyze_code`, `gemini_analyze_text`, `verifier`, `scout` |
|
|
53
53
|
| **OpenRouter** | Qwen models | `qwen_coder`, `qwen_competitive` |
|
|
54
54
|
|
|
@@ -116,11 +116,11 @@ Grok (by xAI) provides live web search, reasoning, and code analysis.
|
|
|
116
116
|
|
|
117
117
|
#### Models Available
|
|
118
118
|
|
|
119
|
-
- **grok-4
|
|
120
|
-
- **grok-4
|
|
121
|
-
- **grok-4** - Previous reasoning model
|
|
122
|
-
- **grok-4-0709** -
|
|
123
|
-
- **grok-
|
|
119
|
+
- **grok-4-1-fast-reasoning** - Latest (Nov 2025): Enhanced reasoning, creativity & emotional intelligence (2M context)
|
|
120
|
+
- **grok-4-1-fast-non-reasoning** - Tool-calling optimized: Fast inference, agentic workflows (2M context)
|
|
121
|
+
- **grok-4-fast-reasoning** - Previous reasoning model
|
|
122
|
+
- **grok-4-0709** - Heavy model (expensive, use sparingly)
|
|
123
|
+
- **grok-code-fast-1** - Coding specialist
|
|
124
124
|
|
|
125
125
|
#### Pricing
|
|
126
126
|
|
|
@@ -175,18 +175,21 @@ OpenAI provides GPT-5 models for brainstorming, comparison, and reasoning.
|
|
|
175
175
|
|
|
176
176
|
#### Models Available
|
|
177
177
|
|
|
178
|
-
- **gpt-5** -
|
|
179
|
-
- **gpt-5-mini** -
|
|
180
|
-
- **gpt-5-
|
|
181
|
-
- **
|
|
178
|
+
- **gpt-5.1** - Flagship model with deep reasoning (2M context)
|
|
179
|
+
- **gpt-5.1-codex-mini** - Fast, cheap workhorse for code tasks (256K context)
|
|
180
|
+
- **gpt-5.1-codex** - Power model for complex code (1M context)
|
|
181
|
+
- **gpt-5-pro** - Premium for complex orchestration (4M context)
|
|
182
182
|
|
|
183
183
|
#### Pricing
|
|
184
184
|
|
|
185
|
+
> **Note:** Prices are approximate and may be outdated. Check [OpenAI Pricing](https://openai.com/pricing) for current rates.
|
|
186
|
+
|
|
185
187
|
| Model | Input | Output | Notes |
|
|
186
188
|
|-------|-------|--------|-------|
|
|
187
|
-
|
|
|
188
|
-
|
|
|
189
|
-
|
|
|
189
|
+
| gpt-5.1 | ~$10 / 1M tokens | ~$30 / 1M tokens | Flagship reasoning |
|
|
190
|
+
| gpt-5.1-codex-mini | ~$2 / 1M tokens | ~$6 / 1M tokens | Best value for code |
|
|
191
|
+
| gpt-5.1-codex | ~$15 / 1M tokens | ~$45 / 1M tokens | Complex code tasks |
|
|
192
|
+
| gpt-5-pro | ~$20 / 1M tokens | ~$60 / 1M tokens | Premium orchestration |
|
|
190
193
|
|
|
191
194
|
**Warning:** GPT-5 models may generate invisible reasoning tokens that increase costs. Monitor usage carefully.
|
|
192
195
|
|
|
@@ -205,11 +208,11 @@ OpenAI provides GPT-5 models for brainstorming, comparison, and reasoning.
|
|
|
205
208
|
|
|
206
209
|
#### Cost Estimation
|
|
207
210
|
|
|
208
|
-
- Single `openai_brainstorm` (gpt-5-mini): ~$0.01 - $0.03
|
|
211
|
+
- Single `openai_brainstorm` (gpt-5.1-codex-mini): ~$0.01 - $0.03
|
|
209
212
|
- Single `openai_brainstorm` (gpt-5): ~$0.15 - $0.40
|
|
210
|
-
- Single `
|
|
213
|
+
- Single `openai_code_review`: ~$0.02 - $0.05
|
|
211
214
|
|
|
212
|
-
**Tip:** Use `model: "gpt-5-mini"` by default, only use `gpt-5` for complex tasks.
|
|
215
|
+
**Tip:** Use `model: "gpt-5.1-codex-mini"` by default, only use `gpt-5` for complex tasks.
|
|
213
216
|
|
|
214
217
|
#### Add to .env
|
|
215
218
|
|
|
@@ -313,6 +316,37 @@ Varies by model, generally:
|
|
|
313
316
|
OPENROUTER_API_KEY=sk-or-v1-abc123...
|
|
314
317
|
```
|
|
315
318
|
|
|
319
|
+
#### OpenRouter Gateway Mode (Optional)
|
|
320
|
+
|
|
321
|
+
OpenRouter can act as a **unified gateway** for all providers (OpenAI, Gemini, Grok) with a single API key:
|
|
322
|
+
|
|
323
|
+
```bash
|
|
324
|
+
# Enable gateway mode - routes all providers through OpenRouter
|
|
325
|
+
USE_OPENROUTER_GATEWAY=true
|
|
326
|
+
OPENROUTER_API_KEY=sk-or-v1-abc123...
|
|
327
|
+
```
|
|
328
|
+
|
|
329
|
+
**How it works:**
|
|
330
|
+
| Provider | Default Mode | Gateway Mode |
|
|
331
|
+
|----------|--------------|--------------|
|
|
332
|
+
| Kimi/Qwen | OpenRouter | OpenRouter (no change) |
|
|
333
|
+
| OpenAI | Direct API | → OpenRouter |
|
|
334
|
+
| Gemini | Direct API | → OpenRouter |
|
|
335
|
+
| Grok | Direct API | → OpenRouter |
|
|
336
|
+
| Perplexity | Direct API | Direct API (always) |
|
|
337
|
+
|
|
338
|
+
**Benefits:**
|
|
339
|
+
- ✅ Single API key for most providers
|
|
340
|
+
- ✅ Unified billing dashboard
|
|
341
|
+
- ✅ Automatic fallback/load balancing
|
|
342
|
+
|
|
343
|
+
**Limitations:**
|
|
344
|
+
- ⚠️ Perplexity still requires direct API (not on OpenRouter)
|
|
345
|
+
- ⚠️ Some provider-specific features may not work (e.g., `reasoning_effort`)
|
|
346
|
+
- ⚠️ Slight latency overhead (proxy)
|
|
347
|
+
|
|
348
|
+
**Note:** Gateway mode is validated by Andrej Karpathy's [llm-council](https://github.com/karpathy/llm-council) project.
|
|
349
|
+
|
|
316
350
|
---
|
|
317
351
|
|
|
318
352
|
## Cost Comparison
|
|
@@ -434,7 +468,7 @@ See [TOOL_PROFILES.md](TOOL_PROFILES.md) for details.
|
|
|
434
468
|
- **Deep research:** `perplexity_research` (expensive, use sparingly)
|
|
435
469
|
- **Live data:** `grok_search` with low `maxSearchSources` (10-20)
|
|
436
470
|
- **Code tasks:** `gemini_analyze_code` or `qwen_coder` (cost-effective)
|
|
437
|
-
- **Brainstorming:** `gemini_brainstorm` or `openai_brainstorm` with `model: "gpt-5-mini"`
|
|
471
|
+
- **Brainstorming:** `gemini_brainstorm` or `openai_brainstorm` with `model: "gpt-5.1-codex-mini"`
|
|
438
472
|
|
|
439
473
|
### 4. Monitor Regularly
|
|
440
474
|
|
|
@@ -509,7 +543,7 @@ TACHI_CACHE_TTL=3600 # 1 hour
|
|
|
509
543
|
2. Review which tools are being used
|
|
510
544
|
3. Switch to `minimal` or `balanced` profile
|
|
511
545
|
4. Avoid `grok_search` with high `maxSearchSources`
|
|
512
|
-
5. Use `gpt-5-mini` instead of `gpt-5`
|
|
546
|
+
5. Use `gpt-5.1-codex-mini` instead of `gpt-5`
|
|
513
547
|
|
|
514
548
|
### API Key Not Working After Setup
|
|
515
549
|
|
package/docs/CONFIGURATION.md
CHANGED
|
@@ -167,6 +167,23 @@ ANTHROPIC_API_KEY=sk-ant-...
|
|
|
167
167
|
QWEN_API_KEY=...
|
|
168
168
|
```
|
|
169
169
|
|
|
170
|
+
### OpenRouter Gateway Mode
|
|
171
|
+
|
|
172
|
+
Route all providers (OpenAI, Gemini, Grok) through OpenRouter with a single API key:
|
|
173
|
+
|
|
174
|
+
```bash
|
|
175
|
+
# Enable gateway mode
|
|
176
|
+
USE_OPENROUTER_GATEWAY=true
|
|
177
|
+
OPENROUTER_API_KEY=sk-or-...
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
**Routing behavior:**
|
|
181
|
+
- **Kimi/Qwen** → Always OpenRouter (native)
|
|
182
|
+
- **OpenAI/Gemini/Grok** → Direct API (default) or OpenRouter (when gateway enabled)
|
|
183
|
+
- **Perplexity** → Always direct API (not on OpenRouter)
|
|
184
|
+
|
|
185
|
+
See [API_KEYS.md](API_KEYS.md#openrouter-gateway-mode-optional) for details.
|
|
186
|
+
|
|
170
187
|
### Search Configuration
|
|
171
188
|
|
|
172
189
|
```bash
|
|
@@ -249,28 +266,28 @@ Configure which models are used for Scout, Challenger, and Verifier tools. These
|
|
|
249
266
|
|
|
250
267
|
```bash
|
|
251
268
|
# Scout model configuration
|
|
252
|
-
SCOUT_QUICK_MODELS=qwen/qwen3-coder-plus,gemini-2.5-flash,gpt-5-mini
|
|
253
|
-
SCOUT_RESEARCH_MODELS=qwen/qwen3-coder-plus,gemini-2.5-pro,gpt-5-mini
|
|
269
|
+
SCOUT_QUICK_MODELS=qwen/qwen3-coder-plus,gemini-2.5-flash,gpt-5.1-codex-mini
|
|
270
|
+
SCOUT_RESEARCH_MODELS=qwen/qwen3-coder-plus,gemini-2.5-pro,gpt-5.1-codex-mini
|
|
254
271
|
|
|
255
272
|
# Challenger model configuration
|
|
256
|
-
CHALLENGER_MODELS=qwen/qwen3-coder-plus,gemini-2.5-pro,gpt-5-mini
|
|
273
|
+
CHALLENGER_MODELS=qwen/qwen3-coder-plus,gemini-2.5-pro,gpt-5.1-codex-mini
|
|
257
274
|
|
|
258
275
|
# Verifier model configuration
|
|
259
|
-
VERIFIER_QUICK_MODELS=qwen/qwen3-coder-plus,gemini-2.5-flash,gpt-5-mini
|
|
260
|
-
VERIFIER_STANDARD_MODELS=qwen/qwen3-coder-plus,gemini-2.5-pro,gpt-5-mini
|
|
276
|
+
VERIFIER_QUICK_MODELS=qwen/qwen3-coder-plus,gemini-2.5-flash,gpt-5.1-codex-mini
|
|
277
|
+
VERIFIER_STANDARD_MODELS=qwen/qwen3-coder-plus,gemini-2.5-pro,gpt-5.1-codex-mini
|
|
261
278
|
VERIFIER_DEEP_MODELS=qwen/qwen3-coder-plus,gemini-2.5-pro,gpt-5
|
|
262
279
|
|
|
263
280
|
# Default models for fallback
|
|
264
|
-
DEFAULT_MODELS=qwen/qwen3-coder-plus,gemini-2.5-pro,gpt-5-mini
|
|
281
|
+
DEFAULT_MODELS=qwen/qwen3-coder-plus,gemini-2.5-pro,gpt-5.1-codex-mini
|
|
265
282
|
```
|
|
266
283
|
|
|
267
284
|
**Cost Optimization:**
|
|
268
|
-
- **gpt-5-mini**: 60% cheaper (~$0.50/$1.00 per 1M tokens), faster, good for most tasks
|
|
285
|
+
- **gpt-5.1-codex-mini**: 60% cheaper (~$0.50/$1.00 per 1M tokens), faster, good for most tasks
|
|
269
286
|
- **gpt-5**: Full quality (~$1.25/$2.50 per 1M tokens), best for critical decisions
|
|
270
287
|
- **gemini-2.5-flash**: Faster, cheaper, good for quick checks
|
|
271
288
|
- **gemini-2.5-pro**: Better reasoning/accuracy, recommended for verification
|
|
272
289
|
|
|
273
|
-
**Recommendation:** Use defaults (gpt-5-mini) for 60% cost savings. Upgrade to gpt-5 only for `deep_verify` or critical workflows.
|
|
290
|
+
**Recommendation:** Use defaults (gpt-5.1-codex-mini) for 60% cost savings. Upgrade to gpt-5 only for `deep_verify` or critical workflows.
|
|
274
291
|
|
|
275
292
|
### Model-Specific Settings
|
|
276
293
|
|
package/docs/TOOLS_REFERENCE.md
CHANGED
|
@@ -22,9 +22,8 @@
|
|
|
22
22
|
- [grok_architect](#grok_architect)
|
|
23
23
|
- [grok_brainstorm](#grok_brainstorm)
|
|
24
24
|
- [OpenAI Suite](#openai-suite)
|
|
25
|
-
- [
|
|
25
|
+
- [openai_reason](#openai_reason)
|
|
26
26
|
- [openai_brainstorm](#openai_brainstorm)
|
|
27
|
-
- [openai_compare](#openai_compare)
|
|
28
27
|
- [openai_code_review](#openai_code_review)
|
|
29
28
|
- [openai_explain](#openai_explain)
|
|
30
29
|
- [Gemini Suite](#gemini-suite)
|
|
@@ -837,7 +836,7 @@ Creative brainstorming using GPT-5 suite with advanced controls.
|
|
|
837
836
|
```typescript
|
|
838
837
|
{
|
|
839
838
|
problem: string; // REQUIRED
|
|
840
|
-
model?: "gpt-5" | "gpt-5-mini" | "gpt-5-
|
|
839
|
+
model?: "gpt-5.1" | "gpt-5.1-codex-mini" | "gpt-5.1-codex"; // Default: "gpt-5.1-codex-mini"
|
|
841
840
|
quantity?: number; // Default: 5
|
|
842
841
|
style?: "innovative" | "practical" | "wild" | "systematic";
|
|
843
842
|
constraints?: string;
|
|
@@ -852,7 +851,7 @@ Creative brainstorming using GPT-5 suite with advanced controls.
|
|
|
852
851
|
| Parameter | Type | Required | Default | Description |
|
|
853
852
|
|-----------|------|----------|---------|-------------|
|
|
854
853
|
| `problem` | `string` | ✅ Yes | - | Problem to brainstorm |
|
|
855
|
-
| `model` | `string` | No | `"gpt-5-mini"` | GPT-5 model variant |
|
|
854
|
+
| `model` | `string` | No | `"gpt-5.1-codex-mini"` | GPT-5 model variant |
|
|
856
855
|
| `quantity` | `number` | No | `5` | Number of ideas to generate |
|
|
857
856
|
| `style` | `string` | No | `"innovative"` | Brainstorming style |
|
|
858
857
|
| `constraints` | `string` | No | - | Additional constraints |
|
|
@@ -864,9 +863,9 @@ Creative brainstorming using GPT-5 suite with advanced controls.
|
|
|
864
863
|
|
|
865
864
|
| Model | Speed | Cost | Best For |
|
|
866
865
|
|-------|-------|------|----------|
|
|
867
|
-
| `gpt-5-
|
|
868
|
-
| `gpt-5-
|
|
869
|
-
| `gpt-5` | Slow | $$$$ |
|
|
866
|
+
| `gpt-5.1-codex-mini` | Fast | $$ | Most tasks (default) |
|
|
867
|
+
| `gpt-5.1-codex` | Medium | $$$ | Complex code tasks |
|
|
868
|
+
| `gpt-5.1` | Slow | $$$$ | Deep reasoning problems |
|
|
870
869
|
|
|
871
870
|
#### Reasoning Effort (GPT-5 only)
|
|
872
871
|
|
|
@@ -909,7 +908,7 @@ openai_brainstorm({
|
|
|
909
908
|
```typescript
|
|
910
909
|
openai_brainstorm({
|
|
911
910
|
problem: "Reduce app cold start time",
|
|
912
|
-
model: "gpt-5-
|
|
911
|
+
model: "gpt-5.1-codex-mini",
|
|
913
912
|
quantity: 5,
|
|
914
913
|
style: "practical",
|
|
915
914
|
constraints: "Must work on mobile devices"
|
|
@@ -928,41 +927,6 @@ openai_brainstorm({
|
|
|
928
927
|
|
|
929
928
|
---
|
|
930
929
|
|
|
931
|
-
### openai_compare
|
|
932
|
-
|
|
933
|
-
Multi-option consensus analysis with GPT-5.
|
|
934
|
-
|
|
935
|
-
#### Schema
|
|
936
|
-
|
|
937
|
-
```typescript
|
|
938
|
-
{
|
|
939
|
-
topic: string; // REQUIRED
|
|
940
|
-
options: string[]; // REQUIRED - Options to compare
|
|
941
|
-
criteria?: string[]; // Evaluation criteria
|
|
942
|
-
includeRecommendation?: boolean; // Default: true
|
|
943
|
-
}
|
|
944
|
-
```
|
|
945
|
-
|
|
946
|
-
#### Example Calls
|
|
947
|
-
|
|
948
|
-
**Compare frameworks:**
|
|
949
|
-
```typescript
|
|
950
|
-
openai_compare({
|
|
951
|
-
topic: "JavaScript framework selection",
|
|
952
|
-
options: ["React", "Vue", "Svelte", "Angular"],
|
|
953
|
-
criteria: [
|
|
954
|
-
"Learning curve",
|
|
955
|
-
"Performance",
|
|
956
|
-
"Community support",
|
|
957
|
-
"Ecosystem maturity",
|
|
958
|
-
"Job market demand"
|
|
959
|
-
],
|
|
960
|
-
includeRecommendation: true
|
|
961
|
-
})
|
|
962
|
-
```
|
|
963
|
-
|
|
964
|
-
---
|
|
965
|
-
|
|
966
930
|
## Gemini Suite
|
|
967
931
|
|
|
968
932
|
### gemini_brainstorm
|
|
@@ -1155,7 +1119,7 @@ Multi-model parallel verification with consensus analysis.
|
|
|
1155
1119
|
#### Variants
|
|
1156
1120
|
|
|
1157
1121
|
**quick_verify** (Default)
|
|
1158
|
-
- Models: `gpt-5-mini`, `gemini-2.5-flash`, `gpt-5`
|
|
1122
|
+
- Models: `gpt-5.1-codex-mini`, `gemini-2.5-flash`, `gpt-5`
|
|
1159
1123
|
- Tokens: 2000
|
|
1160
1124
|
- Timeout: 10s
|
|
1161
1125
|
- Use: Fast verification
|
|
@@ -1167,7 +1131,7 @@ Multi-model parallel verification with consensus analysis.
|
|
|
1167
1131
|
- Use: Complex reasoning
|
|
1168
1132
|
|
|
1169
1133
|
**fact_check**
|
|
1170
|
-
- Models: `gpt-5`, `gemini-2.5-pro`, `gpt-5-mini`
|
|
1134
|
+
- Models: `gpt-5`, `gemini-2.5-pro`, `gpt-5.1-codex-mini`
|
|
1171
1135
|
- Tokens: 3000
|
|
1172
1136
|
- Timeout: 15s
|
|
1173
1137
|
- Sources: Enabled by default
|
|
@@ -1339,7 +1303,7 @@ Critical thinking and echo chamber prevention by generating counter-arguments.
|
|
|
1339
1303
|
```typescript
|
|
1340
1304
|
{
|
|
1341
1305
|
context: string | object | array; // REQUIRED
|
|
1342
|
-
model?: string; // Default: "gpt-5-mini"
|
|
1306
|
+
model?: string; // Default: "gpt-5.1-codex-mini"
|
|
1343
1307
|
maxTokens?: number; // Default: 2000
|
|
1344
1308
|
temperature?: number; // 0-1, Default: 0.9
|
|
1345
1309
|
}
|
|
@@ -1350,13 +1314,13 @@ Critical thinking and echo chamber prevention by generating counter-arguments.
|
|
|
1350
1314
|
| Parameter | Type | Required | Default | Description |
|
|
1351
1315
|
|-----------|------|----------|---------|-------------|
|
|
1352
1316
|
| `context` | `string \| object \| array` | ✅ Yes | - | Claims to challenge |
|
|
1353
|
-
| `model` | `string` | No | `"gpt-5-mini"` | AI model to use |
|
|
1317
|
+
| `model` | `string` | No | `"gpt-5.1-codex-mini"` | AI model to use |
|
|
1354
1318
|
| `maxTokens` | `number` | No | `2000` | Max tokens per call |
|
|
1355
1319
|
| `temperature` | `number` | No | `0.9` | Creativity (0-1) |
|
|
1356
1320
|
|
|
1357
1321
|
#### Supported Models
|
|
1358
1322
|
|
|
1359
|
-
- `gpt-5-mini`, `gpt-5`, `qwq-32b`, `qwen3-30b`, `qwen3-coder-480b`
|
|
1323
|
+
- `gpt-5.1-codex-mini`, `gpt-5`, `qwq-32b`, `qwen3-30b`, `qwen3-coder-480b`
|
|
1360
1324
|
- `gemini-2.5-flash`, `gemini-2.5-pro`
|
|
1361
1325
|
- `grok-4`, `grok-4-0709`
|
|
1362
1326
|
- `sonar-pro`, `perplexity-sonar-pro`
|
package/docs/TOOL_PARAMETERS.md
CHANGED
|
@@ -20,7 +20,7 @@ The Challenger tool provides critical thinking and echo chamber prevention by ge
|
|
|
20
20
|
| Parameter | Type | Required | Default | Description |
|
|
21
21
|
|-----------|------|----------|---------|-------------|
|
|
22
22
|
| `context` | `string \| object \| array` | ✅ Yes | - | The claims, statements, or context to challenge. Can be a string, object with `query`/`text`/`content`, or array of contexts |
|
|
23
|
-
| `model` | `string` | No | `'gpt-5-mini'` | AI model to use for generating challenges.
|
|
23
|
+
| `model` | `string` | No | `'gpt-5.1-codex-mini'` | AI model to use for generating challenges. See [Supported Models](#supported-models) section |
|
|
24
24
|
| `maxTokens` | `number` | No | `2000` | Maximum tokens per API call |
|
|
25
25
|
| `temperature` | `number` | No | `0.9` | Temperature for response generation (0-1). Higher = more creative challenges |
|
|
26
26
|
|
|
@@ -89,17 +89,20 @@ interface Challenge {
|
|
|
89
89
|
|
|
90
90
|
### Supported Models
|
|
91
91
|
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
92
|
+
| Provider | Models | Notes |
|
|
93
|
+
|----------|--------|-------|
|
|
94
|
+
| **Google Gemini** | `gemini-3-pro-preview`, `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.5-flash-lite` | Gemini 3 Pro is latest (Nov 2025) |
|
|
95
|
+
| **OpenAI** | `gpt-5.1`, `gpt-5.1-codex-mini`, `gpt-5.1-codex`, `gpt-5-pro` | Codex models use /v1/responses endpoint |
|
|
96
|
+
| **xAI (Grok)** | `grok-4-1-fast-reasoning`, `grok-4-1-fast-non-reasoning`, `grok-code-fast-1`, `grok-4-0709` | Grok 4.1 is latest (Nov 2025) |
|
|
97
|
+
| **Perplexity** | `sonar-pro`, `sonar-reasoning-pro` | Web search enabled |
|
|
98
|
+
| **OpenRouter** | `qwen/qwen3-coder-plus`, `moonshotai/kimi-k2-thinking` | Requires OPENROUTER_API_KEY |
|
|
96
99
|
|
|
97
100
|
### Notes
|
|
98
101
|
|
|
99
102
|
- Higher `temperature` values produce more diverse and creative challenges
|
|
100
103
|
- The tool automatically detects claim types (fact, opinion, assumption, conclusion)
|
|
101
104
|
- Groupthink detection works best with array contexts containing multiple similar statements
|
|
102
|
-
- Default model (`gpt-5-mini`) balances cost and quality
|
|
105
|
+
- Default model (`gpt-5.1-codex-mini`) balances cost and quality
|
|
103
106
|
|
|
104
107
|
---
|
|
105
108
|
|
|
@@ -123,32 +126,32 @@ The Verifier tool provides multi-model parallel verification with consensus anal
|
|
|
123
126
|
Each variant uses different models and settings optimized for specific use cases:
|
|
124
127
|
|
|
125
128
|
#### `quick_verify` (Default)
|
|
126
|
-
- **Models**: `
|
|
129
|
+
- **Models**: `qwen/qwen3-coder-plus`, `gemini-3-pro-preview`, `gpt-5.1-codex-mini`
|
|
127
130
|
- **Tokens**: 2000
|
|
128
131
|
- **Timeout**: 10000ms
|
|
129
132
|
- **Use case**: Fast verification of simple statements
|
|
130
133
|
|
|
131
134
|
#### `deep_verify`
|
|
132
|
-
- **Models**: `
|
|
135
|
+
- **Models**: `qwen/qwen3-coder-plus`, `gemini-3-pro-preview`, `gpt-5.1`
|
|
133
136
|
- **Tokens**: 6000
|
|
134
137
|
- **Timeout**: 30000ms
|
|
135
138
|
- **Use case**: Complex reasoning and analysis
|
|
136
139
|
|
|
137
140
|
#### `fact_check`
|
|
138
|
-
- **Models**: `
|
|
141
|
+
- **Models**: `qwen/qwen3-coder-plus`, `gemini-3-pro-preview`, `gpt-5.1-codex-mini`
|
|
139
142
|
- **Tokens**: 3000
|
|
140
143
|
- **Timeout**: 15000ms
|
|
141
144
|
- **Include Sources**: Yes (default)
|
|
142
145
|
- **Use case**: Factual verification with citations
|
|
143
146
|
|
|
144
147
|
#### `code_verify`
|
|
145
|
-
- **Models**: `qwen3-coder-
|
|
148
|
+
- **Models**: `qwen/qwen3-coder-plus`, `gemini-3-pro-preview`, `gpt-5.1-codex-mini`
|
|
146
149
|
- **Tokens**: 4000
|
|
147
150
|
- **Timeout**: 20000ms
|
|
148
151
|
- **Use case**: Code correctness verification
|
|
149
152
|
|
|
150
153
|
#### `security_verify`
|
|
151
|
-
- **Models**: `
|
|
154
|
+
- **Models**: `qwen/qwen3-coder-plus`, `gemini-3-pro-preview`, `gpt-5.1-codex-mini`
|
|
152
155
|
- **Tokens**: 4000
|
|
153
156
|
- **Timeout**: 20000ms
|
|
154
157
|
- **Use case**: Security vulnerability detection
|
|
@@ -175,7 +178,7 @@ const result2 = await verifier.verify(
|
|
|
175
178
|
const result3 = await verifier.verify(
|
|
176
179
|
'Is this code safe?',
|
|
177
180
|
{
|
|
178
|
-
model: ['gpt-5', 'gemini-2.5-pro'],
|
|
181
|
+
model: ['gpt-5.1', 'gemini-2.5-pro'],
|
|
179
182
|
maxTokens: 3000
|
|
180
183
|
}
|
|
181
184
|
);
|
|
@@ -248,7 +251,7 @@ The Scout tool provides conditional hybrid intelligence gathering, using Perplex
|
|
|
248
251
|
#### `research_scout` (Default)
|
|
249
252
|
- **Flow**: `perplexity-first-always`
|
|
250
253
|
- **Perplexity Timeout**: 500ms
|
|
251
|
-
- **Parallel Models**: `gemini-
|
|
254
|
+
- **Parallel Models**: `gemini-3-pro-preview`, `gpt-5.1-codex-mini`
|
|
252
255
|
- **Tokens**: 2500
|
|
253
256
|
- **Max Sources**: 100
|
|
254
257
|
- **Use case**: Comprehensive research with current facts
|
|
@@ -271,7 +274,7 @@ The Scout tool provides conditional hybrid intelligence gathering, using Perplex
|
|
|
271
274
|
#### `quick_scout`
|
|
272
275
|
- **Flow**: `conditional-hybrid`
|
|
273
276
|
- **Perplexity Timeout**: 250ms
|
|
274
|
-
- **Parallel Models**: `gemini-
|
|
277
|
+
- **Parallel Models**: `gemini-3-pro-preview`, `gpt-5.1-codex-mini`
|
|
275
278
|
- **Tokens**: 1000
|
|
276
279
|
- **Max Sources**: 50
|
|
277
280
|
- **Use case**: Fast information gathering
|
|
@@ -451,7 +454,7 @@ See test files for more usage examples:
|
|
|
451
454
|
1. **Use appropriate variants**: Don't use `deep_verify` when `quick_verify` suffices
|
|
452
455
|
2. **Set token limits**: Lower `maxTokens` for simple queries
|
|
453
456
|
3. **Control timeouts**: Shorter timeouts for time-sensitive operations
|
|
454
|
-
4. **Choose models wisely**: `gpt-5-mini` and `gemini-2.5-flash` are fast and cheap
|
|
457
|
+
4. **Choose models wisely**: `gpt-5.1-codex-mini` and `gemini-2.5-flash` are fast and cheap
|
|
455
458
|
5. **Limit Grok sources**: Keep `maxSearchSources` low unless needed
|
|
456
459
|
6. **Use `quick_scout`**: For simple lookups instead of full research
|
|
457
460
|
|
|
@@ -467,7 +470,7 @@ If migrating from old tool structure:
|
|
|
467
470
|
await thinkTool.challenge(context);
|
|
468
471
|
|
|
469
472
|
// New
|
|
470
|
-
await challenger.challenge(context, { model: 'gpt-5-mini' });
|
|
473
|
+
await challenger.challenge(context, { model: 'gpt-5.1-codex-mini' });
|
|
471
474
|
```
|
|
472
475
|
|
|
473
476
|
### Verifier (formerly consensus tools)
|
package/docs/WORKFLOWS.md
CHANGED
|
@@ -561,7 +561,7 @@ steps:
|
|
|
561
561
|
problem: "GraphQL architecture patterns"
|
|
562
562
|
style: "systematic"
|
|
563
563
|
quantity: 5
|
|
564
|
-
model: "gpt-5-mini"
|
|
564
|
+
model: "gpt-5.1-codex-mini"
|
|
565
565
|
output:
|
|
566
566
|
variable: openai_patterns
|
|
567
567
|
|
|
@@ -587,7 +587,7 @@ steps:
|
|
|
587
587
|
problem: "Design a distributed caching system"
|
|
588
588
|
style: "innovative"
|
|
589
589
|
quantity: 3
|
|
590
|
-
model: "gpt-5-mini"
|
|
590
|
+
model: "gpt-5.1-codex-mini"
|
|
591
591
|
output: initial_design
|
|
592
592
|
|
|
593
593
|
# Step 2: Challenge the design
|
|
@@ -603,7 +603,7 @@ steps:
|
|
|
603
603
|
query: "Improve the design considering these challenges: ${challenges}"
|
|
604
604
|
mode: "architecture-debate"
|
|
605
605
|
domain: "backend"
|
|
606
|
-
models: ["gpt-5-mini", "gemini-2.5-pro", "grok-4"]
|
|
606
|
+
models: ["gpt-5.1-codex-mini", "gemini-2.5-pro", "grok-4"]
|
|
607
607
|
rounds: 5
|
|
608
608
|
pingPongStyle: "debate"
|
|
609
609
|
temperature: 0.7
|
|
@@ -614,7 +614,7 @@ steps:
|
|
|
614
614
|
params:
|
|
615
615
|
query: "Verify this refined design addresses the challenges: ${refined_design}"
|
|
616
616
|
variant: "code_verify"
|
|
617
|
-
models: ["gpt-5-mini", "gemini-2.5-pro"]
|
|
617
|
+
models: ["gpt-5.1-codex-mini", "gemini-2.5-pro"]
|
|
618
618
|
output: final_verdict
|
|
619
619
|
```
|
|
620
620
|
|
|
@@ -690,7 +690,7 @@ steps:
|
|
|
690
690
|
problem: "${query}"
|
|
691
691
|
style: "systematic"
|
|
692
692
|
quantity: 3
|
|
693
|
-
model: "gpt-5-mini"
|
|
693
|
+
model: "gpt-5.1-codex-mini"
|
|
694
694
|
output: gpt_perspective
|
|
695
695
|
|
|
696
696
|
- tool: perplexity_ask
|
|
@@ -723,7 +723,7 @@ steps:
|
|
|
723
723
|
params:
|
|
724
724
|
query: "Based on synchronized analysis (${sync_analysis}) and latest data (${latest_data}), provide comprehensive answer to: ${query}"
|
|
725
725
|
mode: "deep-reasoning"
|
|
726
|
-
models: ["gpt-5-mini", "gemini-2.5-pro", "perplexity"]
|
|
726
|
+
models: ["gpt-5.1-codex-mini", "gemini-2.5-pro", "perplexity"]
|
|
727
727
|
rounds: 5
|
|
728
728
|
pingPongStyle: "collaborative"
|
|
729
729
|
output: deep_analysis
|
|
@@ -733,7 +733,7 @@ steps:
|
|
|
733
733
|
params:
|
|
734
734
|
query: "Challenge and improve this analysis: ${deep_analysis}"
|
|
735
735
|
mode: "architecture-debate"
|
|
736
|
-
models: ["gpt-5-mini", "gemini-2.5-pro", "grok-4"]
|
|
736
|
+
models: ["gpt-5.1-codex-mini", "gemini-2.5-pro", "grok-4"]
|
|
737
737
|
rounds: 3
|
|
738
738
|
pingPongStyle: "debate"
|
|
739
739
|
output: challenged_analysis
|
package/package.json
CHANGED
package/profiles/balanced.json
CHANGED
package/profiles/code_focus.json
CHANGED
|
@@ -13,8 +13,7 @@
|
|
|
13
13
|
"grok_architect": false,
|
|
14
14
|
"grok_brainstorm": false,
|
|
15
15
|
"grok_search": false,
|
|
16
|
-
"
|
|
17
|
-
"openai_compare": false,
|
|
16
|
+
"openai_reason": false,
|
|
18
17
|
"openai_brainstorm": false,
|
|
19
18
|
"openai_code_review": true,
|
|
20
19
|
"openai_explain": false,
|
package/profiles/full.json
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
{
|
|
2
|
-
"description": "All tools enabled for maximum capability (~Xk tokens,
|
|
2
|
+
"description": "All tools enabled for maximum capability (~Xk tokens, 31 tools)",
|
|
3
3
|
"tools": {
|
|
4
4
|
"think": true,
|
|
5
5
|
"focus": true,
|
|
@@ -13,8 +13,7 @@
|
|
|
13
13
|
"grok_architect": true,
|
|
14
14
|
"grok_brainstorm": true,
|
|
15
15
|
"grok_search": true,
|
|
16
|
-
"
|
|
17
|
-
"openai_compare": true,
|
|
16
|
+
"openai_reason": true,
|
|
18
17
|
"openai_brainstorm": true,
|
|
19
18
|
"openai_code_review": true,
|
|
20
19
|
"openai_explain": true,
|
package/profiles/minimal.json
CHANGED
|
@@ -13,8 +13,7 @@
|
|
|
13
13
|
"grok_architect": false,
|
|
14
14
|
"grok_brainstorm": false,
|
|
15
15
|
"grok_search": false,
|
|
16
|
-
"
|
|
17
|
-
"openai_compare": false,
|
|
16
|
+
"openai_reason": false,
|
|
18
17
|
"openai_brainstorm": false,
|
|
19
18
|
"openai_code_review": false,
|
|
20
19
|
"openai_explain": false,
|
|
@@ -13,8 +13,7 @@
|
|
|
13
13
|
"grok_architect": false,
|
|
14
14
|
"grok_brainstorm": false,
|
|
15
15
|
"grok_search": false,
|
|
16
|
-
"
|
|
17
|
-
"openai_compare": false,
|
|
16
|
+
"openai_reason": false,
|
|
18
17
|
"openai_brainstorm": false,
|
|
19
18
|
"openai_code_review": false,
|
|
20
19
|
"openai_explain": false,
|
package/tools.config.json
CHANGED
|
@@ -25,7 +25,7 @@
|
|
|
25
25
|
"grok_brainstorm",
|
|
26
26
|
"grok_search"
|
|
27
27
|
],
|
|
28
|
-
"openai": ["
|
|
28
|
+
"openai": ["openai_brainstorm", "openai_reason", "openai_code_review", "openai_explain"],
|
|
29
29
|
"gemini": [
|
|
30
30
|
"gemini_brainstorm",
|
|
31
31
|
"gemini_analyze_code",
|
|
@@ -38,7 +38,12 @@
|
|
|
38
38
|
"workflow",
|
|
39
39
|
"list_workflows",
|
|
40
40
|
"create_workflow",
|
|
41
|
-
"visualize_workflow"
|
|
41
|
+
"visualize_workflow",
|
|
42
|
+
"workflow_start",
|
|
43
|
+
"continue_workflow",
|
|
44
|
+
"workflow_status",
|
|
45
|
+
"validate_workflow",
|
|
46
|
+
"validate_workflow_file"
|
|
42
47
|
],
|
|
43
48
|
"collaborative": ["pingpong", "qwen_competitive"]
|
|
44
49
|
},
|
|
@@ -59,8 +64,10 @@
|
|
|
59
64
|
"grok_architect": true,
|
|
60
65
|
"grok_brainstorm": true,
|
|
61
66
|
"grok_search": true,
|
|
62
|
-
"openai_compare": true,
|
|
63
67
|
"openai_brainstorm": true,
|
|
68
|
+
"openai_reason": true,
|
|
69
|
+
"openai_code_review": true,
|
|
70
|
+
"openai_explain": true,
|
|
64
71
|
"gemini_brainstorm": true,
|
|
65
72
|
"gemini_analyze_code": true,
|
|
66
73
|
"gemini_analyze_text": true,
|
|
@@ -74,6 +81,11 @@
|
|
|
74
81
|
"list_workflows": true,
|
|
75
82
|
"create_workflow": true,
|
|
76
83
|
"visualize_workflow": true,
|
|
84
|
+
"workflow_start": true,
|
|
85
|
+
"continue_workflow": true,
|
|
86
|
+
"workflow_status": true,
|
|
87
|
+
"validate_workflow": true,
|
|
88
|
+
"validate_workflow_file": true,
|
|
77
89
|
"pingpong": true,
|
|
78
90
|
"qwen_competitive": false
|
|
79
91
|
}
|
|
@@ -176,10 +176,12 @@ steps:
|
|
|
176
176
|
# ═══════════════════════════════════════════════════════════════════════════
|
|
177
177
|
|
|
178
178
|
- name: consensus
|
|
179
|
-
tool:
|
|
179
|
+
tool: openai_brainstorm
|
|
180
180
|
input:
|
|
181
|
-
|
|
182
|
-
|
|
181
|
+
problem: |
|
|
182
|
+
Synthesize final architecture recommendations for: ${query}
|
|
183
|
+
|
|
184
|
+
Combine these expert analyses into actionable recommendations:
|
|
183
185
|
- "Grok's SOLID analysis: ${grok-solid-analysis.output}"
|
|
184
186
|
- "Gemini's pattern analysis: ${gemini-pattern-analysis.output}"
|
|
185
187
|
- "Qwen's CQRS evaluation: ${qwen-cqrs-evaluation.output}"
|
|
@@ -15,7 +15,7 @@ variables:
|
|
|
15
15
|
steps:
|
|
16
16
|
# Step 1: Claude Thinking - Problem Framing
|
|
17
17
|
- name: claude-thinking
|
|
18
|
-
tool:
|
|
18
|
+
tool: openai_reason # Using GPT-5 Mini for structured thinking
|
|
19
19
|
input:
|
|
20
20
|
query: |
|
|
21
21
|
Analyze and structure the brainstorming request: ${query}
|