@matheuskrumenauer/tanya 0.10.0-beta.0 → 0.11.0-beta.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +49 -0
- package/dist/cli.js +1478 -299
- package/dist/cli.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -267,6 +267,55 @@ schema-invalid tool responses before they reach model history.
|
|
|
267
267
|
See [docs/mcp.md](./docs/mcp.md) for the full schema, transports, server tools,
|
|
268
268
|
and security model.
|
|
269
269
|
|
|
270
|
+
## Multi-model routing
|
|
271
|
+
|
|
272
|
+
Tanya can route each agent step to a different provider/model. Planning and
|
|
273
|
+
simple tool-call turns can use cheap chat models, while synthesis,
|
|
274
|
+
verification, and reasoning turns can use stronger models only when needed.
|
|
275
|
+
|
|
276
|
+
Default route profile:
|
|
277
|
+
|
|
278
|
+
| Step | Route | Fallback |
|
|
279
|
+
| --- | --- | --- |
|
|
280
|
+
| `planning` | `deepseek/deepseek-chat` | `qwen/qwen3-coder-plus` |
|
|
281
|
+
| `tool_call` | `deepseek/deepseek-chat` | `groq/llama-3.3-70b-versatile` |
|
|
282
|
+
| `synthesis` | `deepseek/deepseek-reasoner` | `openai/gpt-4.1-mini` |
|
|
283
|
+
| `verification` | `deepseek/deepseek-reasoner` | `openai/gpt-4.1-mini` |
|
|
284
|
+
| `reasoning` | `deepseek/deepseek-reasoner` | `qwen/qwen3-coder-plus` |
|
|
285
|
+
|
|
286
|
+
Project routes live in `.tania/routes.json`; user-global routes live in
|
|
287
|
+
`~/.tanya/routes.json` with a legacy read fallback from `~/.tania/routes.json`.
|
|
288
|
+
Use `/route` in the REPL to inspect the effective table, `/route show
|
|
289
|
+
<stepType>` to inspect one step, `/route set <stepType> <provider>/<model>` for
|
|
290
|
+
a session-only patch, and `/route reset` to clear session patches.
|
|
291
|
+
|
|
292
|
+
Escalations are visible: if a cheap route exhausts the malformed tool-call
|
|
293
|
+
repair budget, Tanya emits `escalation_event` and uses the route fallback once,
|
|
294
|
+
up to `TANYA_ESCALATION_CAP` per session.
|
|
295
|
+
|
|
296
|
+
See [docs/routing.md](./docs/routing.md) for schema, examples, context-window
|
|
297
|
+
guards, per-tool model overrides, and sub-agent model pins.
|
|
298
|
+
|
|
299
|
+
## Reasoning models
|
|
300
|
+
|
|
301
|
+
Reasoning routes such as `deepseek-reasoner`, `qwen3-thinking-*`, and
|
|
302
|
+
`grok-3-reasoning` are handled as a separate stream. Tanya archives reasoning to
|
|
303
|
+
`.tania/runs/<runId>/reasoning.jsonl`, emits `reasoning_chunk` events, and keeps
|
|
304
|
+
assistant history reasoning-free so replay and verifier inputs stay stable.
|
|
305
|
+
|
|
306
|
+
Reasoning tokens appear separately in `/cost` and `/budget`. Route rules can set
|
|
307
|
+
`reasoningCap.maxTokens`; built-in defaults are 2k for planning-like turns and
|
|
308
|
+
8k for synthesis/verification/reasoning turns. If the cap is exceeded, Tanya
|
|
309
|
+
emits `reasoning_truncated` and asks the model to finish.
|
|
310
|
+
|
|
311
|
+
Use `/memory --reasoning <runId>` to inspect archived reasoning. Use
|
|
312
|
+
`TANYA_HIDE_REASONING=1` to hide reasoning from the human UI while preserving
|
|
313
|
+
JSONL/Cosmo events. Verifier reasoning annotations are off by default; enable
|
|
314
|
+
them with `--verbose-verifier` or `TANYA_VERIFIER_INCLUDE_REASONING=1`.
|
|
315
|
+
|
|
316
|
+
See [docs/reasoning.md](./docs/reasoning.md) for provider notes, billing math,
|
|
317
|
+
budget defaults, and UX modes.
|
|
318
|
+
|
|
270
319
|
`--verify` adds required verification commands to the run context. Tanya must run and report each exact command before finishing the coding task.
|
|
271
320
|
|
|
272
321
|
`tanya benchmark run --all` currently exercises 27 executable low-to-medium regression fixtures: targeted edits, new files, dependency/lockfile updates, framework-style migrations, failing-test repair, frontend smoke checks, artifact/context reuse, streaming long-tool execution, compaction-boundary recovery, run-history logging, dirty worktrees, report repair, and the CosmoHQ mobile/backend smoke profiles.
|