@ast-ai-model-router/cli 0.1.1 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +6 -3
- package/.codex-plugin/plugin.json +8 -5
- package/README.md +169 -26
- package/bin/ast-ai-model-router.js +162 -20
- package/lib/adapters.js +51 -0
- package/lib/analyzer.js +3 -0
- package/lib/config.js +46 -0
- package/lib/cost.js +47 -0
- package/lib/decision.js +68 -0
- package/lib/gateway.js +143 -0
- package/lib/policy.js +35 -0
- package/package.json +13 -4
- package/skills/model-router/SKILL.md +33 -3
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ast-ai-model-router",
|
|
3
|
-
"version": "
|
|
4
|
-
"description": "AST-
|
|
3
|
+
"version": "1.0.0",
|
|
4
|
+
"description": "AST-based Claude Code and Codex model routing with token-cost estimates and CI policy checks.",
|
|
5
5
|
"author": {
|
|
6
6
|
"name": "Faraazuddin Mohammed",
|
|
7
7
|
"email": "opensource@faraa2m.dev",
|
|
@@ -11,11 +11,14 @@
|
|
|
11
11
|
"repository": "https://github.com/faraa2m/ast-ai-model-router",
|
|
12
12
|
"license": "MIT",
|
|
13
13
|
"keywords": [
|
|
14
|
+
"ai-coding-agent",
|
|
14
15
|
"token-economics",
|
|
16
|
+
"llm-cost-optimization",
|
|
15
17
|
"model-router",
|
|
16
18
|
"claude-code",
|
|
17
19
|
"codex",
|
|
18
|
-
"ast"
|
|
20
|
+
"ast",
|
|
21
|
+
"ci"
|
|
19
22
|
],
|
|
20
23
|
"skills": "./skills/"
|
|
21
24
|
}
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ast-ai-model-router",
|
|
3
|
-
"version": "
|
|
4
|
-
"description": "AST-
|
|
3
|
+
"version": "1.0.0",
|
|
4
|
+
"description": "AST-based Claude Code and Codex model routing with token-cost estimates and CI policy checks.",
|
|
5
5
|
"author": {
|
|
6
6
|
"name": "Faraazuddin Mohammed",
|
|
7
7
|
"email": "opensource@faraa2m.dev",
|
|
@@ -11,17 +11,20 @@
|
|
|
11
11
|
"repository": "https://github.com/faraa2m/ast-ai-model-router",
|
|
12
12
|
"license": "MIT",
|
|
13
13
|
"keywords": [
|
|
14
|
+
"ai-coding-agent",
|
|
14
15
|
"token-economics",
|
|
16
|
+
"llm-cost-optimization",
|
|
15
17
|
"model-router",
|
|
16
18
|
"claude-code",
|
|
17
19
|
"codex",
|
|
18
|
-
"ast"
|
|
20
|
+
"ast",
|
|
21
|
+
"ci"
|
|
19
22
|
],
|
|
20
23
|
"skills": "./skills/",
|
|
21
24
|
"interface": {
|
|
22
25
|
"displayName": "AST AI Model Router",
|
|
23
|
-
"shortDescription": "Pick Claude or Codex models from AST and
|
|
24
|
-
"longDescription": "AST AI Model Router analyzes JavaScript, TypeScript, and Python project structure plus the current task to choose an appropriate Claude or Codex model before launch. It is part of
|
|
26
|
+
"shortDescription": "Pick Claude Code or Codex models from AST, task, cost, and policy signals.",
|
|
27
|
+
"longDescription": "AST AI Model Router analyzes JavaScript, TypeScript, and Python project structure plus the current task to choose an appropriate Claude Code or Codex model before launch. It adds Tokenometer-backed token-cost estimates, explainable decision records, and CI policy checks for teams. It is part of the faraa2m token-economics toolkit for measuring, reducing, and routing LLM spend.",
|
|
25
28
|
"developerName": "Faraazuddin Mohammed",
|
|
26
29
|
"category": "Developer Tools",
|
|
27
30
|
"capabilities": [
|
package/README.md
CHANGED
|
@@ -1,14 +1,23 @@
|
|
|
1
1
|
# AST AI Model Router
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
[](https://www.npmjs.com/package/@ast-ai-model-router/cli)
|
|
4
|
+
[](https://github.com/faraa2m/ast-ai-model-router/actions/workflows/ci.yml)
|
|
5
|
+
[](https://github.com/faraa2m/ast-ai-model-router/actions/workflows/npm-smoke.yml)
|
|
6
|
+
[](LICENSE)
|
|
7
|
+
[](https://github.com/faraa2m/ast-ai-model-router/stargazers)
|
|
4
8
|
|
|
5
|
-
|
|
9
|
+
AST-based Claude Code and Codex model router for developers who want explainable AI coding-agent model selection, token-cost visibility, and CI policy checks.
|
|
6
10
|
|
|
7
|
-
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
-
|
|
11
|
+
`ast-ai-model-router` inspects the current task, JavaScript/TypeScript ASTs, Python ASTs, and repo shape, then recommends or launches Claude Code / Codex with the right model tier. It is deterministic, local-first, and designed for both personal coding workflows and production team guardrails.
|
|
12
|
+
|
|
13
|
+
## Why Use It
|
|
14
|
+
|
|
15
|
+
- Avoid defaulting every coding-agent task to the strongest model.
|
|
16
|
+
- Keep simple documentation and explanation tasks on cheaper/faster models.
|
|
17
|
+
- Escalate refactors, migrations, security, auth, database, and architecture work.
|
|
18
|
+
- Get a readable rationale for every model decision.
|
|
19
|
+
- Add CI checks for max model tier and estimated prompt cost.
|
|
20
|
+
- Connect local coding-agent routing to the same token-economics stack as Tokenometer and RouterLab.
|
|
12
21
|
|
|
13
22
|
## Install
|
|
14
23
|
|
|
@@ -16,39 +25,114 @@ This project is part of the [`faraa2m`](https://github.com/faraa2m) token econom
|
|
|
16
25
|
npm install -g @ast-ai-model-router/cli
|
|
17
26
|
```
|
|
18
27
|
|
|
19
|
-
|
|
28
|
+
Run without installing:
|
|
20
29
|
|
|
21
30
|
```bash
|
|
22
|
-
|
|
23
|
-
npm link
|
|
31
|
+
npx --yes --package @ast-ai-model-router/cli ast-ai-model-router --help
|
|
24
32
|
```
|
|
25
33
|
|
|
26
|
-
##
|
|
34
|
+
## Quick Start
|
|
27
35
|
|
|
28
|
-
|
|
36
|
+
Initialize config in a repo:
|
|
29
37
|
|
|
30
38
|
```bash
|
|
31
|
-
ast-ai-model-router
|
|
39
|
+
ast-ai-model-router init
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
Analyze a Claude Code task:
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
ast-ai-model-router analyze --agent claude --task "write docs for the parser"
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
Explain the decision:
|
|
49
|
+
|
|
50
|
+
```bash
|
|
51
|
+
ast-ai-model-router explain --agent codex --task "refactor auth middleware and add regression tests"
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
Preview a launch command without starting an agent:
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
ast-ai-model-router run codex --task "fix failing Python AST tests" --dry-run -- --cd .
|
|
32
58
|
```
|
|
33
59
|
|
|
34
60
|
Launch Codex with the selected model:
|
|
35
61
|
|
|
36
62
|
```bash
|
|
37
|
-
ast-ai-model-router run codex --task "fix
|
|
63
|
+
ast-ai-model-router run codex --task "fix failing Python AST tests" -- --cd .
|
|
38
64
|
```
|
|
39
65
|
|
|
40
66
|
Launch Claude Code with the selected alias:
|
|
41
67
|
|
|
42
68
|
```bash
|
|
43
|
-
ast-ai-model-router run claude --task "plan a cross-module migration" -- --permission-mode plan
|
|
69
|
+
ast-ai-model-router run claude --task "plan a cross-module database migration" -- --permission-mode plan
|
|
44
70
|
```
|
|
45
71
|
|
|
46
|
-
|
|
72
|
+
Route each prompt through a gateway session:
|
|
47
73
|
|
|
48
74
|
```bash
|
|
49
|
-
ast-ai-model-router
|
|
75
|
+
ast-ai-model-router gateway codex -- --sandbox workspace-write
|
|
50
76
|
```
|
|
51
77
|
|
|
78
|
+
Preview one gateway turn without launching an agent:
|
|
79
|
+
|
|
80
|
+
```bash
|
|
81
|
+
ast-ai-model-router gateway claude --once --task "write docs for this repo" --dry-run -- --permission-mode plan
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
## CI And Team Policy
|
|
85
|
+
|
|
86
|
+
Fail if a task would exceed the allowed tier:
|
|
87
|
+
|
|
88
|
+
```bash
|
|
89
|
+
ast-ai-model-router ci \
|
|
90
|
+
--agent claude \
|
|
91
|
+
--task "plan a production database migration" \
|
|
92
|
+
--max-tier complex
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
Fail if Tokenometer can estimate the task prompt above your budget:
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
ast-ai-model-router ci \
|
|
99
|
+
--agent codex \
|
|
100
|
+
--task "review this large auth refactor" \
|
|
101
|
+
--max-cost-usd 0.001
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
Machine-readable decision output:
|
|
105
|
+
|
|
106
|
+
```bash
|
|
107
|
+
ast-ai-model-router analyze --agent codex --task "write tests" --json
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
The JSON includes `selectedModel`, `tier`, `confidence`, `signals`, `rationale`, `warnings`, `costEstimate`, `policy`, and `commandPreview`.
|
|
111
|
+
|
|
112
|
+
## Per-Turn Gateway
|
|
113
|
+
|
|
114
|
+
`run` chooses one model before launching a Claude Code or Codex session. `gateway` is different: it keeps a small router prompt open, scores every message you type, then invokes the selected agent model for that turn.
|
|
115
|
+
|
|
116
|
+
```bash
|
|
117
|
+
ast-ai-model-router gateway claude -- --permission-mode plan
|
|
118
|
+
ast-ai-model-router gateway codex -- --sandbox workspace-write
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
Inside the gateway, type a prompt and press Enter. Use `/exit` or `/quit` to stop.
|
|
122
|
+
|
|
123
|
+
For single-turn automation or CI smoke tests:
|
|
124
|
+
|
|
125
|
+
```bash
|
|
126
|
+
ast-ai-model-router gateway codex --once --task "add regression tests for parser errors" --dry-run
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
The gateway uses non-interactive agent execution:
|
|
130
|
+
|
|
131
|
+
- Claude Code: `claude --print --model <selected-model> ... <prompt>`
|
|
132
|
+
- Codex: `codex exec --model <selected-model> ... <prompt>`
|
|
133
|
+
|
|
134
|
+
This is not an invisible hook inside an already-running Claude Code or Codex TUI. To route every turn, enter prompts through `ast-ai-model-router gateway ...`.
|
|
135
|
+
|
|
52
136
|
## How Routing Works
|
|
53
137
|
|
|
54
138
|
The router scores four groups of signals:
|
|
@@ -56,20 +140,38 @@ The router scores four groups of signals:
|
|
|
56
140
|
- Prompt intent: docs, tests, debugging, refactors, architecture, security, migrations.
|
|
57
141
|
- Repo shape: file count, AST file count, package/build/config files.
|
|
58
142
|
- AST complexity: functions, classes, branches, imports, and language mix.
|
|
59
|
-
- Agent model catalog: Codex models are discovered through `codex debug models`; Claude uses dynamic aliases.
|
|
143
|
+
- Agent model catalog: Codex models are discovered through `codex debug models`; Claude Code uses dynamic aliases.
|
|
60
144
|
|
|
61
|
-
Claude targets are
|
|
145
|
+
Claude Code targets are aliases, not dated model names:
|
|
62
146
|
|
|
63
147
|
- `simple` -> `haiku`
|
|
64
148
|
- `balanced` -> `sonnet`
|
|
65
149
|
- `complex` -> `opus`
|
|
66
150
|
- `planning` -> `opusplan`
|
|
67
151
|
|
|
68
|
-
Codex targets are selected from the installed Codex model catalog. If discovery fails, the router falls back to
|
|
152
|
+
Codex targets are selected from the installed Codex model catalog. If discovery fails, the router falls back to configured defaults.
|
|
153
|
+
|
|
154
|
+
## Token-Cost Estimates
|
|
155
|
+
|
|
156
|
+
Cost estimates use [`@tokenometer/core`](https://github.com/faraa2m/tokenometer) when the selected model maps to a known provider model.
|
|
157
|
+
|
|
158
|
+
Examples:
|
|
159
|
+
|
|
160
|
+
- Claude alias `haiku` maps to `claude-haiku-4-5`.
|
|
161
|
+
- Claude alias `sonnet` maps to `claude-sonnet-4-6`.
|
|
162
|
+
- Codex `gpt-5.4-mini` maps to Tokenometer model `gpt-5-mini`.
|
|
163
|
+
|
|
164
|
+
If a model cannot be mapped, routing still works and the decision includes a warning:
|
|
165
|
+
|
|
166
|
+
```text
|
|
167
|
+
Cost estimate unavailable: No Tokenometer model mapping for codex model "..."
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
Cost estimates are for the task prompt text, not source-file contents. This keeps the tool privacy-preserving and fast by default.
|
|
69
171
|
|
|
70
172
|
## Configuration
|
|
71
173
|
|
|
72
|
-
|
|
174
|
+
`ast-ai-model-router init` writes `model-router.config.json`:
|
|
73
175
|
|
|
74
176
|
```json
|
|
75
177
|
{
|
|
@@ -86,25 +188,66 @@ Add `model-router.config.json` to a project root:
|
|
|
86
188
|
}
|
|
87
189
|
},
|
|
88
190
|
"codex": {
|
|
89
|
-
"discoveryCommand": "codex debug models"
|
|
191
|
+
"discoveryCommand": "codex debug models",
|
|
192
|
+
"fallbackModels": {
|
|
193
|
+
"simple": "gpt-5.4-mini",
|
|
194
|
+
"balanced": "gpt-5.4",
|
|
195
|
+
"complex": "gpt-5.5",
|
|
196
|
+
"planning": "gpt-5.5"
|
|
197
|
+
}
|
|
198
|
+
},
|
|
199
|
+
"policy": {
|
|
200
|
+
"maxTier": "planning",
|
|
201
|
+
"maxCostUsd": null
|
|
202
|
+
},
|
|
203
|
+
"logging": {
|
|
204
|
+
"enabled": false,
|
|
205
|
+
"path": ".model-router/decisions.jsonl"
|
|
90
206
|
}
|
|
91
207
|
}
|
|
92
208
|
```
|
|
93
209
|
|
|
94
|
-
|
|
210
|
+
Decision logging is disabled by default. When enabled with config or `--log`, logs store model decisions and scores, not source code.
|
|
211
|
+
|
|
212
|
+
## Exit Codes
|
|
95
213
|
|
|
96
|
-
|
|
214
|
+
- `0`: success
|
|
215
|
+
- `1`: runtime failure
|
|
216
|
+
- `2`: invalid input or config
|
|
217
|
+
- `3`: policy failure
|
|
97
218
|
|
|
98
|
-
|
|
219
|
+
## Plugin Support
|
|
220
|
+
|
|
221
|
+
This repo includes:
|
|
222
|
+
|
|
223
|
+
- `.codex-plugin/plugin.json`
|
|
224
|
+
- `.claude-plugin/plugin.json`
|
|
225
|
+
- `skills/model-router/SKILL.md`
|
|
226
|
+
|
|
227
|
+
Use the plugin locally:
|
|
99
228
|
|
|
100
229
|
```bash
|
|
101
230
|
claude --plugin-dir .
|
|
102
231
|
codex plugin marketplace add .
|
|
103
232
|
```
|
|
104
233
|
|
|
234
|
+
## Token Economics Stack
|
|
235
|
+
|
|
236
|
+
This project is part of the [`faraa2m`](https://github.com/faraa2m) token-economics ecosystem:
|
|
237
|
+
|
|
238
|
+
- [`tokenometer`](https://github.com/faraa2m/tokenometer): token counts, USD cost, latency benchmarks, and CI prompt-cost guardrails.
|
|
239
|
+
- [`llm-tokens-atlas`](https://github.com/faraa2m/llm-tokens-atlas): empirical tokenizer calibration dataset.
|
|
240
|
+
- [`routerlab`](https://github.com/faraa2m/routerlab): cost-quality routing frontiers for LLM APIs.
|
|
241
|
+
- [`promptc`](https://github.com/faraa2m/promptc): deterministic prompt compiler for cost reduction.
|
|
242
|
+
- `ast-ai-model-router`: model routing for local coding agents.
|
|
243
|
+
|
|
105
244
|
## Privacy
|
|
106
245
|
|
|
107
|
-
The router reads local source files to compute AST complexity and launches the local `claude` or `codex` CLI. It does not
|
|
246
|
+
The router reads local source files to compute AST complexity and launches the local `claude` or `codex` CLI. It does not run a separate network service and does not upload source code. Any model traffic comes from the Claude Code or Codex CLI you choose to run.
|
|
247
|
+
|
|
248
|
+
## Status
|
|
249
|
+
|
|
250
|
+
This is an explainable heuristic router. It is production-usable for policy and workflow guardrails, but it does not claim empirically proven model-quality optimization yet. Future releases can add outcome logging and calibration against real task success.
|
|
108
251
|
|
|
109
252
|
## License
|
|
110
253
|
|
|
@@ -1,25 +1,40 @@
|
|
|
1
1
|
#!/usr/bin/env node
|
|
2
2
|
import { spawn } from "node:child_process";
|
|
3
|
+
import { access, writeFile } from "node:fs/promises";
|
|
4
|
+
import path from "node:path";
|
|
3
5
|
import { parseArgs } from "node:util";
|
|
4
|
-
import {
|
|
5
|
-
import {
|
|
6
|
-
import {
|
|
6
|
+
import { CONFIG_TEMPLATE, loadConfig, validateTier } from "../lib/config.js";
|
|
7
|
+
import { createDecision, maybeLogDecision } from "../lib/decision.js";
|
|
8
|
+
import { runGateway } from "../lib/gateway.js";
|
|
9
|
+
import { formatUsd } from "../lib/policy.js";
|
|
7
10
|
|
|
8
11
|
const HELP = `ast-ai-model-router
|
|
9
12
|
|
|
10
13
|
Usage:
|
|
11
14
|
ast-ai-model-router analyze --agent claude|codex --task "fix auth bug" [--json]
|
|
15
|
+
ast-ai-model-router explain --agent claude|codex --task "plan migration" [--json]
|
|
16
|
+
ast-ai-model-router ci --agent claude|codex --task "deploy change" [--max-tier complex]
|
|
12
17
|
ast-ai-model-router run claude --task "refactor parser" -- [extra claude args]
|
|
13
18
|
ast-ai-model-router run codex --task "write tests" -- [extra codex args]
|
|
19
|
+
ast-ai-model-router gateway claude|codex [--once --task "write docs"] -- [extra agent args]
|
|
20
|
+
ast-ai-model-router init [--cwd <path>] [--force]
|
|
14
21
|
|
|
15
22
|
Options:
|
|
16
23
|
--agent <agent> claude or codex for analyze
|
|
17
24
|
--task <text> Current task description
|
|
18
25
|
--cwd <path> Workspace to inspect, defaults to current directory
|
|
19
26
|
--json Emit machine-readable JSON
|
|
27
|
+
--max-tier <tier> Policy ceiling: simple, balanced, complex, planning
|
|
28
|
+
--max-cost-usd <n> Policy ceiling when cost estimate is available
|
|
29
|
+
--log Append a local JSONL decision record
|
|
30
|
+
--dry-run For run: print the command instead of launching
|
|
31
|
+
--once For gateway: route one --task prompt and exit
|
|
20
32
|
--refresh-models Refresh Codex model catalog cache
|
|
21
33
|
`;
|
|
22
34
|
|
|
35
|
+
const EXIT_INVALID = 2;
|
|
36
|
+
const EXIT_POLICY = 3;
|
|
37
|
+
|
|
23
38
|
async function main() {
|
|
24
39
|
const [command, maybeAgent, ...rest] = process.argv.slice(2);
|
|
25
40
|
if (!command || command === "--help" || command === "-h") {
|
|
@@ -27,7 +42,30 @@ async function main() {
|
|
|
27
42
|
return;
|
|
28
43
|
}
|
|
29
44
|
|
|
30
|
-
if (command === "
|
|
45
|
+
if (command === "init") {
|
|
46
|
+
const { values } = parseArgs({
|
|
47
|
+
args: [maybeAgent, ...rest].filter(Boolean),
|
|
48
|
+
options: {
|
|
49
|
+
cwd: { type: "string" },
|
|
50
|
+
force: { type: "boolean" }
|
|
51
|
+
}
|
|
52
|
+
});
|
|
53
|
+
const cwd = path.resolve(values.cwd ?? process.cwd());
|
|
54
|
+
const configPath = path.join(cwd, "model-router.config.json");
|
|
55
|
+
if (!values.force) {
|
|
56
|
+
try {
|
|
57
|
+
await access(configPath);
|
|
58
|
+
throw new Error(`model-router.config.json already exists at ${configPath}. Use --force to replace it.`);
|
|
59
|
+
} catch (error) {
|
|
60
|
+
if (error.code !== "ENOENT") throw error;
|
|
61
|
+
}
|
|
62
|
+
}
|
|
63
|
+
await writeFile(configPath, `${JSON.stringify(CONFIG_TEMPLATE, null, 2)}\n`, "utf8");
|
|
64
|
+
process.stdout.write(`Wrote ${configPath}\n`);
|
|
65
|
+
return;
|
|
66
|
+
}
|
|
67
|
+
|
|
68
|
+
if (command === "analyze" || command === "explain" || command === "ci") {
|
|
31
69
|
const { values } = parseArgs({
|
|
32
70
|
args: [maybeAgent, ...rest].filter(Boolean),
|
|
33
71
|
options: {
|
|
@@ -35,11 +73,24 @@ async function main() {
|
|
|
35
73
|
task: { type: "string" },
|
|
36
74
|
cwd: { type: "string" },
|
|
37
75
|
json: { type: "boolean" },
|
|
76
|
+
log: { type: "boolean" },
|
|
77
|
+
"max-tier": { type: "string" },
|
|
78
|
+
"max-cost-usd": { type: "string" },
|
|
38
79
|
"refresh-models": { type: "boolean" }
|
|
39
80
|
}
|
|
40
81
|
});
|
|
41
82
|
const agent = assertAgent(values.agent);
|
|
42
|
-
const result = await
|
|
83
|
+
const result = await routeFromValues({ agent, values });
|
|
84
|
+
if (values.log) await maybeLogDecision(result, await loadConfig(values.cwd ?? process.cwd()), true);
|
|
85
|
+
if (command === "explain") {
|
|
86
|
+
printExplanation(result, Boolean(values.json));
|
|
87
|
+
return;
|
|
88
|
+
}
|
|
89
|
+
if (command === "ci") {
|
|
90
|
+
printCi(result, Boolean(values.json));
|
|
91
|
+
if (!result.policy.passed) process.exit(EXIT_POLICY);
|
|
92
|
+
return;
|
|
93
|
+
}
|
|
43
94
|
printResult(result, Boolean(values.json));
|
|
44
95
|
return;
|
|
45
96
|
}
|
|
@@ -54,12 +105,25 @@ async function main() {
|
|
|
54
105
|
options: {
|
|
55
106
|
task: { type: "string" },
|
|
56
107
|
cwd: { type: "string" },
|
|
108
|
+
log: { type: "boolean" },
|
|
109
|
+
"max-tier": { type: "string" },
|
|
110
|
+
"max-cost-usd": { type: "string" },
|
|
111
|
+
"dry-run": { type: "boolean" },
|
|
57
112
|
"refresh-models": { type: "boolean" }
|
|
58
113
|
}
|
|
59
114
|
});
|
|
60
|
-
const result = await
|
|
115
|
+
const result = await routeFromValues({ agent, values });
|
|
116
|
+
if (values.log) await maybeLogDecision(result, await loadConfig(values.cwd ?? process.cwd()), true);
|
|
117
|
+
if (!result.policy.passed) {
|
|
118
|
+
process.stderr.write(`[model-router] policy failed: ${result.policy.failures.join("; ")}\n`);
|
|
119
|
+
process.exit(EXIT_POLICY);
|
|
120
|
+
}
|
|
61
121
|
const executable = agent === "claude" ? "claude" : "codex";
|
|
62
122
|
const args = ["--model", result.selectedModel, ...passthrough];
|
|
123
|
+
if (values["dry-run"]) {
|
|
124
|
+
process.stdout.write(`${executable} ${args.map(shellQuote).join(" ")}\n`);
|
|
125
|
+
return;
|
|
126
|
+
}
|
|
63
127
|
process.stderr.write(`[model-router] ${agent}: ${result.selectedModel} (${result.tier}, confidence ${result.confidence.toFixed(2)})\n`);
|
|
64
128
|
const child = spawn(executable, args, { cwd: result.cwd, stdio: "inherit" });
|
|
65
129
|
child.on("exit", (code, signal) => {
|
|
@@ -76,23 +140,55 @@ async function main() {
|
|
|
76
140
|
return;
|
|
77
141
|
}
|
|
78
142
|
|
|
143
|
+
if (command === "gateway" || command === "intercept") {
|
|
144
|
+
const agent = assertAgent(maybeAgent);
|
|
145
|
+
const split = rest.indexOf("--");
|
|
146
|
+
const optionArgs = split === -1 ? rest : rest.slice(0, split);
|
|
147
|
+
const passthrough = split === -1 ? [] : rest.slice(split + 1);
|
|
148
|
+
const { values } = parseArgs({
|
|
149
|
+
args: optionArgs,
|
|
150
|
+
options: {
|
|
151
|
+
task: { type: "string" },
|
|
152
|
+
cwd: { type: "string" },
|
|
153
|
+
log: { type: "boolean" },
|
|
154
|
+
once: { type: "boolean" },
|
|
155
|
+
"max-tier": { type: "string" },
|
|
156
|
+
"max-cost-usd": { type: "string" },
|
|
157
|
+
"dry-run": { type: "boolean" },
|
|
158
|
+
"refresh-models": { type: "boolean" }
|
|
159
|
+
}
|
|
160
|
+
});
|
|
161
|
+
if (values.once && !values.task) throw new Error("--task is required with --once.");
|
|
162
|
+
const exitCode = await runGateway({
|
|
163
|
+
agent,
|
|
164
|
+
cwd: values.cwd ?? process.cwd(),
|
|
165
|
+
log: Boolean(values.log),
|
|
166
|
+
dryRun: Boolean(values["dry-run"]),
|
|
167
|
+
refreshModels: Boolean(values["refresh-models"]),
|
|
168
|
+
maxTier: values["max-tier"] ? validateTier(values["max-tier"], "--max-tier") : undefined,
|
|
169
|
+
maxCostUsd: parseOptionalNumber(values["max-cost-usd"], "--max-cost-usd"),
|
|
170
|
+
passthrough,
|
|
171
|
+
onceTask: values.once ? values.task : undefined
|
|
172
|
+
});
|
|
173
|
+
process.exit(exitCode);
|
|
174
|
+
}
|
|
175
|
+
|
|
79
176
|
throw new Error(`Unknown command: ${command}`);
|
|
80
177
|
}
|
|
81
178
|
|
|
82
|
-
async function
|
|
83
|
-
const
|
|
84
|
-
const
|
|
85
|
-
const
|
|
86
|
-
return {
|
|
179
|
+
async function routeFromValues({ agent, values }) {
|
|
180
|
+
const maxTier = values["max-tier"] ? validateTier(values["max-tier"], "--max-tier") : undefined;
|
|
181
|
+
const maxCostUsd = parseOptionalNumber(values["max-cost-usd"], "--max-cost-usd");
|
|
182
|
+
const config = await loadConfig(values.cwd ?? process.cwd());
|
|
183
|
+
return createDecision({
|
|
87
184
|
agent,
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
};
|
|
185
|
+
task: values.task ?? "",
|
|
186
|
+
cwd: values.cwd ?? process.cwd(),
|
|
187
|
+
config,
|
|
188
|
+
refreshModels: Boolean(values["refresh-models"]),
|
|
189
|
+
maxTier,
|
|
190
|
+
maxCostUsd
|
|
191
|
+
});
|
|
96
192
|
}
|
|
97
193
|
|
|
98
194
|
function assertAgent(agent) {
|
|
@@ -110,10 +206,56 @@ function printResult(result, asJson) {
|
|
|
110
206
|
process.stdout.write(`Tier: ${result.tier}\n`);
|
|
111
207
|
process.stdout.write(`Confidence: ${result.confidence.toFixed(2)}\n`);
|
|
112
208
|
process.stdout.write(`Signals: ${result.signals.map((signal) => `${signal.name}=${signal.value}`).join(", ")}\n`);
|
|
209
|
+
process.stdout.write(`Cost: ${formatCost(result.costEstimate)}\n`);
|
|
210
|
+
if (result.warnings.length) process.stdout.write(`Warnings: ${result.warnings.join(" ")}\n`);
|
|
211
|
+
process.stdout.write(`Why: ${result.rationale[0]}\n`);
|
|
113
212
|
process.stdout.write(`Run: ${result.commandPreview}\n`);
|
|
114
213
|
}
|
|
115
214
|
|
|
215
|
+
function printExplanation(result, asJson) {
|
|
216
|
+
if (asJson) {
|
|
217
|
+
process.stdout.write(`${JSON.stringify({ ...result, explanation: result.rationale }, null, 2)}\n`);
|
|
218
|
+
return;
|
|
219
|
+
}
|
|
220
|
+
printResult(result, false);
|
|
221
|
+
process.stdout.write("\nExplanation:\n");
|
|
222
|
+
for (const line of result.rationale) process.stdout.write(`- ${line}\n`);
|
|
223
|
+
process.stdout.write("\nSignal details:\n");
|
|
224
|
+
for (const signal of result.signals) {
|
|
225
|
+
process.stdout.write(`- ${signal.name}: ${signal.value} (${signal.detail})\n`);
|
|
226
|
+
}
|
|
227
|
+
}
|
|
228
|
+
|
|
229
|
+
function printCi(result, asJson) {
|
|
230
|
+
if (asJson) {
|
|
231
|
+
process.stdout.write(`${JSON.stringify(result, null, 2)}\n`);
|
|
232
|
+
return;
|
|
233
|
+
}
|
|
234
|
+
process.stdout.write(result.policy.passed ? "model-router ci: passed\n" : "model-router ci: failed\n");
|
|
235
|
+
printResult(result, false);
|
|
236
|
+
if (!result.policy.passed) {
|
|
237
|
+
for (const failure of result.policy.failures) process.stdout.write(`Policy failure: ${failure}\n`);
|
|
238
|
+
}
|
|
239
|
+
}
|
|
240
|
+
|
|
241
|
+
function formatCost(costEstimate) {
|
|
242
|
+
if (!costEstimate?.available) return `unavailable (${costEstimate?.reason ?? "unknown"})`;
|
|
243
|
+
return `${costEstimate.inputTokens} input tokens, ${formatUsd(costEstimate.inputCostUsd)} (${costEstimate.model})`;
|
|
244
|
+
}
|
|
245
|
+
|
|
246
|
+
function parseOptionalNumber(raw, field) {
|
|
247
|
+
if (raw === undefined || raw === null) return undefined;
|
|
248
|
+
const value = Number(raw);
|
|
249
|
+
if (!Number.isFinite(value) || value < 0) throw new Error(`${field} must be a non-negative number`);
|
|
250
|
+
return value;
|
|
251
|
+
}
|
|
252
|
+
|
|
253
|
+
function shellQuote(value) {
|
|
254
|
+
if (/^[A-Za-z0-9_./:=@-]+$/.test(value)) return value;
|
|
255
|
+
return `'${value.replaceAll("'", "'\\''")}'`;
|
|
256
|
+
}
|
|
257
|
+
|
|
116
258
|
main().catch((error) => {
|
|
117
259
|
process.stderr.write(`[model-router] ${error.message}\n`);
|
|
118
|
-
process.exit(1);
|
|
260
|
+
process.exit(error.message.includes("must be") || error.message.includes("Expected") ? EXIT_INVALID : 1);
|
|
119
261
|
});
|
package/lib/adapters.js
ADDED
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
import { spawn } from "node:child_process";
|
|
2
|
+
|
|
3
|
+
export function buildAgentCommand({ agent, model, prompt, passthrough = [] }) {
|
|
4
|
+
if (agent === "claude") {
|
|
5
|
+
return {
|
|
6
|
+
command: "claude",
|
|
7
|
+
args: ["--print", "--model", model, ...passthrough, prompt]
|
|
8
|
+
};
|
|
9
|
+
}
|
|
10
|
+
if (agent === "codex") {
|
|
11
|
+
return {
|
|
12
|
+
command: "codex",
|
|
13
|
+
args: ["exec", "--model", model, ...passthrough, prompt]
|
|
14
|
+
};
|
|
15
|
+
}
|
|
16
|
+
throw new Error("Expected agent to be 'claude' or 'codex'.");
|
|
17
|
+
}
|
|
18
|
+
|
|
19
|
+
export function formatAgentCommand(request) {
|
|
20
|
+
const { command, args } = buildAgentCommand(request);
|
|
21
|
+
return `${command} ${args.map(shellQuote).join(" ")}`;
|
|
22
|
+
}
|
|
23
|
+
|
|
24
|
+
export async function executeAgent({ agent, model, prompt, cwd, passthrough = [] }) {
|
|
25
|
+
const { command, args } = buildAgentCommand({ agent, model, prompt, passthrough });
|
|
26
|
+
return new Promise((resolve) => {
|
|
27
|
+
const child = spawn(command, args, { cwd, stdio: ["ignore", "pipe", "pipe"] });
|
|
28
|
+
let stdout = "";
|
|
29
|
+
let stderr = "";
|
|
30
|
+
|
|
31
|
+
child.stdout?.setEncoding("utf8");
|
|
32
|
+
child.stderr?.setEncoding("utf8");
|
|
33
|
+
child.stdout?.on("data", (chunk) => {
|
|
34
|
+
stdout += chunk;
|
|
35
|
+
});
|
|
36
|
+
child.stderr?.on("data", (chunk) => {
|
|
37
|
+
stderr += chunk;
|
|
38
|
+
});
|
|
39
|
+
child.on("error", (error) => {
|
|
40
|
+
resolve({ exitCode: 1, stdout, stderr, error });
|
|
41
|
+
});
|
|
42
|
+
child.on("exit", (code, signal) => {
|
|
43
|
+
resolve({ exitCode: code ?? 1, signal, stdout, stderr });
|
|
44
|
+
});
|
|
45
|
+
});
|
|
46
|
+
}
|
|
47
|
+
|
|
48
|
+
function shellQuote(value) {
|
|
49
|
+
if (/^[A-Za-z0-9_./:=@-]+$/.test(value)) return value;
|
|
50
|
+
return `'${value.replaceAll("'", "'\\''")}'`;
|
|
51
|
+
}
|
package/lib/analyzer.js
CHANGED
|
@@ -100,6 +100,9 @@ function scoreRepo(files, astFiles) {
|
|
|
100
100
|
}
|
|
101
101
|
|
|
102
102
|
function tierFor(score, task, config) {
|
|
103
|
+
if (/explain|summarize|readme|docs|comment/i.test(task) && !/architecture|migration|security|auth|database|billing|payment|production|deploy/i.test(task) && score <= config.thresholds.balancedMax) {
|
|
104
|
+
return "simple";
|
|
105
|
+
}
|
|
103
106
|
if (/plan|architecture|migration|strategy/i.test(task) && score >= config.thresholds.simpleMax) return "planning";
|
|
104
107
|
if (score <= config.thresholds.simpleMax) return "simple";
|
|
105
108
|
if (score <= config.thresholds.balancedMax) return "balanced";
|
package/lib/config.js
CHANGED
|
@@ -23,6 +23,41 @@ export const DEFAULT_CONFIG = {
|
|
|
23
23
|
planning: "gpt-5.5"
|
|
24
24
|
}
|
|
25
25
|
},
|
|
26
|
+
modelMappings: {
|
|
27
|
+
claude: {
|
|
28
|
+
haiku: "claude-haiku-4-5",
|
|
29
|
+
sonnet: "claude-sonnet-4-6",
|
|
30
|
+
opus: "claude-opus-4-7",
|
|
31
|
+
opusplan: "claude-opus-4-7"
|
|
32
|
+
},
|
|
33
|
+
codex: {
|
|
34
|
+
"gpt-5.4-mini": "gpt-5-mini",
|
|
35
|
+
"gpt-5.4": "gpt-5",
|
|
36
|
+
"gpt-5.5": "gpt-5",
|
|
37
|
+
"codex-mini-latest": "codex-mini-latest"
|
|
38
|
+
}
|
|
39
|
+
},
|
|
40
|
+
policy: {
|
|
41
|
+
maxTier: "planning",
|
|
42
|
+
maxCostUsd: null
|
|
43
|
+
},
|
|
44
|
+
logging: {
|
|
45
|
+
enabled: false,
|
|
46
|
+
path: ".model-router/decisions.jsonl"
|
|
47
|
+
},
|
|
48
|
+
overrides: []
|
|
49
|
+
};
|
|
50
|
+
|
|
51
|
+
export const CONFIG_TEMPLATE = {
|
|
52
|
+
thresholds: DEFAULT_CONFIG.thresholds,
|
|
53
|
+
claude: DEFAULT_CONFIG.claude,
|
|
54
|
+
codex: DEFAULT_CONFIG.codex,
|
|
55
|
+
modelMappings: DEFAULT_CONFIG.modelMappings,
|
|
56
|
+
policy: {
|
|
57
|
+
maxTier: "planning",
|
|
58
|
+
maxCostUsd: null
|
|
59
|
+
},
|
|
60
|
+
logging: DEFAULT_CONFIG.logging,
|
|
26
61
|
overrides: []
|
|
27
62
|
};
|
|
28
63
|
|
|
@@ -52,6 +87,17 @@ function mergeConfig(base, override) {
|
|
|
52
87
|
...override.codex,
|
|
53
88
|
fallbackModels: { ...base.codex.fallbackModels, ...override.codex?.fallbackModels }
|
|
54
89
|
},
|
|
90
|
+
modelMappings: {
|
|
91
|
+
claude: { ...base.modelMappings.claude, ...override.modelMappings?.claude },
|
|
92
|
+
codex: { ...base.modelMappings.codex, ...override.modelMappings?.codex }
|
|
93
|
+
},
|
|
94
|
+
policy: { ...base.policy, ...override.policy },
|
|
95
|
+
logging: { ...base.logging, ...override.logging },
|
|
55
96
|
overrides: override.overrides ?? base.overrides
|
|
56
97
|
};
|
|
57
98
|
}
|
|
99
|
+
|
|
100
|
+
export function validateTier(tier, field = "tier") {
|
|
101
|
+
if (["simple", "balanced", "complex", "planning"].includes(tier)) return tier;
|
|
102
|
+
throw new Error(`${field} must be one of: simple, balanced, complex, planning`);
|
|
103
|
+
}
|
package/lib/cost.js
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
import { tokenize } from "@tokenometer/core";
|
|
2
|
+
|
|
3
|
+
export function estimateCost({ agent, selectedModel, task, config }) {
|
|
4
|
+
const mappedModel = config.modelMappings?.[agent]?.[selectedModel];
|
|
5
|
+
if (!task?.trim()) {
|
|
6
|
+
return unavailable("No task text was provided, so there is no prompt to estimate.");
|
|
7
|
+
}
|
|
8
|
+
if (!mappedModel) {
|
|
9
|
+
return unavailable(`No Tokenometer model mapping for ${agent} model "${selectedModel}".`);
|
|
10
|
+
}
|
|
11
|
+
try {
|
|
12
|
+
const result = tokenize({
|
|
13
|
+
format: "text",
|
|
14
|
+
modelId: mappedModel,
|
|
15
|
+
prompt: task
|
|
16
|
+
});
|
|
17
|
+
return {
|
|
18
|
+
available: true,
|
|
19
|
+
scope: "task",
|
|
20
|
+
source: "tokenometer",
|
|
21
|
+
model: mappedModel,
|
|
22
|
+
inputTokens: result.inputTokens,
|
|
23
|
+
inputCostUsd: result.inputCost,
|
|
24
|
+
approximate: result.approximate,
|
|
25
|
+
tokenizer: result.tokenizer,
|
|
26
|
+
reason: result.approximate
|
|
27
|
+
? "Estimated from Tokenometer offline/proxy tokenizer."
|
|
28
|
+
: "Estimated from Tokenometer exact offline tokenizer."
|
|
29
|
+
};
|
|
30
|
+
} catch (error) {
|
|
31
|
+
return unavailable(error instanceof Error ? error.message : String(error), mappedModel);
|
|
32
|
+
}
|
|
33
|
+
}
|
|
34
|
+
|
|
35
|
+
function unavailable(reason, model = null) {
|
|
36
|
+
return {
|
|
37
|
+
available: false,
|
|
38
|
+
scope: "task",
|
|
39
|
+
source: "tokenometer",
|
|
40
|
+
model,
|
|
41
|
+
inputTokens: null,
|
|
42
|
+
inputCostUsd: null,
|
|
43
|
+
approximate: null,
|
|
44
|
+
tokenizer: null,
|
|
45
|
+
reason
|
|
46
|
+
};
|
|
47
|
+
}
|
package/lib/decision.js
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
import { appendFile, mkdir } from "node:fs/promises";
|
|
2
|
+
import path from "node:path";
|
|
3
|
+
import { analyzeTask } from "./analyzer.js";
|
|
4
|
+
import { estimateCost } from "./cost.js";
|
|
5
|
+
import { chooseModel } from "./models.js";
|
|
6
|
+
import { evaluatePolicy } from "./policy.js";
|
|
7
|
+
|
|
8
|
+
export async function createDecision({ agent, task, cwd, config, refreshModels = false, maxTier, maxCostUsd }) {
|
|
9
|
+
const analysis = await analyzeTask({ cwd, task, config });
|
|
10
|
+
const model = await chooseModel({ agent, tier: analysis.tier, task, config, refreshModels });
|
|
11
|
+
const costEstimate = estimateCost({ agent, selectedModel: model.model, task, config });
|
|
12
|
+
const policy = evaluatePolicy({ tier: analysis.tier, costEstimate, config, maxTier, maxCostUsd });
|
|
13
|
+
const warnings = [];
|
|
14
|
+
if (model.source === "fallback") warnings.push("Codex model discovery failed; using configured fallback model.");
|
|
15
|
+
if (!costEstimate.available) warnings.push(`Cost estimate unavailable: ${costEstimate.reason}`);
|
|
16
|
+
if (!policy.passed) warnings.push(`Policy failed: ${policy.failures.join("; ")}`);
|
|
17
|
+
const rationale = buildRationale({ analysis, model, costEstimate, policy });
|
|
18
|
+
return {
|
|
19
|
+
agent,
|
|
20
|
+
cwd: analysis.cwd,
|
|
21
|
+
selectedModel: model.model,
|
|
22
|
+
tier: analysis.tier,
|
|
23
|
+
confidence: analysis.confidence,
|
|
24
|
+
score: analysis.score,
|
|
25
|
+
signals: analysis.signals,
|
|
26
|
+
rationale,
|
|
27
|
+
warnings,
|
|
28
|
+
costEstimate,
|
|
29
|
+
policy,
|
|
30
|
+
modelSource: model.source,
|
|
31
|
+
commandPreview: `${agent} --model ${model.model}`
|
|
32
|
+
};
|
|
33
|
+
}
|
|
34
|
+
|
|
35
|
+
export async function maybeLogDecision(decision, config, explicitLog = false) {
|
|
36
|
+
if (!explicitLog && !config.logging?.enabled) return;
|
|
37
|
+
const logPath = path.resolve(decision.cwd, config.logging?.path ?? ".model-router/decisions.jsonl");
|
|
38
|
+
await mkdir(path.dirname(logPath), { recursive: true });
|
|
39
|
+
const record = {
|
|
40
|
+
timestamp: new Date().toISOString(),
|
|
41
|
+
agent: decision.agent,
|
|
42
|
+
selectedModel: decision.selectedModel,
|
|
43
|
+
tier: decision.tier,
|
|
44
|
+
confidence: decision.confidence,
|
|
45
|
+
score: decision.score,
|
|
46
|
+
costEstimate: decision.costEstimate,
|
|
47
|
+
policy: decision.policy,
|
|
48
|
+
signals: decision.signals
|
|
49
|
+
};
|
|
50
|
+
await appendFile(logPath, `${JSON.stringify(record)}\n`, "utf8");
|
|
51
|
+
}
|
|
52
|
+
|
|
53
|
+
function buildRationale({ analysis, model, costEstimate, policy }) {
|
|
54
|
+
const sorted = [...analysis.signals].sort((a, b) => Number(b.value) - Number(a.value));
|
|
55
|
+
const top = sorted.slice(0, 3).map((signal) => `${signal.name}=${signal.value} (${signal.detail})`);
|
|
56
|
+
const lines = [
|
|
57
|
+
`Selected ${model.model} because the task scored as ${analysis.tier} complexity.`,
|
|
58
|
+
`Top signals: ${top.join("; ")}.`,
|
|
59
|
+
`Model source: ${model.source}.`
|
|
60
|
+
];
|
|
61
|
+
if (costEstimate.available) {
|
|
62
|
+
lines.push(`Estimated task prompt cost: ${costEstimate.inputTokens} input tokens, ${costEstimate.inputCostUsd.toFixed(6)} USD (${costEstimate.model}).`);
|
|
63
|
+
} else {
|
|
64
|
+
lines.push(`Cost estimate unavailable: ${costEstimate.reason}`);
|
|
65
|
+
}
|
|
66
|
+
lines.push(policy.passed ? "Policy: passed." : `Policy: failed (${policy.failures.join("; ")}).`);
|
|
67
|
+
return lines;
|
|
68
|
+
}
|
package/lib/gateway.js
ADDED
|
@@ -0,0 +1,143 @@
|
|
|
1
|
+
import { createInterface } from "node:readline/promises";
|
|
2
|
+
import { stdin as input, stdout as output } from "node:process";
|
|
3
|
+
import { loadConfig } from "./config.js";
|
|
4
|
+
import { createDecision, maybeLogDecision } from "./decision.js";
|
|
5
|
+
import { executeAgent as defaultExecuteAgent, formatAgentCommand } from "./adapters.js";
|
|
6
|
+
|
|
7
|
+
export async function runGatewayTurn({
|
|
8
|
+
agent,
|
|
9
|
+
prompt,
|
|
10
|
+
cwd,
|
|
11
|
+
config,
|
|
12
|
+
refreshModels = false,
|
|
13
|
+
maxTier,
|
|
14
|
+
maxCostUsd,
|
|
15
|
+
log = false,
|
|
16
|
+
dryRun = false,
|
|
17
|
+
passthrough = [],
|
|
18
|
+
executeAgent = defaultExecuteAgent
|
|
19
|
+
}) {
|
|
20
|
+
const decision = await createDecision({
|
|
21
|
+
agent,
|
|
22
|
+
task: prompt,
|
|
23
|
+
cwd,
|
|
24
|
+
config,
|
|
25
|
+
refreshModels,
|
|
26
|
+
maxTier,
|
|
27
|
+
maxCostUsd
|
|
28
|
+
});
|
|
29
|
+
await maybeLogDecision(decision, config, log);
|
|
30
|
+
if (!decision.policy.passed) {
|
|
31
|
+
return {
|
|
32
|
+
decision,
|
|
33
|
+
output: "",
|
|
34
|
+
exitCode: 3,
|
|
35
|
+
error: new Error(`Policy failed: ${decision.policy.failures.join("; ")}`)
|
|
36
|
+
};
|
|
37
|
+
}
|
|
38
|
+
if (dryRun) {
|
|
39
|
+
const commandPreview = formatAgentCommand({
|
|
40
|
+
agent,
|
|
41
|
+
model: decision.selectedModel,
|
|
42
|
+
prompt,
|
|
43
|
+
passthrough
|
|
44
|
+
});
|
|
45
|
+
return {
|
|
46
|
+
decision,
|
|
47
|
+
output: `${commandPreview}\n`,
|
|
48
|
+
exitCode: 0,
|
|
49
|
+
dryRun: true
|
|
50
|
+
};
|
|
51
|
+
}
|
|
52
|
+
|
|
53
|
+
const result = await executeAgent({
|
|
54
|
+
agent,
|
|
55
|
+
model: decision.selectedModel,
|
|
56
|
+
prompt,
|
|
57
|
+
cwd: decision.cwd,
|
|
58
|
+
passthrough
|
|
59
|
+
});
|
|
60
|
+
return {
|
|
61
|
+
decision,
|
|
62
|
+
output: result.stdout ?? "",
|
|
63
|
+
stderr: result.stderr ?? "",
|
|
64
|
+
exitCode: result.exitCode ?? 1,
|
|
65
|
+
signal: result.signal,
|
|
66
|
+
error: result.error
|
|
67
|
+
};
|
|
68
|
+
}
|
|
69
|
+
|
|
70
|
+
export async function runGateway({
|
|
71
|
+
agent,
|
|
72
|
+
cwd = process.cwd(),
|
|
73
|
+
log = false,
|
|
74
|
+
refreshModels = false,
|
|
75
|
+
maxTier,
|
|
76
|
+
maxCostUsd,
|
|
77
|
+
dryRun = false,
|
|
78
|
+
passthrough = [],
|
|
79
|
+
onceTask
|
|
80
|
+
}) {
|
|
81
|
+
const config = await loadConfig(cwd);
|
|
82
|
+
if (onceTask !== undefined) {
|
|
83
|
+
const result = await runGatewayTurn({
|
|
84
|
+
agent,
|
|
85
|
+
prompt: onceTask,
|
|
86
|
+
cwd,
|
|
87
|
+
config,
|
|
88
|
+
log,
|
|
89
|
+
dryRun,
|
|
90
|
+
refreshModels,
|
|
91
|
+
maxTier,
|
|
92
|
+
maxCostUsd,
|
|
93
|
+
passthrough
|
|
94
|
+
});
|
|
95
|
+
printGatewayTurn(result);
|
|
96
|
+
return result.exitCode;
|
|
97
|
+
}
|
|
98
|
+
|
|
99
|
+
const rl = createInterface({ input, output });
|
|
100
|
+
process.stdout.write(`[model-router] gateway for ${agent}. Type /exit or /quit to stop.\n`);
|
|
101
|
+
try {
|
|
102
|
+
for (;;) {
|
|
103
|
+
const prompt = await rl.question("> ");
|
|
104
|
+
const trimmed = prompt.trim();
|
|
105
|
+
if (!trimmed) continue;
|
|
106
|
+
if (trimmed === "/exit" || trimmed === "/quit") return 0;
|
|
107
|
+
const result = await runGatewayTurn({
|
|
108
|
+
agent,
|
|
109
|
+
prompt,
|
|
110
|
+
cwd,
|
|
111
|
+
config,
|
|
112
|
+
log,
|
|
113
|
+
dryRun,
|
|
114
|
+
refreshModels,
|
|
115
|
+
maxTier,
|
|
116
|
+
maxCostUsd,
|
|
117
|
+
passthrough
|
|
118
|
+
});
|
|
119
|
+
printGatewayTurn(result);
|
|
120
|
+
}
|
|
121
|
+
} catch (error) {
|
|
122
|
+
if (error?.code === "ERR_USE_AFTER_CLOSE") return 0;
|
|
123
|
+
throw error;
|
|
124
|
+
} finally {
|
|
125
|
+
rl.close();
|
|
126
|
+
}
|
|
127
|
+
}
|
|
128
|
+
|
|
129
|
+
export function printGatewayTurn(result) {
|
|
130
|
+
const { decision } = result;
|
|
131
|
+
process.stderr.write(`[model-router] ${decision.agent}: ${decision.selectedModel} (${decision.tier}, confidence ${decision.confidence.toFixed(2)})\n`);
|
|
132
|
+
if (!decision.policy.passed) {
|
|
133
|
+
process.stderr.write(`[model-router] policy failed: ${decision.policy.failures.join("; ")}\n`);
|
|
134
|
+
return;
|
|
135
|
+
}
|
|
136
|
+
if (result.stderr) process.stderr.write(result.stderr);
|
|
137
|
+
if (result.error) {
|
|
138
|
+
process.stderr.write(`[model-router] failed to launch ${decision.agent}: ${result.error.message}\n`);
|
|
139
|
+
return;
|
|
140
|
+
}
|
|
141
|
+
if (result.output) process.stdout.write(result.output);
|
|
142
|
+
if (result.exitCode !== 0) process.stderr.write(`[model-router] ${decision.agent} exited with code ${result.exitCode}\n`);
|
|
143
|
+
}
|
package/lib/policy.js
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
export const TIER_ORDER = {
|
|
2
|
+
simple: 0,
|
|
3
|
+
balanced: 1,
|
|
4
|
+
complex: 2,
|
|
5
|
+
planning: 3
|
|
6
|
+
};
|
|
7
|
+
|
|
8
|
+
export function evaluatePolicy({ tier, costEstimate, config, maxTier, maxCostUsd }) {
|
|
9
|
+
const effectiveMaxTier = maxTier ?? config.policy?.maxTier ?? "planning";
|
|
10
|
+
const effectiveMaxCostUsd = maxCostUsd ?? config.policy?.maxCostUsd ?? null;
|
|
11
|
+
const failures = [];
|
|
12
|
+
if (TIER_ORDER[tier] > TIER_ORDER[effectiveMaxTier]) {
|
|
13
|
+
failures.push(`tier ${tier} exceeds max tier ${effectiveMaxTier}`);
|
|
14
|
+
}
|
|
15
|
+
if (
|
|
16
|
+
effectiveMaxCostUsd !== null &&
|
|
17
|
+
effectiveMaxCostUsd !== undefined &&
|
|
18
|
+
costEstimate?.available &&
|
|
19
|
+
costEstimate.inputCostUsd > Number(effectiveMaxCostUsd)
|
|
20
|
+
) {
|
|
21
|
+
failures.push(`estimated cost ${formatUsd(costEstimate.inputCostUsd)} exceeds max cost ${formatUsd(Number(effectiveMaxCostUsd))}`);
|
|
22
|
+
}
|
|
23
|
+
return {
|
|
24
|
+
passed: failures.length === 0,
|
|
25
|
+
failures,
|
|
26
|
+
maxTier: effectiveMaxTier,
|
|
27
|
+
maxCostUsd: effectiveMaxCostUsd
|
|
28
|
+
};
|
|
29
|
+
}
|
|
30
|
+
|
|
31
|
+
export function formatUsd(value) {
|
|
32
|
+
if (!Number.isFinite(value)) return "unavailable";
|
|
33
|
+
if (value === 0) return "$0.000000";
|
|
34
|
+
return `$${value.toFixed(6)}`;
|
|
35
|
+
}
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@ast-ai-model-router/cli",
|
|
3
|
-
"version": "
|
|
4
|
-
"description": "AST-
|
|
3
|
+
"version": "1.1.0",
|
|
4
|
+
"description": "AST-based Claude Code and Codex model router with token-cost estimates, CI policy checks, and explainable AI coding-agent model selection.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|
|
7
7
|
"ast-ai-model-router": "bin/ast-ai-model-router.js",
|
|
@@ -28,12 +28,20 @@
|
|
|
28
28
|
},
|
|
29
29
|
"keywords": [
|
|
30
30
|
"llm",
|
|
31
|
+
"ai-coding-agent",
|
|
31
32
|
"token-economics",
|
|
33
|
+
"llm-cost-optimization",
|
|
32
34
|
"model-router",
|
|
35
|
+
"model-selection",
|
|
33
36
|
"claude-code",
|
|
37
|
+
"claude-code-model-router",
|
|
34
38
|
"codex",
|
|
39
|
+
"codex-cli",
|
|
40
|
+
"codex-model-router",
|
|
35
41
|
"ast",
|
|
36
|
-
"prompt-cost"
|
|
42
|
+
"prompt-cost",
|
|
43
|
+
"developer-tools",
|
|
44
|
+
"ci"
|
|
37
45
|
],
|
|
38
46
|
"author": {
|
|
39
47
|
"name": "Faraazuddin Mohammed",
|
|
@@ -51,7 +59,8 @@
|
|
|
51
59
|
"provenance": true
|
|
52
60
|
},
|
|
53
61
|
"dependencies": {
|
|
54
|
-
"@babel/parser": "^7.28.5"
|
|
62
|
+
"@babel/parser": "^7.28.5",
|
|
63
|
+
"@tokenometer/core": "^1.1.0"
|
|
55
64
|
},
|
|
56
65
|
"devDependencies": {
|
|
57
66
|
"@changesets/cli": "^2.31.0"
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: model-router
|
|
3
|
-
description: Use AST AI Model Router to choose an appropriate Claude or Codex model from task
|
|
3
|
+
description: Use AST AI Model Router to choose an appropriate Claude Code or Codex model from task, code complexity, token-cost, and policy signals before launching or recommending an agent command.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Model Router
|
|
@@ -21,7 +21,19 @@ or:
|
|
|
21
21
|
ast-ai-model-router analyze --agent claude --task "<task>"
|
|
22
22
|
```
|
|
23
23
|
|
|
24
|
-
2.
|
|
24
|
+
2. When the user needs rationale, use:
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
ast-ai-model-router explain --agent codex --task "<task>"
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
3. For CI or production policy checks, use:
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
ast-ai-model-router ci --agent codex --task "<task>" --max-tier complex
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
4. If the user wants execution, launch through the wrapper:
|
|
25
37
|
|
|
26
38
|
```bash
|
|
27
39
|
ast-ai-model-router run codex --task "<task>" -- <codex args>
|
|
@@ -31,10 +43,28 @@ ast-ai-model-router run codex --task "<task>" -- <codex args>
|
|
|
31
43
|
ast-ai-model-router run claude --task "<task>" -- <claude args>
|
|
32
44
|
```
|
|
33
45
|
|
|
34
|
-
|
|
46
|
+
5. If the user wants every new prompt routed independently, use the gateway:
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
ast-ai-model-router gateway codex -- <codex args>
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
```bash
|
|
53
|
+
ast-ai-model-router gateway claude -- <claude args>
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
For a single routed turn:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
ast-ai-model-router gateway codex --once --task "<task>" -- <codex args>
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
6. Explain the selected model in token-economics terms: simple tasks should use faster/cheaper models; complex migrations, security-sensitive work, and architecture planning should use stronger models.
|
|
35
63
|
|
|
36
64
|
## Notes
|
|
37
65
|
|
|
38
66
|
- Codex model names are discovered dynamically from `codex debug models`.
|
|
39
67
|
- Claude model names use aliases: `haiku`, `sonnet`, `opus`, and `opusplan`.
|
|
68
|
+
- Token-cost estimates use Tokenometer when a selected model maps cleanly to a known provider model.
|
|
40
69
|
- The router analyzes JavaScript/TypeScript with Babel ASTs and Python with stdlib `ast`.
|
|
70
|
+
- Gateway mode routes prompts entered through `ast-ai-model-router gateway`; it does not invisibly intercept an already-running Claude Code or Codex TUI.
|