aiforcecli 0.3.0 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +12 -10
- package/aiforcecli.config.example.json +3 -1
- package/dist/cli.js +1 -1
- package/dist/index.js +1 -1
- package/package.json +4 -1
package/README.md
CHANGED
|
@@ -2,9 +2,10 @@
|
|
|
2
2
|
|
|
3
3
|
**One interface over every coding agent — that you can trust and afford.**
|
|
4
4
|
|
|
5
|
-
`aiforcecli` wraps multiple coding agents — **Claude Code**, **OpenAI Codex**,
|
|
6
|
-
|
|
7
|
-
|
|
5
|
+
`aiforcecli` wraps multiple coding agents — **Claude Code**, **OpenAI Codex**, **Aider**
|
|
6
|
+
(which itself routes to DeepSeek, Mistral/Codestral, Qwen, Ollama, OpenRouter, and more), and
|
|
7
|
+
**Antigravity** (Google's `agy`, for Gemini models) — behind a single normalized interface, and
|
|
8
|
+
adds the things a single-agent CLI structurally can't:
|
|
8
9
|
|
|
9
10
|
- 🧭 **Advise** — instantly recommend the best agent+model for a task (public benchmarks blended with *your* results), before you spend a cent.
|
|
10
11
|
- 🏁 **Race** — run several agents in parallel in isolated git worktrees, **verify each against your tests**, and apply the winner.
|
|
@@ -44,8 +45,8 @@ That's it — you're productive on day one.
|
|
|
44
45
|
|
|
45
46
|
| Command | What it does |
|
|
46
47
|
| --- | --- |
|
|
47
|
-
| `aiforcecli run "<task>"` | Run a task with an agent. With no `--agent`, `aiforcecli` **auto-routes** to the best agent+model for the task within budget. Add `--heal` to **verify and self-heal** (see below). Flags: `--agent claude-code\|codex`, `--model <m>`, `--cwd <path>`, `--budget <usd>`, `--explain`, `--resume <sessionId>`, `--json`, `--heal`, `--max-attempts <n>`, `--verify <cmd>`, `--skip-verify`. |
|
|
48
|
-
| `aiforcecli advise "<task>"` | **Recommend** which agent + model to use — instantly, without running anything. Blends a public benchmark scorecard with your private outcomes. `--cwd <path>`, `--json`. |
|
|
48
|
+
| `aiforcecli run "<task>"` | Run a task with an agent. With no `--agent`, `aiforcecli` **auto-routes** to the best agent+model for the task within budget. Add `--heal` to **verify and self-heal** (see below). Flags: `--agent claude-code\|codex\|aider\|antigravity`, `--model <m>`, `--cwd <path>`, `--budget <usd>`, `--route deterministic\|bayesian`, `--explore`/`--no-explore` (bayesian), `--explain`, `--resume <sessionId>`, `--json`, `--heal`, `--max-attempts <n>`, `--verify <cmd>`, `--skip-verify`. |
|
|
49
|
+
| `aiforcecli advise "<task>"` | **Recommend** which agent + model to use — instantly, without running anything. Blends a public benchmark scorecard with your private outcomes. `--cwd <path>`, `--budget <usd>` (preview under a cap), `--explore`, `--json`. |
|
|
49
50
|
| `aiforcecli race "<task>"` | Run the task on **several agents in parallel** (isolated git worktrees), verify each, and apply the winner. Flags: `--agents claude-code,codex`, `--budget <usd>` (total, split across agents), `--select cheapest\|fastest\|first-pass`, `--verify <cmd>`, `--keep`, `--json`. |
|
|
50
51
|
| `aiforcecli eval` | Run the **private eval suite** to calibrate `advise` on your codebase. `--dir <path>`, `--agents <ids>`, `--json`. |
|
|
51
52
|
| `aiforcecli bench` | **Leaderboard** from your local history: which agent/model wins which task class, per dollar. `--since 7d\|24h\|<ISO>`, `--by-task`, `--json`. |
|
|
@@ -233,8 +234,9 @@ leaderboard (off by default; never sends prompts, paths, or output).
|
|
|
233
234
|
(Claude Code reports `total_cost_usd`). When a CLI reports tokens only (Codex), cost is
|
|
234
235
|
**computed** from a local pricing table — `costSource` records which, so reports are
|
|
235
236
|
never silently wrong.
|
|
236
|
-
- **
|
|
237
|
-
meets
|
|
237
|
+
- **Window caps (`dailyCapUsd`, `weeklyCapUsd`, `monthlyCapUsd`):** checked *before* a run
|
|
238
|
+
starts. If spend in any window already meets its cap (today / this calendar week from Monday /
|
|
239
|
+
this calendar month), the run is refused (exit code `2`). They also tighten the routing budget.
|
|
238
240
|
- **Per-run cap (`maxCostPerRunUsd`, or `--budget`):** enforced *during* the run. As usage
|
|
239
241
|
events arrive, if cumulative cost crosses the cap the subprocess is aborted
|
|
240
242
|
(SIGTERM → SIGKILL).
|
|
@@ -261,16 +263,16 @@ is separate from config — installing it once is enough; settings come from the
|
|
|
261
263
|
|
|
262
264
|
| Block | Key | Default | Purpose |
|
|
263
265
|
| --- | --- | --- | --- |
|
|
264
|
-
| `agents.<id>` | `model`, `bin`, `defaultFlags`, `allowedTools` |
|
|
266
|
+
| `agents.<id>` | `enabled`, `model`, `bin`, `defaultFlags`, `allowedTools` | `enabled:true` | Per-agent settings. `enabled:false` removes the agent everywhere. Plus binary path, forced model, default flags, tool allow-list. |
|
|
265
267
|
| `defaultAgent` | | `claude-code` | Agent used by `run` when routing is off and no `--agent`. |
|
|
266
|
-
| `routing` | `enabled`, `prefer`, `only`, `models` | `enabled:true` |
|
|
268
|
+
| `routing` | `enabled`, `strategy`, `prefer`, `only`, `models` | `enabled:true`, `strategy:deterministic` | Auto-routing. `strategy`: `deterministic` (budget-aware tier router) or `bayesian` (learned engine; honors `advise.explore`). `only` restricts the catalog, `models` adds entries. Per-run override: `run --route`. |
|
|
267
269
|
| `verify` | `enabled`, `command`, `timeoutMs` | `enabled:true`, 300 s | How results are verified; `command` overrides auto-detection. |
|
|
268
270
|
| `heal` | `enabled`, `maxAttempts`, `escalate` | `false`, `3`, `true` | Self-healing loop behavior for `run --heal`. |
|
|
269
271
|
| `race` | `agents`, `select`, `keepWorktrees`, `linkPaths` | `select:cheapest` | Defaults for `race`; `linkPaths` symlinks extra dep dirs into worktrees. |
|
|
270
272
|
| `advise` | `weights`, `privatePseudocount`, `explore` | `0.7/0.2/0.1`, `5`, `false` | Recommendation scoring weights, prior strength, and Thompson-Sampling exploration. |
|
|
271
273
|
| `eval` | `dir`, `models` | `.aiforcecli/evals` | Private eval suite location and which catalog models to evaluate. |
|
|
272
274
|
| `telemetry` | `enabled`, `endpoint` | `false` | Opt-in anonymized outcome upload for the shared leaderboard. |
|
|
273
|
-
| `budgets` | `maxCostPerRunUsd`, `dailyCapUsd` | — | Per-run
|
|
275
|
+
| `budgets` | `maxCostPerRunUsd`, `dailyCapUsd`, `weeklyCapUsd`, `monthlyCapUsd` | — | Per-run cap + daily/weekly/monthly spend caps (refuse to start once a window cap is hit). |
|
|
274
276
|
| top-level | `timeoutMs`, `inactivityTimeoutMs` | 600 s, 120 s | Wall-clock and inactivity watchdogs for every run. |
|
|
275
277
|
|
|
276
278
|
## Development
|