cascade-ai 0.4.0 → 0.9.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,17 +1,44 @@
1
1
  # ◈ Cascade AI
2
2
 
3
- > Multi-tier AI orchestration CLI built for developers who think in systems.
3
+ > **One prompt an organization of AI agents that plan, delegate, and execute in parallel.**
4
+ > Auto-routed to the cheapest model that's best at each step. **Up to 90% cheaper** than running everything on one frontier model.
4
5
 
5
- Cascade is an open-source CLI tool that runs your prompts through a hierarchical three-tier agent system (T1 → T2 → T3), automatically routing work across the best available models, executing tools, and compiling a single coherent result. Inspired by Claude Code, Gemini CLI, and GitHub Copilot CLI — but uniquely structured around orchestration.
6
+ [![npm](https://img.shields.io/npm/v/cascade-ai?color=aaff00&label=npm)](https://www.npmjs.com/package/cascade-ai)
7
+ [![license](https://img.shields.io/badge/license-MIT-aaff00.svg)](LICENSE)
8
+ [![node](https://img.shields.io/badge/node-%E2%89%A520-5AB4E8.svg)](#installation)
9
+ [![providers](https://img.shields.io/badge/providers-6-a78bff.svg)](#ai-providers)
10
+ [![PRs welcome](https://img.shields.io/badge/PRs-welcome-f5a623.svg)](CONTRIBUTING.md)
11
+
12
+ Cascade is an open-source CLI that runs your prompt through a hierarchical three-tier agent system — **T1 plans → T2 manages → T3 executes** — auto-routing each step to the best-value model, running tools, and compiling one coherent result. Think Claude Code / Gemini CLI / Copilot CLI, but uniquely built around **orchestration**.
6
13
 
7
14
  ```
8
15
  cascade "Refactor the auth module to use JWT, add tests, and open a PR"
9
16
  ```
10
17
 
18
+ ## ✨ Highlights
19
+
20
+ - 🧠 **Live benchmark Auto-routing** — set a tier to `Auto` and Cascade fuses *live* public benchmark scores with *live* pricing to pick the best-**value** model for each task.
21
+ - 🤖 **Autonomous mode** (`/auto`) — hands-off runs: safe tools run silently, dangerous ones still ask, budget caps stay the hard stop.
22
+ - 📋 **Boardroom plan review** — pause to review, **edit**, or steer T1's plan (with an AI reviewer's critique) before any worker spawns.
23
+ - ⏯️ **Run resumability** (`/continue`) — hit the budget cap on a big task? Resume from the partial state instead of redoing it.
24
+ - 👥 **Workers recruit help** — a worker can ask its manager to spawn bounded sibling workers when the work fans out — dynamic parallelism, no rigid plan.
25
+ - 💸 **Delegation savings** — every run shows what the hierarchy saved you (`saved $5.63 — 90% vs. all-T1`); no flat-agent tool can show this number.
26
+ - 🛡️ **Safe by default** — permission escalation (T3→T2→T1→you), SSRF-guarded fetch, loopback-only dashboard, and a budget kill-switch.
27
+
28
+ ## Why Cascade is one of a kind
29
+
30
+ Other AI CLIs run a single agent. Cascade runs a visible **organization** — and the terminal shows you the org at work:
31
+
32
+ - **Delegation savings** — the status bar and every run receipt show what the hierarchy saved you (`$0.031 · saved $0.094 — 75% vs. all-T1`), because cheap local T3 workers do the heavy lifting while a premium T1 model only administrates. No flat-agent tool can show this number.
33
+ - **Agent comms feed** (`/comms`) — live radio chatter between workers: peer messages, broadcasts, file locks, barrier syncs. No other CLI has agent-to-agent communication at all, let alone on screen.
34
+ - **`/why`** — every run can explain itself: the complexity verdict and the classifier's reasoning, which model served each tier, failovers, and escalations.
35
+ - **The boardroom** (`planApproval: "always"`) — Complex runs pause so you can approve T1's proposed org chart and budget ("3 managers · 7 workers · est. $0.40") before anything spawns. You sit above T1.
36
+
11
37
  ---
12
38
 
13
39
  ## Table of Contents
14
40
 
41
+ - [What's New](#whats-new)
15
42
  - [How It Works](#how-it-works)
16
43
  - [Features](#features)
17
44
  - [Installation](#installation)
@@ -34,6 +61,66 @@ cascade "Refactor the auth module to use JWT, add tests, and open a PR"
34
61
 
35
62
  ---
36
63
 
64
+ ## What's New
65
+
66
+ ### v0.6 → v0.9.1 — the agentic releases
67
+ - **v0.9.1 — Workers recruit help.** A T3 worker that discovers its task should fan out can call `request_workers` to have its T2 spawn bounded sibling workers (no recursive 4th tier; depth-capped + budget-bounded).
68
+ - **v0.9.0 — Resumability, reflection, smarter local exec.** `/continue` resumes a budget-capped task from its partial state; opt-in **reflection** revises a worker's output against the goal; `t3Execution: auto` runs T3 waves sequentially on local/Ollama tiers and parallel on cloud.
69
+ - **v0.8.0 — Autonomous mode + smarter re-planning.** `/auto` for hands-off runs (safe tools auto-approve, dangerous still gated); T1's reviewer **stops early** when a corrective pass isn't converging; new `/plan` (preview a decomposition) and `/replan`.
70
+ - **v0.7.0 — Boardroom plan review.** Iterative revision (steer → re-plan → re-ask), an AI **plan reviewer**, inline **editable** plans, and a wider gate that can pause Moderate runs too.
71
+ - **v0.6.0 — Live benchmark Auto-routing + fixes.** `Auto` picks the best-value model per task from live public benchmarks + live OpenRouter pricing, with live provider model discovery. Plus the Gemini stale-id 404 self-heal, the Ink-6 paste fix, and run-hang timeouts.
72
+
73
+ <details><summary>Earlier — the visible organization + a flicker-free TUI (v0.5.x)</summary>
74
+
75
+ ### The visible organization + a flicker-free TUI
76
+ - **Delegation savings counter** — live `saved $X (Y%) vs. all-T1` in the StatusBar and `/cost`, plus a one-line receipt after every run (duration · managers · workers · cost · savings).
77
+ - **Agent comms feed** — `/comms` toggles a live ticker of PeerBus traffic (peer messages, broadcasts, file locks, barriers). The events always existed for the web dashboard; the terminal now shows them too.
78
+ - **`/why`** — prints the decision trail for the last run: complexity verdict with the classifier's reason (or which heuristic short-circuited), models per tier, Cascade Auto picks, provider failovers, and escalations.
79
+ - **Boardroom plan approval** — with `planApproval: "always"`, Complex runs pause after T1 plans so you can approve the org chart + estimated cost before any T2 spawns. SDK/headless auto-approve, so default behavior is unchanged.
80
+ - **Flicker fix** — the live area now always fits the viewport (per-panel row budgets, terminal-resize handling, capped panels), which stops Ink's full-screen redraw fallback — the root cause of flicker in long sessions on small/maximized terminals.
81
+ - **Native mouse selection works** — idle repaints no longer wipe an in-progress drag-select; the completed agent tree collapses on your next keystroke instead of an 8s timer. `/copy [n]` copies a response via native clipboard tools with an OSC 52 escape fallback (works over SSH).
82
+ - **`--alt-screen`** — opt-in vim-style alternate-screen mode: flicker-proof by construction, shell restored on exit (even on crashes); history scrolls in-app with PgUp/PgDn.
83
+ - **Ink 6.8 + React 19** — renderer upgrade; Node.js floor rises to **20** (18 is EOL).
84
+
85
+ ### v0.5.7 — Security hardening pass
86
+ A focused security review of the tool and dashboard surface. All changes are covered by tests (`tsc --noEmit` clean, full suite green).
87
+ - **Dashboard binds to loopback by default** — the server previously listened on all interfaces (`0.0.0.0`), exposing `POST /api/run` (which executes a prompt through the full shell/file/code-interpreter tool set) to the local network. It now binds to `127.0.0.1` via the new `dashboard.host` config field; binding to a public interface requires opting in and prints a warning (louder still if `dashboard.auth` is off).
88
+ - **SSRF protection for `web_fetch`** — agent-supplied URLs are validated against a new SSRF-safe fetch helper: http/https only, hostnames resolved and rejected if they map to loopback / link-local (cloud metadata `169.254.169.254`) / private / CGNAT ranges, and every redirect hop re-validated. Set `CASCADE_ALLOW_LOCAL_FETCH=1` to fetch local URLs. The runtime tool-creator sandbox's `fetch` uses the same guard.
89
+ - **`file_edit` and `git` now require approval** — approval is gated by an allowlist that previously omitted both, so in-place file edits and `git commit`/`checkout`/`push` ran with no prompt while `file_write`/`file_delete` were gated. Both are now in the default approval set.
90
+ - **Code interpreter argument injection fixed** — `run_code` now executes via `execFile` with an argv array instead of interpolating arguments into a shell string, so a crafted `args` value can no longer break out into a second command. Temp scripts are written under the workspace root.
91
+ - **Dashboard JWT pinned to HS256** on both sign and verify (defense-in-depth against algorithm-confusion).
92
+ - **Broadened shell dangerous-command patterns** — the built-in blocklist now tolerates flag reordering / extra whitespace (`rm -fr /`, `rm -rf /`) and catches a fork-bomb form. This is defense-in-depth; the approval prompt remains the real gate.
93
+
94
+ ### v0.5.6 — Wizard scrollable model list + chat scrollback + slash panel fix
95
+ - **Init wizard tier-model picker** — added `limit={8}` to the `SelectInput` so long model lists scroll with ↑/↓ indicators instead of overflowing off-screen.
96
+ - **Chat scrolling restored** — the REPL was still enabling mouse-reporting on mount, which captured wheel events and broke the terminal's native scrollback (where Ink `<Static>` messages live since v0.5.4). Flipped the on-mount sequence to actively disable. Mouse-wheel-up now scrolls the terminal scrollback as expected.
97
+ - **Slash-command suggestion panel** — long descriptions were wrapping to a second line and squishing two entries onto one row. Added `wrap="truncate"` on the description text and bumped the fixed panel height by one row to fit the worst-case content (header + 8 entries + both ↑/↓ indicators).
98
+
99
+ ### v0.5.4 — Maximized-terminal flicker fix + orchestrator resilience
100
+ - **Static-based conversation rendering** — completed messages now go to the terminal's native scrollback via Ink `<Static>`; only the live area (status bar, streaming tail, agent tree, input) re-renders per batch. Effectively eliminates the maximized-window flicker on cmd / PowerShell.
101
+ - **`tier:status` throttle** (100 ms) + `React.memo` on AgentTree / StatusBar / HintBar to cut per-event re-render churn.
102
+ - **Auto-clear agent tree** — the tree auto-hides 8 s after a task completes (preserves conversation and cost data); cancelled if a new task starts.
103
+ - **T3 critical-error detection** — rate-limit / auth / forbidden errors now short-circuit the agent loop via a typed `CriticalToolError`; the worker no longer loops 15× on a 429.
104
+ - **T3 stall preserves partial output** via a typed `WorkerStallError` instead of throwing a bare `Error`.
105
+ - **T1 failure summary** — when all sections fail, the user sees the actual root cause (e.g. `[CRITICAL_TOOL_ERROR] grep: 429 Rate limit reached for gpt-5.4-mini`) instead of a generic "all sections encountered errors".
106
+
107
+ ### v0.5.3 — Headless mode and audit fixes
108
+ - **Headless `cascade run` / `-p`** — works in non-TTY contexts (CI, pipes, scripts). Progress → stderr, final answer → stdout. Tool approvals are auto-granted in headless mode.
109
+ - **`cascade models` columns** — long model IDs no longer collide with the provider column.
110
+ - **`/clear` resets cost breakdowns** — per-provider / per-tier maps are reset, not just the totals.
111
+ - **`/config`** — richer output (theme, providers, per-tier models, dashboard port, cascade-auto), guarded against an undefined `config.dashboard`.
112
+ - **Type cleanup** — removed vestigial `ReplMessage` / `ToolCallBlock` interfaces.
113
+
114
+ ### v0.5.2 — Setup wizard redesign + new tools
115
+ - **First-run setup wizard redesigned** to match the Cascade-AI TUI design — themed welcome header, phased step tabs (API Keys → Models → Complete), field boxes, tier cards, and a proper completion screen. All provider/model functionality preserved.
116
+ - **New tools** — `glob`, `grep`, and `web-fetch` available to T3 workers.
117
+ - **Model-performance tracker** — records per-model success/cost stats for scored selection when `cascadeAuto: true`.
118
+ - **Fixes** — removed an accidental `cascade-ai` self-dependency in `package.json`; corrected misleading `/tree` and `/sessions` slash-command descriptions; fixed stale T2/T3 test mocks.
119
+
120
+ </details>
121
+
122
+ ---
123
+
37
124
  ## How It Works
38
125
 
39
126
  Every task runs through three agent tiers:
@@ -118,10 +205,12 @@ User prompt
118
205
 
119
206
  ### Web Dashboard
120
207
  - Real-time agent execution graph (ReactFlow)
208
+ - **Peer communication edges** — animated dashed lines between agents as they exchange messages
209
+ - **Agent Inspector** — click any node to see live output stream and peer communications
121
210
  - Session browser with cost/token stats
122
211
  - Config viewer
123
212
  - JWT auth (password-protected)
124
- - Single-tenant and multi-tenant team modes
213
+ - URL hash routing (`#topology`, `#sessions`, `#logs`, `#settings`)
125
214
  - WebSocket live updates
126
215
 
127
216
  ---
@@ -132,7 +221,7 @@ User prompt
132
221
  npm install -g cascade-ai
133
222
  ```
134
223
 
135
- > Requires **Node.js ≥ 18**.
224
+ > Requires **Node.js ≥ 20**.
136
225
 
137
226
  ---
138
227
 
@@ -191,15 +280,22 @@ Cascade loads config from `.cascade/config.json` in your project directory.
191
280
  "browserEnabled": false
192
281
  },
193
282
  "dashboard": {
283
+ "host": "127.0.0.1",
194
284
  "port": 4891,
195
285
  "auth": true,
196
286
  "teamMode": "single"
197
287
  },
198
288
  "theme": "cascade",
199
- "telemetry": { "enabled": false }
289
+ "telemetry": { "enabled": false },
290
+ "plugins": ["./plugins/my-tool.js"],
291
+ "planApproval": "never",
292
+ "altScreen": false
200
293
  }
201
294
  ```
202
295
 
296
+ - `planApproval: "always"` pauses Complex runs in the **boardroom**: approve T1's proposed sections, worker counts, and estimated cost before any T2 manager spawns. Headless/SDK runs auto-approve.
297
+ - `altScreen: true` (or the `--alt-screen` flag) renders the TUI in the terminal's alternate screen buffer — vim-style, flicker-proof, shell restored on exit. History scrolls in-app with PgUp/PgDn since the alt screen has no native scrollback.
298
+
203
299
  API keys are also read from environment variables:
204
300
 
205
301
  | Provider | Environment Variable |
@@ -209,6 +305,27 @@ API keys are also read from environment variables:
209
305
  | Gemini | `GOOGLE_API_KEY` |
210
306
  | Azure | `AZURE_OPENAI_KEY` |
211
307
 
308
+ ### Linking credentials from other AI CLIs
309
+
310
+ If you already use **Claude Code**, **OpenAI Codex**, **Gemini CLI**, or **GitHub Copilot CLI**, Cascade can reuse the credentials they store on your machine instead of asking you to paste keys again:
311
+
312
+ ```bash
313
+ cascade link # list detected credentials
314
+ cascade link anthropic # adopt an API key for a provider
315
+ cascade link anthropic --accept-risk # adopt a Claude Code subscription token
316
+ ```
317
+
318
+ `cascade doctor` also reports what's linkable. How each credential is treated:
319
+
320
+ | Source | Stored as | Reusable? |
321
+ |--------|-----------|-----------|
322
+ | `ANTHROPIC_API_KEY` / `OPENAI_API_KEY` / `GEMINI_API_KEY` env | API key | ✅ directly |
323
+ | Codex `~/.codex/auth.json` (API-key mode) | API key | ✅ directly |
324
+ | Claude Code `~/.claude/.credentials.json` | OAuth token | ⚠️ as an Anthropic bearer token (needs `--accept-risk`) |
325
+ | Codex ChatGPT login · Gemini CLI · Copilot CLI | vendor OAuth | ❌ detected only — locked to that vendor's backend |
326
+
327
+ > ⚠️ **Terms of service:** reusing a *subscription* OAuth token (Claude Code, ChatGPT, Copilot) outside its own CLI may violate the vendor's terms and can get your account flagged. Cascade only ever reads **your own** local files, never adopts an OAuth token without `--accept-risk`, and never transmits a credential anywhere except to that credential's own provider. Use API keys where you can.
328
+
212
329
  ### CASCADE.md
213
330
 
214
331
  Create a `CASCADE.md` in your project root to give agents project-specific instructions — just like `CLAUDE.md`. Run `cascade init` to generate a template.
@@ -312,6 +429,7 @@ cascade [options] Start interactive REPL
312
429
  cascade run <prompt> Run a single prompt and exit
313
430
  cascade init [path] Initialize Cascade in a directory
314
431
  cascade doctor Diagnose API keys, Ollama, config
432
+ cascade link [provider] Reuse credentials from Claude Code / Codex / Gemini / Copilot
315
433
  cascade update Update to the latest version
316
434
  cascade dashboard Launch the web dashboard
317
435
  ```
@@ -323,6 +441,7 @@ cascade dashboard Launch the web dashboard
323
441
  -t, --theme <name> Color theme (cascade|dark|light|dracula|nord|solarized)
324
442
  -w, --workspace <path> Workspace path (default: cwd)
325
443
  -v, --version Show version
444
+ --alt-screen Vim-style alternate screen (flicker-proof; PgUp/PgDn history)
326
445
  --no-color Disable colors
327
446
  ```
328
447
 
@@ -341,7 +460,10 @@ Type any of these inside the REPL:
341
460
  | `/model` | Interactive picker — choose provider → tier → model (or Auto) |
342
461
  | `/model-info`| Show active models per tier |
343
462
  | `/models` | Browse available models grouped by provider |
344
- | `/cost` | Toggle session cost / token usage panel |
463
+ | `/cost` | Show session cost, token usage, and delegation savings |
464
+ | `/why` | Explain how the last run was routed (complexity, models, failovers) |
465
+ | `/comms` | Toggle the live agent-to-agent comms feed |
466
+ | `/copy [n]` | Copy the last (or nth-last) response to the clipboard |
345
467
  | `/export [markdown\|json]` | Export session to file |
346
468
  | `/rollback` | Undo all file changes made in this session |
347
469
  | `/branch` | Fork the session into parallel branches |
@@ -350,6 +472,8 @@ Type any of these inside the REPL:
350
472
  | `/sessions` | List and resume past sessions |
351
473
  | `/status` | Show live agent tree status |
352
474
 
475
+ > **Selection & copy:** mouse capture stays off, so native drag-select and right-click copy work in your terminal. When idle, the screen never repaints under you; `/copy` covers the one case selection can't — grabbing text while output is still streaming (with an OSC 52 fallback that works over SSH).
476
+
353
477
  ---
354
478
 
355
479
  ## Themes
@@ -635,6 +759,11 @@ web/
635
759
  | ✓ | Hooks system |
636
760
  | ✓ | Scheduler + notifications |
637
761
  | ✓ | SDK |
762
+ | ✓ | Plugin loading from config |
763
+ | ✓ | Auto model specialization discovery |
764
+ | ✓ | T3 text-tool fallback (Ollama support) |
765
+ | ✓ | Peer communication visualization in dashboard |
766
+ | ✓ | Conversational fast-path (bypass T1 for simple prompts) |
638
767
  | 🔜 | VSCode extension (`cascade-vscode`) |
639
768
  | 🔜 | JetBrains extension (`cascade-jetbrains`) |
640
769
  | 🔜 | Cascade Cloud (hosted dashboard) |
@@ -658,7 +787,30 @@ web/
658
787
  ```bash
659
788
  git clone https://github.com/Varun-SV/Cascade-AI.git
660
789
  cd Cascade-AI
661
- npm install # installs root + web via npm workspaces
790
+ npm install # CLI dependencies (uses the committed package-lock.json)
791
+ npm --prefix web install # web dashboard dependencies (needed by `npm run build`)
792
+ npm run build
793
+ ```
794
+
795
+ ### Upgrading an existing checkout (v0.5.7+: Ink 6 / React 19)
796
+
797
+ v0.5.7 moved from Ink 5 / React 18 to **Ink 6.8 / React 19** and raised the
798
+ Node.js floor to **20**. The repo now commits `package-lock.json`, so after a
799
+ pull a plain `npm install` upgrades even a stale `node_modules` in place —
800
+ then rebuild with `npm run build` so `dist/` matches the source (the CLI warns
801
+ on startup when it detects a stale build).
802
+
803
+ If `git pull` refuses because your old untracked `package-lock.json` would be
804
+ overwritten, or `npm install` still reports `ERESOLVE` (this happens on
805
+ checkouts that predate the committed lockfile — npm keeps the installed
806
+ `react@18` in place while `ink@6` needs `react>=19`), do a clean install:
807
+
808
+ ```bash
809
+ rm -rf node_modules web/node_modules package-lock.json web/package-lock.json
810
+ git pull
811
+ npm install
812
+ npm --prefix web install
813
+ npm run build
662
814
  ```
663
815
 
664
816
  ### Development commands
package/bin/cascade.js CHANGED
@@ -1,3 +1,16 @@
1
1
  #!/usr/bin/env node
2
2
  // Cascade AI — Entry point
3
- import '../dist/cli.js';
3
+ try {
4
+ await import('../dist/cli.js');
5
+ } catch (err) {
6
+ if (err?.code === 'ERR_MODULE_NOT_FOUND') {
7
+ const missingDist = /dist[\\/]cli\.js/.test(String(err?.message ?? ''));
8
+ console.error(
9
+ missingDist
10
+ ? 'Cascade build output is missing (dist/cli.js).\nRun: npm install && npm run build'
11
+ : `A dependency failed to load — node_modules may be missing or out of date.\nRun: npm install && npm run build\n\n${err.message}`,
12
+ );
13
+ process.exit(1);
14
+ }
15
+ throw err;
16
+ }