npm - cascade-ai - Versions diffs - 0.4.0 → 0.9.7 - Mend

cascade-ai 0.4.0 → 0.9.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/README.md +159 -7
package/bin/cascade.js +14 -1
package/dist/cli.cjs +5390 -1298
package/dist/cli.cjs.map +1 -1
package/dist/cli.js +5317 -1225
package/dist/cli.js.map +1 -1
package/dist/index.cjs +3431 -550
package/dist/index.cjs.map +1 -1
package/dist/index.d.cts +713 -118
package/dist/index.d.ts +713 -118
package/dist/index.js +3425 -545
package/dist/index.js.map +1 -1
package/dist/keytar-VMICNFEJ.node +0 -0
package/package.json +11 -12
package/web/dist/assets/index-C6Nd1mOj.css +1 -0
package/web/dist/assets/index-DFRrUnoJ.js +246 -0
package/web/dist/assets/react-BP1N17hq.js +1 -0
package/web/dist/assets/reactflow-Clz8xC7C.js +33 -0
package/web/dist/index.html +3 -3
package/dist/keytar-F4YAPN53.node +0 -0
package/web/dist/assets/index-BvxaBI9b.js +0 -216
package/web/dist/assets/index-DO_ICahS.css +0 -1
package/web/dist/assets/react-Cpp6qqoq.js +0 -1
package/web/dist/assets/reactflow-B1e2RnXD.js +0 -48

package/README.md CHANGED Viewed

@@ -1,17 +1,44 @@
 # ◈ Cascade AI
-> Multi-tier AI orchestration CLI — built for developers who think in systems.
+> **One prompt → an organization of AI agents that plan, delegate, and execute in parallel.**
+> Auto-routed to the cheapest model that's best at each step. **Up to 90% cheaper** than running everything on one frontier model.
-Cascade is an open-source CLI tool that runs your prompts through a hierarchical three-tier agent system (T1 → T2 → T3), automatically routing work across the best available models, executing tools, and compiling a single coherent result. Inspired by Claude Code, Gemini CLI, and GitHub Copilot CLI — but uniquely structured around orchestration.
+[![npm](https://img.shields.io/npm/v/cascade-ai?color=aaff00&label=npm)](https://www.npmjs.com/package/cascade-ai)
+[![license](https://img.shields.io/badge/license-MIT-aaff00.svg)](LICENSE)
+[![node](https://img.shields.io/badge/node-%E2%89%A520-5AB4E8.svg)](#installation)
+[![providers](https://img.shields.io/badge/providers-6-a78bff.svg)](#ai-providers)
+[![PRs welcome](https://img.shields.io/badge/PRs-welcome-f5a623.svg)](CONTRIBUTING.md)
+Cascade is an open-source CLI that runs your prompt through a hierarchical three-tier agent system — **T1 plans → T2 manages → T3 executes** — auto-routing each step to the best-value model, running tools, and compiling one coherent result. Think Claude Code / Gemini CLI / Copilot CLI, but uniquely built around **orchestration**.
 ```
 cascade "Refactor the auth module to use JWT, add tests, and open a PR"
 ```
+## ✨ Highlights
+- 🧠 **Live benchmark Auto-routing** — set a tier to `Auto` and Cascade fuses *live* public benchmark scores with *live* pricing to pick the best-**value** model for each task.
+- 🤖 **Autonomous mode** (`/auto`) — hands-off runs: safe tools run silently, dangerous ones still ask, budget caps stay the hard stop.
+- 📋 **Boardroom plan review** — pause to review, **edit**, or steer T1's plan (with an AI reviewer's critique) before any worker spawns.
+- ⏯️ **Run resumability** (`/continue`) — hit the budget cap on a big task? Resume from the partial state instead of redoing it.
+- 👥 **Workers recruit help** — a worker can ask its manager to spawn bounded sibling workers when the work fans out — dynamic parallelism, no rigid plan.
+- 💸 **Delegation savings** — every run shows what the hierarchy saved you (`saved $5.63 — 90% vs. all-T1`); no flat-agent tool can show this number.
+- 🛡️ **Safe by default** — permission escalation (T3→T2→T1→you), SSRF-guarded fetch, loopback-only dashboard, and a budget kill-switch.
+## Why Cascade is one of a kind
+Other AI CLIs run a single agent. Cascade runs a visible **organization** — and the terminal shows you the org at work:
+- **Delegation savings** — the status bar and every run receipt show what the hierarchy saved you (`$0.031 · saved $0.094 — 75% vs. all-T1`), because cheap local T3 workers do the heavy lifting while a premium T1 model only administrates. No flat-agent tool can show this number.
+- **Agent comms feed** (`/comms`) — live radio chatter between workers: peer messages, broadcasts, file locks, barrier syncs. No other CLI has agent-to-agent communication at all, let alone on screen.
+- **`/why`** — every run can explain itself: the complexity verdict and the classifier's reasoning, which model served each tier, failovers, and escalations.
+- **The boardroom** (`planApproval: "always"`) — Complex runs pause so you can approve T1's proposed org chart and budget ("3 managers · 7 workers · est. $0.40") before anything spawns. You sit above T1.
 ---
 ## Table of Contents
+- [What's New](#whats-new)
 - [How It Works](#how-it-works)
 - [Features](#features)
 - [Installation](#installation)
@@ -34,6 +61,66 @@ cascade "Refactor the auth module to use JWT, add tests, and open a PR"
 ---
+## What's New
+### v0.6 → v0.9.1 — the agentic releases
+- **v0.9.1 — Workers recruit help.** A T3 worker that discovers its task should fan out can call `request_workers` to have its T2 spawn bounded sibling workers (no recursive 4th tier; depth-capped + budget-bounded).
+- **v0.9.0 — Resumability, reflection, smarter local exec.** `/continue` resumes a budget-capped task from its partial state; opt-in **reflection** revises a worker's output against the goal; `t3Execution: auto` runs T3 waves sequentially on local/Ollama tiers and parallel on cloud.
+- **v0.8.0 — Autonomous mode + smarter re-planning.** `/auto` for hands-off runs (safe tools auto-approve, dangerous still gated); T1's reviewer **stops early** when a corrective pass isn't converging; new `/plan` (preview a decomposition) and `/replan`.
+- **v0.7.0 — Boardroom plan review.** Iterative revision (steer → re-plan → re-ask), an AI **plan reviewer**, inline **editable** plans, and a wider gate that can pause Moderate runs too.
+- **v0.6.0 — Live benchmark Auto-routing + fixes.** `Auto` picks the best-value model per task from live public benchmarks + live OpenRouter pricing, with live provider model discovery. Plus the Gemini stale-id 404 self-heal, the Ink-6 paste fix, and run-hang timeouts.
+<details><summary>Earlier — the visible organization + a flicker-free TUI (v0.5.x)</summary>
+### The visible organization + a flicker-free TUI
+- **Delegation savings counter** — live `saved $X (Y%) vs. all-T1` in the StatusBar and `/cost`, plus a one-line receipt after every run (duration · managers · workers · cost · savings).
+- **Agent comms feed** — `/comms` toggles a live ticker of PeerBus traffic (peer messages, broadcasts, file locks, barriers). The events always existed for the web dashboard; the terminal now shows them too.
+- **`/why`** — prints the decision trail for the last run: complexity verdict with the classifier's reason (or which heuristic short-circuited), models per tier, Cascade Auto picks, provider failovers, and escalations.
+- **Boardroom plan approval** — with `planApproval: "always"`, Complex runs pause after T1 plans so you can approve the org chart + estimated cost before any T2 spawns. SDK/headless auto-approve, so default behavior is unchanged.
+- **Flicker fix** — the live area now always fits the viewport (per-panel row budgets, terminal-resize handling, capped panels), which stops Ink's full-screen redraw fallback — the root cause of flicker in long sessions on small/maximized terminals.
+- **Native mouse selection works** — idle repaints no longer wipe an in-progress drag-select; the completed agent tree collapses on your next keystroke instead of an 8s timer. `/copy [n]` copies a response via native clipboard tools with an OSC 52 escape fallback (works over SSH).
+- **`--alt-screen`** — opt-in vim-style alternate-screen mode: flicker-proof by construction, shell restored on exit (even on crashes); history scrolls in-app with PgUp/PgDn.
+- **Ink 6.8 + React 19** — renderer upgrade; Node.js floor rises to **20** (18 is EOL).
+### v0.5.7 — Security hardening pass
+A focused security review of the tool and dashboard surface. All changes are covered by tests (`tsc --noEmit` clean, full suite green).
+- **Dashboard binds to loopback by default** — the server previously listened on all interfaces (`0.0.0.0`), exposing `POST /api/run` (which executes a prompt through the full shell/file/code-interpreter tool set) to the local network. It now binds to `127.0.0.1` via the new `dashboard.host` config field; binding to a public interface requires opting in and prints a warning (louder still if `dashboard.auth` is off).
+- **SSRF protection for `web_fetch`** — agent-supplied URLs are validated against a new SSRF-safe fetch helper: http/https only, hostnames resolved and rejected if they map to loopback / link-local (cloud metadata `169.254.169.254`) / private / CGNAT ranges, and every redirect hop re-validated. Set `CASCADE_ALLOW_LOCAL_FETCH=1` to fetch local URLs. The runtime tool-creator sandbox's `fetch` uses the same guard.
+- **`file_edit` and `git` now require approval** — approval is gated by an allowlist that previously omitted both, so in-place file edits and `git commit`/`checkout`/`push` ran with no prompt while `file_write`/`file_delete` were gated. Both are now in the default approval set.
+- **Code interpreter argument injection fixed** — `run_code` now executes via `execFile` with an argv array instead of interpolating arguments into a shell string, so a crafted `args` value can no longer break out into a second command. Temp scripts are written under the workspace root.
+- **Dashboard JWT pinned to HS256** on both sign and verify (defense-in-depth against algorithm-confusion).
+- **Broadened shell dangerous-command patterns** — the built-in blocklist now tolerates flag reordering / extra whitespace (`rm -fr /`, `rm  -rf  /`) and catches a fork-bomb form. This is defense-in-depth; the approval prompt remains the real gate.
+### v0.5.6 — Wizard scrollable model list + chat scrollback + slash panel fix
+- **Init wizard tier-model picker** — added `limit={8}` to the `SelectInput` so long model lists scroll with ↑/↓ indicators instead of overflowing off-screen.
+- **Chat scrolling restored** — the REPL was still enabling mouse-reporting on mount, which captured wheel events and broke the terminal's native scrollback (where Ink `<Static>` messages live since v0.5.4). Flipped the on-mount sequence to actively disable. Mouse-wheel-up now scrolls the terminal scrollback as expected.
+- **Slash-command suggestion panel** — long descriptions were wrapping to a second line and squishing two entries onto one row. Added `wrap="truncate"` on the description text and bumped the fixed panel height by one row to fit the worst-case content (header + 8 entries + both ↑/↓ indicators).
+### v0.5.4 — Maximized-terminal flicker fix + orchestrator resilience
+- **Static-based conversation rendering** — completed messages now go to the terminal's native scrollback via Ink `<Static>`; only the live area (status bar, streaming tail, agent tree, input) re-renders per batch. Effectively eliminates the maximized-window flicker on cmd / PowerShell.
+- **`tier:status` throttle** (100 ms) + `React.memo` on AgentTree / StatusBar / HintBar to cut per-event re-render churn.
+- **Auto-clear agent tree** — the tree auto-hides 8 s after a task completes (preserves conversation and cost data); cancelled if a new task starts.
+- **T3 critical-error detection** — rate-limit / auth / forbidden errors now short-circuit the agent loop via a typed `CriticalToolError`; the worker no longer loops 15× on a 429.
+- **T3 stall preserves partial output** via a typed `WorkerStallError` instead of throwing a bare `Error`.
+- **T1 failure summary** — when all sections fail, the user sees the actual root cause (e.g. `[CRITICAL_TOOL_ERROR] grep: 429 Rate limit reached for gpt-5.4-mini`) instead of a generic "all sections encountered errors".
+### v0.5.3 — Headless mode and audit fixes
+- **Headless `cascade run` / `-p`** — works in non-TTY contexts (CI, pipes, scripts). Progress → stderr, final answer → stdout. Tool approvals are auto-granted in headless mode.
+- **`cascade models` columns** — long model IDs no longer collide with the provider column.
+- **`/clear` resets cost breakdowns** — per-provider / per-tier maps are reset, not just the totals.
+- **`/config`** — richer output (theme, providers, per-tier models, dashboard port, cascade-auto), guarded against an undefined `config.dashboard`.
+- **Type cleanup** — removed vestigial `ReplMessage` / `ToolCallBlock` interfaces.
+### v0.5.2 — Setup wizard redesign + new tools
+- **First-run setup wizard redesigned** to match the Cascade-AI TUI design — themed welcome header, phased step tabs (API Keys → Models → Complete), field boxes, tier cards, and a proper completion screen. All provider/model functionality preserved.
+- **New tools** — `glob`, `grep`, and `web-fetch` available to T3 workers.
+- **Model-performance tracker** — records per-model success/cost stats for scored selection when `cascadeAuto: true`.
+- **Fixes** — removed an accidental `cascade-ai` self-dependency in `package.json`; corrected misleading `/tree` and `/sessions` slash-command descriptions; fixed stale T2/T3 test mocks.
+</details>
+---
 ## How It Works
 Every task runs through three agent tiers:
@@ -118,10 +205,12 @@ User prompt
 ### Web Dashboard
 - Real-time agent execution graph (ReactFlow)
+- **Peer communication edges** — animated dashed lines between agents as they exchange messages
+- **Agent Inspector** — click any node to see live output stream and peer communications
 - Session browser with cost/token stats
 - Config viewer
 - JWT auth (password-protected)
-- Single-tenant and multi-tenant team modes
+- URL hash routing (`#topology`, `#sessions`, `#logs`, `#settings`)
 - WebSocket live updates
 ---
@@ -132,7 +221,7 @@ User prompt
 npm install -g cascade-ai
 ```
-> Requires **Node.js ≥ 18**.
+> Requires **Node.js ≥ 20**.
 ---
@@ -191,15 +280,22 @@ Cascade loads config from `.cascade/config.json` in your project directory.
     "browserEnabled":     false
   },
   "dashboard": {
+    "host":     "127.0.0.1",
     "port":     4891,
     "auth":     true,
     "teamMode": "single"
   },
   "theme":  "cascade",
-  "telemetry": { "enabled": false }
+  "telemetry": { "enabled": false },
+  "plugins": ["./plugins/my-tool.js"],
+  "planApproval": "never",
+  "altScreen": false
 }
 ```
+- `planApproval: "always"` pauses Complex runs in the **boardroom**: approve T1's proposed sections, worker counts, and estimated cost before any T2 manager spawns. Headless/SDK runs auto-approve.
+- `altScreen: true` (or the `--alt-screen` flag) renders the TUI in the terminal's alternate screen buffer — vim-style, flicker-proof, shell restored on exit. History scrolls in-app with PgUp/PgDn since the alt screen has no native scrollback.
 API keys are also read from environment variables:
 | Provider | Environment Variable  |
@@ -209,6 +305,27 @@ API keys are also read from environment variables:
 | Gemini    | `GOOGLE_API_KEY`     |
 | Azure     | `AZURE_OPENAI_KEY`   |
+### Linking credentials from other AI CLIs
+If you already use **Claude Code**, **OpenAI Codex**, **Gemini CLI**, or **GitHub Copilot CLI**, Cascade can reuse the credentials they store on your machine instead of asking you to paste keys again:
+```bash
+cascade link                      # list detected credentials
+cascade link anthropic            # adopt an API key for a provider
+cascade link anthropic --accept-risk   # adopt a Claude Code subscription token
+```
+`cascade doctor` also reports what's linkable. How each credential is treated:
+| Source | Stored as | Reusable? |
+|--------|-----------|-----------|
+| `ANTHROPIC_API_KEY` / `OPENAI_API_KEY` / `GEMINI_API_KEY` env | API key | ✅ directly |
+| Codex `~/.codex/auth.json` (API-key mode) | API key | ✅ directly |
+| Claude Code `~/.claude/.credentials.json` | OAuth token | ⚠️ as an Anthropic bearer token (needs `--accept-risk`) |
+| Codex ChatGPT login · Gemini CLI · Copilot CLI | vendor OAuth | ❌ detected only — locked to that vendor's backend |
+> ⚠️ **Terms of service:** reusing a *subscription* OAuth token (Claude Code, ChatGPT, Copilot) outside its own CLI may violate the vendor's terms and can get your account flagged. Cascade only ever reads **your own** local files, never adopts an OAuth token without `--accept-risk`, and never transmits a credential anywhere except to that credential's own provider. Use API keys where you can.
 ### CASCADE.md
 Create a `CASCADE.md` in your project root to give agents project-specific instructions — just like `CLAUDE.md`. Run `cascade init` to generate a template.
@@ -312,6 +429,7 @@ cascade [options]               Start interactive REPL
 cascade run <prompt>            Run a single prompt and exit
 cascade init [path]             Initialize Cascade in a directory
 cascade doctor                  Diagnose API keys, Ollama, config
+cascade link [provider]         Reuse credentials from Claude Code / Codex / Gemini / Copilot
 cascade update                  Update to the latest version
 cascade dashboard               Launch the web dashboard
 ```
@@ -323,6 +441,7 @@ cascade dashboard               Launch the web dashboard
 -t, --theme  <name>    Color theme (cascade|dark|light|dracula|nord|solarized)
 -w, --workspace <path> Workspace path (default: cwd)
 -v, --version          Show version
+    --alt-screen       Vim-style alternate screen (flicker-proof; PgUp/PgDn history)
     --no-color         Disable colors
 ```
@@ -341,7 +460,10 @@ Type any of these inside the REPL:
 | `/model`     | Interactive picker — choose provider → tier → model (or Auto) |
 | `/model-info`| Show active models per tier                   |
 | `/models`    | Browse available models grouped by provider   |
-| `/cost`      | Toggle session cost / token usage panel       |
+| `/cost`      | Show session cost, token usage, and delegation savings |
+| `/why`       | Explain how the last run was routed (complexity, models, failovers) |
+| `/comms`     | Toggle the live agent-to-agent comms feed     |
+| `/copy [n]`  | Copy the last (or nth-last) response to the clipboard |
 | `/export [markdown\|json]` | Export session to file             |
 | `/rollback`  | Undo all file changes made in this session    |
 | `/branch`    | Fork the session into parallel branches       |
@@ -350,6 +472,8 @@ Type any of these inside the REPL:
 | `/sessions`  | List and resume past sessions                 |
 | `/status`    | Show live agent tree status                   |
+> **Selection & copy:** mouse capture stays off, so native drag-select and right-click copy work in your terminal. When idle, the screen never repaints under you; `/copy` covers the one case selection can't — grabbing text while output is still streaming (with an OSC 52 fallback that works over SSH).
 ---
 ## Themes
@@ -635,6 +759,11 @@ web/
 | ✓ | Hooks system |
 | ✓ | Scheduler + notifications |
 | ✓ | SDK |
+| ✓ | Plugin loading from config |
+| ✓ | Auto model specialization discovery |
+| ✓ | T3 text-tool fallback (Ollama support) |
+| ✓ | Peer communication visualization in dashboard |
+| ✓ | Conversational fast-path (bypass T1 for simple prompts) |
 | 🔜 | VSCode extension (`cascade-vscode`) |
 | 🔜 | JetBrains extension (`cascade-jetbrains`) |
 | 🔜 | Cascade Cloud (hosted dashboard) |
@@ -658,7 +787,30 @@ web/
 ```bash
 git clone https://github.com/Varun-SV/Cascade-AI.git
 cd Cascade-AI
-npm install   # installs root + web via npm workspaces
+npm install               # CLI dependencies (uses the committed package-lock.json)
+npm --prefix web install  # web dashboard dependencies (needed by `npm run build`)
+npm run build
+```
+### Upgrading an existing checkout (v0.5.7+: Ink 6 / React 19)
+v0.5.7 moved from Ink 5 / React 18 to **Ink 6.8 / React 19** and raised the
+Node.js floor to **20**. The repo now commits `package-lock.json`, so after a
+pull a plain `npm install` upgrades even a stale `node_modules` in place —
+then rebuild with `npm run build` so `dist/` matches the source (the CLI warns
+on startup when it detects a stale build).
+If `git pull` refuses because your old untracked `package-lock.json` would be
+overwritten, or `npm install` still reports `ERESOLVE` (this happens on
+checkouts that predate the committed lockfile — npm keeps the installed
+`react@18` in place while `ink@6` needs `react>=19`), do a clean install:
+```bash
+rm -rf node_modules web/node_modules package-lock.json web/package-lock.json
+git pull
+npm install
+npm --prefix web install
+npm run build
 ```
 ### Development commands

package/bin/cascade.js CHANGED Viewed

@@ -1,3 +1,16 @@
 #!/usr/bin/env node
 // Cascade AI — Entry point
-import '../dist/cli.js';
+try {
+  await import('../dist/cli.js');
+} catch (err) {
+  if (err?.code === 'ERR_MODULE_NOT_FOUND') {
+    const missingDist = /dist[\\/]cli\.js/.test(String(err?.message ?? ''));
+    console.error(
+      missingDist
+        ? 'Cascade build output is missing (dist/cli.js).\nRun: npm install && npm run build'
+        : `A dependency failed to load — node_modules may be missing or out of date.\nRun: npm install && npm run build\n\n${err.message}`,
+    );
+    process.exit(1);
+  }
+  throw err;
+}