npm - ada-agent - Versions diffs - 0.1.0 → 0.2.0 - Mend

ada-agent 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/README.md +14 -7
package/bench/README.md +88 -88
package/bench/swebench.mjs +242 -242
package/docs/architecture.md +163 -139
package/docs/architecture.svg +73 -73
package/docs/cloudflare.md +81 -0
package/docs/connectors.md +49 -48
package/docs/integrations.md +62 -59
package/package.json +65 -64
package/src/client/catalog.json +1 -0
package/src/client/cli.ts +1262 -1253
package/src/client/models-dev.ts +106 -52
package/src/selfcheck.ts +26 -0
package/src/server/config.ts +65 -58
package/src/server/providers/openai-compat.ts +78 -76
package/src/server/providers/registry.ts +32 -31
package/src/server/router.ts +33 -29
package/src/shared/types.ts +21 -20

package/docs/architecture.md CHANGED Viewed

@@ -1,139 +1,163 @@
-# Architecture
-ada is two programs in one repo: a thin **client** (the coding agent) and a **backend** (the
-router that holds provider keys). They communicate in one wire format — OpenAI Chat Completions.
-![ada architecture](architecture.svg)
-```
- ada (client)                    ada backend                         upstream providers
- ────────────                    ───────────                         ──────────────────
- agentic loop  ──── HTTP  ───▶   auth (client key)
- tools                           router: model id → provider
- sessions                        adapter: provider → wire format ──▶  Anthropic / OpenAI / …
- approval/TUI   ◀── SSE  ────    normalize back to OpenAI SSE
-```
-Why split it: the backend is the **one control point**. Provider keys, auth, rate limits, and
-billing all belong in one place; the client carries only an ada client key. Same shape as Cursor.
-## Request flow
-1. The client sends an OpenAI-format chat request (model, messages, tools) to `ADA_BACKEND_URL`
-   with its client key as the bearer.
-2. The backend authenticates the key (`ADA_CLIENT_KEYS`, or open in dev mode), then `router.ts`
-   maps the model id → a provider.
-3. The matching **adapter** calls the upstream with the server-held provider key — a pass-through
-   for OpenAI-compatible providers, a translation for Anthropic.
-4. The backend streams normalized OpenAI SSE chunks back; the client renders text and runs tool
-   calls, appending one `{role:"tool", tool_call_id, content}` per call and looping.
-## The agent loop
-![ada agent loop](agent-loop.svg)
-Each turn streams the model's reply; if it contains tool calls, gated ones prompt for approval,
-the tools run, and one `{role:"tool", tool_call_id, content}` per call is appended before control
-returns to the model — looping until the model stops calling tools.
-## Sign in (device flow)
-![ada login device flow](login-flow.svg)
-GitHub/Google login uses the OAuth 2.0 device authorization grant (RFC 8628) — no password ever
-reaches ada. The token is stored locally and sent as the bearer; the backend verifies identity in
-`server/identity.ts`. The GitHub `client_id` is baked in (public, like `gh`), so the client needs
-zero config.
-## One adapter per wire format
-The key design decision: adapters are keyed by **wire format**, not by provider or model.
-- Most providers speak the OpenAI format and share **`openai-compat.ts`** (a pass-through that just
-  swaps in the right base URL + key).
-- Only a divergent format gets its own adapter — **`anthropic.ts`** translates OpenAI ⇄ Anthropic
-  Messages and re-emits Anthropic events as OpenAI SSE.
-Consequences:
-| Change | Cost |
-|---|---|
-| A new model | **0 code** (routing is by id) |
-| A new OpenAI-compatible provider | **2 lines** in `config.ts` (base URL + key env) |
-| A brand-new wire format | **1 adapter** + one line in `registry.ts` |
-Vendor SDKs load **lazily** (pi-style): a `type`-only import plus a dynamic `import()`, so e.g.
-`@anthropic-ai/sdk` never loads unless a Claude request actually arrives.
-## Routing
-`router.ts` maps a model id to a provider:
-- a model id containing `:` (e.g. `qwen2.5-coder:latest`) → local **Ollama**;
-- otherwise by prefix (`gpt*`/`o*` → openai, `claude*` → anthropic, `gemini*` → google,
-  `mistral*` → mistral, `grok*` → xai, …);
-- an explicit `provider` field on the request always wins;
-- anything unmatched falls through to **OpenRouter**.
-## Context compaction
-The client estimates context size (≈ chars / 4) and, when it crosses `ADA_COMPACT_AT` (default
-100k) or a request overflows, summarizes older turns into one compact summary and keeps the recent
-ones. `/compact` forces it; `/context` shows the current estimate.
-## Tool-call recovery
-Some providers (notably **Ollama over a streaming connection**) fail to parse a model's tool call
-into the structured `tool_calls` field and leak it into the text as raw JSON. The client detects a
-reply that *is* a JSON tool call (plain, ```` ```json ```` fenced, or `<tool_call>`-wrapped) for a
-real tool and runs it instead of printing the JSON. Hallucinated tools (no such tool) are left as
-text. See `parseTextToolCalls` in `client/agent.ts`.
-## File layout
-```
-bin/
-  ada.mjs             launcher: register tsx loader → run client/cli.ts
-  ada-server.mjs      launcher: register tsx loader → run server/index.ts
-src/
-  shared/
-    types.ts          provider/model types shared by client and server
-  server/             the routing backend                 (ada-server | npm run server)
-    index.ts          HTTP entry: auth → route → dispatch to an adapter (+ /v1/models, /v1/whoami)
-    config.ts         providers, base URLs, key env vars, port, client-key auth
-    router.ts         model id → provider
-    sse.ts            Server-Sent Events helpers
-    identity.ts       verify GitHub/Google tokens; allowlist
-    oauth.ts          RFC 8628 device-flow login (built-in GitHub client id)
-    credentials.ts    local credential store
-    providers/
-      adapter.ts      the Adapter interface               ← one adapter per WIRE FORMAT
-      registry.ts     provider → adapter map
-      openai-compat.ts pass-through OpenAI-compatible adapter
-      anthropic.ts    native Anthropic adapter (lazy @anthropic-ai/sdk)
-  client/             the terminal agent                  (ada | npm start)
-    cli.ts            REPL: flags, model picker, slash commands, approval prompt
-    agent.ts          the agentic loop (stream → tool calls → feed back → repeat)
-    tools.ts          read/write/edit/bash/ls/grep/glob; protected paths; destructive detection
-    tui.ts            inline TUI engine (composer, spinner, user bar)
-    tui-mode.ts       the TUI loop
-    session.ts        append-only JSONL session store (.cos0/sessions/)
-    compaction.ts     context summarization
-    checkpoint.ts     undo: snapshot files before edits, restore on /undo
-    todos.ts          task tracking + render
-    hooks.ts          extension hooks (before/after tool, input transform)
-    extensions.ts     load extensions (tools + hooks + commands)
-    skills.ts · mcp.ts · prompts.ts   skills, MCP servers, prompt templates
-    settings.ts · platform.ts · render.ts · image.ts · telemetry.ts · pkg.ts
-  selfcheck.ts        offline checks (tools, sessions, routing, parsers, TUI)
-```
-## No build step
-Everything runs through `tsx` — TypeScript with no compile. The `bin/*.mjs` launchers register the
-tsx ESM loader in-process, then import the relevant `.ts` entrypoint (which self-runs). `tsx` is a
-runtime dependency so the global `ada` command works after `npm link` / `npm install -g`.
+# Architecture
+ada is two programs in one repo: a thin **client** (the coding agent) and a **backend** (the
+router that holds provider keys). They communicate in one wire format — OpenAI Chat Completions.
+![ada architecture](architecture.svg)
+```
+ ada (client)                    ada backend                         upstream providers
+ ────────────                    ───────────                         ──────────────────
+ agentic loop  ──── HTTP  ───▶   auth (client key)
+ tools                           router: model id → provider
+ sessions                        adapter: provider → wire format ──▶  Anthropic / OpenAI / …
+ approval/TUI   ◀── SSE  ────    normalize back to OpenAI SSE
+```
+Why split it: the backend is the **one control point**. Provider keys, auth, rate limits, and
+billing all belong in one place; the client carries only an ada client key. Same shape as Cursor.
+## Request flow
+1. The client sends an OpenAI-format chat request (model, messages, tools) to `ADA_BACKEND_URL`
+   with its client key as the bearer.
+2. The backend authenticates the key (`ADA_CLIENT_KEYS`, or open in dev mode), then `router.ts`
+   maps the model id → a provider.
+3. The matching **adapter** calls the upstream with the server-held provider key — a pass-through
+   for OpenAI-compatible providers, a translation for Anthropic.
+4. The backend streams normalized OpenAI SSE chunks back; the client renders text and runs tool
+   calls, appending one `{role:"tool", tool_call_id, content}` per call and looping.
+## The agent loop
+![ada agent loop](agent-loop.svg)
+Each turn streams the model's reply; if it contains tool calls, gated ones go through the
+**permission mode**, the tools run, and one `{role:"tool", tool_call_id, content}` per call is
+appended before control returns to the model — looping until the model stops calling tools.
+**Permission modes** (`/ask` · `/plan` · `/auto`, or `/mode` to cycle; shown in the prompt):
+- **ask** (default) — each gated tool shows a plain-words prompt ("ada wants to run a shell command…")
+  and one key: `[y]es` · `[a]uto` · `[p]lan` · `[n]o`. Destructive `bash` always confirms.
+- **plan** — read-only: ada plans but won't edit; `/run` approves and executes.
+- **auto** — runs tools without asking (still confirms destructive `bash`). `--yolo` starts here.
+**Skills.** ~285 bundled `SKILL.md` instructions load only on demand. A lexical router
+(`client/skill-router.ts`) ranks every request; on a confident, name-exact match ada **auto-applies**
+the skill (injects its procedure, announced `↳ skill: <name>`), otherwise it suggests them. The model
+can also `list_skills` / `find_skill` / `use_skill`. See [orchestration.md](orchestration.md) for the
+strategies (`react`/`plan`/`multi`/`toolsmith`) layered on the same loop.
+**Programmatic surfaces.** Beyond the REPL/TUI, the same agent drives an HTTP API (`ada serve`), a
+typed SDK, an ACP editor bridge (`ada acp`), and read-only session sharing (`ada share`) — see
+[integrations.md](integrations.md). And it can run **SWE-bench Verified** via [bench/](../bench/).
+## Sign in (device flow)
+![ada login device flow](login-flow.svg)
+GitHub/Google login uses the OAuth 2.0 device authorization grant (RFC 8628) — no password ever
+reaches ada. The token is stored locally and sent as the bearer; the backend verifies identity in
+`server/identity.ts`. The GitHub `client_id` is baked in (public, like `gh`), so the client needs
+zero config.
+## One adapter per wire format
+The key design decision: adapters are keyed by **wire format**, not by provider or model.
+- Most providers speak the OpenAI format and share **`openai-compat.ts`** (a pass-through that just
+  swaps in the right base URL + key).
+- Only a divergent format gets its own adapter — **`anthropic.ts`** translates OpenAI ⇄ Anthropic
+  Messages and re-emits Anthropic events as OpenAI SSE.
+Consequences:
+| Change | Cost |
+|---|---|
+| A new model | **0 code** (routing is by id) |
+| A new OpenAI-compatible provider | **2 lines** in `config.ts` (base URL + key env) |
+| A brand-new wire format | **1 adapter** + one line in `registry.ts` |
+Vendor SDKs load **lazily** (pi-style): a `type`-only import plus a dynamic `import()`, so e.g.
+`@anthropic-ai/sdk` never loads unless a Claude request actually arrives.
+## Routing
+`router.ts` maps a model id to a provider:
+- a model id containing `:` (e.g. `qwen2.5-coder:latest`) → local **Ollama**;
+- otherwise by prefix (`gpt*`/`o*` → openai, `claude*` → anthropic, `gemini*` → google,
+  `mistral*` → mistral, `grok*` → xai, …);
+- an explicit `provider` field on the request always wins;
+- anything unmatched falls through to **OpenRouter**.
+## Context compaction
+The client estimates context size (≈ chars / 4) and, when it crosses `ADA_COMPACT_AT` (default
+100k) or a request overflows, summarizes older turns into one compact summary and keeps the recent
+ones. `/compact` forces it; `/context` shows the current estimate.
+## Tool-call recovery
+Some providers (notably **Ollama over a streaming connection**) fail to parse a model's tool call
+into the structured `tool_calls` field and leak it into the text as raw JSON. The client detects a
+reply that *is* a JSON tool call (plain, ```` ```json ```` fenced, or `<tool_call>`-wrapped) for a
+real tool and runs it instead of printing the JSON. Hallucinated tools (no such tool) are left as
+text. See `parseTextToolCalls` in `client/agent.ts`.
+## File layout
+```
+bin/
+  ada.mjs             launcher: register tsx loader → run client/cli.ts
+  ada-server.mjs      launcher: register tsx loader → run server/index.ts
+src/
+  shared/
+    types.ts          provider/model types shared by client and server
+  server/             the routing backend                 (ada-server | npm run server)
+    index.ts          HTTP entry: auth → route → dispatch to an adapter (+ /v1/models, /v1/whoami)
+    config.ts         providers, base URLs, key env vars, port, client-key auth
+    router.ts         model id → provider
+    sse.ts            Server-Sent Events helpers
+    identity.ts       verify GitHub/Google tokens; allowlist
+    oauth.ts          RFC 8628 device-flow login (built-in GitHub client id)
+    credentials.ts    local credential store
+    providers/
+      adapter.ts      the Adapter interface               ← one adapter per WIRE FORMAT
+      registry.ts     provider → adapter map
+      openai-compat.ts pass-through OpenAI-compatible adapter
+      anthropic.ts    native Anthropic adapter (lazy @anthropic-ai/sdk)
+  client/             the terminal agent                  (ada | npm start)
+    cli.ts            REPL: flags, model picker, slash commands, ask/plan/auto modes + approval
+    agent.ts          the agentic loop (stream → tool calls → feed back → repeat) + orchestrators
+    tools.ts          read_file/write_file/edit_file · apply_patch · bash (PTY) · ls/glob/grep (rg)
+                      · web_fetch/web_search · lsp_diagnostics · ask_user; protected paths;
+                      destructive detection; trust-gated auto-format
+    tui.ts            inline TUI engine (composer, spinner, user bar)
+    tui-mode.ts       the TUI loop
+    session.ts        append-only JSONL session store (.ada/sessions/)
+    compaction.ts     context summarization
+    checkpoint.ts · snapshot.ts   undo (revert edits) · whole-tree git snapshot/restore
+    skills.ts · skill-router.ts   skills + the relevance router (auto-apply)
+    mcp.ts · prompts.ts · background.ts · models-dev.ts · lsp.ts   connectors, templates,
+                      background jobs, models.dev catalog, LSP client
+    todos.ts · hooks.ts · extensions.ts   tasks; extension hooks + tools + commands
+    settings.ts · platform.ts · render.ts · image.ts · telemetry.ts · pkg.ts
+  sdk/index.ts        typed client for the HTTP API (`ada serve`)
+  selfcheck.ts        offline checks (tools, sessions, routing, parsers, TUI, classifiers)
+bench/
+  swebench.mjs        SWE-bench Verified prediction generator (scored by the official harness)
+```
+## No build step
+Everything runs through `tsx` — TypeScript with no compile. The `bin/*.mjs` launchers register the
+tsx ESM loader in-process, then import the relevant `.ts` entrypoint (which self-runs). `tsx` is a
+runtime dependency so the global `ada` command works after `npx ada-agent`, `npm install -g ada-agent`,
+or `npm link` from a clone. (`node-pty` is the one native dep, so a C toolchain is needed at install.)

package/docs/architecture.svg CHANGED Viewed

@@ -1,73 +1,73 @@
-<svg viewBox="0 0 920 470" xmlns="http://www.w3.org/2000/svg" font-family="ui-sans-serif, system-ui, sans-serif" role="img" aria-labelledby="t d">
-  <title id="t">ada architecture</title>
-  <desc id="d">The ada terminal client sends OpenAI Chat Completions over HTTP to the ada backend, which authenticates the client key, routes by model id, adapts to each provider's wire format, and streams normalized SSE back. The backend holds every provider key and reaches Anthropic via a native adapter and all other providers via a shared OpenAI-compatible adapter.</desc>
-  <defs>
-    <marker id="fwd" markerWidth="9" markerHeight="9" refX="7" refY="4.5" orient="auto"><path d="M0,0 L9,4.5 L0,9 z" fill="#ffaf00"/></marker>
-    <marker id="back" markerWidth="9" markerHeight="9" refX="7" refY="4.5" orient="auto"><path d="M0,0 L9,4.5 L0,9 z" fill="#3fb950"/></marker>
-  </defs>
-  <!-- panel -->
-  <rect x="6" y="6" width="908" height="458" rx="16" fill="#0d0f12" stroke="#262b33"/>
-  <rect x="34" y="34" width="14" height="14" rx="4" transform="rotate(45 41 41)" fill="#ffaf00"/>
-  <text x="60" y="40" fill="#ffaf00" font-size="17" font-weight="700">ada · architecture</text>
-  <text x="60" y="60" fill="#9aa3af" font-size="12">terminal client → routing backend → providers · one wire format throughout</text>
-  <!-- request / response lanes -->
-  <g font-family="ui-monospace, monospace" font-size="10.5">
-    <!-- top: request (gold, →) -->
-    <line x1="226" y1="200" x2="340" y2="200" stroke="#ffaf00" stroke-width="1.6" marker-end="url(#fwd)"/>
-    <text x="283" y="190" fill="#c5cdd6" text-anchor="middle">OpenAI Chat</text>
-    <text x="283" y="178" fill="#9aa3af" text-anchor="middle" font-size="9">Completions · HTTP</text>
-    <line x1="652" y1="200" x2="710" y2="200" stroke="#ffaf00" stroke-width="1.6" marker-end="url(#fwd)"/>
-    <text x="683" y="190" fill="#c5cdd6" text-anchor="middle">+ key</text>
-    <!-- bottom: response (green, ←) -->
-    <line x1="710" y1="320" x2="652" y2="320" stroke="#3fb950" stroke-width="1.6" marker-end="url(#back)"/>
-    <text x="683" y="338" fill="#7ee08a" text-anchor="middle">SSE</text>
-    <line x1="340" y1="320" x2="226" y2="320" stroke="#3fb950" stroke-width="1.6" marker-end="url(#back)"/>
-    <text x="283" y="338" fill="#7ee08a" text-anchor="middle">normalized OpenAI SSE</text>
-  </g>
-  <!-- client card -->
-  <rect x="42" y="138" width="184" height="212" rx="12" fill="#14171c" stroke="#262b33"/>
-  <text x="134" y="166" fill="#ffaf00" font-size="16" font-weight="700" text-anchor="middle">ada client</text>
-  <text x="134" y="183" fill="#9aa3af" font-size="10.5" text-anchor="middle">the terminal</text>
-  <g font-family="ui-monospace, monospace" font-size="11" fill="#c5cdd6" text-anchor="middle">
-    <text x="134" y="207">agentic loop · tools</text>
-    <text x="134" y="226">285 skills · MCP</text>
-    <text x="134" y="245">ask · plan · auto</text>
-    <text x="134" y="264">REPL · TUI · sessions</text>
-    <text x="134" y="283" fill="#6b7480">serve · SDK · ACP</text>
-  </g>
-  <rect x="80" y="303" width="108" height="22" rx="11" fill="#0d0f12" stroke="#262b33"/>
-  <text x="134" y="318" fill="#6b7480" font-size="10" text-anchor="middle" font-family="ui-monospace, monospace">holds no keys</text>
-  <!-- backend card -->
-  <rect x="346" y="118" width="306" height="244" rx="12" fill="#101318" stroke="#262b33"/>
-  <text x="499" y="146" fill="#ffaf00" font-size="15" font-weight="700" text-anchor="middle">ada backend</text>
-  <text x="499" y="163" fill="#9aa3af" font-size="10" text-anchor="middle">the one control point — holds every key</text>
-  <g font-family="ui-monospace, monospace" font-size="11.5">
-    <g><rect x="366" y="176" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="195" fill="#e6e9ee"><tspan fill="#ffaf00">1</tspan>  auth · client key</text></g>
-    <g><rect x="366" y="214" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="233" fill="#e6e9ee"><tspan fill="#ffaf00">2</tspan>  route · model id → provider</text></g>
-    <g><rect x="366" y="252" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="271" fill="#e6e9ee"><tspan fill="#ffaf00">3</tspan>  adapt · one per wire format</text></g>
-    <g><rect x="366" y="290" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="309" fill="#e6e9ee"><tspan fill="#ffaf00">4</tspan>  normalize → OpenAI SSE</text></g>
-  </g>
-  <!-- providers card -->
-  <rect x="716" y="150" width="162" height="180" rx="12" fill="#14171c" stroke="#262b33"/>
-  <text x="797" y="180" fill="#e6e9ee" font-size="13" font-weight="700" text-anchor="middle">providers</text>
-  <text x="797" y="202" fill="#ff7a59" font-size="12" font-weight="700" text-anchor="middle" font-family="ui-monospace, monospace">Anthropic <tspan fill="#6b7480" font-weight="400" font-size="9">native</tspan></text>
-  <line x1="732" y1="214" x2="862" y2="214" stroke="#262b33"/>
-  <g font-family="ui-monospace, monospace" font-size="10.5" fill="#c5cdd6" text-anchor="middle">
-    <text x="797" y="234">OpenAI · Gemini · Groq</text>
-    <text x="797" y="252">Mistral · DeepSeek · xAI</text>
-    <text x="797" y="270">Together · DashScope</text>
-    <text x="797" y="288">Ollama · OpenRouter</text>
-  </g>
-  <text x="797" y="312" fill="#ffaf00" font-size="9.5" text-anchor="middle" font-family="ui-monospace, monospace">all but Anthropic: openai-compat</text>
-  <!-- footer note -->
-  <text x="460" y="406" fill="#9aa3af" font-size="11" text-anchor="middle" font-family="ui-monospace, monospace">a new model = 0 code   ·   a new OpenAI-compatible provider = 2 lines   ·   a new wire format = 1 adapter</text>
-  <text x="460" y="430" fill="#6b7480" font-size="10" text-anchor="middle">vendor SDKs load lazily — the Anthropic SDK never loads unless a Claude request arrives</text>
-</svg>
+<svg viewBox="0 0 920 470" xmlns="http://www.w3.org/2000/svg" font-family="ui-sans-serif, system-ui, sans-serif" role="img" aria-labelledby="t d">
+  <title id="t">ada architecture</title>
+  <desc id="d">The ada terminal client sends OpenAI Chat Completions over HTTP to the ada backend, which authenticates the client key, routes by model id, adapts to each provider's wire format, and streams normalized SSE back. The backend holds every provider key and reaches Anthropic via a native adapter and all other providers via a shared OpenAI-compatible adapter.</desc>
+  <defs>
+    <marker id="fwd" markerWidth="9" markerHeight="9" refX="7" refY="4.5" orient="auto"><path d="M0,0 L9,4.5 L0,9 z" fill="#ffaf00"/></marker>
+    <marker id="back" markerWidth="9" markerHeight="9" refX="7" refY="4.5" orient="auto"><path d="M0,0 L9,4.5 L0,9 z" fill="#3fb950"/></marker>
+  </defs>
+  <!-- panel -->
+  <rect x="6" y="6" width="908" height="458" rx="16" fill="#0d0f12" stroke="#262b33"/>
+  <rect x="34" y="34" width="14" height="14" rx="4" transform="rotate(45 41 41)" fill="#ffaf00"/>
+  <text x="60" y="40" fill="#ffaf00" font-size="17" font-weight="700">ada · architecture</text>
+  <text x="60" y="60" fill="#9aa3af" font-size="12">terminal client → routing backend → providers · one wire format throughout</text>
+  <!-- request / response lanes -->
+  <g font-family="ui-monospace, monospace" font-size="10.5">
+    <!-- top: request (gold, →) -->
+    <line x1="226" y1="200" x2="340" y2="200" stroke="#ffaf00" stroke-width="1.6" marker-end="url(#fwd)"/>
+    <text x="283" y="190" fill="#c5cdd6" text-anchor="middle">OpenAI Chat</text>
+    <text x="283" y="178" fill="#9aa3af" text-anchor="middle" font-size="9">Completions · HTTP</text>
+    <line x1="652" y1="200" x2="710" y2="200" stroke="#ffaf00" stroke-width="1.6" marker-end="url(#fwd)"/>
+    <text x="683" y="190" fill="#c5cdd6" text-anchor="middle">+ key</text>
+    <!-- bottom: response (green, ←) -->
+    <line x1="710" y1="320" x2="652" y2="320" stroke="#3fb950" stroke-width="1.6" marker-end="url(#back)"/>
+    <text x="683" y="338" fill="#7ee08a" text-anchor="middle">SSE</text>
+    <line x1="340" y1="320" x2="226" y2="320" stroke="#3fb950" stroke-width="1.6" marker-end="url(#back)"/>
+    <text x="283" y="338" fill="#7ee08a" text-anchor="middle">normalized OpenAI SSE</text>
+  </g>
+  <!-- client card -->
+  <rect x="42" y="138" width="184" height="212" rx="12" fill="#14171c" stroke="#262b33"/>
+  <text x="134" y="166" fill="#ffaf00" font-size="16" font-weight="700" text-anchor="middle">ada client</text>
+  <text x="134" y="183" fill="#9aa3af" font-size="10.5" text-anchor="middle">the terminal</text>
+  <g font-family="ui-monospace, monospace" font-size="11" fill="#c5cdd6" text-anchor="middle">
+    <text x="134" y="207">agentic loop · tools</text>
+    <text x="134" y="226">285 skills · MCP</text>
+    <text x="134" y="245">ask · plan · auto</text>
+    <text x="134" y="264">REPL · TUI · sessions</text>
+    <text x="134" y="283" fill="#6b7480">serve · SDK · ACP</text>
+  </g>
+  <rect x="80" y="303" width="108" height="22" rx="11" fill="#0d0f12" stroke="#262b33"/>
+  <text x="134" y="318" fill="#6b7480" font-size="10" text-anchor="middle" font-family="ui-monospace, monospace">holds no keys</text>
+  <!-- backend card -->
+  <rect x="346" y="118" width="306" height="244" rx="12" fill="#101318" stroke="#262b33"/>
+  <text x="499" y="146" fill="#ffaf00" font-size="15" font-weight="700" text-anchor="middle">ada backend</text>
+  <text x="499" y="163" fill="#9aa3af" font-size="10" text-anchor="middle">the one control point — holds every key</text>
+  <g font-family="ui-monospace, monospace" font-size="11.5">
+    <g><rect x="366" y="176" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="195" fill="#e6e9ee"><tspan fill="#ffaf00">1</tspan>  auth · client key</text></g>
+    <g><rect x="366" y="214" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="233" fill="#e6e9ee"><tspan fill="#ffaf00">2</tspan>  route · model id → provider</text></g>
+    <g><rect x="366" y="252" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="271" fill="#e6e9ee"><tspan fill="#ffaf00">3</tspan>  adapt · one per wire format</text></g>
+    <g><rect x="366" y="290" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="309" fill="#e6e9ee"><tspan fill="#ffaf00">4</tspan>  normalize → OpenAI SSE</text></g>
+  </g>
+  <!-- providers card -->
+  <rect x="716" y="150" width="162" height="180" rx="12" fill="#14171c" stroke="#262b33"/>
+  <text x="797" y="180" fill="#e6e9ee" font-size="13" font-weight="700" text-anchor="middle">providers</text>
+  <text x="797" y="202" fill="#ff7a59" font-size="12" font-weight="700" text-anchor="middle" font-family="ui-monospace, monospace">Anthropic <tspan fill="#6b7480" font-weight="400" font-size="9">native</tspan></text>
+  <line x1="732" y1="214" x2="862" y2="214" stroke="#262b33"/>
+  <g font-family="ui-monospace, monospace" font-size="10.5" fill="#c5cdd6" text-anchor="middle">
+    <text x="797" y="234">OpenAI · Gemini · Groq</text>
+    <text x="797" y="252">Mistral · DeepSeek · xAI</text>
+    <text x="797" y="270">Together · DashScope</text>
+    <text x="797" y="288">Ollama · OpenRouter</text>
+  </g>
+  <text x="797" y="312" fill="#ffaf00" font-size="9.5" text-anchor="middle" font-family="ui-monospace, monospace">all but Anthropic: openai-compat</text>
+  <!-- footer note -->
+  <text x="460" y="406" fill="#9aa3af" font-size="11" text-anchor="middle" font-family="ui-monospace, monospace">a new model = 0 code   ·   a new OpenAI-compatible provider = 2 lines   ·   a new wire format = 1 adapter</text>
+  <text x="460" y="430" fill="#6b7480" font-size="10" text-anchor="middle">vendor SDKs load lazily — the Anthropic SDK never loads unless a Claude request arrives</text>
+</svg>

package/docs/cloudflare.md ADDED Viewed

@@ -0,0 +1,81 @@
+# Using Cloudflare models with ada
+Cloudflare gives you two OpenAI-compatible endpoints, and ada speaks OpenAI — so both are just the
+`cloudflare` provider with the right env vars. Pick the one you have:
+- **Workers AI** — Cloudflare *hosts* the model (Llama, Qwen, Gemma, Kimi, …). Simplest.
+- **AI Gateway** — Cloudflare *proxies* other providers (OpenAI/Anthropic/Workers AI/…) through one
+  endpoint, with caching + analytics + optional unified billing.
+Browse what's available and its pricing any time, offline:
+```bash
+ada catalog cloudflare          # Workers AI + AI Gateway models, context + $/1M
+```
+---
+## Workers AI (recommended start)
+1. **Cloudflare dashboard → AI → Workers AI → "Use REST API".** Copy your **Account ID** and
+   **create an API token** (Workers AI scope).
+2. Set the env vars for the backend:
+   ```bash
+   export CLOUDFLARE_ACCOUNT_ID=your-32-char-account-id
+   export CLOUDFLARE_API_TOKEN=your-workers-ai-token
+   ```
+3. Start the backend and run ada with a `@cf/…` model id:
+   ```bash
+   ada-server
+   ada --model "@cf/moonshotai/kimi-k2.7-code"     # or any id from `ada catalog cloudflare`
+   ```
+That's it. ada routes `@cf/*` to Cloudflare automatically, sends the full id through, and `/cost`
+already knows the price from the catalog.
+> The default endpoint ada builds is
+> `https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1` — Workers AI's
+> OpenAI-compatible base. No code change needed.
+---
+## AI Gateway
+1. **Cloudflare dashboard → AI → AI Gateway → create a gateway.** Note your **Account ID** and
+   **Gateway ID**, and grab the gateway's **OpenAI-compatible endpoint URL** (the "compat" base).
+2. Point ada at that URL and supply the token it expects (your gateway token, or the upstream
+   provider's key, depending on how the gateway is configured):
+   ```bash
+   export CLOUDFLARE_BASE_URL="https://gateway.ai.cloudflare.com/v1/<account>/<gateway>/compat"
+   export CLOUDFLARE_API_TOKEN=your-gateway-or-provider-key
+   ```
+   (`CLOUDFLARE_BASE_URL` overrides the Workers AI default, so `CLOUDFLARE_ACCOUNT_ID` isn't needed.)
+3. Use the model id format your gateway expects (often `provider/model`, e.g. `openai/gpt-4o`), and
+   route it explicitly to the `cloudflare` provider — easiest is the `--provider` field or an
+   `@cf/`-style id; otherwise send `provider: "cloudflare"` on the request.
+> Copy the exact base URL from your AI Gateway page — Cloudflare shows the OpenAI-compatible endpoint
+> there. ada just proxies to whatever you set.
+---
+## How it works (why it's only ~2 lines in ada)
+ada keys providers by **wire format**, not by vendor. Cloudflare's Workers AI and AI Gateway both
+emit the OpenAI Chat Completions format, so they reuse the shared `openai-compat.ts` adapter — no
+Cloudflare-specific SDK or adapter. The whole integration is:
+- one `PROVIDERS` entry in [`src/server/config.ts`](../src/server/config.ts) (base URL + key env),
+- one router line in [`src/server/router.ts`](../src/server/router.ts) (`@cf/*` → cloudflare).
+(Contrast: opencode pulls in dedicated `workers-ai-provider` / `ai-gateway-provider` packages + a
+custom loader, because it's built on the Vercel AI SDK's per-provider abstraction. ada doesn't need
+that for an OpenAI-shaped endpoint.)
+## Troubleshooting
+- **401 / 403** — wrong token or scope. Workers AI needs a Workers-AI-scoped token; the Account ID
+  must match the token's account.
+- **404 on the model** — the `@cf/…` id isn't hosted; check `ada catalog cloudflare` or the Workers
+  AI catalog in the dashboard.
+- **`/cost` says "no price table"** — the model isn't in the baked catalog; run `npm run catalog:refresh`.