ada-agent 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,139 +1,163 @@
1
- # Architecture
2
-
3
- ada is two programs in one repo: a thin **client** (the coding agent) and a **backend** (the
4
- router that holds provider keys). They communicate in one wire format — OpenAI Chat Completions.
5
-
6
- ![ada architecture](architecture.svg)
7
-
8
- ```
9
- ada (client) ada backend upstream providers
10
- ──────────── ─────────── ──────────────────
11
- agentic loop ──── HTTP ───▶ auth (client key)
12
- tools router: model id → provider
13
- sessions adapter: provider → wire format ──▶ Anthropic / OpenAI / …
14
- approval/TUI ◀── SSE ──── normalize back to OpenAI SSE
15
- ```
16
-
17
- Why split it: the backend is the **one control point**. Provider keys, auth, rate limits, and
18
- billing all belong in one place; the client carries only an ada client key. Same shape as Cursor.
19
-
20
- ## Request flow
21
-
22
- 1. The client sends an OpenAI-format chat request (model, messages, tools) to `ADA_BACKEND_URL`
23
- with its client key as the bearer.
24
- 2. The backend authenticates the key (`ADA_CLIENT_KEYS`, or open in dev mode), then `router.ts`
25
- maps the model id → a provider.
26
- 3. The matching **adapter** calls the upstream with the server-held provider key — a pass-through
27
- for OpenAI-compatible providers, a translation for Anthropic.
28
- 4. The backend streams normalized OpenAI SSE chunks back; the client renders text and runs tool
29
- calls, appending one `{role:"tool", tool_call_id, content}` per call and looping.
30
-
31
- ## The agent loop
32
-
33
- ![ada agent loop](agent-loop.svg)
34
-
35
- Each turn streams the model's reply; if it contains tool calls, gated ones prompt for approval,
36
- the tools run, and one `{role:"tool", tool_call_id, content}` per call is appended before control
37
- returns to the model — looping until the model stops calling tools.
38
-
39
- ## Sign in (device flow)
40
-
41
- ![ada login device flow](login-flow.svg)
42
-
43
- GitHub/Google login uses the OAuth 2.0 device authorization grant (RFC 8628) no password ever
44
- reaches ada. The token is stored locally and sent as the bearer; the backend verifies identity in
45
- `server/identity.ts`. The GitHub `client_id` is baked in (public, like `gh`), so the client needs
46
- zero config.
47
-
48
- ## One adapter per wire format
49
-
50
- The key design decision: adapters are keyed by **wire format**, not by provider or model.
51
-
52
- - Most providers speak the OpenAI format and share **`openai-compat.ts`** (a pass-through that just
53
- swaps in the right base URL + key).
54
- - Only a divergent format gets its own adapter — **`anthropic.ts`** translates OpenAI ⇄ Anthropic
55
- Messages and re-emits Anthropic events as OpenAI SSE.
56
-
57
- Consequences:
58
-
59
- | Change | Cost |
60
- |---|---|
61
- | A new model | **0 code** (routing is by id) |
62
- | A new OpenAI-compatible provider | **2 lines** in `config.ts` (base URL + key env) |
63
- | A brand-new wire format | **1 adapter** + one line in `registry.ts` |
64
-
65
- Vendor SDKs load **lazily** (pi-style): a `type`-only import plus a dynamic `import()`, so e.g.
66
- `@anthropic-ai/sdk` never loads unless a Claude request actually arrives.
67
-
68
- ## Routing
69
-
70
- `router.ts` maps a model id to a provider:
71
-
72
- - a model id containing `:` (e.g. `qwen2.5-coder:latest`) → local **Ollama**;
73
- - otherwise by prefix (`gpt*`/`o*` → openai, `claude*` → anthropic, `gemini*` → google,
74
- `mistral*` → mistral, `grok*` → xai, …);
75
- - an explicit `provider` field on the request always wins;
76
- - anything unmatched falls through to **OpenRouter**.
77
-
78
- ## Context compaction
79
-
80
- The client estimates context size (≈ chars / 4) and, when it crosses `ADA_COMPACT_AT` (default
81
- 100k) or a request overflows, summarizes older turns into one compact summary and keeps the recent
82
- ones. `/compact` forces it; `/context` shows the current estimate.
83
-
84
- ## Tool-call recovery
85
-
86
- Some providers (notably **Ollama over a streaming connection**) fail to parse a model's tool call
87
- into the structured `tool_calls` field and leak it into the text as raw JSON. The client detects a
88
- reply that *is* a JSON tool call (plain, ```` ```json ```` fenced, or `<tool_call>`-wrapped) for a
89
- real tool and runs it instead of printing the JSON. Hallucinated tools (no such tool) are left as
90
- text. See `parseTextToolCalls` in `client/agent.ts`.
91
-
92
- ## File layout
93
-
94
- ```
95
- bin/
96
- ada.mjs launcher: register tsx loader → run client/cli.ts
97
- ada-server.mjs launcher: register tsx loader run server/index.ts
98
-
99
- src/
100
- shared/
101
- types.ts provider/model types shared by client and server
102
-
103
- server/ the routing backend (ada-server | npm run server)
104
- index.ts HTTP entry: auth route dispatch to an adapter (+ /v1/models, /v1/whoami)
105
- config.ts providers, base URLs, key env vars, port, client-key auth
106
- router.ts model id provider
107
- sse.ts Server-Sent Events helpers
108
- identity.ts verify GitHub/Google tokens; allowlist
109
- oauth.ts RFC 8628 device-flow login (built-in GitHub client id)
110
- credentials.ts local credential store
111
- providers/
112
- adapter.ts the Adapter interface ← one adapter per WIRE FORMAT
113
- registry.ts provideradapter map
114
- openai-compat.ts pass-through OpenAI-compatible adapter
115
- anthropic.ts native Anthropic adapter (lazy @anthropic-ai/sdk)
116
-
117
- client/ the terminal agent (ada | npm start)
118
- cli.ts REPL: flags, model picker, slash commands, approval prompt
119
- agent.ts the agentic loop (stream → tool calls → feed back → repeat)
120
- tools.ts read/write/edit/bash/ls/grep/glob; protected paths; destructive detection
121
- tui.ts inline TUI engine (composer, spinner, user bar)
122
- tui-mode.ts the TUI loop
123
- session.ts append-only JSONL session store (.cos0/sessions/)
124
- compaction.ts context summarization
125
- checkpoint.ts undo: snapshot files before edits, restore on /undo
126
- todos.ts task tracking + render
127
- hooks.ts extension hooks (before/after tool, input transform)
128
- extensions.ts load extensions (tools + hooks + commands)
129
- skills.ts · mcp.ts · prompts.ts skills, MCP servers, prompt templates
130
- settings.ts · platform.ts · render.ts · image.ts · telemetry.ts · pkg.ts
131
-
132
- selfcheck.ts offline checks (tools, sessions, routing, parsers, TUI)
133
- ```
134
-
135
- ## No build step
136
-
137
- Everything runs through `tsx` TypeScript with no compile. The `bin/*.mjs` launchers register the
138
- tsx ESM loader in-process, then import the relevant `.ts` entrypoint (which self-runs). `tsx` is a
139
- runtime dependency so the global `ada` command works after `npm link` / `npm install -g`.
1
+ # Architecture
2
+
3
+ ada is two programs in one repo: a thin **client** (the coding agent) and a **backend** (the
4
+ router that holds provider keys). They communicate in one wire format — OpenAI Chat Completions.
5
+
6
+ ![ada architecture](architecture.svg)
7
+
8
+ ```
9
+ ada (client) ada backend upstream providers
10
+ ──────────── ─────────── ──────────────────
11
+ agentic loop ──── HTTP ───▶ auth (client key)
12
+ tools router: model id → provider
13
+ sessions adapter: provider → wire format ──▶ Anthropic / OpenAI / …
14
+ approval/TUI ◀── SSE ──── normalize back to OpenAI SSE
15
+ ```
16
+
17
+ Why split it: the backend is the **one control point**. Provider keys, auth, rate limits, and
18
+ billing all belong in one place; the client carries only an ada client key. Same shape as Cursor.
19
+
20
+ ## Request flow
21
+
22
+ 1. The client sends an OpenAI-format chat request (model, messages, tools) to `ADA_BACKEND_URL`
23
+ with its client key as the bearer.
24
+ 2. The backend authenticates the key (`ADA_CLIENT_KEYS`, or open in dev mode), then `router.ts`
25
+ maps the model id → a provider.
26
+ 3. The matching **adapter** calls the upstream with the server-held provider key — a pass-through
27
+ for OpenAI-compatible providers, a translation for Anthropic.
28
+ 4. The backend streams normalized OpenAI SSE chunks back; the client renders text and runs tool
29
+ calls, appending one `{role:"tool", tool_call_id, content}` per call and looping.
30
+
31
+ ## The agent loop
32
+
33
+ ![ada agent loop](agent-loop.svg)
34
+
35
+ Each turn streams the model's reply; if it contains tool calls, gated ones go through the
36
+ **permission mode**, the tools run, and one `{role:"tool", tool_call_id, content}` per call is
37
+ appended before control returns to the model — looping until the model stops calling tools.
38
+
39
+ **Permission modes** (`/ask` · `/plan` · `/auto`, or `/mode` to cycle; shown in the prompt):
40
+
41
+ - **ask** (default) — each gated tool shows a plain-words prompt ("ada wants to run a shell command…")
42
+ and one key: `[y]es` · `[a]uto` · `[p]lan` · `[n]o`. Destructive `bash` always confirms.
43
+ - **plan** read-only: ada plans but won't edit; `/run` approves and executes.
44
+ - **auto** runs tools without asking (still confirms destructive `bash`). `--yolo` starts here.
45
+
46
+ **Skills.** ~285 bundled `SKILL.md` instructions load only on demand. A lexical router
47
+ (`client/skill-router.ts`) ranks every request; on a confident, name-exact match ada **auto-applies**
48
+ the skill (injects its procedure, announced `↳ skill: <name>`), otherwise it suggests them. The model
49
+ can also `list_skills` / `find_skill` / `use_skill`. See [orchestration.md](orchestration.md) for the
50
+ strategies (`react`/`plan`/`multi`/`toolsmith`) layered on the same loop.
51
+
52
+ **Programmatic surfaces.** Beyond the REPL/TUI, the same agent drives an HTTP API (`ada serve`), a
53
+ typed SDK, an ACP editor bridge (`ada acp`), and read-only session sharing (`ada share`) — see
54
+ [integrations.md](integrations.md). And it can run **SWE-bench Verified** via [bench/](../bench/).
55
+
56
+ ## Sign in (device flow)
57
+
58
+ ![ada login device flow](login-flow.svg)
59
+
60
+ GitHub/Google login uses the OAuth 2.0 device authorization grant (RFC 8628) — no password ever
61
+ reaches ada. The token is stored locally and sent as the bearer; the backend verifies identity in
62
+ `server/identity.ts`. The GitHub `client_id` is baked in (public, like `gh`), so the client needs
63
+ zero config.
64
+
65
+ ## One adapter per wire format
66
+
67
+ The key design decision: adapters are keyed by **wire format**, not by provider or model.
68
+
69
+ - Most providers speak the OpenAI format and share **`openai-compat.ts`** (a pass-through that just
70
+ swaps in the right base URL + key).
71
+ - Only a divergent format gets its own adapter — **`anthropic.ts`** translates OpenAI ⇄ Anthropic
72
+ Messages and re-emits Anthropic events as OpenAI SSE.
73
+
74
+ Consequences:
75
+
76
+ | Change | Cost |
77
+ |---|---|
78
+ | A new model | **0 code** (routing is by id) |
79
+ | A new OpenAI-compatible provider | **2 lines** in `config.ts` (base URL + key env) |
80
+ | A brand-new wire format | **1 adapter** + one line in `registry.ts` |
81
+
82
+ Vendor SDKs load **lazily** (pi-style): a `type`-only import plus a dynamic `import()`, so e.g.
83
+ `@anthropic-ai/sdk` never loads unless a Claude request actually arrives.
84
+
85
+ ## Routing
86
+
87
+ `router.ts` maps a model id to a provider:
88
+
89
+ - a model id containing `:` (e.g. `qwen2.5-coder:latest`) local **Ollama**;
90
+ - otherwise by prefix (`gpt*`/`o*` → openai, `claude*` anthropic, `gemini*` → google,
91
+ `mistral*` → mistral, `grok*` → xai, …);
92
+ - an explicit `provider` field on the request always wins;
93
+ - anything unmatched falls through to **OpenRouter**.
94
+
95
+ ## Context compaction
96
+
97
+ The client estimates context size (≈ chars / 4) and, when it crosses `ADA_COMPACT_AT` (default
98
+ 100k) or a request overflows, summarizes older turns into one compact summary and keeps the recent
99
+ ones. `/compact` forces it; `/context` shows the current estimate.
100
+
101
+ ## Tool-call recovery
102
+
103
+ Some providers (notably **Ollama over a streaming connection**) fail to parse a model's tool call
104
+ into the structured `tool_calls` field and leak it into the text as raw JSON. The client detects a
105
+ reply that *is* a JSON tool call (plain, ```` ```json ```` fenced, or `<tool_call>`-wrapped) for a
106
+ real tool and runs it instead of printing the JSON. Hallucinated tools (no such tool) are left as
107
+ text. See `parseTextToolCalls` in `client/agent.ts`.
108
+
109
+ ## File layout
110
+
111
+ ```
112
+ bin/
113
+ ada.mjs launcher: register tsx loader run client/cli.ts
114
+ ada-server.mjs launcher: register tsx loader → run server/index.ts
115
+
116
+ src/
117
+ shared/
118
+ types.ts provider/model types shared by client and server
119
+
120
+ server/ the routing backend (ada-server | npm run server)
121
+ index.ts HTTP entry: auth → route → dispatch to an adapter (+ /v1/models, /v1/whoami)
122
+ config.ts providers, base URLs, key env vars, port, client-key auth
123
+ router.ts model id provider
124
+ sse.ts Server-Sent Events helpers
125
+ identity.ts verify GitHub/Google tokens; allowlist
126
+ oauth.ts RFC 8628 device-flow login (built-in GitHub client id)
127
+ credentials.ts local credential store
128
+ providers/
129
+ adapter.ts the Adapter interface ← one adapter per WIRE FORMAT
130
+ registry.ts provider adapter map
131
+ openai-compat.ts pass-through OpenAI-compatible adapter
132
+ anthropic.ts native Anthropic adapter (lazy @anthropic-ai/sdk)
133
+
134
+ client/ the terminal agent (ada | npm start)
135
+ cli.ts REPL: flags, model picker, slash commands, ask/plan/auto modes + approval
136
+ agent.ts the agentic loop (stream → tool calls → feed back → repeat) + orchestrators
137
+ tools.ts read_file/write_file/edit_file · apply_patch · bash (PTY) · ls/glob/grep (rg)
138
+ · web_fetch/web_search · lsp_diagnostics · ask_user; protected paths;
139
+ destructive detection; trust-gated auto-format
140
+ tui.ts inline TUI engine (composer, spinner, user bar)
141
+ tui-mode.ts the TUI loop
142
+ session.ts append-only JSONL session store (.ada/sessions/)
143
+ compaction.ts context summarization
144
+ checkpoint.ts · snapshot.ts undo (revert edits) · whole-tree git snapshot/restore
145
+ skills.ts · skill-router.ts skills + the relevance router (auto-apply)
146
+ mcp.ts · prompts.ts · background.ts · models-dev.ts · lsp.ts connectors, templates,
147
+ background jobs, models.dev catalog, LSP client
148
+ todos.ts · hooks.ts · extensions.ts tasks; extension hooks + tools + commands
149
+ settings.ts · platform.ts · render.ts · image.ts · telemetry.ts · pkg.ts
150
+
151
+ sdk/index.ts typed client for the HTTP API (`ada serve`)
152
+ selfcheck.ts offline checks (tools, sessions, routing, parsers, TUI, classifiers)
153
+
154
+ bench/
155
+ swebench.mjs SWE-bench Verified prediction generator (scored by the official harness)
156
+ ```
157
+
158
+ ## No build step
159
+
160
+ Everything runs through `tsx` — TypeScript with no compile. The `bin/*.mjs` launchers register the
161
+ tsx ESM loader in-process, then import the relevant `.ts` entrypoint (which self-runs). `tsx` is a
162
+ runtime dependency so the global `ada` command works after `npx ada-agent`, `npm install -g ada-agent`,
163
+ or `npm link` from a clone. (`node-pty` is the one native dep, so a C toolchain is needed at install.)
@@ -1,73 +1,73 @@
1
- <svg viewBox="0 0 920 470" xmlns="http://www.w3.org/2000/svg" font-family="ui-sans-serif, system-ui, sans-serif" role="img" aria-labelledby="t d">
2
- <title id="t">ada architecture</title>
3
- <desc id="d">The ada terminal client sends OpenAI Chat Completions over HTTP to the ada backend, which authenticates the client key, routes by model id, adapts to each provider's wire format, and streams normalized SSE back. The backend holds every provider key and reaches Anthropic via a native adapter and all other providers via a shared OpenAI-compatible adapter.</desc>
4
-
5
- <defs>
6
- <marker id="fwd" markerWidth="9" markerHeight="9" refX="7" refY="4.5" orient="auto"><path d="M0,0 L9,4.5 L0,9 z" fill="#ffaf00"/></marker>
7
- <marker id="back" markerWidth="9" markerHeight="9" refX="7" refY="4.5" orient="auto"><path d="M0,0 L9,4.5 L0,9 z" fill="#3fb950"/></marker>
8
- </defs>
9
-
10
- <!-- panel -->
11
- <rect x="6" y="6" width="908" height="458" rx="16" fill="#0d0f12" stroke="#262b33"/>
12
- <rect x="34" y="34" width="14" height="14" rx="4" transform="rotate(45 41 41)" fill="#ffaf00"/>
13
- <text x="60" y="40" fill="#ffaf00" font-size="17" font-weight="700">ada · architecture</text>
14
- <text x="60" y="60" fill="#9aa3af" font-size="12">terminal client → routing backend → providers · one wire format throughout</text>
15
-
16
- <!-- request / response lanes -->
17
- <g font-family="ui-monospace, monospace" font-size="10.5">
18
- <!-- top: request (gold, →) -->
19
- <line x1="226" y1="200" x2="340" y2="200" stroke="#ffaf00" stroke-width="1.6" marker-end="url(#fwd)"/>
20
- <text x="283" y="190" fill="#c5cdd6" text-anchor="middle">OpenAI Chat</text>
21
- <text x="283" y="178" fill="#9aa3af" text-anchor="middle" font-size="9">Completions · HTTP</text>
22
- <line x1="652" y1="200" x2="710" y2="200" stroke="#ffaf00" stroke-width="1.6" marker-end="url(#fwd)"/>
23
- <text x="683" y="190" fill="#c5cdd6" text-anchor="middle">+ key</text>
24
-
25
- <!-- bottom: response (green, ←) -->
26
- <line x1="710" y1="320" x2="652" y2="320" stroke="#3fb950" stroke-width="1.6" marker-end="url(#back)"/>
27
- <text x="683" y="338" fill="#7ee08a" text-anchor="middle">SSE</text>
28
- <line x1="340" y1="320" x2="226" y2="320" stroke="#3fb950" stroke-width="1.6" marker-end="url(#back)"/>
29
- <text x="283" y="338" fill="#7ee08a" text-anchor="middle">normalized OpenAI SSE</text>
30
- </g>
31
-
32
- <!-- client card -->
33
- <rect x="42" y="138" width="184" height="212" rx="12" fill="#14171c" stroke="#262b33"/>
34
- <text x="134" y="166" fill="#ffaf00" font-size="16" font-weight="700" text-anchor="middle">ada client</text>
35
- <text x="134" y="183" fill="#9aa3af" font-size="10.5" text-anchor="middle">the terminal</text>
36
- <g font-family="ui-monospace, monospace" font-size="11" fill="#c5cdd6" text-anchor="middle">
37
- <text x="134" y="207">agentic loop · tools</text>
38
- <text x="134" y="226">285 skills · MCP</text>
39
- <text x="134" y="245">ask · plan · auto</text>
40
- <text x="134" y="264">REPL · TUI · sessions</text>
41
- <text x="134" y="283" fill="#6b7480">serve · SDK · ACP</text>
42
- </g>
43
- <rect x="80" y="303" width="108" height="22" rx="11" fill="#0d0f12" stroke="#262b33"/>
44
- <text x="134" y="318" fill="#6b7480" font-size="10" text-anchor="middle" font-family="ui-monospace, monospace">holds no keys</text>
45
-
46
- <!-- backend card -->
47
- <rect x="346" y="118" width="306" height="244" rx="12" fill="#101318" stroke="#262b33"/>
48
- <text x="499" y="146" fill="#ffaf00" font-size="15" font-weight="700" text-anchor="middle">ada backend</text>
49
- <text x="499" y="163" fill="#9aa3af" font-size="10" text-anchor="middle">the one control point — holds every key</text>
50
- <g font-family="ui-monospace, monospace" font-size="11.5">
51
- <g><rect x="366" y="176" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="195" fill="#e6e9ee"><tspan fill="#ffaf00">1</tspan> auth · client key</text></g>
52
- <g><rect x="366" y="214" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="233" fill="#e6e9ee"><tspan fill="#ffaf00">2</tspan> route · model id → provider</text></g>
53
- <g><rect x="366" y="252" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="271" fill="#e6e9ee"><tspan fill="#ffaf00">3</tspan> adapt · one per wire format</text></g>
54
- <g><rect x="366" y="290" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="309" fill="#e6e9ee"><tspan fill="#ffaf00">4</tspan> normalize → OpenAI SSE</text></g>
55
- </g>
56
-
57
- <!-- providers card -->
58
- <rect x="716" y="150" width="162" height="180" rx="12" fill="#14171c" stroke="#262b33"/>
59
- <text x="797" y="180" fill="#e6e9ee" font-size="13" font-weight="700" text-anchor="middle">providers</text>
60
- <text x="797" y="202" fill="#ff7a59" font-size="12" font-weight="700" text-anchor="middle" font-family="ui-monospace, monospace">Anthropic <tspan fill="#6b7480" font-weight="400" font-size="9">native</tspan></text>
61
- <line x1="732" y1="214" x2="862" y2="214" stroke="#262b33"/>
62
- <g font-family="ui-monospace, monospace" font-size="10.5" fill="#c5cdd6" text-anchor="middle">
63
- <text x="797" y="234">OpenAI · Gemini · Groq</text>
64
- <text x="797" y="252">Mistral · DeepSeek · xAI</text>
65
- <text x="797" y="270">Together · DashScope</text>
66
- <text x="797" y="288">Ollama · OpenRouter</text>
67
- </g>
68
- <text x="797" y="312" fill="#ffaf00" font-size="9.5" text-anchor="middle" font-family="ui-monospace, monospace">all but Anthropic: openai-compat</text>
69
-
70
- <!-- footer note -->
71
- <text x="460" y="406" fill="#9aa3af" font-size="11" text-anchor="middle" font-family="ui-monospace, monospace">a new model = 0 code · a new OpenAI-compatible provider = 2 lines · a new wire format = 1 adapter</text>
72
- <text x="460" y="430" fill="#6b7480" font-size="10" text-anchor="middle">vendor SDKs load lazily — the Anthropic SDK never loads unless a Claude request arrives</text>
73
- </svg>
1
+ <svg viewBox="0 0 920 470" xmlns="http://www.w3.org/2000/svg" font-family="ui-sans-serif, system-ui, sans-serif" role="img" aria-labelledby="t d">
2
+ <title id="t">ada architecture</title>
3
+ <desc id="d">The ada terminal client sends OpenAI Chat Completions over HTTP to the ada backend, which authenticates the client key, routes by model id, adapts to each provider's wire format, and streams normalized SSE back. The backend holds every provider key and reaches Anthropic via a native adapter and all other providers via a shared OpenAI-compatible adapter.</desc>
4
+
5
+ <defs>
6
+ <marker id="fwd" markerWidth="9" markerHeight="9" refX="7" refY="4.5" orient="auto"><path d="M0,0 L9,4.5 L0,9 z" fill="#ffaf00"/></marker>
7
+ <marker id="back" markerWidth="9" markerHeight="9" refX="7" refY="4.5" orient="auto"><path d="M0,0 L9,4.5 L0,9 z" fill="#3fb950"/></marker>
8
+ </defs>
9
+
10
+ <!-- panel -->
11
+ <rect x="6" y="6" width="908" height="458" rx="16" fill="#0d0f12" stroke="#262b33"/>
12
+ <rect x="34" y="34" width="14" height="14" rx="4" transform="rotate(45 41 41)" fill="#ffaf00"/>
13
+ <text x="60" y="40" fill="#ffaf00" font-size="17" font-weight="700">ada · architecture</text>
14
+ <text x="60" y="60" fill="#9aa3af" font-size="12">terminal client → routing backend → providers · one wire format throughout</text>
15
+
16
+ <!-- request / response lanes -->
17
+ <g font-family="ui-monospace, monospace" font-size="10.5">
18
+ <!-- top: request (gold, →) -->
19
+ <line x1="226" y1="200" x2="340" y2="200" stroke="#ffaf00" stroke-width="1.6" marker-end="url(#fwd)"/>
20
+ <text x="283" y="190" fill="#c5cdd6" text-anchor="middle">OpenAI Chat</text>
21
+ <text x="283" y="178" fill="#9aa3af" text-anchor="middle" font-size="9">Completions · HTTP</text>
22
+ <line x1="652" y1="200" x2="710" y2="200" stroke="#ffaf00" stroke-width="1.6" marker-end="url(#fwd)"/>
23
+ <text x="683" y="190" fill="#c5cdd6" text-anchor="middle">+ key</text>
24
+
25
+ <!-- bottom: response (green, ←) -->
26
+ <line x1="710" y1="320" x2="652" y2="320" stroke="#3fb950" stroke-width="1.6" marker-end="url(#back)"/>
27
+ <text x="683" y="338" fill="#7ee08a" text-anchor="middle">SSE</text>
28
+ <line x1="340" y1="320" x2="226" y2="320" stroke="#3fb950" stroke-width="1.6" marker-end="url(#back)"/>
29
+ <text x="283" y="338" fill="#7ee08a" text-anchor="middle">normalized OpenAI SSE</text>
30
+ </g>
31
+
32
+ <!-- client card -->
33
+ <rect x="42" y="138" width="184" height="212" rx="12" fill="#14171c" stroke="#262b33"/>
34
+ <text x="134" y="166" fill="#ffaf00" font-size="16" font-weight="700" text-anchor="middle">ada client</text>
35
+ <text x="134" y="183" fill="#9aa3af" font-size="10.5" text-anchor="middle">the terminal</text>
36
+ <g font-family="ui-monospace, monospace" font-size="11" fill="#c5cdd6" text-anchor="middle">
37
+ <text x="134" y="207">agentic loop · tools</text>
38
+ <text x="134" y="226">285 skills · MCP</text>
39
+ <text x="134" y="245">ask · plan · auto</text>
40
+ <text x="134" y="264">REPL · TUI · sessions</text>
41
+ <text x="134" y="283" fill="#6b7480">serve · SDK · ACP</text>
42
+ </g>
43
+ <rect x="80" y="303" width="108" height="22" rx="11" fill="#0d0f12" stroke="#262b33"/>
44
+ <text x="134" y="318" fill="#6b7480" font-size="10" text-anchor="middle" font-family="ui-monospace, monospace">holds no keys</text>
45
+
46
+ <!-- backend card -->
47
+ <rect x="346" y="118" width="306" height="244" rx="12" fill="#101318" stroke="#262b33"/>
48
+ <text x="499" y="146" fill="#ffaf00" font-size="15" font-weight="700" text-anchor="middle">ada backend</text>
49
+ <text x="499" y="163" fill="#9aa3af" font-size="10" text-anchor="middle">the one control point — holds every key</text>
50
+ <g font-family="ui-monospace, monospace" font-size="11.5">
51
+ <g><rect x="366" y="176" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="195" fill="#e6e9ee"><tspan fill="#ffaf00">1</tspan> auth · client key</text></g>
52
+ <g><rect x="366" y="214" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="233" fill="#e6e9ee"><tspan fill="#ffaf00">2</tspan> route · model id → provider</text></g>
53
+ <g><rect x="366" y="252" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="271" fill="#e6e9ee"><tspan fill="#ffaf00">3</tspan> adapt · one per wire format</text></g>
54
+ <g><rect x="366" y="290" width="266" height="30" rx="8" fill="#14171c" stroke="#262b33"/><text x="380" y="309" fill="#e6e9ee"><tspan fill="#ffaf00">4</tspan> normalize → OpenAI SSE</text></g>
55
+ </g>
56
+
57
+ <!-- providers card -->
58
+ <rect x="716" y="150" width="162" height="180" rx="12" fill="#14171c" stroke="#262b33"/>
59
+ <text x="797" y="180" fill="#e6e9ee" font-size="13" font-weight="700" text-anchor="middle">providers</text>
60
+ <text x="797" y="202" fill="#ff7a59" font-size="12" font-weight="700" text-anchor="middle" font-family="ui-monospace, monospace">Anthropic <tspan fill="#6b7480" font-weight="400" font-size="9">native</tspan></text>
61
+ <line x1="732" y1="214" x2="862" y2="214" stroke="#262b33"/>
62
+ <g font-family="ui-monospace, monospace" font-size="10.5" fill="#c5cdd6" text-anchor="middle">
63
+ <text x="797" y="234">OpenAI · Gemini · Groq</text>
64
+ <text x="797" y="252">Mistral · DeepSeek · xAI</text>
65
+ <text x="797" y="270">Together · DashScope</text>
66
+ <text x="797" y="288">Ollama · OpenRouter</text>
67
+ </g>
68
+ <text x="797" y="312" fill="#ffaf00" font-size="9.5" text-anchor="middle" font-family="ui-monospace, monospace">all but Anthropic: openai-compat</text>
69
+
70
+ <!-- footer note -->
71
+ <text x="460" y="406" fill="#9aa3af" font-size="11" text-anchor="middle" font-family="ui-monospace, monospace">a new model = 0 code · a new OpenAI-compatible provider = 2 lines · a new wire format = 1 adapter</text>
72
+ <text x="460" y="430" fill="#6b7480" font-size="10" text-anchor="middle">vendor SDKs load lazily — the Anthropic SDK never loads unless a Claude request arrives</text>
73
+ </svg>
@@ -0,0 +1,81 @@
1
+ # Using Cloudflare models with ada
2
+
3
+ Cloudflare gives you two OpenAI-compatible endpoints, and ada speaks OpenAI — so both are just the
4
+ `cloudflare` provider with the right env vars. Pick the one you have:
5
+
6
+ - **Workers AI** — Cloudflare *hosts* the model (Llama, Qwen, Gemma, Kimi, …). Simplest.
7
+ - **AI Gateway** — Cloudflare *proxies* other providers (OpenAI/Anthropic/Workers AI/…) through one
8
+ endpoint, with caching + analytics + optional unified billing.
9
+
10
+ Browse what's available and its pricing any time, offline:
11
+
12
+ ```bash
13
+ ada catalog cloudflare # Workers AI + AI Gateway models, context + $/1M
14
+ ```
15
+
16
+ ---
17
+
18
+ ## Workers AI (recommended start)
19
+
20
+ 1. **Cloudflare dashboard → AI → Workers AI → "Use REST API".** Copy your **Account ID** and
21
+ **create an API token** (Workers AI scope).
22
+ 2. Set the env vars for the backend:
23
+ ```bash
24
+ export CLOUDFLARE_ACCOUNT_ID=your-32-char-account-id
25
+ export CLOUDFLARE_API_TOKEN=your-workers-ai-token
26
+ ```
27
+ 3. Start the backend and run ada with a `@cf/…` model id:
28
+ ```bash
29
+ ada-server
30
+ ada --model "@cf/moonshotai/kimi-k2.7-code" # or any id from `ada catalog cloudflare`
31
+ ```
32
+
33
+ That's it. ada routes `@cf/*` to Cloudflare automatically, sends the full id through, and `/cost`
34
+ already knows the price from the catalog.
35
+
36
+ > The default endpoint ada builds is
37
+ > `https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1` — Workers AI's
38
+ > OpenAI-compatible base. No code change needed.
39
+
40
+ ---
41
+
42
+ ## AI Gateway
43
+
44
+ 1. **Cloudflare dashboard → AI → AI Gateway → create a gateway.** Note your **Account ID** and
45
+ **Gateway ID**, and grab the gateway's **OpenAI-compatible endpoint URL** (the "compat" base).
46
+ 2. Point ada at that URL and supply the token it expects (your gateway token, or the upstream
47
+ provider's key, depending on how the gateway is configured):
48
+ ```bash
49
+ export CLOUDFLARE_BASE_URL="https://gateway.ai.cloudflare.com/v1/<account>/<gateway>/compat"
50
+ export CLOUDFLARE_API_TOKEN=your-gateway-or-provider-key
51
+ ```
52
+ (`CLOUDFLARE_BASE_URL` overrides the Workers AI default, so `CLOUDFLARE_ACCOUNT_ID` isn't needed.)
53
+ 3. Use the model id format your gateway expects (often `provider/model`, e.g. `openai/gpt-4o`), and
54
+ route it explicitly to the `cloudflare` provider — easiest is the `--provider` field or an
55
+ `@cf/`-style id; otherwise send `provider: "cloudflare"` on the request.
56
+
57
+ > Copy the exact base URL from your AI Gateway page — Cloudflare shows the OpenAI-compatible endpoint
58
+ > there. ada just proxies to whatever you set.
59
+
60
+ ---
61
+
62
+ ## How it works (why it's only ~2 lines in ada)
63
+
64
+ ada keys providers by **wire format**, not by vendor. Cloudflare's Workers AI and AI Gateway both
65
+ emit the OpenAI Chat Completions format, so they reuse the shared `openai-compat.ts` adapter — no
66
+ Cloudflare-specific SDK or adapter. The whole integration is:
67
+
68
+ - one `PROVIDERS` entry in [`src/server/config.ts`](../src/server/config.ts) (base URL + key env),
69
+ - one router line in [`src/server/router.ts`](../src/server/router.ts) (`@cf/*` → cloudflare).
70
+
71
+ (Contrast: opencode pulls in dedicated `workers-ai-provider` / `ai-gateway-provider` packages + a
72
+ custom loader, because it's built on the Vercel AI SDK's per-provider abstraction. ada doesn't need
73
+ that for an OpenAI-shaped endpoint.)
74
+
75
+ ## Troubleshooting
76
+
77
+ - **401 / 403** — wrong token or scope. Workers AI needs a Workers-AI-scoped token; the Account ID
78
+ must match the token's account.
79
+ - **404 on the model** — the `@cf/…` id isn't hosted; check `ada catalog cloudflare` or the Workers
80
+ AI catalog in the dashboard.
81
+ - **`/cost` says "no price table"** — the model isn't in the baked catalog; run `npm run catalog:refresh`.