@vedmalex/ai-connect 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (69) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +1266 -0
  3. package/dist/browser/index.js +6690 -0
  4. package/dist/browser/index.js.map +7 -0
  5. package/dist/bun/index.js +10073 -0
  6. package/dist/bun/index.js.map +7 -0
  7. package/dist/bun/local.js +9753 -0
  8. package/dist/bun/local.js.map +7 -0
  9. package/dist/node/index.js +10073 -0
  10. package/dist/node/index.js.map +7 -0
  11. package/dist/node/local.js +9753 -0
  12. package/dist/node/local.js.map +7 -0
  13. package/dist/types/acp-presets.d.ts +2 -0
  14. package/dist/types/acp-presets.d.ts.map +1 -0
  15. package/dist/types/acp.d.ts +39 -0
  16. package/dist/types/acp.d.ts.map +1 -0
  17. package/dist/types/browser.d.ts +5 -0
  18. package/dist/types/browser.d.ts.map +1 -0
  19. package/dist/types/bun.d.ts +2 -0
  20. package/dist/types/bun.d.ts.map +1 -0
  21. package/dist/types/catalog.d.ts +33 -0
  22. package/dist/types/catalog.d.ts.map +1 -0
  23. package/dist/types/cli-presets.d.ts +12 -0
  24. package/dist/types/cli-presets.d.ts.map +1 -0
  25. package/dist/types/cli.d.ts +20 -0
  26. package/dist/types/cli.d.ts.map +1 -0
  27. package/dist/types/client.d.ts +3 -0
  28. package/dist/types/client.d.ts.map +1 -0
  29. package/dist/types/config.d.ts +30 -0
  30. package/dist/types/config.d.ts.map +1 -0
  31. package/dist/types/default-handlers.d.ts +5 -0
  32. package/dist/types/default-handlers.d.ts.map +1 -0
  33. package/dist/types/errors.d.ts +15 -0
  34. package/dist/types/errors.d.ts.map +1 -0
  35. package/dist/types/fanout.d.ts +20 -0
  36. package/dist/types/fanout.d.ts.map +1 -0
  37. package/dist/types/files.d.ts +42 -0
  38. package/dist/types/files.d.ts.map +1 -0
  39. package/dist/types/image.d.ts +40 -0
  40. package/dist/types/image.d.ts.map +1 -0
  41. package/dist/types/index.browser.d.ts +18 -0
  42. package/dist/types/index.browser.d.ts.map +1 -0
  43. package/dist/types/index.d.ts +19 -0
  44. package/dist/types/index.d.ts.map +1 -0
  45. package/dist/types/local-handlers.d.ts +12 -0
  46. package/dist/types/local-handlers.d.ts.map +1 -0
  47. package/dist/types/local.d.ts +6 -0
  48. package/dist/types/local.d.ts.map +1 -0
  49. package/dist/types/logging.d.ts +8 -0
  50. package/dist/types/logging.d.ts.map +1 -0
  51. package/dist/types/mock-gateway.d.ts +97 -0
  52. package/dist/types/mock-gateway.d.ts.map +1 -0
  53. package/dist/types/model-reference.d.ts +85 -0
  54. package/dist/types/model-reference.d.ts.map +1 -0
  55. package/dist/types/node.d.ts +2 -0
  56. package/dist/types/node.d.ts.map +1 -0
  57. package/dist/types/probe.d.ts +71 -0
  58. package/dist/types/probe.d.ts.map +1 -0
  59. package/dist/types/router.d.ts +41 -0
  60. package/dist/types/router.d.ts.map +1 -0
  61. package/dist/types/runtime.d.ts +6 -0
  62. package/dist/types/runtime.d.ts.map +1 -0
  63. package/dist/types/server-presets.d.ts +9 -0
  64. package/dist/types/server-presets.d.ts.map +1 -0
  65. package/dist/types/server.d.ts +12 -0
  66. package/dist/types/server.d.ts.map +1 -0
  67. package/dist/types/types.d.ts +1346 -0
  68. package/dist/types/types.d.ts.map +1 -0
  69. package/package.json +94 -0
package/README.md ADDED
@@ -0,0 +1,1266 @@
1
+ # ai-connect
2
+
3
+ `ai-connect` is a Bun-first TypeScript library for unified access to AI providers from browser and local runtimes.
4
+
5
+ It models routes as `provider + transport + account + credential + model`, so one client can combine:
6
+
7
+ - direct APIs for `OpenAI`, `Anthropic`, and `Gemini`
8
+ - local-only ACP harness routes for `Claude Code`, `Codex`, and `Gemini`
9
+ - key rotation, account rotation, cooldowns, retries, and fallback chains
10
+ - portable file, PDF document, and image inputs across direct API and ACP paths
11
+ - cooperative cancellation, pause-with-partial, and per-operation timeouts
12
+ - incremental streaming deltas, live health checks, and read-only model probes
13
+ - context-window resolution, client-safe route projection, and fan-out throttling
14
+
15
+ ## Status
16
+
17
+ Implemented today:
18
+
19
+ - `OpenAI API`, `Anthropic API`, `Gemini API`
20
+ - `Claude Code ACP`, `Codex ACP`, `Gemini ACP`
21
+ - `Gemini CLI`, `Qwen CLI`, `Claude/OpenClaude CLI`, `Codex CLI`
22
+ - `OpenCode Server`
23
+ - browser and local client factories
24
+ - env-backed key pools with delimiter-based rotation
25
+ - image generation helpers for `OpenAI` and `Gemini`
26
+ - text, image, and PDF document attachment inlining for direct API prompts
27
+ - portable file normalization for paths, `File`, `Blob`, data URLs, and remote URIs
28
+ - cancellation, pause, and per-operation timeouts on `generate()` and `stream()`
29
+ - incremental streaming deltas for OpenAI SSE and ACP routes
30
+ - live two-stage health checks and read-only broken-model probes
31
+ - context-window resolution with a browser-safe curated model reference
32
+ - client-safe route/candidate projection for untrusted UI/agent surfaces
33
+ - per-route model allowlist modes, unknown-selector degrade policy, and a pluggable model selector
34
+ - client-side fan-out throttling (concurrency, rate, lifetime call ceiling)
35
+
36
+ ## Built-In Provider Scope
37
+
38
+ Built-in HTTP handlers exist for:
39
+
40
+ - `openai`
41
+ - `anthropic`
42
+ - `gemini`
43
+
44
+ `gemini` is the canonical provider id for the Google Gemini stack.
45
+
46
+ `google` is not a supported provider id. Use `gemini`.
47
+
48
+ Custom provider ids are accepted by config normalization, but they are not automatically backed by built-in HTTP handlers.
49
+
50
+ Use these rules:
51
+
52
+ - for OpenAI-compatible APIs such as `OpenRouter`, keep `provider: "openai"` and override `transport.baseUrl`
53
+ - for Anthropic-compatible APIs, keep `provider: "anthropic"` and override `transport.baseUrl`
54
+ - for Gemini-compatible APIs, keep `provider: "gemini"` and override `transport.baseUrl`
55
+ - use a truly custom provider id only when you also supply a custom handler, or when the route uses `cli`, `acp`, or `server`
56
+
57
+ ## Install
58
+
59
+ This package is published to the npm registry as the public scoped package `@vedmalex/ai-connect`.
60
+
61
+ ```bash
62
+ npm install @vedmalex/ai-connect
63
+ ```
64
+
65
+ With Bun:
66
+
67
+ ```bash
68
+ bun add @vedmalex/ai-connect
69
+ ```
70
+
71
+ You can still consume it directly from GitHub if you need a specific unpublished commit:
72
+
73
+ ```json
74
+ {
75
+ "dependencies": {
76
+ "@vedmalex/ai-connect": "git+ssh://git@github.com/vedmalex/ai-connect.git#<commit-sha>"
77
+ }
78
+ }
79
+ ```
80
+
81
+ If your consumer uses Bun against the GitHub form, also add:
82
+
83
+ ```json
84
+ {
85
+ "trustedDependencies": [
86
+ "@vedmalex/ai-connect"
87
+ ]
88
+ }
89
+ ```
90
+
91
+ Full integration notes:
92
+
93
+ - [docs/integration-via-git.md](./docs/integration-via-git.md)
94
+ - [docs/reference-demos.md](./docs/reference-demos.md)
95
+ - [docs/usage-spec.md](./docs/usage-spec.md)
96
+
97
+ To build the workspace locally from source:
98
+
99
+ ```bash
100
+ bun install
101
+ ```
102
+
103
+ ## Reference Demos
104
+
105
+ This repository includes two full reference applications in monorepo form:
106
+
107
+ - [apps/web-demo](/Users/vedmalex/work/ai-connect/apps/web-demo)
108
+ - [apps/local-demo](/Users/vedmalex/work/ai-connect/apps/local-demo)
109
+
110
+ They are intended as copyable blueprints for real products.
111
+ The web demo exposes settings through explicit windows and form controls.
112
+ The local demo exposes settings through JSONC and a TUI workflow.
113
+
114
+ Run them from the repository root:
115
+
116
+ ```bash
117
+ bun run dev:web-demo
118
+ bun run dev:local-demo
119
+ ```
120
+
121
+ The shared contract used by both demos lives in:
122
+
123
+ - [packages/reference-demo-core](/Users/vedmalex/work/ai-connect/packages/reference-demo-core)
124
+
125
+ Full demo guide:
126
+
127
+ - [reference-demos.md](/Users/vedmalex/work/ai-connect/docs/reference-demos.md)
128
+
129
+ ## Quick Start
130
+
131
+ ```ts
132
+ import { createBrowserClient, defineConfig } from "@vedmalex/ai-connect/browser";
133
+
134
+ const client = createBrowserClient(
135
+ defineConfig({
136
+ providers: {
137
+ openai: {
138
+ accounts: [
139
+ {
140
+ id: "main",
141
+ transport: "api",
142
+ models: ["gpt-4.1"],
143
+ credentials: [{ apiKeyEnv: "OPENAI_API_KEY" }],
144
+ },
145
+ ],
146
+ },
147
+ },
148
+ }),
149
+ {
150
+ runtime: {
151
+ getEnv: (name) => import.meta.env[name],
152
+ },
153
+ },
154
+ );
155
+
156
+ const result = await client.generate({
157
+ messages: [{ role: "user", content: "Summarize this design brief." }],
158
+ });
159
+
160
+ console.log(result.text);
161
+ ```
162
+
163
+ For built-in API providers, `transport.baseUrl` is optional. If you omit it, `ai-connect` uses the official upstream defaults:
164
+
165
+ - `openai` -> `https://api.openai.com/v1/...`
166
+ - `anthropic` -> `https://api.anthropic.com/v1/...`
167
+ - `gemini` -> `https://generativelanguage.googleapis.com/v1beta/...`
168
+
169
+ ## Local ACP Example
170
+
171
+ ```ts
172
+ import { createLocalClient, defineConfig } from "@vedmalex/ai-connect";
173
+
174
+ const client = createLocalClient(
175
+ defineConfig({
176
+ providers: {
177
+ anthropic: {
178
+ accounts: [
179
+ {
180
+ id: "subscription",
181
+ transport: {
182
+ kind: "acp",
183
+ id: "claude-code-acp",
184
+ },
185
+ models: ["claude-sonnet-4"],
186
+ },
187
+ ],
188
+ },
189
+ },
190
+ }),
191
+ {
192
+ acp: {
193
+ permissionMode: "approve-reads",
194
+ commands: {
195
+ "anthropic:claude-code-acp":
196
+ "npx -y @agentclientprotocol/claude-agent-acp@^0.25.0",
197
+ },
198
+ },
199
+ },
200
+ );
201
+
202
+ const result = await client.generate({
203
+ messages: [{ role: "user", content: "Review this repository layout." }],
204
+ });
205
+
206
+ console.log(result.text);
207
+ ```
208
+
209
+ If the host application is running in one folder but the inference should use another project folder as local context, pass `workingDirectory` per request:
210
+
211
+ ```ts
212
+ const result = await client.generate({
213
+ workingDirectory: "/Users/vedmalex/work/scancheck-target",
214
+ messages: [{ role: "user", content: "Review this repository layout." }],
215
+ });
216
+ ```
217
+
218
+ Dedicated provider-specific ACP examples:
219
+
220
+ - [examples/acp-claude.ts](/Users/vedmalex/work/ai-connect/examples/acp-claude.ts)
221
+ - [examples/acp-codex.ts](/Users/vedmalex/work/ai-connect/examples/acp-codex.ts)
222
+ - [examples/acp-gemini.ts](/Users/vedmalex/work/ai-connect/examples/acp-gemini.ts)
223
+
224
+ Dedicated `clientTools` examples:
225
+
226
+ - [examples/browser-client-tools.ts](/Users/vedmalex/work/ai-connect/examples/browser-client-tools.ts)
227
+ - [examples/local-client-tools.ts](/Users/vedmalex/work/ai-connect/examples/local-client-tools.ts)
228
+
229
+ For Gemini ACP you can choose the harness auth mode per connection by creating separate accounts:
230
+
231
+ ```ts
232
+ const client = createLocalClient(
233
+ defineConfig({
234
+ providers: {
235
+ gemini: {
236
+ accounts: [
237
+ {
238
+ id: "gemini-default",
239
+ transport: { kind: "acp", id: "gemini-acp" },
240
+ models: ["auto-gemini-3"],
241
+ },
242
+ {
243
+ id: "gemini-oauth",
244
+ transport: {
245
+ kind: "acp",
246
+ id: "gemini-acp",
247
+ auth: {
248
+ methodId: "oauth-personal",
249
+ },
250
+ },
251
+ models: ["auto-gemini-3"],
252
+ },
253
+ ],
254
+ },
255
+ },
256
+ routing: {
257
+ operations: {
258
+ text: ["gemini:gemini-acp:gemini-oauth:auto-gemini-3"],
259
+ },
260
+ },
261
+ }),
262
+ );
263
+ ```
264
+
265
+ This keeps ACP harness auth explicit at connection selection time without injecting provider keys or `baseUrl` from `ai-connect`.
266
+
267
+ ## Local CLI And Server Presets
268
+
269
+ Built-in local transport presets are available both as catalog entries and as exported preset metadata:
270
+
271
+ ```ts
272
+ import {
273
+ AI_CONNECT_DEFAULT_CLI_PRESETS,
274
+ AI_CONNECT_DEFAULT_SERVER_PRESETS,
275
+ getTextTransportPresetById,
276
+ listTextProviderCatalog,
277
+ } from "@vedmalex/ai-connect";
278
+
279
+ const localCatalog = listTextProviderCatalog({ runtime: "local" });
280
+ const codexCli = getTextTransportPresetById("openai", "codex-cli");
281
+ const opencodeServer = AI_CONNECT_DEFAULT_SERVER_PRESETS.opencode;
282
+ const geminiCli = AI_CONNECT_DEFAULT_CLI_PRESETS.gemini;
283
+ ```
284
+
285
+ For built-in CLI routes the shortest form is still the route `id`:
286
+
287
+ ```ts
288
+ transport: {
289
+ kind: "cli",
290
+ id: "gemini-cli",
291
+ }
292
+ ```
293
+
294
+ If you want a custom route id but still want the built-in argv/parser/command defaults, set `transport.cli.preset` explicitly:
295
+
296
+ ```ts
297
+ transport: {
298
+ kind: "cli",
299
+ id: "my-gemini-wrapper",
300
+ cli: {
301
+ preset: "gemini",
302
+ },
303
+ }
304
+ ```
305
+
306
+ CLI command resolution order is:
307
+
308
+ 1. `transport.command`
309
+ 2. `createLocalClient(..., { cli: { commands } })`
310
+ 3. `transport.cli.preset`
311
+ 4. built-in command mapped from `provider + transport.id`
312
+
313
+ Known local presets now include:
314
+
315
+ - `openai:codex-cli`
316
+ - `anthropic:claude-cli`
317
+ - `openclaude:openclaude-cli`
318
+ - `anthropic:claude-code-acp`
319
+ - `gemini:gemini-cli`
320
+ - `gemini:gemini-acp`
321
+ - `qwen:qwen-cli`
322
+ - `qwen:qwen-acp`
323
+ - `opencode:opencode-server`
324
+ - `opencode:opencode-acp`
325
+
326
+ ## Custom CLI Providers
327
+
328
+ Custom CLI providers can be connected by describing the argv template and the parser:
329
+
330
+ ```ts
331
+ import { createLocalClient, defineConfig } from "@vedmalex/ai-connect";
332
+
333
+ const client = createLocalClient(
334
+ defineConfig({
335
+ providers: {
336
+ customcli: {
337
+ accounts: [
338
+ {
339
+ id: "my-cli",
340
+ transport: {
341
+ kind: "cli",
342
+ id: "my-company-cli",
343
+ command: "my-agent",
344
+ cli: {
345
+ argsTemplate: [
346
+ "run",
347
+ "--prompt",
348
+ "{prompt}",
349
+ "--model",
350
+ "{model}",
351
+ "--format",
352
+ "json",
353
+ ],
354
+ parser: {
355
+ kind: "json",
356
+ textPath: "payload.message",
357
+ usagePath: "metrics",
358
+ errorPath: "error.message",
359
+ },
360
+ },
361
+ },
362
+ models: ["my-model-v1"],
363
+ },
364
+ ],
365
+ },
366
+ },
367
+ }),
368
+ );
369
+ ```
370
+
371
+ Supported placeholders in `argsTemplate`:
372
+
373
+ - `{prompt}`
374
+ - `{model}`
375
+ - `{output_file}`
376
+
377
+ `{output_file}` is useful for CLIs like `codex exec` that stream JSONL to stdout but write the final assistant message to a file.
378
+
379
+ Current local transport scope:
380
+
381
+ - `cli`: text generation, plus optional model discovery via an ACP sidecar
382
+ - `server`: text generation plus provider-native model discovery
383
+
384
+ ## Mock Gateway
385
+
386
+ For API-level debugging you can run a local mock backend that behaves like a small OpenAI/Anthropic/Gemini proxy and captures the real finalized wire payloads:
387
+
388
+ ```bash
389
+ bun run mock-gateway
390
+ ```
391
+
392
+ It prints base URLs for:
393
+
394
+ - `OpenAI`: `http://127.0.0.1:8046/v1`
395
+ - `Anthropic`: `http://127.0.0.1:8046/v1/messages`
396
+ - `Gemini`: `http://127.0.0.1:8046/v1beta/models`
397
+
398
+ The mock backend accepts any API key value and logs each captured request after `ai-connect` has already normalized it. Set `MOCK_GATEWAY_VERBOSE=1` to print full request snapshots instead of only summaries.
399
+
400
+ To run it as a transparent MITM in front of a real upstream proxy:
401
+
402
+ ```bash
403
+ MITM_UPSTREAM_ORIGIN=http://127.0.0.1:8045 bun run mock-gateway
404
+ ```
405
+
406
+ In that mode it keeps the same local URLs, forwards requests upstream, and logs:
407
+
408
+ - the finalized request payload
409
+ - the upstream response payload
410
+ - per-request total latency and upstream latency
411
+
412
+ This is useful both for direct API routes and for ACP harnesses that support gateway-style HTTP upstream configuration, because the harness can point at the MITM URL while `ai-connect` stays attached to the same local endpoint.
413
+
414
+ ## Rotation and Fallback
415
+
416
+ ```ts
417
+ import { createLocalClient, defineConfig } from "@vedmalex/ai-connect";
418
+
419
+ const client = createLocalClient(
420
+ defineConfig({
421
+ providers: {
422
+ openai: {
423
+ accounts: [
424
+ {
425
+ id: "main",
426
+ transport: "api",
427
+ models: ["gpt-4.1"],
428
+ credentials: [
429
+ {
430
+ id: "pool",
431
+ apiKeyEnv: "OPENAI_API_KEYS",
432
+ apiKeyDelimiter: ",",
433
+ },
434
+ ],
435
+ },
436
+ ],
437
+ },
438
+ anthropic: {
439
+ accounts: [
440
+ {
441
+ id: "subscription",
442
+ transport: { kind: "acp", id: "claude-code-acp" },
443
+ models: ["claude-sonnet-4"],
444
+ },
445
+ ],
446
+ },
447
+ },
448
+ routing: {
449
+ strategy: "round-robin",
450
+ shuffleOnInit: true,
451
+ fallback: {
452
+ on: {
453
+ rate_limit: [
454
+ "rotate-credential",
455
+ "rotate-account",
456
+ "fallback-transport",
457
+ "fallback-provider",
458
+ ],
459
+ },
460
+ },
461
+ },
462
+ }),
463
+ );
464
+ ```
465
+
466
+ Route pools accept several selector forms, but the safest form is:
467
+
468
+ - `provider:transport:account:model`
469
+ - or the full concrete `route.id`
470
+
471
+ Shorter selectors such as `provider:account:model` are convenience aliases. If the same account+model exists on multiple transports, the shorter form can match more than one route.
472
+
473
+ Three error codes are intentionally **hard-terminal**: they never rotate, retry, or fall back, and they never pollute route health:
474
+
475
+ - `aborted` — the caller cancelled the operation
476
+ - `timeout` — an operation deadline elapsed
477
+ - `fanout_limit` — a client-side fan-out ceiling was exhausted
478
+
479
+ All other normalized error codes (`rate_limit`, `quota_exhausted`, `temporary_unavailable`, etc.) remain eligible for the rotation/retry/fallback chain you configure under `routing.fallback`.
480
+
481
+ ## Cancellation, Pause, and Timeouts
482
+
483
+ `generate(request, opts?)` and `stream(request, opts?)` accept an optional second argument:
484
+
485
+ ```ts
486
+ type GenerateCallOptions = {
487
+ signal?: AbortSignal;
488
+ pauseSignal?: AbortSignal;
489
+ timeoutMs?: number;
490
+ };
491
+ ```
492
+
493
+ Cancellation with an `AbortSignal` discards any in-flight partial and throws an `AiConnectError` with code `aborted`:
494
+
495
+ ```ts
496
+ const controller = new AbortController();
497
+ setTimeout(() => controller.abort(), 5_000);
498
+
499
+ try {
500
+ const result = await client.generate(
501
+ { messages: [{ role: "user", content: "Long task..." }] },
502
+ { signal: controller.signal },
503
+ );
504
+ console.log(result.text);
505
+ } catch (error) {
506
+ if (error instanceof AiConnectError && error.code === "aborted") {
507
+ console.log("cancelled");
508
+ }
509
+ }
510
+ ```
511
+
512
+ `pauseSignal` is a separate, cooperative signal. In `stream()`, firing it stops reading and yields a terminal `{ type: "paused", result }` event that **keeps** everything accumulated so far:
513
+
514
+ ```ts
515
+ const pause = new AbortController();
516
+
517
+ for await (const event of client.stream(
518
+ { messages: [{ role: "user", content: "Stream a draft." }] },
519
+ { pauseSignal: pause.signal },
520
+ )) {
521
+ if (event.type === "delta") {
522
+ process.stdout.write(event.text);
523
+ } else if (event.type === "paused") {
524
+ console.log("\npaused with partial:", event.result.text);
525
+ } else if (event.type === "result") {
526
+ console.log("\ndone:", event.result.text);
527
+ }
528
+ }
529
+ ```
530
+
531
+ In `generate()` a mid-call pause degenerates to `aborted`, because a non-streamed call cannot retain a partial. Abort always throws and discards; pause in `stream()` is the only way to keep a partial.
532
+
533
+ `timeoutMs` overrides the per-operation timeout tier for a single call. Setting `<= 0` or `Infinity` disables the timer. A fired timeout throws `AiConnectError` with code `timeout`. You can also set client-wide tier defaults:
534
+
535
+ ```ts
536
+ const client = createLocalClient(config, {
537
+ timeouts: {
538
+ generateMs: 120_000, // generate / stream (default 120000)
539
+ probeMs: 12_000, // verify / discover* / checkHealth / probeModels (default 12000)
540
+ },
541
+ });
542
+ ```
543
+
544
+ `verify()`, `discoverModels()`, and `discoverAcpModels()` accept the `signal`/`timeoutMs` subset of these options as their own second argument.
545
+
546
+ ## Files and Images
547
+
548
+ The unified request format supports:
549
+
550
+ - `attachments` for text, image, and PDF document prompt inputs
551
+ - `image.size` and `image.rawPrompt` for image generation routes
552
+ - portable file inputs:
553
+ - absolute local paths
554
+ - browser `File`
555
+ - browser `Blob`
556
+ - `data:` URLs
557
+ - remote file references with `uri` or a provider `providerFileId`
558
+
559
+ Example:
560
+
561
+ ```ts
562
+ const result = await client.generate({
563
+ operation: "image",
564
+ messages: [{ role: "user", content: "Create a lotus architecture diagram" }],
565
+ attachments: [
566
+ new File(["project outline"], "brief.md", { type: "text/markdown" }),
567
+ ],
568
+ image: {
569
+ size: "1280x720",
570
+ },
571
+ });
572
+
573
+ console.log(result.attachments);
574
+ ```
575
+
576
+ ### PDF and Document Input
577
+
578
+ PDF attachments (`application/pdf`) now route across the `api` transport family, not just ACP. Each built-in API handler maps a document attachment to its provider-native content block:
579
+
580
+ - `anthropic` — a `document` block (base64 inline, Files-API `file_id`, or url)
581
+ - `openai` — a file content block (inline file data or an uploaded Files-API `file_id`) alongside `image_url` for images
582
+ - `gemini` — `inlineData` for inline bytes or `fileData` for an uploaded file URI
583
+
584
+ Oversize PDFs are uploaded to the provider's Files API and referenced by id (`providerFileId`); if that upload fails the handler falls back to the inline base64 path and records a warning. A route that cannot carry a document at all fails with a clean `AiConnectError` whose code is `unsupported_capability`.
585
+
586
+ ```ts
587
+ const result = await client.generate({
588
+ messages: [{ role: "user", content: "Summarize the attached report." }],
589
+ attachments: ["/Users/vedmalex/work/reports/q3.pdf"],
590
+ });
591
+
592
+ console.log(result.text);
593
+ ```
594
+
595
+ A previously-uploaded document can be referenced directly by its provider file id, skipping re-upload:
596
+
597
+ ```ts
598
+ const result = await client.generate({
599
+ messages: [{ role: "user", content: "What changed since the last revision?" }],
600
+ attachments: [{ providerFileId: "file_abc123", mimeType: "application/pdf", name: "spec.pdf" }],
601
+ });
602
+ ```
603
+
604
+ The portable-file primitives used for this are exported and browser-safe where the source allows it:
605
+
606
+ - `SUPPORTED_DOCUMENT_MIME_TYPES` — the set of MIME types treated as documents (currently `application/pdf`)
607
+ - `portableFileCategory(file)` — coarse `"image" | "document" | "text" | "other"` classification
608
+ - `isPortableDocumentFile(file)` — convenience predicate for the `document` category
609
+ - `materializePortableFile(file)` — one decode pass producing a `PortableFilePayload` (`base64`, `dataUrl`, `uri`, `text`, `providerFileId` carriers)
610
+ - `portableFileToBase64(file)` — raw base64 of the file bytes (no `data:` prefix)
611
+
612
+ Path-based file access requires a local runtime; in a browser bundle use `File`, `Blob`, `data:` URLs, or remote references.
613
+
614
+ ## Wide Event Logging
615
+
616
+ The client supports opt-in structured logging in the "log once per request lifecycle" style described at [loggingsucks.com](https://loggingsucks.com/).
617
+
618
+ ```ts
619
+ import {
620
+ createConsoleWideEventLogger,
621
+ createLocalClient,
622
+ defineConfig,
623
+ } from "@vedmalex/ai-connect";
624
+
625
+ const client = createLocalClient(
626
+ defineConfig({
627
+ providers: {
628
+ openai: {
629
+ accounts: [
630
+ {
631
+ id: "main",
632
+ transport: "api",
633
+ models: ["gpt-4.1"],
634
+ credentials: [{ apiKeyEnv: "OPENAI_API_KEY" }],
635
+ },
636
+ ],
637
+ },
638
+ },
639
+ }),
640
+ {
641
+ logging: {
642
+ logger: createConsoleWideEventLogger(),
643
+ sampling: {
644
+ sampleRate: 0.1,
645
+ slowOperationMs: 2_000,
646
+ keepErrors: true,
647
+ keepWarnings: true,
648
+ },
649
+ baseContext: {
650
+ service_name: "my-app",
651
+ environment: "production",
652
+ },
653
+ },
654
+ },
655
+ );
656
+
657
+ await client.generate({
658
+ messages: [{ role: "user", content: "Summarize this request." }],
659
+ logContext: {
660
+ request_id: "req-123",
661
+ tenant_id: "acme",
662
+ user_id: "u-42",
663
+ },
664
+ });
665
+ ```
666
+
667
+ What gets logged:
668
+
669
+ - one canonical event per `generate`, `stream`, `verify`, `discoverModels`, `discoverAcpModels`, `checkHealth`, or `probeModels` call
670
+ - request shape summary, not raw prompt content
671
+ - selected route plus full fallback/retry attempt chain
672
+ - duration, usage (including `usage.calls`), warnings, and verification issue codes
673
+ - per-operation summaries: `verification`, `modelDiscovery`, `health`, and `probe`
674
+ - caller-provided `logContext` for business identifiers
675
+
676
+ Helpers:
677
+
678
+ - `createConsoleWideEventLogger()`
679
+ - `shouldEmitWideEvent()`
680
+
681
+ ## Streaming Deltas
682
+
683
+ `stream()` yields a `GenerateStreamEvent` union:
684
+
685
+ ```ts
686
+ type GenerateStreamEvent =
687
+ | { type: "delta"; text: string }
688
+ | { type: "result"; result: GenerateResult }
689
+ | { type: "paused"; result: GenerateResult };
690
+ ```
691
+
692
+ For routes with a real incremental producer (the OpenAI SSE handler and the ACP delta producer), `stream()` emits `{ type: "delta", text }` tokens as they arrive and then a terminal `{ type: "result", result }`. Routes without an incremental producer still yield a single terminal `result`. A cooperative `pauseSignal` ends the stream with a terminal `{ type: "paused", result }` that keeps the accumulated partial (see [Cancellation, Pause, and Timeouts](#cancellation-pause-and-timeouts)).
693
+
694
+ ```ts
695
+ for await (const event of client.stream({
696
+ messages: [{ role: "user", content: "Write a haiku." }],
697
+ })) {
698
+ if (event.type === "delta") {
699
+ process.stdout.write(event.text);
700
+ } else if (event.type === "result") {
701
+ console.log("\n", event.result.usage);
702
+ }
703
+ }
704
+ ```
705
+
706
+ `delta` and `result` may interleave; `paused` and `result` are mutually exclusive terminals. Abort, by contrast, throws and discards partials — it never yields `paused`.
707
+
708
+ ## Health Checks and Model Probes
709
+
710
+ Two read-only diagnostics complement `verify()` and `discoverModels()`. Neither mutates router health (no `recordFailure`/`recordSuccess`).
711
+
712
+ `checkHealth(target?)` runs a live two-stage check per route:
713
+
714
+ 1. endpoint reachability (api `GET /models` via discovery; acp/cli/server session via `verify`)
715
+ 2. a minimal bounded chat ping (max one token) that captures `latencyMs`
716
+
717
+ A Stage-1 failure short-circuits Stage-2 with detail `"skipped: endpoint unreachable"`. Pass `reachabilityOnly: true` for the cheap Stage-1-only check on hot paths.
718
+
719
+ ```ts
720
+ const report = await client.checkHealth({ transports: ["api"] });
721
+
722
+ for (const route of report.routes) {
723
+ console.log(route.routeId, route.ok, route.model.latencyMs);
724
+ }
725
+ ```
726
+
727
+ `probeModels(target?, opts?)` classifies each `route::model` tuple as `broken` vs transient. For api transports `broken` is HTTP-status-driven (`400 <= status < 500` and `status !== 429`); 429, 5xx, and status-less transport errors are transient (`broken: false`). Results are served from a per-route TTL cache (default 5 minutes), with bounded concurrency (default 4), a per-probe timeout (default 8s), and `opts.signal` support to stop a fan-out mid-flight. `probeModelsStream(target?, opts?)` yields each `ProbeModelResult` as it settles.
728
+
729
+ ```ts
730
+ const results = await client.probeModels(
731
+ { transports: ["api"] },
732
+ { concurrency: 6, timeoutMs: 5_000, forceRefresh: false },
733
+ );
734
+
735
+ const broken = results.filter((r) => r.broken);
736
+ ```
737
+
738
+ The classification primitive is exported as `classifyProbeOutcome`, with the defaults `PROBE_DEFAULT_CONCURRENCY`, `PROBE_DEFAULT_TIMEOUT_MS`, and `PROBE_DEFAULT_TTL_MS`.
739
+
740
+ ## Context Window and Model Discovery
741
+
742
+ `resolveModelContext(input, options?)` returns the effective context window for a model (synchronous, no I/O), with a clear precedence: `discovered` > `reference` (curated table) > `configured` (per-model/route config) > `default` (8192). Results are cached per `(baseUrl|transportId)::model`; a cache hit returns the same value and source and ignores `options.discovered`.
743
+
744
+ ```ts
745
+ const ctx = client.resolveModelContext(
746
+ { provider: "openai", model: "gpt-4.1" },
747
+ { discovered: 1_047_576 },
748
+ );
749
+
750
+ console.log(ctx.contextWindow, ctx.source, ctx.cached);
751
+ ```
752
+
753
+ Configure a per-model context window in account config either at the account level (inherited by string-form `models`) or per model:
754
+
755
+ ```ts
756
+ {
757
+ id: "main",
758
+ transport: "api",
759
+ contextWindow: 128_000,
760
+ models: ["gpt-4o", { id: "gpt-4.1", contextWindow: 1_047_576 }],
761
+ credentials: [{ apiKeyEnv: "OPENAI_API_KEY" }],
762
+ }
763
+ ```
764
+
765
+ Model discovery now also surfaces typed `contextLength`, `free`, and `pricing` fields on each discovered `ModelInfo`. The curated reference table and its parsers are browser-safe exports:
766
+
767
+ - `MODEL_REFERENCE` and `lookupModelRef(model)`
768
+ - `resolveModelContextWindow({ discovered?, reference?, configured?, defaultContextWindow? })`
769
+ - `extractModelContextLength(rawModelRecord)`
770
+ - `detectModelFree(modelId, pricing?, rawModelRecord?)`
771
+ - `parseModelPricing(rawModelRecord)`
772
+ - `DEFAULT_CONTEXT_WINDOW` (8192), `normalizeModelKey`, `modelContextCacheKey`
773
+
774
+ ## Client-Safe Projection and Flexible Routing
775
+
776
+ For untrusted UI or agent-discovery surfaces, project routes without ever exposing credentials or `baseUrl`:
777
+
778
+ ```ts
779
+ const publicRoutes = client.listPublicRoutes({ operation: "text" });
780
+ const candidates = client.listCandidateModels({ provider: "openai" });
781
+ ```
782
+
783
+ `listPublicRoutes()` returns `PublicRoute` DTOs (built by explicit construction, never by spreading an internal route), and `listCandidateModels()` returns the same secret-free `CandidateModel` list offered to a model selector.
784
+
785
+ Per-route routing flexibility is configured on the account:
786
+
787
+ - `modelAllowlistMode: "strict" | "shortlist"` — `strict` (default) drops an undeclared `routeHints.model`; `shortlist` passes a verbatim requested model through on a synthetic copy that preserves the route id (never fragments health)
788
+ - `defaultResponseFormat` — injected only when the caller did not supply `parameters.responseFormat`
789
+ - `systemPrompt` — injected as a leading system message only when the caller authored no system message
790
+ - `contextMode: "workspace" | "clean"` — execution-context mode (see [Clean Context Mode](#clean-context-mode))
791
+
792
+ Unmatched route selectors are governed by `routing.resolution.unknownSelector`:
793
+
794
+ - `"error"` (default) — throw on an unmatched selector
795
+ - `"default"` — substitute the configured `defaultRouteId` for each unmatched selector
796
+ - `"off"` — silently drop the unmatched selector (degrade)
797
+
798
+ ```ts
799
+ defineConfig({
800
+ providers: {
801
+ openai: {
802
+ accounts: [
803
+ {
804
+ id: "main",
805
+ transport: "api",
806
+ models: ["gpt-4.1"],
807
+ modelAllowlistMode: "shortlist",
808
+ defaultResponseFormat: { type: "json_object" },
809
+ systemPrompt: "You are a concise assistant.",
810
+ credentials: [{ apiKeyEnv: "OPENAI_API_KEY" }],
811
+ },
812
+ ],
813
+ },
814
+ },
815
+ routing: {
816
+ resolution: {
817
+ unknownSelector: "default",
818
+ defaultRouteId: "openai:api:main:gpt-4.1",
819
+ },
820
+ },
821
+ });
822
+ ```
823
+
824
+ ## Fan-Out Throttling
825
+
826
+ Client-side fan-out throttling bounds how aggressively a client issues calls. It is configured at the client level and can be overridden per request:
827
+
828
+ ```ts
829
+ type FanoutPolicy = {
830
+ maxConcurrency?: number; // simultaneous in-flight calls (semaphore + FIFO fairness)
831
+ requestsPerSecond?: number; // deterministic token bucket driven by runtime.now()
832
+ maxCalls?: number; // hard LIFETIME ceiling
833
+ };
834
+ ```
835
+
836
+ Any unset field is unbounded. Exhausting `maxCalls` throws `AiConnectError` with code `fanout_limit` **before** route selection, so it never pollutes route health.
837
+
838
+ ```ts
839
+ const client = createLocalClient(config, {
840
+ fanout: { maxConcurrency: 4, requestsPerSecond: 10 },
841
+ });
842
+
843
+ await client.generate({
844
+ messages: [{ role: "user", content: "..." }],
845
+ fanout: { maxCalls: 100 }, // request-scoped, merged per-field over the client default
846
+ });
847
+ ```
848
+
849
+ A per-request `fanout` merges per-field over the client default into a request-scoped limiter that never mutates the shared client limiter. The standalone limiter primitive is exported as `createFanoutLimiter(policy, runtime)`; normalize a raw policy first with `normalizeFanoutPolicy()` (and `mergeFanoutPolicy()` to combine a base and override).
850
+
851
+ ## Model Selector Hook
852
+
853
+ A consumer-supplied `modelSelector` runs before normal routing and picks a model from the eligible candidates:
854
+
855
+ ```ts
856
+ const client = createLocalClient(config, {
857
+ routeHints: {
858
+ modelSelector: (question, candidateModels) => {
859
+ // question carries text/messages/operation/routeHints (no secrets);
860
+ // candidateModels is the secret-free CandidateModel list.
861
+ if (question.text.length > 4_000) {
862
+ return candidateModels.find((c) => c.model.includes("4.1"))?.model;
863
+ }
864
+ return undefined; // defer to normal routing
865
+ },
866
+ failOpen: false,
867
+ },
868
+ });
869
+ ```
870
+
871
+ Returning `undefined` defers to normal routing. An explicit `routeHints.model` always beats the selector (the hook is not even invoked). A thrown or rejected selector fails closed to `validation_error` by default; set `failOpen: true` to ignore it and fall through to normal routing instead. The selector may be async and LLM-backed.
872
+
873
+ ## Clean Context Mode
874
+
875
+ `contextMode` is now generalized across all transports (previously ACP-only), set per account or per ACP launch:
876
+
877
+ - `"workspace"` (default) — `ai-connect` may inject its ambient launch context (cwd/skills/rules for ACP)
878
+ - `"clean"` — `ai-connect` injects nothing ambient; only the consumer messages/attachments plus explicit route config (`systemPrompt`, `defaultResponseFormat`) reach the wire
879
+
880
+ Clean mode suppresses ambient context, not explicit configuration: a route's `systemPrompt` and `defaultResponseFormat` are still applied in clean mode.
881
+
882
+ ## Usage Accounting
883
+
884
+ `result.usage.calls` counts the successful, usage-bearing model calls behind a result. It is seeded as `+1` per reporting call (only when usage is actually reported) and summed across usage merges, so a result assembled from multiple rounds or a fallback chain reports the true call count. It is never fabricated — a route that reports no usage contributes no `calls`.
885
+
886
+ ```ts
887
+ const result = await client.generate({
888
+ messages: [{ role: "user", content: "Multi-round task." }],
889
+ });
890
+
891
+ console.log(result.usage?.calls, result.usage?.totalTokens);
892
+ ```
893
+
894
+ ## Robustness
895
+
896
+ Two robustness behaviors apply on the API path:
897
+
898
+ - **Strict structured output** — `parameters.responseFormat` of `{ type: "json_schema", strict: true, ... }` requests strict schema enforcement. If the upstream rejects the request with a `400` specifically because of `response_format`, the handler performs a one-shot graceful retry with the format dropped, records a warning, and continues.
899
+ - **Deep error unwrapping** — upstream error payloads are unwrapped up to three levels deep (cycle-safe, JSON-decoding stringified `.error`/`.message` payloads along the way) so the surfaced `AiConnectError` message is the real provider message, not an opaque envelope.
900
+
901
+ ## Cross-Project Reuse
902
+
903
+ Several primitives are intentionally provider-agnostic, client-free where possible, and free of `node:*` imports so they ship cleanly in browser bundles:
904
+
905
+ - **Model reference** — `MODEL_REFERENCE`, `lookupModelRef`, `resolveModelContextWindow`, `extractModelContextLength`, `detectModelFree`, `parseModelPricing` (pure data + functions, no client instance)
906
+ - **Probe classification** — `classifyProbeOutcome` plus the `PROBE_DEFAULT_*` constants (HTTP-status-driven, provider-blind; the cache is owned and passed in by the caller)
907
+ - **Fan-out limiter** — `createFanoutLimiter(policy, runtime)` (a deterministic token bucket + semaphore driven by `runtime.now()`, standalone with no client)
908
+ - **Abort context** — the `AbortContext`/`GenerateCallOptions` contract and `mapAbortError(reason)` for deterministic `aborted`/`timeout` mapping
909
+ - **Usage accounting** — the `UsageInfo.calls` summing rule (carried on the flat `UsageInfo` shape; any new transport adds `calls: 1` in its usage guard and aggregation is automatic)
910
+
911
+ These are exported from both the default and `@vedmalex/ai-connect/browser` entry points (everything except `createLocalClient`).
912
+
913
+ ### bs-search Migration
914
+
915
+ When consuming `ai-connect` from `bs-search`:
916
+
917
+ - depend on the published package `@vedmalex/ai-connect`, or pin an unpublished commit via `file:../ai-connect` for local development
918
+ - the cancellation contract mirrors the engine's existing convention: a caller `signal` aborts (discarding partials → `aborted`), while a separate `pauseSignal` cooperatively pauses a stream and keeps the partial — the same split as the engine's `signal` vs `_pauseSignal`
919
+ - prefer the provider-agnostic primitives above (model reference, probe, fan-out, usage accounting) over re-implementing them, since they are client-free and browser-safe
920
+
921
+ ## Browser vs Local
922
+
923
+ | Capability | Browser API routes | Local API routes | Local ACP routes |
924
+ | --- | --- | --- | --- |
925
+ | Text generation | yes | yes | yes |
926
+ | Image generation | yes | yes | depends on harness |
927
+ | Text attachments | yes | yes | yes |
928
+ | Image attachments | yes | yes | yes |
929
+ | PDF document attachments | yes (data URL / remote ref) | yes | depends on harness |
930
+ | Streaming deltas | yes (OpenAI SSE) | yes (OpenAI SSE) | yes (ACP delta producer) |
931
+ | Cancellation / pause / timeout | yes | yes | yes |
932
+ | Health check / model probe | yes | yes | yes |
933
+ | Context-window resolution | yes | yes | yes |
934
+ | Local file paths | no | yes | yes |
935
+ | Local command/session verification | no | yes | yes |
936
+ | Claude/Codex/Gemini ACP | no | yes | yes |
937
+
938
+ ## Runtime Entry Points
939
+
940
+ Use explicit runtime entry points when you know the target in advance:
941
+
942
+ - `@vedmalex/ai-connect/browser`
943
+ - `@vedmalex/ai-connect/node`
944
+ - `@vedmalex/ai-connect/bun`
945
+ - `@vedmalex/ai-connect/local`
946
+
947
+ Notes:
948
+
949
+ - `@vedmalex/ai-connect/browser` is the browser-safe bundle.
950
+ - `@vedmalex/ai-connect` defaults to the full Node/Bun-oriented entry.
951
+ - `@vedmalex/ai-connect/local` is the focused local runtime entry with ACP support.
952
+
953
+ ## Public API
954
+
955
+ Main exports:
956
+
957
+ - `defineConfig`
958
+ - `createClient`
959
+ - `createBrowserClient`
960
+ - `createLocalClient`
961
+ - `preparePortableFile`
962
+ - `buildImagePromptBundle`
963
+ - `IMAGE_SIZE_PRESETS`
964
+ - `AiConnectError`, `isAiConnectError`, `toAiConnectError`, `mapAbortError`
965
+ - `createConsoleWideEventLogger`
966
+ - `shouldEmitWideEvent`
967
+
968
+ File primitives:
969
+
970
+ - `SUPPORTED_DOCUMENT_MIME_TYPES`
971
+ - `portableFileCategory`
972
+ - `isPortableDocumentFile`
973
+ - `materializePortableFile`
974
+ - `portableFileToBase64`
975
+
976
+ Model-reference primitives (browser-safe):
977
+
978
+ - `MODEL_REFERENCE`, `lookupModelRef`
979
+ - `resolveModelContextWindow`, `extractModelContextLength`
980
+ - `detectModelFree`, `parseModelPricing`
981
+ - `DEFAULT_CONTEXT_WINDOW`, `normalizeModelKey`, `modelContextCacheKey`
982
+
983
+ Probe + fan-out primitives:
984
+
985
+ - `classifyProbeOutcome`
986
+ - `PROBE_DEFAULT_CONCURRENCY`, `PROBE_DEFAULT_TIMEOUT_MS`, `PROBE_DEFAULT_TTL_MS`
987
+ - `createFanoutLimiter`, `normalizeFanoutPolicy`, `mergeFanoutPolicy`
988
+
989
+ Client methods:
990
+
991
+ - `generate(request, opts?)`
992
+ - `stream(request, opts?)`
993
+ - `verify(target?, opts?)`
994
+ - `discoverModels(target?, opts?)`
995
+ - `discoverAcpModels(target?, opts?)`
996
+ - `checkHealth(target?)`
997
+ - `probeModels(target?, opts?)`
998
+ - `probeModelsStream(target?, opts?)`
999
+ - `resolveModelContext(input, options?)`
1000
+ - `prepareFile(input)`
1001
+ - `listRoutes(filter?)`
1002
+ - `listPublicRoutes(filter?)`
1003
+ - `listCandidateModels(filter?)`
1004
+
1005
+ `generate()` and `stream()` accept `GenerateCallOptions { signal?, pauseSignal?, timeoutMs? }`; `verify()`/`discoverModels()`/`discoverAcpModels()` accept the `{ signal?, timeoutMs? }` subset. See [Cancellation, Pause, and Timeouts](#cancellation-pause-and-timeouts), [Health Checks and Model Probes](#health-checks-and-model-probes), and [Client-Safe Projection and Flexible Routing](#client-safe-projection-and-flexible-routing).
1006
+
1007
+ `discoverAcpModels()` opens the configured ACP route, runs the ACP handshake up to `session/new`, and returns the advertised `availableModels` and `currentModelId` per route.
1008
+
1009
+ `discoverModels()` is the unified catalog API for HTTP API, ACP, local server routes, and CLI routes that delegate discovery to an ACP sidecar. Use `target.transports` when you want only one transport family.
1010
+
1011
+ Current discovery support matrix:
1012
+
1013
+ - `api`: supported
1014
+ - `acp`: supported
1015
+ - `server`: supported
1016
+ - `cli`: supported when the route config enables `transport.cli.discovery`, or when a built-in CLI preset maps discovery to ACP
1017
+
1018
+ Built-in CLI discovery defaults:
1019
+
1020
+ - `gemini-cli` -> `gemini-acp`
1021
+ - `qwen-cli` -> `qwen-acp`
1022
+ - `claude-cli` -> `claude-code-acp`
1023
+ - `codex-cli` -> `codex-acp`
1024
+ - `openclaude-cli` -> no default discovery bridge
1025
+
1026
+ CLI discovery through ACP adds ACP-side prerequisites:
1027
+
1028
+ - the ACP executable must exist
1029
+ - the ACP harness must be authenticated if that provider requires auth
1030
+ - `verify()` checks route plausibility and handler presence, but it does not perform a live discovery/auth handshake up front
1031
+
1032
+ For custom CLI wrappers you can make the public API stay uniform by delegating discovery to ACP:
1033
+
1034
+ ```ts
1035
+ transport: {
1036
+ kind: "cli",
1037
+ id: "my-gemini-wrapper",
1038
+ command: "/opt/bin/gemini-wrapper",
1039
+ cli: {
1040
+ discovery: {
1041
+ via: "acp",
1042
+ acp: {
1043
+ providerId: "gemini",
1044
+ transportId: "gemini-acp",
1045
+ auth: {
1046
+ methodId: "oauth-personal",
1047
+ },
1048
+ },
1049
+ },
1050
+ },
1051
+ }
1052
+ ```
1053
+
1054
+ ACP routes are treated as harness-owned connections:
1055
+
1056
+ - `ai-connect` does not inject `baseUrl`
1057
+ - `ai-connect` does not inject provider API keys into ACP
1058
+ - the local ACP tool is responsible for its own auth/session and upstream routing
1059
+
1060
+ Tool semantics are intentionally split:
1061
+
1062
+ - `api` routes support tool schema passthrough via `parameters.tools`
1063
+ - `api` routes also support client-managed tools through `clientTools`
1064
+ - `clientTools` are executed locally by `ai-connect` after the provider returns tool calls
1065
+ - `parameters.tools` remains the right path for upstream-managed tool schemas that are not executed by `ai-connect`
1066
+ - `acp` routes support harness-owned tool execution
1067
+ - `acp` routes do not currently forward request-defined tool schema from `parameters.tools`
1068
+ - `cli` and current built-in `server` routes do not support tool schema passthrough or tool execution
1069
+
1070
+ That distinction is also reflected in route capabilities:
1071
+
1072
+ - `supportsToolSchema`
1073
+ - `supportsToolExecution`
1074
+ - `supportsClientToolExecution`
1075
+
1076
+ Client-managed tools can be registered on the client and then selected per request:
1077
+
1078
+ ```ts
1079
+ const client = createBrowserClient(config, {
1080
+ clientTools: [
1081
+ {
1082
+ type: "function",
1083
+ function: {
1084
+ name: "lookup_weather",
1085
+ description: "Return current weather for a city",
1086
+ parameters: {
1087
+ type: "object",
1088
+ properties: {
1089
+ city: { type: "string" },
1090
+ },
1091
+ required: ["city"],
1092
+ },
1093
+ },
1094
+ async execute(args, context) {
1095
+ return {
1096
+ data: {
1097
+ city: String(args.city),
1098
+ source: "local-cache",
1099
+ workingDirectory: context.workingDirectory,
1100
+ },
1101
+ };
1102
+ },
1103
+ },
1104
+ ],
1105
+ });
1106
+
1107
+ const result = await client.generate({
1108
+ messages: [{ role: "user", content: "Check the weather in Moscow." }],
1109
+ clientTools: ["lookup_weather"],
1110
+ });
1111
+ ```
1112
+
1113
+ Current limits:
1114
+
1115
+ - `clientTools` are supported only for `generate()`
1116
+ - `clientTools` are currently supported only for text requests without attachments/image options
1117
+ - built-in local execution of `clientTools` is implemented for built-in API handlers: `openai`, `anthropic`, `gemini`
1118
+
1119
+ ### Context and MCP Semantics
1120
+
1121
+ `ai-connect` separates transport routing from harness-owned context loading.
1122
+
1123
+ ACP routes:
1124
+
1125
+ - default to `workspace` context mode and `default` skills mode
1126
+ - launch from the configured local cwd, or from the current process cwd when no override is provided
1127
+ - a request-level `workingDirectory` overrides that cwd for the current inference call
1128
+ - can therefore pick up project-local context files, rules, and skills that the harness itself knows how to load
1129
+ - do not automatically inherit MCP servers from the host agent or from the current Codex session
1130
+
1131
+ Important ACP boundary:
1132
+
1133
+ - `ai-connect` currently sends `mcpServers: []` in ACP `session/new`
1134
+ - this means host-agent MCP integrations are not forwarded into the ACP harness automatically
1135
+ - if an ACP harness needs tools, skills, or MCP-style integrations, they must come from that harness's own configuration/environment
1136
+
1137
+ ACP clean mode:
1138
+
1139
+ - `transport.launch.contextMode: "clean"` asks `ai-connect` to isolate cwd/home/config best-effort for supported harnesses
1140
+ - `transport.launch.skillsMode: "disabled"` asks `ai-connect` to suppress harness-owned skills/rules where supported
1141
+ - this is strongest for harnesses where `ai-connect` has provider-specific launch isolation; for others it is best-effort
1142
+
1143
+ CLI routes:
1144
+
1145
+ - run as one-shot commands from `cli.cwd ?? process.cwd()`
1146
+ - a request-level `workingDirectory` overrides that cwd for the current inference call
1147
+ - can therefore use the current project folder as context if the underlying CLI tool inspects cwd
1148
+ - do not have a first-class `workspace` vs `clean` launch mode today
1149
+ - if you need a clean CLI run, use an isolated `cli.cwd`, custom `cli.env`, or a wrapper command
1150
+
1151
+ Server routes:
1152
+
1153
+ - use whatever context model the local HTTP server implements
1154
+ - spawned local server processes use `workingDirectory ?? server.cwd ?? process.cwd()`
1155
+ - `ai-connect` does not define project-context semantics for the server process beyond launch cwd/env overrides
1156
+
1157
+ ACP usage statistics are exposed on `result.usage` when the harness provides them. `ai-connect` currently normalizes:
1158
+
1159
+ - Gemini ACP `_meta.quota.token_count` and `_meta.quota.model_usage`
1160
+ - OpenCode ACP `usage_update` (`used`, `size`, `cost`)
1161
+ - Qwen ACP `_meta.usage` (`inputTokens`, `outputTokens`, `totalTokens`, `thoughtTokens`, `cachedReadTokens`)
1162
+
1163
+ ## Examples
1164
+
1165
+ See:
1166
+
1167
+ - [examples/acp-claude.ts](/Users/vedmalex/work/ai-connect/examples/acp-claude.ts)
1168
+ - [examples/acp-codex.ts](/Users/vedmalex/work/ai-connect/examples/acp-codex.ts)
1169
+ - [examples/acp-gemini.ts](/Users/vedmalex/work/ai-connect/examples/acp-gemini.ts)
1170
+ - [examples/browser-basic.ts](/Users/vedmalex/work/ai-connect/examples/browser-basic.ts)
1171
+ - [examples/local-acp.ts](/Users/vedmalex/work/ai-connect/examples/local-acp.ts)
1172
+ - [examples/local-test-server.ts](/Users/vedmalex/work/ai-connect/examples/local-test-server.ts)
1173
+ - [examples/rotation-fallback.ts](/Users/vedmalex/work/ai-connect/examples/rotation-fallback.ts)
1174
+ - [examples/image-test.ts](/Users/vedmalex/work/ai-connect/examples/image-test.ts)
1175
+ - [examples/image-edit-test.ts](/Users/vedmalex/work/ai-connect/examples/image-edit-test.ts)
1176
+ - [examples/image-workflow.ts](/Users/vedmalex/work/ai-connect/examples/image-workflow.ts)
1177
+ - [examples/wide-event-logging.ts](/Users/vedmalex/work/ai-connect/examples/wide-event-logging.ts)
1178
+
1179
+ Example execution notes:
1180
+
1181
+ - local examples run `verify()` first and print missing prerequisites clearly
1182
+ - ACP and other live network examples only execute the real prompt when `AI_CONNECT_RUN_EXAMPLE=1` is set
1183
+ - browser examples should be run in an actual browser runtime, not from Bun/Node CLI
1184
+
1185
+ ## Local Test Server Preset
1186
+
1187
+ If you are targeting the local gateway at `127.0.0.1:8045`, configure direct API routes with `transport.baseUrl`.
1188
+
1189
+ ```ts
1190
+ import { createLocalClient, defineConfig } from "@vedmalex/ai-connect";
1191
+
1192
+ const LOCAL_TEST_API_KEY = "sk-8181e6a4a59b4ec5a9931f3ae0f359c4";
1193
+
1194
+ const client = createLocalClient(
1195
+ defineConfig({
1196
+ providers: {
1197
+ openai: {
1198
+ accounts: [
1199
+ {
1200
+ id: "local-openai",
1201
+ transport: {
1202
+ kind: "api",
1203
+ baseUrl: "http://127.0.0.1:8045/v1",
1204
+ },
1205
+ models: ["gpt-oss-120b-medium"],
1206
+ credentials: [{ apiKey: LOCAL_TEST_API_KEY }],
1207
+ },
1208
+ ],
1209
+ },
1210
+ anthropic: {
1211
+ accounts: [
1212
+ {
1213
+ id: "local-anthropic",
1214
+ transport: {
1215
+ kind: "api",
1216
+ baseUrl: "http://127.0.0.1:8045/v1/messages",
1217
+ },
1218
+ models: ["claude-sonnet-4-6"],
1219
+ credentials: [{ apiKey: LOCAL_TEST_API_KEY }],
1220
+ },
1221
+ ],
1222
+ },
1223
+ gemini: {
1224
+ accounts: [
1225
+ {
1226
+ id: "local-gemini",
1227
+ transport: {
1228
+ kind: "api",
1229
+ baseUrl: "http://127.0.0.1:8045/v1beta/models",
1230
+ },
1231
+ models: ["gemini-3.1-flash-lite", "gemini-3.1-flash-image"],
1232
+ credentials: [{ apiKey: LOCAL_TEST_API_KEY }],
1233
+ },
1234
+ ],
1235
+ },
1236
+ },
1237
+ }),
1238
+ );
1239
+
1240
+ const catalog = await client.discoverModels({
1241
+ transports: ["api"],
1242
+ });
1243
+
1244
+ console.log(
1245
+ catalog.routes.flatMap((route) => route.availableModels.map((model) => model.modelId)),
1246
+ );
1247
+ ```
1248
+
1249
+ ## Publishing
1250
+
1251
+ `@vedmalex/ai-connect` ships as a public scoped npm package. The package metadata enforces the publish boundary:
1252
+
1253
+ - `publishConfig.access` is `public`, which is required for a scoped name to publish without an explicit `--access public` flag.
1254
+ - `prepublishOnly` runs `bun run check && bun run build` before any publish.
1255
+ - `bun run check` runs `bun run typecheck` followed by the full test suite, including every Deterministic Simulation Testing (DST) scenario (DST scenarios are plain `bun:test` cases, so `bun test` exercises them transitively).
1256
+ - `bun run build` compiles the runtime bundles and the type declarations.
1257
+
1258
+ Because `prepublishOnly` gates on the entire `check` + `build` pipeline, a publish is only possible when the whole suite is green and the distributable output is freshly built.
1259
+
1260
+ Only the built output ships. `files` is restricted to `dist`, so `src`, `tests`, and workspace tooling are excluded from the published tarball. You can confirm the contents before publishing:
1261
+
1262
+ ```bash
1263
+ npm pack --dry-run
1264
+ ```
1265
+
1266
+ Both `npm publish` and `bun publish` honor the `prepublishOnly` gate.