@aexhq/sdk 0.34.0 → 0.36.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (63) hide show
  1. package/README.md +16 -15
  2. package/dist/_contracts/index.d.ts +3 -4
  3. package/dist/_contracts/index.js +1 -4
  4. package/dist/_contracts/operations.d.ts +2 -1
  5. package/dist/_contracts/operations.js +10 -0
  6. package/dist/_contracts/run-config.d.ts +1 -3
  7. package/dist/_contracts/run-config.js +2 -7
  8. package/dist/_contracts/run-trace.d.ts +0 -86
  9. package/dist/_contracts/run-trace.js +1 -184
  10. package/dist/_contracts/run-unit.d.ts +2 -25
  11. package/dist/_contracts/run-unit.js +1 -2
  12. package/dist/_contracts/runtime-manifest.d.ts +1 -1
  13. package/dist/_contracts/runtime-security-profile.d.ts +0 -2
  14. package/dist/_contracts/runtime-security-profile.js +0 -9
  15. package/dist/_contracts/runtime-types.d.ts +25 -4
  16. package/dist/_contracts/stable.d.ts +1 -1
  17. package/dist/_contracts/stable.js +1 -1
  18. package/dist/_contracts/submission.d.ts +62 -95
  19. package/dist/_contracts/submission.js +59 -482
  20. package/dist/cli.mjs +99 -442
  21. package/dist/cli.mjs.sha256 +1 -1
  22. package/dist/client.d.ts +49 -25
  23. package/dist/client.js +341 -70
  24. package/dist/client.js.map +1 -1
  25. package/dist/index.d.ts +9 -15
  26. package/dist/index.js +11 -17
  27. package/dist/index.js.map +1 -1
  28. package/dist/retry.d.ts +162 -0
  29. package/dist/retry.js +320 -0
  30. package/dist/retry.js.map +1 -0
  31. package/dist/secret.d.ts +2 -2
  32. package/dist/secret.js +1 -1
  33. package/dist/version.d.ts +1 -1
  34. package/dist/version.js +1 -1
  35. package/docs/concepts/composition.md +8 -14
  36. package/docs/credentials.md +59 -101
  37. package/docs/defaults.md +0 -8
  38. package/docs/events.md +8 -9
  39. package/docs/limits-and-quotas.md +1 -4
  40. package/docs/limits.md +2 -6
  41. package/docs/mcp.md +4 -5
  42. package/docs/networking.md +6 -16
  43. package/docs/outputs.md +0 -4
  44. package/docs/public-surface.json +3 -3
  45. package/docs/quickstart.md +3 -7
  46. package/docs/retries.md +129 -0
  47. package/docs/run-config.md +6 -3
  48. package/docs/secrets.md +1 -1
  49. package/docs/skills.md +3 -3
  50. package/docs/vision-skills.md +52 -101
  51. package/examples/feature-tour.ts +284 -0
  52. package/package.json +1 -1
  53. package/dist/_contracts/proxy-protocol.d.ts +0 -305
  54. package/dist/_contracts/proxy-protocol.js +0 -297
  55. package/dist/_contracts/proxy-validation.d.ts +0 -19
  56. package/dist/_contracts/proxy-validation.js +0 -51
  57. package/dist/data-tools.d.ts +0 -82
  58. package/dist/data-tools.js +0 -251
  59. package/dist/data-tools.js.map +0 -1
  60. package/dist/proxy-endpoint.d.ts +0 -131
  61. package/dist/proxy-endpoint.js +0 -144
  62. package/dist/proxy-endpoint.js.map +0 -1
  63. package/examples/chat-corpus.ts +0 -84
package/docs/events.md CHANGED
@@ -53,9 +53,9 @@ for the string:
53
53
  const lastText = (await session.messages().last())?.text;
54
54
  ```
55
55
 
56
- `decodeAssistantText`, `textOf`, and `summarizeRunTrace` remain exported as the
57
- power-user escape hatch over a raw `RunEvent` list, but "get the last message"
58
- is now `await session.messages().last()`.
56
+ Prefer `session.messages().list()` or the collected `result.messages` /
57
+ `result.text` fields for assistant text. Low-level event helpers remain exported
58
+ for callers that build custom collectors.
59
59
 
60
60
  The CLI mirrors the same surface:
61
61
 
@@ -162,7 +162,7 @@ const jsonl = await response.text();
162
162
 
163
163
  ## Event shape
164
164
 
165
- Events are typed as the discriminated `RunEvent` union for compatibility and as the versioned coordinator envelope for live consumers. aex records raw runtime/provider payloads **after** secret redaction and structural sanitization, so the bytes you see never contain the provider key, MCP credentials, or proxy bearer that were supplied when the session was opened.
165
+ Events are typed as the discriminated `RunEvent` union for compatibility and as the versioned coordinator envelope for live consumers. aex records raw runtime/provider payloads **after** secret redaction and structural sanitization, so the bytes you see never contain provider keys, MCP credentials, or runtime secrets supplied when the session was opened.
166
166
 
167
167
  ## Typed helpers
168
168
 
@@ -180,8 +180,7 @@ import {
180
180
  isToolCallResult,
181
181
  isCustom,
182
182
  isLog,
183
- isEventChannel,
184
- textOf
183
+ isEventChannel
185
184
  } from "@aexhq/sdk";
186
185
  ```
187
186
 
@@ -191,6 +190,6 @@ All guards test the `type` discriminant at runtime. `isTextMessage`,
191
190
  `event.data` to the fields that event type carries — e.g. inside
192
191
  `if (isTextMessage(e))`, `e.data.text` is typed `string`. The lifecycle/channel
193
192
  guards (`isRunStarted`, `isRunError`, `isCustom`, `isLog`, …) operate on the
194
- coordinator envelope and narrow only the discriminant. `textOf(events)` returns
195
- the run's final assistant text concatenated from the `TEXT_MESSAGE_CONTENT`
196
- blocks.
193
+ coordinator envelope and narrow only the discriminant. Use `result.text` or
194
+ `session.messages.all()` when you need assistant text without inspecting the
195
+ event stream directly.
@@ -96,12 +96,9 @@ Default values; each is overridable per-plane via the matching
96
96
  | API token create | 10 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
97
97
  | API token delete | 30 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
98
98
 
99
- ## Request scope (proxy and egress)
99
+ ## Request Scope
100
100
 
101
101
  | Limit | Value | Source | Raisable? | Constant |
102
102
  | --- | --- | --- | --- | --- |
103
- | Proxy request body | 10 MiB | aex policy | Per-endpoint via `maxRequestBytes` | `REQUEST_PROXY_DEFAULT_MAX_REQUEST_BYTES` |
104
- | Proxy response body | `0` = unlimited (streamed unbuffered) | aex policy | Per-endpoint via `maxResponseBytes` | `REQUEST_PROXY_DEFAULT_MAX_RESPONSE_BYTES` |
105
- | Proxy upstream timeout | 5 minutes | aex policy | Per-endpoint via `timeoutMs` | `REQUEST_PROXY_DEFAULT_TIMEOUT_MS` |
106
103
  | Signed output URL TTL | 300 seconds | aex policy | Per-call via `expiresSeconds` | `REQUEST_PRESIGN_URL_DEFAULT_TTL_SECONDS` |
107
104
  | Event-stream connection ticket TTL | 60 seconds | aex policy | Per-mint via `ttlMs` | `REQUEST_TICKET_DEFAULT_TTL_MS` |
package/docs/limits.md CHANGED
@@ -17,9 +17,6 @@ For the current provider/model set, see the generated
17
17
  | Area | Default |
18
18
  | --- | --- |
19
19
  | Workspace storage | 50 GiB per workspace for captured outputs and workspace artifacts. aex-maintainer admin workspaces may be unlimited for internal dogfooding; this is not a customer entitlement. |
20
- | Proxy request body | 10 MiB per proxy endpoint unless the endpoint declares a different `maxRequestBytes`. |
21
- | Proxy timeout | 5 minutes per proxy endpoint unless the endpoint declares a different `timeoutMs`. |
22
- | Proxy telemetry | Proxy calls emit report-only usage telemetry for call count, failed calls, request bytes, response bytes when known, and duration. Public proxy pricing is not shipped unless documented later. |
23
20
 
24
21
  ## Product Boundaries
25
22
 
@@ -27,14 +24,13 @@ For the current provider/model set, see the generated
27
24
  | --- | --- |
28
25
  | Runtime | New submissions run on the managed runtime. There is no public runtime selector. |
29
26
  | Provider policy | Provider retention, training exclusion, HIPAA/BAA, data residency, abuse policy, and pricing belong to the selected provider account, endpoint, and contract. |
30
- | Secrets | Provider keys, MCP credentials, proxy auth, and env secrets are caller-owned. aex excludes secret values from idempotency and uses the explicit secret surfaces described in [Secrets](secrets.md). |
27
+ | Secrets | Provider keys, MCP credentials, and env secrets are caller-owned. aex excludes secret values from idempotency and uses the explicit secret surfaces described in [Secrets](secrets.md). |
31
28
  | MCP servers | Remote MCP servers are customer-trusted systems. aex validates declarations and routes credentials; it does not make an untrusted MCP server safe. |
32
- | Proxy endpoints | The proxy enforces declared host/path/method/auth policy for calls routed through it. Upstream side effects and data handling remain with the upstream service and customer. |
33
29
  | Outputs | Captured outputs, events, and metadata are stored under the run record and downloaded through auth-gated routes. Output content is customer content. |
34
30
  | Human review | Runs execute after submission. Cancellation is available, but aex does not pause a run for platform-mediated approval or interactive clarification. |
35
31
  | Sessions | The durable product primitive is the session/run record. Sessions can be resumed by id and auto-suspend after the configured idle window; persistent named agent profiles and saved agent definitions are out of scope. |
36
32
  | Deployment | The supported product is the hosted aex service plus the SDK and CLI. Alternate `baseUrl` values are for local, staging, or hosted aex API planes, not a self-host product promise. |
37
- | Cost | BYOK provider-token charges accrue to the customer's provider account. aex records report-only telemetry for runtime, storage, and proxy usage; free trials, billing-grade invoices, and public pricing documents are not shipped unless documented later. |
33
+ | Cost | BYOK provider-token charges accrue to the customer's provider account. aex records report-only telemetry for runtime and storage usage; free trials, billing-grade invoices, and public pricing documents are not shipped unless documented later. |
38
34
 
39
35
  ## Provider Policy Links
40
36
 
package/docs/mcp.md CHANGED
@@ -25,14 +25,13 @@ server, so we cannot elide MCP responses or write them to the session
25
25
  filesystem on the user's behalf. Anything an MCP tool returns lands
26
26
  directly in the model's context.
27
27
 
28
- For ingestion-style tools that return large JSON blobs (search results,
29
- catalogue dumps, bulk reads), use the **CLI-as-skill + managed proxy**
30
- pattern instead of MCP:
28
+ For ingestion-style MCP servers that return large JSON blobs (search results,
29
+ catalogue dumps, bulk reads), prefer a skill that writes files instead of
30
+ putting the whole response in model context:
31
31
 
32
32
  1. Package the upstream as a skill-tool (`Tools.fromSkillDir` /
33
33
  `Tools.fromSkillUrl`) — a CLI binary the agent invokes with its bash tool.
34
- 2. Route every upstream HTTPS call through a per-run `ProxyEndpoint`
35
- (audit, byte caps, budget enforcement).
34
+ 2. Keep any upstream HTTPS credentials in `environment.secrets`.
36
35
  3. Have the CLI write the full payload to the session filesystem. By default,
37
36
  files it creates or modifies are captured automatically; pass
38
37
  `outputs.allowedDirs` only when you want to narrow capture to specific roots.
@@ -26,7 +26,6 @@ These reach the network over managed paths and are **not** subject to
26
26
  - The model / provider call for the run (and its subagents).
27
27
  - The built-in `web_search` and `web_fetch` tools (still SSRF-guarded).
28
28
  - Any remote MCP servers you declare in `mcpServers` — see [MCP](mcp.md).
29
- - Any `proxyEndpoints` you declare — see [Credentials](credentials.md).
30
29
  - The package registries for any `environment.packages` you declare (pip → PyPI,
31
30
  apt → the distribution mirrors). Declaring a package implicitly allows the
32
31
  registry it installs from.
@@ -70,17 +69,8 @@ non-default port when you need one (`api.example.com:8443`); a bare host name
70
69
  covers HTTPS on 443. Matching is exact per host — it is not a wildcard or suffix
71
70
  match, so list each host you need.
72
71
 
73
- To validate your allowlist before submitting, `buildPlatformAllowedHosts` returns
74
- the host set the platform will enforce given a base URL plus your extra hosts:
75
-
76
- ```ts
77
- import { buildPlatformAllowedHosts } from "@aexhq/sdk";
78
-
79
- const allowedHosts = buildPlatformAllowedHosts({
80
- baseUrl: "https://api.aex.dev",
81
- extraHosts: ["api.example.com"]
82
- });
83
- ```
72
+ Keep the allowlist in your session options so the submitted network policy is
73
+ visible at the same call site as the code that needs it.
84
74
 
85
75
  ## Open mode
86
76
 
@@ -135,7 +125,7 @@ your client succeeds without extra setup.
135
125
  - **`allowedHosts` only applies in `limited` mode.** It is ignored in `open`
136
126
  mode, where the SSRF deny-list is the only gate.
137
127
 
138
- For routing credentialed HTTP calls through the managed proxy without putting the
139
- secret in the container, use proxy endpoints see
140
- [Credentials](credentials.md). For remote tool servers, see [MCP](mcp.md). For
141
- the full set of run-config fields, see [Run configuration](run-config.md).
128
+ For credentialed HTTP calls, pass the credential as an `environment.secrets`
129
+ entry and let your code use its normal HTTP client. For remote tool servers, see
130
+ [MCP](mcp.md). For the full set of run-config fields, see
131
+ [Run configuration](run-config.md).
package/docs/outputs.md CHANGED
@@ -100,10 +100,6 @@ if (truncated) {
100
100
 
101
101
  Check `truncated` before treating `text` as complete. Pass `options.grep` (a substring or `RegExp`) to keep only matching lines of the capped text. The returned `output` is the matched `Output` record, and `totalBytes` is the file's full size when the server reports it.
102
102
 
103
- ### Chatting over a workspace's outputs
104
-
105
- `createDataTools(client)` packages the read surface (`sessions.list` + `sessions.outputs(id).list` + `sessions.outputs(id).read`) as a vendor-neutral LLM tool set (`{ tools, instructions, execute }`) so you can build a search-then-fetch chat over your sessions and their outputs in a few lines on top of the public SDK. The `tools` are plain JSON-Schema definitions (the shape every major LLM tool API accepts); `execute(name, input)` dispatches a tool call against the workspace-scoped client. See the runnable `examples/data-chat/` example.
106
-
107
103
  ## Finding outputs
108
104
 
109
105
  `session.outputs().list(query?)` can filter the captured output list client-side. Use `session.outputs().find(query)` when you want discovery to be explicit, or `session.outputs().findOne(query)` when exactly one file is expected:
@@ -2,7 +2,7 @@
2
2
  "brand": "aex",
3
3
  "productName": "Agent Executor",
4
4
  "oneLine": "aex is an agent execution platform for launching autonomous agents from a simple TypeScript SDK and CLI.",
5
- "description": "Open durable agent sessions, send turns, stream events, capture outputs, and compose agents with skills, files, MCP, proxy endpoints, and subagents across the managed runtime.",
5
+ "description": "Open durable agent sessions, send turns, stream events, capture outputs, and compose agents with skills, files, MCP, secrets, networking controls, and subagents across the managed runtime.",
6
6
  "alpha": {
7
7
  "label": "Alpha testing",
8
8
  "description": "Access is limited to invited testers while we harden the hosted runtime, dashboard, and SDK workflows."
@@ -61,7 +61,7 @@
61
61
  "slug": "agent-composition",
62
62
  "href": "/docs/features/#agent-composition",
63
63
  "title": "Agent composition",
64
- "description": "Skills, files, AGENTS.md, remote MCP servers, proxy endpoints, environment variables, packages, and networking controls."
64
+ "description": "Skills, files, AGENTS.md, remote MCP servers, environment variables, packages, secrets, and networking controls."
65
65
  },
66
66
  {
67
67
  "slug": "subagents",
@@ -79,7 +79,7 @@
79
79
  "slug": "typed-control-surface",
80
80
  "href": "/docs/features/#typed-control-surface",
81
81
  "title": "Typed control surface",
82
- "description": "Strongly typed SDK inputs, CLI parity, BYOK secrets, scoped proxy auth, redaction, and output modes."
82
+ "description": "Strongly typed SDK inputs, CLI parity, BYOK provider keys, workspace secrets, redaction, and output modes."
83
83
  }
84
84
  ]
85
85
  }
@@ -83,11 +83,8 @@ for await (const event of turn) {
83
83
  }
84
84
  await turn.done();
85
85
 
86
- // Reads/streams/downloads are grouped into accessor sub-resources:
87
- // session.messages() / events() / outputs() / webhooks(). Grab the last
88
- // assistant message (an AssistantTextEntry; use ?.text for the string).
89
- const lastText = (await session.messages().last())?.text;
90
- console.log(lastText);
86
+ const messages = await session.messages().list();
87
+ console.log(messages.at(-1)?.text);
91
88
 
92
89
  // Poll the record until the session parks (idle / suspended / error).
93
90
  const record = await session.wait();
@@ -110,8 +107,7 @@ aex run \
110
107
 
111
108
  ## Add capabilities
112
109
 
113
- - Add files, skills, AGENTS.md, MCP servers, proxy endpoints, packages, and networking controls with [Composition](concepts/composition.md).
114
- - Inspect runtime tools with [Agent tools](concepts/agent-tools.md).
110
+ - Add files, skills, AGENTS.md, MCP servers, packages, and networking controls with [Composition](concepts/composition.md).
115
111
  - Use parent/child run delegation from the [Features](https://aex.dev/docs/features/#subagents) page.
116
112
  - Narrow output capture or download individual files with [Outputs](outputs.md).
117
113
  - Check supported providers and models in the [provider/runtime capability matrix](provider-runtime-capabilities.md).
@@ -0,0 +1,129 @@
1
+ ---
2
+ title: Retries and throttling
3
+ ---
4
+
5
+ # Retries and throttling
6
+
7
+ The SDK ships with built-in transport resilience. Every request it makes to the
8
+ aex API is automatically retried on **transient** failures with bounded
9
+ exponential backoff and jitter, honoring the server's `Retry-After` header. You
10
+ get this by default — no wrapper code — and it is safe to leave on because the
11
+ billable submits carry a stable idempotency key, so a retry never creates a
12
+ duplicate run.
13
+
14
+ ## What gets retried
15
+
16
+ Retried automatically:
17
+
18
+ - HTTP `429` (rate limited)
19
+ - HTTP `500`, `502`, `503`, `504` (server hiccups)
20
+ - HTTP `529` (upstream provider overloaded)
21
+ - Network errors (connection reset, DNS failure, timeout)
22
+
23
+ Never retried — these fail fast so you see the real problem immediately:
24
+
25
+ - `400` / `422` (bad request), `401` / `403` (auth), `404` (not found),
26
+ `409` (conflict), and every other non-transient `4xx`.
27
+ - A request you aborted yourself (via an `AbortSignal`).
28
+
29
+ ## Tuning or disabling
30
+
31
+ Pass a `retry` option when you construct the client:
32
+
33
+ ```ts
34
+ import { Aex } from "@aexhq/sdk";
35
+
36
+ const aex = new Aex({
37
+ apiToken: process.env.AEX_API_TOKEN!,
38
+ retry: {
39
+ maxAttempts: 4, // total tries incl. the first (default 4)
40
+ initialDelayMs: 500, // base backoff, doubles per retry (default 500)
41
+ maxDelayMs: 20_000, // cap on any single wait (default 20s)
42
+ maxElapsedMs: 120_000 // overall wall-clock budget (default 2m)
43
+ }
44
+ });
45
+ ```
46
+
47
+ Turn it off entirely with `retry: false`, or make a single attempt with
48
+ `retry: { maxAttempts: 1 }`.
49
+
50
+ ## Idempotent by construction
51
+
52
+ Retries — whether the built-in transport retry or your own re-invocation of
53
+ `run(...)` — never double-bill. The one-shot `run(...)` and `sessions.run(...)`
54
+ derive the turn's idempotency key from the session-create key, so re-invoking
55
+ either with the same `idempotencyKey` de-duplicates **both** the session create
56
+ and the billable turn server-side:
57
+
58
+ ```ts
59
+ // A retried call with the same idempotencyKey resolves to the same run,
60
+ // not a second billable one.
61
+ const result = await aex.run({
62
+ model: "claude-haiku-4-5",
63
+ message: "Write a short report and save it as a file.",
64
+ apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! },
65
+ idempotencyKey: "report-2026-07-01"
66
+ });
67
+ ```
68
+
69
+ ## Replaying a throttled turn
70
+
71
+ When a turn on a live session is interrupted by a throttle, replay the last
72
+ message with `session.replayLast()`. It reuses the previous message's idempotency
73
+ key by default, so if the original turn actually landed it de-duplicates instead
74
+ of billing twice:
75
+
76
+ ```ts
77
+ const session = await aex.openSession({
78
+ model: "claude-haiku-4-5",
79
+ apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! }
80
+ });
81
+
82
+ try {
83
+ await session.send("Summarize the attached dataset.").done();
84
+ } catch (err) {
85
+ const { isRateLimited } = await import("@aexhq/sdk");
86
+ if (isRateLimited(err)) {
87
+ // Wait out the throttle, then replay the same message.
88
+ await new Promise((r) => setTimeout(r, err.retryAfterMs ?? 2_000));
89
+ await session.replayLast().done();
90
+ } else {
91
+ throw err;
92
+ }
93
+ }
94
+ ```
95
+
96
+ Pass a fresh key (`session.replayLast({ idempotencyKey: "..." })`) when you
97
+ deliberately want a brand-new turn instead of a de-duplicated replay.
98
+
99
+ ## The throttle error
100
+
101
+ When retries are exhausted on a rate-limit / overloaded status, the SDK throws an
102
+ `AexRateLimitError`. It extends `AexApiError`, so existing `catch` sites keep
103
+ working, and it carries structured, non-leaky detail:
104
+
105
+ ```ts
106
+ import { isRateLimited } from "@aexhq/sdk";
107
+
108
+ try {
109
+ await aex.run({ /* … */ });
110
+ } catch (err) {
111
+ if (isRateLimited(err)) {
112
+ err.status; // 429 | 503 | 529
113
+ err.attempts; // how many tries were made
114
+ err.retryAfterMs; // suggested wait, when the server supplied one
115
+ err.source; // "api" (aex plane) or "provider" (upstream model)
116
+ err.providerFault; // upstream fault detail, when the model provider throttled
117
+ }
118
+ }
119
+ ```
120
+
121
+ The `message` is a fixed summary (e.g. `aex API rate limit reached (HTTP 429)
122
+ after 4 attempts; retry after ~2s`) — it never echoes the raw response body,
123
+ which stays available, redacted, on `err.body`.
124
+
125
+ When the throttle originated at the upstream model provider (rather than the aex
126
+ API plane), `err.source` is `"provider"` and `err.providerFault` describes it:
127
+ its `kind` (`rate_limit` / `overloaded` / `quota_exceeded` / `provider_error`),
128
+ the upstream `status`, and a suggested `retryAfterMs`. Use `parseProviderFault`
129
+ to read the same shape off a raw fault value yourself.
@@ -13,13 +13,16 @@ Allowed fields:
13
13
  - `mcpServers` - array of `McpServerRef`; headers are split into the vaulted secrets channel server-side.
14
14
  - `environment` - `{ networking?, packages?, variables? }`. Networking is open by default; set `networking.mode` to `limited` only when you want an allowlist. `variables` are merged into the in-container `RUNTIME.env` / `RUNTIME.json` mounts. (Run secrets go in `environment.secrets`, which carries live `Secret` instances and is not part of a shareable config.)
15
15
  - `runtime` - optional managed-runtime preset. Prefer `Sizes` in TypeScript.
16
- - `proxyEndpoints` - array of `ProxyEndpoint` instances; endpoint-level `retry` is allowed here and remains declaration-based.
17
16
  - `metadata` - non-secret structured metadata.
18
17
  - `overrides` - `{ idleTtl?, timeout?, maxSpendUsd? }`. `timeout` is an optional session deadline (e.g. `"30m"`, `"2h"`); `maxSpendUsd` stops the session once its spend would exceed the cap (see [Limits & quotas](limits-and-quotas.md)).
19
18
 
20
19
  `message` (the one-shot `run` input), `agentsMd`, `files`, `outputs`, `tools`, `includeBuiltinTools`, and `outputMode` are `openSession` / `run` options, not reusable run-config fields. They carry the turn input, bytes, capture behavior, or agent tool/output controls that belong on a concrete call. Skill bundles are `tools` entries built with `Tools.fromSkillDir(...)` / `Tools.fromSkillUrl(...)`, so they too are SDK-code options rather than config fields. Subagents run in-process; there is no `limits` / `parentRunId` option.
21
20
 
22
- Secrets never live in run config. Pass provider keys through the top-level `apiKeys` map (and run secrets through `environment.secrets`) in the SDK, or the equivalent host-mode flags (`--anthropic-api-key`, `--mcp-auth`, `--proxy-auth`) in the CLI. See [Secrets](secrets.md) for secret lifecycles and [Credentials](credentials.md) for the proxy endpoint policy/auth split and retry fields.
21
+ Secrets never live in run config. Pass provider keys through the top-level
22
+ `apiKeys` map and runtime secrets through `environment.secrets` in the SDK, or
23
+ the equivalent host-mode flags (`--anthropic-api-key`, `--mcp-auth`) in the CLI.
24
+ See [Secrets](secrets.md) for secret lifecycles and [Credentials](credentials.md)
25
+ for credential handling.
23
26
 
24
27
  ## Reuse in code
25
28
 
@@ -52,4 +55,4 @@ aex run --config ./run.json \
52
55
  --anthropic-api-key "$ANTHROPIC_API_KEY"
53
56
  ```
54
57
 
55
- ...or as explicit flags (`--model`, `--system`, `--prompt`, `--mcp`, `--mcp-auth`, `--runtime-size`, `--run-timeout`, `--proxy-endpoint`, `--proxy-auth`, `--metadata`). The two modes are mutually exclusive.
58
+ ...or as explicit flags (`--model`, `--system`, `--prompt`, `--mcp`, `--mcp-auth`, `--runtime-size`, `--run-timeout`, `--metadata`). The two modes are mutually exclusive.
package/docs/secrets.md CHANGED
@@ -111,5 +111,5 @@ await aex.run({
111
111
  await aex.secrets.delete("serper-api-key");
112
112
  ```
113
113
 
114
- The CLI supports per-run provider, MCP, and proxy credentials. Workspace secret
114
+ The CLI supports per-run provider and MCP credentials. Workspace secret
115
115
  administration is exposed through the SDK.
package/docs/skills.md CHANGED
@@ -72,9 +72,9 @@ files into the workspace under `/workspace/skills/<name>/`. So the `SKILL.md` bo
72
72
  and every supporting file are on disk from the first turn; the load-tool call is
73
73
  how that body enters the model's context, not how the files get written.
74
74
 
75
- The platform also mounts the `aex` CLI and a per-run manifest into every run.
76
- Skills call managed HTTP proxy endpoints through the mounted CLI
77
- (`aex proxy ...`); see [Credentials](credentials.md) for the policy and auth model.
75
+ Skills that call external HTTP APIs should read credentials from
76
+ `environment.secrets` and use the normal client for that service. See
77
+ [Credentials](credentials.md) for the secret model.
78
78
 
79
79
  Run-scoped asset copies are part of the run record and are removed by run deletion
80
80
  or retention cleanup.
@@ -1,73 +1,57 @@
1
1
  ---
2
- title: Call a vision (or any model) API from a skill
2
+ title: Call a vision API from a skill
3
3
  ---
4
4
 
5
- # Call a vision (or any model) API from a skill
5
+ # Call a vision API from a skill
6
6
 
7
- aex has no built-in vision tool. The agent's `provider`/`model` selects the
8
- *reasoning* model it is not an endpoint a skill can POST an image to mid-run.
9
- To give a run image understanding (or to call any other model/HTTP API), ship a
10
- **skill** that POSTs to the provider's OpenAI-compatible endpoint **through the
11
- managed proxy**, with the key supplied on a `ProxyEndpoint.bearer(...)` instance.
12
- The raw key never enters the container.
7
+ aex has no built-in vision tool. The agent's `provider` / `model` selects the
8
+ reasoning model for the run; if a skill needs image understanding mid-run, ship a
9
+ skill that calls the vision provider with normal HTTP and pass that provider key
10
+ as a runtime secret.
13
11
 
14
- This is the same proxy described in `credentials.md` — this page is the worked
15
- recipe for the model-API case, which has two wrinkles a plain JSON call does not:
16
- the image rides as a **base64 data URL** in the request body, and that body is
17
- large enough to need a raised `maxRequestBytes`.
12
+ The runnable example lives at [`examples/vision-skill/`](../../../examples/vision-skill).
13
+ It captions a frame with ByteDance Doubao Seed Vision (Ark) and returns a
14
+ per-noun "does the frame depict X?" verdict.
18
15
 
19
- The canonical, runnable example lives in the repo at
20
- [`examples/vision-skill/`](../../../examples/vision-skill) (`SKILL.md`,
21
- `caption_frame.py`, `verify_frame.py`, `run_with_vision_skill.mjs`). It
22
- captions a frame with ByteDance Doubao Seed Vision (Ark) and returns a per-noun
23
- "does the frame depict X?" verdict. Everything below is taken from it.
24
-
25
- ## 1. Declare the model endpoint as a proxy endpoint
26
-
27
- The vision provider's API is just an HTTPS host. Declare it with
28
- `ProxyEndpoint.bearer(...)`, which carries the key on the instance. The two
29
- model-specific settings are `responseMode: "full"` (so the skill gets the upstream
30
- JSON back) and a raised `maxRequestBytes` (so the base64 image fits):
16
+ ## Submit the run
31
17
 
32
18
  ```ts
33
- import { Aex, Models, Tools, ProxyEndpoint } from "@aexhq/sdk";
19
+ import { Aex, Models, Secret, Tools } from "@aexhq/sdk";
34
20
 
35
21
  const aex = new Aex({ apiToken: process.env.AEX_API_TOKEN! });
36
22
 
37
- const doubaoArk = ProxyEndpoint.bearer({
38
- name: "doubao-ark",
39
- baseUrl: "https://ark.ap-southeast.bytepluses.com", // intl BytePlus gateway
40
- token: process.env.DOUBAO_API_KEY!,
41
- allowMethods: ["POST"],
42
- allowPathPrefixes: ["/api/v3/chat/completions"],
43
- maxRequestBytes: 2_000_000, // base64 image POSTs — see note below
44
- responseMode: "full",
45
- timeoutMs: 60_000
46
- });
47
-
48
- await aex.run({
23
+ const result = await aex.run({
49
24
  model: Models.CLAUDE_HAIKU_4_5,
50
- message: "…read skills/frame-vision-gate/SKILL.md, then caption + verify the frame",
25
+ message: "Read skills/frame-vision-gate/SKILL.md, then caption and verify the frame.",
51
26
  tools: [await Tools.fromSkillDir("./vision-skill", { name: "frame-vision-gate" })],
52
- proxyEndpoints: [doubaoArk],
27
+ environment: {
28
+ secrets: {
29
+ DOUBAO_API_KEY: Secret.value(process.env.DOUBAO_API_KEY!)
30
+ },
31
+ networking: {
32
+ mode: "limited",
33
+ allowedHosts: ["ark.ap-southeast.bytepluses.com"]
34
+ }
35
+ },
53
36
  apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! }
54
37
  });
38
+
39
+ console.log(result.runId, result.text);
55
40
  ```
56
41
 
57
- `Tools.fromSkillDir("./vision-skill", )` is resolved relative to the process CWD, so
58
- run the script from the directory that *contains* `vision-skill/` (in the
59
- repo, that is `examples/`). The same pattern works for OpenAI, Gemini's
60
- OpenAI-compatible endpoint, or any other OpenAI-chat-shaped vision API — only
61
- `baseUrl` and the path prefix change.
42
+ `Tools.fromSkillDir("./vision-skill", ...)` is resolved relative to the process
43
+ CWD. Run the script from the directory that contains `vision-skill/` (in this
44
+ repo, `examples/`).
62
45
 
63
- ## 2. POST the image as a base64 data URL through the proxy
46
+ ## Call the provider from the skill
64
47
 
65
- Inside the run, the skill builds the OpenAI-compatible chat-completions body. The
66
- image is **base64-inlined as a data URL** in an `image_url` content part — it is
67
- not uploaded:
48
+ Inside the run, the skill reads `DOUBAO_API_KEY` and makes an
49
+ OpenAI-compatible chat-completions request with Python's standard HTTP client.
50
+ The image is base64-inlined as a data URL in the request body:
68
51
 
69
52
  ```python
70
- import base64, json
53
+ import base64, json, os, urllib.request
54
+
71
55
  b64 = base64.b64encode(open("/workspace/files/frame.jpg", "rb").read()).decode()
72
56
  request_body = {
73
57
  "model": "doubao-seed-1-6-vision-250815",
@@ -81,63 +65,30 @@ request_body = {
81
65
  ]}
82
66
  ]
83
67
  }
84
- ```
85
-
86
- Write the body to a file and hand it to the mounted CLI with `--data @<file>`
87
- (the mount has no execute bit, so invoke through `bun`; see `credentials.md`):
88
68
 
89
- ```python
90
- import subprocess
91
- body_path = "/workspace/.aex/_ark_request.json"
92
- open(body_path, "w").write(json.dumps(request_body))
93
-
94
- result = subprocess.run(
95
- ["bun", "/mnt/session/uploads/aex/aex", "proxy", "doubao-ark",
96
- "--method", "POST",
97
- "--path", "/api/v3/chat/completions",
98
- "--header", "content-type=application/json",
99
- "--data", f"@{body_path}",
100
- "--response-mode", "full"],
101
- capture_output=True, text=True, timeout=90,
69
+ req = urllib.request.Request(
70
+ "https://ark.ap-southeast.bytepluses.com/api/v3/chat/completions",
71
+ data=json.dumps(request_body).encode("utf-8"),
72
+ headers={
73
+ "Authorization": f"Bearer {os.environ['DOUBAO_API_KEY']}",
74
+ "Content-Type": "application/json"
75
+ },
76
+ method="POST",
102
77
  )
103
78
  ```
104
79
 
105
- In `--response-mode full` the CLI prints a `ProxyResponseEnvelope` on stdout. The
106
- upstream JSON is **base64-encoded** in `upstreamBodyBase64`; an error instead
107
- carries an `error` field. Unwrap it:
108
-
109
- ```python
110
- envelope = json.loads(result.stdout)
111
- if "error" in envelope:
112
- raise RuntimeError(f"proxy error: {envelope['error']}: {envelope['message']}")
113
- upstream = json.loads(base64.b64decode(envelope["upstreamBodyBase64"]).decode())
114
- content = upstream["choices"][0]["message"]["content"] # the model's JSON answer
115
- ```
116
-
117
- The key is injected by the hosted proxy on the outbound call; it never appears on disk in
118
- the container or in the model's context.
119
-
120
- ## `maxRequestBytes` and timeout defaults
80
+ The same pattern works for OpenAI, Gemini's OpenAI-compatible endpoint, or any
81
+ other HTTPS model API. Put the key in `environment.secrets`, allow-list the host
82
+ when using limited networking, and use the provider's normal SDK or HTTP API.
121
83
 
122
- The per-endpoint `maxRequestBytes` default is **10 MiB** and the default timeout
123
- is **5 minutes**. That fits typical base64 image/model POSTs without extra
124
- configuration. If a body does exceed the cap, the proxy rejects it before any
125
- upstream call with an explicit error naming the observed size, the configured
126
- cap, and how to raise it:
84
+ ## Payload size
127
85
 
128
- > request body is 2400000 bytes, which exceeds this endpoint's maxRequestBytes
129
- > (10485760). Raise the per-endpoint maxRequestBytes in the proxy endpoint policy …
86
+ Base64 images are larger than their source files. Scale frames before captioning
87
+ when possible, for example:
130
88
 
131
- Two ways to stay under the cap: raise `maxRequestBytes`, and/or scale frames
132
- before captioning (`ffmpeg -i source.mp4 -vf fps=1,scale=960:-1 frame_%03d.jpg`)
133
- so full-res frames do not add payload and model cost without useful signal.
134
-
135
- ## Notes
89
+ ```bash
90
+ ffmpeg -i source.mp4 -vf fps=1,scale=960:-1 frame_%03d.jpg
91
+ ```
136
92
 
137
- - **Host selection.** Use the provider endpoint that matches your account and
138
- declare it as the proxy endpoint `baseUrl`.
139
- - **Keyless model hosts.** If the upstream takes no credential, declare the
140
- endpoint with `ProxyEndpoint.none(...)` (see `credentials.md`).
141
- - **Response size.** `responseMode: "full"` is required to read the model's reply
142
- back. Leave `maxResponseBytes` at its default (`0` = unlimited, streamed) unless
143
- you want a truncation cap.
93
+ This keeps upload size and model cost bounded without losing the signal most
94
+ vision models need.