@aexhq/sdk 0.34.0 → 0.36.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +16 -15
- package/dist/_contracts/index.d.ts +3 -4
- package/dist/_contracts/index.js +1 -4
- package/dist/_contracts/operations.d.ts +2 -1
- package/dist/_contracts/operations.js +10 -0
- package/dist/_contracts/run-config.d.ts +1 -3
- package/dist/_contracts/run-config.js +2 -7
- package/dist/_contracts/run-trace.d.ts +0 -86
- package/dist/_contracts/run-trace.js +1 -184
- package/dist/_contracts/run-unit.d.ts +2 -25
- package/dist/_contracts/run-unit.js +1 -2
- package/dist/_contracts/runtime-manifest.d.ts +1 -1
- package/dist/_contracts/runtime-security-profile.d.ts +0 -2
- package/dist/_contracts/runtime-security-profile.js +0 -9
- package/dist/_contracts/runtime-types.d.ts +25 -4
- package/dist/_contracts/stable.d.ts +1 -1
- package/dist/_contracts/stable.js +1 -1
- package/dist/_contracts/submission.d.ts +62 -95
- package/dist/_contracts/submission.js +59 -482
- package/dist/cli.mjs +99 -442
- package/dist/cli.mjs.sha256 +1 -1
- package/dist/client.d.ts +49 -25
- package/dist/client.js +341 -70
- package/dist/client.js.map +1 -1
- package/dist/index.d.ts +9 -15
- package/dist/index.js +11 -17
- package/dist/index.js.map +1 -1
- package/dist/retry.d.ts +162 -0
- package/dist/retry.js +320 -0
- package/dist/retry.js.map +1 -0
- package/dist/secret.d.ts +2 -2
- package/dist/secret.js +1 -1
- package/dist/version.d.ts +1 -1
- package/dist/version.js +1 -1
- package/docs/concepts/composition.md +8 -14
- package/docs/credentials.md +59 -101
- package/docs/defaults.md +0 -8
- package/docs/events.md +8 -9
- package/docs/limits-and-quotas.md +1 -4
- package/docs/limits.md +2 -6
- package/docs/mcp.md +4 -5
- package/docs/networking.md +6 -16
- package/docs/outputs.md +0 -4
- package/docs/public-surface.json +3 -3
- package/docs/quickstart.md +3 -7
- package/docs/retries.md +129 -0
- package/docs/run-config.md +6 -3
- package/docs/secrets.md +1 -1
- package/docs/skills.md +3 -3
- package/docs/vision-skills.md +52 -101
- package/examples/feature-tour.ts +284 -0
- package/package.json +1 -1
- package/dist/_contracts/proxy-protocol.d.ts +0 -305
- package/dist/_contracts/proxy-protocol.js +0 -297
- package/dist/_contracts/proxy-validation.d.ts +0 -19
- package/dist/_contracts/proxy-validation.js +0 -51
- package/dist/data-tools.d.ts +0 -82
- package/dist/data-tools.js +0 -251
- package/dist/data-tools.js.map +0 -1
- package/dist/proxy-endpoint.d.ts +0 -131
- package/dist/proxy-endpoint.js +0 -144
- package/dist/proxy-endpoint.js.map +0 -1
- package/examples/chat-corpus.ts +0 -84
package/docs/events.md
CHANGED
|
@@ -53,9 +53,9 @@ for the string:
|
|
|
53
53
|
const lastText = (await session.messages().last())?.text;
|
|
54
54
|
```
|
|
55
55
|
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
56
|
+
Prefer `session.messages().list()` or the collected `result.messages` /
|
|
57
|
+
`result.text` fields for assistant text. Low-level event helpers remain exported
|
|
58
|
+
for callers that build custom collectors.
|
|
59
59
|
|
|
60
60
|
The CLI mirrors the same surface:
|
|
61
61
|
|
|
@@ -162,7 +162,7 @@ const jsonl = await response.text();
|
|
|
162
162
|
|
|
163
163
|
## Event shape
|
|
164
164
|
|
|
165
|
-
Events are typed as the discriminated `RunEvent` union for compatibility and as the versioned coordinator envelope for live consumers. aex records raw runtime/provider payloads **after** secret redaction and structural sanitization, so the bytes you see never contain
|
|
165
|
+
Events are typed as the discriminated `RunEvent` union for compatibility and as the versioned coordinator envelope for live consumers. aex records raw runtime/provider payloads **after** secret redaction and structural sanitization, so the bytes you see never contain provider keys, MCP credentials, or runtime secrets supplied when the session was opened.
|
|
166
166
|
|
|
167
167
|
## Typed helpers
|
|
168
168
|
|
|
@@ -180,8 +180,7 @@ import {
|
|
|
180
180
|
isToolCallResult,
|
|
181
181
|
isCustom,
|
|
182
182
|
isLog,
|
|
183
|
-
isEventChannel
|
|
184
|
-
textOf
|
|
183
|
+
isEventChannel
|
|
185
184
|
} from "@aexhq/sdk";
|
|
186
185
|
```
|
|
187
186
|
|
|
@@ -191,6 +190,6 @@ All guards test the `type` discriminant at runtime. `isTextMessage`,
|
|
|
191
190
|
`event.data` to the fields that event type carries — e.g. inside
|
|
192
191
|
`if (isTextMessage(e))`, `e.data.text` is typed `string`. The lifecycle/channel
|
|
193
192
|
guards (`isRunStarted`, `isRunError`, `isCustom`, `isLog`, …) operate on the
|
|
194
|
-
coordinator envelope and narrow only the discriminant. `
|
|
195
|
-
|
|
196
|
-
|
|
193
|
+
coordinator envelope and narrow only the discriminant. Use `result.text` or
|
|
194
|
+
`session.messages.all()` when you need assistant text without inspecting the
|
|
195
|
+
event stream directly.
|
|
@@ -96,12 +96,9 @@ Default values; each is overridable per-plane via the matching
|
|
|
96
96
|
| API token create | 10 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
|
|
97
97
|
| API token delete | 30 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
|
|
98
98
|
|
|
99
|
-
## Request
|
|
99
|
+
## Request Scope
|
|
100
100
|
|
|
101
101
|
| Limit | Value | Source | Raisable? | Constant |
|
|
102
102
|
| --- | --- | --- | --- | --- |
|
|
103
|
-
| Proxy request body | 10 MiB | aex policy | Per-endpoint via `maxRequestBytes` | `REQUEST_PROXY_DEFAULT_MAX_REQUEST_BYTES` |
|
|
104
|
-
| Proxy response body | `0` = unlimited (streamed unbuffered) | aex policy | Per-endpoint via `maxResponseBytes` | `REQUEST_PROXY_DEFAULT_MAX_RESPONSE_BYTES` |
|
|
105
|
-
| Proxy upstream timeout | 5 minutes | aex policy | Per-endpoint via `timeoutMs` | `REQUEST_PROXY_DEFAULT_TIMEOUT_MS` |
|
|
106
103
|
| Signed output URL TTL | 300 seconds | aex policy | Per-call via `expiresSeconds` | `REQUEST_PRESIGN_URL_DEFAULT_TTL_SECONDS` |
|
|
107
104
|
| Event-stream connection ticket TTL | 60 seconds | aex policy | Per-mint via `ttlMs` | `REQUEST_TICKET_DEFAULT_TTL_MS` |
|
package/docs/limits.md
CHANGED
|
@@ -17,9 +17,6 @@ For the current provider/model set, see the generated
|
|
|
17
17
|
| Area | Default |
|
|
18
18
|
| --- | --- |
|
|
19
19
|
| Workspace storage | 50 GiB per workspace for captured outputs and workspace artifacts. aex-maintainer admin workspaces may be unlimited for internal dogfooding; this is not a customer entitlement. |
|
|
20
|
-
| Proxy request body | 10 MiB per proxy endpoint unless the endpoint declares a different `maxRequestBytes`. |
|
|
21
|
-
| Proxy timeout | 5 minutes per proxy endpoint unless the endpoint declares a different `timeoutMs`. |
|
|
22
|
-
| Proxy telemetry | Proxy calls emit report-only usage telemetry for call count, failed calls, request bytes, response bytes when known, and duration. Public proxy pricing is not shipped unless documented later. |
|
|
23
20
|
|
|
24
21
|
## Product Boundaries
|
|
25
22
|
|
|
@@ -27,14 +24,13 @@ For the current provider/model set, see the generated
|
|
|
27
24
|
| --- | --- |
|
|
28
25
|
| Runtime | New submissions run on the managed runtime. There is no public runtime selector. |
|
|
29
26
|
| Provider policy | Provider retention, training exclusion, HIPAA/BAA, data residency, abuse policy, and pricing belong to the selected provider account, endpoint, and contract. |
|
|
30
|
-
| Secrets | Provider keys, MCP credentials,
|
|
27
|
+
| Secrets | Provider keys, MCP credentials, and env secrets are caller-owned. aex excludes secret values from idempotency and uses the explicit secret surfaces described in [Secrets](secrets.md). |
|
|
31
28
|
| MCP servers | Remote MCP servers are customer-trusted systems. aex validates declarations and routes credentials; it does not make an untrusted MCP server safe. |
|
|
32
|
-
| Proxy endpoints | The proxy enforces declared host/path/method/auth policy for calls routed through it. Upstream side effects and data handling remain with the upstream service and customer. |
|
|
33
29
|
| Outputs | Captured outputs, events, and metadata are stored under the run record and downloaded through auth-gated routes. Output content is customer content. |
|
|
34
30
|
| Human review | Runs execute after submission. Cancellation is available, but aex does not pause a run for platform-mediated approval or interactive clarification. |
|
|
35
31
|
| Sessions | The durable product primitive is the session/run record. Sessions can be resumed by id and auto-suspend after the configured idle window; persistent named agent profiles and saved agent definitions are out of scope. |
|
|
36
32
|
| Deployment | The supported product is the hosted aex service plus the SDK and CLI. Alternate `baseUrl` values are for local, staging, or hosted aex API planes, not a self-host product promise. |
|
|
37
|
-
| Cost | BYOK provider-token charges accrue to the customer's provider account. aex records report-only telemetry for runtime
|
|
33
|
+
| Cost | BYOK provider-token charges accrue to the customer's provider account. aex records report-only telemetry for runtime and storage usage; free trials, billing-grade invoices, and public pricing documents are not shipped unless documented later. |
|
|
38
34
|
|
|
39
35
|
## Provider Policy Links
|
|
40
36
|
|
package/docs/mcp.md
CHANGED
|
@@ -25,14 +25,13 @@ server, so we cannot elide MCP responses or write them to the session
|
|
|
25
25
|
filesystem on the user's behalf. Anything an MCP tool returns lands
|
|
26
26
|
directly in the model's context.
|
|
27
27
|
|
|
28
|
-
For ingestion-style
|
|
29
|
-
catalogue dumps, bulk reads),
|
|
30
|
-
|
|
28
|
+
For ingestion-style MCP servers that return large JSON blobs (search results,
|
|
29
|
+
catalogue dumps, bulk reads), prefer a skill that writes files instead of
|
|
30
|
+
putting the whole response in model context:
|
|
31
31
|
|
|
32
32
|
1. Package the upstream as a skill-tool (`Tools.fromSkillDir` /
|
|
33
33
|
`Tools.fromSkillUrl`) — a CLI binary the agent invokes with its bash tool.
|
|
34
|
-
2.
|
|
35
|
-
(audit, byte caps, budget enforcement).
|
|
34
|
+
2. Keep any upstream HTTPS credentials in `environment.secrets`.
|
|
36
35
|
3. Have the CLI write the full payload to the session filesystem. By default,
|
|
37
36
|
files it creates or modifies are captured automatically; pass
|
|
38
37
|
`outputs.allowedDirs` only when you want to narrow capture to specific roots.
|
package/docs/networking.md
CHANGED
|
@@ -26,7 +26,6 @@ These reach the network over managed paths and are **not** subject to
|
|
|
26
26
|
- The model / provider call for the run (and its subagents).
|
|
27
27
|
- The built-in `web_search` and `web_fetch` tools (still SSRF-guarded).
|
|
28
28
|
- Any remote MCP servers you declare in `mcpServers` — see [MCP](mcp.md).
|
|
29
|
-
- Any `proxyEndpoints` you declare — see [Credentials](credentials.md).
|
|
30
29
|
- The package registries for any `environment.packages` you declare (pip → PyPI,
|
|
31
30
|
apt → the distribution mirrors). Declaring a package implicitly allows the
|
|
32
31
|
registry it installs from.
|
|
@@ -70,17 +69,8 @@ non-default port when you need one (`api.example.com:8443`); a bare host name
|
|
|
70
69
|
covers HTTPS on 443. Matching is exact per host — it is not a wildcard or suffix
|
|
71
70
|
match, so list each host you need.
|
|
72
71
|
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
```ts
|
|
77
|
-
import { buildPlatformAllowedHosts } from "@aexhq/sdk";
|
|
78
|
-
|
|
79
|
-
const allowedHosts = buildPlatformAllowedHosts({
|
|
80
|
-
baseUrl: "https://api.aex.dev",
|
|
81
|
-
extraHosts: ["api.example.com"]
|
|
82
|
-
});
|
|
83
|
-
```
|
|
72
|
+
Keep the allowlist in your session options so the submitted network policy is
|
|
73
|
+
visible at the same call site as the code that needs it.
|
|
84
74
|
|
|
85
75
|
## Open mode
|
|
86
76
|
|
|
@@ -135,7 +125,7 @@ your client succeeds without extra setup.
|
|
|
135
125
|
- **`allowedHosts` only applies in `limited` mode.** It is ignored in `open`
|
|
136
126
|
mode, where the SSRF deny-list is the only gate.
|
|
137
127
|
|
|
138
|
-
For
|
|
139
|
-
|
|
140
|
-
[
|
|
141
|
-
|
|
128
|
+
For credentialed HTTP calls, pass the credential as an `environment.secrets`
|
|
129
|
+
entry and let your code use its normal HTTP client. For remote tool servers, see
|
|
130
|
+
[MCP](mcp.md). For the full set of run-config fields, see
|
|
131
|
+
[Run configuration](run-config.md).
|
package/docs/outputs.md
CHANGED
|
@@ -100,10 +100,6 @@ if (truncated) {
|
|
|
100
100
|
|
|
101
101
|
Check `truncated` before treating `text` as complete. Pass `options.grep` (a substring or `RegExp`) to keep only matching lines of the capped text. The returned `output` is the matched `Output` record, and `totalBytes` is the file's full size when the server reports it.
|
|
102
102
|
|
|
103
|
-
### Chatting over a workspace's outputs
|
|
104
|
-
|
|
105
|
-
`createDataTools(client)` packages the read surface (`sessions.list` + `sessions.outputs(id).list` + `sessions.outputs(id).read`) as a vendor-neutral LLM tool set (`{ tools, instructions, execute }`) so you can build a search-then-fetch chat over your sessions and their outputs in a few lines on top of the public SDK. The `tools` are plain JSON-Schema definitions (the shape every major LLM tool API accepts); `execute(name, input)` dispatches a tool call against the workspace-scoped client. See the runnable `examples/data-chat/` example.
|
|
106
|
-
|
|
107
103
|
## Finding outputs
|
|
108
104
|
|
|
109
105
|
`session.outputs().list(query?)` can filter the captured output list client-side. Use `session.outputs().find(query)` when you want discovery to be explicit, or `session.outputs().findOne(query)` when exactly one file is expected:
|
package/docs/public-surface.json
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
"brand": "aex",
|
|
3
3
|
"productName": "Agent Executor",
|
|
4
4
|
"oneLine": "aex is an agent execution platform for launching autonomous agents from a simple TypeScript SDK and CLI.",
|
|
5
|
-
"description": "Open durable agent sessions, send turns, stream events, capture outputs, and compose agents with skills, files, MCP,
|
|
5
|
+
"description": "Open durable agent sessions, send turns, stream events, capture outputs, and compose agents with skills, files, MCP, secrets, networking controls, and subagents across the managed runtime.",
|
|
6
6
|
"alpha": {
|
|
7
7
|
"label": "Alpha testing",
|
|
8
8
|
"description": "Access is limited to invited testers while we harden the hosted runtime, dashboard, and SDK workflows."
|
|
@@ -61,7 +61,7 @@
|
|
|
61
61
|
"slug": "agent-composition",
|
|
62
62
|
"href": "/docs/features/#agent-composition",
|
|
63
63
|
"title": "Agent composition",
|
|
64
|
-
"description": "Skills, files, AGENTS.md, remote MCP servers,
|
|
64
|
+
"description": "Skills, files, AGENTS.md, remote MCP servers, environment variables, packages, secrets, and networking controls."
|
|
65
65
|
},
|
|
66
66
|
{
|
|
67
67
|
"slug": "subagents",
|
|
@@ -79,7 +79,7 @@
|
|
|
79
79
|
"slug": "typed-control-surface",
|
|
80
80
|
"href": "/docs/features/#typed-control-surface",
|
|
81
81
|
"title": "Typed control surface",
|
|
82
|
-
"description": "Strongly typed SDK inputs, CLI parity, BYOK
|
|
82
|
+
"description": "Strongly typed SDK inputs, CLI parity, BYOK provider keys, workspace secrets, redaction, and output modes."
|
|
83
83
|
}
|
|
84
84
|
]
|
|
85
85
|
}
|
package/docs/quickstart.md
CHANGED
|
@@ -83,11 +83,8 @@ for await (const event of turn) {
|
|
|
83
83
|
}
|
|
84
84
|
await turn.done();
|
|
85
85
|
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
// assistant message (an AssistantTextEntry; use ?.text for the string).
|
|
89
|
-
const lastText = (await session.messages().last())?.text;
|
|
90
|
-
console.log(lastText);
|
|
86
|
+
const messages = await session.messages().list();
|
|
87
|
+
console.log(messages.at(-1)?.text);
|
|
91
88
|
|
|
92
89
|
// Poll the record until the session parks (idle / suspended / error).
|
|
93
90
|
const record = await session.wait();
|
|
@@ -110,8 +107,7 @@ aex run \
|
|
|
110
107
|
|
|
111
108
|
## Add capabilities
|
|
112
109
|
|
|
113
|
-
- Add files, skills, AGENTS.md, MCP servers,
|
|
114
|
-
- Inspect runtime tools with [Agent tools](concepts/agent-tools.md).
|
|
110
|
+
- Add files, skills, AGENTS.md, MCP servers, packages, and networking controls with [Composition](concepts/composition.md).
|
|
115
111
|
- Use parent/child run delegation from the [Features](https://aex.dev/docs/features/#subagents) page.
|
|
116
112
|
- Narrow output capture or download individual files with [Outputs](outputs.md).
|
|
117
113
|
- Check supported providers and models in the [provider/runtime capability matrix](provider-runtime-capabilities.md).
|
package/docs/retries.md
ADDED
|
@@ -0,0 +1,129 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: Retries and throttling
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Retries and throttling
|
|
6
|
+
|
|
7
|
+
The SDK ships with built-in transport resilience. Every request it makes to the
|
|
8
|
+
aex API is automatically retried on **transient** failures with bounded
|
|
9
|
+
exponential backoff and jitter, honoring the server's `Retry-After` header. You
|
|
10
|
+
get this by default — no wrapper code — and it is safe to leave on because the
|
|
11
|
+
billable submits carry a stable idempotency key, so a retry never creates a
|
|
12
|
+
duplicate run.
|
|
13
|
+
|
|
14
|
+
## What gets retried
|
|
15
|
+
|
|
16
|
+
Retried automatically:
|
|
17
|
+
|
|
18
|
+
- HTTP `429` (rate limited)
|
|
19
|
+
- HTTP `500`, `502`, `503`, `504` (server hiccups)
|
|
20
|
+
- HTTP `529` (upstream provider overloaded)
|
|
21
|
+
- Network errors (connection reset, DNS failure, timeout)
|
|
22
|
+
|
|
23
|
+
Never retried — these fail fast so you see the real problem immediately:
|
|
24
|
+
|
|
25
|
+
- `400` / `422` (bad request), `401` / `403` (auth), `404` (not found),
|
|
26
|
+
`409` (conflict), and every other non-transient `4xx`.
|
|
27
|
+
- A request you aborted yourself (via an `AbortSignal`).
|
|
28
|
+
|
|
29
|
+
## Tuning or disabling
|
|
30
|
+
|
|
31
|
+
Pass a `retry` option when you construct the client:
|
|
32
|
+
|
|
33
|
+
```ts
|
|
34
|
+
import { Aex } from "@aexhq/sdk";
|
|
35
|
+
|
|
36
|
+
const aex = new Aex({
|
|
37
|
+
apiToken: process.env.AEX_API_TOKEN!,
|
|
38
|
+
retry: {
|
|
39
|
+
maxAttempts: 4, // total tries incl. the first (default 4)
|
|
40
|
+
initialDelayMs: 500, // base backoff, doubles per retry (default 500)
|
|
41
|
+
maxDelayMs: 20_000, // cap on any single wait (default 20s)
|
|
42
|
+
maxElapsedMs: 120_000 // overall wall-clock budget (default 2m)
|
|
43
|
+
}
|
|
44
|
+
});
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
Turn it off entirely with `retry: false`, or make a single attempt with
|
|
48
|
+
`retry: { maxAttempts: 1 }`.
|
|
49
|
+
|
|
50
|
+
## Idempotent by construction
|
|
51
|
+
|
|
52
|
+
Retries — whether the built-in transport retry or your own re-invocation of
|
|
53
|
+
`run(...)` — never double-bill. The one-shot `run(...)` and `sessions.run(...)`
|
|
54
|
+
derive the turn's idempotency key from the session-create key, so re-invoking
|
|
55
|
+
either with the same `idempotencyKey` de-duplicates **both** the session create
|
|
56
|
+
and the billable turn server-side:
|
|
57
|
+
|
|
58
|
+
```ts
|
|
59
|
+
// A retried call with the same idempotencyKey resolves to the same run,
|
|
60
|
+
// not a second billable one.
|
|
61
|
+
const result = await aex.run({
|
|
62
|
+
model: "claude-haiku-4-5",
|
|
63
|
+
message: "Write a short report and save it as a file.",
|
|
64
|
+
apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! },
|
|
65
|
+
idempotencyKey: "report-2026-07-01"
|
|
66
|
+
});
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
## Replaying a throttled turn
|
|
70
|
+
|
|
71
|
+
When a turn on a live session is interrupted by a throttle, replay the last
|
|
72
|
+
message with `session.replayLast()`. It reuses the previous message's idempotency
|
|
73
|
+
key by default, so if the original turn actually landed it de-duplicates instead
|
|
74
|
+
of billing twice:
|
|
75
|
+
|
|
76
|
+
```ts
|
|
77
|
+
const session = await aex.openSession({
|
|
78
|
+
model: "claude-haiku-4-5",
|
|
79
|
+
apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! }
|
|
80
|
+
});
|
|
81
|
+
|
|
82
|
+
try {
|
|
83
|
+
await session.send("Summarize the attached dataset.").done();
|
|
84
|
+
} catch (err) {
|
|
85
|
+
const { isRateLimited } = await import("@aexhq/sdk");
|
|
86
|
+
if (isRateLimited(err)) {
|
|
87
|
+
// Wait out the throttle, then replay the same message.
|
|
88
|
+
await new Promise((r) => setTimeout(r, err.retryAfterMs ?? 2_000));
|
|
89
|
+
await session.replayLast().done();
|
|
90
|
+
} else {
|
|
91
|
+
throw err;
|
|
92
|
+
}
|
|
93
|
+
}
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
Pass a fresh key (`session.replayLast({ idempotencyKey: "..." })`) when you
|
|
97
|
+
deliberately want a brand-new turn instead of a de-duplicated replay.
|
|
98
|
+
|
|
99
|
+
## The throttle error
|
|
100
|
+
|
|
101
|
+
When retries are exhausted on a rate-limit / overloaded status, the SDK throws an
|
|
102
|
+
`AexRateLimitError`. It extends `AexApiError`, so existing `catch` sites keep
|
|
103
|
+
working, and it carries structured, non-leaky detail:
|
|
104
|
+
|
|
105
|
+
```ts
|
|
106
|
+
import { isRateLimited } from "@aexhq/sdk";
|
|
107
|
+
|
|
108
|
+
try {
|
|
109
|
+
await aex.run({ /* … */ });
|
|
110
|
+
} catch (err) {
|
|
111
|
+
if (isRateLimited(err)) {
|
|
112
|
+
err.status; // 429 | 503 | 529
|
|
113
|
+
err.attempts; // how many tries were made
|
|
114
|
+
err.retryAfterMs; // suggested wait, when the server supplied one
|
|
115
|
+
err.source; // "api" (aex plane) or "provider" (upstream model)
|
|
116
|
+
err.providerFault; // upstream fault detail, when the model provider throttled
|
|
117
|
+
}
|
|
118
|
+
}
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
The `message` is a fixed summary (e.g. `aex API rate limit reached (HTTP 429)
|
|
122
|
+
after 4 attempts; retry after ~2s`) — it never echoes the raw response body,
|
|
123
|
+
which stays available, redacted, on `err.body`.
|
|
124
|
+
|
|
125
|
+
When the throttle originated at the upstream model provider (rather than the aex
|
|
126
|
+
API plane), `err.source` is `"provider"` and `err.providerFault` describes it:
|
|
127
|
+
its `kind` (`rate_limit` / `overloaded` / `quota_exceeded` / `provider_error`),
|
|
128
|
+
the upstream `status`, and a suggested `retryAfterMs`. Use `parseProviderFault`
|
|
129
|
+
to read the same shape off a raw fault value yourself.
|
package/docs/run-config.md
CHANGED
|
@@ -13,13 +13,16 @@ Allowed fields:
|
|
|
13
13
|
- `mcpServers` - array of `McpServerRef`; headers are split into the vaulted secrets channel server-side.
|
|
14
14
|
- `environment` - `{ networking?, packages?, variables? }`. Networking is open by default; set `networking.mode` to `limited` only when you want an allowlist. `variables` are merged into the in-container `RUNTIME.env` / `RUNTIME.json` mounts. (Run secrets go in `environment.secrets`, which carries live `Secret` instances and is not part of a shareable config.)
|
|
15
15
|
- `runtime` - optional managed-runtime preset. Prefer `Sizes` in TypeScript.
|
|
16
|
-
- `proxyEndpoints` - array of `ProxyEndpoint` instances; endpoint-level `retry` is allowed here and remains declaration-based.
|
|
17
16
|
- `metadata` - non-secret structured metadata.
|
|
18
17
|
- `overrides` - `{ idleTtl?, timeout?, maxSpendUsd? }`. `timeout` is an optional session deadline (e.g. `"30m"`, `"2h"`); `maxSpendUsd` stops the session once its spend would exceed the cap (see [Limits & quotas](limits-and-quotas.md)).
|
|
19
18
|
|
|
20
19
|
`message` (the one-shot `run` input), `agentsMd`, `files`, `outputs`, `tools`, `includeBuiltinTools`, and `outputMode` are `openSession` / `run` options, not reusable run-config fields. They carry the turn input, bytes, capture behavior, or agent tool/output controls that belong on a concrete call. Skill bundles are `tools` entries built with `Tools.fromSkillDir(...)` / `Tools.fromSkillUrl(...)`, so they too are SDK-code options rather than config fields. Subagents run in-process; there is no `limits` / `parentRunId` option.
|
|
21
20
|
|
|
22
|
-
Secrets never live in run config. Pass provider keys through the top-level
|
|
21
|
+
Secrets never live in run config. Pass provider keys through the top-level
|
|
22
|
+
`apiKeys` map and runtime secrets through `environment.secrets` in the SDK, or
|
|
23
|
+
the equivalent host-mode flags (`--anthropic-api-key`, `--mcp-auth`) in the CLI.
|
|
24
|
+
See [Secrets](secrets.md) for secret lifecycles and [Credentials](credentials.md)
|
|
25
|
+
for credential handling.
|
|
23
26
|
|
|
24
27
|
## Reuse in code
|
|
25
28
|
|
|
@@ -52,4 +55,4 @@ aex run --config ./run.json \
|
|
|
52
55
|
--anthropic-api-key "$ANTHROPIC_API_KEY"
|
|
53
56
|
```
|
|
54
57
|
|
|
55
|
-
...or as explicit flags (`--model`, `--system`, `--prompt`, `--mcp`, `--mcp-auth`, `--runtime-size`, `--run-timeout`, `--
|
|
58
|
+
...or as explicit flags (`--model`, `--system`, `--prompt`, `--mcp`, `--mcp-auth`, `--runtime-size`, `--run-timeout`, `--metadata`). The two modes are mutually exclusive.
|
package/docs/secrets.md
CHANGED
|
@@ -111,5 +111,5 @@ await aex.run({
|
|
|
111
111
|
await aex.secrets.delete("serper-api-key");
|
|
112
112
|
```
|
|
113
113
|
|
|
114
|
-
The CLI supports per-run provider
|
|
114
|
+
The CLI supports per-run provider and MCP credentials. Workspace secret
|
|
115
115
|
administration is exposed through the SDK.
|
package/docs/skills.md
CHANGED
|
@@ -72,9 +72,9 @@ files into the workspace under `/workspace/skills/<name>/`. So the `SKILL.md` bo
|
|
|
72
72
|
and every supporting file are on disk from the first turn; the load-tool call is
|
|
73
73
|
how that body enters the model's context, not how the files get written.
|
|
74
74
|
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
75
|
+
Skills that call external HTTP APIs should read credentials from
|
|
76
|
+
`environment.secrets` and use the normal client for that service. See
|
|
77
|
+
[Credentials](credentials.md) for the secret model.
|
|
78
78
|
|
|
79
79
|
Run-scoped asset copies are part of the run record and are removed by run deletion
|
|
80
80
|
or retention cleanup.
|
package/docs/vision-skills.md
CHANGED
|
@@ -1,73 +1,57 @@
|
|
|
1
1
|
---
|
|
2
|
-
title: Call a vision
|
|
2
|
+
title: Call a vision API from a skill
|
|
3
3
|
---
|
|
4
4
|
|
|
5
|
-
# Call a vision
|
|
5
|
+
# Call a vision API from a skill
|
|
6
6
|
|
|
7
|
-
aex has no built-in vision tool. The agent's `provider
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
managed proxy**, with the key supplied on a `ProxyEndpoint.bearer(...)` instance.
|
|
12
|
-
The raw key never enters the container.
|
|
7
|
+
aex has no built-in vision tool. The agent's `provider` / `model` selects the
|
|
8
|
+
reasoning model for the run; if a skill needs image understanding mid-run, ship a
|
|
9
|
+
skill that calls the vision provider with normal HTTP and pass that provider key
|
|
10
|
+
as a runtime secret.
|
|
13
11
|
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
large enough to need a raised `maxRequestBytes`.
|
|
12
|
+
The runnable example lives at [`examples/vision-skill/`](../../../examples/vision-skill).
|
|
13
|
+
It captions a frame with ByteDance Doubao Seed Vision (Ark) and returns a
|
|
14
|
+
per-noun "does the frame depict X?" verdict.
|
|
18
15
|
|
|
19
|
-
|
|
20
|
-
[`examples/vision-skill/`](../../../examples/vision-skill) (`SKILL.md`,
|
|
21
|
-
`caption_frame.py`, `verify_frame.py`, `run_with_vision_skill.mjs`). It
|
|
22
|
-
captions a frame with ByteDance Doubao Seed Vision (Ark) and returns a per-noun
|
|
23
|
-
"does the frame depict X?" verdict. Everything below is taken from it.
|
|
24
|
-
|
|
25
|
-
## 1. Declare the model endpoint as a proxy endpoint
|
|
26
|
-
|
|
27
|
-
The vision provider's API is just an HTTPS host. Declare it with
|
|
28
|
-
`ProxyEndpoint.bearer(...)`, which carries the key on the instance. The two
|
|
29
|
-
model-specific settings are `responseMode: "full"` (so the skill gets the upstream
|
|
30
|
-
JSON back) and a raised `maxRequestBytes` (so the base64 image fits):
|
|
16
|
+
## Submit the run
|
|
31
17
|
|
|
32
18
|
```ts
|
|
33
|
-
import { Aex, Models,
|
|
19
|
+
import { Aex, Models, Secret, Tools } from "@aexhq/sdk";
|
|
34
20
|
|
|
35
21
|
const aex = new Aex({ apiToken: process.env.AEX_API_TOKEN! });
|
|
36
22
|
|
|
37
|
-
const
|
|
38
|
-
name: "doubao-ark",
|
|
39
|
-
baseUrl: "https://ark.ap-southeast.bytepluses.com", // intl BytePlus gateway
|
|
40
|
-
token: process.env.DOUBAO_API_KEY!,
|
|
41
|
-
allowMethods: ["POST"],
|
|
42
|
-
allowPathPrefixes: ["/api/v3/chat/completions"],
|
|
43
|
-
maxRequestBytes: 2_000_000, // base64 image POSTs — see note below
|
|
44
|
-
responseMode: "full",
|
|
45
|
-
timeoutMs: 60_000
|
|
46
|
-
});
|
|
47
|
-
|
|
48
|
-
await aex.run({
|
|
23
|
+
const result = await aex.run({
|
|
49
24
|
model: Models.CLAUDE_HAIKU_4_5,
|
|
50
|
-
message: "
|
|
25
|
+
message: "Read skills/frame-vision-gate/SKILL.md, then caption and verify the frame.",
|
|
51
26
|
tools: [await Tools.fromSkillDir("./vision-skill", { name: "frame-vision-gate" })],
|
|
52
|
-
|
|
27
|
+
environment: {
|
|
28
|
+
secrets: {
|
|
29
|
+
DOUBAO_API_KEY: Secret.value(process.env.DOUBAO_API_KEY!)
|
|
30
|
+
},
|
|
31
|
+
networking: {
|
|
32
|
+
mode: "limited",
|
|
33
|
+
allowedHosts: ["ark.ap-southeast.bytepluses.com"]
|
|
34
|
+
}
|
|
35
|
+
},
|
|
53
36
|
apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! }
|
|
54
37
|
});
|
|
38
|
+
|
|
39
|
+
console.log(result.runId, result.text);
|
|
55
40
|
```
|
|
56
41
|
|
|
57
|
-
`Tools.fromSkillDir("./vision-skill",
|
|
58
|
-
|
|
59
|
-
repo,
|
|
60
|
-
OpenAI-compatible endpoint, or any other OpenAI-chat-shaped vision API — only
|
|
61
|
-
`baseUrl` and the path prefix change.
|
|
42
|
+
`Tools.fromSkillDir("./vision-skill", ...)` is resolved relative to the process
|
|
43
|
+
CWD. Run the script from the directory that contains `vision-skill/` (in this
|
|
44
|
+
repo, `examples/`).
|
|
62
45
|
|
|
63
|
-
##
|
|
46
|
+
## Call the provider from the skill
|
|
64
47
|
|
|
65
|
-
Inside the run, the skill
|
|
66
|
-
|
|
67
|
-
|
|
48
|
+
Inside the run, the skill reads `DOUBAO_API_KEY` and makes an
|
|
49
|
+
OpenAI-compatible chat-completions request with Python's standard HTTP client.
|
|
50
|
+
The image is base64-inlined as a data URL in the request body:
|
|
68
51
|
|
|
69
52
|
```python
|
|
70
|
-
import base64, json
|
|
53
|
+
import base64, json, os, urllib.request
|
|
54
|
+
|
|
71
55
|
b64 = base64.b64encode(open("/workspace/files/frame.jpg", "rb").read()).decode()
|
|
72
56
|
request_body = {
|
|
73
57
|
"model": "doubao-seed-1-6-vision-250815",
|
|
@@ -81,63 +65,30 @@ request_body = {
|
|
|
81
65
|
]}
|
|
82
66
|
]
|
|
83
67
|
}
|
|
84
|
-
```
|
|
85
|
-
|
|
86
|
-
Write the body to a file and hand it to the mounted CLI with `--data @<file>`
|
|
87
|
-
(the mount has no execute bit, so invoke through `bun`; see `credentials.md`):
|
|
88
68
|
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
"--path", "/api/v3/chat/completions",
|
|
98
|
-
"--header", "content-type=application/json",
|
|
99
|
-
"--data", f"@{body_path}",
|
|
100
|
-
"--response-mode", "full"],
|
|
101
|
-
capture_output=True, text=True, timeout=90,
|
|
69
|
+
req = urllib.request.Request(
|
|
70
|
+
"https://ark.ap-southeast.bytepluses.com/api/v3/chat/completions",
|
|
71
|
+
data=json.dumps(request_body).encode("utf-8"),
|
|
72
|
+
headers={
|
|
73
|
+
"Authorization": f"Bearer {os.environ['DOUBAO_API_KEY']}",
|
|
74
|
+
"Content-Type": "application/json"
|
|
75
|
+
},
|
|
76
|
+
method="POST",
|
|
102
77
|
)
|
|
103
78
|
```
|
|
104
79
|
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
```python
|
|
110
|
-
envelope = json.loads(result.stdout)
|
|
111
|
-
if "error" in envelope:
|
|
112
|
-
raise RuntimeError(f"proxy error: {envelope['error']}: {envelope['message']}")
|
|
113
|
-
upstream = json.loads(base64.b64decode(envelope["upstreamBodyBase64"]).decode())
|
|
114
|
-
content = upstream["choices"][0]["message"]["content"] # the model's JSON answer
|
|
115
|
-
```
|
|
116
|
-
|
|
117
|
-
The key is injected by the hosted proxy on the outbound call; it never appears on disk in
|
|
118
|
-
the container or in the model's context.
|
|
119
|
-
|
|
120
|
-
## `maxRequestBytes` and timeout defaults
|
|
80
|
+
The same pattern works for OpenAI, Gemini's OpenAI-compatible endpoint, or any
|
|
81
|
+
other HTTPS model API. Put the key in `environment.secrets`, allow-list the host
|
|
82
|
+
when using limited networking, and use the provider's normal SDK or HTTP API.
|
|
121
83
|
|
|
122
|
-
|
|
123
|
-
is **5 minutes**. That fits typical base64 image/model POSTs without extra
|
|
124
|
-
configuration. If a body does exceed the cap, the proxy rejects it before any
|
|
125
|
-
upstream call with an explicit error naming the observed size, the configured
|
|
126
|
-
cap, and how to raise it:
|
|
84
|
+
## Payload size
|
|
127
85
|
|
|
128
|
-
|
|
129
|
-
|
|
86
|
+
Base64 images are larger than their source files. Scale frames before captioning
|
|
87
|
+
when possible, for example:
|
|
130
88
|
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
## Notes
|
|
89
|
+
```bash
|
|
90
|
+
ffmpeg -i source.mp4 -vf fps=1,scale=960:-1 frame_%03d.jpg
|
|
91
|
+
```
|
|
136
92
|
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
- **Keyless model hosts.** If the upstream takes no credential, declare the
|
|
140
|
-
endpoint with `ProxyEndpoint.none(...)` (see `credentials.md`).
|
|
141
|
-
- **Response size.** `responseMode: "full"` is required to read the model's reply
|
|
142
|
-
back. Leave `maxResponseBytes` at its default (`0` = unlimited, streamed) unless
|
|
143
|
-
you want a truncation cap.
|
|
93
|
+
This keeps upload size and model cost bounded without losing the signal most
|
|
94
|
+
vision models need.
|