@aexhq/sdk 0.36.0 → 0.37.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. package/README.md +1 -1
  2. package/dist/_contracts/event-envelope.d.ts +22 -1
  3. package/dist/_contracts/event-envelope.js +26 -2
  4. package/dist/_contracts/event-stream-client.js +7 -1
  5. package/dist/_contracts/operations.d.ts +30 -1
  6. package/dist/_contracts/operations.js +54 -1
  7. package/dist/_contracts/run-config.d.ts +1 -1
  8. package/dist/_contracts/run-unit.d.ts +12 -0
  9. package/dist/_contracts/run-unit.js +55 -0
  10. package/dist/_contracts/runtime-sizes.d.ts +2 -2
  11. package/dist/_contracts/runtime-sizes.js +5 -5
  12. package/dist/_contracts/runtime-types.d.ts +98 -0
  13. package/dist/_contracts/submission.d.ts +4 -4
  14. package/dist/cli.mjs +554 -69
  15. package/dist/cli.mjs.sha256 +1 -1
  16. package/dist/client.d.ts +40 -1
  17. package/dist/client.js +90 -5
  18. package/dist/client.js.map +1 -1
  19. package/dist/index.d.ts +1 -1
  20. package/dist/index.js.map +1 -1
  21. package/dist/version.d.ts +1 -1
  22. package/dist/version.js +1 -1
  23. package/docs/authentication.md +92 -0
  24. package/docs/billing.md +112 -0
  25. package/docs/concepts/agent-tools.md +4 -4
  26. package/docs/concepts/providers-and-runtimes.md +4 -1
  27. package/docs/concepts/runs.md +2 -1
  28. package/docs/concepts/subagents.md +85 -0
  29. package/docs/credentials.md +27 -3
  30. package/docs/defaults.md +9 -7
  31. package/docs/errors.md +132 -0
  32. package/docs/events.md +36 -23
  33. package/docs/limits-and-quotas.md +29 -13
  34. package/docs/limits.md +2 -2
  35. package/docs/mcp.md +1 -1
  36. package/docs/networking.md +68 -42
  37. package/docs/outputs.md +4 -3
  38. package/docs/public-surface.json +1 -1
  39. package/docs/quickstart.md +9 -6
  40. package/docs/run-config.md +1 -1
  41. package/docs/secrets.md +5 -0
  42. package/docs/webhooks.md +132 -0
  43. package/package.json +1 -1
@@ -0,0 +1,112 @@
1
+ ---
2
+ title: Billing & webhook signing secret
3
+ ---
4
+
5
+ # Billing & webhook signing secret
6
+
7
+ Workspace-level billing, subscription, and webhook verification calls are
8
+ token-scoped like every other client call — the workspace is derived
9
+ server-side from the API token.
10
+
11
+ ## Read the billing summary
12
+
13
+ `aex.billing()` returns the workspace's prepaid balance, current-month spend,
14
+ and the spend cap enforced on new runs, plus plan fields:
15
+
16
+ ```ts
17
+ import { Aex } from "@aexhq/sdk";
18
+
19
+ const aex = new Aex(process.env.AEX_API_TOKEN!);
20
+ const billing = await aex.billing();
21
+ console.log(billing.balanceUsd, billing.monthSpendUsd, billing.spendCapUsd);
22
+ ```
23
+
24
+ The returned `BillingSummary` is additive-tolerant: fields a newer deployment
25
+ reports that this SDK version does not know yet pass through on the object
26
+ instead of being rejected.
27
+
28
+ CLI equivalent:
29
+
30
+ ```bash
31
+ aex billing # human-readable balance / month spend / spend cap
32
+ aex billing --json # the raw wire body for scripting
33
+ ```
34
+
35
+ ## Manage the subscription
36
+
37
+ `aex.billingCheckout({ planKey })` creates a hosted checkout session for a paid
38
+ plan. Open the returned URL in a browser; the workspace plan changes after
39
+ checkout completes and the hosted API confirms the subscription.
40
+
41
+ ```ts
42
+ const { url } = await aex.billingCheckout({
43
+ planKey: "pro",
44
+ idempotencyKey: crypto.randomUUID()
45
+ });
46
+ console.log(url);
47
+ ```
48
+
49
+ `aex.billingPortal()` creates a hosted billing portal session for the workspace:
50
+
51
+ ```ts
52
+ const { url } = await aex.billingPortal({ returnUrl: "https://aex.dev/billing" });
53
+ console.log(url);
54
+ ```
55
+
56
+ CLI equivalents:
57
+
58
+ ```bash
59
+ aex billing upgrade pro
60
+ aex billing portal
61
+ ```
62
+
63
+ ## Read the credit ledger
64
+
65
+ `aex.billingLedger({ limit })` returns recent credit-ledger rows, newest first —
66
+ allowance grants, adjustments, and run charges with signed `amountUsd` (credits
67
+ positive, charges negative). `limit` is clamped server-side to [1, 100] (default
68
+ 25); the read is not cursor-paged.
69
+
70
+ ```ts
71
+ const { entries } = await aex.billingLedger({ limit: 50 });
72
+ for (const entry of entries) {
73
+ console.log(entry.createdAt, entry.entryType, entry.amountUsd);
74
+ }
75
+ ```
76
+
77
+ CLI equivalent:
78
+
79
+ ```bash
80
+ aex billing ledger --limit 50 # JSON rows, newest first
81
+ ```
82
+
83
+ ## Reveal the webhook signing secret
84
+
85
+ Run webhooks are signed Standard-Webhooks style with a per-workspace secret.
86
+ `aex.webhookSigningSecret()` reveals it (creating one on first use) as the
87
+ `whsec_<base64>` string that `verifyAexWebhook` takes as `secret`:
88
+
89
+ ```ts
90
+ import { Aex, verifyAexWebhook } from "@aexhq/sdk";
91
+
92
+ const aex = new Aex(process.env.AEX_API_TOKEN!);
93
+ const { whsec } = await aex.webhookSigningSecret();
94
+
95
+ // In your webhook receiver:
96
+ const verified = await verifyAexWebhook({
97
+ rawBody, // the exact request body bytes as a string
98
+ headers, // the inbound request headers
99
+ secret: whsec
100
+ });
101
+ ```
102
+
103
+ Repeat calls return the SAME value — the hosted API does not rotate the
104
+ signing secret. Treat the reveal as sensitive: store it in your secret manager
105
+ and never log it.
106
+
107
+ CLI equivalent (prints the bare `whsec_...` string, pipeable into a secret
108
+ store; the reveal never goes to stderr or debug traces):
109
+
110
+ ```bash
111
+ aex webhooks secret
112
+ ```
@@ -12,7 +12,7 @@ Managed runs inject the complete builtin tool set into the agent by default:
12
12
  - `head`, `tail` — read bounded file slices
13
13
  - `web_fetch`, `web_search` — fetch a URL / managed web search
14
14
  - `todo_write` — maintain a todo list
15
- - `subagent`, `subagent_result` — delegate to and read back from child runs
15
+ - `subagent`, `subagent_result` — delegate to and read back from child runs (see [Subagents](subagents.md))
16
16
  - `bash_output`, `bash_kill` — manage background bash jobs
17
17
  - `wait`, `git` — bounded idle-yield and first-class git
18
18
 
@@ -32,13 +32,13 @@ to pick a narrow subset alongside `includeBuiltinTools: false`.
32
32
  The final tool list is ordered: resolved builtin tools, then custom tools, then
33
33
  MCP tools.
34
34
 
35
- Networking is open by default: the agent may reach any public host, subject to a
36
- fixed SSRF deny-list. `web_fetch` and `web_search` reach the network over a
35
+ Networking is open by default within the platform's managed egress ceiling and
36
+ a fixed SSRF deny-list. `web_fetch` and `web_search` reach the network over a
37
37
  managed, SSRF-guarded path that is **not** governed by `environment.networking`,
38
38
  so their hosts never need to be listed in a `limited` allowlist. Setting
39
39
  `environment.networking.mode` to `limited` restricts only the agent's own
40
40
  arbitrary egress (e.g. a `curl` in `bash`); the built-in web tools keep working.
41
- See [Networking](../networking.md).
41
+ See [Networking](../networking.md) for the full two-layer model.
42
42
 
43
43
  ## Disable builtins
44
44
 
@@ -17,7 +17,10 @@ aex exposes one submission shape across supported providers:
17
17
  | Doubao | `Providers.DOUBAO` |
18
18
  | Doubao China | `Providers.DOUBAO_CN` |
19
19
 
20
- All submissions run on the managed runtime. There is no public runtime selector; omit `runtime`.
20
+ All submissions run on the managed runtime. The optional `runtime` option picks
21
+ a managed machine-size preset — use `Sizes.*` in TypeScript (e.g.
22
+ `runtime: Sizes.SHARED_0_25X_1GB`) or `--runtime-size` in the CLI. Omit it for
23
+ the default size; there is no alternative runtime backend to select.
21
24
 
22
25
  ## Selection
23
26
 
@@ -38,7 +38,8 @@ The same durable record backs SDK and CLI reads. From the handle use `refresh`,
38
38
  and `events().stream()` / `events().streamEnvelopes()` / `outputs().read(...)` for
39
39
  streaming and byte-capped reads) — to inspect the session live or after it parks;
40
40
  from the client, `aex.sessions.list()` / `aex.sessions.get(id)` read across the
41
- workspace.
41
+ workspace (CLI mirrors: `aex sessions` and `aex runs` list the workspace's
42
+ sessions/runs newest-first).
42
43
 
43
44
  Use `idempotencyKey` when retrying `openSession` or `send` from your own
44
45
  workflow. aex hashes the normalized non-secret submission, so a retry with the
@@ -0,0 +1,85 @@
1
+ ---
2
+ title: Subagents
3
+ description: Delegate bounded sub-tasks to child agent runs with the subagent tool.
4
+ icon: GitFork
5
+ ---
6
+
7
+ A run can delegate bounded sub-tasks to **child agent runs** with the built-in
8
+ `subagent` tool. Delegation is agent-driven: the model decides to fan work out,
9
+ each child is a real run with its own record, and the parent collects results
10
+ as the children finish. There is no client-side parent/child API — lineage is
11
+ session-internal.
12
+
13
+ ## How the tool works
14
+
15
+ The agent calls `subagent` with a `prompt` and a `model` (both required), plus
16
+ optional `system`, `provider`, `runtimeSize`, `timeout`,
17
+ `includeBuiltinTools`, `tools` (builtin-tool names for the child), `skills`,
18
+ and `files`. The call is **always async**: on a successful spawn it returns
19
+ immediately with the child run id, and the parent keeps working while the child
20
+ runs. When a child settles, the parent is notified in its loop, and it reads
21
+ the child's status and captured outputs on demand with the companion
22
+ `subagent_result` tool.
23
+
24
+ Children inherit the parent's vaulted BYOK provider keys server-side — the
25
+ child submission carries no secrets. Include a provider key for every provider
26
+ your subagents may use when you open the parent session (a parent holding no
27
+ key for the child's provider gets a clear `parent_missing_provider_key` tool
28
+ error). See [Credentials](../credentials.md).
29
+
30
+ ## Depth and breadth limits
31
+
32
+ Delegation is bounded by two server-enforced lineage limits:
33
+
34
+ | Limit | Value | Behavior at the limit |
35
+ | --- | --- | --- |
36
+ | Max depth | **5** — the root run is depth 0 and may spawn down to depth 5; a depth-5 run may not spawn further | The spawn is rejected with a `depth_exceeded` tool error (the parent keeps running). |
37
+ | Concurrent children per lineage root | **1000** live (non-terminal) descendants by default; hard platform ceiling **4096** | Further spawns are refused until a child settles. |
38
+
39
+ The whole descendant subtree of one root shares a single depth and breadth
40
+ budget, enforced server-side at every level — a grandchild spawn counts against
41
+ the same root budget as a direct child. Values are mirrored in
42
+ [Limits & quotas](../limits-and-quotas.md).
43
+
44
+ ## Where children run: `in-process` vs `container`
45
+
46
+ By default a child runs **in-process**: it executes as a sibling agent process
47
+ inside the parent's own machine, sharing the parent's CPU, memory, and
48
+ lifetime. This is the platform default shipped today.
49
+
50
+ - **No extra runtime cost.** The parent's machine is the billable unit, so
51
+ in-process children bill **$0 of additional runtime** — fan-out is priced by
52
+ the parent box, however many children it hosts. (Model-token spend is still
53
+ whatever each child's provider calls cost on your BYOK key.)
54
+ - **Shared capacity.** N children share the parent's fixed CPU/memory. For
55
+ large fan-outs, size the parent up (`runtime`) rather than assuming each
56
+ child gets its own machine.
57
+ - **Joined lifecycle.** The parent's terminal waits for its in-process children,
58
+ and their results are folded into the parent's per-child accounting. Platform
59
+ recovery re-spawns in-process children exactly once if the parent's machine
60
+ is replaced mid-run — settled children are never re-run.
61
+
62
+ The escape valve is `host: "container"`: the child is dispatched to its **own
63
+ isolated machine** with its own runtime size and its own runtime billing, and
64
+ the parent does not host it. Use it when a child needs guaranteed capacity,
65
+ isolation from the parent's filesystem/CPU, or a different machine size than
66
+ the parent can share.
67
+
68
+ ## Lineage and observability
69
+
70
+ Every child — in-process or container — is a first-class run record:
71
+
72
+ - The parent's transcript logs each spawn with the child's run id.
73
+ - Each child has its own status, typed event timeline, and captured outputs,
74
+ readable by id like any other run (`aex.sessions.get(id)`, or the CLI's
75
+ `aex status` / `aex events` / `aex outputs` / `aex download`).
76
+ - The child's outputs are handed back to the parent via `subagent_result`, and
77
+ they remain independently downloadable after the lineage finishes.
78
+
79
+ ## Bounding delegation
80
+
81
+ - Turn delegation off for a run by cherry-picking builtins without `subagent`
82
+ (see [Agent tools](agent-tools.md)) or setting `includeBuiltinTools: false`.
83
+ - A per-session spend cap (`overrides.maxSpendUsd`) bounds the parent's spend.
84
+ - The depth/breadth limits above are platform defaults and are not settable
85
+ per-session today.
@@ -13,6 +13,24 @@ aex uses explicit, per-session credentials:
13
13
 
14
14
  Secrets never belong in reusable run config, files, prompts, or examples.
15
15
 
16
+ ## The client credential
17
+
18
+ Pass your aex API token directly to the constructor — `new Aex(apiKey)` — or as
19
+ the `apiKey` option. The older `apiToken` option remains accepted as a
20
+ compatibility alias, so existing code keeps working:
21
+
22
+ ```ts
23
+ import { Aex } from "@aexhq/sdk";
24
+
25
+ const aex = new Aex(process.env.AEX_API_TOKEN!); // preferred shorthand
26
+ // equivalently:
27
+ // const aex = new Aex({ apiKey: process.env.AEX_API_TOKEN! });
28
+ // const aex = new Aex({ apiToken: process.env.AEX_API_TOKEN! }); // alias
29
+ ```
30
+
31
+ See [Authentication](authentication.md) for how tokens are scoped, rotated, and
32
+ issued during the beta.
33
+
16
34
  ## Provider keys
17
35
 
18
36
  A session selects one upstream provider and must carry a BYOK key for it. Include
@@ -68,6 +86,11 @@ curl -sS \
68
86
 
69
87
  ## Workspace secrets
70
88
 
89
+ > **Availability note:** workspace-secret `Secret.ref(...)` injection requires
90
+ > the next platform deploy — on the current hosted plane the referenced
91
+ > variable can resolve empty inside the run. Per-run `Secret.value(...)`
92
+ > secrets are unaffected.
93
+
71
94
  Store reusable values once, then reference them by name:
72
95
 
73
96
  ```ts
@@ -92,9 +115,10 @@ Secret reads return metadata only; they never return the stored value.
92
115
 
93
116
  ## Networking
94
117
 
95
- Networking is open by default. Use `environment.networking.mode: "limited"` with
96
- `allowedHosts` when you want a run to reach only named public hosts. See
97
- [Networking](networking.md).
118
+ Networking is open by default within the platform's managed egress ceiling. Use
119
+ `environment.networking.mode: "limited"` with `allowedHosts` when you want a
120
+ run's own code to reach only named hosts. See [Networking](networking.md) for
121
+ the two-layer enforcement model.
98
122
 
99
123
  ## Explicit call-site rule
100
124
 
package/docs/defaults.md CHANGED
@@ -5,10 +5,9 @@ title: Defaults
5
5
  # Defaults
6
6
 
7
7
  These are the values aex applies when you **omit** the corresponding option on a
8
- run. Every value is mirrored from a single source-of-truth constant; the
9
- constant file is authoritative and this page is generated documentation, not a
10
- second source of truth. If a value here ever disagrees with that constant,
11
- the constant wins.
8
+ run. Every value is mirrored from a single source-of-truth constant in the
9
+ platform's limits module; this page is hand-maintained against those constants.
10
+ If a value here ever disagrees with the constant, the constant wins.
12
11
 
13
12
  Each value below is named by its source-of-truth constant. The runtime-size
14
13
  presets are defined in the public
@@ -21,7 +20,7 @@ For the hard ceilings and who can raise them, see
21
20
 
22
21
  | Option | Default | How to override | Source |
23
22
  | --- | --- | --- | --- |
24
- | `timeout` (run deadline) | 1 hour | Per-session via `overrides.timeout` (e.g. `"30m"`, `"2h"`), clamped to the run-timeout floor/ceiling. | `RUN_DEFAULT_TIMEOUT_MS` |
23
+ | `timeout` (run deadline) | 8 hours (also the ceiling) | Per-session via `overrides.timeout` (e.g. `"30m"`, `"2h"`), clamped to the run-timeout floor/ceiling. | `RUN_DEFAULT_TIMEOUT_MS` |
25
24
  | `runtime` (machine size) | `shared-0.25x-1gb` — 0.25 vCPU, 1 GB | Per-session via `runtime` (use `Sizes.*` in TypeScript). | `RUN_DEFAULT_RUNTIME_SIZE` |
26
25
  | `overrides.maxSpendUsd` (per-session spend cap) | None — no spend cap (the session is still bounded by its `timeout` and any workspace-level cap) | Per-session via `overrides.maxSpendUsd` (a positive USD amount); the session is stopped once its spend would exceed the cap. | — |
27
26
 
@@ -57,5 +56,8 @@ For the hard ceilings and who can raise them, see
57
56
 
58
57
  | Option | Default | How to override | Source |
59
58
  | --- | --- | --- | --- |
60
- | Per-workspace mutation rate limits (per minute) | run submit 60, run cancel 30, run delete 30, signed link 120, API token create 10, API token delete 30 | Per-plane via the matching `AEX_RATE_LIMIT_<ACTION>_PER_MINUTE` env var. | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
61
- | Workspace storage cap | 50 GiB | Per-plane via env `AEX_WORKSPACE_STORAGE_CAP_BYTES`; admin workspaces are uncapped (not a customer entitlement). | `WORKSPACE_DEFAULT_STORAGE_CAP_BYTES` |
59
+ | Run submit rate (per minute) | 120 (`0` = disabled); past it `POST /runs` fails with `429 workspace_submit_rate_exceeded` | Per-plane via env `AEX_WORKSPACE_SUBMIT_RATE_PER_MIN`; per-workspace via support. | |
60
+ | Max concurrent runs | 50 live root runs (hard ceiling 200) | Per-workspace override via support, clamped to the ceiling. | `WORKSPACE_DEFAULT_MAX_CONCURRENT_RUNS` |
61
+ | Monthly spend cap | $250 per UTC calendar month (`0` = unlimited) | Per-workspace override via support. | `WORKSPACE_DEFAULT_SPEND_CAP_USD` |
62
+ | Per-workspace mutation rate limits (per minute) | run cancel 30, run delete 30, signed link 120, API token create 10, API token delete 30 | Per-plane via the matching `AEX_RATE_LIMIT_<ACTION>_PER_MINUTE` env var. | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
63
+ | Workspace storage cap | 500 GB (decimal) | Per-plane via env `AEX_WORKSPACE_STORAGE_CAP_BYTES`; admin workspaces are uncapped (not a customer entitlement). | `WORKSPACE_DEFAULT_STORAGE_CAP_BYTES` |
package/docs/errors.md ADDED
@@ -0,0 +1,132 @@
1
+ ---
2
+ title: Errors
3
+ ---
4
+
5
+ # Errors
6
+
7
+ Every API error is a JSON body with a machine-readable `error` code; most also
8
+ carry a human `message` and the self-describing fields named below. The SDK
9
+ surfaces non-2xx responses as `AexApiError` (with the parsed body attached) and
10
+ throttling as `AexRateLimitError`.
11
+
12
+ ## 401 — authentication
13
+
14
+ | Code | Meaning |
15
+ | --- | --- |
16
+ | `unauthorized` | Missing, invalid, or revoked bearer token. |
17
+
18
+ Check the token value and that it has not been deleted. `aex whoami` is the
19
+ cheapest way to validate a credential. See [Authentication](authentication.md).
20
+
21
+ ## 403 — authorization
22
+
23
+ | Code | Meaning |
24
+ | --- | --- |
25
+ | `insufficient_scope` | The token is valid but lacks the route's required scope. The body's `requiredScope` field names the missing scope. |
26
+ | `unknown_workspace` | The token does not route to a known workspace. |
27
+ | `forbidden` | The authenticated workspace does not own the addressed resource. |
28
+
29
+ ```json
30
+ { "error": "insufficient_scope", "requiredScope": "runs:write" }
31
+ ```
32
+
33
+ ## 400 — validation
34
+
35
+ | Code | Meaning |
36
+ | --- | --- |
37
+ | `bad_request` | Missing or unparseable request body. |
38
+ | `invalid_submission` | The submission failed shape validation; `message` names the offending field. |
39
+ | `missing_provider_key` | The submission names a provider but carries no BYOK key for it (`apiKeys[provider]`). |
40
+ | `malformed_token` | The bearer value is not a structurally valid aex token. |
41
+
42
+ 400s are permanent for that request — fix the input rather than retrying.
43
+ The SDK's client-side validation (`RunConfigValidationError`) catches most of
44
+ these before the request is sent.
45
+
46
+ ## 402 — payment required
47
+
48
+ Two distinct submit gates return 402; both bodies are self-describing.
49
+
50
+ **`insufficient_balance`** — the workspace prepaid balance is at or below the
51
+ effective submit floor. Top up the balance or bind a payment method.
52
+
53
+ ```json
54
+ {
55
+ "error": "insufficient_balance",
56
+ "message": "Workspace balance is depleted; top up your prepaid balance or bind a payment method to submit runs.",
57
+ "balanceUsd": 0,
58
+ "balanceGraceFloorUsd": 0,
59
+ "paymentMethodStatus": "none",
60
+ "planKey": "default"
61
+ }
62
+ ```
63
+
64
+ `balanceGraceFloorUsd` is the payment-method-aware floor the gate compared
65
+ against (`paymentMethodStatus: "active"` folds a bounded card overdraft into
66
+ it, so the floor can be negative).
67
+
68
+ **`workspace_spend_cap_exceeded`** — the workspace's monthly spend cap is
69
+ reached. The cap resets at the start of the next UTC month; contact support to
70
+ raise it.
71
+
72
+ ```json
73
+ {
74
+ "error": "workspace_spend_cap_exceeded",
75
+ "message": "Monthly spend cap of $250 reached ($251.13 accrued this month). The cap resets at the start of the next UTC month; contact support to raise it.",
76
+ "capUsd": 250,
77
+ "accruedUsd": 251.13
78
+ }
79
+ ```
80
+
81
+ ## 429 — rate limits
82
+
83
+ **`workspace_concurrency_exceeded`** — admitting one more live run would exceed
84
+ the workspace's concurrent-run cap. Wait for a run to finish, or contact
85
+ support to raise the cap.
86
+
87
+ ```json
88
+ {
89
+ "error": "workspace_concurrency_exceeded",
90
+ "message": "Workspace concurrency limit reached: 50 live runs at the cap of 50. Wait for a run to finish, or contact support to raise your workspace limit.",
91
+ "cap": 50,
92
+ "observed": 50
93
+ }
94
+ ```
95
+
96
+ **`workspace_submit_rate_exceeded`** — too many submits in the current
97
+ one-minute window. Retry shortly.
98
+
99
+ ```json
100
+ {
101
+ "error": "workspace_submit_rate_exceeded",
102
+ "message": "Submit rate limit of 120/minute exceeded. Retry shortly, or contact support to raise your workspace limit.",
103
+ "perMin": 120,
104
+ "observed": 121
105
+ }
106
+ ```
107
+
108
+ The `limit`-naming fields (`cap`/`perMin`) and the `observed` window value make
109
+ each deny self-describing, so a client can back off proportionally. To
110
+ anticipate both 429s and both 402s *before* submitting, read the effective caps
111
+ from `aex.whoami().limits` — the values come from the same resolution code the
112
+ gates enforce. See [Limits & quotas](limits-and-quotas.md).
113
+
114
+ ## 404 — not found
115
+
116
+ `not_found`: the id does not exist **or** belongs to another workspace (aex
117
+ does not distinguish the two).
118
+
119
+ ## 5xx — server errors
120
+
121
+ | Code | Meaning |
122
+ | --- | --- |
123
+ | `internal_error` (500) | Unexpected server fault. Retry with backoff; report persistent cases. |
124
+ | `db_resuming` (503) | The database tier is resuming from idle. Transient — retry. |
125
+
126
+ The SDK retries transient failures automatically: HTTP `429`, `5xx`, `529`, and
127
+ network errors get bounded exponential backoff with full jitter, honoring any
128
+ `Retry-After` header. Tune or disable this with the client `retry` option; use
129
+ `isRateLimited(err)` / `AexRateLimitError` to handle persistent throttling
130
+ without parsing raw bodies. Idempotent submit retries are safe — the SDK
131
+ attaches a stable idempotency key to billable session create/send requests, so
132
+ a retried request never double-submits.
package/docs/events.md CHANGED
@@ -110,22 +110,32 @@ collected session turn. The returned `runId` is the session id.
110
110
 
111
111
  ## Terminal events vs. the run record
112
112
 
113
- A session turn emits a terminal **event** `RUN_FINISHED`
114
- (success) or `RUN_ERROR` when
115
- the agent's stream ends. This is an AG-UI *render-complete* signal: the runner
116
- emits it **before** aex commits the authoritative session record, so an
117
- `aex.sessions.get(id)` issued the instant you observe `RUN_FINISHED` can still
118
- read a non-parked status for a moment. Treat the terminal event as the
119
- lowest-latency "stop the spinner" signal **not** a read-consistency barrier.
120
-
121
- Two facts make this easy to work with:
122
-
123
- - **Outputs are already durable at the terminal event.** The runner uploads every
124
- output before it emits the terminal event, and `session.outputs().list()` / downloads
125
- read object storage directly so the moment you see `RUN_FINISHED` the outputs
126
- are complete and readable.
127
- - **The session _record_ settles a beat later.** To read the authoritative status
128
- consistently, don't key off the terminal event — use one of:
113
+ Two families of events can end a turn's stream, and which one you see depends
114
+ on how the turn ends:
115
+
116
+ - **AG-UI terminals** `RUN_FINISHED` / `RUN_ERROR`. These are *render-complete*
117
+ signals emitted by the agent stream itself. On the managed plane a normal
118
+ session turn usually does **not** emit `RUN_FINISHED`: the session *parks*
119
+ instead (see below). Expect `RUN_ERROR` on stream-level failures, and treat
120
+ `RUN_FINISHED` — when it does appear — as a low-latency "stop the spinner"
121
+ hint, not a read-consistency barrier.
122
+ - **`aex.session.*` park terminals** — `CUSTOM` events named `aex.session.idle`,
123
+ `aex.session.suspended`, or `aex.session.error`. On the managed plane these
124
+ are what actually end a turn: the session parks with the matching status, and
125
+ by the time the park event is broadcast the session record has already
126
+ reached that status. This is the terminal you should expect from a managed
127
+ run's event stream.
128
+
129
+ The SDK's helpers cover both families so you never have to switch on the plane:
130
+
131
+ - `isRunTerminal(event)` — true for the AG-UI `RUN_FINISHED` / `RUN_ERROR` pair.
132
+ - `isRunSettled(event)` — true for the `aex.run.settled` settle barrier **and**
133
+ for any `aex.session.*` park terminal. The managed plane does not broadcast a
134
+ separate `aex.run.settled` barrier — the park event plays that role — so
135
+ `isRunSettled` is the one guard that reliably means "this stream is done and
136
+ the record is authoritative".
137
+
138
+ To read the authoritative status consistently, use one of:
129
139
 
130
140
  ```ts
131
141
  // Session record path: send a turn, then wait for the session to park.
@@ -135,18 +145,21 @@ const record = await session.wait(); // the parked session record
135
145
  ```
136
146
 
137
147
  ```ts
138
- // Live events AND a settle-consistent end: the iterator keeps reading past
139
- // RUN_FINISHED until the post-mirror barrier, so the record is terminal when it ends.
148
+ // Live events AND a settle-consistent end: the iterator ends on the settle
149
+ // barrier OR the aex.session.* park terminal, whichever the plane emits
150
+ // so when it ends, the session record is already parked/terminal.
140
151
  for await (const event of session.events().streamEnvelopes({ settleConsistent: true })) {
141
152
  // render events live…
142
153
  }
143
- const settled = await aex.sessions.get(session.id); // guaranteed terminal here
154
+ const settled = await aex.sessions.get(session.id); // parked/terminal here
144
155
  ```
145
156
 
146
- Under the hood the coordinator broadcasts one `aex.run.settled` CUSTOM event as a
147
- run's last stream event, immediately after the durable record commits.
148
- `settleConsistent` ends the stream on it; on a raw stream, detect it with
149
- `isRunSettled(event)`.
157
+ `settleConsistent: true` makes the iterator end exactly when `isRunSettled(event)`
158
+ first fires; on a raw stream, apply `isRunSettled(event)` yourself. What it
159
+ guarantees: when the stream ends, a subsequent `aex.sessions.get(id)` reads a
160
+ parked/terminal status and `session.outputs().list()` is complete. Outputs are
161
+ uploaded before the terminal is broadcast, so they are readable the moment the
162
+ stream ends.
150
163
 
151
164
  ## Temporary event archive links
152
165
 
@@ -5,10 +5,9 @@ title: Limits & quotas
5
5
  # Limits & quotas
6
6
 
7
7
  These are the hard ceilings and caps that bound a run, a workspace, and a single
8
- request. Every value is mirrored from a single source-of-truth constant; the
9
- constant file is authoritative and this page is generated documentation, not a
10
- second source of truth. If a value here ever disagrees with that constant,
11
- the constant wins.
8
+ request. Every value is mirrored from a single source-of-truth constant in the
9
+ platform's limits module; this page is hand-maintained against those constants.
10
+ If a value here ever disagrees with the constant, the constant wins.
12
11
 
13
12
  Each row is named by its source-of-truth constant. For the values that apply
14
13
  when you omit an option, see
@@ -26,7 +25,7 @@ And whether you can **raise** it: per-run option, per-plan, or no.
26
25
 
27
26
  | Limit | Value | Source | Raisable? | Constant |
28
27
  | --- | --- | --- | --- | --- |
29
- | Maximum run timeout | 6 hours | aex policy | Per plan (billing-driven) | `RUN_MAX_TIMEOUT_MS` |
28
+ | Maximum run timeout | 8 hours (also the default when `timeout` is omitted) | aex policy | Per plan (billing-driven) | `RUN_MAX_TIMEOUT_MS` |
30
29
  | Minimum run timeout | 1 minute | aex policy | No (floor) | `RUN_MIN_TIMEOUT_MS` |
31
30
  | Per-call exec timeout (default) | 30 minutes | aex policy | Per-call via the tool call's `timeoutMs` | `RUN_DEFAULT_EXEC_TIMEOUT_MS` |
32
31
  | MCP connect timeout (default) | 30 seconds | aex policy | Per-port via `connectTimeoutMs` | `RUN_DEFAULT_MCP_CONNECT_TIMEOUT_MS` |
@@ -43,8 +42,8 @@ silently lost.
43
42
  | --- | --- | --- | --- | --- |
44
43
  | Capture wall-clock budget | 1 hour | aex policy | No (hard ceiling) | `RUN_CAPTURE_DEFAULT_TIMEOUT_MS` |
45
44
  | Max files captured | 50,000 | aex policy | No (hard ceiling) | `RUN_CAPTURE_MAX_FILES` |
46
- | Max bytes per captured file | 1 TB | aex policy | No (hard ceiling) | `RUN_CAPTURE_MAX_FILE_BYTES` |
47
- | Max total captured bytes | 1 TB | aex policy | No (hard ceiling) | `RUN_CAPTURE_MAX_TOTAL_BYTES` |
45
+ | Max bytes per captured file | 500 GB (decimal) | aex policy | No (hard ceiling) | `RUN_CAPTURE_MAX_FILE_BYTES` |
46
+ | Max total captured bytes | 500 GB (decimal) | aex policy | No (hard ceiling) | `RUN_CAPTURE_MAX_TOTAL_BYTES` |
48
47
 
49
48
  ### Tool output caps (per run)
50
49
 
@@ -74,9 +73,11 @@ silently lost.
74
73
 
75
74
  | Limit | Value | Source | Raisable? | Constant |
76
75
  | --- | --- | --- | --- | --- |
77
- | Workspace storage cap | 50 GiB (admins uncapped — not a customer entitlement) | Workspace default | Per-plane via env `AEX_WORKSPACE_STORAGE_CAP_BYTES` | `WORKSPACE_DEFAULT_STORAGE_CAP_BYTES` |
78
- | Max concurrent runs per workspace | Advisory there is no hard per-workspace concurrent-run cap constant; concurrency is bounded by plan, the subagent child-run cap, and provider/platform throughput rather than a fixed number. | aex policy | n/a | |
79
- | Skill bundle max compressed size (`.zip`) | 100 GB | Workspace default | Per-workspace (plan/env) | `WORKSPACE_SKILL_BUNDLE_MAX_COMPRESSED_BYTES` |
76
+ | Workspace storage cap | 500 GB (decimal; admins uncapped — not a customer entitlement) | Workspace default | Per-plane via env `AEX_WORKSPACE_STORAGE_CAP_BYTES` | `WORKSPACE_DEFAULT_STORAGE_CAP_BYTES` |
77
+ | Max concurrent runs per workspace | **50** live (non-terminal) root runs by default; hard platform ceiling **200**. One more submit past the cap fails with `429 workspace_concurrency_exceeded` (see [Errors](errors.md)). Subagent children are governed separately by the per-lineage caps below. | Workspace default | Per-workspace override (contact support), clamped to the 200 ceiling | `WORKSPACE_DEFAULT_MAX_CONCURRENT_RUNS` / `WORKSPACE_MAX_CONCURRENT_RUNS_CEILING` |
78
+ | Monthly workspace spend cap | **$250** per rolling UTC calendar month by default; `0` = unlimited. A submit past the cap fails with `402 workspace_spend_cap_exceeded` (see [Errors](errors.md)). | Workspace default | Per-workspace override (contact support) | `WORKSPACE_DEFAULT_SPEND_CAP_USD` |
79
+ | Skill bundle max compressed size (`.zip`) | 10 GiB (enforced at upload by the SDK and re-enforced server-side) | aex policy | No (hard ceiling) | `SKILL_BUNDLE_LIMITS.maxCompressedBytes` |
80
+ | Skill bundle max decompressed size (sum of uncompressed file sizes) | 50 MB | aex policy | No (hard ceiling) | `SKILL_BUNDLE_LIMITS.maxDecompressedBytes` |
80
81
  | Skill bundle max file entries | 1,000 | Workspace default | Per-workspace (plan/env) | `WORKSPACE_SKILL_BUNDLE_MAX_FILES` |
81
82
  | Skill bundle max directory depth (`a/b/c/d` = 4) | 16 | Workspace default | Per-workspace (plan/env) | `WORKSPACE_SKILL_BUNDLE_MAX_DEPTH` |
82
83
  | Skill bundle max entry path length | 512 characters | Workspace default | No (hard ceiling) | `WORKSPACE_SKILL_BUNDLE_MAX_PATH_LENGTH` |
@@ -84,18 +85,33 @@ silently lost.
84
85
 
85
86
  ### Rate limits (per workspace, per minute)
86
87
 
87
- Default values; each is overridable per-plane via the matching
88
- `AEX_RATE_LIMIT_<ACTION>_PER_MINUTE` env var.
88
+ Run submission has its own platform-enforced velocity cap: **120 submits per
89
+ minute** per workspace by default (`0` = disabled). Past it, `POST /runs` fails
90
+ with `429 workspace_submit_rate_exceeded` (see [Errors](errors.md)). It is
91
+ overridable per-plane via `AEX_WORKSPACE_SUBMIT_RATE_PER_MIN` or per-workspace
92
+ via support.
93
+
94
+ The dashboard mutation actions below default as listed; each is overridable
95
+ per-plane via the matching `AEX_RATE_LIMIT_<ACTION>_PER_MINUTE` env var.
89
96
 
90
97
  | Action | Default per minute | Source | Constant |
91
98
  | --- | --- | --- | --- |
92
- | Run submit | 60 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
93
99
  | Run cancel | 30 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
94
100
  | Run delete | 30 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
95
101
  | Signed output link | 120 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
96
102
  | API token create | 10 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
97
103
  | API token delete | 30 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
98
104
 
105
+ ### Introspecting your effective caps
106
+
107
+ `aex.whoami()` (CLI: `aex whoami`) returns a `limits` object carrying the
108
+ workspace's *effective* values for the caps above — `maxConcurrentRuns`,
109
+ `submitRatePerMinute`, `spendCapUsd`, plus the live `monthSpendUsd`,
110
+ `balanceUsd`, `balanceGraceFloorUsd`, and `paymentMethodStatus` — resolved by
111
+ the same code the admission gates use, so you can anticipate a `429`/`402`
112
+ before submitting. See [Authentication](authentication.md) and
113
+ [Errors](errors.md).
114
+
99
115
  ## Request Scope
100
116
 
101
117
  | Limit | Value | Source | Raisable? | Constant |
package/docs/limits.md CHANGED
@@ -16,13 +16,13 @@ For the current provider/model set, see the generated
16
16
 
17
17
  | Area | Default |
18
18
  | --- | --- |
19
- | Workspace storage | 50 GiB per workspace for captured outputs and workspace artifacts. aex-maintainer admin workspaces may be unlimited for internal dogfooding; this is not a customer entitlement. |
19
+ | Workspace storage | 500 GB per workspace for captured outputs and workspace artifacts. aex-maintainer admin workspaces may be unlimited for internal dogfooding; this is not a customer entitlement. |
20
20
 
21
21
  ## Product Boundaries
22
22
 
23
23
  | Area | Boundary |
24
24
  | --- | --- |
25
- | Runtime | New submissions run on the managed runtime. There is no public runtime selector. |
25
+ | Runtime | New submissions run on the managed runtime. The `runtime` option selects a managed machine-size preset (`Sizes.*`); there is no alternative runtime backend. |
26
26
  | Provider policy | Provider retention, training exclusion, HIPAA/BAA, data residency, abuse policy, and pricing belong to the selected provider account, endpoint, and contract. |
27
27
  | Secrets | Provider keys, MCP credentials, and env secrets are caller-owned. aex excludes secret values from idempotency and uses the explicit secret surfaces described in [Secrets](secrets.md). |
28
28
  | MCP servers | Remote MCP servers are customer-trusted systems. aex validates declarations and routes credentials; it does not make an untrusted MCP server safe. |