npm - @aexhq/sdk - Versions diffs - 0.36.0 → 0.37.0 - Mend

@aexhq/sdk 0.36.0 → 0.37.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

package/README.md +1 -1
package/dist/_contracts/event-envelope.d.ts +22 -1
package/dist/_contracts/event-envelope.js +26 -2
package/dist/_contracts/event-stream-client.js +7 -1
package/dist/_contracts/operations.d.ts +30 -1
package/dist/_contracts/operations.js +54 -1
package/dist/_contracts/run-config.d.ts +1 -1
package/dist/_contracts/run-unit.d.ts +12 -0
package/dist/_contracts/run-unit.js +55 -0
package/dist/_contracts/runtime-sizes.d.ts +2 -2
package/dist/_contracts/runtime-sizes.js +5 -5
package/dist/_contracts/runtime-types.d.ts +98 -0
package/dist/_contracts/submission.d.ts +4 -4
package/dist/cli.mjs +554 -69
package/dist/cli.mjs.sha256 +1 -1
package/dist/client.d.ts +40 -1
package/dist/client.js +90 -5
package/dist/client.js.map +1 -1
package/dist/index.d.ts +1 -1
package/dist/index.js.map +1 -1
package/dist/version.d.ts +1 -1
package/dist/version.js +1 -1
package/docs/authentication.md +92 -0
package/docs/billing.md +112 -0
package/docs/concepts/agent-tools.md +4 -4
package/docs/concepts/providers-and-runtimes.md +4 -1
package/docs/concepts/runs.md +2 -1
package/docs/concepts/subagents.md +85 -0
package/docs/credentials.md +27 -3
package/docs/defaults.md +9 -7
package/docs/errors.md +132 -0
package/docs/events.md +36 -23
package/docs/limits-and-quotas.md +29 -13
package/docs/limits.md +2 -2
package/docs/mcp.md +1 -1
package/docs/networking.md +68 -42
package/docs/outputs.md +4 -3
package/docs/public-surface.json +1 -1
package/docs/quickstart.md +9 -6
package/docs/run-config.md +1 -1
package/docs/secrets.md +5 -0
package/docs/webhooks.md +132 -0
package/package.json +1 -1

package/docs/billing.md ADDED Viewed

@@ -0,0 +1,112 @@
+---
+title: Billing & webhook signing secret
+---
+# Billing & webhook signing secret
+Workspace-level billing, subscription, and webhook verification calls are
+token-scoped like every other client call — the workspace is derived
+server-side from the API token.
+## Read the billing summary
+`aex.billing()` returns the workspace's prepaid balance, current-month spend,
+and the spend cap enforced on new runs, plus plan fields:
+```ts
+import { Aex } from "@aexhq/sdk";
+const aex = new Aex(process.env.AEX_API_TOKEN!);
+const billing = await aex.billing();
+console.log(billing.balanceUsd, billing.monthSpendUsd, billing.spendCapUsd);
+```
+The returned `BillingSummary` is additive-tolerant: fields a newer deployment
+reports that this SDK version does not know yet pass through on the object
+instead of being rejected.
+CLI equivalent:
+```bash
+aex billing            # human-readable balance / month spend / spend cap
+aex billing --json     # the raw wire body for scripting
+```
+## Manage the subscription
+`aex.billingCheckout({ planKey })` creates a hosted checkout session for a paid
+plan. Open the returned URL in a browser; the workspace plan changes after
+checkout completes and the hosted API confirms the subscription.
+```ts
+const { url } = await aex.billingCheckout({
+  planKey: "pro",
+  idempotencyKey: crypto.randomUUID()
+});
+console.log(url);
+```
+`aex.billingPortal()` creates a hosted billing portal session for the workspace:
+```ts
+const { url } = await aex.billingPortal({ returnUrl: "https://aex.dev/billing" });
+console.log(url);
+```
+CLI equivalents:
+```bash
+aex billing upgrade pro
+aex billing portal
+```
+## Read the credit ledger
+`aex.billingLedger({ limit })` returns recent credit-ledger rows, newest first —
+allowance grants, adjustments, and run charges with signed `amountUsd` (credits
+positive, charges negative). `limit` is clamped server-side to [1, 100] (default
+25); the read is not cursor-paged.
+```ts
+const { entries } = await aex.billingLedger({ limit: 50 });
+for (const entry of entries) {
+  console.log(entry.createdAt, entry.entryType, entry.amountUsd);
+}
+```
+CLI equivalent:
+```bash
+aex billing ledger --limit 50   # JSON rows, newest first
+```
+## Reveal the webhook signing secret
+Run webhooks are signed Standard-Webhooks style with a per-workspace secret.
+`aex.webhookSigningSecret()` reveals it (creating one on first use) as the
+`whsec_<base64>` string that `verifyAexWebhook` takes as `secret`:
+```ts
+import { Aex, verifyAexWebhook } from "@aexhq/sdk";
+const aex = new Aex(process.env.AEX_API_TOKEN!);
+const { whsec } = await aex.webhookSigningSecret();
+// In your webhook receiver:
+const verified = await verifyAexWebhook({
+  rawBody,          // the exact request body bytes as a string
+  headers,          // the inbound request headers
+  secret: whsec
+});
+```
+Repeat calls return the SAME value — the hosted API does not rotate the
+signing secret. Treat the reveal as sensitive: store it in your secret manager
+and never log it.
+CLI equivalent (prints the bare `whsec_...` string, pipeable into a secret
+store; the reveal never goes to stderr or debug traces):
+```bash
+aex webhooks secret
+```

package/docs/concepts/agent-tools.md CHANGED Viewed

@@ -12,7 +12,7 @@ Managed runs inject the complete builtin tool set into the agent by default:
 - `head`, `tail` — read bounded file slices
 - `web_fetch`, `web_search` — fetch a URL / managed web search
 - `todo_write` — maintain a todo list
-- `subagent`, `subagent_result` — delegate to and read back from child runs
+- `subagent`, `subagent_result` — delegate to and read back from child runs (see [Subagents](subagents.md))
 - `bash_output`, `bash_kill` — manage background bash jobs
 - `wait`, `git` — bounded idle-yield and first-class git
@@ -32,13 +32,13 @@ to pick a narrow subset alongside `includeBuiltinTools: false`.
 The final tool list is ordered: resolved builtin tools, then custom tools, then
 MCP tools.
-Networking is open by default: the agent may reach any public host, subject to a
-fixed SSRF deny-list. `web_fetch` and `web_search` reach the network over a
+Networking is open by default within the platform's managed egress ceiling and
+a fixed SSRF deny-list. `web_fetch` and `web_search` reach the network over a
 managed, SSRF-guarded path that is **not** governed by `environment.networking`,
 so their hosts never need to be listed in a `limited` allowlist. Setting
 `environment.networking.mode` to `limited` restricts only the agent's own
 arbitrary egress (e.g. a `curl` in `bash`); the built-in web tools keep working.
-See [Networking](../networking.md).
+See [Networking](../networking.md) for the full two-layer model.
 ## Disable builtins

package/docs/concepts/providers-and-runtimes.md CHANGED Viewed

@@ -17,7 +17,10 @@ aex exposes one submission shape across supported providers:
 | Doubao | `Providers.DOUBAO` |
 | Doubao China | `Providers.DOUBAO_CN` |
-All submissions run on the managed runtime. There is no public runtime selector; omit `runtime`.
+All submissions run on the managed runtime. The optional `runtime` option picks
+a managed machine-size preset — use `Sizes.*` in TypeScript (e.g.
+`runtime: Sizes.SHARED_0_25X_1GB`) or `--runtime-size` in the CLI. Omit it for
+the default size; there is no alternative runtime backend to select.
 ## Selection

package/docs/concepts/runs.md CHANGED Viewed

@@ -38,7 +38,8 @@ The same durable record backs SDK and CLI reads. From the handle use `refresh`,
 and `events().stream()` / `events().streamEnvelopes()` / `outputs().read(...)` for
 streaming and byte-capped reads) — to inspect the session live or after it parks;
 from the client, `aex.sessions.list()` / `aex.sessions.get(id)` read across the
-workspace.
+workspace (CLI mirrors: `aex sessions` and `aex runs` list the workspace's
+sessions/runs newest-first).
 Use `idempotencyKey` when retrying `openSession` or `send` from your own
 workflow. aex hashes the normalized non-secret submission, so a retry with the

package/docs/concepts/subagents.md ADDED Viewed

@@ -0,0 +1,85 @@
+---
+title: Subagents
+description: Delegate bounded sub-tasks to child agent runs with the subagent tool.
+icon: GitFork
+---
+A run can delegate bounded sub-tasks to **child agent runs** with the built-in
+`subagent` tool. Delegation is agent-driven: the model decides to fan work out,
+each child is a real run with its own record, and the parent collects results
+as the children finish. There is no client-side parent/child API — lineage is
+session-internal.
+## How the tool works
+The agent calls `subagent` with a `prompt` and a `model` (both required), plus
+optional `system`, `provider`, `runtimeSize`, `timeout`,
+`includeBuiltinTools`, `tools` (builtin-tool names for the child), `skills`,
+and `files`. The call is **always async**: on a successful spawn it returns
+immediately with the child run id, and the parent keeps working while the child
+runs. When a child settles, the parent is notified in its loop, and it reads
+the child's status and captured outputs on demand with the companion
+`subagent_result` tool.
+Children inherit the parent's vaulted BYOK provider keys server-side — the
+child submission carries no secrets. Include a provider key for every provider
+your subagents may use when you open the parent session (a parent holding no
+key for the child's provider gets a clear `parent_missing_provider_key` tool
+error). See [Credentials](../credentials.md).
+## Depth and breadth limits
+Delegation is bounded by two server-enforced lineage limits:
+| Limit | Value | Behavior at the limit |
+| --- | --- | --- |
+| Max depth | **5** — the root run is depth 0 and may spawn down to depth 5; a depth-5 run may not spawn further | The spawn is rejected with a `depth_exceeded` tool error (the parent keeps running). |
+| Concurrent children per lineage root | **1000** live (non-terminal) descendants by default; hard platform ceiling **4096** | Further spawns are refused until a child settles. |
+The whole descendant subtree of one root shares a single depth and breadth
+budget, enforced server-side at every level — a grandchild spawn counts against
+the same root budget as a direct child. Values are mirrored in
+[Limits & quotas](../limits-and-quotas.md).
+## Where children run: `in-process` vs `container`
+By default a child runs **in-process**: it executes as a sibling agent process
+inside the parent's own machine, sharing the parent's CPU, memory, and
+lifetime. This is the platform default shipped today.
+- **No extra runtime cost.** The parent's machine is the billable unit, so
+  in-process children bill **$0 of additional runtime** — fan-out is priced by
+  the parent box, however many children it hosts. (Model-token spend is still
+  whatever each child's provider calls cost on your BYOK key.)
+- **Shared capacity.** N children share the parent's fixed CPU/memory. For
+  large fan-outs, size the parent up (`runtime`) rather than assuming each
+  child gets its own machine.
+- **Joined lifecycle.** The parent's terminal waits for its in-process children,
+  and their results are folded into the parent's per-child accounting. Platform
+  recovery re-spawns in-process children exactly once if the parent's machine
+  is replaced mid-run — settled children are never re-run.
+The escape valve is `host: "container"`: the child is dispatched to its **own
+isolated machine** with its own runtime size and its own runtime billing, and
+the parent does not host it. Use it when a child needs guaranteed capacity,
+isolation from the parent's filesystem/CPU, or a different machine size than
+the parent can share.
+## Lineage and observability
+Every child — in-process or container — is a first-class run record:
+- The parent's transcript logs each spawn with the child's run id.
+- Each child has its own status, typed event timeline, and captured outputs,
+  readable by id like any other run (`aex.sessions.get(id)`, or the CLI's
+  `aex status` / `aex events` / `aex outputs` / `aex download`).
+- The child's outputs are handed back to the parent via `subagent_result`, and
+  they remain independently downloadable after the lineage finishes.
+## Bounding delegation
+- Turn delegation off for a run by cherry-picking builtins without `subagent`
+  (see [Agent tools](agent-tools.md)) or setting `includeBuiltinTools: false`.
+- A per-session spend cap (`overrides.maxSpendUsd`) bounds the parent's spend.
+- The depth/breadth limits above are platform defaults and are not settable
+  per-session today.

package/docs/credentials.md CHANGED Viewed

@@ -13,6 +13,24 @@ aex uses explicit, per-session credentials:
 Secrets never belong in reusable run config, files, prompts, or examples.
+## The client credential
+Pass your aex API token directly to the constructor — `new Aex(apiKey)` — or as
+the `apiKey` option. The older `apiToken` option remains accepted as a
+compatibility alias, so existing code keeps working:
+```ts
+import { Aex } from "@aexhq/sdk";
+const aex = new Aex(process.env.AEX_API_TOKEN!);          // preferred shorthand
+// equivalently:
+// const aex = new Aex({ apiKey: process.env.AEX_API_TOKEN! });
+// const aex = new Aex({ apiToken: process.env.AEX_API_TOKEN! }); // alias
+```
+See [Authentication](authentication.md) for how tokens are scoped, rotated, and
+issued during the beta.
 ## Provider keys
 A session selects one upstream provider and must carry a BYOK key for it. Include
@@ -68,6 +86,11 @@ curl -sS \
 ## Workspace secrets
+> **Availability note:** workspace-secret `Secret.ref(...)` injection requires
+> the next platform deploy — on the current hosted plane the referenced
+> variable can resolve empty inside the run. Per-run `Secret.value(...)`
+> secrets are unaffected.
 Store reusable values once, then reference them by name:
 ```ts
@@ -92,9 +115,10 @@ Secret reads return metadata only; they never return the stored value.
 ## Networking
-Networking is open by default. Use `environment.networking.mode: "limited"` with
-`allowedHosts` when you want a run to reach only named public hosts. See
-[Networking](networking.md).
+Networking is open by default within the platform's managed egress ceiling. Use
+`environment.networking.mode: "limited"` with `allowedHosts` when you want a
+run's own code to reach only named hosts. See [Networking](networking.md) for
+the two-layer enforcement model.
 ## Explicit call-site rule

package/docs/defaults.md CHANGED Viewed

@@ -5,10 +5,9 @@ title: Defaults
 # Defaults
 These are the values aex applies when you **omit** the corresponding option on a
-run. Every value is mirrored from a single source-of-truth constant; the
-constant file is authoritative and this page is generated documentation, not a
-second source of truth. If a value here ever disagrees with that constant,
-the constant wins.
+run. Every value is mirrored from a single source-of-truth constant in the
+platform's limits module; this page is hand-maintained against those constants.
+If a value here ever disagrees with the constant, the constant wins.
 Each value below is named by its source-of-truth constant. The runtime-size
 presets are defined in the public
@@ -21,7 +20,7 @@ For the hard ceilings and who can raise them, see
 | Option | Default | How to override | Source |
 | --- | --- | --- | --- |
-| `timeout` (run deadline) | 1 hour | Per-session via `overrides.timeout` (e.g. `"30m"`, `"2h"`), clamped to the run-timeout floor/ceiling. | `RUN_DEFAULT_TIMEOUT_MS` |
+| `timeout` (run deadline) | 8 hours (also the ceiling) | Per-session via `overrides.timeout` (e.g. `"30m"`, `"2h"`), clamped to the run-timeout floor/ceiling. | `RUN_DEFAULT_TIMEOUT_MS` |
 | `runtime` (machine size) | `shared-0.25x-1gb` — 0.25 vCPU, 1 GB | Per-session via `runtime` (use `Sizes.*` in TypeScript). | `RUN_DEFAULT_RUNTIME_SIZE` |
 | `overrides.maxSpendUsd` (per-session spend cap) | None — no spend cap (the session is still bounded by its `timeout` and any workspace-level cap) | Per-session via `overrides.maxSpendUsd` (a positive USD amount); the session is stopped once its spend would exceed the cap. | — |
@@ -57,5 +56,8 @@ For the hard ceilings and who can raise them, see
 | Option | Default | How to override | Source |
 | --- | --- | --- | --- |
-| Per-workspace mutation rate limits (per minute) | run submit 60, run cancel 30, run delete 30, signed link 120, API token create 10, API token delete 30 | Per-plane via the matching `AEX_RATE_LIMIT_<ACTION>_PER_MINUTE` env var. | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
-| Workspace storage cap | 50 GiB | Per-plane via env `AEX_WORKSPACE_STORAGE_CAP_BYTES`; admin workspaces are uncapped (not a customer entitlement). | `WORKSPACE_DEFAULT_STORAGE_CAP_BYTES` |
+| Run submit rate (per minute) | 120 (`0` = disabled); past it `POST /runs` fails with `429 workspace_submit_rate_exceeded` | Per-plane via env `AEX_WORKSPACE_SUBMIT_RATE_PER_MIN`; per-workspace via support. | — |
+| Max concurrent runs | 50 live root runs (hard ceiling 200) | Per-workspace override via support, clamped to the ceiling. | `WORKSPACE_DEFAULT_MAX_CONCURRENT_RUNS` |
+| Monthly spend cap | $250 per UTC calendar month (`0` = unlimited) | Per-workspace override via support. | `WORKSPACE_DEFAULT_SPEND_CAP_USD` |
+| Per-workspace mutation rate limits (per minute) | run cancel 30, run delete 30, signed link 120, API token create 10, API token delete 30 | Per-plane via the matching `AEX_RATE_LIMIT_<ACTION>_PER_MINUTE` env var. | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
+| Workspace storage cap | 500 GB (decimal) | Per-plane via env `AEX_WORKSPACE_STORAGE_CAP_BYTES`; admin workspaces are uncapped (not a customer entitlement). | `WORKSPACE_DEFAULT_STORAGE_CAP_BYTES` |

package/docs/errors.md ADDED Viewed

@@ -0,0 +1,132 @@
+---
+title: Errors
+---
+# Errors
+Every API error is a JSON body with a machine-readable `error` code; most also
+carry a human `message` and the self-describing fields named below. The SDK
+surfaces non-2xx responses as `AexApiError` (with the parsed body attached) and
+throttling as `AexRateLimitError`.
+## 401 — authentication
+| Code | Meaning |
+| --- | --- |
+| `unauthorized` | Missing, invalid, or revoked bearer token. |
+Check the token value and that it has not been deleted. `aex whoami` is the
+cheapest way to validate a credential. See [Authentication](authentication.md).
+## 403 — authorization
+| Code | Meaning |
+| --- | --- |
+| `insufficient_scope` | The token is valid but lacks the route's required scope. The body's `requiredScope` field names the missing scope. |
+| `unknown_workspace` | The token does not route to a known workspace. |
+| `forbidden` | The authenticated workspace does not own the addressed resource. |
+```json
+{ "error": "insufficient_scope", "requiredScope": "runs:write" }
+```
+## 400 — validation
+| Code | Meaning |
+| --- | --- |
+| `bad_request` | Missing or unparseable request body. |
+| `invalid_submission` | The submission failed shape validation; `message` names the offending field. |
+| `missing_provider_key` | The submission names a provider but carries no BYOK key for it (`apiKeys[provider]`). |
+| `malformed_token` | The bearer value is not a structurally valid aex token. |
+400s are permanent for that request — fix the input rather than retrying.
+The SDK's client-side validation (`RunConfigValidationError`) catches most of
+these before the request is sent.
+## 402 — payment required
+Two distinct submit gates return 402; both bodies are self-describing.
+**`insufficient_balance`** — the workspace prepaid balance is at or below the
+effective submit floor. Top up the balance or bind a payment method.
+```json
+{
+  "error": "insufficient_balance",
+  "message": "Workspace balance is depleted; top up your prepaid balance or bind a payment method to submit runs.",
+  "balanceUsd": 0,
+  "balanceGraceFloorUsd": 0,
+  "paymentMethodStatus": "none",
+  "planKey": "default"
+}
+```
+`balanceGraceFloorUsd` is the payment-method-aware floor the gate compared
+against (`paymentMethodStatus: "active"` folds a bounded card overdraft into
+it, so the floor can be negative).
+**`workspace_spend_cap_exceeded`** — the workspace's monthly spend cap is
+reached. The cap resets at the start of the next UTC month; contact support to
+raise it.
+```json
+{
+  "error": "workspace_spend_cap_exceeded",
+  "message": "Monthly spend cap of $250 reached ($251.13 accrued this month). The cap resets at the start of the next UTC month; contact support to raise it.",
+  "capUsd": 250,
+  "accruedUsd": 251.13
+}
+```
+## 429 — rate limits
+**`workspace_concurrency_exceeded`** — admitting one more live run would exceed
+the workspace's concurrent-run cap. Wait for a run to finish, or contact
+support to raise the cap.
+```json
+{
+  "error": "workspace_concurrency_exceeded",
+  "message": "Workspace concurrency limit reached: 50 live runs at the cap of 50. Wait for a run to finish, or contact support to raise your workspace limit.",
+  "cap": 50,
+  "observed": 50
+}
+```
+**`workspace_submit_rate_exceeded`** — too many submits in the current
+one-minute window. Retry shortly.
+```json
+{
+  "error": "workspace_submit_rate_exceeded",
+  "message": "Submit rate limit of 120/minute exceeded. Retry shortly, or contact support to raise your workspace limit.",
+  "perMin": 120,
+  "observed": 121
+}
+```
+The `limit`-naming fields (`cap`/`perMin`) and the `observed` window value make
+each deny self-describing, so a client can back off proportionally. To
+anticipate both 429s and both 402s *before* submitting, read the effective caps
+from `aex.whoami().limits` — the values come from the same resolution code the
+gates enforce. See [Limits & quotas](limits-and-quotas.md).
+## 404 — not found
+`not_found`: the id does not exist **or** belongs to another workspace (aex
+does not distinguish the two).
+## 5xx — server errors
+| Code | Meaning |
+| --- | --- |
+| `internal_error` (500) | Unexpected server fault. Retry with backoff; report persistent cases. |
+| `db_resuming` (503) | The database tier is resuming from idle. Transient — retry. |
+The SDK retries transient failures automatically: HTTP `429`, `5xx`, `529`, and
+network errors get bounded exponential backoff with full jitter, honoring any
+`Retry-After` header. Tune or disable this with the client `retry` option; use
+`isRateLimited(err)` / `AexRateLimitError` to handle persistent throttling
+without parsing raw bodies. Idempotent submit retries are safe — the SDK
+attaches a stable idempotency key to billable session create/send requests, so
+a retried request never double-submits.

package/docs/events.md CHANGED Viewed

@@ -110,22 +110,32 @@ collected session turn. The returned `runId` is the session id.
 ## Terminal events vs. the run record
-A session turn emits a terminal **event** — `RUN_FINISHED`
-(success) or `RUN_ERROR` — when
-the agent's stream ends. This is an AG-UI *render-complete* signal: the runner
-emits it **before** aex commits the authoritative session record, so an
-`aex.sessions.get(id)` issued the instant you observe `RUN_FINISHED` can still
-read a non-parked status for a moment. Treat the terminal event as the
-lowest-latency "stop the spinner" signal — **not** a read-consistency barrier.
-Two facts make this easy to work with:
-- **Outputs are already durable at the terminal event.** The runner uploads every
-  output before it emits the terminal event, and `session.outputs().list()` / downloads
-  read object storage directly — so the moment you see `RUN_FINISHED` the outputs
-  are complete and readable.
-- **The session _record_ settles a beat later.** To read the authoritative status
-  consistently, don't key off the terminal event — use one of:
+Two families of events can end a turn's stream, and which one you see depends
+on how the turn ends:
+- **AG-UI terminals** — `RUN_FINISHED` / `RUN_ERROR`. These are *render-complete*
+  signals emitted by the agent stream itself. On the managed plane a normal
+  session turn usually does **not** emit `RUN_FINISHED`: the session *parks*
+  instead (see below). Expect `RUN_ERROR` on stream-level failures, and treat
+  `RUN_FINISHED` — when it does appear — as a low-latency "stop the spinner"
+  hint, not a read-consistency barrier.
+- **`aex.session.*` park terminals** — `CUSTOM` events named `aex.session.idle`,
+  `aex.session.suspended`, or `aex.session.error`. On the managed plane these
+  are what actually end a turn: the session parks with the matching status, and
+  by the time the park event is broadcast the session record has already
+  reached that status. This is the terminal you should expect from a managed
+  run's event stream.
+The SDK's helpers cover both families so you never have to switch on the plane:
+- `isRunTerminal(event)` — true for the AG-UI `RUN_FINISHED` / `RUN_ERROR` pair.
+- `isRunSettled(event)` — true for the `aex.run.settled` settle barrier **and**
+  for any `aex.session.*` park terminal. The managed plane does not broadcast a
+  separate `aex.run.settled` barrier — the park event plays that role — so
+  `isRunSettled` is the one guard that reliably means "this stream is done and
+  the record is authoritative".
+To read the authoritative status consistently, use one of:
 ```ts
 // Session record path: send a turn, then wait for the session to park.
@@ -135,18 +145,21 @@ const record = await session.wait(); // the parked session record
 ```
 ```ts
-// Live events AND a settle-consistent end: the iterator keeps reading past
-// RUN_FINISHED until the post-mirror barrier, so the record is terminal when it ends.
+// Live events AND a settle-consistent end: the iterator ends on the settle
+// barrier OR the aex.session.* park terminal, whichever the plane emits —
+// so when it ends, the session record is already parked/terminal.
 for await (const event of session.events().streamEnvelopes({ settleConsistent: true })) {
   // render events live…
 }
-const settled = await aex.sessions.get(session.id); // guaranteed terminal here
+const settled = await aex.sessions.get(session.id); // parked/terminal here
 ```
-Under the hood the coordinator broadcasts one `aex.run.settled` CUSTOM event as a
-run's last stream event, immediately after the durable record commits.
-`settleConsistent` ends the stream on it; on a raw stream, detect it with
-`isRunSettled(event)`.
+`settleConsistent: true` makes the iterator end exactly when `isRunSettled(event)`
+first fires; on a raw stream, apply `isRunSettled(event)` yourself. What it
+guarantees: when the stream ends, a subsequent `aex.sessions.get(id)` reads a
+parked/terminal status and `session.outputs().list()` is complete. Outputs are
+uploaded before the terminal is broadcast, so they are readable the moment the
+stream ends.
 ## Temporary event archive links

package/docs/limits-and-quotas.md CHANGED Viewed

@@ -5,10 +5,9 @@ title: Limits & quotas
 # Limits & quotas
 These are the hard ceilings and caps that bound a run, a workspace, and a single
-request. Every value is mirrored from a single source-of-truth constant; the
-constant file is authoritative and this page is generated documentation, not a
-second source of truth. If a value here ever disagrees with that constant,
-the constant wins.
+request. Every value is mirrored from a single source-of-truth constant in the
+platform's limits module; this page is hand-maintained against those constants.
+If a value here ever disagrees with the constant, the constant wins.
 Each row is named by its source-of-truth constant. For the values that apply
 when you omit an option, see
@@ -26,7 +25,7 @@ And whether you can **raise** it: per-run option, per-plan, or no.
 | Limit | Value | Source | Raisable? | Constant |
 | --- | --- | --- | --- | --- |
-| Maximum run timeout | 6 hours | aex policy | Per plan (billing-driven) | `RUN_MAX_TIMEOUT_MS` |
+| Maximum run timeout | 8 hours (also the default when `timeout` is omitted) | aex policy | Per plan (billing-driven) | `RUN_MAX_TIMEOUT_MS` |
 | Minimum run timeout | 1 minute | aex policy | No (floor) | `RUN_MIN_TIMEOUT_MS` |
 | Per-call exec timeout (default) | 30 minutes | aex policy | Per-call via the tool call's `timeoutMs` | `RUN_DEFAULT_EXEC_TIMEOUT_MS` |
 | MCP connect timeout (default) | 30 seconds | aex policy | Per-port via `connectTimeoutMs` | `RUN_DEFAULT_MCP_CONNECT_TIMEOUT_MS` |
@@ -43,8 +42,8 @@ silently lost.
 | --- | --- | --- | --- | --- |
 | Capture wall-clock budget | 1 hour | aex policy | No (hard ceiling) | `RUN_CAPTURE_DEFAULT_TIMEOUT_MS` |
 | Max files captured | 50,000 | aex policy | No (hard ceiling) | `RUN_CAPTURE_MAX_FILES` |
-| Max bytes per captured file | 1 TB | aex policy | No (hard ceiling) | `RUN_CAPTURE_MAX_FILE_BYTES` |
-| Max total captured bytes | 1 TB | aex policy | No (hard ceiling) | `RUN_CAPTURE_MAX_TOTAL_BYTES` |
+| Max bytes per captured file | 500 GB (decimal) | aex policy | No (hard ceiling) | `RUN_CAPTURE_MAX_FILE_BYTES` |
+| Max total captured bytes | 500 GB (decimal) | aex policy | No (hard ceiling) | `RUN_CAPTURE_MAX_TOTAL_BYTES` |
 ### Tool output caps (per run)
@@ -74,9 +73,11 @@ silently lost.
 | Limit | Value | Source | Raisable? | Constant |
 | --- | --- | --- | --- | --- |
-| Workspace storage cap | 50 GiB (admins uncapped — not a customer entitlement) | Workspace default | Per-plane via env `AEX_WORKSPACE_STORAGE_CAP_BYTES` | `WORKSPACE_DEFAULT_STORAGE_CAP_BYTES` |
-| Max concurrent runs per workspace | Advisory — there is no hard per-workspace concurrent-run cap constant; concurrency is bounded by plan, the subagent child-run cap, and provider/platform throughput rather than a fixed number. | aex policy | n/a | — |
-| Skill bundle max compressed size (`.zip`) | 100 GB | Workspace default | Per-workspace (plan/env) | `WORKSPACE_SKILL_BUNDLE_MAX_COMPRESSED_BYTES` |
+| Workspace storage cap | 500 GB (decimal; admins uncapped — not a customer entitlement) | Workspace default | Per-plane via env `AEX_WORKSPACE_STORAGE_CAP_BYTES` | `WORKSPACE_DEFAULT_STORAGE_CAP_BYTES` |
+| Max concurrent runs per workspace | **50** live (non-terminal) root runs by default; hard platform ceiling **200**. One more submit past the cap fails with `429 workspace_concurrency_exceeded` (see [Errors](errors.md)). Subagent children are governed separately by the per-lineage caps below. | Workspace default | Per-workspace override (contact support), clamped to the 200 ceiling | `WORKSPACE_DEFAULT_MAX_CONCURRENT_RUNS` / `WORKSPACE_MAX_CONCURRENT_RUNS_CEILING` |
+| Monthly workspace spend cap | **$250** per rolling UTC calendar month by default; `0` = unlimited. A submit past the cap fails with `402 workspace_spend_cap_exceeded` (see [Errors](errors.md)). | Workspace default | Per-workspace override (contact support) | `WORKSPACE_DEFAULT_SPEND_CAP_USD` |
+| Skill bundle max compressed size (`.zip`) | 10 GiB (enforced at upload by the SDK and re-enforced server-side) | aex policy | No (hard ceiling) | `SKILL_BUNDLE_LIMITS.maxCompressedBytes` |
+| Skill bundle max decompressed size (sum of uncompressed file sizes) | 50 MB | aex policy | No (hard ceiling) | `SKILL_BUNDLE_LIMITS.maxDecompressedBytes` |
 | Skill bundle max file entries | 1,000 | Workspace default | Per-workspace (plan/env) | `WORKSPACE_SKILL_BUNDLE_MAX_FILES` |
 | Skill bundle max directory depth (`a/b/c/d` = 4) | 16 | Workspace default | Per-workspace (plan/env) | `WORKSPACE_SKILL_BUNDLE_MAX_DEPTH` |
 | Skill bundle max entry path length | 512 characters | Workspace default | No (hard ceiling) | `WORKSPACE_SKILL_BUNDLE_MAX_PATH_LENGTH` |
@@ -84,18 +85,33 @@ silently lost.
 ### Rate limits (per workspace, per minute)
-Default values; each is overridable per-plane via the matching
-`AEX_RATE_LIMIT_<ACTION>_PER_MINUTE` env var.
+Run submission has its own platform-enforced velocity cap: **120 submits per
+minute** per workspace by default (`0` = disabled). Past it, `POST /runs` fails
+with `429 workspace_submit_rate_exceeded` (see [Errors](errors.md)). It is
+overridable per-plane via `AEX_WORKSPACE_SUBMIT_RATE_PER_MIN` or per-workspace
+via support.
+The dashboard mutation actions below default as listed; each is overridable
+per-plane via the matching `AEX_RATE_LIMIT_<ACTION>_PER_MINUTE` env var.
 | Action | Default per minute | Source | Constant |
 | --- | --- | --- | --- |
-| Run submit | 60 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
 | Run cancel | 30 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
 | Run delete | 30 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
 | Signed output link | 120 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
 | API token create | 10 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
 | API token delete | 30 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
+### Introspecting your effective caps
+`aex.whoami()` (CLI: `aex whoami`) returns a `limits` object carrying the
+workspace's *effective* values for the caps above — `maxConcurrentRuns`,
+`submitRatePerMinute`, `spendCapUsd`, plus the live `monthSpendUsd`,
+`balanceUsd`, `balanceGraceFloorUsd`, and `paymentMethodStatus` — resolved by
+the same code the admission gates use, so you can anticipate a `429`/`402`
+before submitting. See [Authentication](authentication.md) and
+[Errors](errors.md).
 ## Request Scope
 | Limit | Value | Source | Raisable? | Constant |

package/docs/limits.md CHANGED Viewed

@@ -16,13 +16,13 @@ For the current provider/model set, see the generated
 | Area | Default |
 | --- | --- |
-| Workspace storage | 50 GiB per workspace for captured outputs and workspace artifacts. aex-maintainer admin workspaces may be unlimited for internal dogfooding; this is not a customer entitlement. |
+| Workspace storage | 500 GB per workspace for captured outputs and workspace artifacts. aex-maintainer admin workspaces may be unlimited for internal dogfooding; this is not a customer entitlement. |
 ## Product Boundaries
 | Area | Boundary |
 | --- | --- |
-| Runtime | New submissions run on the managed runtime. There is no public runtime selector. |
+| Runtime | New submissions run on the managed runtime. The `runtime` option selects a managed machine-size preset (`Sizes.*`); there is no alternative runtime backend. |
 | Provider policy | Provider retention, training exclusion, HIPAA/BAA, data residency, abuse policy, and pricing belong to the selected provider account, endpoint, and contract. |
 | Secrets | Provider keys, MCP credentials, and env secrets are caller-owned. aex excludes secret values from idempotency and uses the explicit secret surfaces described in [Secrets](secrets.md). |
 | MCP servers | Remote MCP servers are customer-trusted systems. aex validates declarations and routes credentials; it does not make an untrusted MCP server safe. |