npm - @aexhq/sdk - Versions diffs - 0.35.0 → 0.37.0 - Mend

@aexhq/sdk 0.35.0 → 0.37.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (72) hide show

package/README.md +17 -16
package/dist/_contracts/event-envelope.d.ts +22 -1
package/dist/_contracts/event-envelope.js +26 -2
package/dist/_contracts/event-stream-client.js +7 -1
package/dist/_contracts/index.d.ts +3 -4
package/dist/_contracts/index.js +1 -4
package/dist/_contracts/operations.d.ts +31 -1
package/dist/_contracts/operations.js +64 -1
package/dist/_contracts/run-config.d.ts +2 -4
package/dist/_contracts/run-config.js +2 -7
package/dist/_contracts/run-trace.d.ts +0 -86
package/dist/_contracts/run-trace.js +1 -184
package/dist/_contracts/run-unit.d.ts +14 -25
package/dist/_contracts/run-unit.js +56 -2
package/dist/_contracts/runtime-manifest.d.ts +1 -1
package/dist/_contracts/runtime-security-profile.d.ts +0 -2
package/dist/_contracts/runtime-security-profile.js +0 -9
package/dist/_contracts/runtime-sizes.d.ts +2 -2
package/dist/_contracts/runtime-sizes.js +5 -5
package/dist/_contracts/runtime-types.d.ts +123 -4
package/dist/_contracts/stable.d.ts +1 -1
package/dist/_contracts/stable.js +1 -1
package/dist/_contracts/submission.d.ts +8 -76
package/dist/_contracts/submission.js +5 -472
package/dist/cli.mjs +574 -511
package/dist/cli.mjs.sha256 +1 -1
package/dist/client.d.ts +69 -25
package/dist/client.js +338 -68
package/dist/client.js.map +1 -1
package/dist/index.d.ts +8 -16
package/dist/index.js +5 -17
package/dist/index.js.map +1 -1
package/dist/secret.d.ts +2 -2
package/dist/secret.js +1 -1
package/dist/version.d.ts +1 -1
package/dist/version.js +1 -1
package/docs/authentication.md +92 -0
package/docs/billing.md +112 -0
package/docs/concepts/agent-tools.md +4 -4
package/docs/concepts/composition.md +8 -14
package/docs/concepts/providers-and-runtimes.md +4 -1
package/docs/concepts/runs.md +2 -1
package/docs/concepts/subagents.md +85 -0
package/docs/credentials.md +78 -96
package/docs/defaults.md +9 -15
package/docs/errors.md +132 -0
package/docs/events.md +44 -32
package/docs/limits-and-quotas.md +30 -17
package/docs/limits.md +4 -8
package/docs/mcp.md +5 -6
package/docs/networking.md +75 -59
package/docs/outputs.md +4 -7
package/docs/public-surface.json +4 -4
package/docs/quickstart.md +12 -13
package/docs/run-config.md +7 -4
package/docs/secrets.md +6 -1
package/docs/skills.md +3 -3
package/docs/vision-skills.md +52 -101
package/docs/webhooks.md +132 -0
package/examples/feature-tour.ts +4 -21
package/package.json +1 -1
package/dist/_contracts/proxy-protocol.d.ts +0 -305
package/dist/_contracts/proxy-protocol.js +0 -297
package/dist/_contracts/proxy-validation.d.ts +0 -19
package/dist/_contracts/proxy-validation.js +0 -51
package/dist/data-tools.d.ts +0 -82
package/dist/data-tools.js +0 -251
package/dist/data-tools.js.map +0 -1
package/dist/proxy-endpoint.d.ts +0 -131
package/dist/proxy-endpoint.js +0 -144
package/dist/proxy-endpoint.js.map +0 -1
package/examples/chat-corpus.ts +0 -84

package/docs/concepts/agent-tools.md CHANGED Viewed

@@ -12,7 +12,7 @@ Managed runs inject the complete builtin tool set into the agent by default:
 - `head`, `tail` — read bounded file slices
 - `web_fetch`, `web_search` — fetch a URL / managed web search
 - `todo_write` — maintain a todo list
-- `subagent`, `subagent_result` — delegate to and read back from child runs
+- `subagent`, `subagent_result` — delegate to and read back from child runs (see [Subagents](subagents.md))
 - `bash_output`, `bash_kill` — manage background bash jobs
 - `wait`, `git` — bounded idle-yield and first-class git
@@ -32,13 +32,13 @@ to pick a narrow subset alongside `includeBuiltinTools: false`.
 The final tool list is ordered: resolved builtin tools, then custom tools, then
 MCP tools.
-Networking is open by default: the agent may reach any public host, subject to a
-fixed SSRF deny-list. `web_fetch` and `web_search` reach the network over a
+Networking is open by default within the platform's managed egress ceiling and
+a fixed SSRF deny-list. `web_fetch` and `web_search` reach the network over a
 managed, SSRF-guarded path that is **not** governed by `environment.networking`,
 so their hosts never need to be listed in a `limited` allowlist. Setting
 `environment.networking.mode` to `limited` restricts only the agent's own
 arbitrary egress (e.g. a `curl` in `bash`); the built-in web tools keep working.
-See [Networking](../networking.md).
+See [Networking](../networking.md) for the full two-layer model.
 ## Disable builtins

package/docs/concepts/composition.md CHANGED Viewed

@@ -14,11 +14,11 @@ runtime before the first agent turn.
 | Agent instructions | `AgentsMd.fromPath`, `AgentsMd.fromContent` |
 | Reference files and folders | `File.fromPath`, `File.fromBytes` |
 | Remote tools | `McpServer.remote`, `McpServer.fromId` |
-| Credentialed HTTP APIs | `ProxyEndpoint.none`, `bearer`, `basic`, `header`, `query` |
 | Non-secret runtime settings | `environment.variables`, `environment.packages`, `environment.networking` |
+| Runtime secrets for your code | `Secret.value`, `Secret.ref`, `environment.secrets` |
 ```ts
-import { AgentsMd, File, McpServer, Models, ProxyEndpoint, Tools } from "@aexhq/sdk";
+import { AgentsMd, File, McpServer, Models, Secret, Tools } from "@aexhq/sdk";
 await aex.run({
   model: Models.CLAUDE_HAIKU_4_5,
@@ -27,20 +27,14 @@ await aex.run({
   files: [await File.fromPath("./input")],
   tools: [await Tools.fromSkillDir("./skills/report-writer", { name: "report-writer" })],
   mcpServers: [McpServer.remote({ name: "github", url: "https://example.com/mcp" })],
-  proxyEndpoints: [
-    ProxyEndpoint.bearer({
-      name: "internal-api",
-      baseUrl: "https://api.example.com",
-      token: process.env.INTERNAL_API_TOKEN!,
-      allowMethods: ["GET"],
-      allowPathPrefixes: ["/v1/"]
-    })
-  ],
+  environment: {
+    secrets: { INTERNAL_API_TOKEN: Secret.value(process.env.INTERNAL_API_TOKEN!) },
+    networking: { mode: "limited", allowedHosts: ["api.example.com"] }
+  },
   apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! }
 });
 ```
 Secrets stay out of reusable configs. Provider keys go in the top-level `apiKeys`
-map; MCP auth rides on each `McpServer` instance and proxy auth on each
-`ProxyEndpoint` instance — the SDK splits them into the vaulted secrets channel
-server-side, so they never live in a shareable config object.
+map; reusable or per-run values for your own code go in `environment.secrets`.
+Your code then makes normal HTTP calls with the standard client for that service.

package/docs/concepts/providers-and-runtimes.md CHANGED Viewed

@@ -17,7 +17,10 @@ aex exposes one submission shape across supported providers:
 | Doubao | `Providers.DOUBAO` |
 | Doubao China | `Providers.DOUBAO_CN` |
-All submissions run on the managed runtime. There is no public runtime selector; omit `runtime`.
+All submissions run on the managed runtime. The optional `runtime` option picks
+a managed machine-size preset — use `Sizes.*` in TypeScript (e.g.
+`runtime: Sizes.SHARED_0_25X_1GB`) or `--runtime-size` in the CLI. Omit it for
+the default size; there is no alternative runtime backend to select.
 ## Selection

package/docs/concepts/runs.md CHANGED Viewed

@@ -38,7 +38,8 @@ The same durable record backs SDK and CLI reads. From the handle use `refresh`,
 and `events().stream()` / `events().streamEnvelopes()` / `outputs().read(...)` for
 streaming and byte-capped reads) — to inspect the session live or after it parks;
 from the client, `aex.sessions.list()` / `aex.sessions.get(id)` read across the
-workspace.
+workspace (CLI mirrors: `aex sessions` and `aex runs` list the workspace's
+sessions/runs newest-first).
 Use `idempotencyKey` when retrying `openSession` or `send` from your own
 workflow. aex hashes the normalized non-secret submission, so a retry with the

package/docs/concepts/subagents.md ADDED Viewed

@@ -0,0 +1,85 @@
+---
+title: Subagents
+description: Delegate bounded sub-tasks to child agent runs with the subagent tool.
+icon: GitFork
+---
+A run can delegate bounded sub-tasks to **child agent runs** with the built-in
+`subagent` tool. Delegation is agent-driven: the model decides to fan work out,
+each child is a real run with its own record, and the parent collects results
+as the children finish. There is no client-side parent/child API — lineage is
+session-internal.
+## How the tool works
+The agent calls `subagent` with a `prompt` and a `model` (both required), plus
+optional `system`, `provider`, `runtimeSize`, `timeout`,
+`includeBuiltinTools`, `tools` (builtin-tool names for the child), `skills`,
+and `files`. The call is **always async**: on a successful spawn it returns
+immediately with the child run id, and the parent keeps working while the child
+runs. When a child settles, the parent is notified in its loop, and it reads
+the child's status and captured outputs on demand with the companion
+`subagent_result` tool.
+Children inherit the parent's vaulted BYOK provider keys server-side — the
+child submission carries no secrets. Include a provider key for every provider
+your subagents may use when you open the parent session (a parent holding no
+key for the child's provider gets a clear `parent_missing_provider_key` tool
+error). See [Credentials](../credentials.md).
+## Depth and breadth limits
+Delegation is bounded by two server-enforced lineage limits:
+| Limit | Value | Behavior at the limit |
+| --- | --- | --- |
+| Max depth | **5** — the root run is depth 0 and may spawn down to depth 5; a depth-5 run may not spawn further | The spawn is rejected with a `depth_exceeded` tool error (the parent keeps running). |
+| Concurrent children per lineage root | **1000** live (non-terminal) descendants by default; hard platform ceiling **4096** | Further spawns are refused until a child settles. |
+The whole descendant subtree of one root shares a single depth and breadth
+budget, enforced server-side at every level — a grandchild spawn counts against
+the same root budget as a direct child. Values are mirrored in
+[Limits & quotas](../limits-and-quotas.md).
+## Where children run: `in-process` vs `container`
+By default a child runs **in-process**: it executes as a sibling agent process
+inside the parent's own machine, sharing the parent's CPU, memory, and
+lifetime. This is the platform default shipped today.
+- **No extra runtime cost.** The parent's machine is the billable unit, so
+  in-process children bill **$0 of additional runtime** — fan-out is priced by
+  the parent box, however many children it hosts. (Model-token spend is still
+  whatever each child's provider calls cost on your BYOK key.)
+- **Shared capacity.** N children share the parent's fixed CPU/memory. For
+  large fan-outs, size the parent up (`runtime`) rather than assuming each
+  child gets its own machine.
+- **Joined lifecycle.** The parent's terminal waits for its in-process children,
+  and their results are folded into the parent's per-child accounting. Platform
+  recovery re-spawns in-process children exactly once if the parent's machine
+  is replaced mid-run — settled children are never re-run.
+The escape valve is `host: "container"`: the child is dispatched to its **own
+isolated machine** with its own runtime size and its own runtime billing, and
+the parent does not host it. Use it when a child needs guaranteed capacity,
+isolation from the parent's filesystem/CPU, or a different machine size than
+the parent can share.
+## Lineage and observability
+Every child — in-process or container — is a first-class run record:
+- The parent's transcript logs each spawn with the child's run id.
+- Each child has its own status, typed event timeline, and captured outputs,
+  readable by id like any other run (`aex.sessions.get(id)`, or the CLI's
+  `aex status` / `aex events` / `aex outputs` / `aex download`).
+- The child's outputs are handed back to the parent via `subagent_result`, and
+  they remain independently downloadable after the lineage finishes.
+## Bounding delegation
+- Turn delegation off for a run by cherry-picking builtins without `subagent`
+  (see [Agent tools](agent-tools.md)) or setting `includeBuiltinTools: false`.
+- A per-session spend cap (`overrides.maxSpendUsd`) bounds the parent's spend.
+- The depth/breadth limits above are platform defaults and are not settable
+  per-session today.

package/docs/credentials.md CHANGED Viewed

@@ -4,142 +4,124 @@ title: Credentials
 # Credentials
-aex treats provider keys, MCP credentials, and proxy endpoint auth as per-run
-credentials. Reusable env secrets are documented separately in
-[Secrets](secrets.md).
+aex uses explicit, per-session credentials:
-The caller passes a workspace-scoped SDK token and the provider key inline on every `openSession` / `run` call. aex holds the bundle in run-scoped custody for the session lifecycle and attempts terminal cleanup/revocation for the aex-controlled references. MCP credentials and proxy endpoint auth values travel the same way.
+- `AEX_API_TOKEN` authenticates the SDK or CLI to aex.
+- `apiKeys` carries BYOK provider keys for the model provider.
+- `McpServer.remote(..., { headers })` carries MCP auth when a remote MCP server needs it.
+- `environment.secrets` carries runtime secrets for your own code.
-A session selects one upstream `provider` (default `anthropic`) and must carry a BYOK
-key for it. Keys are supplied per-provider so a session can also hold keys for the
-**other** providers its subagents may use:
+Secrets never belong in reusable run config, files, prompts, or examples.
-| Field | Required secret |
-| --- | --- |
-| Provider API keys | `apiKeys` (top-level, keyed by provider) |
+## The client credential
-```ts
-// The session's own provider key, plus extra keys its subagents can use.
-apiKeys: {
-  anthropic: process.env.ANTHROPIC_API_KEY!, // the session's provider
-  deepseek: process.env.DEEPSEEK_API_KEY!     // for a cross-provider subagent
-}
-```
+Pass your aex API token directly to the constructor — `new Aex(apiKey)` — or as
+the `apiKey` option. The older `apiToken` option remains accepted as a
+compatibility alias, so existing code keeps working:
-A `subagent` spawned with a different-family model **inherits the parent's keys
-server-side** from the session's vaulted bundle — the keys never transit the
-container. If the parent holds no key for the child's provider, the child is
-rejected with `parent_missing_provider_key`.
+```ts
+import { Aex } from "@aexhq/sdk";
-MCP credential types:
+const aex = new Aex(process.env.AEX_API_TOKEN!);          // preferred shorthand
+// equivalently:
+// const aex = new Aex({ apiKey: process.env.AEX_API_TOKEN! });
+// const aex = new Aex({ apiToken: process.env.AEX_API_TOKEN! }); // alias
+```
-- `static_bearer`;
-- `oauth_access_token`.
+See [Authentication](authentication.md) for how tokens are scoped, rotated, and
+issued during the beta.
-Unsupported:
+## Provider keys
-- arbitrary headers;
-- OAuth refresh;
-- persisted aex vault.
+A session selects one upstream provider and must carry a BYOK key for it. Include
+additional provider keys only when subagents may use those providers.
-For managed-runtime runs, aex injects the matching BYOK provider key at the hosted provider-proxy. Provider-side sessions and data remain subject to the selected provider account's retention and deletion policies.
+```ts
+const result = await aex.run({
+  model: Models.CLAUDE_HAIKU_4_5,
+  message: "Write a short report and save it as a file.",
+  apiKeys: {
+    anthropic: process.env.ANTHROPIC_API_KEY!
+  }
+});
+```
-## Proxy endpoints (per-run custom HTTP credentials)
+Provider keys are used by the managed runtime for model calls. They are not saved
+as client defaults.
-Some skills need to call non-MCP HTTP services (e.g. Stripe, internal APIs). Embedding the credential in the skill content puts the raw secret on disk in the agent container and in the model's context — both prompt-injection-readable.
+## Runtime secrets
-The platform's managed HTTP proxy is the agent-first alternative. Declare each endpoint with a `ProxyEndpoint.*` constructor: the instance carries the non-secret **policy** (hashed for idempotency) and its **auth token** together at the call site. The SDK splits the token into the vaulted secrets channel server-side (not hashed, so key rotation does not collapse onto a stale run), and the raw credential value never enters the container.
+Use `environment.secrets` for credentials your code needs at runtime. The value
+can be ephemeral with `Secret.value(...)` or a workspace secret reference with
+`Secret.ref(...)`.
 ```ts
-import { Aex, Models, ProxyEndpoint } from "@aexhq/sdk";
+import { Aex, Models, Secret } from "@aexhq/sdk";
-const aex = new Aex({
-  apiToken: "ant_..."
-});
-const stripe = ProxyEndpoint.bearer({
-  name: "stripe",
-  baseUrl: "https://api.stripe.com",
-  token: process.env.STRIPE_API_KEY!,
-  allowMethods: ["GET", "POST"],
-  allowPathPrefixes: ["/v1/charges", "/v1/refunds"],
-  maxRequestBytes: 65_536,
-  maxResponseBytes: 65_536,
-  timeoutMs: 10_000,
-  responseMode: "headers_only",
-  retry: {
-    maxAttempts: 3,
-    initialDelayMs: 250,
-    maxDelayMs: 5000,
-    jitter: "full",
-    retryOnStatuses: [408, 425, 429, 500, 502, 503, 504],
-    retryOnMethods: ["GET", "HEAD"],
-    respectRetryAfter: true
-  }
-});
+const aex = new Aex({ apiToken: process.env.AEX_API_TOKEN! });
 await aex.run({
   model: Models.CLAUDE_HAIKU_4_5,
-  message: "…",
-  proxyEndpoints: [stripe],
+  message: "Call https://api.example.com/v1/status with INTERNAL_API_TOKEN and summarize it.",
+  environment: {
+    secrets: {
+      INTERNAL_API_TOKEN: Secret.value(process.env.INTERNAL_API_TOKEN!)
+    },
+    networking: {
+      mode: "limited",
+      allowedHosts: ["api.example.com"]
+    }
+  },
   apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! }
 });
 ```
-The five constructors — `ProxyEndpoint.none` / `bearer` / `header` / `basic` / `query` — put the auth secret on the same call as the policy, so any drift (wrong `responseMode`, misnamed auth field) is a TypeScript error at the call site instead of an HTTP 400 a round-trip later.
-Inside the run container, every session has the platform CLI mounted at `/mnt/session/uploads/aex/aex` (a Bun-compatible ESM bundle) and a manifest at `/mnt/session/uploads/aex/index.json` describing the declared endpoints. The skill invokes the CLI through `bun` (the mount has no execute permission so direct invocation fails with `bad interpreter: Permission denied`):
+Inside the run, use normal HTTP code for the service:
 ```bash
-bun /mnt/session/uploads/aex/aex proxy stripe \
-  --method GET \
-  --path /v1/charges/ch_123 \
-  --response-mode headers_only
+curl -sS \
+  -H "Authorization: Bearer $INTERNAL_API_TOKEN" \
+  https://api.example.com/v1/status
 ```
-The CLI reads the per-run bearer from `/mnt/session/uploads/aex/run-token`, attaches the `X-Aex-Proxy-Protocol` header, and the hosted proxy injects the bearer/header/query/basic credential before dispatching the outbound call. Only the response (subject to `responseMode` and `maxResponseBytes`) reaches the container. `--response-mode` can only narrow below the policy ceiling.
+## Workspace secrets
-Retries are declaration-based. Add `retry` to the endpoint policy when safe for that upstream; runs without `retry` keep single-attempt behavior. `maxAttempts` counts the initial request, and defaults apply only when `retry` is present: `maxAttempts: 3`, `initialDelayMs: 250`, `maxDelayMs: 5000`, `jitter: "full"`, `retryOnStatuses: [408, 425, 429, 500, 502, 503, 504]`, `retryOnMethods: ["GET", "HEAD"]`, and `respectRetryAfter: true`. There are no per-call `aex proxy` retry flags.
+> **Availability note:** workspace-secret `Secret.ref(...)` injection requires
+> the next platform deploy — on the current hosted plane the referenced
+> variable can resolve empty inside the run. Per-run `Secret.value(...)`
+> secrets are unaffected.
-#### Keyless upstreams (`authShape: { type: "none" }`)
-For public APIs that take no credential (Wikimedia Commons, Internet Archive, Library of Congress, NASA Images, NARA, GDELT, etc.), declare the endpoint with `ProxyEndpoint.none(...)` — it produces only a declaration, no auth token:
+Store reusable values once, then reference them by name:
 ```ts
-import { ProxyEndpoint } from "@aexhq/sdk";
-const wikimedia = ProxyEndpoint.none({
-  name: "wikimedia",
-  baseUrl: "https://commons.wikimedia.org",
-  allowMethods: ["GET"],
-  allowPathPrefixes: ["/wiki/", "/w/api.php"]
+await aex.secrets.set({
+  name: "internal-api-token",
+  value: process.env.INTERNAL_API_TOKEN!
 });
 await aex.run({
   model: Models.CLAUDE_HAIKU_4_5,
-  message: "…",
-  proxyEndpoints: [wikimedia],
+  message: "Use INTERNAL_API_TOKEN for the status request.",
+  environment: {
+    secrets: {
+      INTERNAL_API_TOKEN: Secret.ref("internal-api-token")
+    }
+  },
   apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! }
 });
 ```
-The keyless endpoint still routes through the aex managed proxy: every call is allow-listed, audited, and redacted. The hosted proxy injects no `Authorization` header and no query-string credential.
+Secret reads return metadata only; they never return the stored value.
-`bun /mnt/session/uploads/aex/aex --help` reads endpoint details from `/mnt/session/uploads/aex/index.json`. Runs that do not declare any `proxyEndpoints` still have the CLI and an empty manifest mounted, so agents never need to introspect whether the surface exists.
+## Networking
-### Networking
-Networking is open by default. When a run explicitly uses `limited` networking,
-the platform host must appear in `allowed_hosts`. aex injects it
-automatically; for advance validation use:
-```ts
-const allowedHosts = buildPlatformAllowedHosts({
-  baseUrl: "https://api.aex.dev",
-  extraHosts: ["api.stripe.com"]
-});
-```
+Networking is open by default within the platform's managed egress ceiling. Use
+`environment.networking.mode: "limited"` with `allowedHosts` when you want a
+run's own code to reach only named hosts. See [Networking](networking.md) for
+the two-layer enforcement model.
-### Secrets are always explicit at the call site
+## Explicit call-site rule
-There is no `defaultSecrets` and no client-held secret state. Every `openSession` / `run` call carries its own credentials at the call site: the top-level `apiKeys` map (one provider key, plus any subagent keys), MCP auth on each `McpServer` instance, and proxy auth on each `ProxyEndpoint` instance. This is the agent-first invariant: the credentials being used on any given call are visible in the same code that opens the session.
+There is no `defaultSecrets` and no client-held secret state. Each
+`openSession(...)` or `run(...)` call should show the provider keys, MCP auth, and
+runtime secrets needed for that call.

package/docs/defaults.md CHANGED Viewed

@@ -5,10 +5,9 @@ title: Defaults
 # Defaults
 These are the values aex applies when you **omit** the corresponding option on a
-run. Every value is mirrored from a single source-of-truth constant; the
-constant file is authoritative and this page is generated documentation, not a
-second source of truth. If a value here ever disagrees with that constant,
-the constant wins.
+run. Every value is mirrored from a single source-of-truth constant in the
+platform's limits module; this page is hand-maintained against those constants.
+If a value here ever disagrees with the constant, the constant wins.
 Each value below is named by its source-of-truth constant. The runtime-size
 presets are defined in the public
@@ -21,7 +20,7 @@ For the hard ceilings and who can raise them, see
 | Option | Default | How to override | Source |
 | --- | --- | --- | --- |
-| `timeout` (run deadline) | 1 hour | Per-session via `overrides.timeout` (e.g. `"30m"`, `"2h"`), clamped to the run-timeout floor/ceiling. | `RUN_DEFAULT_TIMEOUT_MS` |
+| `timeout` (run deadline) | 8 hours (also the ceiling) | Per-session via `overrides.timeout` (e.g. `"30m"`, `"2h"`), clamped to the run-timeout floor/ceiling. | `RUN_DEFAULT_TIMEOUT_MS` |
 | `runtime` (machine size) | `shared-0.25x-1gb` — 0.25 vCPU, 1 GB | Per-session via `runtime` (use `Sizes.*` in TypeScript). | `RUN_DEFAULT_RUNTIME_SIZE` |
 | `overrides.maxSpendUsd` (per-session spend cap) | None — no spend cap (the session is still bounded by its `timeout` and any workspace-level cap) | Per-session via `overrides.maxSpendUsd` (a positive USD amount); the session is stopped once its spend would exceed the cap. | — |
@@ -39,14 +38,6 @@ For the hard ceilings and who can raise them, see
 | MCP connect timeout (register + initialize + discover) | 30 seconds | Per-port via `connectTimeoutMs`. | `RUN_DEFAULT_MCP_CONNECT_TIMEOUT_MS` |
 | MCP `tools/call` timeout | 30 minutes | Per-port via `callTimeoutMs`. | `RUN_DEFAULT_MCP_CALL_TIMEOUT_MS` |
-## Proxy endpoints
-| Option | Default | How to override | Source |
-| --- | --- | --- | --- |
-| `maxRequestBytes` | 10 MiB | Per-endpoint via the endpoint's `maxRequestBytes`. | `REQUEST_PROXY_DEFAULT_MAX_REQUEST_BYTES` |
-| `maxResponseBytes` | `0` (unlimited — the response is streamed unbuffered) | Per-endpoint via the endpoint's `maxResponseBytes`. | `REQUEST_PROXY_DEFAULT_MAX_RESPONSE_BYTES` |
-| `timeoutMs` (upstream) | 5 minutes | Per-endpoint via the endpoint's `timeoutMs`. | `REQUEST_PROXY_DEFAULT_TIMEOUT_MS` |
 ## Links (signed URLs and tickets)
 | Option | Default | How to override | Source |
@@ -65,5 +56,8 @@ For the hard ceilings and who can raise them, see
 | Option | Default | How to override | Source |
 | --- | --- | --- | --- |
-| Per-workspace mutation rate limits (per minute) | run submit 60, run cancel 30, run delete 30, signed link 120, API token create 10, API token delete 30 | Per-plane via the matching `AEX_RATE_LIMIT_<ACTION>_PER_MINUTE` env var. | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
-| Workspace storage cap | 50 GiB | Per-plane via env `AEX_WORKSPACE_STORAGE_CAP_BYTES`; admin workspaces are uncapped (not a customer entitlement). | `WORKSPACE_DEFAULT_STORAGE_CAP_BYTES` |
+| Run submit rate (per minute) | 120 (`0` = disabled); past it `POST /runs` fails with `429 workspace_submit_rate_exceeded` | Per-plane via env `AEX_WORKSPACE_SUBMIT_RATE_PER_MIN`; per-workspace via support. | — |
+| Max concurrent runs | 50 live root runs (hard ceiling 200) | Per-workspace override via support, clamped to the ceiling. | `WORKSPACE_DEFAULT_MAX_CONCURRENT_RUNS` |
+| Monthly spend cap | $250 per UTC calendar month (`0` = unlimited) | Per-workspace override via support. | `WORKSPACE_DEFAULT_SPEND_CAP_USD` |
+| Per-workspace mutation rate limits (per minute) | run cancel 30, run delete 30, signed link 120, API token create 10, API token delete 30 | Per-plane via the matching `AEX_RATE_LIMIT_<ACTION>_PER_MINUTE` env var. | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
+| Workspace storage cap | 500 GB (decimal) | Per-plane via env `AEX_WORKSPACE_STORAGE_CAP_BYTES`; admin workspaces are uncapped (not a customer entitlement). | `WORKSPACE_DEFAULT_STORAGE_CAP_BYTES` |

package/docs/errors.md ADDED Viewed

@@ -0,0 +1,132 @@
+---
+title: Errors
+---
+# Errors
+Every API error is a JSON body with a machine-readable `error` code; most also
+carry a human `message` and the self-describing fields named below. The SDK
+surfaces non-2xx responses as `AexApiError` (with the parsed body attached) and
+throttling as `AexRateLimitError`.
+## 401 — authentication
+| Code | Meaning |
+| --- | --- |
+| `unauthorized` | Missing, invalid, or revoked bearer token. |
+Check the token value and that it has not been deleted. `aex whoami` is the
+cheapest way to validate a credential. See [Authentication](authentication.md).
+## 403 — authorization
+| Code | Meaning |
+| --- | --- |
+| `insufficient_scope` | The token is valid but lacks the route's required scope. The body's `requiredScope` field names the missing scope. |
+| `unknown_workspace` | The token does not route to a known workspace. |
+| `forbidden` | The authenticated workspace does not own the addressed resource. |
+```json
+{ "error": "insufficient_scope", "requiredScope": "runs:write" }
+```
+## 400 — validation
+| Code | Meaning |
+| --- | --- |
+| `bad_request` | Missing or unparseable request body. |
+| `invalid_submission` | The submission failed shape validation; `message` names the offending field. |
+| `missing_provider_key` | The submission names a provider but carries no BYOK key for it (`apiKeys[provider]`). |
+| `malformed_token` | The bearer value is not a structurally valid aex token. |
+400s are permanent for that request — fix the input rather than retrying.
+The SDK's client-side validation (`RunConfigValidationError`) catches most of
+these before the request is sent.
+## 402 — payment required
+Two distinct submit gates return 402; both bodies are self-describing.
+**`insufficient_balance`** — the workspace prepaid balance is at or below the
+effective submit floor. Top up the balance or bind a payment method.
+```json
+{
+  "error": "insufficient_balance",
+  "message": "Workspace balance is depleted; top up your prepaid balance or bind a payment method to submit runs.",
+  "balanceUsd": 0,
+  "balanceGraceFloorUsd": 0,
+  "paymentMethodStatus": "none",
+  "planKey": "default"
+}
+```
+`balanceGraceFloorUsd` is the payment-method-aware floor the gate compared
+against (`paymentMethodStatus: "active"` folds a bounded card overdraft into
+it, so the floor can be negative).
+**`workspace_spend_cap_exceeded`** — the workspace's monthly spend cap is
+reached. The cap resets at the start of the next UTC month; contact support to
+raise it.
+```json
+{
+  "error": "workspace_spend_cap_exceeded",
+  "message": "Monthly spend cap of $250 reached ($251.13 accrued this month). The cap resets at the start of the next UTC month; contact support to raise it.",
+  "capUsd": 250,
+  "accruedUsd": 251.13
+}
+```
+## 429 — rate limits
+**`workspace_concurrency_exceeded`** — admitting one more live run would exceed
+the workspace's concurrent-run cap. Wait for a run to finish, or contact
+support to raise the cap.
+```json
+{
+  "error": "workspace_concurrency_exceeded",
+  "message": "Workspace concurrency limit reached: 50 live runs at the cap of 50. Wait for a run to finish, or contact support to raise your workspace limit.",
+  "cap": 50,
+  "observed": 50
+}
+```
+**`workspace_submit_rate_exceeded`** — too many submits in the current
+one-minute window. Retry shortly.
+```json
+{
+  "error": "workspace_submit_rate_exceeded",
+  "message": "Submit rate limit of 120/minute exceeded. Retry shortly, or contact support to raise your workspace limit.",
+  "perMin": 120,
+  "observed": 121
+}
+```
+The `limit`-naming fields (`cap`/`perMin`) and the `observed` window value make
+each deny self-describing, so a client can back off proportionally. To
+anticipate both 429s and both 402s *before* submitting, read the effective caps
+from `aex.whoami().limits` — the values come from the same resolution code the
+gates enforce. See [Limits & quotas](limits-and-quotas.md).
+## 404 — not found
+`not_found`: the id does not exist **or** belongs to another workspace (aex
+does not distinguish the two).
+## 5xx — server errors
+| Code | Meaning |
+| --- | --- |
+| `internal_error` (500) | Unexpected server fault. Retry with backoff; report persistent cases. |
+| `db_resuming` (503) | The database tier is resuming from idle. Transient — retry. |
+The SDK retries transient failures automatically: HTTP `429`, `5xx`, `529`, and
+network errors get bounded exponential backoff with full jitter, honoring any
+`Retry-After` header. Tune or disable this with the client `retry` option; use
+`isRateLimited(err)` / `AexRateLimitError` to handle persistent throttling
+without parsing raw bodies. Idempotent submit retries are safe — the SDK
+attaches a stable idempotency key to billable session create/send requests, so
+a retried request never double-submits.