npm - @aexhq/sdk - Versions diffs - 0.34.0 → 0.36.0 - Mend

@aexhq/sdk 0.34.0 → 0.36.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (63) hide show

package/README.md +16 -15
package/dist/_contracts/index.d.ts +3 -4
package/dist/_contracts/index.js +1 -4
package/dist/_contracts/operations.d.ts +2 -1
package/dist/_contracts/operations.js +10 -0
package/dist/_contracts/run-config.d.ts +1 -3
package/dist/_contracts/run-config.js +2 -7
package/dist/_contracts/run-trace.d.ts +0 -86
package/dist/_contracts/run-trace.js +1 -184
package/dist/_contracts/run-unit.d.ts +2 -25
package/dist/_contracts/run-unit.js +1 -2
package/dist/_contracts/runtime-manifest.d.ts +1 -1
package/dist/_contracts/runtime-security-profile.d.ts +0 -2
package/dist/_contracts/runtime-security-profile.js +0 -9
package/dist/_contracts/runtime-types.d.ts +25 -4
package/dist/_contracts/stable.d.ts +1 -1
package/dist/_contracts/stable.js +1 -1
package/dist/_contracts/submission.d.ts +62 -95
package/dist/_contracts/submission.js +59 -482
package/dist/cli.mjs +99 -442
package/dist/cli.mjs.sha256 +1 -1
package/dist/client.d.ts +49 -25
package/dist/client.js +341 -70
package/dist/client.js.map +1 -1
package/dist/index.d.ts +9 -15
package/dist/index.js +11 -17
package/dist/index.js.map +1 -1
package/dist/retry.d.ts +162 -0
package/dist/retry.js +320 -0
package/dist/retry.js.map +1 -0
package/dist/secret.d.ts +2 -2
package/dist/secret.js +1 -1
package/dist/version.d.ts +1 -1
package/dist/version.js +1 -1
package/docs/concepts/composition.md +8 -14
package/docs/credentials.md +59 -101
package/docs/defaults.md +0 -8
package/docs/events.md +8 -9
package/docs/limits-and-quotas.md +1 -4
package/docs/limits.md +2 -6
package/docs/mcp.md +4 -5
package/docs/networking.md +6 -16
package/docs/outputs.md +0 -4
package/docs/public-surface.json +3 -3
package/docs/quickstart.md +3 -7
package/docs/retries.md +129 -0
package/docs/run-config.md +6 -3
package/docs/secrets.md +1 -1
package/docs/skills.md +3 -3
package/docs/vision-skills.md +52 -101
package/examples/feature-tour.ts +284 -0
package/package.json +1 -1
package/dist/_contracts/proxy-protocol.d.ts +0 -305
package/dist/_contracts/proxy-protocol.js +0 -297
package/dist/_contracts/proxy-validation.d.ts +0 -19
package/dist/_contracts/proxy-validation.js +0 -51
package/dist/data-tools.d.ts +0 -82
package/dist/data-tools.js +0 -251
package/dist/data-tools.js.map +0 -1
package/dist/proxy-endpoint.d.ts +0 -131
package/dist/proxy-endpoint.js +0 -144
package/dist/proxy-endpoint.js.map +0 -1
package/examples/chat-corpus.ts +0 -84

package/docs/events.md CHANGED Viewed

@@ -53,9 +53,9 @@ for the string:
 const lastText = (await session.messages().last())?.text;
 ```
-`decodeAssistantText`, `textOf`, and `summarizeRunTrace` remain exported as the
-power-user escape hatch over a raw `RunEvent` list, but "get the last message"
-is now `await session.messages().last()`.
+Prefer `session.messages().list()` or the collected `result.messages` /
+`result.text` fields for assistant text. Low-level event helpers remain exported
+for callers that build custom collectors.
 The CLI mirrors the same surface:
@@ -162,7 +162,7 @@ const jsonl = await response.text();
 ## Event shape
-Events are typed as the discriminated `RunEvent` union for compatibility and as the versioned coordinator envelope for live consumers. aex records raw runtime/provider payloads **after** secret redaction and structural sanitization, so the bytes you see never contain the provider key, MCP credentials, or proxy bearer that were supplied when the session was opened.
+Events are typed as the discriminated `RunEvent` union for compatibility and as the versioned coordinator envelope for live consumers. aex records raw runtime/provider payloads **after** secret redaction and structural sanitization, so the bytes you see never contain provider keys, MCP credentials, or runtime secrets supplied when the session was opened.
 ## Typed helpers
@@ -180,8 +180,7 @@ import {
   isToolCallResult,
   isCustom,
   isLog,
-  isEventChannel,
-  textOf
+  isEventChannel
 } from "@aexhq/sdk";
 ```
@@ -191,6 +190,6 @@ All guards test the `type` discriminant at runtime. `isTextMessage`,
 `event.data` to the fields that event type carries — e.g. inside
 `if (isTextMessage(e))`, `e.data.text` is typed `string`. The lifecycle/channel
 guards (`isRunStarted`, `isRunError`, `isCustom`, `isLog`, …) operate on the
-coordinator envelope and narrow only the discriminant. `textOf(events)` returns
-the run's final assistant text concatenated from the `TEXT_MESSAGE_CONTENT`
-blocks.
+coordinator envelope and narrow only the discriminant. Use `result.text` or
+`session.messages.all()` when you need assistant text without inspecting the
+event stream directly.

package/docs/limits-and-quotas.md CHANGED Viewed

@@ -96,12 +96,9 @@ Default values; each is overridable per-plane via the matching
 | API token create | 10 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
 | API token delete | 30 | Workspace default | `WORKSPACE_RATE_LIMIT_DEFAULTS` |
-## Request scope (proxy and egress)
+## Request Scope
 | Limit | Value | Source | Raisable? | Constant |
 | --- | --- | --- | --- | --- |
-| Proxy request body | 10 MiB | aex policy | Per-endpoint via `maxRequestBytes` | `REQUEST_PROXY_DEFAULT_MAX_REQUEST_BYTES` |
-| Proxy response body | `0` = unlimited (streamed unbuffered) | aex policy | Per-endpoint via `maxResponseBytes` | `REQUEST_PROXY_DEFAULT_MAX_RESPONSE_BYTES` |
-| Proxy upstream timeout | 5 minutes | aex policy | Per-endpoint via `timeoutMs` | `REQUEST_PROXY_DEFAULT_TIMEOUT_MS` |
 | Signed output URL TTL | 300 seconds | aex policy | Per-call via `expiresSeconds` | `REQUEST_PRESIGN_URL_DEFAULT_TTL_SECONDS` |
 | Event-stream connection ticket TTL | 60 seconds | aex policy | Per-mint via `ttlMs` | `REQUEST_TICKET_DEFAULT_TTL_MS` |

package/docs/limits.md CHANGED Viewed

@@ -17,9 +17,6 @@ For the current provider/model set, see the generated
 | Area | Default |
 | --- | --- |
 | Workspace storage | 50 GiB per workspace for captured outputs and workspace artifacts. aex-maintainer admin workspaces may be unlimited for internal dogfooding; this is not a customer entitlement. |
-| Proxy request body | 10 MiB per proxy endpoint unless the endpoint declares a different `maxRequestBytes`. |
-| Proxy timeout | 5 minutes per proxy endpoint unless the endpoint declares a different `timeoutMs`. |
-| Proxy telemetry | Proxy calls emit report-only usage telemetry for call count, failed calls, request bytes, response bytes when known, and duration. Public proxy pricing is not shipped unless documented later. |
 ## Product Boundaries
@@ -27,14 +24,13 @@ For the current provider/model set, see the generated
 | --- | --- |
 | Runtime | New submissions run on the managed runtime. There is no public runtime selector. |
 | Provider policy | Provider retention, training exclusion, HIPAA/BAA, data residency, abuse policy, and pricing belong to the selected provider account, endpoint, and contract. |
-| Secrets | Provider keys, MCP credentials, proxy auth, and env secrets are caller-owned. aex excludes secret values from idempotency and uses the explicit secret surfaces described in [Secrets](secrets.md). |
+| Secrets | Provider keys, MCP credentials, and env secrets are caller-owned. aex excludes secret values from idempotency and uses the explicit secret surfaces described in [Secrets](secrets.md). |
 | MCP servers | Remote MCP servers are customer-trusted systems. aex validates declarations and routes credentials; it does not make an untrusted MCP server safe. |
-| Proxy endpoints | The proxy enforces declared host/path/method/auth policy for calls routed through it. Upstream side effects and data handling remain with the upstream service and customer. |
 | Outputs | Captured outputs, events, and metadata are stored under the run record and downloaded through auth-gated routes. Output content is customer content. |
 | Human review | Runs execute after submission. Cancellation is available, but aex does not pause a run for platform-mediated approval or interactive clarification. |
 | Sessions | The durable product primitive is the session/run record. Sessions can be resumed by id and auto-suspend after the configured idle window; persistent named agent profiles and saved agent definitions are out of scope. |
 | Deployment | The supported product is the hosted aex service plus the SDK and CLI. Alternate `baseUrl` values are for local, staging, or hosted aex API planes, not a self-host product promise. |
-| Cost | BYOK provider-token charges accrue to the customer's provider account. aex records report-only telemetry for runtime, storage, and proxy usage; free trials, billing-grade invoices, and public pricing documents are not shipped unless documented later. |
+| Cost | BYOK provider-token charges accrue to the customer's provider account. aex records report-only telemetry for runtime and storage usage; free trials, billing-grade invoices, and public pricing documents are not shipped unless documented later. |
 ## Provider Policy Links

package/docs/mcp.md CHANGED Viewed

@@ -25,14 +25,13 @@ server, so we cannot elide MCP responses or write them to the session
 filesystem on the user's behalf. Anything an MCP tool returns lands
 directly in the model's context.
-For ingestion-style tools that return large JSON blobs (search results,
-catalogue dumps, bulk reads), use the **CLI-as-skill + managed proxy**
-pattern instead of MCP:
+For ingestion-style MCP servers that return large JSON blobs (search results,
+catalogue dumps, bulk reads), prefer a skill that writes files instead of
+putting the whole response in model context:
 1. Package the upstream as a skill-tool (`Tools.fromSkillDir` /
    `Tools.fromSkillUrl`) — a CLI binary the agent invokes with its bash tool.
-2. Route every upstream HTTPS call through a per-run `ProxyEndpoint`
-   (audit, byte caps, budget enforcement).
+2. Keep any upstream HTTPS credentials in `environment.secrets`.
 3. Have the CLI write the full payload to the session filesystem. By default,
    files it creates or modifies are captured automatically; pass
    `outputs.allowedDirs` only when you want to narrow capture to specific roots.

package/docs/networking.md CHANGED Viewed

@@ -26,7 +26,6 @@ These reach the network over managed paths and are **not** subject to
 - The model / provider call for the run (and its subagents).
 - The built-in `web_search` and `web_fetch` tools (still SSRF-guarded).
 - Any remote MCP servers you declare in `mcpServers` — see [MCP](mcp.md).
-- Any `proxyEndpoints` you declare — see [Credentials](credentials.md).
 - The package registries for any `environment.packages` you declare (pip → PyPI,
   apt → the distribution mirrors). Declaring a package implicitly allows the
   registry it installs from.
@@ -70,17 +69,8 @@ non-default port when you need one (`api.example.com:8443`); a bare host name
 covers HTTPS on 443. Matching is exact per host — it is not a wildcard or suffix
 match, so list each host you need.
-To validate your allowlist before submitting, `buildPlatformAllowedHosts` returns
-the host set the platform will enforce given a base URL plus your extra hosts:
-```ts
-import { buildPlatformAllowedHosts } from "@aexhq/sdk";
-const allowedHosts = buildPlatformAllowedHosts({
-  baseUrl: "https://api.aex.dev",
-  extraHosts: ["api.example.com"]
-});
-```
+Keep the allowlist in your session options so the submitted network policy is
+visible at the same call site as the code that needs it.
 ## Open mode
@@ -135,7 +125,7 @@ your client succeeds without extra setup.
 - **`allowedHosts` only applies in `limited` mode.** It is ignored in `open`
   mode, where the SSRF deny-list is the only gate.
-For routing credentialed HTTP calls through the managed proxy without putting the
-secret in the container, use proxy endpoints — see
-[Credentials](credentials.md). For remote tool servers, see [MCP](mcp.md). For
-the full set of run-config fields, see [Run configuration](run-config.md).
+For credentialed HTTP calls, pass the credential as an `environment.secrets`
+entry and let your code use its normal HTTP client. For remote tool servers, see
+[MCP](mcp.md). For the full set of run-config fields, see
+[Run configuration](run-config.md).

package/docs/outputs.md CHANGED Viewed

@@ -100,10 +100,6 @@ if (truncated) {
 Check `truncated` before treating `text` as complete. Pass `options.grep` (a substring or `RegExp`) to keep only matching lines of the capped text. The returned `output` is the matched `Output` record, and `totalBytes` is the file's full size when the server reports it.
-### Chatting over a workspace's outputs
-`createDataTools(client)` packages the read surface (`sessions.list` + `sessions.outputs(id).list` + `sessions.outputs(id).read`) as a vendor-neutral LLM tool set (`{ tools, instructions, execute }`) so you can build a search-then-fetch chat over your sessions and their outputs in a few lines on top of the public SDK. The `tools` are plain JSON-Schema definitions (the shape every major LLM tool API accepts); `execute(name, input)` dispatches a tool call against the workspace-scoped client. See the runnable `examples/data-chat/` example.
 ## Finding outputs
 `session.outputs().list(query?)` can filter the captured output list client-side. Use `session.outputs().find(query)` when you want discovery to be explicit, or `session.outputs().findOne(query)` when exactly one file is expected:

package/docs/public-surface.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "brand": "aex",
   "productName": "Agent Executor",
   "oneLine": "aex is an agent execution platform for launching autonomous agents from a simple TypeScript SDK and CLI.",
-  "description": "Open durable agent sessions, send turns, stream events, capture outputs, and compose agents with skills, files, MCP, proxy endpoints, and subagents across the managed runtime.",
+  "description": "Open durable agent sessions, send turns, stream events, capture outputs, and compose agents with skills, files, MCP, secrets, networking controls, and subagents across the managed runtime.",
   "alpha": {
     "label": "Alpha testing",
     "description": "Access is limited to invited testers while we harden the hosted runtime, dashboard, and SDK workflows."
@@ -61,7 +61,7 @@
       "slug": "agent-composition",
       "href": "/docs/features/#agent-composition",
       "title": "Agent composition",
-      "description": "Skills, files, AGENTS.md, remote MCP servers, proxy endpoints, environment variables, packages, and networking controls."
+      "description": "Skills, files, AGENTS.md, remote MCP servers, environment variables, packages, secrets, and networking controls."
     },
     {
       "slug": "subagents",
@@ -79,7 +79,7 @@
       "slug": "typed-control-surface",
       "href": "/docs/features/#typed-control-surface",
       "title": "Typed control surface",
-      "description": "Strongly typed SDK inputs, CLI parity, BYOK secrets, scoped proxy auth, redaction, and output modes."
+      "description": "Strongly typed SDK inputs, CLI parity, BYOK provider keys, workspace secrets, redaction, and output modes."
     }
   ]
 }

package/docs/quickstart.md CHANGED Viewed

@@ -83,11 +83,8 @@ for await (const event of turn) {
 }
 await turn.done();
-// Reads/streams/downloads are grouped into accessor sub-resources:
-// session.messages() / events() / outputs() / webhooks(). Grab the last
-// assistant message (an AssistantTextEntry; use ?.text for the string).
-const lastText = (await session.messages().last())?.text;
-console.log(lastText);
+const messages = await session.messages().list();
+console.log(messages.at(-1)?.text);
 // Poll the record until the session parks (idle / suspended / error).
 const record = await session.wait();
@@ -110,8 +107,7 @@ aex run \
 ## Add capabilities
-- Add files, skills, AGENTS.md, MCP servers, proxy endpoints, packages, and networking controls with [Composition](concepts/composition.md).
-- Inspect runtime tools with [Agent tools](concepts/agent-tools.md).
+- Add files, skills, AGENTS.md, MCP servers, packages, and networking controls with [Composition](concepts/composition.md).
 - Use parent/child run delegation from the [Features](https://aex.dev/docs/features/#subagents) page.
 - Narrow output capture or download individual files with [Outputs](outputs.md).
 - Check supported providers and models in the [provider/runtime capability matrix](provider-runtime-capabilities.md).

package/docs/retries.md ADDED Viewed

@@ -0,0 +1,129 @@
+---
+title: Retries and throttling
+---
+# Retries and throttling
+The SDK ships with built-in transport resilience. Every request it makes to the
+aex API is automatically retried on **transient** failures with bounded
+exponential backoff and jitter, honoring the server's `Retry-After` header. You
+get this by default — no wrapper code — and it is safe to leave on because the
+billable submits carry a stable idempotency key, so a retry never creates a
+duplicate run.
+## What gets retried
+Retried automatically:
+- HTTP `429` (rate limited)
+- HTTP `500`, `502`, `503`, `504` (server hiccups)
+- HTTP `529` (upstream provider overloaded)
+- Network errors (connection reset, DNS failure, timeout)
+Never retried — these fail fast so you see the real problem immediately:
+- `400` / `422` (bad request), `401` / `403` (auth), `404` (not found),
+  `409` (conflict), and every other non-transient `4xx`.
+- A request you aborted yourself (via an `AbortSignal`).
+## Tuning or disabling
+Pass a `retry` option when you construct the client:
+```ts
+import { Aex } from "@aexhq/sdk";
+const aex = new Aex({
+  apiToken: process.env.AEX_API_TOKEN!,
+  retry: {
+    maxAttempts: 4,        // total tries incl. the first (default 4)
+    initialDelayMs: 500,   // base backoff, doubles per retry (default 500)
+    maxDelayMs: 20_000,    // cap on any single wait (default 20s)
+    maxElapsedMs: 120_000  // overall wall-clock budget (default 2m)
+  }
+});
+```
+Turn it off entirely with `retry: false`, or make a single attempt with
+`retry: { maxAttempts: 1 }`.
+## Idempotent by construction
+Retries — whether the built-in transport retry or your own re-invocation of
+`run(...)` — never double-bill. The one-shot `run(...)` and `sessions.run(...)`
+derive the turn's idempotency key from the session-create key, so re-invoking
+either with the same `idempotencyKey` de-duplicates **both** the session create
+and the billable turn server-side:
+```ts
+// A retried call with the same idempotencyKey resolves to the same run,
+// not a second billable one.
+const result = await aex.run({
+  model: "claude-haiku-4-5",
+  message: "Write a short report and save it as a file.",
+  apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! },
+  idempotencyKey: "report-2026-07-01"
+});
+```
+## Replaying a throttled turn
+When a turn on a live session is interrupted by a throttle, replay the last
+message with `session.replayLast()`. It reuses the previous message's idempotency
+key by default, so if the original turn actually landed it de-duplicates instead
+of billing twice:
+```ts
+const session = await aex.openSession({
+  model: "claude-haiku-4-5",
+  apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! }
+});
+try {
+  await session.send("Summarize the attached dataset.").done();
+} catch (err) {
+  const { isRateLimited } = await import("@aexhq/sdk");
+  if (isRateLimited(err)) {
+    // Wait out the throttle, then replay the same message.
+    await new Promise((r) => setTimeout(r, err.retryAfterMs ?? 2_000));
+    await session.replayLast().done();
+  } else {
+    throw err;
+  }
+}
+```
+Pass a fresh key (`session.replayLast({ idempotencyKey: "..." })`) when you
+deliberately want a brand-new turn instead of a de-duplicated replay.
+## The throttle error
+When retries are exhausted on a rate-limit / overloaded status, the SDK throws an
+`AexRateLimitError`. It extends `AexApiError`, so existing `catch` sites keep
+working, and it carries structured, non-leaky detail:
+```ts
+import { isRateLimited } from "@aexhq/sdk";
+try {
+  await aex.run({ /* … */ });
+} catch (err) {
+  if (isRateLimited(err)) {
+    err.status;         // 429 | 503 | 529
+    err.attempts;       // how many tries were made
+    err.retryAfterMs;   // suggested wait, when the server supplied one
+    err.source;         // "api" (aex plane) or "provider" (upstream model)
+    err.providerFault;  // upstream fault detail, when the model provider throttled
+  }
+}
+```
+The `message` is a fixed summary (e.g. `aex API rate limit reached (HTTP 429)
+after 4 attempts; retry after ~2s`) — it never echoes the raw response body,
+which stays available, redacted, on `err.body`.
+When the throttle originated at the upstream model provider (rather than the aex
+API plane), `err.source` is `"provider"` and `err.providerFault` describes it:
+its `kind` (`rate_limit` / `overloaded` / `quota_exceeded` / `provider_error`),
+the upstream `status`, and a suggested `retryAfterMs`. Use `parseProviderFault`
+to read the same shape off a raw fault value yourself.

package/docs/run-config.md CHANGED Viewed

@@ -13,13 +13,16 @@ Allowed fields:
 - `mcpServers` - array of `McpServerRef`; headers are split into the vaulted secrets channel server-side.
 - `environment` - `{ networking?, packages?, variables? }`. Networking is open by default; set `networking.mode` to `limited` only when you want an allowlist. `variables` are merged into the in-container `RUNTIME.env` / `RUNTIME.json` mounts. (Run secrets go in `environment.secrets`, which carries live `Secret` instances and is not part of a shareable config.)
 - `runtime` - optional managed-runtime preset. Prefer `Sizes` in TypeScript.
-- `proxyEndpoints` - array of `ProxyEndpoint` instances; endpoint-level `retry` is allowed here and remains declaration-based.
 - `metadata` - non-secret structured metadata.
 - `overrides` - `{ idleTtl?, timeout?, maxSpendUsd? }`. `timeout` is an optional session deadline (e.g. `"30m"`, `"2h"`); `maxSpendUsd` stops the session once its spend would exceed the cap (see [Limits & quotas](limits-and-quotas.md)).
 `message` (the one-shot `run` input), `agentsMd`, `files`, `outputs`, `tools`, `includeBuiltinTools`, and `outputMode` are `openSession` / `run` options, not reusable run-config fields. They carry the turn input, bytes, capture behavior, or agent tool/output controls that belong on a concrete call. Skill bundles are `tools` entries built with `Tools.fromSkillDir(...)` / `Tools.fromSkillUrl(...)`, so they too are SDK-code options rather than config fields. Subagents run in-process; there is no `limits` / `parentRunId` option.
-Secrets never live in run config. Pass provider keys through the top-level `apiKeys` map (and run secrets through `environment.secrets`) in the SDK, or the equivalent host-mode flags (`--anthropic-api-key`, `--mcp-auth`, `--proxy-auth`) in the CLI. See [Secrets](secrets.md) for secret lifecycles and [Credentials](credentials.md) for the proxy endpoint policy/auth split and retry fields.
+Secrets never live in run config. Pass provider keys through the top-level
+`apiKeys` map and runtime secrets through `environment.secrets` in the SDK, or
+the equivalent host-mode flags (`--anthropic-api-key`, `--mcp-auth`) in the CLI.
+See [Secrets](secrets.md) for secret lifecycles and [Credentials](credentials.md)
+for credential handling.
 ## Reuse in code
@@ -52,4 +55,4 @@ aex run --config ./run.json \
   --anthropic-api-key "$ANTHROPIC_API_KEY"
 ```
-...or as explicit flags (`--model`, `--system`, `--prompt`, `--mcp`, `--mcp-auth`, `--runtime-size`, `--run-timeout`, `--proxy-endpoint`, `--proxy-auth`, `--metadata`). The two modes are mutually exclusive.
+...or as explicit flags (`--model`, `--system`, `--prompt`, `--mcp`, `--mcp-auth`, `--runtime-size`, `--run-timeout`, `--metadata`). The two modes are mutually exclusive.

package/docs/secrets.md CHANGED Viewed

@@ -111,5 +111,5 @@ await aex.run({
 await aex.secrets.delete("serper-api-key");
 ```
-The CLI supports per-run provider, MCP, and proxy credentials. Workspace secret
+The CLI supports per-run provider and MCP credentials. Workspace secret
 administration is exposed through the SDK.

package/docs/skills.md CHANGED Viewed

@@ -72,9 +72,9 @@ files into the workspace under `/workspace/skills/<name>/`. So the `SKILL.md` bo
 and every supporting file are on disk from the first turn; the load-tool call is
 how that body enters the model's context, not how the files get written.
-The platform also mounts the `aex` CLI and a per-run manifest into every run.
-Skills call managed HTTP proxy endpoints through the mounted CLI
-(`aex proxy ...`); see [Credentials](credentials.md) for the policy and auth model.
+Skills that call external HTTP APIs should read credentials from
+`environment.secrets` and use the normal client for that service. See
+[Credentials](credentials.md) for the secret model.
 Run-scoped asset copies are part of the run record and are removed by run deletion
 or retention cleanup.

package/docs/vision-skills.md CHANGED Viewed

@@ -1,73 +1,57 @@
 ---
-title: Call a vision (or any model) API from a skill
+title: Call a vision API from a skill
 ---
-# Call a vision (or any model) API from a skill
+# Call a vision API from a skill
-aex has no built-in vision tool. The agent's `provider`/`model` selects the
-*reasoning* model — it is not an endpoint a skill can POST an image to mid-run.
-To give a run image understanding (or to call any other model/HTTP API), ship a
-**skill** that POSTs to the provider's OpenAI-compatible endpoint **through the
-managed proxy**, with the key supplied on a `ProxyEndpoint.bearer(...)` instance.
-The raw key never enters the container.
+aex has no built-in vision tool. The agent's `provider` / `model` selects the
+reasoning model for the run; if a skill needs image understanding mid-run, ship a
+skill that calls the vision provider with normal HTTP and pass that provider key
+as a runtime secret.
-This is the same proxy described in `credentials.md` — this page is the worked
-recipe for the model-API case, which has two wrinkles a plain JSON call does not:
-the image rides as a **base64 data URL** in the request body, and that body is
-large enough to need a raised `maxRequestBytes`.
+The runnable example lives at [`examples/vision-skill/`](../../../examples/vision-skill).
+It captions a frame with ByteDance Doubao Seed Vision (Ark) and returns a
+per-noun "does the frame depict X?" verdict.
-The canonical, runnable example lives in the repo at
-[`examples/vision-skill/`](../../../examples/vision-skill) (`SKILL.md`,
-`caption_frame.py`, `verify_frame.py`, `run_with_vision_skill.mjs`). It
-captions a frame with ByteDance Doubao Seed Vision (Ark) and returns a per-noun
-"does the frame depict X?" verdict. Everything below is taken from it.
-## 1. Declare the model endpoint as a proxy endpoint
-The vision provider's API is just an HTTPS host. Declare it with
-`ProxyEndpoint.bearer(...)`, which carries the key on the instance. The two
-model-specific settings are `responseMode: "full"` (so the skill gets the upstream
-JSON back) and a raised `maxRequestBytes` (so the base64 image fits):
+## Submit the run
 ```ts
-import { Aex, Models, Tools, ProxyEndpoint } from "@aexhq/sdk";
+import { Aex, Models, Secret, Tools } from "@aexhq/sdk";
 const aex = new Aex({ apiToken: process.env.AEX_API_TOKEN! });
-const doubaoArk = ProxyEndpoint.bearer({
-  name: "doubao-ark",
-  baseUrl: "https://ark.ap-southeast.bytepluses.com", // intl BytePlus gateway
-  token: process.env.DOUBAO_API_KEY!,
-  allowMethods: ["POST"],
-  allowPathPrefixes: ["/api/v3/chat/completions"],
-  maxRequestBytes: 2_000_000, // base64 image POSTs — see note below
-  responseMode: "full",
-  timeoutMs: 60_000
-});
-await aex.run({
+const result = await aex.run({
   model: Models.CLAUDE_HAIKU_4_5,
-  message: "…read skills/frame-vision-gate/SKILL.md, then caption + verify the frame…",
+  message: "Read skills/frame-vision-gate/SKILL.md, then caption and verify the frame.",
   tools: [await Tools.fromSkillDir("./vision-skill", { name: "frame-vision-gate" })],
-  proxyEndpoints: [doubaoArk],
+  environment: {
+    secrets: {
+      DOUBAO_API_KEY: Secret.value(process.env.DOUBAO_API_KEY!)
+    },
+    networking: {
+      mode: "limited",
+      allowedHosts: ["ark.ap-southeast.bytepluses.com"]
+    }
+  },
   apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY! }
 });
+console.log(result.runId, result.text);
 ```
-`Tools.fromSkillDir("./vision-skill", …)` is resolved relative to the process CWD, so
-run the script from the directory that *contains* `vision-skill/` (in the
-repo, that is `examples/`). The same pattern works for OpenAI, Gemini's
-OpenAI-compatible endpoint, or any other OpenAI-chat-shaped vision API — only
-`baseUrl` and the path prefix change.
+`Tools.fromSkillDir("./vision-skill", ...)` is resolved relative to the process
+CWD. Run the script from the directory that contains `vision-skill/` (in this
+repo, `examples/`).
-## 2. POST the image as a base64 data URL through the proxy
+## Call the provider from the skill
-Inside the run, the skill builds the OpenAI-compatible chat-completions body. The
-image is **base64-inlined as a data URL** in an `image_url` content part — it is
-not uploaded:
+Inside the run, the skill reads `DOUBAO_API_KEY` and makes an
+OpenAI-compatible chat-completions request with Python's standard HTTP client.
+The image is base64-inlined as a data URL in the request body:
 ```python
-import base64, json
+import base64, json, os, urllib.request
 b64 = base64.b64encode(open("/workspace/files/frame.jpg", "rb").read()).decode()
 request_body = {
     "model": "doubao-seed-1-6-vision-250815",
@@ -81,63 +65,30 @@ request_body = {
         ]}
     ]
 }
-```
-Write the body to a file and hand it to the mounted CLI with `--data @<file>`
-(the mount has no execute bit, so invoke through `bun`; see `credentials.md`):
-```python
-import subprocess
-body_path = "/workspace/.aex/_ark_request.json"
-open(body_path, "w").write(json.dumps(request_body))
-result = subprocess.run(
-    ["bun", "/mnt/session/uploads/aex/aex", "proxy", "doubao-ark",
-     "--method", "POST",
-     "--path", "/api/v3/chat/completions",
-     "--header", "content-type=application/json",
-     "--data", f"@{body_path}",
-     "--response-mode", "full"],
-    capture_output=True, text=True, timeout=90,
+req = urllib.request.Request(
+    "https://ark.ap-southeast.bytepluses.com/api/v3/chat/completions",
+    data=json.dumps(request_body).encode("utf-8"),
+    headers={
+        "Authorization": f"Bearer {os.environ['DOUBAO_API_KEY']}",
+        "Content-Type": "application/json"
+    },
+    method="POST",
 )
 ```
-In `--response-mode full` the CLI prints a `ProxyResponseEnvelope` on stdout. The
-upstream JSON is **base64-encoded** in `upstreamBodyBase64`; an error instead
-carries an `error` field. Unwrap it:
-```python
-envelope = json.loads(result.stdout)
-if "error" in envelope:
-    raise RuntimeError(f"proxy error: {envelope['error']}: {envelope['message']}")
-upstream = json.loads(base64.b64decode(envelope["upstreamBodyBase64"]).decode())
-content = upstream["choices"][0]["message"]["content"]  # the model's JSON answer
-```
-The key is injected by the hosted proxy on the outbound call; it never appears on disk in
-the container or in the model's context.
-## `maxRequestBytes` and timeout defaults
+The same pattern works for OpenAI, Gemini's OpenAI-compatible endpoint, or any
+other HTTPS model API. Put the key in `environment.secrets`, allow-list the host
+when using limited networking, and use the provider's normal SDK or HTTP API.
-The per-endpoint `maxRequestBytes` default is **10 MiB** and the default timeout
-is **5 minutes**. That fits typical base64 image/model POSTs without extra
-configuration. If a body does exceed the cap, the proxy rejects it before any
-upstream call with an explicit error naming the observed size, the configured
-cap, and how to raise it:
+## Payload size
-> request body is 2400000 bytes, which exceeds this endpoint's maxRequestBytes
-> (10485760). Raise the per-endpoint maxRequestBytes in the proxy endpoint policy …
+Base64 images are larger than their source files. Scale frames before captioning
+when possible, for example:
-Two ways to stay under the cap: raise `maxRequestBytes`, and/or scale frames
-before captioning (`ffmpeg -i source.mp4 -vf fps=1,scale=960:-1 frame_%03d.jpg`)
-so full-res frames do not add payload and model cost without useful signal.
-## Notes
+```bash
+ffmpeg -i source.mp4 -vf fps=1,scale=960:-1 frame_%03d.jpg
+```
-- **Host selection.** Use the provider endpoint that matches your account and
-  declare it as the proxy endpoint `baseUrl`.
-- **Keyless model hosts.** If the upstream takes no credential, declare the
-  endpoint with `ProxyEndpoint.none(...)` (see `credentials.md`).
-- **Response size.** `responseMode: "full"` is required to read the model's reply
-  back. Leave `maxResponseBytes` at its default (`0` = unlimited, streamed) unless
-  you want a truncation cap.
+This keeps upload size and model cost bounded without losing the signal most
+vision models need.