@tangle-network/sandbox 0.1.2 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/README.md +561 -2
  2. package/dist/agent/index.d.ts +435 -0
  3. package/dist/agent/index.js +1 -0
  4. package/dist/auth/index.d.ts +2 -2
  5. package/dist/auth/index.js +1 -1
  6. package/dist/client-BuPZLOxS.d.ts +1050 -0
  7. package/dist/client-BwRV2Zun.js +1 -0
  8. package/dist/collaboration/index.d.ts +1 -1
  9. package/dist/collaboration/index.js +1 -1
  10. package/dist/collaboration-CRyb5e8F.js +1 -0
  11. package/dist/core.d.ts +4 -3
  12. package/dist/core.js +1 -1
  13. package/dist/errors-1Se5ATyZ.d.ts +128 -0
  14. package/dist/errors-CljiGR__.js +1 -0
  15. package/dist/{index-t7xkzv0U.d.ts → index-2gFsmmQs.d.ts} +3 -3
  16. package/dist/{index-gA-oRjOi.d.ts → index-D-2pH_70.d.ts} +35 -4
  17. package/dist/{index-BuS8nl3b.d.ts → index-D7bwmNs8.d.ts} +6 -1
  18. package/dist/index.d.ts +110 -62
  19. package/dist/index.js +1 -1
  20. package/dist/openai/index.d.ts +641 -0
  21. package/dist/openai/index.js +1 -0
  22. package/dist/platform-integrations.d.ts +2 -0
  23. package/dist/platform-integrations.js +1 -0
  24. package/dist/{sandbox-BvZ0-Iv7.d.ts → sandbox-CpK8etqP.d.ts} +1735 -41
  25. package/dist/sandbox-DTup2jzz.js +1 -0
  26. package/dist/session-gateway/index.js +1 -1
  27. package/dist/tangle/index.d.ts +1 -1
  28. package/dist/tangle/index.js +1 -1
  29. package/dist/tangle-CnYnTRi6.js +1 -0
  30. package/package.json +114 -34
  31. package/LICENSE +0 -11
  32. package/dist/client-CcRvqt85.js +0 -1
  33. package/dist/collaboration-CVvhPU8M.js +0 -1
  34. package/dist/errors-AIT8qikt.d.ts +0 -491
  35. package/dist/errors-CdMTv7uG.js +0 -1
  36. package/dist/sandbox-D1JnQIJx.js +0 -1
  37. package/dist/tangle-CSb9rjAh.js +0 -1
package/README.md CHANGED
@@ -43,6 +43,77 @@ console.log(task.response);
43
43
  await box.delete();
44
44
  ```
45
45
 
46
+ ## Stream durability is platform-managed — do not build your own
47
+
48
+ > **If you are about to add a Cloudflare Durable Object, a KV bucket, an in-Worker
49
+ > ring buffer, or any other state store to "buffer agent stream events so the user
50
+ > survives a reload" — stop.** The Tangle orchestrator already buffers every event
51
+ > for every session to a Redis sorted-set keyed by `sessionId` with TTL, and the
52
+ > SDK ships a browser/Worker-safe client that reconnects, replays missed events,
53
+ > and persists `lastEventId` across tab reloads. The work is done. Use it.
54
+
55
+ The decision tree:
56
+
57
+ | You need… | Use |
58
+ |---|---|
59
+ | Fire-and-forget streaming from a server you control (CLI, cron, batch worker) | `box.streamPrompt()` / `box.streamTask()` — internal auto-reconnect handles transient drops within the same call |
60
+ | Survive **client process death** (Worker isolate eviction, laptop crash, deploy) and resume later | `box.dispatchPrompt(msg, { sessionId })` then `box.session(sessionId).events({ since })` / `.result()` from a fresh process |
61
+ | Survive **browser** disconnects (wifi flap, tab reload, mobile background) with `Last-Event-ID` replay | `SessionGatewayClient` from `@tangle-network/sandbox/session-gateway` |
62
+ | Retry a payment-triggered or webhook-triggered run safely | `box.dispatchPrompt(msg, { sessionId: deterministicKeyFromRequest })` — same `sessionId` is idempotent: a duplicate dispatch returns the in-flight or completed session, never re-executes |
63
+ | Inspect what happened to a turn that died mid-stream | `box.session(id).status()` (terminal state), `box.session(id).events({ since })` (replay buffered events) |
64
+
65
+ ### Dispatch + reconnect (the "Worker restart" pattern)
66
+
67
+ ```typescript
68
+ import { Sandbox } from "@tangle-network/sandbox";
69
+
70
+ const client = new Sandbox({ apiKey: process.env.TANGLE_API_KEY! });
71
+
72
+ // In your /chat handler:
73
+ const sessionId = req.headers.get("x-turn-id") ?? crypto.randomUUID();
74
+ const { sessionId: id, alreadyExisted } = await box.dispatchPrompt(prompt, {
75
+ sessionId, // idempotent: a retry with the same id is a lookup, not a re-execute
76
+ });
77
+
78
+ // Stream to the browser; if the client comes back later with the same sessionId
79
+ // and a Last-Event-ID, hand them the replay path below.
80
+ for await (const event of box.session(id).events({
81
+ since: req.headers.get("last-event-id") ?? undefined,
82
+ })) {
83
+ res.write(`id: ${event.id}\ndata: ${JSON.stringify(event)}\n\n`);
84
+ }
85
+ ```
86
+
87
+ ### Browser-direct streaming (without proxying tokens through your server)
88
+
89
+ ```typescript
90
+ import { SessionGatewayClient } from "@tangle-network/sandbox/session-gateway";
91
+
92
+ const client = new SessionGatewayClient({
93
+ url: "wss://your-sandbox-api.example.com/session",
94
+ token: await fetchScopedToken(), // mint via box.mintScopedToken({ scope: 'session', sessionId })
95
+ sessionId,
96
+ autoReconnect: true,
97
+ enableReplayPersistence: true, // remembers lastEventId across reloads
98
+ replayStorage: { /* localStorage adapter, see session-gateway docs */ },
99
+ handlers: {
100
+ onMessage: (event) => render(event),
101
+ onReplayStart: ({ since }) => showSpinner(`replaying from ${since}`),
102
+ onReplayComplete: () => hideSpinner(),
103
+ onBackpressureWarning: ({ droppedCount, suggestReplay }) =>
104
+ suggestReplay && client.replay(client.stats.replay.lastEventId),
105
+ },
106
+ });
107
+ client.connect();
108
+ ```
109
+
110
+ `SessionGatewayClient` handles auto-reconnect with exponential backoff, sequence-gap
111
+ detection, replay-on-reconnect, and `lastEventId` persistence. None of this requires a
112
+ Durable Object.
113
+
114
+ See `examples/cf-worker-chat.ts`, `examples/browser-streaming-resume.ts`, and
115
+ `examples/reconnect-from-last-event-id.ts` for end-to-end patterns.
116
+
46
117
  ## Features
47
118
 
48
119
  - **Sandbox Management** - Create, list, stop, resume, and delete sandboxes
@@ -50,9 +121,11 @@ await box.delete();
50
121
  - **AI Agent Tasks** - Multi-turn agent execution with automatic tool use
51
122
  - **Snapshots** - Save and restore sandbox state
52
123
  - **BYOS3** - Bring your own S3 storage for snapshots
53
- - **Batch Execution** - Run tasks across multiple sandboxes in parallel
124
+ - **Fleets** - Coordinated multi-machine workloads with policy, workspace snapshots, and parallel dispatch
125
+ - **Intelligence Reports** - Deterministic or agentic post-hoc analysis of sandbox or fleet evidence (fleet reports can refine to a single dispatch via `subject.dispatchId`)
54
126
  - **Event Streaming** - Real-time SSE streams for agent events
55
127
  - **Collaboration Foundations** - Token issuance and document identity helpers for collaborative editing
128
+ - **Trace Intelligence** - Export raw sandbox and fleet traces, embedded intelligence envelope, timing metrics, and OTEL JSON to customer-owned observability systems
56
129
 
57
130
  ## Collaboration Foundations
58
131
 
@@ -110,6 +183,64 @@ const bootstrap = await collab.bootstrap({
110
183
 
111
184
  The bridge and client are SDK-side primitives. Product/backend endpoints and CRDT runtime integration still need to be wired by the application.
112
185
 
186
+ ## Trace Intelligence
187
+
188
+ The platform exposes two distinct intelligence primitives. Use the right one for the job.
189
+
190
+ | Primitive | What you get | Billable | API call |
191
+ |---|---|---|---|
192
+ | **Embedded envelope** | Inline summary in a trace/dispatch response: signals, recommended actions, fanout timings, dispatch failure classes | No (`billing.billable: false`) | `trace({ includeIntelligence: true })`, `intelligence()`, dispatch responses |
193
+ | **Intelligence Report** | A pollable job over sandbox or fleet evidence; fleet reports can refine to a single dispatch via `subject.dispatchId`. Runs in `deterministic` or `agentic` mode against a budget. | `deterministic`: free. `agentic`: billed against `budget.maxUsd` | `intelligence.createReport(...)`, see [Intelligence Reports](#intelligence-reports) |
194
+
195
+ ### Embedded envelope (free)
196
+
197
+ The embedded envelope is opt-in on `trace()` calls (`includeIntelligence: true`) and always included on dispatch responses (because dispatch already pays the analysis cost as part of producing the result).
198
+
199
+ ```typescript
200
+ const boxBundle = await box.trace(); // { trace } only
201
+ const boxInsights = await box.intelligence(); // envelope only
202
+ const boxBundleWithInsights = await box.trace({ includeIntelligence: true });
203
+
204
+ const run = await fleet.dispatchExecDetailed("pytest -q", {
205
+ machines: ["worker-1", "worker-2"],
206
+ });
207
+ console.log(run.results);
208
+ console.log(run.intelligence.signals); // always present on dispatch
209
+ console.log(run.intelligence.recommendedActions);
210
+
211
+ const fleetBundle = await fleet.trace(); // { trace } only
212
+ const fleetInsights = await fleet.intelligence(); // envelope only
213
+ const fleetBundleWithInsights = await fleet.trace({ includeIntelligence: true });
214
+ ```
215
+
216
+ Sandbox traces cover lifecycle, runtime, usage, timing, and current health snapshots. Fleet traces add machine lifecycle, workspace state, dispatch results, fanout timings, and critical path. The embedded envelope tells you what to inspect next: reliability, parallelism efficiency for fleets, dispatch failure classes, resource attribution, timing bottlenecks, and recommended actions.
217
+
218
+ The embedded envelope is deterministic platform analysis. It reports `billing.billable: false` and `billing.costUsd: 0`; customers are not charged for generating these envelopes.
219
+
220
+ ### Exporting traces to your observability stack
221
+
222
+ Use `format: "tangle"` to preserve the native envelope, or `format: "otel-json"` for OpenTelemetry-style collectors and platforms that accept OTLP JSON.
223
+
224
+ ```typescript
225
+ await box.exportTrace({
226
+ url: "https://collector.example.com/traces",
227
+ headers: { Authorization: `Bearer ${process.env.OBSERVABILITY_TOKEN}` },
228
+ format: "otel-json",
229
+ serviceName: "research-agent",
230
+ });
231
+
232
+ await fleet.exportTrace({
233
+ url: "https://collector.example.com/traces",
234
+ headers: { Authorization: `Bearer ${process.env.OBSERVABILITY_TOKEN}` },
235
+ format: "otel-json",
236
+ serviceName: "research-agent",
237
+ });
238
+ ```
239
+
240
+ For Braintrust, Lemma, Raindrop, Langfuse, Datadog, or a custom warehouse, keep this as a customer-owned sink or webhook. Tangle does not need their vendor credentials: fetch `box.trace()` or `fleet.trace()` and send the raw bundle through their SDK/API, or point `exportTrace()` at a small ingest endpoint that transforms it into the vendor's preferred schema.
241
+
242
+ Agent tools should expose both `trace` and `intelligence` actions on `manageSandboxes`. `trace` returns the full raw bundle for downstream analysis; `intelligence` returns the compact agent-readable next-step summary.
243
+
113
244
  ## Core Concepts
114
245
 
115
246
  ### Sandboxes
@@ -229,7 +360,7 @@ console.log(`Compute: ${usage.computeMinutes} minutes`);
229
360
 
230
361
  #### `client.runBatch(tasks, options?)`
231
362
 
232
- Run tasks across multiple sandboxes in parallel.
363
+ Run ad-hoc tasks across freshly-provisioned sandboxes in parallel. Use this when the work is one-shot and the sandboxes do not need to share a workspace or be addressable by stable `machineId`. For coordinated multi-machine workloads with policy enforcement, workspace sharing, dispatch buffering, and intelligence reports, see [Fleets](#fleets).
233
364
 
234
365
  ```typescript
235
366
  const result = await client.runBatch([
@@ -244,6 +375,14 @@ const result = await client.runBatch([
244
375
  console.log(`Success rate: ${result.successRate}%`);
245
376
  ```
246
377
 
378
+ #### `client.fleets`
379
+
380
+ Fleet client — see [Fleets](#fleets) for the full surface (`create`, `createWithCoordinator`, `list`, `delete`, `estimateCost`, `capabilities`, `operations`, `reconcile`, `reapExpired`).
381
+
382
+ #### `client.intelligence`
383
+
384
+ Intelligence report client — see [Intelligence Reports](#intelligence-reports) for the full surface (`createReport`, `createAgenticReport`, `getReport`, `listReports`, `waitForReport`).
385
+
247
386
  ### Sandbox Instance
248
387
 
249
388
  After creating or retrieving a sandbox, you get a `SandboxInstance` with these methods:
@@ -288,6 +427,7 @@ Each sandbox runs one AI backend. Pass `backend.type` to choose it:
288
427
  | `opencode` | [OpenCode](https://github.com/anomalyco/opencode) | Default. Multi-provider, profile system, MCP support |
289
428
  | `claude-code` | [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | Anthropic-native. Needs ANTHROPIC_API_KEY |
290
429
  | `codex` | [Codex CLI](https://github.com/openai/codex) | OpenAI-native. Needs OPENAI_API_KEY |
430
+ | `cursor` | [Cursor Agent SDK](https://cursor.com/changelog/sdk-release) | Cursor-native local/cloud agent. Needs CURSOR_API_KEY |
291
431
  | `amp` | [AMP](https://sourcegraph.com/amp) | Sourcegraph AMP agent |
292
432
  | `factory-droids` | [Factory](https://factory.ai) | Factory Droid agent |
293
433
 
@@ -302,6 +442,52 @@ await box.prompt("Audit this repo", {
302
442
  backend: { type: "codex", profile: "browser-codex-fast" },
303
443
  });
304
444
 
445
+ // Use Cursor Agent SDK
446
+ await box.prompt("Implement this change", {
447
+ backend: {
448
+ type: "cursor",
449
+ model: {
450
+ model: "composer-2",
451
+ apiKey: process.env.CURSOR_API_KEY,
452
+ },
453
+ profile: {
454
+ name: "cursor-release-agent",
455
+ prompt: {
456
+ systemPrompt:
457
+ "Use repo rules, configured MCP servers, skills, and subagents when relevant.",
458
+ },
459
+ mcp: {
460
+ docs: { transport: "sse", url: "https://docs.example.com/sse" },
461
+ },
462
+ subagents: {
463
+ reviewer: {
464
+ description: "Reviews changes for correctness and missing tests.",
465
+ prompt: "Review the current diff. Return only blocking findings.",
466
+ model: "composer-2",
467
+ },
468
+ },
469
+ resources: {
470
+ instructions: "Run focused tests before reporting completion.",
471
+ skills: [
472
+ {
473
+ kind: "inline",
474
+ name: "release-check",
475
+ content:
476
+ "---\nname: release-check\ndescription: Validate release readiness.\n---\nRun typecheck and focused tests.",
477
+ },
478
+ ],
479
+ },
480
+ extensions: {
481
+ cursor: {
482
+ runtime: "local",
483
+ local: { settingSources: ["project", "user"] },
484
+ force: true,
485
+ },
486
+ },
487
+ },
488
+ },
489
+ });
490
+
305
491
  // Use OpenCode with an inline profile
306
492
  await box.prompt("Audit this repo", {
307
493
  backend: {
@@ -332,6 +518,36 @@ await box.prompt("Analyze this", {
332
518
 
333
519
  The SDK serializes `backend.profile` into the required wire format automatically.
334
520
 
521
+ Cursor profiles map portable MCP, resources, skills, subagents, hooks, permissions,
522
+ and Cursor-native `extensions.cursor` fields into the Cursor Agent SDK. Local
523
+ Cursor runs materialize `.cursor/*` project files inside the sandbox workspace.
524
+ Cloud Cursor runs fail closed when given uncommitted local resources that cannot
525
+ be delivered to the remote Cursor workspace.
526
+
527
+ Provider-native metadata is available through `box.backend` when the backend
528
+ SDK exposes it:
529
+
530
+ ```typescript
531
+ const account = await box.backend.account();
532
+ const models = await box.backend.models();
533
+ const repositories = await box.backend.repositories();
534
+ const agents = await box.backend.agents({ limit: 20 });
535
+ const agent = await box.backend.agent(agents.items[0].agentId);
536
+ const runs = await box.backend.runs(agent.agentId, { limit: 20 });
537
+ const run = await box.backend.run(runs.items[0].id, {
538
+ agentId: agent.agentId,
539
+ });
540
+ const messages = await box.backend.agentMessages(agent.agentId, { limit: 50 });
541
+ const artifacts = await box.backend.artifacts("active-session-id");
542
+ const bytes = await box.backend.downloadArtifact(
543
+ "active-session-id",
544
+ artifacts[0].path,
545
+ );
546
+ ```
547
+
548
+ Unsupported provider-control methods return the backend error; the SDK does not
549
+ fabricate catalog, run, or artifact data for backends that do not expose it.
550
+
335
551
  #### `box.task(message, options?)`
336
552
 
337
553
  Run a multi-turn agent task. The agent keeps working until completion.
@@ -551,6 +767,317 @@ box.expiresAt // Date | undefined
551
767
  box.error // Error message if failed
552
768
  ```
553
769
 
770
+ ## Fleets
771
+
772
+ A **fleet** is a coordinated group of sandboxes that runs one logical workload across many machines. Fleets are the canonical primitive for parallel work, distributed simulation, multi-agent experiments, or any task that needs more than one sandbox under one lifecycle.
773
+
774
+ Fleets give you:
775
+
776
+ - A single id (`fleetId`) plus a stable machine id (`machineId`) per member that you choose
777
+ - Policy enforcement (CPU / memory / storage / accelerator caps, allowed drivers / images / templates, max spend, max concurrent creates) — checked client-side **before** sandboxes are provisioned
778
+ - Per-dispatch parallelism, retries, timeouts, idempotency, cancellation, and result buffering
779
+ - Shared workspace modes (`isolated`, `shared`) with cross-machine snapshot, restore, and reconcile
780
+ - Fleet-scoped telemetry: usage, cost estimate, trace bundle, embedded intelligence envelope, and full intelligence reports
781
+ - Scoped tokens for handing a fleet to a downstream service without leaking the parent API key
782
+
783
+ ### Create a fleet
784
+
785
+ There are two creation shapes. Use `create` when every machine is symmetric, and `createWithCoordinator` when one machine acts as orchestrator over a pool of workers.
786
+
787
+ ```typescript
788
+ // Symmetric fleet
789
+ const fleet = await client.fleets.create({
790
+ defaults: {
791
+ image: "python:3.12",
792
+ maxLifetimeSeconds: 60 * 60,
793
+ },
794
+ policy: {
795
+ maxMachines: 4,
796
+ maxConcurrentCreates: 2,
797
+ maxTotalCpu: 8,
798
+ maxTotalMemoryMb: 16_384,
799
+ maxSpendUsd: 5,
800
+ allowAccelerators: false,
801
+ },
802
+ machines: [
803
+ { machineId: "worker-1", resources: { cpuCores: 2, memoryMB: 4096 } },
804
+ { machineId: "worker-2", resources: { cpuCores: 2, memoryMB: 4096 } },
805
+ ],
806
+ workspace: { mode: "isolated" },
807
+ metadata: { experiment: "react-19-bump" },
808
+ idempotencyKey: "exp-react-19-2025-05-18",
809
+ });
810
+
811
+ // Coordinator + workers
812
+ const cluster = await client.fleets.createWithCoordinator({
813
+ defaults: { image: "python:3.12" },
814
+ coordinator: { resources: { cpuCores: 1, memoryMB: 1024 } },
815
+ workers: [
816
+ { machineId: "worker-1", resources: { cpuCores: 2, memoryMB: 4096 } },
817
+ { machineId: "worker-2", resources: { cpuCores: 2, memoryMB: 4096 } },
818
+ ],
819
+ });
820
+ ```
821
+
822
+ `createWithCoordinator` is sugar over `create`: it injects a `coordinator` machine with `role: "coordinator"` and tags the workers `role: "worker"` in metadata. After creation both shapes return a `SandboxFleet` instance.
823
+
824
+ ### Dispatch across machines
825
+
826
+ ```typescript
827
+ // Fire and collect — returns FleetExecDispatchResult[]
828
+ const results = await fleet.dispatchExec("pytest -q", {
829
+ machines: fleet.ids, // default: every machine
830
+ maxConcurrent: 2,
831
+ timeoutMs: 60_000,
832
+ retry: { attempts: 2, backoffMs: 1_000 },
833
+ dispatchId: "pytest-run-1", // idempotent — same id replays the same dispatch
834
+ });
835
+
836
+ // Detailed — returns the full response including dispatchId, durationMs,
837
+ // trace, and the embedded intelligence envelope (always present on dispatch).
838
+ const detailed = await fleet.dispatchExecDetailed("pytest -q");
839
+ console.log(detailed.intelligence.signals);
840
+
841
+ // Prompt variant — runs an agent prompt on each selected machine
842
+ const promptResults = await fleet.dispatchPrompt(
843
+ "Summarize what changed in this branch and why.",
844
+ { machines: ["worker-1"], model: "anthropic/claude-sonnet-4-20250514" },
845
+ );
846
+
847
+ // Stream events as they happen
848
+ for await (const event of fleet.dispatchExecStream("npm test", {
849
+ machines: fleet.ids,
850
+ })) {
851
+ if (event.type === "result") console.log(event.data);
852
+ }
853
+
854
+ // Read previously-buffered results by dispatchId
855
+ const buffered = await fleet.dispatchResults("pytest-run-1", {
856
+ limit: 100,
857
+ machines: ["worker-1", "worker-2"],
858
+ });
859
+
860
+ // Cancel an in-flight dispatch
861
+ await fleet.cancelDispatch("pytest-run-1", "abandoning experiment");
862
+ ```
863
+
864
+ ### Single-machine helpers
865
+
866
+ When you want to talk to one machine in the fleet directly:
867
+
868
+ ```typescript
869
+ const result = await fleet.exec("worker-1", "echo hello");
870
+ const reply = await fleet.prompt("worker-1", "What is the structure?");
871
+ const box = await fleet.sandbox("worker-1"); // full SandboxInstance
872
+ ```
873
+
874
+ ### Workspace snapshots (shared workspace mode)
875
+
876
+ ```typescript
877
+ const snap = await fleet.createWorkspaceSnapshot();
878
+ await fleet.restoreWorkspaceSnapshot(snap.snapshotId);
879
+ await fleet.reconcileWorkspace();
880
+ ```
881
+
882
+ ### Dynamic topology
883
+
884
+ ```typescript
885
+ await fleet.attachMachine({
886
+ machineId: "worker-3",
887
+ sandboxId: "sbx_abc", // existing sandbox to bind into the fleet
888
+ role: "worker",
889
+ });
890
+ await fleet.detachMachine("worker-3");
891
+ ```
892
+
893
+ ### Artifacts, usage, cost, tokens
894
+
895
+ ```typescript
896
+ const artifacts = await fleet.collectArtifacts([
897
+ { machineId: "worker-1", path: "/workspace/report.json", maxBytes: 1_000_000 },
898
+ ]);
899
+
900
+ const usage = await fleet.usage(); // current usage rollup
901
+ const estimate = await fleet.cost(); // cost estimate
902
+ const manifest = await fleet.manifest(); // machine manifest as persisted
903
+
904
+ // Scoped token — hand to a downstream service without leaking the parent key
905
+ const token = await fleet.createToken({ expiresInSeconds: 3600 });
906
+ ```
907
+
908
+ ### Pre-flight cost estimate (without creating)
909
+
910
+ ```typescript
911
+ const preEstimate = await client.fleets.estimateCost({
912
+ defaults: { image: "python:3.12" },
913
+ policy: { maxMachines: 4, maxSpendUsd: 5 },
914
+ machines: [
915
+ { machineId: "worker-1", resources: { cpuCores: 2, memoryMB: 4096 } },
916
+ { machineId: "worker-2", resources: { cpuCores: 2, memoryMB: 4096 } },
917
+ ],
918
+ });
919
+ ```
920
+
921
+ ### Operations
922
+
923
+ ```typescript
924
+ const caps = await client.fleets.capabilities(); // which drivers / templates / images
925
+ const ops = await client.fleets.operations(); // operations summary
926
+ const recon = await client.fleets.reconcile(); // reconcile drift
927
+ const reaped = await client.fleets.reapExpired(); // sweep expired fleets
928
+ ```
929
+
930
+ ### Lookup and delete
931
+
932
+ ```typescript
933
+ const found = await client.fleets.list({ fleetId: "fleet_abc" });
934
+ await client.fleets.delete("fleet_abc", { continueOnError: true });
935
+ ```
936
+
937
+ ### Single-shot batches without fleets
938
+
939
+ For one-shot, ad-hoc parallel work that does not need fleet-level policy / workspaces / dispatch buffering / intelligence reports, `client.runBatch(tasks, options)` is the simpler primitive. New code that needs more than one sandbox under one logical lifecycle should reach for fleets.
940
+
941
+ ## Intelligence Reports
942
+
943
+ The **Intelligence Reports** API generates structured post-hoc analyses over two subject types: a single **sandbox** or a single **fleet**. A fleet subject can optionally be narrowed to one dispatch within the fleet via `subject.dispatchId`. Two modes:
944
+
945
+ - **`deterministic`** (default) — platform-side rule-based analysis. Free. Returns immediately or near-immediately. Surfaces lifecycle, runtime, plan-headroom, and resource-density signals derived directly from your trace evidence.
946
+ - **`agentic`** — runs the **Tangle Trace Analyst**, an LLM-driven reasoning loop, over your OTLP trace evidence. Returns findings with evidence references, recommended actions, and a validation plan. **Billed** against `budget.maxUsd`; the platform never spends past the budget you set. Async (returns a job; poll for terminal state).
947
+
948
+ The dedicated client lives on `client.intelligence`:
949
+
950
+ ```typescript
951
+ import { Sandbox } from "@tangle-network/sandbox";
952
+ import type { IntelligenceClient } from "@tangle-network/sandbox";
953
+
954
+ const client = new Sandbox({ apiKey, baseUrl });
955
+ const intelligence: IntelligenceClient = client.intelligence;
956
+ ```
957
+
958
+ ### Create a report
959
+
960
+ ```typescript
961
+ // Deterministic — over a single sandbox
962
+ const det = await client.intelligence.createReport({
963
+ subject: { type: "sandbox", id: box.id },
964
+ });
965
+
966
+ // Deterministic — over a fleet
967
+ await client.intelligence.createReport({
968
+ subject: { type: "fleet", id: fleet.fleetId },
969
+ });
970
+
971
+ // Deterministic — narrowed to one dispatch within a fleet
972
+ await client.intelligence.createReport({
973
+ subject: {
974
+ type: "fleet",
975
+ id: fleet.fleetId,
976
+ dispatchId: "pytest-run-1",
977
+ },
978
+ });
979
+
980
+ // Agentic — billed against the budget
981
+ const agentic = await client.intelligence.createReport({
982
+ subject: { type: "fleet", id: fleet.fleetId },
983
+ mode: "agentic",
984
+ budget: { billTo: "customer", maxUsd: 5 },
985
+ acknowledgeCost: true,
986
+ metadata: { experiment: "react-19-bump" },
987
+ });
988
+
989
+ // Shorthand for the agentic + budget pattern
990
+ const shortcut = await client.intelligence.createAgenticReport({
991
+ subject: { type: "sandbox", id: box.id },
992
+ maxUsd: 2,
993
+ });
994
+ ```
995
+
996
+ ### Poll a report to completion
997
+
998
+ Agentic reports return `status: "pending"` immediately. Either poll yourself with `getReport`, or use the built-in `waitForReport`:
999
+
1000
+ ```typescript
1001
+ const job = await client.intelligence.createAgenticReport({
1002
+ subject: { type: "fleet", id: fleet.fleetId },
1003
+ maxUsd: 5,
1004
+ });
1005
+
1006
+ const completed = await client.intelligence.waitForReport(job.jobId, {
1007
+ timeoutMs: 5 * 60 * 1000,
1008
+ pollMs: 2_000,
1009
+ });
1010
+
1011
+ if (completed.status === "completed") {
1012
+ console.log(completed.findings);
1013
+ console.log(completed.recommendedActions);
1014
+ }
1015
+ ```
1016
+
1017
+ ### List existing reports
1018
+
1019
+ ```typescript
1020
+ const recent = await client.intelligence.listReports({
1021
+ subjectType: "fleet",
1022
+ subjectId: fleet.fleetId,
1023
+ limit: 20,
1024
+ });
1025
+ ```
1026
+
1027
+ ### Per-subject shortcuts
1028
+
1029
+ `SandboxInstance` and `SandboxFleet` expose convenience wrappers so you don't have to thread `subject` manually:
1030
+
1031
+ ```typescript
1032
+ await box.createIntelligenceReport({ mode: "deterministic" });
1033
+ await box.createAgenticIntelligenceReport({ maxUsd: 2 });
1034
+
1035
+ await fleet.createIntelligenceReport({ mode: "deterministic" });
1036
+ await fleet.createAgenticIntelligenceReport({ maxUsd: 5 });
1037
+
1038
+ // Fleet helpers accept the v2 refinement fields directly:
1039
+ await fleet.createIntelligenceReport({
1040
+ mode: "deterministic",
1041
+ dispatchId: "pytest-run-1",
1042
+ });
1043
+ ```
1044
+
1045
+ Both wrappers post to `POST /v1/intelligence/reports` with the right `subject` filled in.
1046
+
1047
+ ### Time windows and baselines
1048
+
1049
+ Every report can be narrowed by a time window and compared against a same-type baseline. The analyzer rejects mixed-type comparisons because the delta would be meaningless.
1050
+
1051
+ ```typescript
1052
+ // Bound the analysis to a one-hour window.
1053
+ await fleet.createIntelligenceReport({
1054
+ window: { since: Date.now() - 60 * 60 * 1000 },
1055
+ });
1056
+
1057
+ // Compare two dispatches of the same fleet against each other.
1058
+ await fleet.createIntelligenceReport({
1059
+ dispatchId: "run-after",
1060
+ compareTo: { type: "fleet", id: fleet.fleetId, dispatchId: "run-before" },
1061
+ });
1062
+
1063
+ // Sandbox baseline.
1064
+ await box.createIntelligenceReport({
1065
+ compareTo: { type: "sandbox", id: previousBox.id },
1066
+ });
1067
+ ```
1068
+
1069
+ ### Cost before commit
1070
+
1071
+ Estimate cost without creating a report. Subject ownership is verified the same way as `createReport`, so the endpoint never becomes an existence oracle for foreign subjects.
1072
+
1073
+ ```typescript
1074
+ const estimate = await client.intelligence.estimateReport({
1075
+ subject: { type: "fleet", id: fleet.fleetId, dispatchId: "pytest-run-1" },
1076
+ mode: "agentic",
1077
+ });
1078
+ console.log(`Would cost ${estimate.costUsd} USD (${estimate.reason})`);
1079
+ ```
1080
+
554
1081
  ## Error Handling
555
1082
 
556
1083
  ```typescript
@@ -602,6 +1129,38 @@ import type {
602
1129
  BatchResult,
603
1130
  BatchOptions,
604
1131
  UsageInfo,
1132
+ // Fleets
1133
+ CreateSandboxFleetOptions,
1134
+ CreateSandboxFleetWithCoordinatorOptions,
1135
+ SandboxFleetMachineSpec,
1136
+ SandboxFleetInfo,
1137
+ SandboxFleetManifest,
1138
+ SandboxFleetUsage,
1139
+ SandboxFleetCostEstimate,
1140
+ SandboxFleetToken,
1141
+ SandboxFleetTraceBundle,
1142
+ SandboxFleetTraceOptions,
1143
+ SandboxFleetDispatchResponse,
1144
+ FleetExecDispatchOptions,
1145
+ FleetExecDispatchResult,
1146
+ FleetPromptDispatchOptions,
1147
+ FleetPromptDispatchResult,
1148
+ FleetDispatchResultBuffer,
1149
+ FleetDispatchResultBufferOptions,
1150
+ FleetDispatchStreamOptions,
1151
+ FleetDispatchCancelResult,
1152
+ FleetMachineId,
1153
+ // Intelligence Reports
1154
+ IntelligenceReport,
1155
+ IntelligenceReportBudget,
1156
+ CreateIntelligenceReportOptions,
1157
+ } from "@tangle-network/sandbox";
1158
+
1159
+ // Concrete classes — useful when you need to reference the type itself
1160
+ import {
1161
+ IntelligenceClient,
1162
+ SandboxFleet,
1163
+ SandboxFleetClient,
605
1164
  } from "@tangle-network/sandbox";
606
1165
  ```
607
1166