@tangle-network/agent-runtime 0.23.0 → 0.25.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +85 -498
- package/dist/agent.d.ts +5 -206
- package/dist/chunk-GLR25NG7.js +92 -0
- package/dist/chunk-GLR25NG7.js.map +1 -0
- package/dist/{chunk-7HN72MF3.js → chunk-QZEDHTT2.js} +2 -2
- package/dist/chunk-QZEDHTT2.js.map +1 -0
- package/dist/{chunk-IQHYOJU3.js → chunk-ZJACJZF7.js} +289 -1
- package/dist/chunk-ZJACJZF7.js.map +1 -0
- package/dist/improvement-adapter-CaZxFxTd.d.ts +207 -0
- package/dist/improvement.d.ts +120 -0
- package/dist/improvement.js +161 -0
- package/dist/improvement.js.map +1 -0
- package/dist/index.js +7 -1
- package/dist/index.js.map +1 -1
- package/dist/local-harness-KrdFTY5R.d.ts +82 -0
- package/dist/mcp/bin.js +2 -1
- package/dist/mcp/bin.js.map +1 -1
- package/dist/mcp/index.d.ts +190 -2
- package/dist/mcp/index.js +21 -13
- package/dist/mcp/index.js.map +1 -1
- package/package.json +17 -23
- package/dist/chunk-7HN72MF3.js.map +0 -1
- package/dist/chunk-IQHYOJU3.js.map +0 -1
package/README.md
CHANGED
|
@@ -1,551 +1,138 @@
|
|
|
1
1
|
# @tangle-network/agent-runtime
|
|
2
2
|
|
|
3
|
-
Production runtime substrate for domain agents. Owns the task lifecycle
|
|
4
|
-
(knowledge readiness, control loop, session resume, sanitized telemetry,
|
|
5
|
-
canonical `RuntimeRunRow` persistence + cost ledger), the chat-turn
|
|
6
|
-
engine (NDJSON envelope + product hooks), the chat-model catalog +
|
|
7
|
-
admission, and the declarative `defineAgent` manifest — so domain
|
|
8
|
-
repos stop inventing their own. Long-running execution durability
|
|
9
|
-
(reconnect, replay, dedup) lives in `@tangle-network/sandbox`.
|
|
3
|
+
Production runtime substrate for domain agents. Owns the chat-turn engine, task lifecycle, knowledge readiness, sanitized telemetry, OTEL export, model admission, and the declarative `defineAgent` manifest. Long-running execution durability lives in `@tangle-network/sandbox`.
|
|
10
4
|
|
|
11
5
|
```bash
|
|
12
|
-
pnpm add @tangle-network/agent-runtime @tangle-network/agent-eval
|
|
6
|
+
pnpm add @tangle-network/agent-runtime @tangle-network/agent-eval @tangle-network/sandbox
|
|
13
7
|
```
|
|
14
8
|
|
|
15
|
-
##
|
|
9
|
+
## Hello world
|
|
16
10
|
|
|
17
|
-
|
|
18
|
-
|---|---|
|
|
19
|
-
| `runAgentTask` | Single-shot adapter-driven task with eval/verification |
|
|
20
|
-
| `runAgentTaskStream` | Streaming product loop with session resume + backends |
|
|
21
|
-
| `handleChatTurn` | Framework-neutral chat-turn orchestrator (NDJSON + `session.run.*` envelope + product hooks) |
|
|
22
|
-
| `deriveExecutionId` | Stable substrate executionId for `X-Execution-ID` cross-process reconnect |
|
|
23
|
-
| `startRuntimeRun` | Canonical production-run row + cost ledger |
|
|
24
|
-
| `defineAgent` | Declarative per-vertical agent manifest — surfaces, knowledge, rubric, run fn |
|
|
25
|
-
| `createMcpServer` (`/mcp`) + `agent-runtime-mcp` bin | Stdio MCP server with the 5 delegation tools (`delegate_code`, `delegate_research`, `delegate_feedback`, `delegation_status`, `delegation_history`) |
|
|
26
|
-
| `resolveChatModel` / `validateChatModelId` / `getModels` | Router catalog fetch + fail-closed admission + precedence resolver |
|
|
27
|
-
| `decideKnowledgeReadiness` | `ready` / `blocked` / `caveat` branch for routes / UI |
|
|
28
|
-
| `createOpenAICompatibleBackend` | OpenAI-compatible streaming backend (TCloud / cli-bridge) |
|
|
29
|
-
| `createSandboxPromptBackend` | Sandbox / sidecar `streamPrompt` clients |
|
|
30
|
-
| `createRuntimeStreamEventCollector` | Default-redacted sanitized telemetry over a stream |
|
|
31
|
-
| `PlatformAuthClient` + `PlatformHubClient` (`/platform`) | Cross-site SSO + integrations hub |
|
|
32
|
-
|
|
33
|
-
Every public export is annotated `@stable` or `@experimental`. `@stable`
|
|
34
|
-
exports do not change shape inside a minor; `@experimental` exports may
|
|
35
|
-
change inside a minor and require a deliberate consumer bump.
|
|
36
|
-
|
|
37
|
-
## Quickstart
|
|
38
|
-
|
|
39
|
-
```ts
|
|
40
|
-
import { runAgentTask } from '@tangle-network/agent-runtime'
|
|
41
|
-
|
|
42
|
-
const result = await runAgentTask({
|
|
43
|
-
task: { id: 'review-2026-return', intent: 'Review the return', domain: 'tax' },
|
|
44
|
-
adapter: {
|
|
45
|
-
async observe() { return { /* domain state */ } },
|
|
46
|
-
async validate({ state }) { return [/* eval results */] },
|
|
47
|
-
async decide({ state }) { return { type: 'stop', pass: true, score: 1, reason: 'done' } },
|
|
48
|
-
async act() { return undefined },
|
|
49
|
-
},
|
|
50
|
-
})
|
|
51
|
-
console.log(result.status, result.runRecords)
|
|
52
|
-
```
|
|
53
|
-
|
|
54
|
-
## Chat turns
|
|
55
|
-
|
|
56
|
-
`handleChatTurn` wraps a product `produce()` hook with the `session.run.*`
|
|
57
|
-
lifecycle envelope, drains the producer stream through the NDJSON line
|
|
58
|
-
protocol, and calls the persist / post-process hooks after drain.
|
|
59
|
-
Framework-neutral: takes already-resolved values, never a `Request` or
|
|
60
|
-
`Context`.
|
|
11
|
+
Every product agent is a `handleChatTurn` call inside a route. This 20-line snippet is what gtm / creative / legal / tax all run:
|
|
61
12
|
|
|
62
13
|
```ts
|
|
63
14
|
import { handleChatTurn } from '@tangle-network/agent-runtime'
|
|
64
15
|
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
return new Response(result.body, { headers: { 'content-type': result.contentType } })
|
|
79
|
-
```
|
|
80
|
-
|
|
81
|
-
## Execution continuity
|
|
82
|
-
|
|
83
|
-
Long-running execution durability — reconnect, replay, dedup — lives in
|
|
84
|
-
the substrate. `@tangle-network/sandbox`'s `box.streamPrompt`
|
|
85
|
-
auto-reconnects in-call (extracts `executionId` from the response and
|
|
86
|
-
replays via the runtime endpoint on drop). Cross-process reconnect —
|
|
87
|
-
worker dies, a fresh worker resumes the same execution — requires
|
|
88
|
-
either bypassing the SDK and POSTing directly with `X-Execution-ID`
|
|
89
|
-
(see `tax-agent/sessions.ts`) or a future SDK release that surfaces the
|
|
90
|
-
field on `PromptOptions`.
|
|
91
|
-
|
|
92
|
-
`deriveExecutionId` is the convention helper for the stable id the
|
|
93
|
-
product persists alongside its session row:
|
|
94
|
-
|
|
95
|
-
```ts
|
|
96
|
-
import { deriveExecutionId } from '@tangle-network/agent-runtime'
|
|
97
|
-
|
|
98
|
-
const executionId = deriveExecutionId({ projectId, sessionId, turnIndex })
|
|
99
|
-
// pass as `X-Execution-ID` header when calling the orchestrator directly
|
|
100
|
-
```
|
|
101
|
-
|
|
102
|
-
## Chat-model resolution
|
|
103
|
-
|
|
104
|
-
One primitive every chat handler needs and was hand-rolling per repo:
|
|
105
|
-
router catalog fetch, malformed-id guard, fail-closed catalog admission,
|
|
106
|
-
precedence resolver. Policy-free — the caller passes its own precedence
|
|
107
|
-
order and known-good allowlist.
|
|
108
|
-
|
|
109
|
-
```ts
|
|
110
|
-
import {
|
|
111
|
-
resolveChatModel, resolveRouterBaseUrl, validateChatModelId, getModels,
|
|
112
|
-
} from '@tangle-network/agent-runtime'
|
|
113
|
-
|
|
114
|
-
const routerBaseUrl = resolveRouterBaseUrl(env)
|
|
115
|
-
const { model, source } = resolveChatModel(
|
|
116
|
-
[
|
|
117
|
-
{ source: 'request', model: requestBody.model },
|
|
118
|
-
{ source: 'workspace', model: workspace.pinnedModel },
|
|
119
|
-
{ source: 'env', model: env.TCLOUD_CHAT_MODEL },
|
|
120
|
-
],
|
|
121
|
-
{ source: 'default', model: 'claude-sonnet-4-6' },
|
|
122
|
-
)
|
|
123
|
-
const validation = await validateChatModelId(model, {
|
|
124
|
-
routerBaseUrl,
|
|
125
|
-
allowlist: ['claude-sonnet-4-6'],
|
|
126
|
-
})
|
|
127
|
-
if (!validation.succeeded) throw new ConfigError(validation.error)
|
|
128
|
-
```
|
|
129
|
-
|
|
130
|
-
Full runnable: [`examples/model-resolution/`](./examples/model-resolution/).
|
|
131
|
-
|
|
132
|
-
## Define an agent — declarative manifest
|
|
133
|
-
|
|
134
|
-
`defineAgent` is the per-vertical layer that pairs a runtime adapter with
|
|
135
|
-
the surfaces / knowledge / rubric / outcome contract `agent-eval`'s analyst
|
|
136
|
-
loop drives improvement against.
|
|
137
|
-
|
|
138
|
-
```ts
|
|
139
|
-
import { defineAgent } from '@tangle-network/agent-runtime/agent'
|
|
140
|
-
|
|
141
|
-
export const myAgent = defineAgent({
|
|
142
|
-
id: 'legal-agent',
|
|
143
|
-
surfaces: { /* prompt, tools, skills — the levers an analyst can edit */ },
|
|
144
|
-
knowledge: { /* requirements + provider */ },
|
|
145
|
-
rubric: { /* dimensions + weights */ },
|
|
146
|
-
run: async (ctx) => {
|
|
147
|
-
/* product-specific run — typically wraps handleChatTurn or runAgentTaskStream */
|
|
148
|
-
},
|
|
149
|
-
})
|
|
150
|
-
```
|
|
151
|
-
|
|
152
|
-
## Canonical production-run lifecycle
|
|
153
|
-
|
|
154
|
-
`startRuntimeRun` records what the agent did for a customer, what it
|
|
155
|
-
cost, and how it ended. Replaces bespoke `agentRuns` helpers across
|
|
156
|
-
consumer repos.
|
|
157
|
-
|
|
158
|
-
```ts
|
|
159
|
-
import { startRuntimeRun, runAgentTaskStream } from '@tangle-network/agent-runtime'
|
|
160
|
-
|
|
161
|
-
const run = startRuntimeRun({
|
|
162
|
-
workspaceId: 'ws-1', sessionId: threadId, agentId: 'legal-chat-runtime',
|
|
163
|
-
taskSpec, scenarioId: `legal-chat:${threadId}`,
|
|
164
|
-
adapter: { upsert: (row) => db.insert(agentRuns).values(row) },
|
|
165
|
-
})
|
|
166
|
-
for await (const event of runAgentTaskStream({ task: taskSpec, backend, input })) {
|
|
167
|
-
run.observe(event)
|
|
168
|
-
if (event.type === 'final') {
|
|
169
|
-
run.complete({ status: event.status === 'completed' ? 'completed' : 'failed', resultSummary: event.text ?? '' })
|
|
170
|
-
}
|
|
171
|
-
}
|
|
172
|
-
await run.persist({ runtimeEvents: telemetry.events })
|
|
173
|
-
```
|
|
174
|
-
|
|
175
|
-
Full runnable: [`examples/runtime-run/`](./examples/runtime-run/).
|
|
176
|
-
|
|
177
|
-
## Delegation tools (MCP)
|
|
178
|
-
|
|
179
|
-
`@tangle-network/agent-runtime/mcp` ships a stdio MCP server that exposes
|
|
180
|
-
five delegation tools to a sandbox coding-harness agent (claude-code,
|
|
181
|
-
codex, opencode, ...). The product agent itself runs inside a sandbox
|
|
182
|
-
during a chat; when it needs a long-running coder or researcher loop, it
|
|
183
|
-
calls one of these tools instead of doing the work in-line.
|
|
184
|
-
|
|
185
|
-
| Tool | Kind | Use |
|
|
186
|
-
|---|---|---|
|
|
187
|
-
| `delegate_code` | async | Code-modification task — returns a `taskId`; poll `delegation_status` for the patch |
|
|
188
|
-
| `delegate_research` | async | Source-grounded research task — returns a `taskId`; poll for items + citations |
|
|
189
|
-
| `delegate_feedback` | sync | Append an agent/user/judge rating against a delegation, artifact, or outcome |
|
|
190
|
-
| `delegation_status` | sync | Snapshot of a delegation's state machine (`pending` → `running` → `completed` \| `failed` \| `cancelled`) |
|
|
191
|
-
| `delegation_history` | sync | Newest-first read of past delegations + attached feedback |
|
|
192
|
-
|
|
193
|
-
Mount the server from a Node entry point:
|
|
194
|
-
|
|
195
|
-
```ts
|
|
196
|
-
import { Sandbox } from '@tangle-network/sandbox'
|
|
197
|
-
import {
|
|
198
|
-
createMcpServer,
|
|
199
|
-
createDefaultCoderDelegate,
|
|
200
|
-
} from '@tangle-network/agent-runtime/mcp'
|
|
201
|
-
|
|
202
|
-
const sandboxClient = new Sandbox({ apiKey: process.env.TANGLE_API_KEY! })
|
|
203
|
-
const server = createMcpServer({
|
|
204
|
-
coderDelegate: createDefaultCoderDelegate({ sandboxClient }),
|
|
205
|
-
// researcherDelegate: wire your own — see below.
|
|
206
|
-
})
|
|
207
|
-
await server.serve() // reads JSON-RPC from stdin, writes responses to stdout
|
|
208
|
-
```
|
|
209
|
-
|
|
210
|
-
Or run the ready-made bin:
|
|
211
|
-
|
|
212
|
-
```bash
|
|
213
|
-
TANGLE_API_KEY=sk_sandbox_... agent-runtime-mcp
|
|
214
|
-
```
|
|
215
|
-
|
|
216
|
-
### Surfacing the tools through `createOpenAICompatibleBackend`
|
|
217
|
-
|
|
218
|
-
Sandbox callers discover MCP tools through the runtime mount. Callers that
|
|
219
|
-
route through the OpenAI-compat backend (tcloud, OpenRouter, cli-bridge,
|
|
220
|
-
OpenAI direct) must hand the model an explicit `tools[]` array — the
|
|
221
|
-
backend does not auto-discover. `mcpToolsForRuntimeMcp()` returns the
|
|
222
|
-
canonical projection so the model can call any of the 5 delegation tools
|
|
223
|
-
through the OpenAI-compat path:
|
|
224
|
-
|
|
225
|
-
```ts
|
|
226
|
-
import {
|
|
227
|
-
createOpenAICompatibleBackend,
|
|
228
|
-
mcpToolsForRuntimeMcp,
|
|
229
|
-
} from '@tangle-network/agent-runtime'
|
|
230
|
-
|
|
231
|
-
const backend = createOpenAICompatibleBackend({
|
|
232
|
-
apiKey,
|
|
233
|
-
baseUrl,
|
|
234
|
-
model,
|
|
235
|
-
tools: mcpToolsForRuntimeMcp(),
|
|
236
|
-
})
|
|
237
|
-
```
|
|
238
|
-
|
|
239
|
-
Use `mcpToolsForRuntimeMcpSubset(['delegate_research', 'delegation_status'])`
|
|
240
|
-
when you want a curated subset (e.g. read-only research without the coder
|
|
241
|
-
queue).
|
|
242
|
-
|
|
243
|
-
The bin auto-wires the coder delegate and, when
|
|
244
|
-
`@tangle-network/agent-knowledge` is installed as a peer, the researcher
|
|
245
|
-
delegate. Environment knobs:
|
|
246
|
-
|
|
247
|
-
- `TANGLE_API_KEY` — required (unless both `MCP_DISABLE_*` are set)
|
|
248
|
-
- `SANDBOX_BASE_URL` — sandbox-SDK base URL override
|
|
249
|
-
- `TANGLE_FLEET_ID` — switches placement from sibling-sandbox to fleet-workspace (see [Placement modes](#placement-modes))
|
|
250
|
-
- `TANGLE_FLEET_EXCLUDE_MACHINES` — comma-separated machine ids to skip during fleet-mode round-robin (typically the coordinator)
|
|
251
|
-
- `MCP_MAX_CONCURRENT_SANDBOXES` — kernel `maxConcurrency` cap (default 4)
|
|
252
|
-
- `MCP_CODER_FANOUT_HARNESSES` — comma-separated harness ids for `variants > 1`
|
|
253
|
-
- `MCP_DISABLE_CODER` / `MCP_DISABLE_RESEARCHER` — omit the matching tool
|
|
254
|
-
|
|
255
|
-
### Placement modes
|
|
256
|
-
|
|
257
|
-
Where worker iterations land — sibling sandboxes vs the caller's fleet
|
|
258
|
-
workspace — is controlled by `TANGLE_FLEET_ID`.
|
|
259
|
-
|
|
260
|
-
**Sibling-sandbox mode (default).** No `TANGLE_FLEET_ID` set. Every
|
|
261
|
-
`delegate_code` / `delegate_research` call invokes `sandboxClient.create(...)`
|
|
262
|
-
and runs the worker in a fresh sandbox. The worker's diff lives in the
|
|
263
|
-
worker's filesystem; the caller pulls it back via the structured tool
|
|
264
|
-
result. Use this when the MCP server runs as a standalone CLI mounted
|
|
265
|
-
outside a fleet (developer workflows, single-process integrations).
|
|
266
|
-
|
|
267
|
-
**Fleet-workspace mode.** `TANGLE_FLEET_ID` set by the parent sandbox when
|
|
268
|
-
it launches the MCP server. Each delegation dispatches onto an existing
|
|
269
|
-
machine in that fleet via `fleet.sandbox(machineId).streamPrompt(...)`.
|
|
270
|
-
The fleet's shared-workspace policy means worker machines mount the same
|
|
271
|
-
filesystem as the caller — diffs land in-place, no cross-sandbox copy
|
|
272
|
-
step. The bin logs `fleet-aware delegation: fleetId=...` to stderr on
|
|
273
|
-
startup so the operator can confirm the placement.
|
|
274
|
-
|
|
275
|
-
Pass `TANGLE_FLEET_ID` from a parent sandbox's `AgentProfile.mcpServers`
|
|
276
|
-
config:
|
|
277
|
-
|
|
278
|
-
```ts
|
|
279
|
-
import { defineAgentProfile } from '@tangle-network/sandbox'
|
|
280
|
-
|
|
281
|
-
const parentProfile = defineAgentProfile({
|
|
282
|
-
name: 'tax-orchestrator',
|
|
283
|
-
mcp: {
|
|
284
|
-
'agent-runtime': {
|
|
285
|
-
transport: 'stdio',
|
|
286
|
-
command: 'agent-runtime-mcp',
|
|
287
|
-
env: {
|
|
288
|
-
TANGLE_API_KEY: '${TANGLE_API_KEY}',
|
|
289
|
-
TANGLE_FLEET_ID: '${TANGLE_FLEET_ID}', // injected by orchestrator
|
|
290
|
-
TANGLE_FLEET_EXCLUDE_MACHINES: 'coordinator', // skip the machine running this MCP server
|
|
291
|
-
},
|
|
16
|
+
export async function POST({ request, env, ctx }: { request: Request; env: Env; ctx: ExecutionContext }) {
|
|
17
|
+
const { workspaceId, threadId, userMessage } = await request.json()
|
|
18
|
+
const box = await ensureWorkspaceSandbox(workspaceId)
|
|
19
|
+
|
|
20
|
+
const result = handleChatTurn({
|
|
21
|
+
identity: { tenantId: workspaceId, sessionId: threadId, userId: 'demo', turnIndex: 0 },
|
|
22
|
+
hooks: {
|
|
23
|
+
produce: () => ({
|
|
24
|
+
stream: box.streamPrompt(userMessage),
|
|
25
|
+
finalText: () => box.lastResponse(),
|
|
26
|
+
}),
|
|
27
|
+
persistAssistantMessage: async ({ identity, finalText }) => env.db.insertMessage(identity, finalText),
|
|
28
|
+
traceFlush: () => env.traceSink.flush(),
|
|
292
29
|
},
|
|
293
|
-
|
|
294
|
-
})
|
|
295
|
-
|
|
296
|
-
|
|
297
|
-
For non-bin entry points, wire an executor directly:
|
|
298
|
-
|
|
299
|
-
```ts
|
|
300
|
-
import { Sandbox } from '@tangle-network/sandbox'
|
|
301
|
-
import {
|
|
302
|
-
createMcpServer,
|
|
303
|
-
createDefaultCoderDelegate,
|
|
304
|
-
createFleetWorkspaceExecutor,
|
|
305
|
-
createSiblingSandboxExecutor,
|
|
306
|
-
detectExecutor,
|
|
307
|
-
} from '@tangle-network/agent-runtime/mcp'
|
|
308
|
-
|
|
309
|
-
const sandboxClient = new Sandbox({ apiKey: process.env.TANGLE_API_KEY! })
|
|
310
|
-
|
|
311
|
-
// Either pick automatically from env:
|
|
312
|
-
const executor = await detectExecutor({ sandboxClient })
|
|
313
|
-
|
|
314
|
-
// Or pin it explicitly:
|
|
315
|
-
const fleet = await sandboxClient.fleets.get(process.env.TANGLE_FLEET_ID!)
|
|
316
|
-
const fleetExecutor = createFleetWorkspaceExecutor({
|
|
317
|
-
fleet,
|
|
318
|
-
excludeMachineIds: ['coordinator'],
|
|
319
|
-
})
|
|
320
|
-
|
|
321
|
-
const server = createMcpServer({
|
|
322
|
-
coderDelegate: createDefaultCoderDelegate({ executor: fleetExecutor }),
|
|
323
|
-
})
|
|
30
|
+
waitUntil: ctx.waitUntil.bind(ctx),
|
|
31
|
+
})
|
|
32
|
+
return new Response(result.body, { headers: { 'content-type': result.contentType } })
|
|
33
|
+
}
|
|
324
34
|
```
|
|
325
35
|
|
|
326
|
-
|
|
327
|
-
iteration: `{ placement: 'sibling', sandboxId }` in sibling mode,
|
|
328
|
-
`{ placement: 'fleet', fleetId, machineId, sandboxId }` in fleet mode.
|
|
329
|
-
Analyst loops use this to correlate worker activity with the caller's
|
|
330
|
-
machine.
|
|
36
|
+
That's the centerpiece. Everything else is "when chat alone isn't enough."
|
|
331
37
|
|
|
332
|
-
|
|
333
|
-
|
|
334
|
-
Coder + researcher delegations are **fire-and-poll**. The handler returns
|
|
335
|
-
a `taskId` immediately; the agent calls `delegation_status(taskId)` until
|
|
336
|
-
the state is terminal. Identical inputs return the same `taskId` —
|
|
337
|
-
duplicate-call safety is built in via canonical-form hashing.
|
|
38
|
+
## Which entry point do I reach for?
|
|
338
39
|
|
|
339
40
|
```
|
|
340
|
-
|
|
341
|
-
agent →
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
|
|
41
|
+
Production chat turn (90% of products) → handleChatTurn
|
|
42
|
+
Declarative agent manifest → defineAgent (/agent)
|
|
43
|
+
Cross-process reconnect (X-Execution-ID) → deriveExecutionId
|
|
44
|
+
One-shot task with verification + eval → runAgentTask
|
|
45
|
+
Streaming task without chat-turn envelope → runAgentTaskStream
|
|
46
|
+
Multi-iteration parallel fanout (coders /
|
|
47
|
+
researchers proposing N variants) → runLoop + a Driver (/loops)
|
|
48
|
+
Tool/MCP delegation server (stdio) → createMcpServer (/mcp)
|
|
49
|
+
Analyst surface mutations → runAnalystLoop (/analyst-loop)
|
|
50
|
+
Production-run persistence + cost ledger → startRuntimeRun
|
|
51
|
+
Cross-site SSO / integrations hub → PlatformAuthClient (/platform)
|
|
345
52
|
```
|
|
346
53
|
|
|
347
|
-
|
|
348
|
-
pending delegations — Phase 2 will move state into sqlite.
|
|
54
|
+
## Defaults
|
|
349
55
|
|
|
350
|
-
|
|
56
|
+
When nothing is specified:
|
|
351
57
|
|
|
352
|
-
|
|
353
|
-
|
|
354
|
-
|
|
58
|
+
| Knob | Default | Override |
|
|
59
|
+
|---|---|---|
|
|
60
|
+
| Backend model | `gpt-4o-mini` (when via `createOpenAICompatibleBackend`) | `model` option, or `MODEL_NAME` env |
|
|
61
|
+
| Backend provider | `openai-compat` when `TANGLE_API_KEY` present, else `openai` if `OPENAI_API_KEY` | `MODEL_PROVIDER` env |
|
|
62
|
+
| Router base URL | `https://router.tangle.tools/v1` | `TANGLE_ROUTER_BASE_URL` env |
|
|
63
|
+
| Sandbox base URL | `https://sandbox.tangle.tools` | `SANDBOX_API_URL` env |
|
|
64
|
+
| Loop iteration cap | 8 | `runLoop({ maxIterations })` |
|
|
65
|
+
| Driver | none — required to pass `Refine` or `FanoutVote` | `createRefineDriver()` or `createFanoutVoteDriver({ n })` |
|
|
66
|
+
| Validator | none — required if using `runLoop` | profile preset (e.g., `coderProfile().validator`) or your own |
|
|
67
|
+
| OTEL export | off | set `OTEL_EXPORTER_OTLP_ENDPOINT` |
|
|
68
|
+
| Trace propagation through MCP subprocess | off until product wires it | `env.TRACE_ID` + `env.PARENT_SPAN_ID` at MCP launch |
|
|
355
69
|
|
|
356
|
-
|
|
357
|
-
import { runLoop } from '@tangle-network/agent-runtime/loops'
|
|
358
|
-
import { researcherProfile, multiHarnessResearcherFanout } from '@tangle-network/agent-knowledge/profiles'
|
|
359
|
-
import { createMcpServer, type ResearcherDelegate } from '@tangle-network/agent-runtime/mcp'
|
|
360
|
-
|
|
361
|
-
const researcherDelegate: ResearcherDelegate = async (args, ctx) => {
|
|
362
|
-
const task = {
|
|
363
|
-
question: args.question,
|
|
364
|
-
knowledgeNamespace: args.namespace,
|
|
365
|
-
scope: args.scope,
|
|
366
|
-
sources: args.sources,
|
|
367
|
-
/* ...map config.recencyWindow ISO strings to Date objects */
|
|
368
|
-
}
|
|
369
|
-
if ((args.variants ?? 1) <= 1) {
|
|
370
|
-
const preset = researcherProfile({ task })
|
|
371
|
-
const result = await runLoop({
|
|
372
|
-
driver: { /* single-shot */ async plan(t, h) { return h.length === 0 ? [t] : [] }, decide(h) { return h.length > 0 ? 'pick-winner' : 'fail' } },
|
|
373
|
-
agentRun: preset.agentRunSpec, output: preset.output, validator: preset.validator,
|
|
374
|
-
task, ctx: { sandboxClient, signal: ctx.signal }, maxIterations: 1,
|
|
375
|
-
})
|
|
376
|
-
return result.winner!.output
|
|
377
|
-
}
|
|
378
|
-
const fanout = multiHarnessResearcherFanout({ task })
|
|
379
|
-
const result = await runLoop({
|
|
380
|
-
driver: fanout.driver,
|
|
381
|
-
agentRuns: fanout.agentRuns.slice(0, args.variants),
|
|
382
|
-
output: fanout.output, validator: fanout.validator,
|
|
383
|
-
task, ctx: { sandboxClient, signal: ctx.signal },
|
|
384
|
-
maxIterations: args.variants ?? 1,
|
|
385
|
-
})
|
|
386
|
-
return result.winner!.output
|
|
387
|
-
}
|
|
70
|
+
## Composition with the rest of the stack
|
|
388
71
|
|
|
389
|
-
createMcpServer({ researcherDelegate })
|
|
390
72
|
```
|
|
73
|
+
agent-runtime ──── handleChatTurn (chat turn lifecycle)
|
|
74
|
+
defineAgent (declarative manifest)
|
|
75
|
+
runLoop (multi-shot kernel)
|
|
76
|
+
createMcpServer (delegation tools server)
|
|
77
|
+
OTEL export (trace pipeline)
|
|
391
78
|
|
|
392
|
-
|
|
79
|
+
agent-eval ──── runEvalCampaign / runProductionLoop / runAgentMatrix
|
|
80
|
+
(consumes agent-runtime traces, scores, gates promotion)
|
|
393
81
|
|
|
394
|
-
|
|
395
|
-
|
|
396
|
-
(both OpenAI delta shape and the Anthropic `tool_use` shape proxied by
|
|
397
|
-
the router) are assembled across SSE chunks and emitted as a single
|
|
398
|
-
`tool_call` RuntimeStreamEvent per call. The backend does NOT execute
|
|
399
|
-
tools — surfacing the call is the contract; dispatch is the caller's
|
|
400
|
-
problem.
|
|
82
|
+
agent-knowledge ─── proposeKnowledgeWrites / applyKnowledgeWriteBlocks
|
|
83
|
+
(analyst-loop produces these; runtime consumes them)
|
|
401
84
|
|
|
402
|
-
|
|
403
|
-
|
|
404
|
-
createOpenAICompatibleBackend,
|
|
405
|
-
runAgentTaskStream,
|
|
406
|
-
type OpenAIChatTool,
|
|
407
|
-
} from '@tangle-network/agent-runtime'
|
|
408
|
-
|
|
409
|
-
const delegateResearch: OpenAIChatTool = {
|
|
410
|
-
type: 'function',
|
|
411
|
-
function: {
|
|
412
|
-
name: 'delegate_research',
|
|
413
|
-
description: 'Spin up a researcher loop and return a taskId.',
|
|
414
|
-
parameters: {
|
|
415
|
-
type: 'object',
|
|
416
|
-
properties: { question: { type: 'string' } },
|
|
417
|
-
required: ['question'],
|
|
418
|
-
},
|
|
419
|
-
},
|
|
420
|
-
}
|
|
421
|
-
|
|
422
|
-
const backend = createOpenAICompatibleBackend({
|
|
423
|
-
apiKey: process.env.TANGLE_API_KEY!,
|
|
424
|
-
baseUrl: 'https://router.tangle.tools/v1',
|
|
425
|
-
model: 'claude-sonnet-4-6',
|
|
426
|
-
tools: [delegateResearch /* + delegate_code, delegate_feedback, etc. */],
|
|
427
|
-
toolChoice: 'auto', // or 'none' | 'required' | { type: 'function', function: { name } }
|
|
428
|
-
})
|
|
429
|
-
|
|
430
|
-
for await (const event of runAgentTaskStream({ task, backend, input })) {
|
|
431
|
-
if (event.type === 'tool_call') {
|
|
432
|
-
// Dispatch through your MCP / sandbox runtime. `args` is JSON-parsed
|
|
433
|
-
// when the model produced a valid object, raw string otherwise.
|
|
434
|
-
const result = await dispatch(event.toolName, event.args)
|
|
435
|
-
// Feed `result` back on a follow-up turn via `input.messages`.
|
|
436
|
-
}
|
|
437
|
-
}
|
|
85
|
+
sandbox ──── AgentProfile (substrate type), Sandbox.create, exportTraceBundle
|
|
86
|
+
(provides the harness execution surface)
|
|
438
87
|
```
|
|
439
88
|
|
|
440
|
-
|
|
441
|
-
server's `tools/list` response into this shape once at config time and
|
|
442
|
-
pass the array as `tools`. The runtime intentionally does NOT depend on
|
|
443
|
-
`@modelcontextprotocol/sdk` — keeping the backend transport thin lets
|
|
444
|
-
domain repos own MCP plumbing.
|
|
445
|
-
|
|
446
|
-
### Transport errors fail loud
|
|
447
|
-
|
|
448
|
-
Non-success HTTP responses (4xx/5xx after retry exhaustion) and
|
|
449
|
-
connection failures throw `BackendTransportError` from inside the
|
|
450
|
-
`stream()` generator. `runAgentTaskStream` catches the throw and emits:
|
|
451
|
-
|
|
452
|
-
- `backend_error` event with `error: { kind: 'transport', message, status, body }`
|
|
453
|
-
- terminal `final` event with `status: 'failed'` carrying the same `error` detail
|
|
454
|
-
|
|
455
|
-
Consumers building a `RunRecord` MUST map `final.error` onto
|
|
456
|
-
`RunRecord.error`. Treating an empty `finalText` as "agent produced
|
|
457
|
-
nothing" hides credit exhaustion (HTTP 402), auth failure (401),
|
|
458
|
-
model-not-found (404), and upstream outages (5xx).
|
|
459
|
-
|
|
460
|
-
```ts
|
|
461
|
-
for await (const event of runAgentTaskStream({ task, backend, input })) {
|
|
462
|
-
run.observe(event)
|
|
463
|
-
if (event.type === 'final') {
|
|
464
|
-
run.complete({
|
|
465
|
-
status: event.status === 'completed' ? 'completed' : 'failed',
|
|
466
|
-
resultSummary: event.text ?? '',
|
|
467
|
-
error: event.error
|
|
468
|
-
? `${event.error.kind} ${event.error.status ?? ''}: ${event.error.message}`
|
|
469
|
-
: undefined,
|
|
470
|
-
})
|
|
471
|
-
}
|
|
472
|
-
}
|
|
473
|
-
```
|
|
89
|
+
Self-improving products consume all four. See [`agent-stack-adoption` skill](https://github.com/drewstone/dotfiles/blob/main/claude/skills/agent-stack-adoption/SKILL.md) for the end-to-end 10-phase adoption runbook.
|
|
474
90
|
|
|
475
|
-
|
|
476
|
-
telemetry envelope surfaces `error.kind` + `error.status` but redacts
|
|
477
|
-
`error.body` (it can echo user-visible text from a provider's error
|
|
478
|
-
page). Opt in with `RuntimeTelemetryOptions.includeControlPayloads`.
|
|
91
|
+
## Examples
|
|
479
92
|
|
|
480
|
-
|
|
93
|
+
Ordered as a learning progression — each example introduces one concept.
|
|
481
94
|
|
|
482
|
-
|
|
483
|
-
|
|
484
|
-
| `ValidationError` | Caller passed invalid arguments |
|
|
485
|
-
| `ConfigError` | Required env / config missing |
|
|
486
|
-
| `NotFoundError` | A named resource does not exist |
|
|
487
|
-
| `BackendTransportError` | Backend HTTP / IPC call returned non-success — carries `status` + truncated `body` |
|
|
488
|
-
| `SessionMismatchError` | Resume requested against a different backend |
|
|
489
|
-
| `RuntimeRunStateError` | `RuntimeRunHandle` lifecycle methods called out of order |
|
|
95
|
+
**Start here:**
|
|
96
|
+
- [`chat-handler/`](./examples/chat-handler/) — `handleChatTurn`, the production centerpiece
|
|
490
97
|
|
|
491
|
-
|
|
492
|
-
|
|
493
|
-
|
|
98
|
+
**Add observability + readiness:**
|
|
99
|
+
- [`with-knowledge-readiness/`](./examples/with-knowledge-readiness/) — `requiredKnowledge` + `decideKnowledgeReadiness`
|
|
100
|
+
- [`sanitized-telemetry-streaming/`](./examples/sanitized-telemetry-streaming/) — `createRuntimeStreamEventCollector` + redaction
|
|
101
|
+
- [`runtime-run/`](./examples/runtime-run/) — `startRuntimeRun` + cost ledger persistence
|
|
494
102
|
|
|
495
|
-
|
|
103
|
+
**Add delegation:**
|
|
104
|
+
- [`mcp-delegation/`](./examples/mcp-delegation/) — mount `agent-runtime-mcp` in an `AgentProfile`
|
|
496
105
|
|
|
497
|
-
|
|
498
|
-
|
|
499
|
-
|
|
500
|
-
|
|
106
|
+
**Multi-agent fanout (advanced):**
|
|
107
|
+
- [`coder-loop/`](./examples/coder-loop/) — `coderProfile` + `runLoop` + `FanoutVote`
|
|
108
|
+
- [`researcher-loop/`](./examples/researcher-loop/) — `researcherProfile` + `runLoop` (peer dep: `@tangle-network/agent-knowledge`)
|
|
109
|
+
- [`fleet-delegation/`](./examples/fleet-delegation/) — `TANGLE_FLEET_ID` + `createFleetWorkspaceExecutor`
|
|
501
110
|
|
|
502
|
-
|
|
503
|
-
import { createRuntimeStreamEventCollector, runAgentTaskStream } from '@tangle-network/agent-runtime'
|
|
111
|
+
## Stability
|
|
504
112
|
|
|
505
|
-
|
|
506
|
-
for await (const event of runAgentTaskStream({ task, backend })) telemetry.onEvent(event)
|
|
507
|
-
console.log(telemetry.events, telemetry.summary())
|
|
508
|
-
```
|
|
113
|
+
Every public export is annotated `@stable` or `@experimental`. `@stable` exports do not change shape inside a minor. `@experimental` exports may change inside a minor and require a deliberate consumer bump.
|
|
509
114
|
|
|
510
115
|
## Package boundaries
|
|
511
116
|
|
|
512
117
|
| Package | Owns |
|
|
513
118
|
|---|---|
|
|
514
|
-
| `agent-runtime` | Task lifecycle, adapters, backends, chat-turn engine,
|
|
515
|
-
| `agent-runtime/platform` | Cross-site SSO
|
|
119
|
+
| `agent-runtime` | Task lifecycle, adapters, backends, chat-turn engine, model resolution, trace bridge, `defineAgent` |
|
|
120
|
+
| `agent-runtime/platform` | Cross-site SSO + integrations hub |
|
|
516
121
|
| `agent-runtime/agent` | `defineAgent` + surfaces / outcome adapters |
|
|
517
122
|
| `agent-runtime/analyst-loop` | `runAnalystLoop` — analyst registry driver |
|
|
518
|
-
| `agent-
|
|
123
|
+
| `agent-runtime/loops` | `runLoop` kernel + `Refine` / `FanoutVote` drivers |
|
|
124
|
+
| `agent-runtime/profiles` | `coderProfile`, `researcherProfile` presets |
|
|
125
|
+
| `agent-runtime/mcp` | `createMcpServer` + `agent-runtime-mcp` bin (5 delegation tools) |
|
|
126
|
+
| `agent-eval` | Evals, judges, scorecards, RL bridge, release evidence, matrix |
|
|
519
127
|
| `agent-knowledge` | Evidence, claims, wiki pages, retrieval |
|
|
520
|
-
|
|
|
521
|
-
|
|
522
|
-
See [`docs/concepts.md`](./docs/concepts.md) for the mental model.
|
|
523
|
-
|
|
524
|
-
## Examples
|
|
128
|
+
| `sandbox` | `AgentProfile`, `Sandbox.create`, `streamPrompt`, `exportTraceBundle` |
|
|
525
129
|
|
|
526
|
-
|
|
527
|
-
`@tangle-network/agent-runtime` (the same surface consumers use):
|
|
528
|
-
|
|
529
|
-
- [`basic-task/`](./examples/basic-task/) — smallest `runAgentTask`
|
|
530
|
-
- [`with-knowledge-readiness/`](./examples/with-knowledge-readiness/) — readiness gating
|
|
531
|
-
- [`sanitized-telemetry/`](./examples/sanitized-telemetry/) + [`-streaming/`](./examples/sanitized-telemetry-streaming/) — redaction
|
|
532
|
-
- [`sse-stream/`](./examples/sse-stream/) — SSE helpers for browser clients
|
|
533
|
-
- [`sandbox-stream-backend/`](./examples/sandbox-stream-backend/) — `createSandboxPromptBackend`
|
|
534
|
-
- [`openai-stream-backend/`](./examples/openai-stream-backend/) — `createOpenAICompatibleBackend`
|
|
535
|
-
- [`runtime-run/`](./examples/runtime-run/) — production-run row + cost ledger
|
|
536
|
-
- [`model-resolution/`](./examples/model-resolution/) — router catalog + fail-closed admission
|
|
537
|
-
- [`agent-into-reviewer/`](./examples/agent-into-reviewer/) — pipe one runtime's stream into a reviewer agent
|
|
538
|
-
- [`chat-handler/`](./examples/chat-handler/) — `handleChatTurn` (the centerpiece production pattern)
|
|
539
|
-
- [`coder-loop/`](./examples/coder-loop/) — `coderProfile` + `runLoop` + `FanoutVote` (driven-loop kernel)
|
|
540
|
-
- [`researcher-loop/`](./examples/researcher-loop/) — `researcherProfile` + `runLoop` + `FanoutVote` (peer dep: `@tangle-network/agent-knowledge`)
|
|
541
|
-
- [`mcp-delegation/`](./examples/mcp-delegation/) — mount `agent-runtime-mcp` in a product `AgentProfile` + stdio `tools/list` smoke
|
|
542
|
-
- [`fleet-delegation/`](./examples/fleet-delegation/) — `TANGLE_FLEET_ID` env flip + `createFleetWorkspaceExecutor` topology
|
|
130
|
+
See [`docs/concepts.md`](./docs/concepts.md) for the deeper mental model.
|
|
543
131
|
|
|
544
132
|
## Tests
|
|
545
133
|
|
|
546
134
|
```bash
|
|
547
|
-
pnpm test
|
|
135
|
+
pnpm test # 283+ tests across the kernel + drivers + MCP + backends + analyst-loop
|
|
548
136
|
pnpm typecheck
|
|
549
|
-
pnpm lint
|
|
550
137
|
pnpm build
|
|
551
138
|
```
|