@tangle-network/agent-runtime 0.6.0 → 0.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +98 -166
- package/dist/index.d.ts +488 -78
- package/dist/index.js +943 -421
- package/dist/index.js.map +1 -1
- package/package.json +17 -12
package/README.md
CHANGED
|
@@ -1,54 +1,32 @@
|
|
|
1
|
-
# agent-runtime
|
|
2
|
-
|
|
3
|
-
Reusable runtime lifecycle for domain-specific agents. Standardizes the
|
|
4
|
-
task lifecycle (knowledge readiness → questions/acquisition → control loop
|
|
5
|
-
→ eval) and delegates domain behavior to an adapter. Owns no domain
|
|
6
|
-
policy, models, tools, connectors, or UI.
|
|
7
|
-
|
|
8
|
-
## Contents
|
|
9
|
-
|
|
10
|
-
- [Overview](#overview)
|
|
11
|
-
- [Install](#install)
|
|
12
|
-
- [Getting started](#getting-started)
|
|
13
|
-
- [When to use which entry point](#when-to-use-which-entry-point)
|
|
14
|
-
- [Backends for `runAgentTaskStream`](#backends-for-runagenttaskstream)
|
|
15
|
-
- [Lifecycle events](#lifecycle-events)
|
|
16
|
-
- [Knowledge providers](#knowledge-providers)
|
|
17
|
-
- [Sanitized telemetry](#sanitized-telemetry)
|
|
18
|
-
- [Package boundaries](#package-boundaries)
|
|
19
|
-
- [Examples](#examples)
|
|
20
|
-
|
|
21
|
-
## Overview
|
|
22
|
-
|
|
23
|
-
```txt
|
|
24
|
-
TaskSpec
|
|
25
|
-
→ Knowledge readiness
|
|
26
|
-
→ Question / acquisition decision
|
|
27
|
-
→ Agent control loop (observe / validate / decide / act)
|
|
28
|
-
→ Eval / verification
|
|
29
|
-
→ Run evidence
|
|
30
|
-
```
|
|
31
|
-
|
|
32
|
-
For product agents that own a streaming model backend:
|
|
33
|
-
|
|
34
|
-
```txt
|
|
35
|
-
TaskSpec
|
|
36
|
-
→ Knowledge readiness
|
|
37
|
-
→ Session create/resume
|
|
38
|
-
→ Backend stream
|
|
39
|
-
→ Sanitized RuntimeStreamEvent / SSE
|
|
40
|
-
```
|
|
1
|
+
# @tangle-network/agent-runtime
|
|
41
2
|
|
|
42
|
-
|
|
3
|
+
Production runtime substrate for domain agents. Owns the task lifecycle
|
|
4
|
+
(knowledge readiness, control loop, session resume, sanitized telemetry,
|
|
5
|
+
canonical `RuntimeRunRow` persistence + cost ledger) so domain repos stop
|
|
6
|
+
inventing their own.
|
|
43
7
|
|
|
44
8
|
```bash
|
|
45
9
|
pnpm add @tangle-network/agent-runtime @tangle-network/agent-eval
|
|
46
10
|
```
|
|
47
11
|
|
|
48
|
-
##
|
|
12
|
+
## What you get
|
|
13
|
+
|
|
14
|
+
| Entry point | When to reach for it |
|
|
15
|
+
|---|---|
|
|
16
|
+
| `runAgentTask` | Single-shot adapter-driven task with eval/verification |
|
|
17
|
+
| `runAgentTaskStream` | Streaming product loop with session resume + backends |
|
|
18
|
+
| `startRuntimeRun` | Canonical production-run row + cost ledger (NEW in 0.7.0) |
|
|
19
|
+
| `createTraceBridge` | Map `RuntimeStreamEvent` → `agent-eval` `TraceEvent` (NEW in 0.7.0) |
|
|
20
|
+
| `decideKnowledgeReadiness` | `ready` / `blocked` / `caveat` branch for routes / UI |
|
|
21
|
+
| `createOpenAICompatibleBackend` | OpenAI-compatible streaming backend (TCloud / cli-bridge) |
|
|
22
|
+
| `createSandboxPromptBackend` | Sandbox / sidecar `streamPrompt` clients |
|
|
23
|
+
| `createRuntimeStreamEventCollector` | Default-redacted sanitized telemetry over a stream |
|
|
24
|
+
|
|
25
|
+
Every public export is annotated `@stable` or `@experimental`. `@stable`
|
|
26
|
+
exports do not change shape inside a minor; `@experimental` exports may
|
|
27
|
+
change inside a minor and require a deliberate consumer bump.
|
|
49
28
|
|
|
50
|
-
|
|
51
|
-
no streaming:
|
|
29
|
+
## Quickstart
|
|
52
30
|
|
|
53
31
|
```ts
|
|
54
32
|
import { runAgentTask } from '@tangle-network/agent-runtime'
|
|
@@ -63,7 +41,7 @@ const result = await runAgentTask({
|
|
|
63
41
|
async observe() { return { /* domain state */ } },
|
|
64
42
|
async validate({ state }) { return [/* eval results */] },
|
|
65
43
|
async decide({ state }) {
|
|
66
|
-
return {
|
|
44
|
+
return { type: 'stop', pass: true, score: 1, reason: 'review complete' }
|
|
67
45
|
},
|
|
68
46
|
async act() { return undefined },
|
|
69
47
|
},
|
|
@@ -72,165 +50,119 @@ const result = await runAgentTask({
|
|
|
72
50
|
console.log(result.status, result.runRecords)
|
|
73
51
|
```
|
|
74
52
|
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
## When to use which entry point
|
|
78
|
-
|
|
79
|
-
| You want… | Use |
|
|
80
|
-
|---|---|
|
|
81
|
-
| Single-shot task with eval/verification | `runAgentTask` |
|
|
82
|
-
| Streaming product loop with session resume | `runAgentTaskStream` + a backend factory |
|
|
83
|
-
| Just SSE serialization for an existing readiness report | `readinessServerSentEvent` |
|
|
84
|
-
| Just sanitized telemetry over an existing run | `createRuntimeEventCollector` (+ `summarizeAgentTaskRun`) for `runAgentTask`, or `createRuntimeStreamEventCollector` for `runAgentTaskStream` |
|
|
85
|
-
| Stable readiness branching (`ready` / `blocked` / `caveat`) in a route | `decideKnowledgeReadiness` |
|
|
86
|
-
|
|
87
|
-
## Backends for `runAgentTaskStream`
|
|
53
|
+
## Canonical production-run lifecycle (NEW in 0.7.0)
|
|
88
54
|
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
| `createSandboxPromptBackend` | Sandbox / sidecar `streamPrompt` clients |
|
|
95
|
-
| `createIterableBackend` | Custom coding harnesses, browser agents |
|
|
96
|
-
|
|
97
|
-
For [cli-bridge](https://github.com/drewstone/cli-bridge) (or any other
|
|
98
|
-
OpenAI-compatible HTTP gateway), use `createOpenAICompatibleBackend` pointed
|
|
99
|
-
at the gateway's `/v1/chat/completions` URL — the cli-bridge harness/model
|
|
100
|
-
selector is just an OpenAI `model` string like `claude/sonnet` or
|
|
101
|
-
`codex/gpt-5-codex`.
|
|
102
|
-
|
|
103
|
-
Adapters are intentionally thin. Product repos still own client
|
|
104
|
-
construction, auth, concrete tool permissions, and UI behavior. See
|
|
105
|
-
[`examples/sandbox-stream-backend/`](./examples/sandbox-stream-backend/) and
|
|
106
|
-
[`examples/openai-stream-backend/`](./examples/openai-stream-backend/) for
|
|
107
|
-
runnable wirings.
|
|
108
|
-
|
|
109
|
-
## Lifecycle events
|
|
110
|
-
|
|
111
|
-
`runAgentTask` and `runAgentTaskStream` emit typed lifecycle events
|
|
112
|
-
through `onEvent`:
|
|
55
|
+
`startRuntimeRun` is the ONE abstraction for "the agent did a thing on
|
|
56
|
+
behalf of a customer; record what it did, what it cost, how it ended."
|
|
57
|
+
Replaces bespoke `agentRuns`-row helpers (legal-agent's
|
|
58
|
+
`completeProductionAgentRun` + `persistRuntimeRun` pair is the canonical
|
|
59
|
+
example of what this subsumes).
|
|
113
60
|
|
|
114
61
|
```ts
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
62
|
+
import { startRuntimeRun, runAgentTaskStream } from '@tangle-network/agent-runtime'
|
|
63
|
+
|
|
64
|
+
const run = startRuntimeRun({
|
|
65
|
+
workspaceId: 'ws-1',
|
|
66
|
+
sessionId: threadId,
|
|
67
|
+
agentId: 'legal-chat-runtime',
|
|
68
|
+
taskSpec,
|
|
69
|
+
scenarioId: `legal-chat:${threadId}`,
|
|
70
|
+
adapter: { upsert: (row) => db.insert(agentRuns).values(row) },
|
|
120
71
|
})
|
|
121
|
-
```
|
|
122
72
|
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
73
|
+
for await (const event of runAgentTaskStream({ task: taskSpec, backend, input })) {
|
|
74
|
+
run.observe(event) // llm_call events update the cost ledger
|
|
75
|
+
if (event.type === 'final') {
|
|
76
|
+
run.complete({
|
|
77
|
+
status: event.status === 'completed' ? 'completed' : 'failed',
|
|
78
|
+
resultSummary: event.text ?? '',
|
|
79
|
+
error: event.status === 'failed' ? event.reason : undefined,
|
|
80
|
+
})
|
|
81
|
+
}
|
|
82
|
+
}
|
|
130
83
|
|
|
131
|
-
|
|
84
|
+
await run.persist({ runtimeEvents: telemetry.events })
|
|
85
|
+
console.log(run.cost()) // { tokensIn, tokensOut, costUsd, wallMs, llmCalls }
|
|
86
|
+
```
|
|
132
87
|
|
|
133
|
-
|
|
88
|
+
Full runnable: [`examples/runtime-run/`](./examples/runtime-run/).
|
|
134
89
|
|
|
135
|
-
-
|
|
136
|
-
- `answerQuestions` — handle outstanding user questions
|
|
137
|
-
- `executeAcquisitionPlans` — fetch missing evidence
|
|
138
|
-
- `refreshReadiness` — rerun scoring after acquisition
|
|
90
|
+
## agent-eval trace bridge (NEW in 0.7.0)
|
|
139
91
|
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
to emit a domain action (asking a user, querying a connector, etc.).
|
|
92
|
+
If you persist traces in agent-eval's `TraceStore`, map runtime stream
|
|
93
|
+
events to `TraceEvent` once and stop hand-rolling the adapter in every
|
|
94
|
+
domain repo:
|
|
144
95
|
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
`blocked`, or `caveat` plus gap IDs and the recommended action.
|
|
96
|
+
```ts
|
|
97
|
+
import { createTraceBridge } from '@tangle-network/agent-runtime'
|
|
148
98
|
|
|
149
|
-
|
|
99
|
+
const bridge = createTraceBridge({ runId, spanId })
|
|
100
|
+
for await (const event of runAgentTaskStream({ task, backend, input })) {
|
|
101
|
+
const trace = bridge.toTraceEvent(event)
|
|
102
|
+
if (trace) await traceStore.appendEvent(trace)
|
|
103
|
+
}
|
|
104
|
+
```
|
|
150
105
|
|
|
151
|
-
|
|
152
|
-
Use the built-in sanitized collector:
|
|
106
|
+
## Error taxonomy
|
|
153
107
|
|
|
154
|
-
|
|
155
|
-
import {
|
|
156
|
-
createRuntimeEventCollector,
|
|
157
|
-
summarizeAgentTaskRun,
|
|
158
|
-
} from '@tangle-network/agent-runtime'
|
|
108
|
+
Every public function throws one of:
|
|
159
109
|
|
|
160
|
-
|
|
161
|
-
|
|
110
|
+
| Error | When |
|
|
111
|
+
|---|---|
|
|
112
|
+
| `ValidationError` | Caller passed invalid arguments |
|
|
113
|
+
| `ConfigError` | Required env / config missing |
|
|
114
|
+
| `NotFoundError` | A named resource does not exist |
|
|
115
|
+
| `BackendTransportError` | Backend HTTP / IPC call returned non-success |
|
|
116
|
+
| `SessionMismatchError` | Resume requested against a different backend |
|
|
117
|
+
| `RuntimeRunStateError` | `RuntimeRunHandle` lifecycle methods called out of order |
|
|
162
118
|
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
119
|
+
All extend `AgentEvalError` (re-exported from `@tangle-network/agent-eval`)
|
|
120
|
+
and carry a stable `code` so cross-package handlers can pattern-match
|
|
121
|
+
without importing the runtime.
|
|
166
122
|
|
|
167
|
-
|
|
168
|
-
questions, control payloads, evidence IDs, task metadata, and eval
|
|
169
|
-
details. Private diagnostics opt-in via `RuntimeTelemetryOptions` flags
|
|
170
|
-
(`includeInputs`, `includeUserAnswers`, `includeControlPayloads`,
|
|
171
|
-
`includeEvidenceIds`, `includeRequirementDescriptions`,
|
|
172
|
-
`includeMetadata`, `includeEvalDetails`).
|
|
123
|
+
## Sanitized telemetry
|
|
173
124
|
|
|
174
|
-
|
|
175
|
-
|
|
125
|
+
`task.intent` flows through sanitized telemetry on every event. **Never
|
|
126
|
+
set it to user input** — use a fixed string describing the operation
|
|
127
|
+
kind (e.g. `"Run a chat turn"`, `"Score a tax return"`). Route user-
|
|
128
|
+
visible content through `task.inputs` (redacted by default).
|
|
176
129
|
|
|
177
130
|
```ts
|
|
178
|
-
import {
|
|
179
|
-
createRuntimeStreamEventCollector,
|
|
180
|
-
runAgentTaskStream,
|
|
181
|
-
} from '@tangle-network/agent-runtime'
|
|
131
|
+
import { createRuntimeStreamEventCollector, runAgentTaskStream } from '@tangle-network/agent-runtime'
|
|
182
132
|
|
|
183
133
|
const telemetry = createRuntimeStreamEventCollector()
|
|
184
134
|
for await (const event of runAgentTaskStream({ task, backend })) {
|
|
185
135
|
telemetry.onEvent(event)
|
|
186
136
|
}
|
|
187
|
-
|
|
188
|
-
console.log(telemetry.events)
|
|
189
|
-
console.log(telemetry.summary())
|
|
137
|
+
console.log(telemetry.events, telemetry.summary())
|
|
190
138
|
```
|
|
191
139
|
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
a single dispatcher would silently misroute events whose `type` literals
|
|
196
|
-
overlap (`task_start`, `readiness_end`, etc.).
|
|
197
|
-
|
|
198
|
-
### `task.intent` is sanitized telemetry by default
|
|
199
|
-
|
|
200
|
-
`task.intent` flows through sanitized telemetry on every event. **Never
|
|
201
|
-
set it to user input** — use a fixed string describing the operation
|
|
202
|
-
kind (e.g. `"Run a chat turn"`, `"Score a tax return"`). If you need to
|
|
203
|
-
log user-visible intent, route it through `inputs` (which are redacted
|
|
204
|
-
by default) instead.
|
|
205
|
-
|
|
206
|
-
For SSE-over-HTTP, use the helpers:
|
|
207
|
-
|
|
208
|
-
```ts
|
|
209
|
-
import { readinessServerSentEvent } from '@tangle-network/agent-runtime'
|
|
210
|
-
writer.write(encoder.encode(readinessServerSentEvent(readinessReport)))
|
|
211
|
-
```
|
|
140
|
+
By default the collector redacts task inputs, user answers, credential
|
|
141
|
+
questions, control payloads, evidence IDs, task metadata, and eval
|
|
142
|
+
details. Private diagnostics opt-in via `RuntimeTelemetryOptions`.
|
|
212
143
|
|
|
213
144
|
## Package boundaries
|
|
214
145
|
|
|
215
146
|
| Package | Owns |
|
|
216
147
|
|---|---|
|
|
217
|
-
| `agent-runtime` |
|
|
218
|
-
| `agent-eval` | Control loops, readiness scoring, traces, evals, failure classes,
|
|
148
|
+
| `agent-runtime` | Lifecycle, adapters, backends, `RuntimeRunHandle`, trace bridge |
|
|
149
|
+
| `agent-eval` | Control loops, readiness scoring, traces, evals, failure classes, release evidence |
|
|
219
150
|
| `agent-knowledge` | Evidence, claims, wiki pages, retrieval, knowledge bundle builders |
|
|
220
151
|
| Domain packages | Domain tools, policies, credentials, UI text, rubrics |
|
|
221
152
|
|
|
222
153
|
The API uses `runAgentTask`, not `runVerticalAgentTask`. `domain` is
|
|
223
|
-
metadata on the task
|
|
224
|
-
|
|
154
|
+
metadata on the task because the runtime is reusable across many kinds of
|
|
155
|
+
agents without baking taxonomy into type names.
|
|
225
156
|
|
|
226
157
|
## Examples
|
|
227
158
|
|
|
228
159
|
Runnable in [`examples/`](./examples/):
|
|
229
160
|
|
|
230
|
-
- [`basic-task/`](./examples/basic-task/) —
|
|
231
|
-
- [`with-knowledge-readiness/`](./examples/with-knowledge-readiness/) — readiness gating +
|
|
232
|
-
- [`sanitized-telemetry/`](./examples/sanitized-telemetry/) — `createRuntimeEventCollector` + redaction
|
|
233
|
-
- [`sanitized-telemetry-streaming/`](./examples/sanitized-telemetry-streaming/) —
|
|
161
|
+
- [`basic-task/`](./examples/basic-task/) — smallest `runAgentTask`
|
|
162
|
+
- [`with-knowledge-readiness/`](./examples/with-knowledge-readiness/) — readiness gating + `onKnowledgeBlocked`
|
|
163
|
+
- [`sanitized-telemetry/`](./examples/sanitized-telemetry/) — `createRuntimeEventCollector` + redaction
|
|
164
|
+
- [`sanitized-telemetry-streaming/`](./examples/sanitized-telemetry-streaming/) — streaming collector + redaction
|
|
234
165
|
- [`sse-stream/`](./examples/sse-stream/) — Server-Sent Events for browser clients
|
|
235
|
-
- [`sandbox-stream-backend/`](./examples/sandbox-stream-backend/) — `
|
|
236
|
-
- [`openai-stream-backend/`](./examples/openai-stream-backend/) — `
|
|
166
|
+
- [`sandbox-stream-backend/`](./examples/sandbox-stream-backend/) — `createSandboxPromptBackend`
|
|
167
|
+
- [`openai-stream-backend/`](./examples/openai-stream-backend/) — `createOpenAICompatibleBackend`
|
|
168
|
+
- [`runtime-run/`](./examples/runtime-run/) — `startRuntimeRun` + cost ledger + persistence adapter (NEW)
|