@tangle-network/agent-runtime 0.5.6 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +98 -161
- package/dist/index.d.ts +488 -84
- package/dist/index.js +943 -454
- package/dist/index.js.map +1 -1
- package/docs/product-runtime-kernel.md +13 -11
- package/package.json +13 -12
package/README.md
CHANGED
|
@@ -1,54 +1,32 @@
|
|
|
1
|
-
# agent-runtime
|
|
2
|
-
|
|
3
|
-
Reusable runtime lifecycle for domain-specific agents. Standardizes the
|
|
4
|
-
task lifecycle (knowledge readiness → questions/acquisition → control loop
|
|
5
|
-
→ eval) and delegates domain behavior to an adapter. Owns no domain
|
|
6
|
-
policy, models, tools, connectors, or UI.
|
|
7
|
-
|
|
8
|
-
## Contents
|
|
9
|
-
|
|
10
|
-
- [Overview](#overview)
|
|
11
|
-
- [Install](#install)
|
|
12
|
-
- [Getting started](#getting-started)
|
|
13
|
-
- [When to use which entry point](#when-to-use-which-entry-point)
|
|
14
|
-
- [Backends for `runAgentTaskStream`](#backends-for-runagenttaskstream)
|
|
15
|
-
- [Lifecycle events](#lifecycle-events)
|
|
16
|
-
- [Knowledge providers](#knowledge-providers)
|
|
17
|
-
- [Sanitized telemetry](#sanitized-telemetry)
|
|
18
|
-
- [Package boundaries](#package-boundaries)
|
|
19
|
-
- [Examples](#examples)
|
|
20
|
-
|
|
21
|
-
## Overview
|
|
22
|
-
|
|
23
|
-
```txt
|
|
24
|
-
TaskSpec
|
|
25
|
-
→ Knowledge readiness
|
|
26
|
-
→ Question / acquisition decision
|
|
27
|
-
→ Agent control loop (observe / validate / decide / act)
|
|
28
|
-
→ Eval / verification
|
|
29
|
-
→ Run evidence
|
|
30
|
-
```
|
|
31
|
-
|
|
32
|
-
For product agents that own a streaming model backend:
|
|
33
|
-
|
|
34
|
-
```txt
|
|
35
|
-
TaskSpec
|
|
36
|
-
→ Knowledge readiness
|
|
37
|
-
→ Session create/resume
|
|
38
|
-
→ Backend stream
|
|
39
|
-
→ Sanitized RuntimeStreamEvent / SSE
|
|
40
|
-
```
|
|
1
|
+
# @tangle-network/agent-runtime
|
|
41
2
|
|
|
42
|
-
|
|
3
|
+
Production runtime substrate for domain agents. Owns the task lifecycle
|
|
4
|
+
(knowledge readiness, control loop, session resume, sanitized telemetry,
|
|
5
|
+
canonical `RuntimeRunRow` persistence + cost ledger) so domain repos stop
|
|
6
|
+
inventing their own.
|
|
43
7
|
|
|
44
8
|
```bash
|
|
45
9
|
pnpm add @tangle-network/agent-runtime @tangle-network/agent-eval
|
|
46
10
|
```
|
|
47
11
|
|
|
48
|
-
##
|
|
12
|
+
## What you get
|
|
49
13
|
|
|
50
|
-
|
|
51
|
-
|
|
14
|
+
| Entry point | When to reach for it |
|
|
15
|
+
|---|---|
|
|
16
|
+
| `runAgentTask` | Single-shot adapter-driven task with eval/verification |
|
|
17
|
+
| `runAgentTaskStream` | Streaming product loop with session resume + backends |
|
|
18
|
+
| `startRuntimeRun` | Canonical production-run row + cost ledger (NEW in 0.7.0) |
|
|
19
|
+
| `createTraceBridge` | Map `RuntimeStreamEvent` → `agent-eval` `TraceEvent` (NEW in 0.7.0) |
|
|
20
|
+
| `decideKnowledgeReadiness` | `ready` / `blocked` / `caveat` branch for routes / UI |
|
|
21
|
+
| `createOpenAICompatibleBackend` | OpenAI-compatible streaming backend (TCloud / cli-bridge) |
|
|
22
|
+
| `createSandboxPromptBackend` | Sandbox / sidecar `streamPrompt` clients |
|
|
23
|
+
| `createRuntimeStreamEventCollector` | Default-redacted sanitized telemetry over a stream |
|
|
24
|
+
|
|
25
|
+
Every public export is annotated `@stable` or `@experimental`. `@stable`
|
|
26
|
+
exports do not change shape inside a minor; `@experimental` exports may
|
|
27
|
+
change inside a minor and require a deliberate consumer bump.
|
|
28
|
+
|
|
29
|
+
## Quickstart
|
|
52
30
|
|
|
53
31
|
```ts
|
|
54
32
|
import { runAgentTask } from '@tangle-network/agent-runtime'
|
|
@@ -63,7 +41,7 @@ const result = await runAgentTask({
|
|
|
63
41
|
async observe() { return { /* domain state */ } },
|
|
64
42
|
async validate({ state }) { return [/* eval results */] },
|
|
65
43
|
async decide({ state }) {
|
|
66
|
-
return {
|
|
44
|
+
return { type: 'stop', pass: true, score: 1, reason: 'review complete' }
|
|
67
45
|
},
|
|
68
46
|
async act() { return undefined },
|
|
69
47
|
},
|
|
@@ -72,160 +50,119 @@ const result = await runAgentTask({
|
|
|
72
50
|
console.log(result.status, result.runRecords)
|
|
73
51
|
```
|
|
74
52
|
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
## When to use which entry point
|
|
78
|
-
|
|
79
|
-
| You want… | Use |
|
|
80
|
-
|---|---|
|
|
81
|
-
| Single-shot task with eval/verification | `runAgentTask` |
|
|
82
|
-
| Streaming product loop with session resume | `runAgentTaskStream` + a backend factory |
|
|
83
|
-
| Just SSE serialization for an existing readiness report | `readinessServerSentEvent` |
|
|
84
|
-
| Just sanitized telemetry over an existing run | `createRuntimeEventCollector` (+ `summarizeAgentTaskRun`) for `runAgentTask`, or `createRuntimeStreamEventCollector` for `runAgentTaskStream` |
|
|
85
|
-
| Stable readiness branching (`ready` / `blocked` / `caveat`) in a route | `decideKnowledgeReadiness` |
|
|
86
|
-
|
|
87
|
-
## Backends for `runAgentTaskStream`
|
|
88
|
-
|
|
89
|
-
Four SDK-agnostic factories ship in core:
|
|
90
|
-
|
|
91
|
-
| Factory | When |
|
|
92
|
-
|---|---|
|
|
93
|
-
| `createOpenAICompatibleBackend` | TCloud / OpenAI-compatible chat APIs |
|
|
94
|
-
| `createCliBridgeBackend` | HTTP CLI bridge streams |
|
|
95
|
-
| `createSandboxPromptBackend` | Sandbox / sidecar `streamPrompt` clients |
|
|
96
|
-
| `createIterableBackend` | Custom coding harnesses, browser agents |
|
|
97
|
-
|
|
98
|
-
Adapters are intentionally thin. Product repos still own client
|
|
99
|
-
construction, auth, concrete tool permissions, and UI behavior. See
|
|
100
|
-
[`examples/sandbox-stream-backend/`](./examples/sandbox-stream-backend/) and
|
|
101
|
-
[`examples/openai-stream-backend/`](./examples/openai-stream-backend/) for
|
|
102
|
-
runnable wirings.
|
|
53
|
+
## Canonical production-run lifecycle (NEW in 0.7.0)
|
|
103
54
|
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
55
|
+
`startRuntimeRun` is the ONE abstraction for "the agent did a thing on
|
|
56
|
+
behalf of a customer; record what it did, what it cost, how it ended."
|
|
57
|
+
Replaces bespoke `agentRuns`-row helpers (legal-agent's
|
|
58
|
+
`completeProductionAgentRun` + `persistRuntimeRun` pair is the canonical
|
|
59
|
+
example of what this subsumes).
|
|
108
60
|
|
|
109
61
|
```ts
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
62
|
+
import { startRuntimeRun, runAgentTaskStream } from '@tangle-network/agent-runtime'
|
|
63
|
+
|
|
64
|
+
const run = startRuntimeRun({
|
|
65
|
+
workspaceId: 'ws-1',
|
|
66
|
+
sessionId: threadId,
|
|
67
|
+
agentId: 'legal-chat-runtime',
|
|
68
|
+
taskSpec,
|
|
69
|
+
scenarioId: `legal-chat:${threadId}`,
|
|
70
|
+
adapter: { upsert: (row) => db.insert(agentRuns).values(row) },
|
|
115
71
|
})
|
|
116
|
-
```
|
|
117
72
|
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
73
|
+
for await (const event of runAgentTaskStream({ task: taskSpec, backend, input })) {
|
|
74
|
+
run.observe(event) // llm_call events update the cost ledger
|
|
75
|
+
if (event.type === 'final') {
|
|
76
|
+
run.complete({
|
|
77
|
+
status: event.status === 'completed' ? 'completed' : 'failed',
|
|
78
|
+
resultSummary: event.text ?? '',
|
|
79
|
+
error: event.status === 'failed' ? event.reason : undefined,
|
|
80
|
+
})
|
|
81
|
+
}
|
|
82
|
+
}
|
|
125
83
|
|
|
126
|
-
|
|
84
|
+
await run.persist({ runtimeEvents: telemetry.events })
|
|
85
|
+
console.log(run.cost()) // { tokensIn, tokensOut, costUsd, wallMs, llmCalls }
|
|
86
|
+
```
|
|
127
87
|
|
|
128
|
-
|
|
88
|
+
Full runnable: [`examples/runtime-run/`](./examples/runtime-run/).
|
|
129
89
|
|
|
130
|
-
-
|
|
131
|
-
- `answerQuestions` — handle outstanding user questions
|
|
132
|
-
- `executeAcquisitionPlans` — fetch missing evidence
|
|
133
|
-
- `refreshReadiness` — rerun scoring after acquisition
|
|
90
|
+
## agent-eval trace bridge (NEW in 0.7.0)
|
|
134
91
|
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
to emit a domain action (asking a user, querying a connector, etc.).
|
|
92
|
+
If you persist traces in agent-eval's `TraceStore`, map runtime stream
|
|
93
|
+
events to `TraceEvent` once and stop hand-rolling the adapter in every
|
|
94
|
+
domain repo:
|
|
139
95
|
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
`blocked`, or `caveat` plus gap IDs and the recommended action.
|
|
96
|
+
```ts
|
|
97
|
+
import { createTraceBridge } from '@tangle-network/agent-runtime'
|
|
143
98
|
|
|
144
|
-
|
|
99
|
+
const bridge = createTraceBridge({ runId, spanId })
|
|
100
|
+
for await (const event of runAgentTaskStream({ task, backend, input })) {
|
|
101
|
+
const trace = bridge.toTraceEvent(event)
|
|
102
|
+
if (trace) await traceStore.appendEvent(trace)
|
|
103
|
+
}
|
|
104
|
+
```
|
|
145
105
|
|
|
146
|
-
|
|
147
|
-
Use the built-in sanitized collector:
|
|
106
|
+
## Error taxonomy
|
|
148
107
|
|
|
149
|
-
|
|
150
|
-
import {
|
|
151
|
-
createRuntimeEventCollector,
|
|
152
|
-
summarizeAgentTaskRun,
|
|
153
|
-
} from '@tangle-network/agent-runtime'
|
|
108
|
+
Every public function throws one of:
|
|
154
109
|
|
|
155
|
-
|
|
156
|
-
|
|
110
|
+
| Error | When |
|
|
111
|
+
|---|---|
|
|
112
|
+
| `ValidationError` | Caller passed invalid arguments |
|
|
113
|
+
| `ConfigError` | Required env / config missing |
|
|
114
|
+
| `NotFoundError` | A named resource does not exist |
|
|
115
|
+
| `BackendTransportError` | Backend HTTP / IPC call returned non-success |
|
|
116
|
+
| `SessionMismatchError` | Resume requested against a different backend |
|
|
117
|
+
| `RuntimeRunStateError` | `RuntimeRunHandle` lifecycle methods called out of order |
|
|
157
118
|
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
119
|
+
All extend `AgentEvalError` (re-exported from `@tangle-network/agent-eval`)
|
|
120
|
+
and carry a stable `code` so cross-package handlers can pattern-match
|
|
121
|
+
without importing the runtime.
|
|
161
122
|
|
|
162
|
-
|
|
163
|
-
questions, control payloads, evidence IDs, task metadata, and eval
|
|
164
|
-
details. Private diagnostics opt-in via `RuntimeTelemetryOptions` flags
|
|
165
|
-
(`includeInputs`, `includeUserAnswers`, `includeControlPayloads`,
|
|
166
|
-
`includeEvidenceIds`, `includeRequirementDescriptions`,
|
|
167
|
-
`includeMetadata`, `includeEvalDetails`).
|
|
123
|
+
## Sanitized telemetry
|
|
168
124
|
|
|
169
|
-
|
|
170
|
-
|
|
125
|
+
`task.intent` flows through sanitized telemetry on every event. **Never
|
|
126
|
+
set it to user input** — use a fixed string describing the operation
|
|
127
|
+
kind (e.g. `"Run a chat turn"`, `"Score a tax return"`). Route user-
|
|
128
|
+
visible content through `task.inputs` (redacted by default).
|
|
171
129
|
|
|
172
130
|
```ts
|
|
173
|
-
import {
|
|
174
|
-
createRuntimeStreamEventCollector,
|
|
175
|
-
runAgentTaskStream,
|
|
176
|
-
} from '@tangle-network/agent-runtime'
|
|
131
|
+
import { createRuntimeStreamEventCollector, runAgentTaskStream } from '@tangle-network/agent-runtime'
|
|
177
132
|
|
|
178
133
|
const telemetry = createRuntimeStreamEventCollector()
|
|
179
134
|
for await (const event of runAgentTaskStream({ task, backend })) {
|
|
180
135
|
telemetry.onEvent(event)
|
|
181
136
|
}
|
|
182
|
-
|
|
183
|
-
console.log(telemetry.events)
|
|
184
|
-
console.log(telemetry.summary())
|
|
137
|
+
console.log(telemetry.events, telemetry.summary())
|
|
185
138
|
```
|
|
186
139
|
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
a single dispatcher would silently misroute events whose `type` literals
|
|
191
|
-
overlap (`task_start`, `readiness_end`, etc.).
|
|
192
|
-
|
|
193
|
-
### `task.intent` is sanitized telemetry by default
|
|
194
|
-
|
|
195
|
-
`task.intent` flows through sanitized telemetry on every event. **Never
|
|
196
|
-
set it to user input** — use a fixed string describing the operation
|
|
197
|
-
kind (e.g. `"Run a chat turn"`, `"Score a tax return"`). If you need to
|
|
198
|
-
log user-visible intent, route it through `inputs` (which are redacted
|
|
199
|
-
by default) instead.
|
|
200
|
-
|
|
201
|
-
For SSE-over-HTTP, use the helpers:
|
|
202
|
-
|
|
203
|
-
```ts
|
|
204
|
-
import { readinessServerSentEvent } from '@tangle-network/agent-runtime'
|
|
205
|
-
writer.write(encoder.encode(readinessServerSentEvent(readinessReport)))
|
|
206
|
-
```
|
|
140
|
+
By default the collector redacts task inputs, user answers, credential
|
|
141
|
+
questions, control payloads, evidence IDs, task metadata, and eval
|
|
142
|
+
details. Private diagnostics opt-in via `RuntimeTelemetryOptions`.
|
|
207
143
|
|
|
208
144
|
## Package boundaries
|
|
209
145
|
|
|
210
146
|
| Package | Owns |
|
|
211
147
|
|---|---|
|
|
212
|
-
| `agent-runtime` |
|
|
213
|
-
| `agent-eval` | Control loops, readiness scoring, traces, evals, failure classes,
|
|
148
|
+
| `agent-runtime` | Lifecycle, adapters, backends, `RuntimeRunHandle`, trace bridge |
|
|
149
|
+
| `agent-eval` | Control loops, readiness scoring, traces, evals, failure classes, release evidence |
|
|
214
150
|
| `agent-knowledge` | Evidence, claims, wiki pages, retrieval, knowledge bundle builders |
|
|
215
151
|
| Domain packages | Domain tools, policies, credentials, UI text, rubrics |
|
|
216
152
|
|
|
217
153
|
The API uses `runAgentTask`, not `runVerticalAgentTask`. `domain` is
|
|
218
|
-
metadata on the task
|
|
219
|
-
|
|
154
|
+
metadata on the task because the runtime is reusable across many kinds of
|
|
155
|
+
agents without baking taxonomy into type names.
|
|
220
156
|
|
|
221
157
|
## Examples
|
|
222
158
|
|
|
223
159
|
Runnable in [`examples/`](./examples/):
|
|
224
160
|
|
|
225
|
-
- [`basic-task/`](./examples/basic-task/) —
|
|
226
|
-
- [`with-knowledge-readiness/`](./examples/with-knowledge-readiness/) — readiness gating +
|
|
227
|
-
- [`sanitized-telemetry/`](./examples/sanitized-telemetry/) — `createRuntimeEventCollector` + redaction
|
|
228
|
-
- [`sanitized-telemetry-streaming/`](./examples/sanitized-telemetry-streaming/) —
|
|
161
|
+
- [`basic-task/`](./examples/basic-task/) — smallest `runAgentTask`
|
|
162
|
+
- [`with-knowledge-readiness/`](./examples/with-knowledge-readiness/) — readiness gating + `onKnowledgeBlocked`
|
|
163
|
+
- [`sanitized-telemetry/`](./examples/sanitized-telemetry/) — `createRuntimeEventCollector` + redaction
|
|
164
|
+
- [`sanitized-telemetry-streaming/`](./examples/sanitized-telemetry-streaming/) — streaming collector + redaction
|
|
229
165
|
- [`sse-stream/`](./examples/sse-stream/) — Server-Sent Events for browser clients
|
|
230
|
-
- [`sandbox-stream-backend/`](./examples/sandbox-stream-backend/) — `
|
|
231
|
-
- [`openai-stream-backend/`](./examples/openai-stream-backend/) — `
|
|
166
|
+
- [`sandbox-stream-backend/`](./examples/sandbox-stream-backend/) — `createSandboxPromptBackend`
|
|
167
|
+
- [`openai-stream-backend/`](./examples/openai-stream-backend/) — `createOpenAICompatibleBackend`
|
|
168
|
+
- [`runtime-run/`](./examples/runtime-run/) — `startRuntimeRun` + cost ledger + persistence adapter (NEW)
|