@mantyx/sdk 0.10.0 → 0.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +8 -1
- package/README.md +37 -31
- package/dist/a2a-server.cjs +9 -0
- package/dist/a2a-server.cjs.map +1 -1
- package/dist/a2a-server.d.cts +1 -1
- package/dist/a2a-server.d.ts +1 -1
- package/dist/a2a-server.js +1 -1
- package/dist/{chunk-XMUCELMH.js → chunk-DR625E6B.js} +69 -9
- package/dist/chunk-DR625E6B.js.map +1 -0
- package/dist/{client-DHwh8MPj.d.cts → client-Byb0Zdo7.d.cts} +199 -84
- package/dist/{client-DHwh8MPj.d.ts → client-Byb0Zdo7.d.ts} +199 -84
- package/dist/index.cjs +76 -78
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +2 -2
- package/dist/index.d.ts +2 -2
- package/dist/index.js +9 -69
- package/dist/index.js.map +1 -1
- package/docs/agent-runs-protocol.md +373 -220
- package/docs/wire-protocol.md +415 -252
- package/package.json +1 -1
- package/dist/chunk-XMUCELMH.js.map +0 -1
- package/docs/oauth.md +0 -356
package/docs/wire-protocol.md
CHANGED
|
@@ -9,7 +9,7 @@ SDK is expected to ship for client-resolved (`*_local`) tools.
|
|
|
9
9
|
|
|
10
10
|
If you're just looking for HTTP routes, auth, body shapes, or session
|
|
11
11
|
semantics, start with `agent-runs-protocol.md`. If you're writing or
|
|
12
|
-
maintaining an SDK and want to know
|
|
12
|
+
maintaining an SDK and want to know _exactly_ what a `local_tool_call` event
|
|
13
13
|
looks like for `mcp_local`, you're in the right place.
|
|
14
14
|
|
|
15
15
|
> **Authentication.** Every example below uses
|
|
@@ -21,9 +21,10 @@ looks like for `mcp_local`, you're in the right place.
|
|
|
21
21
|
> `models:read`, `mantyx.identity:read`); see §2 of
|
|
22
22
|
> `agent-runs-protocol.md` for the per-endpoint scope table and
|
|
23
23
|
> [`docs/oauth.md`](./oauth.md) for the registration / Authorization Code
|
|
24
|
-
>
|
|
24
|
+
>
|
|
25
|
+
> - PKCE flow.
|
|
25
26
|
|
|
26
|
-
> **Stability.** Field names listed in
|
|
27
|
+
> **Stability.** Field names listed in _bold_ are part of the documented
|
|
27
28
|
> stable surface. Any other fields are passed through verbatim and survive
|
|
28
29
|
> round-trips, but their semantics are not contractually guaranteed. The
|
|
29
30
|
> server uses Zod with `passthrough` for all `*_local` resolved-content
|
|
@@ -34,15 +35,15 @@ looks like for `mcp_local`, you're in the right place.
|
|
|
34
35
|
|
|
35
36
|
## 0. Glossary
|
|
36
37
|
|
|
37
|
-
| Term | Meaning
|
|
38
|
-
| ------------------- |
|
|
39
|
-
| **MANTYX** | The agent operating system server (this repo). Owns LLM orchestration, tool execution for server-resolved tools, persistence.
|
|
40
|
-
| **SDK** | Anything calling the public agent-runs API — typically `@mantyx/ts-sdk`, but also other-language SDKs and direct HTTP clients.
|
|
41
|
-
| **Agent run** | A single LLM execution. Streams events; ends with a terminal `result` / `error` / `cancelled`.
|
|
42
|
-
| **Spec** | The JSON object describing what the run does — model, prompt, tools, budgets, optional `reasoningLevel`. Sent in the `POST /agent-runs` (or `.../messages`) body.
|
|
43
|
-
| **Tool ref** | One entry in `spec.tools[]`. A discriminated union keyed by `kind`.
|
|
44
|
-
| **Server-resolved** | A tool MANTYX executes itself (`mantyx`, `mantyx_plugin`, `a2a`, `mcp`). The SDK only sees informational `tool_result` events.
|
|
45
|
-
| **Client-resolved** | A tool the SDK executes (`local`, `a2a_local`, `mcp_local`). MANTYX emits `local_tool_call`, the SDK does the work, the SDK posts back to `.../tool-results`.
|
|
38
|
+
| Term | Meaning |
|
|
39
|
+
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
40
|
+
| **MANTYX** | The agent operating system server (this repo). Owns LLM orchestration, tool execution for server-resolved tools, persistence. |
|
|
41
|
+
| **SDK** | Anything calling the public agent-runs API — typically `@mantyx/ts-sdk`, but also other-language SDKs and direct HTTP clients. |
|
|
42
|
+
| **Agent run** | A single LLM execution. Streams events; ends with a terminal `result` / `error` / `cancelled`. |
|
|
43
|
+
| **Spec** | The JSON object describing what the run does — model, prompt, tools, budgets, optional `reasoningLevel`. Sent in the `POST /agent-runs` (or `.../messages`) body. |
|
|
44
|
+
| **Tool ref** | One entry in `spec.tools[]`. A discriminated union keyed by `kind`. |
|
|
45
|
+
| **Server-resolved** | A tool MANTYX executes itself (`mantyx`, `mantyx_plugin`, `a2a`, `mcp`). The SDK only sees informational `tool_result` events. |
|
|
46
|
+
| **Client-resolved** | A tool the SDK executes (`local`, `a2a_local`, `mcp_local`). MANTYX emits `local_tool_call`, the SDK does the work, the SDK posts back to `.../tool-results`. |
|
|
46
47
|
| **Resolution** | The act of turning an external resource (A2A peer, MCP server) into a self-contained JSON document the model can reason about. For `*_local` kinds, resolution is the **SDK's** responsibility. |
|
|
47
48
|
|
|
48
49
|
---
|
|
@@ -108,23 +109,30 @@ short-circuit, etc.) see `agent-runs-protocol.md` §4.
|
|
|
108
109
|
{
|
|
109
110
|
"modelId": "openai:gpt-5.5",
|
|
110
111
|
"systemPrompt": "...",
|
|
111
|
-
"prompt": "...",
|
|
112
|
-
"tools": [
|
|
113
|
-
|
|
112
|
+
"prompt": "...", // OR "messages": [...]
|
|
113
|
+
"tools": [
|
|
114
|
+
/* tool refs — see §3 */
|
|
115
|
+
],
|
|
116
|
+
"reasoningLevel": "medium", // optional; see §6
|
|
114
117
|
"budgets": { "maxToolTurns": 32 },
|
|
115
|
-
"outputSchema": {
|
|
116
|
-
|
|
117
|
-
"
|
|
118
|
+
"outputSchema": {
|
|
119
|
+
// optional; see §7
|
|
120
|
+
"name": "weather_report", // defaults to "output"
|
|
121
|
+
"schema": {
|
|
122
|
+
/* JSON Schema */
|
|
123
|
+
},
|
|
118
124
|
},
|
|
119
|
-
"loopDetection": {
|
|
125
|
+
"loopDetection": {
|
|
126
|
+
// optional; see §8
|
|
120
127
|
"consecutiveThreshold": 3,
|
|
121
|
-
"hardCutoffThreshold":
|
|
128
|
+
"hardCutoffThreshold": 6,
|
|
122
129
|
},
|
|
123
|
-
"toolBudgets": {
|
|
124
|
-
|
|
125
|
-
"
|
|
130
|
+
"toolBudgets": {
|
|
131
|
+
// optional; see §8
|
|
132
|
+
"recall": { "maxCalls": 4 },
|
|
133
|
+
"hive_consult_ontology": { "maxCalls": 4 },
|
|
126
134
|
},
|
|
127
|
-
"metadata": { "customer": "acme" }
|
|
135
|
+
"metadata": { "customer": "acme" }, // optional, free-form k/v
|
|
128
136
|
}
|
|
129
137
|
```
|
|
130
138
|
|
|
@@ -132,7 +140,7 @@ short-circuit, etc.) see `agent-runs-protocol.md` §4.
|
|
|
132
140
|
|
|
133
141
|
Same body shape, posted to `POST /agent-sessions/:id/messages`. The session
|
|
134
142
|
keeps the conversation history; per-message `tools`, `reasoningLevel`,
|
|
135
|
-
`outputSchema`, `loopDetection`, and `toolBudgets`
|
|
143
|
+
`outputSchema`, `loopDetection`, and `toolBudgets` _replace_ the session's
|
|
136
144
|
defaults for that single run only — the next run falls back to whatever
|
|
137
145
|
the session was created with.
|
|
138
146
|
|
|
@@ -141,20 +149,20 @@ the session was created with.
|
|
|
141
149
|
## 3. Tool ref taxonomy
|
|
142
150
|
|
|
143
151
|
Every entry in `spec.tools[]` is one of the seven shapes below. The
|
|
144
|
-
|
|
152
|
+
_resolution column_ is the contract that drives everything else: **server**
|
|
145
153
|
means MANTYX runs the tool itself and the SDK only ever sees a
|
|
146
154
|
`tool_result` event; **client** means MANTYX is a transport and the SDK
|
|
147
155
|
must answer `local_tool_call` events.
|
|
148
156
|
|
|
149
|
-
| Kind
|
|
150
|
-
|
|
|
151
|
-
| `mantyx`
|
|
152
|
-
| `mantyx_plugin`
|
|
153
|
-
| `local`
|
|
154
|
-
| `a2a`
|
|
155
|
-
| `a2a_local`
|
|
156
|
-
| `mcp`
|
|
157
|
-
| `mcp_local`
|
|
157
|
+
| Kind | Resolution | Wire-payload contract |
|
|
158
|
+
| --------------- | ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
159
|
+
| `mantyx` | server | `{ id }` reference to a workspace `Tool` row. |
|
|
160
|
+
| `mantyx_plugin` | server | `{ name }` reference to a platform plugin tool. |
|
|
161
|
+
| `local` | client | `{ name, description?, parameters?, outputSchema?, longRunning? }` — `parameters` is **JSON Schema** (object schema with `properties`/`required`); forwarded verbatim to the LLM provider and validated against incoming tool-call args before execution. `outputSchema` (optional) is JSON Schema for the tool's structured return value, surfaced to providers that accept per-tool response schemas. `longRunning` (optional, default `false`) annotates the model-facing description with a "don't double-call while pending" hint so every provider treats the tool as long-running. |
|
|
162
|
+
| `a2a` | server | `{ name, agentCardUrl, headers?, contextId?, description? }`. |
|
|
163
|
+
| `a2a_local` | client | `{ name, agentCard }` — **resolved A2A Agent Card JSON content**. |
|
|
164
|
+
| `mcp` | server | `{ name, url, headers?, toolFilter? }`. |
|
|
165
|
+
| `mcp_local` | client | `{ name, serverInfo?, tools[] }` — **resolved MCP `Tool[]`**. |
|
|
158
166
|
|
|
159
167
|
The remainder of this document focuses on `local`, `a2a_local`, and
|
|
160
168
|
`mcp_local`, because they're the ones that carry SDK-defined structured
|
|
@@ -173,38 +181,40 @@ caller-specific business logic.
|
|
|
173
181
|
```jsonc
|
|
174
182
|
{
|
|
175
183
|
"kind": "local",
|
|
176
|
-
"name": "send_email",
|
|
184
|
+
"name": "send_email", // model-facing; /^[a-zA-Z0-9_]{1,64}$/
|
|
177
185
|
"description": "Send a transactional email.",
|
|
178
|
-
"parameters": {
|
|
186
|
+
"parameters": {
|
|
187
|
+
// OPTIONAL; JSON Schema for args
|
|
179
188
|
"type": "object",
|
|
180
189
|
"properties": {
|
|
181
|
-
"to":
|
|
190
|
+
"to": { "type": "string", "format": "email" },
|
|
182
191
|
"subject": { "type": "string" },
|
|
183
|
-
"body":
|
|
192
|
+
"body": { "type": "string" },
|
|
184
193
|
},
|
|
185
194
|
"required": ["to", "subject", "body"],
|
|
186
|
-
"additionalProperties": false
|
|
195
|
+
"additionalProperties": false,
|
|
187
196
|
},
|
|
188
|
-
"outputSchema": {
|
|
197
|
+
"outputSchema": {
|
|
198
|
+
// OPTIONAL; JSON Schema for the return value
|
|
189
199
|
"type": "object",
|
|
190
200
|
"properties": { "id": { "type": "string" } },
|
|
191
201
|
"required": ["id"],
|
|
192
|
-
"additionalProperties": false
|
|
202
|
+
"additionalProperties": false,
|
|
193
203
|
},
|
|
194
|
-
"longRunning": false
|
|
204
|
+
"longRunning": false, // OPTIONAL; default false
|
|
195
205
|
}
|
|
196
206
|
```
|
|
197
207
|
|
|
198
208
|
**Field reference:**
|
|
199
209
|
|
|
200
|
-
| Field | Required | Notes
|
|
201
|
-
| -------------- | -------- |
|
|
202
|
-
| `kind` | yes | Discriminator literal `"local"`.
|
|
203
|
-
| `name` | yes | Model-facing tool name. Must match `/^[a-zA-Z0-9_]{1,64}$/`.
|
|
204
|
-
| `description` | no | Free-form. When omitted the model sees an empty description (acceptable but reduces tool selection accuracy).
|
|
205
|
-
| `parameters` | no | JSON Schema for the tool's input. Must be an object schema (`type: "object"` with `properties`); other shapes are coerced to an empty object schema server-side. Nested constraints (`array.items`, `enum`, `anyOf`, …) are preserved end-to-end. Args that fail server-side validation produce a structured `tool_input_invalid` tool result the model can recover from instead of crashing the call.
|
|
206
|
-
| `outputSchema` | no | JSON Schema for the structured value the tool returns. Forwarded to providers with per-tool response schemas (Gemini's `responseJsonSchema` on the FunctionDeclaration); other engines surface it through the description and rely on host-side validation. The model uses it to plan follow-up arguments more reliably. Must be an object schema; non-object roots are dropped server-side (engines reject non-object roots in this position).
|
|
207
|
-
| `longRunning` | no | When `true`, MANTYX appends a stable hint to the description:<br
|
|
210
|
+
| Field | Required | Notes |
|
|
211
|
+
| -------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
212
|
+
| `kind` | yes | Discriminator literal `"local"`. |
|
|
213
|
+
| `name` | yes | Model-facing tool name. Must match `/^[a-zA-Z0-9_]{1,64}$/`. |
|
|
214
|
+
| `description` | no | Free-form. When omitted the model sees an empty description (acceptable but reduces tool selection accuracy). |
|
|
215
|
+
| `parameters` | no | JSON Schema for the tool's input. Must be an object schema (`type: "object"` with `properties`); other shapes are coerced to an empty object schema server-side. Nested constraints (`array.items`, `enum`, `anyOf`, …) are preserved end-to-end. Args that fail server-side validation produce a structured `tool_input_invalid` tool result the model can recover from instead of crashing the call. |
|
|
216
|
+
| `outputSchema` | no | JSON Schema for the structured value the tool returns. Forwarded to providers with per-tool response schemas (Gemini's `responseJsonSchema` on the FunctionDeclaration); other engines surface it through the description and rely on host-side validation. The model uses it to plan follow-up arguments more reliably. Must be an object schema; non-object roots are dropped server-side (engines reject non-object roots in this position). |
|
|
217
|
+
| `longRunning` | no | When `true`, MANTYX appends a stable hint to the description:<br>_"NOTE: This is a long-running operation. Do not call this tool again if it has already returned an intermediate or pending status."_<br>Useful for tools where a single call may yield a `pending` / status response and the SDK polls on its own; without the hint, models routinely fire repeat calls and waste turns. Pure declarative — MANTYX does not change scheduling. |
|
|
208
218
|
|
|
209
219
|
**Tool call dispatch.** When the model calls a `local` tool, the SSE
|
|
210
220
|
stream emits `local_tool_call` with `kind: "local"` (or omitted, for
|
|
@@ -222,9 +232,10 @@ the `agentCard` field. MANTYX never reaches out to discover it.
|
|
|
222
232
|
```jsonc
|
|
223
233
|
{
|
|
224
234
|
"kind": "a2a_local",
|
|
225
|
-
"name": "intranet_hr_agent",
|
|
226
|
-
"description": "...",
|
|
227
|
-
"agentCard": {
|
|
235
|
+
"name": "intranet_hr_agent", // model-facing; /^[a-zA-Z0-9_]{1,64}$/
|
|
236
|
+
"description": "...", // OPTIONAL; overrides the synthesized one
|
|
237
|
+
"agentCard": {
|
|
238
|
+
// REQUIRED; A2A Agent Card content
|
|
228
239
|
"protocolVersion": "0.3.0",
|
|
229
240
|
"name": "Acme HR",
|
|
230
241
|
"description": "Answers questions about HR policies and benefits.",
|
|
@@ -242,34 +253,38 @@ the `agentCard` field. MANTYX never reaches out to discover it.
|
|
|
242
253
|
"name": "PTO lookup",
|
|
243
254
|
"description": "Find a teammate's remaining PTO days for the year.",
|
|
244
255
|
"tags": ["hr", "pto"],
|
|
245
|
-
"examples": ["How many PTO days does Alice have left?"]
|
|
246
|
-
}
|
|
256
|
+
"examples": ["How many PTO days does Alice have left?"],
|
|
257
|
+
},
|
|
258
|
+
],
|
|
259
|
+
"securitySchemes": {
|
|
260
|
+
/* spec-shaped, never read by MANTYX */
|
|
261
|
+
},
|
|
262
|
+
"security": [
|
|
263
|
+
/* spec-shaped, never read by MANTYX */
|
|
247
264
|
],
|
|
248
|
-
"securitySchemes": { /* spec-shaped, never read by MANTYX */ },
|
|
249
|
-
"security": [ /* spec-shaped, never read by MANTYX */ ]
|
|
250
265
|
/* …any other A2A spec field passes through unchanged. */
|
|
251
|
-
}
|
|
266
|
+
},
|
|
252
267
|
}
|
|
253
268
|
```
|
|
254
269
|
|
|
255
270
|
**Where the SDK obtains `agentCard`:**
|
|
256
271
|
|
|
257
|
-
-
|
|
272
|
+
- _Well-known URL._ Most peers expose the card at
|
|
258
273
|
`<peer>/.well-known/agent-card.json`. The SDK can simply
|
|
259
274
|
`fetch` it (with whatever auth applies on the local network).
|
|
260
|
-
-
|
|
275
|
+
- _Static config._ For peers that don't publish a card, hand-craft one — the
|
|
261
276
|
spec only requires a couple of fields and the rest is all metadata.
|
|
262
|
-
-
|
|
277
|
+
- _Registry / cache._ Cache cards locally and refresh periodically. MANTYX
|
|
263
278
|
treats every spec submission as a fresh snapshot, so new cards take
|
|
264
279
|
effect on the next run / message.
|
|
265
280
|
|
|
266
281
|
**What MANTYX does with `agentCard`:**
|
|
267
282
|
|
|
268
|
-
| Field
|
|
269
|
-
|
|
|
270
|
-
| `name`, `description`
|
|
271
|
-
| `skills[]` (first 12)
|
|
272
|
-
| All other fields
|
|
283
|
+
| Field | Used for | Notes |
|
|
284
|
+
| --------------------- | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
285
|
+
| `name`, `description` | Tool description for the model | Used to compose `"Delegate a task to <name>: <description>"` if no `description` override is supplied at the ref level. |
|
|
286
|
+
| `skills[]` (first 12) | Tool description for the model | Bulleted into the description so the model can choose a peer based on capability. |
|
|
287
|
+
| All other fields | Echo only | Forwarded back to the SDK in every `local_tool_call` event so the SDK can dispatch by `url`, by `provider.organization`, by `protocolVersion`, or whatever it indexed on. |
|
|
273
288
|
|
|
274
289
|
### 3.3 `mcp_local` — SDK-resolved Tool catalog
|
|
275
290
|
|
|
@@ -283,28 +298,32 @@ ship the `Implementation` block from MCP `Initialize` as `serverInfo`.
|
|
|
283
298
|
```jsonc
|
|
284
299
|
{
|
|
285
300
|
"kind": "mcp_local",
|
|
286
|
-
"name": "fs",
|
|
287
|
-
"serverInfo": {
|
|
301
|
+
"name": "fs", // SDK-side server label; not a name prefix
|
|
302
|
+
"serverInfo": {
|
|
303
|
+
// OPTIONAL; from MCP Initialize
|
|
288
304
|
"name": "mcp-server-filesystem",
|
|
289
|
-
"version": "0.4.1"
|
|
305
|
+
"version": "0.4.1",
|
|
290
306
|
/* …any other Implementation field passes through unchanged. */
|
|
291
307
|
},
|
|
292
|
-
"tools": [
|
|
308
|
+
"tools": [
|
|
309
|
+
// REQUIRED; verbatim MCP tools/list output
|
|
293
310
|
{
|
|
294
|
-
"name": "fs_read_file",
|
|
311
|
+
"name": "fs_read_file", // model-facing; /^[a-zA-Z0-9_]{1,64}$/; SDK owns naming
|
|
295
312
|
"description": "Read a file under /workspace.",
|
|
296
|
-
"inputSchema": {
|
|
313
|
+
"inputSchema": {
|
|
314
|
+
// MCP's term for the JSON Schema
|
|
297
315
|
"type": "object",
|
|
298
316
|
"properties": { "path": { "type": "string" } },
|
|
299
|
-
"required": ["path"]
|
|
317
|
+
"required": ["path"],
|
|
300
318
|
},
|
|
301
|
-
"annotations": {
|
|
319
|
+
"annotations": {
|
|
320
|
+
// OPTIONAL; spec-defined hints
|
|
302
321
|
"readOnlyHint": true,
|
|
303
|
-
"openWorldHint": false
|
|
304
|
-
}
|
|
322
|
+
"openWorldHint": false,
|
|
323
|
+
},
|
|
305
324
|
/* …any other MCP Tool field passes through unchanged. */
|
|
306
|
-
}
|
|
307
|
-
]
|
|
325
|
+
},
|
|
326
|
+
],
|
|
308
327
|
}
|
|
309
328
|
```
|
|
310
329
|
|
|
@@ -313,8 +332,8 @@ ship the `Implementation` block from MCP `Initialize` as `serverInfo`.
|
|
|
313
332
|
```ts
|
|
314
333
|
// pseudo-code, MCP-SDK-flavoured
|
|
315
334
|
const client = new McpClient(stdio("./fs-server"));
|
|
316
|
-
const init = await client.initialize();
|
|
317
|
-
const list = await client.listTools();
|
|
335
|
+
const init = await client.initialize(); // → { name, version, … }
|
|
336
|
+
const list = await client.listTools(); // → { tools: [...] }
|
|
318
337
|
|
|
319
338
|
// drop straight into the spec
|
|
320
339
|
const ref = {
|
|
@@ -327,18 +346,18 @@ const ref = {
|
|
|
327
346
|
|
|
328
347
|
**What MANTYX does with the catalog:**
|
|
329
348
|
|
|
330
|
-
| Field
|
|
331
|
-
|
|
|
332
|
-
| `tools[].name`
|
|
333
|
-
| `tools[].description`
|
|
334
|
-
| `tools[].inputSchema`
|
|
335
|
-
| `tools[].annotations`
|
|
336
|
-
| `serverInfo`
|
|
349
|
+
| Field | Used for | Notes |
|
|
350
|
+
| --------------------- | ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
351
|
+
| `tools[].name` | Model-facing tool name | Used as-is. MANTYX does **not** prefix with the ref's `name`. The SDK is responsible for any naming convention (e.g. emit `fs_read_file` instead of `read_file` if you have multiple servers). |
|
|
352
|
+
| `tools[].description` | Model-facing description | Used as-is. |
|
|
353
|
+
| `tools[].inputSchema` | LLM tool-call schema | Forwarded **verbatim** to the LLM provider as the tool's JSON Schema, then validated against incoming tool-call args (Ajv) before execution. Nested constraints (`array.items`, `enum`, `anyOf`, …) are preserved end-to-end. Empty / missing schema → no-arg tool. Args that violate the schema produce a structured `tool_input_invalid` tool result the model can recover from instead of crashing the tool. |
|
|
354
|
+
| `tools[].annotations` | Echo only | Forwarded to the SDK in `local_tool_call` events (as part of the call envelope) for observability. |
|
|
355
|
+
| `serverInfo` | Echo only | Forwarded to the SDK in `local_tool_call.mcpServerInfo`. |
|
|
337
356
|
|
|
338
357
|
> **Naming convention reminder.** Because MANTYX doesn't prefix names for
|
|
339
358
|
> `mcp_local`, two refs that both expose a tool called `read_file` will
|
|
340
359
|
> collide. Either give the second one a different `name` in the catalog or
|
|
341
|
-
> drop it via SDK-side filtering. (For `mcp` —
|
|
360
|
+
> drop it via SDK-side filtering. (For `mcp` — _remote_ MCP — MANTYX does
|
|
342
361
|
> auto-prefix with the ref's `name`, so collisions are impossible.)
|
|
343
362
|
|
|
344
363
|
---
|
|
@@ -352,24 +371,30 @@ so reconnects can use `Last-Event-ID`.
|
|
|
352
371
|
Every event payload has the same envelope:
|
|
353
372
|
|
|
354
373
|
```jsonc
|
|
355
|
-
{
|
|
374
|
+
{
|
|
375
|
+
"seq": 7,
|
|
376
|
+
"type": "<event-type>",
|
|
377
|
+
"data": {
|
|
378
|
+
/* type-specific */
|
|
379
|
+
},
|
|
380
|
+
}
|
|
356
381
|
```
|
|
357
382
|
|
|
358
383
|
The vocabulary (`EphemeralEventType` in `bus.ts`):
|
|
359
384
|
|
|
360
|
-
| Type
|
|
361
|
-
|
|
|
362
|
-
| `assistant_delta`
|
|
363
|
-
| `thinking_delta`
|
|
364
|
-
| `tool_result`
|
|
365
|
-
| `local_tool_call`
|
|
366
|
-
| `local_tool_result_in`
|
|
367
|
-
| `loop_detected`
|
|
368
|
-
| `tool_budget_exceeded`
|
|
369
|
-
| `assistant_message`
|
|
370
|
-
| `result`
|
|
371
|
-
| `error`
|
|
372
|
-
| `cancelled`
|
|
385
|
+
| Type | Direction | Frequency | Purpose |
|
|
386
|
+
| ---------------------- | --------- | ------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
387
|
+
| `assistant_delta` | M → SDK | Many | Streamed assistant text token / chunk. |
|
|
388
|
+
| `thinking_delta` | M → SDK | Many (iff `reasoningLevel > 0`) | Streamed extended-thinking text (provider redacts when policy requires). |
|
|
389
|
+
| `tool_result` | M → SDK | Per server-resolved tool call | Informational — tells the SDK that MANTYX ran a server-resolved tool (`mantyx`, `mantyx_plugin`, `a2a`, `mcp`) and got a result. The SDK does not need to act on it. |
|
|
390
|
+
| `local_tool_call` | M → SDK | Per client-resolved tool call | **Action required.** SDK must POST a tool-result. |
|
|
391
|
+
| `local_tool_result_in` | M → SDK | Per client-resolved tool call | Informational mirror of the tool-result the SDK just posted, persisted for observability. Re-emitted to late subscribers so they can replay the conversation. |
|
|
392
|
+
| `loop_detected` | M → SDK | 0–2× per run (soft nudge + optional hard cutoff) | Observability for the loop-detection guard (see §8). The server already substituted the synthetic skip + steering nudge — SDK clients render a status note (`looping — nudged` / `looping — gave up`) and otherwise leave the run alone. |
|
|
393
|
+
| `tool_budget_exceeded` | M → SDK | Per intercepted tool call | Observability for per-tool call budgets (see §8). The synthetic `tool_result` carrying the "budget exceeded — pivot or finalize" body lands on the normal tool-result channel; this event is purely so SDK clients can surface a UI banner. |
|
|
394
|
+
| `assistant_message` | M → SDK | 1× per turn | Final assistant message for the turn (concatenated, persistence-ready). |
|
|
395
|
+
| `result` | M → SDK | 1× terminal | Successful completion. Carries the final assistant text and run summary. |
|
|
396
|
+
| `error` | M → SDK | 1× terminal | Failure. Carries `error` (message), `code` / `errorClass` (category), `finishReason`, and an optional `partialText` salvage payload. See §4.7. |
|
|
397
|
+
| `cancelled` | M → SDK | 1× terminal | Cancellation. Run was aborted via `POST /cancel`. |
|
|
373
398
|
|
|
374
399
|
`result`, `error`, and `cancelled` are the **terminal** events — the SDK
|
|
375
400
|
should close the SSE stream after one of them arrives.
|
|
@@ -395,8 +420,8 @@ progress text — it's not part of the canonical assistant response.
|
|
|
395
420
|
"data": {
|
|
396
421
|
"toolUseId": "tu_a",
|
|
397
422
|
"name": "github_search_repos",
|
|
398
|
-
"result": "..."
|
|
399
|
-
}
|
|
423
|
+
"result": "...", // truncated for display; never JSON-parsed by SDK
|
|
424
|
+
},
|
|
400
425
|
}
|
|
401
426
|
```
|
|
402
427
|
|
|
@@ -436,8 +461,8 @@ No extras. Dispatch by `name`.
|
|
|
436
461
|
"toolUseId": "tu_x",
|
|
437
462
|
"name": "compute_total",
|
|
438
463
|
"args": { "amount": 42, "currency": "USD" },
|
|
439
|
-
"kind": "local"
|
|
440
|
-
}
|
|
464
|
+
"kind": "local", // OR omitted (legacy)
|
|
465
|
+
},
|
|
441
466
|
}
|
|
442
467
|
```
|
|
443
468
|
|
|
@@ -455,17 +480,20 @@ dispatch to the right A2A client when it manages multiple peers.
|
|
|
455
480
|
"name": "intranet_hr_agent",
|
|
456
481
|
"args": { "message": "When does PTO reset?" },
|
|
457
482
|
"kind": "a2a_local",
|
|
458
|
-
"agentCard": {
|
|
483
|
+
"agentCard": {
|
|
484
|
+
// full Agent Card from the spec
|
|
459
485
|
"name": "Acme HR",
|
|
460
486
|
"url": "https://hr.intranet.acme/a2a",
|
|
461
|
-
"skills": [
|
|
487
|
+
"skills": [
|
|
488
|
+
/* ... */
|
|
489
|
+
],
|
|
462
490
|
/* ...all other fields the SDK shipped... */
|
|
463
|
-
}
|
|
464
|
-
}
|
|
491
|
+
},
|
|
492
|
+
},
|
|
465
493
|
}
|
|
466
494
|
```
|
|
467
495
|
|
|
468
|
-
`args.message` is
|
|
496
|
+
`args.message` is _always_ `{ "message": string }` for `a2a_local` — the
|
|
469
497
|
LLM's task is reduced to "what do I want to ask the peer in plain text?"
|
|
470
498
|
so the SDK doesn't have to re-derive an A2A `message` envelope from a
|
|
471
499
|
tool-specific schema.
|
|
@@ -481,23 +509,24 @@ parsing the tool name back into pieces.
|
|
|
481
509
|
"type": "local_tool_call",
|
|
482
510
|
"data": {
|
|
483
511
|
"toolUseId": "tu_z",
|
|
484
|
-
"name": "fs_read_file",
|
|
512
|
+
"name": "fs_read_file", // identical to what the SDK declared
|
|
485
513
|
"args": { "path": "/etc/hosts" },
|
|
486
514
|
"kind": "mcp_local",
|
|
487
|
-
"mcpServer": "fs",
|
|
488
|
-
"mcpToolName": "fs_read_file",
|
|
489
|
-
"mcpServerInfo": {
|
|
515
|
+
"mcpServer": "fs", // ref's `name` — SDK's MCP-client key
|
|
516
|
+
"mcpToolName": "fs_read_file", // duplicates `name` for the SDK's convenience
|
|
517
|
+
"mcpServerInfo": {
|
|
518
|
+
// present iff the spec carried `serverInfo`
|
|
490
519
|
"name": "mcp-server-filesystem",
|
|
491
|
-
"version": "0.4.1"
|
|
492
|
-
}
|
|
493
|
-
}
|
|
520
|
+
"version": "0.4.1",
|
|
521
|
+
},
|
|
522
|
+
},
|
|
494
523
|
}
|
|
495
524
|
```
|
|
496
525
|
|
|
497
526
|
The SDK's typical dispatch path is:
|
|
498
527
|
|
|
499
528
|
```ts
|
|
500
|
-
const client = mcpClients.get(call.mcpServer);
|
|
529
|
+
const client = mcpClients.get(call.mcpServer); // by SDK label
|
|
501
530
|
if (!client) throw new Error(`unknown MCP server ${call.mcpServer}`);
|
|
502
531
|
const result = await client.callTool({
|
|
503
532
|
name: call.mcpToolName,
|
|
@@ -509,7 +538,10 @@ const text = result.content
|
|
|
509
538
|
.join("\n");
|
|
510
539
|
await fetch(`${baseUrl}/agent-runs/${runId}/tool-results`, {
|
|
511
540
|
method: "POST",
|
|
512
|
-
headers: {
|
|
541
|
+
headers: {
|
|
542
|
+
"Content-Type": "application/json",
|
|
543
|
+
Authorization: `Bearer ${apiKey}`,
|
|
544
|
+
},
|
|
513
545
|
body: JSON.stringify({ toolUseId: call.toolUseId, result: text }),
|
|
514
546
|
});
|
|
515
547
|
```
|
|
@@ -523,20 +555,27 @@ await fetch(`${baseUrl}/agent-runs/${runId}/tool-results`, {
|
|
|
523
555
|
"data": {
|
|
524
556
|
"text": "Here's what I found...",
|
|
525
557
|
"turn": 0,
|
|
526
|
-
"finishReason": "tool_use",
|
|
527
|
-
"toolCalls": [
|
|
528
|
-
|
|
529
|
-
|
|
530
|
-
|
|
558
|
+
"finishReason": "tool_use", // optional; canonical lowercase token
|
|
559
|
+
"toolCalls": [
|
|
560
|
+
// optional; absent when the turn was text-only
|
|
561
|
+
{
|
|
562
|
+
"id": "call_abc",
|
|
563
|
+
"name": "search",
|
|
564
|
+
"input": {
|
|
565
|
+
/* JSON Schema-matching args */
|
|
566
|
+
},
|
|
567
|
+
},
|
|
568
|
+
],
|
|
569
|
+
},
|
|
531
570
|
}
|
|
532
571
|
```
|
|
533
572
|
|
|
534
|
-
| Field
|
|
535
|
-
|
|
|
536
|
-
| `text`
|
|
537
|
-
| `turn`
|
|
538
|
-
| `finishReason`
|
|
539
|
-
| `toolCalls`
|
|
573
|
+
| Field | Type | Required | Notes |
|
|
574
|
+
| -------------- | ------------ | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
575
|
+
| `text` | string | yes | Full assistant text for this turn (concatenation of every preceding `assistant_delta` for this turn, plus any non-streaming snapshot the engine appended at close). May be empty when the turn was tool-only. |
|
|
576
|
+
| `turn` | integer | yes | 0-based tool-turn index this assistant message closes. Useful for SDK clients pairing the message with the subsequent `tool_result` rows. |
|
|
577
|
+
| `finishReason` | string\|null | no | Canonical lowercase stop reason normalized across providers (`"end_turn"`, `"tool_use"`, `"max_tokens"`, `"refusal"`, `"malformed_function_call"`, …). Pulled from the engine's per-turn `stopReason` after normalization — Gemini's `MAX_TOKENS` lands as `"max_tokens"`, OpenAI's `length` lands as `"max_tokens"`, etc. `null` / omitted when the provider did not report one. |
|
|
578
|
+
| `toolCalls` | array | no | Tool calls the model emitted on this turn (id, sanitized pipeline-side name, JSON-matching `input`). Omitted when the model did not call any tools. |
|
|
540
579
|
|
|
541
580
|
**Emission frequency.** Exactly **one** `assistant_message` per completed
|
|
542
581
|
assistant turn — including the last turn before a terminal `error`. SDK
|
|
@@ -548,7 +587,7 @@ and avoid stitching a turn out of `assistant_delta` chunks themselves
|
|
|
548
587
|
Gemini `MAX_TOKENS` while emitting `outputSchema` JSON), the last
|
|
549
588
|
`assistant_message` preceding the `error` carries the partial text plus
|
|
550
589
|
`finishReason: "max_tokens"`. The terminal `error` event then carries the
|
|
551
|
-
|
|
590
|
+
_same_ text on `data.partialText` so reconnect / replay sees both pieces
|
|
552
591
|
without depending on event ordering.
|
|
553
592
|
|
|
554
593
|
### 4.5 `loop_detected`
|
|
@@ -563,11 +602,11 @@ without depending on event ordering.
|
|
|
563
602
|
"data": { "consecutiveCount": 6, "hardCutoff": true, "tools": ["recall"] } }
|
|
564
603
|
```
|
|
565
604
|
|
|
566
|
-
| Field | Type | Notes
|
|
567
|
-
| ------------------ | ------- |
|
|
568
|
-
| `consecutiveCount` | integer | Length of the identical-batch streak that just tripped the threshold (`>= consecutiveThreshold`).
|
|
605
|
+
| Field | Type | Notes |
|
|
606
|
+
| ------------------ | ------- | ---------------------------------------------------------------------------------------------------------------------------- |
|
|
607
|
+
| `consecutiveCount` | integer | Length of the identical-batch streak that just tripped the threshold (`>= consecutiveThreshold`). |
|
|
569
608
|
| `hardCutoff` | boolean | `false` for the soft nudge round; `true` once the pipeline forces finalisation. The SDK may see one of each in a single run. |
|
|
570
|
-
| `tools` | array | Names of the tool calls in the looping batch (no args — those are persisted on the matching `tool_result` events).
|
|
609
|
+
| `tools` | array | Names of the tool calls in the looping batch (no args — those are persisted on the matching `tool_result` events). |
|
|
571
610
|
|
|
572
611
|
Observability only: the synthetic skip + steering nudge are emitted on the
|
|
573
612
|
normal `tool_result` and assistant-message channels by the time this event
|
|
@@ -580,14 +619,17 @@ See §8 for the wire-spec field that controls thresholds.
|
|
|
580
619
|
### 4.6 `tool_budget_exceeded`
|
|
581
620
|
|
|
582
621
|
```jsonc
|
|
583
|
-
{
|
|
584
|
-
"
|
|
622
|
+
{
|
|
623
|
+
"seq": 14,
|
|
624
|
+
"type": "tool_budget_exceeded",
|
|
625
|
+
"data": { "tool": "recall", "maxCalls": 4, "callIndex": 5 },
|
|
626
|
+
}
|
|
585
627
|
```
|
|
586
628
|
|
|
587
|
-
| Field | Type | Notes
|
|
588
|
-
| ----------- | ------- |
|
|
589
|
-
| `tool` | string | Logical tool name as the model saw it (matches the key in `spec.toolBudgets`).
|
|
590
|
-
| `maxCalls` | integer | Configured cap.
|
|
629
|
+
| Field | Type | Notes |
|
|
630
|
+
| ----------- | ------- | ----------------------------------------------------------------------------------------------------------- |
|
|
631
|
+
| `tool` | string | Logical tool name as the model saw it (matches the key in `spec.toolBudgets`). |
|
|
632
|
+
| `maxCalls` | integer | Configured cap. |
|
|
591
633
|
| `callIndex` | integer | 1-based count of attempts to call this tool over the run lifetime; always strictly greater than `maxCalls`. |
|
|
592
634
|
|
|
593
635
|
Observability only: the synthetic "budget exceeded — pivot or finalize"
|
|
@@ -601,14 +643,25 @@ See §8 for the wire-spec field that defines budgets.
|
|
|
601
643
|
### 4.7 Terminal events
|
|
602
644
|
|
|
603
645
|
```jsonc
|
|
604
|
-
|
|
646
|
+
// Every terminal `result` and `error` event also carries `tokens`, `turns`,
|
|
647
|
+
// and `model` for cost attribution and dashboards — see §4.7.1.
|
|
648
|
+
{ "seq": 14, "type": "result", "data": {
|
|
649
|
+
"ok": true,
|
|
650
|
+
"text": "...",
|
|
651
|
+
"tokens": { "inputTokens": 1283, "cachedTokens": 512, "reasoningTokens": 96, "outputTokens": 240 },
|
|
652
|
+
"turns": 3,
|
|
653
|
+
"model": { "id": "platform:demo", "provider": "openai", "vendorModelId": "gpt-5.4-mini", "reasoningEffort": "low" }
|
|
654
|
+
} }
|
|
605
655
|
{ "seq": 14, "type": "error", "data": {
|
|
606
656
|
"error": "Model output was truncated (stop_reason=max_tokens). …",
|
|
607
657
|
"code": "truncation", // mirrors `errorClass`; legacy alias
|
|
608
658
|
"errorClass": "truncation", // canonical category (see below)
|
|
609
659
|
"finishReason": "max_tokens", // canonical lowercase stop reason
|
|
610
660
|
"partialText": "{\n \"answer\":… (truncated JSON) …",
|
|
611
|
-
"retryable": false
|
|
661
|
+
"retryable": false, // optional; per-class retry hint
|
|
662
|
+
"tokens": { "inputTokens": 8190, "cachedTokens": 0, "reasoningTokens": 0, "outputTokens": 1024 },
|
|
663
|
+
"turns": 1,
|
|
664
|
+
"model": { "id": "provider:cmf…", "provider": "google", "vendorModelId": "gemini-2.5-pro" }
|
|
612
665
|
} }
|
|
613
666
|
{ "seq": 14, "type": "cancelled", "data": { "reason": "user" } }
|
|
614
667
|
```
|
|
@@ -620,25 +673,116 @@ SSE stream.
|
|
|
620
673
|
with structured triage attributes when the failure carried a salvage path
|
|
621
674
|
(typically truncation, upstream deadline, or max-budget-with-text):
|
|
622
675
|
|
|
623
|
-
| Field | Type
|
|
624
|
-
| -------------- |
|
|
625
|
-
| `error` | string
|
|
626
|
-
| `code` | string
|
|
627
|
-
| `errorClass` | string
|
|
628
|
-
| `finishReason` | string\|null | no
|
|
629
|
-
| `partialText` | string
|
|
630
|
-
| `retryable` | boolean
|
|
676
|
+
| Field | Type | Required | Notes |
|
|
677
|
+
| -------------- | ------------ | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
678
|
+
| `error` | string | yes | Human-readable message (also persisted on `EphemeralAgentRun.error`). |
|
|
679
|
+
| `code` | string | yes | Legacy alias for `errorClass`. Equals `errorClass` when present; otherwise a small lowercase token (`"error"`, `"invalid_spec"`, `"worker_error"`, …) the SDK can switch on. |
|
|
680
|
+
| `errorClass` | string | no | Canonical category. One of `"rate_limit"`, `"overloaded"`, `"server"`, `"context_window"` (input too big), `"truncation"` (output budget exhausted), `"invalid_request"`, `"auth"`, `"timeout"`, `"local_timeout"`, `"upstream_deadline"`, `"unknown"`. New categories may land additively. |
|
|
681
|
+
| `finishReason` | string\|null | no | Canonical lowercase stop reason normalized across providers (`"max_tokens"`, `"refusal"`, `"malformed_function_call"`, …). When present, mirrors the value on the last `assistant_message`. |
|
|
682
|
+
| `partialText` | string | no | **Best-effort raw bytes** the model emitted before the failure. For `outputSchema` runs this is likely **incomplete JSON** that will fail `JSON.parse` — see §7 below. Also persisted on `EphemeralAgentRun.finalText` so the Calls UI can render it alongside a truncation banner. |
|
|
683
|
+
| `retryable` | boolean | no | Coarse retry hint inherited from the pipeline's error classifier. Informational; the SDK still owns the actual retry decision. |
|
|
631
684
|
|
|
632
685
|
When `errorClass` is `"truncation"`, the `EphemeralAgentRun` row that the
|
|
633
686
|
SDK can re-fetch via `GET /agent-runs/:runId` will have:
|
|
634
687
|
|
|
635
|
-
| Field | Value
|
|
636
|
-
| --------------- |
|
|
637
|
-
| `status` | `"failed"`
|
|
638
|
-
| `finalText` | Same string as `data.partialText` (so SDKs can ignore the SSE stream and still recover the salvage).
|
|
639
|
-
| `error` | Same string as `data.error`.
|
|
688
|
+
| Field | Value |
|
|
689
|
+
| --------------- | ------------------------------------------------------------------------------------------------------------------------ |
|
|
690
|
+
| `status` | `"failed"` |
|
|
691
|
+
| `finalText` | Same string as `data.partialText` (so SDKs can ignore the SSE stream and still recover the salvage). |
|
|
692
|
+
| `error` | Same string as `data.error`. |
|
|
640
693
|
| `failureReason` | `{ "errorClass": "truncation", "finishReason": "max_tokens" }` (JSON object, future-proof for additional triage fields). |
|
|
641
694
|
|
|
695
|
+
### 4.7.1 Cost-attribution fields (`tokens`, `turns`, `model`)
|
|
696
|
+
|
|
697
|
+
Every terminal `result` and `error` event carries three additional
|
|
698
|
+
fields so callers can drive cost dashboards, per-turn budgets, and
|
|
699
|
+
provider/model spend reports without a follow-up `GET /agent-runs/:runId`
|
|
700
|
+
round trip. The same fields are persisted on the `EphemeralAgentRun`
|
|
701
|
+
row (columns `tokens` / `turns` / `model`) and surfaced by that
|
|
702
|
+
endpoint.
|
|
703
|
+
|
|
704
|
+
| Field | Type | Notes |
|
|
705
|
+
| -------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
706
|
+
| `tokens` | object | Per-run token totals aggregated across every model invocation. Schema below. |
|
|
707
|
+
| `turns` | int | Total `engine.completeTurn(...)` invocations for the run, **including** the failing call when a run errors out mid-loop. A single-shot run reports `1`; a tool loop is `>= 2`. Tracked by the pipeline as `modelInvocations` in `PipelineLoopState` and emitted on the terminal `PipelineEvent` (see `packages/agent-pipeline/src/types.ts`). Distinct from "tool turns" — `turns` counts **model invocations**, regardless of whether the model called any tools. |
|
|
708
|
+
| `model` | object | Resolved model that actually executed the run. Schema below. |
|
|
709
|
+
|
|
710
|
+
Always present on terminal events for runs created against
|
|
711
|
+
**MANTYX ≥ 2026-09** servers. Older servers omit these fields entirely;
|
|
712
|
+
SDK clients (TS/Go/Python) detect "no usage data" by checking that
|
|
713
|
+
`model.provider` is empty / falsy. JSON keys follow MANTYX's standard
|
|
714
|
+
camelCase wire convention.
|
|
715
|
+
|
|
716
|
+
**`tokens` schema** — mirrors the wire shape produced by
|
|
717
|
+
`tokenUsageToWireTokens` in `packages/ts-sdk/src/usage-wire.ts`, which
|
|
718
|
+
is the single source of truth across the TS SDK return value, REST/SSE,
|
|
719
|
+
and A2A surfaces:
|
|
720
|
+
|
|
721
|
+
| Field | Type | Notes |
|
|
722
|
+
| ----------------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
723
|
+
| `inputTokens` | int | **Total billable input** — fresh prompt tokens **plus** the cached-read slice the provider still bills (at a discount) **plus** any cache-creation tokens **plus** tool-prompt tokens. Equal to the sum of every provider-reported input bucket for the run. |
|
|
724
|
+
| `cachedTokens` | int | The discounted slice of `inputTokens` that came from a prompt cache hit (Anthropic prompt caching, OpenAI cached prompt, Gemini implicit cache). `0` when the provider doesn't report cache reads or the run didn't hit cache. |
|
|
725
|
+
| `reasoningTokens` | int | Non-visible thinking tokens. **Already counted inside `outputTokens`** — surfaced separately so dashboards can break out "thinking cost" vs visible output. `0` when the model didn't reason or didn't report it. |
|
|
726
|
+
| `outputTokens` | int | **All** tokens the model emitted for this run, visible + reasoning. Matches the provider's "completion tokens" / "output tokens" billing line. |
|
|
727
|
+
|
|
728
|
+
`inputTokens` and `outputTokens` together cover every billable token the
|
|
729
|
+
run consumed; `cachedTokens` and `reasoningTokens` are diagnostic
|
|
730
|
+
breakdowns _inside_ those two totals (not separate buckets to be added).
|
|
731
|
+
All four are clamped to non-negative integers — a misbehaving engine
|
|
732
|
+
emitting `NaN` or negatives cannot poison the JSON snapshot or Prisma
|
|
733
|
+
write.
|
|
734
|
+
|
|
735
|
+
**`model` schema** — fields the platform stamps onto every successful
|
|
736
|
+
or failed run via `services/agent-runs/resolve-model.ts`:
|
|
737
|
+
|
|
738
|
+
| Field | Type | Notes |
|
|
739
|
+
| ----------------- | ------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
740
|
+
| `id` | string | Catalog id — the same string a caller would pass back as `modelId` (in §2.1) to re-select this exact entry (e.g. `"platform:demo"`, `"provider:cmf…"`). Empty string against legacy fallbacks that didn't synthesise a catalog id. |
|
|
741
|
+
| `provider` | string | Lowercase provider id: `"openai"`, `"anthropic"`, `"google"`, `"azure-openai"`. |
|
|
742
|
+
| `vendorModelId` | string | The model id the platform actually sent to the provider (e.g. `"gpt-5.4-mini"`, `"claude-opus-4-7"`, `"gemini-2.5-pro"`). Carried through from the `model` field on `AgentSpec` after resolution. |
|
|
743
|
+
| `reasoningEffort` | string | Optional. `"off"`, `"low"`, `"medium"`, `"high"`. Computed via `resolveReasoningEffortForOptions` (`packages/ts-sdk/src/usage-wire.ts`) from the unified 0–100 `reasoningLevel` knob: 0 → `"off"`, 1–35 → `"low"`, 36–65 → `"medium"`, 66–100 → `"high"`. Omitted when the provider doesn't expose a reasoning-level knob or the run didn't request one. |
|
|
744
|
+
|
|
745
|
+
**Per-provider token mapping.** Provider responses vary in how they
|
|
746
|
+
report token usage. MANTYX normalises them into the wire shape above as
|
|
747
|
+
follows (see `packages/agent-pipeline/src/engines/*` for the engine-
|
|
748
|
+
side aggregation that feeds `tokenUsageToWireTokens`):
|
|
749
|
+
|
|
750
|
+
| Provider | `inputTokens` ← | `cachedTokens` ← | `reasoningTokens` ← | `outputTokens` ← |
|
|
751
|
+
| --------- | ----------------------------------------------------------------------------------------------- | ------------------------------------------- | ----------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
|
|
752
|
+
| OpenAI | `usage.prompt_tokens` (already includes cached read tokens) | `usage.prompt_tokens_details.cached_tokens` | `usage.completion_tokens_details.reasoning_tokens` | `usage.completion_tokens` |
|
|
753
|
+
| Anthropic | `usage.input_tokens` + `usage.cache_read_input_tokens` + `usage.cache_creation_input_tokens` | `usage.cache_read_input_tokens` | (extended-thinking tokens; folded into `output_tokens` by the provider) | `usage.output_tokens` |
|
|
754
|
+
| Google | `usageMetadata.promptTokenCount` + `usageMetadata.cachedContentTokenCount` + tool-prompt tokens | `usageMetadata.cachedContentTokenCount` | `usageMetadata.thoughtsTokenCount` | `usageMetadata.candidatesTokenCount` (or `totalTokenCount - promptTokenCount` for older Gemini SDKs) |
|
|
755
|
+
|
|
756
|
+
If a provider doesn't report a given bucket the corresponding field is
|
|
757
|
+
`0`, never `null`.
|
|
758
|
+
|
|
759
|
+
**Tool-loop accounting.** When the run executes tool turns, every
|
|
760
|
+
`engine.completeTurn(...)` invocation contributes its usage to the
|
|
761
|
+
aggregated `tokens` object — so a run with one tool round (model →
|
|
762
|
+
tool → model) reports `turns: 2` and the **sum** of both model calls'
|
|
763
|
+
token usage. The counter is incremented in a `try/finally` around the
|
|
764
|
+
engine call inside `runMainPipelineLoop`
|
|
765
|
+
(`packages/agent-pipeline/src/pipeline.ts`), so the failing call still
|
|
766
|
+
counts toward `turns` even when the engine throws. The terminal event
|
|
767
|
+
carries cumulative totals only; per-turn observability lives on
|
|
768
|
+
`assistant_message` events.
|
|
769
|
+
|
|
770
|
+
**A2A exposure.** The MANTYX-hosted A2A endpoint
|
|
771
|
+
(`POST /api/a2a/{workspaceSlug}/agents/{agentSlug}`) returns the same
|
|
772
|
+
triple under `result.metadata.mantyx`. The block is omitted entirely
|
|
773
|
+
against legacy runners that haven't implemented the optional
|
|
774
|
+
`runWithUsage` method on `AgentRunner` (see
|
|
775
|
+
`packages/ts-sdk/src/a2a/adapter.ts`); cross-platform A2A clients
|
|
776
|
+
should treat its absence as "no usage data" rather than as zero usage.
|
|
777
|
+
|
|
778
|
+
**SDK return-value exposure.** The TS SDK exposes the same triple via
|
|
779
|
+
the opt-in `runAgentWithUsage` (returning a `RunAgentResult` with
|
|
780
|
+
`text`, `tokens`, `turns`, `model`). The legacy `runAgent` still
|
|
781
|
+
returns just `string` for backward compatibility — see
|
|
782
|
+
`packages/ts-sdk/src/run.ts`. Go and Python SDKs surface the fields
|
|
783
|
+
directly on the existing `RunResult` struct/dataclass (additive,
|
|
784
|
+
non-breaking since those return types were already objects).
|
|
785
|
+
|
|
642
786
|
---
|
|
643
787
|
|
|
644
788
|
## 5. SDK → MANTYX: tool-result POST
|
|
@@ -657,19 +801,19 @@ Authorization: Bearer <api-key>
|
|
|
657
801
|
}
|
|
658
802
|
```
|
|
659
803
|
|
|
660
|
-
| Field
|
|
661
|
-
|
|
|
662
|
-
| `toolUseId`
|
|
663
|
-
| `result`
|
|
664
|
-
| `error`
|
|
804
|
+
| Field | Type | Required | Notes |
|
|
805
|
+
| ----------- | ------ | -------- | ------------------------------------------------------------------------------------------------------------------------------ |
|
|
806
|
+
| `toolUseId` | string | yes | Must match a pending `local_tool_call`'s id. |
|
|
807
|
+
| `result` | string | one-of | Successful textual result (≤ 2 MB). For MCP tools, flatten content blocks to text. For A2A delegations, the peer's reply text. |
|
|
808
|
+
| `error` | string | one-of | Human-readable failure message (≤ 8 KB). Surfaced to the model so it can recover. |
|
|
665
809
|
|
|
666
810
|
Server response codes:
|
|
667
811
|
|
|
668
|
-
| Code
|
|
669
|
-
|
|
|
670
|
-
| `204` | Accepted; the runner was woken and will resume the model loop.
|
|
671
|
-
| `400` | Body failed Zod validation (missing `toolUseId`, both/neither of `result`/`error`, etc.).
|
|
672
|
-
| `404` | `unknown_tool_use` — `toolUseId` doesn't match any pending call (already answered or unknown id).
|
|
812
|
+
| Code | When |
|
|
813
|
+
| ----- | ------------------------------------------------------------------------------------------------------------------- |
|
|
814
|
+
| `204` | Accepted; the runner was woken and will resume the model loop. |
|
|
815
|
+
| `400` | Body failed Zod validation (missing `toolUseId`, both/neither of `result`/`error`, etc.). |
|
|
816
|
+
| `404` | `unknown_tool_use` — `toolUseId` doesn't match any pending call (already answered or unknown id). |
|
|
673
817
|
| `409` | `run_terminal` — the run already finished (success, failure, cancel, or local-tool timeout). The result is dropped. |
|
|
674
818
|
|
|
675
819
|
The runner enforces a per-call `localToolTimeoutMs` (default 5 minutes).
|
|
@@ -684,20 +828,20 @@ After timeout the model loop unblocks with a synthetic
|
|
|
684
828
|
`spec.reasoningLevel` controls the LLM's extended-thinking effort. Two
|
|
685
829
|
input shapes are accepted; both map to a numeric `0–100` internally.
|
|
686
830
|
|
|
687
|
-
| Form
|
|
688
|
-
|
|
|
689
|
-
| **String**
|
|
690
|
-
| **Number**
|
|
831
|
+
| Form | Values | Notes |
|
|
832
|
+
| ---------- | -------------------------------------- | ---------------------------------------------------------- |
|
|
833
|
+
| **String** | `"off"`, `"low"`, `"medium"`, `"high"` | Snaps to `0`, `30`, `50`, `80` (matches the web composer). |
|
|
834
|
+
| **Number** | integer `0`–`100` | Pass-through. `0` explicitly disables provider thinking. |
|
|
691
835
|
|
|
692
836
|
Per provider:
|
|
693
837
|
|
|
694
|
-
| Provider
|
|
695
|
-
|
|
|
696
|
-
| OpenAI Responses (o-series, GPT-5.x) | `reasoning.effort`
|
|
697
|
-
| Gemini ≥ 3
|
|
698
|
-
| Gemini ≤ 2.5
|
|
699
|
-
| Anthropic / Bedrock-Anthropic
|
|
700
|
-
| xAI Grok, others
|
|
838
|
+
| Provider | Knob driven by `reasoningLevel` |
|
|
839
|
+
| ------------------------------------ | -------------------------------------------------------------------- |
|
|
840
|
+
| OpenAI Responses (o-series, GPT-5.x) | `reasoning.effort` |
|
|
841
|
+
| Gemini ≥ 3 | `thinkingConfig.thinkingLevel` |
|
|
842
|
+
| Gemini ≤ 2.5 | `thinkingConfig.thinkingBudget` (token budget; scaled) |
|
|
843
|
+
| Anthropic / Bedrock-Anthropic | extended thinking budget (≈ 512 tokens at `low` → ≈ 8 000 at `high`) |
|
|
844
|
+
| xAI Grok, others | ignored |
|
|
701
845
|
|
|
702
846
|
When `reasoningLevel > 0` and the provider supports it, the SSE stream
|
|
703
847
|
will include `thinking_delta` events alongside `assistant_delta`.
|
|
@@ -718,21 +862,21 @@ guaranteed-parseable JSON matching the supplied schema.
|
|
|
718
862
|
}
|
|
719
863
|
```
|
|
720
864
|
|
|
721
|
-
| Field | Type | Required | Notes
|
|
722
|
-
| -------- | ------ | -------- |
|
|
723
|
-
| `name` | string | no | Stable identifier passed to providers (OpenAI `text.format.name`, Anthropic synthetic-tool name). Defaults to `"output"`.
|
|
865
|
+
| Field | Type | Required | Notes |
|
|
866
|
+
| -------- | ------ | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
867
|
+
| `name` | string | no | Stable identifier passed to providers (OpenAI `text.format.name`, Anthropic synthetic-tool name). Defaults to `"output"`. |
|
|
724
868
|
| `schema` | object | yes | JSON Schema for the assistant text. Root must be a JSON object — most providers reject array/scalar roots in structured-output mode. Passed through verbatim; MANTYX does not validate the schema's contents. |
|
|
725
869
|
|
|
726
870
|
Per provider:
|
|
727
871
|
|
|
728
|
-
| Provider
|
|
729
|
-
|
|
|
730
|
-
| OpenAI Responses (o-series, GPT-5.x, …) | `text.format = { type: "json_schema", strict: true, name, schema }` on every `completeTurn` (compatible with tool calls).
|
|
731
|
-
| Gemini 3+ (any turn)
|
|
732
|
-
| Gemini ≤ 2.5 with no tools
|
|
733
|
-
| Gemini ≤ 2.5 **with tools**
|
|
734
|
-
| Anthropic / Bedrock-Anthropic
|
|
735
|
-
| xAI Grok, others
|
|
872
|
+
| Provider | How the schema is enforced |
|
|
873
|
+
| --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
874
|
+
| OpenAI Responses (o-series, GPT-5.x, …) | `text.format = { type: "json_schema", strict: true, name, schema }` on every `completeTurn` (compatible with tool calls). |
|
|
875
|
+
| Gemini 3+ (any turn) | `responseMimeType: "application/json"` + `responseJsonSchema` on every `completeTurn`. Gemini 3 accepts the schema alongside `functionDeclarations`. |
|
|
876
|
+
| Gemini ≤ 2.5 with no tools | Same as Gemini 3+: `responseMimeType: "application/json"` + `responseJsonSchema`. |
|
|
877
|
+
| Gemini ≤ 2.5 **with tools** | Synthetic `set_model_response` function declaration is injected; its `parametersJsonSchema` is the supplied schema. The system instruction is augmented to direct the model to call this tool with the final answer. The engine intercepts the call, hides it from the SDK, and surfaces the call's arguments as the assistant text (JSON-stringified). Sidesteps the API rejection ("Function calling with a response mime type: 'application/json' is unsupported") without round-tripping a 4xx. |
|
|
878
|
+
| Anthropic / Bedrock-Anthropic | Synthetic `final_report` tool whose `input_schema` is the supplied schema; `tool_choice` is forced on the no-tools finishing turn. The tool's input is surfaced as the assistant text. |
|
|
879
|
+
| xAI Grok, others | Ignored — the model returns plain text. |
|
|
736
880
|
|
|
737
881
|
The synthetic-tool paths (Gemini 2.5 + tools, Anthropic) are entirely
|
|
738
882
|
internal: the SDK still receives `data.text: string` on the terminal
|
|
@@ -741,11 +885,11 @@ or `final_report`. They never appear in the tools array the SDK declared.
|
|
|
741
885
|
|
|
742
886
|
Validation (server-side, `400 invalid_request` on violation):
|
|
743
887
|
|
|
744
|
-
| Constraint
|
|
745
|
-
|
|
|
746
|
-
| Serialized JSON size of `outputSchema`
|
|
747
|
-
| `name` regex
|
|
748
|
-
| `schema` shape
|
|
888
|
+
| Constraint | Limit |
|
|
889
|
+
| -------------------------------------- | --------------------------------- |
|
|
890
|
+
| Serialized JSON size of `outputSchema` | ≤ 32 KB |
|
|
891
|
+
| `name` regex | `/^[a-zA-Z0-9_-]{1,64}$/` |
|
|
892
|
+
| `schema` shape | non-`null`, non-array JSON object |
|
|
749
893
|
|
|
750
894
|
**SDK guidance.** Even though the server enforces JSON shape via the
|
|
751
895
|
provider, transient model errors (refusal text, truncation under
|
|
@@ -768,8 +912,8 @@ bytes that already streamed. Instead:
|
|
|
768
912
|
bytes (§4.7).
|
|
769
913
|
3. The run row exposes the salvage on
|
|
770
914
|
`GET /agent-runs/:runId` as `{ status: "failed", finalText: "<partial JSON>",
|
|
771
|
-
|
|
772
|
-
|
|
915
|
+
error: "Model output was truncated …", failureReason: { errorClass:
|
|
916
|
+
"truncation", finishReason: "max_tokens" } }`.
|
|
773
917
|
|
|
774
918
|
`partialText` is a **best-effort raw byte sequence** — for `outputSchema`
|
|
775
919
|
runs it will almost always fail `JSON.parse` because the JSON object was
|
|
@@ -781,7 +925,7 @@ falling back to it as the answer is not.
|
|
|
781
925
|
`outputSchema` works for both ephemeral runs (`systemPrompt`-defined) and
|
|
782
926
|
`agentId`-backed runs — the runner applies the schema to whichever
|
|
783
927
|
`AgentSpec` it built. `outputSchema` is independent of `reasoningLevel`:
|
|
784
|
-
the model can think extensively
|
|
928
|
+
the model can think extensively _and_ emit JSON.
|
|
785
929
|
|
|
786
930
|
---
|
|
787
931
|
|
|
@@ -798,10 +942,10 @@ The pipeline tracks an order-invariant canonical signature for every
|
|
|
798
942
|
assistant turn that emits one or more tool calls. When the same signature
|
|
799
943
|
repeats consecutively the guard intervenes:
|
|
800
944
|
|
|
801
|
-
| Trigger
|
|
802
|
-
|
|
|
945
|
+
| Trigger | Server action |
|
|
946
|
+
| ------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
803
947
|
| `consecutiveThreshold` identical batches in a row | Skip the duplicate batch with a synthetic "you've made this exact call before" tool result, prepend a user-style **steering nudge** ("either deliver a final answer or change strategy") before the next model turn. |
|
|
804
|
-
| `hardCutoffThreshold` identical batches in a row | Force a tools-disabled finalise turn (same path as `budgets.maxToolTurnsExceeded: "finalize"`) so the run lands cleanly.
|
|
948
|
+
| `hardCutoffThreshold` identical batches in a row | Force a tools-disabled finalise turn (same path as `budgets.maxToolTurnsExceeded: "finalize"`) so the run lands cleanly. |
|
|
805
949
|
|
|
806
950
|
```jsonc
|
|
807
951
|
"loopDetection": {
|
|
@@ -813,11 +957,11 @@ repeats consecutively the guard intervenes:
|
|
|
813
957
|
"loopDetection": false // explicitly disable for this run
|
|
814
958
|
```
|
|
815
959
|
|
|
816
|
-
| Field | Type | Notes
|
|
817
|
-
| ---------------------- | --------------- |
|
|
818
|
-
| `consecutiveThreshold` | integer ≥ 2 | Default `3`. Single batch = single tool call, not a loop, so the floor is `2`.
|
|
960
|
+
| Field | Type | Notes |
|
|
961
|
+
| ---------------------- | --------------- | --------------------------------------------------------------------------------------------------------------------- |
|
|
962
|
+
| `consecutiveThreshold` | integer ≥ 2 | Default `3`. Single batch = single tool call, not a loop, so the floor is `2`. |
|
|
819
963
|
| `hardCutoffThreshold` | integer ≥ 3 | Default `6`. Must be **strictly greater** than `consecutiveThreshold` (otherwise the soft nudge never gets a chance). |
|
|
820
|
-
| (top-level `false`) | literal `false` | Disables the guard. `budgets.maxToolTurns` still applies.
|
|
964
|
+
| (top-level `false`) | literal `false` | Disables the guard. `budgets.maxToolTurns` still applies. |
|
|
821
965
|
|
|
822
966
|
Validation (server-side, `400 invalid_request` on violation): both
|
|
823
967
|
thresholds capped at `100`; `hardCutoffThreshold` must exceed
|
|
@@ -844,10 +988,10 @@ loop and either changes strategy or finalises.
|
|
|
844
988
|
}
|
|
845
989
|
```
|
|
846
990
|
|
|
847
|
-
| Field | Type
|
|
848
|
-
| ---------- |
|
|
991
|
+
| Field | Type | Notes |
|
|
992
|
+
| ---------- | -------------------- | -------------------------------------------------------------------------------------------------------------- |
|
|
849
993
|
| `<key>` | string (1–120 chars) | Logical tool name as the model sees it (`ResolvedTool.name`). The SDK + pipeline handle internal sanitisation. |
|
|
850
|
-
| `maxCalls` | integer ≥ 0
|
|
994
|
+
| `maxCalls` | integer ≥ 0 | Hard cap. `0` disables the tool entirely (the first attempt returns the synthetic body). |
|
|
851
995
|
|
|
852
996
|
Budgets are **per-tool, not pooled** — `hive_search_deals: { maxCalls: 5 }`
|
|
853
997
|
and `hive_search_meetings: { maxCalls: 5 }` give the agent five of each,
|
|
@@ -855,23 +999,23 @@ not five between them.
|
|
|
855
999
|
|
|
856
1000
|
Validation (server-side, `400 invalid_request` on violation):
|
|
857
1001
|
|
|
858
|
-
| Constraint
|
|
859
|
-
|
|
|
860
|
-
| Max entries
|
|
861
|
-
| `<key>` length
|
|
1002
|
+
| Constraint | Limit |
|
|
1003
|
+
| ---------------------- | ---------------------------------------------------------------- |
|
|
1004
|
+
| Max entries | `32` |
|
|
1005
|
+
| `<key>` length | `1..120` |
|
|
862
1006
|
| `maxCalls` upper bound | `1000` (functionally unlimited; `maxToolTurns: 100` fires first) |
|
|
863
1007
|
|
|
864
1008
|
**Default budgets** (applied when the field is omitted; caller-provided
|
|
865
1009
|
entries are layered on top so per-run overrides win):
|
|
866
1010
|
|
|
867
|
-
| Tool
|
|
868
|
-
|
|
|
869
|
-
| `recall` (workspace memory hybrid search)
|
|
870
|
-
| `traverse` (memory graph BFS)
|
|
871
|
-
| `hive_consult_ontology` (per-hive ontology read; same name across all three hives)
|
|
872
|
-
| `hive_search_deals` / `_meetings` / `_companies` / `_people` (Sales Hive general search)
|
|
873
|
-
| `hive_search_tickets` / `_conversations` / `_accounts` (Customer Hive general search)
|
|
874
|
-
| `hive_search_releases` / `_issues` (Product Hive general search)
|
|
1011
|
+
| Tool | Default `maxCalls` |
|
|
1012
|
+
| ---------------------------------------------------------------------------------------- | ------------------ |
|
|
1013
|
+
| `recall` (workspace memory hybrid search) | `4` |
|
|
1014
|
+
| `traverse` (memory graph BFS) | `3` |
|
|
1015
|
+
| `hive_consult_ontology` (per-hive ontology read; same name across all three hives) | `4` |
|
|
1016
|
+
| `hive_search_deals` / `_meetings` / `_companies` / `_people` (Sales Hive general search) | `5` |
|
|
1017
|
+
| `hive_search_tickets` / `_conversations` / `_accounts` (Customer Hive general search) | `5` |
|
|
1018
|
+
| `hive_search_releases` / `_issues` (Product Hive general search) | `5` |
|
|
875
1019
|
|
|
876
1020
|
Pass `"toolBudgets": {}` to start from a clean slate (no defaults applied
|
|
877
1021
|
on top — useful for runs that intentionally want unbounded research). When
|
|
@@ -942,23 +1086,27 @@ terminal event.
|
|
|
942
1086
|
import { fetch } from "undici";
|
|
943
1087
|
|
|
944
1088
|
// ── 1. Resolve the Agent Card locally ───────────────────────────────────
|
|
945
|
-
const cardResp = await fetch(
|
|
946
|
-
|
|
947
|
-
|
|
948
|
-
|
|
1089
|
+
const cardResp = await fetch(
|
|
1090
|
+
"https://hr.intranet.acme/.well-known/agent-card.json",
|
|
1091
|
+
{
|
|
1092
|
+
headers: { Authorization: `Bearer ${INTRANET_TOKEN}` },
|
|
1093
|
+
},
|
|
1094
|
+
);
|
|
1095
|
+
const agentCard = await cardResp.json(); // ← whole document, passed through
|
|
949
1096
|
|
|
950
1097
|
// ── 2. Submit the spec ──────────────────────────────────────────────────
|
|
951
1098
|
const create = await fetch(`${MANTYX}/api/v1/workspaces/${slug}/agent-runs`, {
|
|
952
1099
|
method: "POST",
|
|
953
|
-
headers: {
|
|
1100
|
+
headers: {
|
|
1101
|
+
"Content-Type": "application/json",
|
|
1102
|
+
Authorization: `Bearer ${apiKey}`,
|
|
1103
|
+
},
|
|
954
1104
|
body: JSON.stringify({
|
|
955
1105
|
modelId: "openai:gpt-5.5",
|
|
956
1106
|
systemPrompt: "You can delegate HR questions to the Acme HR agent.",
|
|
957
1107
|
prompt: "How many PTO days does Alice have left this year?",
|
|
958
1108
|
reasoningLevel: "low",
|
|
959
|
-
tools: [
|
|
960
|
-
{ kind: "a2a_local", name: "intranet_hr_agent", agentCard },
|
|
961
|
-
],
|
|
1109
|
+
tools: [{ kind: "a2a_local", name: "intranet_hr_agent", agentCard }],
|
|
962
1110
|
}),
|
|
963
1111
|
});
|
|
964
1112
|
const { runId, streamUrl } = await create.json();
|
|
@@ -972,14 +1120,23 @@ for await (const ev of parseSSE(stream)) {
|
|
|
972
1120
|
if (ev.type !== "local_tool_call") continue;
|
|
973
1121
|
if (ev.data.kind !== "a2a_local") continue;
|
|
974
1122
|
|
|
975
|
-
const peer = a2aClients.get(ev.data.agentCard.url);
|
|
1123
|
+
const peer = a2aClients.get(ev.data.agentCard.url); // ← dispatch by URL
|
|
976
1124
|
const reply = await peer.send({ message: ev.data.args.message });
|
|
977
1125
|
|
|
978
|
-
await fetch(
|
|
979
|
-
|
|
980
|
-
|
|
981
|
-
|
|
982
|
-
|
|
1126
|
+
await fetch(
|
|
1127
|
+
`${MANTYX}/api/v1/workspaces/${slug}/agent-runs/${runId}/tool-results`,
|
|
1128
|
+
{
|
|
1129
|
+
method: "POST",
|
|
1130
|
+
headers: {
|
|
1131
|
+
"Content-Type": "application/json",
|
|
1132
|
+
Authorization: `Bearer ${apiKey}`,
|
|
1133
|
+
},
|
|
1134
|
+
body: JSON.stringify({
|
|
1135
|
+
toolUseId: ev.data.toolUseId,
|
|
1136
|
+
result: reply.text,
|
|
1137
|
+
}),
|
|
1138
|
+
},
|
|
1139
|
+
);
|
|
983
1140
|
}
|
|
984
1141
|
```
|
|
985
1142
|
|
|
@@ -990,13 +1147,16 @@ for await (const ev of parseSSE(stream)) {
|
|
|
990
1147
|
```ts
|
|
991
1148
|
// ── 1. Connect + resolve catalog locally ────────────────────────────────
|
|
992
1149
|
const mcp = new McpClient(stdio("./mcp-server-filesystem"));
|
|
993
|
-
const initImpl = await mcp.initialize();
|
|
994
|
-
const { tools } = await mcp.listTools();
|
|
1150
|
+
const initImpl = await mcp.initialize(); // → { name, version, ... }
|
|
1151
|
+
const { tools } = await mcp.listTools(); // → MCP Tool[]
|
|
995
1152
|
|
|
996
1153
|
// ── 2. Submit the spec ──────────────────────────────────────────────────
|
|
997
1154
|
const create = await fetch(`${MANTYX}/api/v1/workspaces/${slug}/agent-runs`, {
|
|
998
1155
|
method: "POST",
|
|
999
|
-
headers: {
|
|
1156
|
+
headers: {
|
|
1157
|
+
"Content-Type": "application/json",
|
|
1158
|
+
Authorization: `Bearer ${apiKey}`,
|
|
1159
|
+
},
|
|
1000
1160
|
body: JSON.stringify({
|
|
1001
1161
|
modelId: "openai:gpt-5.5",
|
|
1002
1162
|
prompt: "Tell me what's at /etc/hosts.",
|
|
@@ -1005,7 +1165,7 @@ const create = await fetch(`${MANTYX}/api/v1/workspaces/${slug}/agent-runs`, {
|
|
|
1005
1165
|
kind: "mcp_local",
|
|
1006
1166
|
name: "fs",
|
|
1007
1167
|
serverInfo: initImpl,
|
|
1008
|
-
tools,
|
|
1168
|
+
tools, // ← verbatim from listTools()
|
|
1009
1169
|
},
|
|
1010
1170
|
],
|
|
1011
1171
|
}),
|
|
@@ -1018,7 +1178,7 @@ for await (const ev of parseSSE(streamFromUrl(streamUrl, apiKey))) {
|
|
|
1018
1178
|
if (ev.data.kind !== "mcp_local") continue;
|
|
1019
1179
|
|
|
1020
1180
|
const result = await mcp.callTool({
|
|
1021
|
-
name: ev.data.mcpToolName,
|
|
1181
|
+
name: ev.data.mcpToolName, // identical to ev.data.name
|
|
1022
1182
|
arguments: ev.data.args,
|
|
1023
1183
|
});
|
|
1024
1184
|
const text = result.content
|
|
@@ -1026,11 +1186,17 @@ for await (const ev of parseSSE(streamFromUrl(streamUrl, apiKey))) {
|
|
|
1026
1186
|
.map((b) => b.text)
|
|
1027
1187
|
.join("\n");
|
|
1028
1188
|
|
|
1029
|
-
await fetch(
|
|
1030
|
-
|
|
1031
|
-
|
|
1032
|
-
|
|
1033
|
-
|
|
1189
|
+
await fetch(
|
|
1190
|
+
`${MANTYX}/api/v1/workspaces/${slug}/agent-runs/${runId}/tool-results`,
|
|
1191
|
+
{
|
|
1192
|
+
method: "POST",
|
|
1193
|
+
headers: {
|
|
1194
|
+
"Content-Type": "application/json",
|
|
1195
|
+
Authorization: `Bearer ${apiKey}`,
|
|
1196
|
+
},
|
|
1197
|
+
body: JSON.stringify({ toolUseId: ev.data.toolUseId, result: text }),
|
|
1198
|
+
},
|
|
1199
|
+
);
|
|
1034
1200
|
}
|
|
1035
1201
|
```
|
|
1036
1202
|
|
|
@@ -1050,7 +1216,7 @@ A reference SDK should:
|
|
|
1050
1216
|
JSON shape via the provider, but transient model errors can still
|
|
1051
1217
|
produce strings that fail to parse in rare cases.
|
|
1052
1218
|
- [ ] Accept `loopDetection` and `toolBudgets` from the caller and pass
|
|
1053
|
-
them through unchanged (see §8). Both are
|
|
1219
|
+
them through unchanged (see §8). Both are _additive_ — omitting
|
|
1054
1220
|
them keeps the runtime defaults; passing `loopDetection: false` opts
|
|
1055
1221
|
out; passing `toolBudgets: {}` clears the defaults; passing entries
|
|
1056
1222
|
layers caller overrides on top of the defaults. Do **not** translate
|
|
@@ -1061,12 +1227,9 @@ A reference SDK should:
|
|
|
1061
1227
|
synthetic tool-results / steering nudges, so the SDK should keep
|
|
1062
1228
|
consuming the stream until the terminal event lands.
|
|
1063
1229
|
- [ ] Maintain three local-callback registries (or one tagged-union
|
|
1064
|
-
registry), keyed by `name`:
|
|
1065
|
-
-
|
|
1066
|
-
|
|
1067
|
-
field — typically `agentCard.url`),
|
|
1068
|
-
- local MCP servers (`kind: "mcp_local"`, indexed by the SDK-side
|
|
1069
|
-
server label that matches `local_tool_call.mcpServer`).
|
|
1230
|
+
registry), keyed by `name`: - generic local tools (`kind: "local"`), - local A2A peers (`kind: "a2a_local"`, indexed by some Agent Card
|
|
1231
|
+
field — typically `agentCard.url`), - local MCP servers (`kind: "mcp_local"`, indexed by the SDK-side
|
|
1232
|
+
server label that matches `local_tool_call.mcpServer`).
|
|
1070
1233
|
- [ ] For `kind: "local"`, accept developer-supplied `parameters` (Zod /
|
|
1071
1234
|
JSON Schema) and serialize to JSON Schema before submission. When the
|
|
1072
1235
|
caller declares an output schema, forward it as `outputSchema` (same
|
|
@@ -1099,4 +1262,4 @@ A reference SDK should:
|
|
|
1099
1262
|
- [A2A spec](https://google.github.io/A2A/specification/) — canonical
|
|
1100
1263
|
Agent Card schema.
|
|
1101
1264
|
- [MCP spec](https://spec.modelcontextprotocol.io/) — canonical `Tool` and
|
|
1102
|
-
`Implementation` shapes.
|
|
1265
|
+
`Implementation` shapes.
|