@mantyx/sdk 0.10.1 → 0.12.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +14 -0
- package/dist/a2a-server.cjs +9 -0
- package/dist/a2a-server.cjs.map +1 -1
- package/dist/a2a-server.d.cts +1 -1
- package/dist/a2a-server.d.ts +1 -1
- package/dist/a2a-server.js +1 -1
- package/dist/{chunk-XMUCELMH.js → chunk-2K4BGJGJ.js} +88 -9
- package/dist/chunk-2K4BGJGJ.js.map +1 -0
- package/dist/{client-CZUVldDx.d.cts → client-LQlx7iYY.d.cts} +217 -2
- package/dist/{client-CZUVldDx.d.ts → client-LQlx7iYY.d.ts} +217 -2
- package/dist/index.cjs +88 -9
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +2 -2
- package/dist/index.d.ts +2 -2
- package/dist/index.js +2 -2
- package/dist/index.js.map +1 -1
- package/docs/agent-runs-protocol.md +450 -234
- package/docs/wire-protocol.md +525 -272
- package/package.json +1 -1
- package/dist/chunk-XMUCELMH.js.map +0 -1
- package/docs/oauth.md +0 -356
|
@@ -16,7 +16,7 @@ Companion documents:
|
|
|
16
16
|
|
|
17
17
|
## 1. Concepts
|
|
18
18
|
|
|
19
|
-
**Ephemeral agent.** A run-time agent that is
|
|
19
|
+
**Ephemeral agent.** A run-time agent that is _defined by the request_ rather
|
|
20
20
|
than persisted as a row in MANTYX's `Agent` table. The full spec (system
|
|
21
21
|
prompt, model, tools) is stored as part of each session/run for observability
|
|
22
22
|
but is not editable from the dashboard.
|
|
@@ -24,15 +24,15 @@ but is not editable from the dashboard.
|
|
|
24
24
|
**Tool refs.** Seven flavours, all carried inside the agent spec's `tools`
|
|
25
25
|
array:
|
|
26
26
|
|
|
27
|
-
| `kind`
|
|
28
|
-
|
|
|
29
|
-
| `mantyx`
|
|
30
|
-
| `mantyx_plugin`
|
|
31
|
-
| `local`
|
|
32
|
-
| `a2a`
|
|
33
|
-
| `a2a_local`
|
|
34
|
-
| `mcp`
|
|
35
|
-
| `mcp_local`
|
|
27
|
+
| `kind` | Resolved by | Notes |
|
|
28
|
+
| --------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
29
|
+
| `mantyx` | server | A workspace `Tool` row referenced by id (HTTP / Code / Plugin). |
|
|
30
|
+
| `mantyx_plugin` | server | A platform plugin tool referenced by name. |
|
|
31
|
+
| `local` | client | A custom tool defined and executed in the SDK's process. Carries `parameters` (input JSON Schema) plus optional `outputSchema` (return-value JSON Schema) and `longRunning` flag — see §4.1.1. |
|
|
32
|
+
| `a2a` | server | A _remote_ Agent2Agent peer MANTYX can reach; invoked via `message/send` and the reply is surfaced as the tool result. |
|
|
33
|
+
| `a2a_local` | client | An A2A peer MANTYX **cannot** reach. SDK resolves the [Agent Card](https://google.github.io/A2A/specification/#agent-card) locally and ships it inline; MANTYX uses it for the model description and routes calls back to the SDK over SSE. |
|
|
34
|
+
| `mcp` | server | A _remote_ MCP server (Streamable HTTP). At run start MANTYX lists the catalog and exposes every tool as `<server>_<tool>` (subject to `toolFilter`). |
|
|
35
|
+
| `mcp_local` | client | An MCP server MANTYX **cannot** reach. SDK runs `Initialize` + `tools/list` locally and ships the resolved `Tool[]` (with `inputSchema`); MANTYX exposes them to the model with the SDK-declared names and routes calls back over SSE. |
|
|
36
36
|
|
|
37
37
|
The split is deliberate:
|
|
38
38
|
|
|
@@ -42,7 +42,7 @@ The split is deliberate:
|
|
|
42
42
|
MCP/A2A this also means MANTYX does discovery (`listTools`, agent-card
|
|
43
43
|
fetch).
|
|
44
44
|
- **Client-resolved / "local"** (`local`, `a2a_local`, `mcp_local`) —
|
|
45
|
-
MANTYX has
|
|
45
|
+
MANTYX has _no_ access to the resource. The SDK does **all** of the
|
|
46
46
|
work: connection, discovery, listing, expansion, arg validation, auth,
|
|
47
47
|
execution, retries. MANTYX is a thin LLM-routing layer that emits a
|
|
48
48
|
`local_tool_call` event and blocks until the SDK POSTs back to
|
|
@@ -52,9 +52,9 @@ The split is deliberate:
|
|
|
52
52
|
|
|
53
53
|
**One-shot run vs. session.** A run is an LLM execution. Runs may be:
|
|
54
54
|
|
|
55
|
-
-
|
|
55
|
+
- _one-shot_ (`POST /agent-runs`) — fire-and-stream, no persistent state apart
|
|
56
56
|
from observability.
|
|
57
|
-
-
|
|
57
|
+
- _session-scoped_ (`POST /agent-sessions/:id/messages`) — the run inherits the
|
|
58
58
|
session's full message history, and the new user/assistant turns are
|
|
59
59
|
appended back to the session on success.
|
|
60
60
|
|
|
@@ -75,10 +75,10 @@ Authorization: Bearer <credential>
|
|
|
75
75
|
X-API-Key: <credential>
|
|
76
76
|
```
|
|
77
77
|
|
|
78
|
-
| Credential
|
|
79
|
-
|
|
|
80
|
-
| **Workspace API key**
|
|
81
|
-
| **OAuth 2.0 access token
|
|
78
|
+
| Credential | Token format | Identifies | Bound to | Use when |
|
|
79
|
+
| -------------------------- | ------------- | ---------------------------------------------------- | ------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
|
|
80
|
+
| **Workspace API key** | `mantyx_…` | The workspace | One workspace, no end-user | Personal scripts, internal automations, anything the SDK caller owns end-to-end. |
|
|
81
|
+
| **OAuth 2.0 access token** | `mantyx_at_…` | An end user **and** the workspace they consented for | One workspace, one user (or one app for `client_credentials`) | "Sign in with MANTYX" apps, third-party integrations, anywhere consent + scopes matter. |
|
|
82
82
|
|
|
83
83
|
The server resolves whichever it sees by token-prefix sniffing (see
|
|
84
84
|
`packages/api/src/services/bearer-credential.ts`) — SDKs do **not** need
|
|
@@ -115,19 +115,19 @@ two differences:
|
|
|
115
115
|
multi-scope ones — see §2.3). The SDK is expected to surface this
|
|
116
116
|
verbatim. The agent-runs surface uses these scopes:
|
|
117
117
|
|
|
118
|
-
| Endpoint
|
|
119
|
-
|
|
|
120
|
-
| `GET .../models`
|
|
121
|
-
| `POST .../agent-runs`
|
|
122
|
-
| `GET .../agent-runs/{runId}`
|
|
123
|
-
| `GET .../agent-runs/{runId}/stream`
|
|
124
|
-
| `POST .../agent-runs/{runId}/cancel`
|
|
125
|
-
| `POST .../agent-runs/{runId}/tool-results`
|
|
126
|
-
| `POST .../agent-sessions`
|
|
127
|
-
| `GET .../agent-sessions/{sessionId}`
|
|
128
|
-
| `DELETE .../agent-sessions/{sessionId}`
|
|
129
|
-
| `POST .../agent-sessions/{sessionId}/messages`
|
|
130
|
-
| `GET /api/oauth/userinfo`
|
|
118
|
+
| Endpoint | Required scope |
|
|
119
|
+
| ------------------------------------------------ | ---------------------- |
|
|
120
|
+
| `GET .../models` | `models:read` |
|
|
121
|
+
| `POST .../agent-runs` | `runs:write` |
|
|
122
|
+
| `GET .../agent-runs/{runId}` | `runs:read` |
|
|
123
|
+
| `GET .../agent-runs/{runId}/stream` | `runs:read` |
|
|
124
|
+
| `POST .../agent-runs/{runId}/cancel` | `runs:write` |
|
|
125
|
+
| `POST .../agent-runs/{runId}/tool-results` | `runs:write` |
|
|
126
|
+
| `POST .../agent-sessions` | `sessions:write` |
|
|
127
|
+
| `GET .../agent-sessions/{sessionId}` | `sessions:read` |
|
|
128
|
+
| `DELETE .../agent-sessions/{sessionId}` | `sessions:write` |
|
|
129
|
+
| `POST .../agent-sessions/{sessionId}/messages` | `sessions:write` |
|
|
130
|
+
| `GET /api/oauth/userinfo` | `mantyx.identity:read` |
|
|
131
131
|
|
|
132
132
|
For an SDK that exposes one-shot runs and sessions end-to-end, request
|
|
133
133
|
at minimum `models:read runs:read runs:write sessions:read sessions:write`,
|
|
@@ -143,9 +143,10 @@ two differences:
|
|
|
143
143
|
OAuth tokens **also** honor the per-token agent allow-list
|
|
144
144
|
(`OAuthAccessToken.agentIds`) the user picked at consent time — see
|
|
145
145
|
[`docs/oauth.md`](./oauth.md) for the full registration / authorization-code
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
146
|
+
|
|
147
|
+
- PKCE flow. PKCE (`S256`) is mandatory and every MANTYX OAuth app is a
|
|
148
|
+
confidential client, so the token endpoint requires both `client_secret`
|
|
149
|
+
and `code_verifier`.
|
|
149
150
|
|
|
150
151
|
**Token lifetimes.** Access tokens live **1 hour** (`expires_in: 3600`).
|
|
151
152
|
Refresh tokens are **persistent and non-rotating**: they have no
|
|
@@ -176,14 +177,14 @@ Content-Type: application/json
|
|
|
176
177
|
|
|
177
178
|
### 2.3 Error model for credentials
|
|
178
179
|
|
|
179
|
-
| Status | Body shape
|
|
180
|
-
| ------ |
|
|
181
|
-
| `401` | `{ "error": "Unauthorized", "message": "API key or OAuth access token required..." }`
|
|
182
|
-
| `401` | `{ "error": "Invalid API key or OAuth access token" }`
|
|
183
|
-
| `403` | `{ "error": "This API key is not for the Developer API", "hint": "..." }`
|
|
184
|
-
| `403` | `{ "error": "Workspace API keys are not available on this plan.", "code": "api_keys_plan" }` <br> `{ "error": "OAuth applications are not available on this plan.", "code": "oauth_apps_plan" }` | Workspace tier lacks the `apiKeys` / `oauthApps` feature.
|
|
185
|
-
| `403` | `{ "error": "insufficient_scope", "required": "runs:write" }` (or an array if a route needs multiple)
|
|
186
|
-
| `404` | `{ "error": "Workspace path does not match this credential", "hint": "..." }`
|
|
180
|
+
| Status | Body shape | When |
|
|
181
|
+
| ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
182
|
+
| `401` | `{ "error": "Unauthorized", "message": "API key or OAuth access token required..." }` | No `Authorization` / `X-API-Key` header. |
|
|
183
|
+
| `401` | `{ "error": "Invalid API key or OAuth access token" }` | Token doesn't match a row, expired, or revoked. |
|
|
184
|
+
| `403` | `{ "error": "This API key is not for the Developer API", "hint": "..." }` | API key has wrong `usage`. |
|
|
185
|
+
| `403` | `{ "error": "Workspace API keys are not available on this plan.", "code": "api_keys_plan" }` <br> `{ "error": "OAuth applications are not available on this plan.", "code": "oauth_apps_plan" }` | Workspace tier lacks the `apiKeys` / `oauthApps` feature. |
|
|
186
|
+
| `403` | `{ "error": "insufficient_scope", "required": "runs:write" }` (or an array if a route needs multiple) | OAuth token is missing a scope a route demands. The response also sets `WWW-Authenticate: Bearer error="insufficient_scope", scope="..."`. |
|
|
187
|
+
| `404` | `{ "error": "Workspace path does not match this credential", "hint": "..." }` | URL slug ≠ token's workspace. |
|
|
187
188
|
|
|
188
189
|
## 3. Models
|
|
189
190
|
|
|
@@ -204,7 +205,11 @@ platform-hosted offerings visible to the workspace's tier.
|
|
|
204
205
|
"vendorModelId": "claude-sonnet-4-5",
|
|
205
206
|
"source": "platform_offering",
|
|
206
207
|
"contextWindowTokens": 200000,
|
|
207
|
-
"pricing": {
|
|
208
|
+
"pricing": {
|
|
209
|
+
"inputPer1MUsd": 3.0,
|
|
210
|
+
"outputPer1MUsd": 15.0,
|
|
211
|
+
"cacheReadPer1MUsd": 0.3,
|
|
212
|
+
},
|
|
208
213
|
},
|
|
209
214
|
{
|
|
210
215
|
"id": "provider:cm6def456",
|
|
@@ -213,10 +218,10 @@ platform-hosted offerings visible to the workspace's tier.
|
|
|
213
218
|
"vendorModelId": "gpt-5.5",
|
|
214
219
|
"source": "workspace_provider",
|
|
215
220
|
"contextWindowTokens": 200000,
|
|
216
|
-
"pricing": null
|
|
217
|
-
}
|
|
221
|
+
"pricing": null,
|
|
222
|
+
},
|
|
218
223
|
],
|
|
219
|
-
"defaultModelId": "platform:cm6abc123"
|
|
224
|
+
"defaultModelId": "platform:cm6abc123",
|
|
220
225
|
}
|
|
221
226
|
```
|
|
222
227
|
|
|
@@ -240,11 +245,11 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
|
|
|
240
245
|
|
|
241
246
|
```jsonc
|
|
242
247
|
{
|
|
243
|
-
"name": "ephemeral",
|
|
244
|
-
"agentId": "agent_cm6abc123",
|
|
245
|
-
"systemPrompt": "You are helpful.",
|
|
246
|
-
"modelId": "platform:cm6abc123",
|
|
247
|
-
"reasoningLevel": "medium",
|
|
248
|
+
"name": "ephemeral", // optional, observability only
|
|
249
|
+
"agentId": "agent_cm6abc123", // optional — see §4.1
|
|
250
|
+
"systemPrompt": "You are helpful.", // required unless agentId is set
|
|
251
|
+
"modelId": "platform:cm6abc123", // optional, see §3
|
|
252
|
+
"reasoningLevel": "medium", // optional, see §4.4
|
|
248
253
|
"tools": [
|
|
249
254
|
{ "kind": "mantyx", "id": "tool_cm6..." },
|
|
250
255
|
{ "kind": "mantyx_plugin", "name": "web_search" },
|
|
@@ -252,20 +257,22 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
|
|
|
252
257
|
"kind": "local",
|
|
253
258
|
"name": "read_file",
|
|
254
259
|
"description": "Read a file from the user's machine",
|
|
255
|
-
"parameters": {
|
|
260
|
+
"parameters": {
|
|
261
|
+
// JSON Schema for the args object
|
|
256
262
|
"type": "object",
|
|
257
263
|
"properties": { "path": { "type": "string" } },
|
|
258
264
|
"required": ["path"],
|
|
259
|
-
"additionalProperties": false
|
|
265
|
+
"additionalProperties": false,
|
|
260
266
|
},
|
|
261
|
-
"outputSchema": {
|
|
267
|
+
"outputSchema": {
|
|
268
|
+
// optional — JSON Schema for the return value
|
|
262
269
|
"type": "object",
|
|
263
270
|
"properties": {
|
|
264
|
-
"bytes": { "type": "string", "description": "UTF-8 file contents" }
|
|
271
|
+
"bytes": { "type": "string", "description": "UTF-8 file contents" },
|
|
265
272
|
},
|
|
266
|
-
"required": ["bytes"]
|
|
273
|
+
"required": ["bytes"],
|
|
267
274
|
},
|
|
268
|
-
"longRunning": false
|
|
275
|
+
"longRunning": false, // optional — default false
|
|
269
276
|
},
|
|
270
277
|
{
|
|
271
278
|
"kind": "a2a",
|
|
@@ -273,12 +280,13 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
|
|
|
273
280
|
"description": "Delegate billing questions to the Acme billing agent.",
|
|
274
281
|
"agentCardUrl": "https://billing.acme.com/.well-known/agent-card.json",
|
|
275
282
|
"headers": { "Authorization": "Bearer ${BILLING_TOKEN}" },
|
|
276
|
-
"contextId": "ctx_abc"
|
|
283
|
+
"contextId": "ctx_abc", // optional A2A context to thread turns
|
|
277
284
|
},
|
|
278
285
|
{
|
|
279
286
|
"kind": "a2a_local",
|
|
280
287
|
"name": "intranet_hr_agent",
|
|
281
|
-
"agentCard": {
|
|
288
|
+
"agentCard": {
|
|
289
|
+
// SDK-resolved A2A Agent Card content
|
|
282
290
|
"protocolVersion": "0.3.0",
|
|
283
291
|
"name": "Acme HR",
|
|
284
292
|
"description": "Answers questions about HR policies and benefits.",
|
|
@@ -289,72 +297,83 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
|
|
|
289
297
|
{
|
|
290
298
|
"id": "pto_lookup",
|
|
291
299
|
"name": "PTO lookup",
|
|
292
|
-
"description": "Find a teammate's remaining PTO days for the year."
|
|
300
|
+
"description": "Find a teammate's remaining PTO days for the year.",
|
|
293
301
|
},
|
|
294
302
|
{
|
|
295
303
|
"id": "benefits_qa",
|
|
296
304
|
"name": "Benefits Q&A",
|
|
297
|
-
"description": "Answer questions about insurance, 401k, and parental leave."
|
|
298
|
-
}
|
|
299
|
-
]
|
|
300
|
-
}
|
|
305
|
+
"description": "Answer questions about insurance, 401k, and parental leave.",
|
|
306
|
+
},
|
|
307
|
+
],
|
|
308
|
+
},
|
|
301
309
|
},
|
|
302
310
|
{
|
|
303
311
|
"kind": "mcp",
|
|
304
|
-
"name": "github",
|
|
312
|
+
"name": "github", // → tools become github_<tool>
|
|
305
313
|
"url": "https://mcp.github.com/v1",
|
|
306
314
|
"headers": { "Authorization": "Bearer ${GH_PAT}" },
|
|
307
|
-
"toolFilter": ["search_repos", "read_file"]
|
|
315
|
+
"toolFilter": ["search_repos", "read_file"], // optional allowlist
|
|
308
316
|
},
|
|
309
317
|
{
|
|
310
318
|
"kind": "mcp_local",
|
|
311
|
-
"name": "fs",
|
|
312
|
-
"serverInfo": {
|
|
319
|
+
"name": "fs", // SDK-side server label only — NOT a prefix
|
|
320
|
+
"serverInfo": {
|
|
321
|
+
// optional; from MCP Initialize
|
|
313
322
|
"name": "mcp-server-filesystem",
|
|
314
|
-
"version": "0.4.1"
|
|
323
|
+
"version": "0.4.1",
|
|
315
324
|
},
|
|
316
|
-
"tools": [
|
|
325
|
+
"tools": [
|
|
326
|
+
// verbatim MCP tools/list response
|
|
317
327
|
{
|
|
318
|
-
"name": "fs_read_file",
|
|
328
|
+
"name": "fs_read_file", // model-facing name, exactly as declared
|
|
319
329
|
"description": "Read a file from the user's workstation",
|
|
320
|
-
"inputSchema": {
|
|
330
|
+
"inputSchema": {
|
|
331
|
+
// MCP's term — JSON Schema
|
|
321
332
|
"type": "object",
|
|
322
333
|
"properties": { "path": { "type": "string" } },
|
|
323
|
-
"required": ["path"]
|
|
324
|
-
}
|
|
325
|
-
}
|
|
326
|
-
]
|
|
327
|
-
}
|
|
334
|
+
"required": ["path"],
|
|
335
|
+
},
|
|
336
|
+
},
|
|
337
|
+
],
|
|
338
|
+
},
|
|
328
339
|
],
|
|
329
|
-
"budgets": { "maxToolTurns": 32 },
|
|
330
|
-
"outputSchema": {
|
|
340
|
+
"budgets": { "maxToolTurns": 32 }, // optional safety cap
|
|
341
|
+
"outputSchema": {
|
|
342
|
+
// optional, see §4.5
|
|
331
343
|
"name": "weather_report",
|
|
332
344
|
"schema": {
|
|
333
345
|
"type": "object",
|
|
334
346
|
"properties": {
|
|
335
347
|
"city": { "type": "string" },
|
|
336
|
-
"temperature_c": { "type": "number" }
|
|
348
|
+
"temperature_c": { "type": "number" },
|
|
337
349
|
},
|
|
338
|
-
"required": ["city", "temperature_c"]
|
|
339
|
-
}
|
|
350
|
+
"required": ["city", "temperature_c"],
|
|
351
|
+
},
|
|
340
352
|
},
|
|
341
|
-
"loopDetection": {
|
|
353
|
+
"loopDetection": {
|
|
354
|
+
// optional, see §4.6
|
|
342
355
|
"consecutiveThreshold": 3,
|
|
343
|
-
"hardCutoffThreshold": 6
|
|
356
|
+
"hardCutoffThreshold": 6,
|
|
344
357
|
},
|
|
345
|
-
"toolBudgets": {
|
|
346
|
-
|
|
358
|
+
"toolBudgets": {
|
|
359
|
+
// optional, see §4.7
|
|
360
|
+
"recall": { "maxCalls": 4 },
|
|
347
361
|
"hive_consult_ontology": { "maxCalls": 4 },
|
|
348
|
-
"scary_tool":
|
|
362
|
+
"scary_tool": { "maxCalls": 0 },
|
|
363
|
+
},
|
|
364
|
+
"supervisor": {
|
|
365
|
+
// optional, see §4.8 — platform LLM judge; pass false to disable
|
|
366
|
+
"interval": 5,
|
|
349
367
|
},
|
|
350
|
-
"metadata": {
|
|
368
|
+
"metadata": {
|
|
369
|
+
// optional, see §4.9
|
|
351
370
|
"customer": "acme",
|
|
352
|
-
"env": "prod"
|
|
353
|
-
}
|
|
371
|
+
"env": "prod",
|
|
372
|
+
},
|
|
354
373
|
}
|
|
355
374
|
```
|
|
356
375
|
|
|
357
|
-
`POST /agent-runs` additionally accepts `prompt`
|
|
376
|
+
`POST /agent-runs` additionally accepts `prompt` _or_ `messages` (an array of
|
|
358
377
|
`{role, content}`). Sending both is a `400 invalid_request`.
|
|
359
378
|
|
|
360
379
|
### 4.1 Triggering a persisted MANTYX agent (`agentId`)
|
|
@@ -366,7 +385,7 @@ defining an ephemeral one inline. When `agentId` is set:
|
|
|
366
385
|
stored system prompt at run time.
|
|
367
386
|
- `modelId` becomes optional. If omitted, the server uses the agent's
|
|
368
387
|
configured LLM provider (or the workspace automation provider if the agent
|
|
369
|
-
has
|
|
388
|
+
has _Use workspace default model_ turned on).
|
|
370
389
|
- The agent's own tools are loaded from its workspace configuration —
|
|
371
390
|
including memory, skills, and plugin tools — and your `tools` array is
|
|
372
391
|
**merged on top**. This is typically used to attach `local` tools so the
|
|
@@ -389,14 +408,14 @@ the handler in its own process. MANTYX never executes the body — it
|
|
|
389
408
|
emits a `local_tool_call` event when the model picks the tool and waits
|
|
390
409
|
for the SDK to POST a tool-result.
|
|
391
410
|
|
|
392
|
-
| Field | Required | Notes
|
|
393
|
-
| -------------- | -------- |
|
|
394
|
-
| `kind` | yes | Discriminator literal `"local"`.
|
|
395
|
-
| `name` | yes | Model-facing tool name. Must match `/^[a-zA-Z0-9_]{1,64}$/`.
|
|
396
|
-
| `description` | no | Free-form. Empty when omitted (acceptable, but reduces tool-selection accuracy).
|
|
397
|
-
| `parameters` | no | JSON Schema for the tool's input. Must be a `type: "object"` schema with `properties`; non-object roots are coerced to an empty object schema server-side. Forwarded **verbatim** to the LLM provider so nested constraints (`array.items`, `enum`, `anyOf`, numeric formats, …) survive. Args that fail server-side validation produce a structured `tool_input_invalid` tool result the model can recover from instead of crashing the call.
|
|
398
|
-
| `outputSchema` | no | JSON Schema for the structured value the tool returns. When present, forwarded to providers that accept per-tool response schemas (Gemini's `responseJsonSchema` on the FunctionDeclaration); other engines surface it through the description and rely on host-side validation. Helps the model emit follow-up arguments that round-trip cleanly. Must be an object schema; non-object roots are dropped server-side.
|
|
399
|
-
| `longRunning` | no | When `true`, MANTYX appends a stable hint to the model-facing description so every provider treats the tool as long-running:<br
|
|
411
|
+
| Field | Required | Notes |
|
|
412
|
+
| -------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
413
|
+
| `kind` | yes | Discriminator literal `"local"`. |
|
|
414
|
+
| `name` | yes | Model-facing tool name. Must match `/^[a-zA-Z0-9_]{1,64}$/`. |
|
|
415
|
+
| `description` | no | Free-form. Empty when omitted (acceptable, but reduces tool-selection accuracy). |
|
|
416
|
+
| `parameters` | no | JSON Schema for the tool's input. Must be a `type: "object"` schema with `properties`; non-object roots are coerced to an empty object schema server-side. Forwarded **verbatim** to the LLM provider so nested constraints (`array.items`, `enum`, `anyOf`, numeric formats, …) survive. Args that fail server-side validation produce a structured `tool_input_invalid` tool result the model can recover from instead of crashing the call. |
|
|
417
|
+
| `outputSchema` | no | JSON Schema for the structured value the tool returns. When present, forwarded to providers that accept per-tool response schemas (Gemini's `responseJsonSchema` on the FunctionDeclaration); other engines surface it through the description and rely on host-side validation. Helps the model emit follow-up arguments that round-trip cleanly. Must be an object schema; non-object roots are dropped server-side. |
|
|
418
|
+
| `longRunning` | no | When `true`, MANTYX appends a stable hint to the model-facing description so every provider treats the tool as long-running:<br>_"NOTE: This is a long-running operation. Do not call this tool again if it has already returned an intermediate or pending status."_<br>Useful for tools that return `pending` and rely on SDK-side polling — without the hint the model routinely fires repeat calls and burns turns. Pure declarative — MANTYX does not change scheduling. |
|
|
400
419
|
|
|
401
420
|
The `outputSchema` and `longRunning` fields are **additive** since wire
|
|
402
421
|
protocol v1: SDKs that don't ship them keep working unchanged. Providers
|
|
@@ -410,10 +429,10 @@ A2A delegation lets the agent hand a task to another
|
|
|
410
429
|
[Agent2Agent](https://google.github.io/A2A/) peer. The wire protocol exposes
|
|
411
430
|
two kinds depending on **who can reach the peer**:
|
|
412
431
|
|
|
413
|
-
- `kind: "a2a"` —
|
|
432
|
+
- `kind: "a2a"` — _remote_ (server-resolved). MANTYX dials `agentCardUrl`
|
|
414
433
|
directly. Pick this when the peer is on the public internet or in the
|
|
415
434
|
same VPC as MANTYX.
|
|
416
|
-
- `kind: "a2a_local"` —
|
|
435
|
+
- `kind: "a2a_local"` — _local_ (client-resolved). The SDK invokes the peer
|
|
417
436
|
on its side and posts back the reply. Pick this when the peer lives on an
|
|
418
437
|
intranet, behind a VPN, or on the user's device — anywhere MANTYX can't
|
|
419
438
|
reach but the SDK can.
|
|
@@ -431,14 +450,14 @@ POSTs the model's `message` argument to `agentCardUrl` over A2A's standard
|
|
|
431
450
|
and `/message/send` endpoints are probed in order) and forwards the remote
|
|
432
451
|
agent's text reply back as the tool result.
|
|
433
452
|
|
|
434
|
-
| Field
|
|
435
|
-
|
|
|
436
|
-
| `kind`
|
|
437
|
-
| `name`
|
|
438
|
-
| `description`
|
|
439
|
-
| `agentCardUrl`
|
|
440
|
-
| `headers`
|
|
441
|
-
| `contextId`
|
|
453
|
+
| Field | Required | Notes |
|
|
454
|
+
| -------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
455
|
+
| `kind` | yes | Discriminator literal `"a2a"`. |
|
|
456
|
+
| `name` | yes | Tool name surfaced to the model — must match `/^[a-zA-Z0-9_]{1,64}$/`. |
|
|
457
|
+
| `description` | no | Model-facing description. Defaults to `"Delegate a task to the <name> agent over A2A. Pass the full task as a single message."`. Mention the remote agent's purpose so the model picks it for the right turn. |
|
|
458
|
+
| `agentCardUrl` | yes | URL of the remote Agent Card (`/.well-known/agent-card.json`) or the JSON-RPC root the peer accepts. |
|
|
459
|
+
| `headers` | no | Flat string→string HTTP headers sent on every A2A request — typically `Authorization`. Each value is capped at 8 KB. |
|
|
460
|
+
| `contextId` | no | A2A `contextId` to thread multiple delegations into the same remote conversation. Omit for fresh per-call context. |
|
|
442
461
|
|
|
443
462
|
> **Secret handling.** `headers` are forwarded **as-is** by the SDK API. If
|
|
444
463
|
> you need long-lived credentials (refresh tokens, rotating API keys),
|
|
@@ -476,30 +495,30 @@ Per-run lifecycle:
|
|
|
476
495
|
5. **Continuation (MANTYX).** MANTYX feeds the reply back into the model
|
|
477
496
|
loop as the tool result.
|
|
478
497
|
|
|
479
|
-
| Field
|
|
480
|
-
|
|
|
481
|
-
| `kind`
|
|
482
|
-
| `name`
|
|
483
|
-
| `description`
|
|
484
|
-
| `agentCard`
|
|
498
|
+
| Field | Required | Notes |
|
|
499
|
+
| ------------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
500
|
+
| `kind` | yes | Discriminator literal `"a2a_local"`. |
|
|
501
|
+
| `name` | yes | Tool name surfaced to the model — must match `/^[a-zA-Z0-9_]{1,64}$/`. |
|
|
502
|
+
| `description` | no | Model-facing description override. When omitted, MANTYX synthesizes one from `agentCard.name`, `agentCard.description`, and the first 12 skills. |
|
|
503
|
+
| `agentCard` | yes | The resolved A2A Agent Card (JSON content). Schema follows the [A2A Agent Card spec](https://google.github.io/A2A/specification/#agent-card) — passthrough for unknown fields, so any spec-compliant card works. See the _Agent Card shape_ table below for the fields MANTYX actually reads. |
|
|
485
504
|
|
|
486
505
|
**Agent Card shape** (only the fields MANTYX inspects; everything else is
|
|
487
506
|
forwarded verbatim back to the SDK):
|
|
488
507
|
|
|
489
|
-
| Card field
|
|
490
|
-
|
|
|
491
|
-
| `protocolVersion`
|
|
492
|
-
| `name`
|
|
493
|
-
| `description`
|
|
494
|
-
| `url`
|
|
495
|
-
| `version`
|
|
496
|
-
| `provider`
|
|
497
|
-
| `capabilities`
|
|
498
|
-
| `defaultInputModes`
|
|
499
|
-
| `defaultOutputModes`
|
|
500
|
-
| `skills[]`
|
|
501
|
-
| `securitySchemes`, `security` | echo only
|
|
502
|
-
|
|
|
508
|
+
| Card field | Used by MANTYX | Notes |
|
|
509
|
+
| ----------------------------- | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
|
|
510
|
+
| `protocolVersion` | echo only | A2A protocol version (e.g. `"0.3.0"`). |
|
|
511
|
+
| `name` | description | Used when synthesizing the tool description (`"Delegate a task to the <name> agent ..."`). |
|
|
512
|
+
| `description` | description | One-paragraph summary of what the peer does — surfaced to the model. |
|
|
513
|
+
| `url` | echo only | Peer's A2A endpoint. Forwarded back to the SDK in the `local_tool_call` event so the SDK can dispatch by URL. Never fetched server-side. |
|
|
514
|
+
| `version` | echo only | Peer agent version. |
|
|
515
|
+
| `provider` | echo only | Vendor info. |
|
|
516
|
+
| `capabilities` | echo only | A2A capability flags (streaming, push notifications, …). |
|
|
517
|
+
| `defaultInputModes` | echo only | Modalities the peer accepts. |
|
|
518
|
+
| `defaultOutputModes` | echo only | Modalities the peer returns. |
|
|
519
|
+
| `skills[]` | description | First 12 skills (`name`, `description`) are bulleted into the tool description so the model knows what to ask for. |
|
|
520
|
+
| `securitySchemes`, `security` | echo only | Forwarded to the SDK; MANTYX does no auth. |
|
|
521
|
+
| _anything else_ | echo only | Passthrough — survives round-trip unchanged. |
|
|
503
522
|
|
|
504
523
|
Local A2A respects the same `localToolTimeoutMs` budget (default 5 minutes)
|
|
505
524
|
as `kind: "local"`. Tool-result POSTs after timeout return `409 run_terminal`.
|
|
@@ -510,25 +529,25 @@ as `kind: "local"`. Tool-result POSTs after timeout return `409 run_terminal`.
|
|
|
510
529
|
expose every tool published by an MCP server to the agent loop in one go.
|
|
511
530
|
Like A2A, the protocol distinguishes by **where the server lives**:
|
|
512
531
|
|
|
513
|
-
- `kind: "mcp"` —
|
|
532
|
+
- `kind: "mcp"` — _remote_ MCP (Streamable HTTP). MANTYX has network access
|
|
514
533
|
to the server, dials it, lists the catalog at run start, and proxies each
|
|
515
534
|
call server-side. **MANTYX prefixes every discovered tool name with the
|
|
516
535
|
ref's `name`** (e.g. `github_search_repos`) so multiple MCP servers
|
|
517
536
|
can coexist without colliding.
|
|
518
|
-
- `kind: "mcp_local"` —
|
|
537
|
+
- `kind: "mcp_local"` — _local_ MCP (stdio, on-device, intranet). MANTYX
|
|
519
538
|
has **no** access to the server; the SDK does discovery, validation, and
|
|
520
539
|
execution. The SDK declares the tool catalog with **the exact names it
|
|
521
540
|
wants the model to see** — MANTYX does not auto-prefix.
|
|
522
541
|
|
|
523
542
|
#### `kind: "mcp"` — remote MCP
|
|
524
543
|
|
|
525
|
-
| Field
|
|
526
|
-
|
|
|
527
|
-
| `kind`
|
|
528
|
-
| `name`
|
|
529
|
-
| `url`
|
|
530
|
-
| `headers`
|
|
531
|
-
| `toolFilter`
|
|
544
|
+
| Field | Required | Notes |
|
|
545
|
+
| ------------ | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
546
|
+
| `kind` | yes | Discriminator literal `"mcp"`. |
|
|
547
|
+
| `name` | yes | Server label — MANTYX prefixes every discovered tool name as `<name>_<tool>`. Must match `/^[a-zA-Z0-9_]{1,64}$/`. |
|
|
548
|
+
| `url` | yes | Streamable HTTP MCP endpoint. |
|
|
549
|
+
| `headers` | no | Flat string→string HTTP headers (e.g. `Authorization`). Each value capped at 8 KB. |
|
|
550
|
+
| `toolFilter` | no | Allowlist of MCP tool names (un-prefixed, as the server returns them). When set, tools not in the list are silently dropped. When omitted, every published tool is exposed. |
|
|
532
551
|
|
|
533
552
|
If the MCP server is unreachable when the run starts, MANTYX still exposes
|
|
534
553
|
a single stub tool named `<server>_unavailable` so the model can report the
|
|
@@ -566,16 +585,17 @@ Per-run lifecycle:
|
|
|
566
585
|
"type": "local_tool_call",
|
|
567
586
|
"data": {
|
|
568
587
|
"toolUseId": "tu_x",
|
|
569
|
-
"name": "fs_read_file",
|
|
588
|
+
"name": "fs_read_file", // SDK-declared name; same string the model called
|
|
570
589
|
"args": { "path": "/etc/hosts" },
|
|
571
590
|
"kind": "mcp_local",
|
|
572
|
-
"mcpServer": "fs",
|
|
591
|
+
"mcpServer": "fs", // the SDK-side label from the ref's `name`
|
|
573
592
|
"mcpToolName": "fs_read_file", // duplicates `name` for the SDK's convenience
|
|
574
|
-
"mcpServerInfo": {
|
|
593
|
+
"mcpServerInfo": {
|
|
594
|
+
// present iff the ref carried `serverInfo`
|
|
575
595
|
"name": "mcp-server-filesystem",
|
|
576
|
-
"version": "0.4.1"
|
|
577
|
-
}
|
|
578
|
-
}
|
|
596
|
+
"version": "0.4.1",
|
|
597
|
+
},
|
|
598
|
+
},
|
|
579
599
|
}
|
|
580
600
|
```
|
|
581
601
|
|
|
@@ -587,12 +607,12 @@ Per-run lifecycle:
|
|
|
587
607
|
updated `mcp_local` ref inside `POST /agent-sessions/:id/messages`'s
|
|
588
608
|
`tools` field; the catalog snapshot lives on the run, not the session.
|
|
589
609
|
|
|
590
|
-
| Field
|
|
591
|
-
|
|
|
592
|
-
| `kind`
|
|
593
|
-
| `name`
|
|
594
|
-
| `serverInfo`
|
|
595
|
-
| `tools`
|
|
610
|
+
| Field | Required | Notes |
|
|
611
|
+
| ------------ | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
612
|
+
| `kind` | yes | Discriminator literal `"mcp_local"`. |
|
|
613
|
+
| `name` | yes | SDK-side server label (e.g. `"fs"`, `"jira"`). Echoed back unchanged as `mcpServer` on every `local_tool_call`. **Not used to prefix tool names.** Match `/^[a-zA-Z0-9_]{1,64}$/`. |
|
|
614
|
+
| `serverInfo` | no | The MCP `Implementation` block the SDK got from `Initialize` (`{ name, version? }`, plus any extra fields the server returned). Forwarded to the SDK in `local_tool_call.mcpServerInfo` for observability; not used to drive behavior. |
|
|
615
|
+
| `tools` | yes | Verbatim MCP `tools/list` output (1–64 entries). Each item is the standard MCP `Tool` shape: `{ name, description?, inputSchema?, annotations?, … }`. `name` is the model-facing tool name (SDK owns naming). `inputSchema` is the MCP-spec JSON Schema for the tool's arguments — used to constrain the LLM's tool call. Empty `inputSchema` means a no-arg tool. |
|
|
596
616
|
|
|
597
617
|
Older SDKs that ignore the `kind` discriminator still see a normal
|
|
598
618
|
`local_tool_call` and can match on `name` alone.
|
|
@@ -612,10 +632,10 @@ provider:
|
|
|
612
632
|
|
|
613
633
|
Two equivalent input shapes are accepted:
|
|
614
634
|
|
|
615
|
-
| Form
|
|
616
|
-
|
|
|
617
|
-
| **String**
|
|
618
|
-
| **Number**
|
|
635
|
+
| Form | Values | Notes |
|
|
636
|
+
| ---------- | -------------------------------------- | --------------------------------------------------------------------------------------------------------------------- |
|
|
637
|
+
| **String** | `"off"`, `"low"`, `"medium"`, `"high"` | Snaps to the same anchors the web composer uses (Fast=30, Moderate=50, Smart=80; off=0). |
|
|
638
|
+
| **Number** | integer `0`–`100` | Pass-through to `RunAgentOptions.reasoningLevel`. `0` explicitly disables provider thinking even on reasoning models. |
|
|
619
639
|
|
|
620
640
|
When omitted, MANTYX falls back to the agent's default — for ephemeral
|
|
621
641
|
specs, that means thinking is off; for `agentId`-backed specs, it follows
|
|
@@ -649,29 +669,29 @@ reply directly into downstream code without LLM-flavoured prose to parse out.
|
|
|
649
669
|
}
|
|
650
670
|
```
|
|
651
671
|
|
|
652
|
-
| Field | Required | Notes
|
|
653
|
-
| -------- | -------- |
|
|
654
|
-
| `name` | no | Stable identifier passed to providers (OpenAI `text.format.name`, Anthropic synthetic-tool name). Defaults to `"output"`. Must match `/^[a-zA-Z0-9_-]{1,64}$/`.
|
|
672
|
+
| Field | Required | Notes |
|
|
673
|
+
| -------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
674
|
+
| `name` | no | Stable identifier passed to providers (OpenAI `text.format.name`, Anthropic synthetic-tool name). Defaults to `"output"`. Must match `/^[a-zA-Z0-9_-]{1,64}$/`. |
|
|
655
675
|
| `schema` | yes | JSON Schema describing the final assistant text. Root must be a JSON **object** (most providers reject array / scalar roots in structured-output mode). The schema is passed through verbatim — MANTYX does not validate its contents; the provider does. |
|
|
656
676
|
|
|
657
677
|
Validation (server-side, `400 invalid_request` on violation):
|
|
658
678
|
|
|
659
|
-
| Constraint
|
|
660
|
-
|
|
|
661
|
-
| Serialized JSON size of `outputSchema` | ≤ 32 KB
|
|
662
|
-
| `name` regex
|
|
663
|
-
| `schema` shape
|
|
679
|
+
| Constraint | Limit |
|
|
680
|
+
| -------------------------------------- | --------------------------------- |
|
|
681
|
+
| Serialized JSON size of `outputSchema` | ≤ 32 KB |
|
|
682
|
+
| `name` regex | `/^[a-zA-Z0-9_-]{1,64}$/` |
|
|
683
|
+
| `schema` shape | non-`null`, non-array JSON object |
|
|
664
684
|
|
|
665
685
|
**Per-provider behaviour** (mirrors the SDK's `RunAgentOptions.finalResponseSchema`):
|
|
666
686
|
|
|
667
|
-
| Provider
|
|
668
|
-
|
|
|
669
|
-
| OpenAI Responses (o-series, GPT-5.x, …) | `text.format = { type: "json_schema", strict: true, name, schema }` on every turn (works alongside tool calls).
|
|
670
|
-
| Gemini 3+ (any turn)
|
|
671
|
-
| Gemini ≤ 2.5 (no-tools turn)
|
|
672
|
-
| Gemini ≤ 2.5 (with tools)
|
|
673
|
-
| Anthropic / Bedrock-Anthropic
|
|
674
|
-
| xAI Grok, others
|
|
687
|
+
| Provider | How the schema is enforced |
|
|
688
|
+
| --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
689
|
+
| OpenAI Responses (o-series, GPT-5.x, …) | `text.format = { type: "json_schema", strict: true, name, schema }` on every turn (works alongside tool calls). |
|
|
690
|
+
| Gemini 3+ (any turn) | `responseMimeType: "application/json"` + `responseJsonSchema` on every `completeTurn`. Gemini 3 accepts the schema alongside `functionDeclarations`. |
|
|
691
|
+
| Gemini ≤ 2.5 (no-tools turn) | `responseMimeType: "application/json"` + `responseJsonSchema`. |
|
|
692
|
+
| Gemini ≤ 2.5 (with tools) | Synthetic `set_model_response` function declaration is injected; its `parametersJsonSchema` is the supplied schema. The system instruction is augmented to direct the model to call this tool with the final answer. The engine intercepts the call, hides it from the SDK, and surfaces the call's arguments as the assistant text (JSON-stringified). Sidesteps the API rejection ("Function calling with a response mime type: 'application/json' is unsupported") without round-tripping a 4xx. |
|
|
693
|
+
| Anthropic / Bedrock-Anthropic | Synthetic `final_report` tool whose `input_schema` is the supplied schema; `tool_choice` is forced on the no-tools finishing turn. The tool's input is surfaced as the assistant text. |
|
|
694
|
+
| xAI Grok, others | Ignored (the model returns plain text). |
|
|
675
695
|
|
|
676
696
|
The synthetic-tool paths (Gemini 2.5 + tools, Anthropic) are entirely
|
|
677
697
|
internal: the SDK never receives a `local_tool_call` for
|
|
@@ -727,17 +747,17 @@ The wire shape also accepts the literal `false`:
|
|
|
727
747
|
"loopDetection": false // explicitly disable the guard for this run
|
|
728
748
|
```
|
|
729
749
|
|
|
730
|
-
| Field | Type | Required | Notes
|
|
731
|
-
| ---------------------- | --------------- | -------- |
|
|
732
|
-
| `consecutiveThreshold` | integer ≥ 2 | no | Defaults to **3** when the field is omitted. Must be `>= 2` (one identical batch is just a single tool call, not a loop).
|
|
750
|
+
| Field | Type | Required | Notes |
|
|
751
|
+
| ---------------------- | --------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
|
|
752
|
+
| `consecutiveThreshold` | integer ≥ 2 | no | Defaults to **3** when the field is omitted. Must be `>= 2` (one identical batch is just a single tool call, not a loop). |
|
|
733
753
|
| `hardCutoffThreshold` | integer ≥ 3 | no | Defaults to **6** when the field is omitted. Must be `> consecutiveThreshold`; otherwise the soft nudge would never get a chance to land. |
|
|
734
|
-
| (top-level `false`) | literal `false` | no | Disables the guard entirely for this run. The pipeline still enforces `budgets.maxToolTurns`.
|
|
754
|
+
| (top-level `false`) | literal `false` | no | Disables the guard entirely for this run. The pipeline still enforces `budgets.maxToolTurns`. |
|
|
735
755
|
|
|
736
756
|
Validation (server-side, `400 invalid_request` on violation):
|
|
737
757
|
|
|
738
|
-
| Constraint
|
|
739
|
-
|
|
|
740
|
-
| `consecutiveThreshold` / `hardCutoffThreshold` upper bound
|
|
758
|
+
| Constraint | Limit |
|
|
759
|
+
| ------------------------------------------------------------------ | -------- |
|
|
760
|
+
| `consecutiveThreshold` / `hardCutoffThreshold` upper bound | `100` |
|
|
741
761
|
| `hardCutoffThreshold` strictly greater than `consecutiveThreshold` | enforced |
|
|
742
762
|
|
|
743
763
|
**Defaults.** When `loopDetection` is omitted entirely, MANTYX applies the
|
|
@@ -776,31 +796,31 @@ tool result.
|
|
|
776
796
|
}
|
|
777
797
|
```
|
|
778
798
|
|
|
779
|
-
| Field | Type | Required | Notes
|
|
780
|
-
| ---------- | ----------- | -------- |
|
|
781
|
-
| `<key>` | string | yes | Logical tool name as the model sees it (the same name on `ResolvedTool.name`; the SDK + pipeline handle sanitisation). 1–120 characters.
|
|
799
|
+
| Field | Type | Required | Notes |
|
|
800
|
+
| ---------- | ----------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
801
|
+
| `<key>` | string | yes | Logical tool name as the model sees it (the same name on `ResolvedTool.name`; the SDK + pipeline handle sanitisation). 1–120 characters. |
|
|
782
802
|
| `maxCalls` | integer ≥ 0 | yes | Hard cap on executed calls per run. `0` disables the tool entirely (every attempt returns the synthetic body on the first try). Budgets are **per-tool, not pooled**: `hive_search_deals: { maxCalls: 5 }` and `hive_search_meetings: { maxCalls: 5 }` give the agent five of each, not five between them. |
|
|
783
803
|
|
|
784
804
|
Validation (server-side, `400 invalid_request` on violation):
|
|
785
805
|
|
|
786
|
-
| Constraint
|
|
787
|
-
|
|
|
788
|
-
| Max entries
|
|
789
|
-
| `<key>` length
|
|
806
|
+
| Constraint | Limit |
|
|
807
|
+
| ---------------------- | -------------------------------------------------------------------------- |
|
|
808
|
+
| Max entries | `32` |
|
|
809
|
+
| `<key>` length | `1..120` chars |
|
|
790
810
|
| `maxCalls` upper bound | `1000` (functionally unlimited; the SDK's `maxToolTurns: 100` fires first) |
|
|
791
811
|
|
|
792
812
|
**Defaults.** When `toolBudgets` is omitted, MANTYX layers the runtime
|
|
793
813
|
defaults from `runtime/default-run-guards.ts` on top of the spec. The
|
|
794
814
|
default research-tool surface is:
|
|
795
815
|
|
|
796
|
-
| Tool
|
|
797
|
-
|
|
|
798
|
-
| `recall` (workspace memory hybrid search)
|
|
799
|
-
| `traverse` (memory graph BFS)
|
|
800
|
-
| `hive_consult_ontology` (per-hive ontology read; same name across all three hives)
|
|
801
|
-
| `hive_search_deals` / `_meetings` / `_companies` / `_people` (Sales Hive general search)
|
|
802
|
-
| `hive_search_tickets` / `_conversations` / `_accounts` (Customer Hive general search)
|
|
803
|
-
| `hive_search_releases` / `_issues` (Product Hive general search)
|
|
816
|
+
| Tool | Default `maxCalls` |
|
|
817
|
+
| ---------------------------------------------------------------------------------------- | ------------------ |
|
|
818
|
+
| `recall` (workspace memory hybrid search) | `4` |
|
|
819
|
+
| `traverse` (memory graph BFS) | `3` |
|
|
820
|
+
| `hive_consult_ontology` (per-hive ontology read; same name across all three hives) | `4` |
|
|
821
|
+
| `hive_search_deals` / `_meetings` / `_companies` / `_people` (Sales Hive general search) | `5` |
|
|
822
|
+
| `hive_search_tickets` / `_conversations` / `_accounts` (Customer Hive general search) | `5` |
|
|
823
|
+
| `hive_search_releases` / `_issues` (Product Hive general search) | `5` |
|
|
804
824
|
|
|
805
825
|
Pass `"toolBudgets": {}` to start from a clean slate (no defaults applied
|
|
806
826
|
on top — useful for runs that intentionally want unbounded research). When
|
|
@@ -828,7 +848,64 @@ during normal multi-entity reads. The loop-detection guard catches the
|
|
|
828
848
|
pathological "same `(name, args)` batch over and over" case for that
|
|
829
849
|
family without needing per-tool caps.
|
|
830
850
|
|
|
831
|
-
### 4.8 `
|
|
851
|
+
### 4.8 `supervisor` (run judge)
|
|
852
|
+
|
|
853
|
+
`supervisor` controls the optional **run supervisor** — an LLM judge that
|
|
854
|
+
periodically reviews the agent's transcript (reasoning, tool calls, tool
|
|
855
|
+
results, visible text) and may steer the run:
|
|
856
|
+
|
|
857
|
+
- **`on_track`** — no-op; the run continues.
|
|
858
|
+
- **`redirect`** — a steering user message is injected; tools stay available.
|
|
859
|
+
- **`finalize`** — the next turn is forced tools-disabled so the run lands a
|
|
860
|
+
clean final answer.
|
|
861
|
+
|
|
862
|
+
Reviews fire every **`interval` LLM calls** (`completeTurn` invocations) at
|
|
863
|
+
the bottom of tool-emitting rounds. Default interval is **5** when enabled.
|
|
864
|
+
|
|
865
|
+
```jsonc
|
|
866
|
+
"supervisor": {
|
|
867
|
+
"interval": 5 // optional — LLM calls between reviews; default 5
|
|
868
|
+
}
|
|
869
|
+
|
|
870
|
+
// or:
|
|
871
|
+
"supervisor": false // explicitly disable the platform judge for this run
|
|
872
|
+
```
|
|
873
|
+
|
|
874
|
+
| Field | Type | Required | Notes |
|
|
875
|
+
| ----------------- | --------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------- |
|
|
876
|
+
| `interval` | integer ≥ 1 | no | Defaults to **5** when the supervisor is enabled and `interval` is omitted. Capped at **100** server-side. |
|
|
877
|
+
| (literal `false`) | `false` | no | Disables the run supervisor for this run. `loopDetection` and `toolBudgets` still apply. |
|
|
878
|
+
|
|
879
|
+
**Defaults.** When `supervisor` is **omitted**, MANTYX enables the platform
|
|
880
|
+
LLM judge on ephemeral runs. Pass `"supervisor": false` to opt out.
|
|
881
|
+
|
|
882
|
+
**SDK-only usage.** When calling `@mantyx/ts-sdk` directly (not via
|
|
883
|
+
`POST /agent-runs`), the supervisor is **off unless explicitly configured**:
|
|
884
|
+
pass `supervisor: { review, interval? }` on `RunAgentOptions` to enable a
|
|
885
|
+
caller-supplied judge, or pass `supervisor: false` (or omit the field) to
|
|
886
|
+
keep it disabled. The wire field above controls the **platform-hosted** judge
|
|
887
|
+
on API/ephemeral runs only.
|
|
888
|
+
|
|
889
|
+
Validation (server-side, `400 invalid_request` on violation):
|
|
890
|
+
|
|
891
|
+
| Constraint | Limit |
|
|
892
|
+
| ----------------- | ----- |
|
|
893
|
+
| `interval` upper bound | `100` |
|
|
894
|
+
|
|
895
|
+
**Inheritance for sessions.**
|
|
896
|
+
|
|
897
|
+
- `POST /agent-sessions { supervisor }` — sets the session-default, applied
|
|
898
|
+
to every subsequent message run.
|
|
899
|
+
- `POST /agent-sessions/:id/messages { supervisor }` — optional per-message
|
|
900
|
+
override; applies to that one run only and does not mutate the session's
|
|
901
|
+
stored value.
|
|
902
|
+
|
|
903
|
+
**Observability.** Each review emits a SSE `supervisor` event (see §7) —
|
|
904
|
+
including `on_track` checks — so SDK clients can render supervisor activity.
|
|
905
|
+
When `action` is `redirect` or `finalize`, the pipeline has already applied
|
|
906
|
+
the verdict by the time the event arrives.
|
|
907
|
+
|
|
908
|
+
### 4.9 `metadata` (developer-supplied KV for filtering)
|
|
832
909
|
|
|
833
910
|
`metadata` is a flat string→string KV that is **persisted alongside the run /
|
|
834
911
|
session** and surfaced in the MANTYX dashboard. Use it to tag runs with your
|
|
@@ -838,12 +915,12 @@ prompt.
|
|
|
838
915
|
|
|
839
916
|
Validation (server-side, `400 invalid_request` on violation):
|
|
840
917
|
|
|
841
|
-
| Constraint
|
|
842
|
-
|
|
|
843
|
-
| Max entries
|
|
844
|
-
| Key pattern
|
|
845
|
-
| Value type / length
|
|
846
|
-
| Serialized JSON size
|
|
918
|
+
| Constraint | Limit |
|
|
919
|
+
| -------------------- | ------------------------ |
|
|
920
|
+
| Max entries | 16 |
|
|
921
|
+
| Key pattern | `^[A-Za-z0-9._-]{1,64}$` |
|
|
922
|
+
| Value type / length | string ≤ 256 chars |
|
|
923
|
+
| Serialized JSON size | ≤ 4 KB |
|
|
847
924
|
|
|
848
925
|
For session-scoped runs the inheritance rules are:
|
|
849
926
|
|
|
@@ -872,13 +949,18 @@ POST /api/v1/workspaces/{slug}/agent-runs/{runId}/cancel
|
|
|
872
949
|
`POST /agent-runs` returns `202 Accepted` immediately:
|
|
873
950
|
|
|
874
951
|
```json
|
|
875
|
-
{
|
|
952
|
+
{
|
|
953
|
+
"runId": "run_abc",
|
|
954
|
+
"streamUrl": "/api/v1/workspaces/acme/agent-runs/run_abc/stream"
|
|
955
|
+
}
|
|
876
956
|
```
|
|
877
957
|
|
|
878
958
|
`GET .../stream` is the canonical event channel; see §7.
|
|
879
959
|
|
|
880
960
|
`GET /agent-runs/{runId}` returns the run snapshot (status, final text, error,
|
|
881
|
-
spec
|
|
961
|
+
spec, plus the cost-attribution triple `tokens` / `turns` / `model` —
|
|
962
|
+
see §7.1) without subscribing to live events. Useful for polling long
|
|
963
|
+
runs or attributing spend after the SSE stream was already consumed.
|
|
882
964
|
|
|
883
965
|
## 6. Sessions
|
|
884
966
|
|
|
@@ -903,13 +985,15 @@ and returns `{ runId, streamUrl }` just like a one-shot run. Body:
|
|
|
903
985
|
```jsonc
|
|
904
986
|
{
|
|
905
987
|
"prompt": "What's in /etc/hosts?",
|
|
906
|
-
"tools": [
|
|
988
|
+
"tools": [
|
|
989
|
+
/* optional refresh of tool definitions */
|
|
990
|
+
],
|
|
907
991
|
}
|
|
908
992
|
```
|
|
909
993
|
|
|
910
994
|
The server prepends the session's prior messages, runs the model, and on
|
|
911
995
|
success appends the new user/assistant turns back to the session row. Local
|
|
912
|
-
tool **handlers** are
|
|
996
|
+
tool **handlers** are _not_ persisted: the session stores definitions
|
|
913
997
|
(name, schema, description) so that a restarted SDK can re-bind handlers and
|
|
914
998
|
keep going.
|
|
915
999
|
|
|
@@ -939,9 +1023,6 @@ data: <utf-8 JSON>
|
|
|
939
1023
|
`<type>` and `<data>` shapes:
|
|
940
1024
|
|
|
941
1025
|
```jsonc
|
|
942
|
-
// running message
|
|
943
|
-
{ "seq": 1, "type": "started", "data": {} }
|
|
944
|
-
|
|
945
1026
|
// streamed assistant tokens (zero or more per turn)
|
|
946
1027
|
{ "seq": 2, "type": "assistant_delta", "data": { "text": "Hello" } }
|
|
947
1028
|
|
|
@@ -978,9 +1059,30 @@ data: <utf-8 JSON>
|
|
|
978
1059
|
// is observability so SDK clients can render "memory budget exhausted" status notes.
|
|
979
1060
|
{ "seq": 7, "type": "tool_budget_exceeded", "data": { "tool": "recall", "maxCalls": 4, "callIndex": 5 } }
|
|
980
1061
|
|
|
1062
|
+
// run-supervisor check (see §4.8). Fired on every review — on_track included.
|
|
1063
|
+
{ "seq": 7, "type": "supervisor", "data": { "action": "on_track", "reason": "Agent is making progress.", "llmCalls": 5 } }
|
|
1064
|
+
{ "seq": 8, "type": "supervisor", "data": { "action": "redirect", "reason": "Stuck re-querying.", "redirect": "Answer from the data you already have.", "llmCalls": 10 } }
|
|
1065
|
+
{ "seq": 9, "type": "supervisor", "data": { "action": "finalize", "reason": "Enough to answer.", "llmCalls": 15 } }
|
|
1066
|
+
|
|
981
1067
|
// terminal event
|
|
982
|
-
|
|
983
|
-
|
|
1068
|
+
// Every terminal `result` event also carries `tokens`, `turns`, and `model`
|
|
1069
|
+
// for cost attribution and dashboards — see §7.1. Older platforms (pre-
|
|
1070
|
+
// 2026-09) omit these fields; SDK clients detect "no usage data" by
|
|
1071
|
+
// checking that `model.provider` is empty / falsy.
|
|
1072
|
+
{ "seq": 8, "type": "result", "data": {
|
|
1073
|
+
"subtype": "success",
|
|
1074
|
+
"text": "Final reply",
|
|
1075
|
+
"tokens": { "inputTokens": 1283, "cachedTokens": 512, "reasoningTokens": 96, "outputTokens": 240 },
|
|
1076
|
+
"turns": 3,
|
|
1077
|
+
"model": { "id": "platform:demo", "provider": "openai", "vendorModelId": "gpt-5.4-mini", "reasoningEffort": "low" }
|
|
1078
|
+
} }
|
|
1079
|
+
{ "seq": 8, "type": "result", "data": {
|
|
1080
|
+
"subtype": "error_local_tool_timeout",
|
|
1081
|
+
"error": "...",
|
|
1082
|
+
"tokens": { "inputTokens": 980, "cachedTokens": 0, "reasoningTokens": 0, "outputTokens": 14 },
|
|
1083
|
+
"turns": 2,
|
|
1084
|
+
"model": { "id": "platform:demo", "provider": "anthropic", "vendorModelId": "claude-opus-4-7" }
|
|
1085
|
+
} }
|
|
984
1086
|
{ "seq": 8, "type": "cancelled", "data": {} }
|
|
985
1087
|
```
|
|
986
1088
|
|
|
@@ -991,6 +1093,117 @@ field and the parsed `type` inside `data` — they are always equal, but
|
|
|
991
1093
|
implementations should rely on `data.type` because some HTTP middleware
|
|
992
1094
|
strips the `event:` line.
|
|
993
1095
|
|
|
1096
|
+
### 7.1 Cost-attribution fields (`tokens`, `turns`, `model`)
|
|
1097
|
+
|
|
1098
|
+
Every terminal `result` SSE event (and every terminal `error` event on
|
|
1099
|
+
platforms that emit it — see `docs/wire-protocol.md` §4.7) carries three
|
|
1100
|
+
additional fields so callers can drive cost dashboards, per-turn budgets,
|
|
1101
|
+
and provider/model spend reports without a follow-up
|
|
1102
|
+
`GET /agent-runs/:runId` round trip. The same fields are persisted on the
|
|
1103
|
+
`EphemeralAgentRun` row and surfaced by that endpoint.
|
|
1104
|
+
|
|
1105
|
+
| Field | Type | Notes |
|
|
1106
|
+
| -------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
1107
|
+
| `tokens` | object | Per-run token totals aggregated across every model invocation. See schema below. |
|
|
1108
|
+
| `turns` | int | Total `engine.completeTurn(...)` invocations for the run. Counts the failing call too — so a single-shot run is `1`, a tool loop is `>= 2`, and a run that errored on its first model call is `1`. Distinct from "tool turns" — `turns` is **model invocations**, regardless of whether the model called any tools. |
|
|
1109
|
+
| `model` | object | Resolved model that actually executed the run. See schema below. |
|
|
1110
|
+
|
|
1111
|
+
Always present on the terminal event for runs created against
|
|
1112
|
+
**MANTYX ≥ 2026-09** servers. Older servers omit these fields entirely;
|
|
1113
|
+
SDK clients (TS/Go/Python) detect "no usage data" by checking that
|
|
1114
|
+
`model.provider` is empty / falsy. JSON keys follow MANTYX's standard
|
|
1115
|
+
camelCase wire convention.
|
|
1116
|
+
|
|
1117
|
+
**`tokens` schema** — mirrors the wire shape produced by
|
|
1118
|
+
`tokenUsageToWireTokens` in `packages/ts-sdk/src/usage-wire.ts`:
|
|
1119
|
+
|
|
1120
|
+
| Field | Type | Notes |
|
|
1121
|
+
| ----------------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
1122
|
+
| `inputTokens` | int | **Total billable input** — fresh prompt tokens **plus** the cached-read slice the provider still bills (at a discount) **plus** any cache-creation tokens **plus** tool-prompt tokens. Equal to the sum of every provider-reported input bucket for the run. |
|
|
1123
|
+
| `cachedTokens` | int | The discounted slice of `inputTokens` that came from a prompt cache hit (Anthropic prompt caching, OpenAI cached prompt, Gemini implicit cache). `0` when the provider doesn't report cache reads or the run didn't hit cache. |
|
|
1124
|
+
| `reasoningTokens` | int | Non-visible thinking tokens. **Already counted inside `outputTokens`** — surfaced separately so dashboards can break out "thinking cost" vs visible output. `0` when the model didn't reason or didn't report it. |
|
|
1125
|
+
| `outputTokens` | int | **All** tokens the model emitted for this run, visible + reasoning. Matches the provider's "completion tokens" / "output tokens" billing line. |
|
|
1126
|
+
|
|
1127
|
+
`inputTokens` and `outputTokens` together cover every billable token the
|
|
1128
|
+
run consumed; `cachedTokens` and `reasoningTokens` are diagnostic
|
|
1129
|
+
breakdowns _inside_ those two totals (not separate buckets to be added).
|
|
1130
|
+
|
|
1131
|
+
**`model` schema** — fields the platform stamps onto every successful
|
|
1132
|
+
or failed run:
|
|
1133
|
+
|
|
1134
|
+
| Field | Type | Notes |
|
|
1135
|
+
| ----------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
1136
|
+
| `id` | string | Catalog id — the same string a caller would pass back as `modelId` to re-select this exact entry (e.g. `"platform:demo"`, `"provider:cmf…"`). Empty string against legacy fallbacks that didn't synthesise a catalog id. |
|
|
1137
|
+
| `provider` | string | Lowercase provider id: `"openai"`, `"anthropic"`, `"google"`, `"azure-openai"`. |
|
|
1138
|
+
| `vendorModelId` | string | The model id the platform actually sent to the provider (e.g. `"gpt-5.4-mini"`, `"claude-opus-4-7"`, `"gemini-2.5-pro"`). Carried through from the `model` field on `AgentSpec` after resolution. |
|
|
1139
|
+
| `reasoningEffort` | string | Optional. `"off"`, `"low"`, `"medium"`, `"high"`. Omitted when the provider doesn't expose a reasoning-level knob or the run didn't request one. |
|
|
1140
|
+
|
|
1141
|
+
**Per-provider token mapping.** Provider responses vary in how they
|
|
1142
|
+
report token usage. MANTYX normalises them into the wire shape above as
|
|
1143
|
+
follows:
|
|
1144
|
+
|
|
1145
|
+
| Provider | `inputTokens` ← | `cachedTokens` ← | `reasoningTokens` ← | `outputTokens` ← |
|
|
1146
|
+
| --------- | ----------------------------------------------------------------------------------------------- | ------------------------------------------- | ----------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
|
|
1147
|
+
| OpenAI | `usage.prompt_tokens` (already includes cached read tokens) | `usage.prompt_tokens_details.cached_tokens` | `usage.completion_tokens_details.reasoning_tokens` | `usage.completion_tokens` |
|
|
1148
|
+
| Anthropic | `usage.input_tokens` + `usage.cache_read_input_tokens` + `usage.cache_creation_input_tokens` | `usage.cache_read_input_tokens` | (extended-thinking tokens; folded into `output_tokens` by the provider) | `usage.output_tokens` |
|
|
1149
|
+
| Google | `usageMetadata.promptTokenCount` + `usageMetadata.cachedContentTokenCount` + tool-prompt tokens | `usageMetadata.cachedContentTokenCount` | `usageMetadata.thoughtsTokenCount` | `usageMetadata.candidatesTokenCount` (or `totalTokenCount - promptTokenCount` for older Gemini SDKs) |
|
|
1150
|
+
|
|
1151
|
+
If a provider doesn't report a given bucket the corresponding field is
|
|
1152
|
+
`0`, never `null`.
|
|
1153
|
+
|
|
1154
|
+
**Tool-loop accounting.** When the run executes tool turns, every
|
|
1155
|
+
`engine.completeTurn(...)` invocation contributes its usage to the
|
|
1156
|
+
aggregated `tokens` object — so a run with one tool round (model →
|
|
1157
|
+
tool → model) reports `turns: 2` and the **sum** of both model calls'
|
|
1158
|
+
token usage. The terminal event carries the cumulative totals; no
|
|
1159
|
+
per-turn breakdown is in the terminal event (use the
|
|
1160
|
+
`assistant_message` events for per-turn observability).
|
|
1161
|
+
|
|
1162
|
+
**Snapshot exposure.** `GET /api/v1/workspaces/{slug}/agent-runs/{runId}`
|
|
1163
|
+
also returns `tokens` / `turns` / `model` on the run snapshot JSON, with
|
|
1164
|
+
the same wire shape. The keys are always present (as `null` until the
|
|
1165
|
+
worker writes the terminal event, and on legacy rows pre-rollout) so
|
|
1166
|
+
SDK clients can probe server capability via `"tokens" in body` without
|
|
1167
|
+
triggering an undefined-vs-null distinction across HTTP/JSON
|
|
1168
|
+
serialization.
|
|
1169
|
+
|
|
1170
|
+
**A2A exposure.** The MANTYX-hosted A2A endpoint
|
|
1171
|
+
(`POST /api/a2a/{workspaceSlug}/agents/{agentSlug}`) returns the same
|
|
1172
|
+
triple on the JSON-RPC response under `result.metadata.mantyx`:
|
|
1173
|
+
|
|
1174
|
+
```jsonc
|
|
1175
|
+
{
|
|
1176
|
+
"result": {
|
|
1177
|
+
"kind": "message",
|
|
1178
|
+
"messageId": "msg_abc",
|
|
1179
|
+
"role": "agent",
|
|
1180
|
+
"parts": [{ "kind": "text", "text": "Final reply" }],
|
|
1181
|
+
"metadata": {
|
|
1182
|
+
"mantyx": {
|
|
1183
|
+
"tokens": {
|
|
1184
|
+
"inputTokens": 1283,
|
|
1185
|
+
"cachedTokens": 512,
|
|
1186
|
+
"reasoningTokens": 96,
|
|
1187
|
+
"outputTokens": 240,
|
|
1188
|
+
},
|
|
1189
|
+
"turns": 3,
|
|
1190
|
+
"model": {
|
|
1191
|
+
"id": "platform:demo",
|
|
1192
|
+
"provider": "openai",
|
|
1193
|
+
"vendorModelId": "gpt-5.4-mini",
|
|
1194
|
+
"reasoningEffort": "low",
|
|
1195
|
+
},
|
|
1196
|
+
},
|
|
1197
|
+
},
|
|
1198
|
+
},
|
|
1199
|
+
}
|
|
1200
|
+
```
|
|
1201
|
+
|
|
1202
|
+
The `metadata.mantyx` block is omitted entirely against legacy runners
|
|
1203
|
+
that haven't implemented `runWithUsage` on the A2A adapter (see
|
|
1204
|
+
`packages/ts-sdk/src/a2a/adapter.ts`); cross-platform A2A clients
|
|
1205
|
+
should treat its absence as "no usage data" rather than as zero usage.
|
|
1206
|
+
|
|
994
1207
|
## 8. Local tool result
|
|
995
1208
|
|
|
996
1209
|
```
|
|
@@ -1027,23 +1240,25 @@ All non-2xx responses use this body shape:
|
|
|
1027
1240
|
|
|
1028
1241
|
```jsonc
|
|
1029
1242
|
{
|
|
1030
|
-
"error": "invalid_model",
|
|
1243
|
+
"error": "invalid_model", // machine-readable code
|
|
1031
1244
|
"message": "Model 'foo' is ambiguous; pick one of: provider:cm6...",
|
|
1032
|
-
"candidates": [
|
|
1245
|
+
"candidates": [
|
|
1246
|
+
/* sometimes present */
|
|
1247
|
+
],
|
|
1033
1248
|
}
|
|
1034
1249
|
```
|
|
1035
1250
|
|
|
1036
1251
|
Common codes:
|
|
1037
1252
|
|
|
1038
|
-
| Code
|
|
1039
|
-
|
|
|
1040
|
-
| `unauthorized`
|
|
1041
|
-
| `not_found`
|
|
1042
|
-
| `invalid_request`
|
|
1043
|
-
| `invalid_model`
|
|
1044
|
-
| `unknown_tool_use`
|
|
1045
|
-
| `run_terminal`
|
|
1046
|
-
| `rate_limited`
|
|
1253
|
+
| Code | HTTP | Notes |
|
|
1254
|
+
| ------------------ | ---: | -------------------------------------- |
|
|
1255
|
+
| `unauthorized` | 401 | Missing/invalid API key |
|
|
1256
|
+
| `not_found` | 404 | Workspace, run, or session unknown |
|
|
1257
|
+
| `invalid_request` | 400 | Body failed Zod validation |
|
|
1258
|
+
| `invalid_model` | 400 | `modelId` couldn't be resolved |
|
|
1259
|
+
| `unknown_tool_use` | 404 | Tool-result for an unknown `toolUseId` |
|
|
1260
|
+
| `run_terminal` | 409 | Tool-result after run finished |
|
|
1261
|
+
| `rate_limited` | 429 | Per-API-key sliding window |
|
|
1047
1262
|
|
|
1048
1263
|
## 11. Suggested client architecture
|
|
1049
1264
|
|
|
@@ -1061,8 +1276,8 @@ A reference SDK should:
|
|
|
1061
1276
|
model-side "don't double-call" hint without hand-editing the
|
|
1062
1277
|
description.
|
|
1063
1278
|
- **Local A2A peers** (`kind: "a2a_local"`) — caller-supplied A2A
|
|
1064
|
-
clients. Resolve the peer's Agent Card
|
|
1065
|
-
|
|
1279
|
+
clients. Resolve the peer's Agent Card _first_ (e.g. `fetch
|
|
1280
|
+
"<peer>/.well-known/agent-card.json"` or read from a local registry),
|
|
1066
1281
|
attach it to the spec as `agentCard`, and in the dispatcher look the
|
|
1067
1282
|
client up by `agentCard.url` (or any other field you indexed on)
|
|
1068
1283
|
when the `local_tool_call` arrives.
|
|
@@ -1074,9 +1289,10 @@ A reference SDK should:
|
|
|
1074
1289
|
|
|
1075
1290
|
`mantyx`, `mantyx_plugin`, `a2a`, and `mcp` refs are server-resolved —
|
|
1076
1291
|
no SDK-side registry needed.
|
|
1292
|
+
|
|
1077
1293
|
3. On `runAgent` / `session.send`:
|
|
1078
1294
|
- Accept `reasoningLevel` from the caller and pass it through unchanged
|
|
1079
|
-
(string `"off" | "low" | "medium" | "high"`
|
|
1295
|
+
(string `"off" | "low" | "medium" | "high"` _or_ number `0–100`); do
|
|
1080
1296
|
**not** translate to a vendor-specific knob — the server owns that
|
|
1081
1297
|
mapping so all SDKs stay aligned with the web composer.
|
|
1082
1298
|
- POST the run/message, get `{ runId, streamUrl }`.
|
|
@@ -1088,18 +1304,18 @@ A reference SDK should:
|
|
|
1088
1304
|
- Treat `thinking_delta` events as opt-in callback fodder; many UIs hide
|
|
1089
1305
|
them by default. Their presence depends on `reasoningLevel > 0` and
|
|
1090
1306
|
on the active model exposing thought parts.
|
|
1091
|
-
- Accept `loopDetection` and `
|
|
1092
|
-
them through unchanged (see §4.6 / §4.7).
|
|
1093
|
-
omitting them keeps MANTYX's runtime defaults; passing
|
|
1094
|
-
`loopDetection: false`
|
|
1095
|
-
defaults; passing entries layers caller
|
|
1096
|
-
defaults.
|
|
1097
|
-
- Treat `loop_detected` and `
|
|
1098
|
-
observability-only — the server already substituted
|
|
1099
|
-
tool-results / steering nudges
|
|
1100
|
-
the event to the caller (status banner,
|
|
1101
|
-
**not** abort the run on these events; the run
|
|
1102
|
-
`result` / `error` / `cancelled` as usual.
|
|
1307
|
+
- Accept `loopDetection`, `toolBudgets`, and `supervisor` from the caller
|
|
1308
|
+
and pass them through unchanged (see §4.6 / §4.7 / §4.8). All three are
|
|
1309
|
+
_additive_: omitting them keeps MANTYX's runtime defaults; passing
|
|
1310
|
+
`loopDetection: false` or `supervisor: false` opts out; passing
|
|
1311
|
+
`toolBudgets: {}` clears the defaults; passing entries layers caller
|
|
1312
|
+
overrides on top of the defaults.
|
|
1313
|
+
- Treat `loop_detected`, `tool_budget_exceeded`, and `supervisor` SSE
|
|
1314
|
+
events as observability-only — the server already substituted synthetic
|
|
1315
|
+
tool-results / steering nudges / supervisor verdicts where applicable, so
|
|
1316
|
+
the SDK's job is just to surface the event to the caller (status banner,
|
|
1317
|
+
log line, telemetry). Do **not** abort the run on these events; the run
|
|
1318
|
+
continues through `result` / `error` / `cancelled` as usual.
|
|
1103
1319
|
- On terminal `result`, resolve the call. On `error` subtype, throw.
|
|
1104
1320
|
4. Re-emit assistant deltas/events as a stream/iterator for callers who care
|
|
1105
1321
|
about live output.
|