@mantyx/sdk 0.10.0 → 0.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +8 -1
- package/README.md +37 -31
- package/dist/a2a-server.cjs +9 -0
- package/dist/a2a-server.cjs.map +1 -1
- package/dist/a2a-server.d.cts +1 -1
- package/dist/a2a-server.d.ts +1 -1
- package/dist/a2a-server.js +1 -1
- package/dist/{chunk-XMUCELMH.js → chunk-DR625E6B.js} +69 -9
- package/dist/chunk-DR625E6B.js.map +1 -0
- package/dist/{client-DHwh8MPj.d.cts → client-Byb0Zdo7.d.cts} +199 -84
- package/dist/{client-DHwh8MPj.d.ts → client-Byb0Zdo7.d.ts} +199 -84
- package/dist/index.cjs +76 -78
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +2 -2
- package/dist/index.d.ts +2 -2
- package/dist/index.js +9 -69
- package/dist/index.js.map +1 -1
- package/docs/agent-runs-protocol.md +373 -220
- package/docs/wire-protocol.md +415 -252
- package/package.json +1 -1
- package/dist/chunk-XMUCELMH.js.map +0 -1
- package/docs/oauth.md +0 -356
|
@@ -16,7 +16,7 @@ Companion documents:
|
|
|
16
16
|
|
|
17
17
|
## 1. Concepts
|
|
18
18
|
|
|
19
|
-
**Ephemeral agent.** A run-time agent that is
|
|
19
|
+
**Ephemeral agent.** A run-time agent that is _defined by the request_ rather
|
|
20
20
|
than persisted as a row in MANTYX's `Agent` table. The full spec (system
|
|
21
21
|
prompt, model, tools) is stored as part of each session/run for observability
|
|
22
22
|
but is not editable from the dashboard.
|
|
@@ -24,15 +24,15 @@ but is not editable from the dashboard.
|
|
|
24
24
|
**Tool refs.** Seven flavours, all carried inside the agent spec's `tools`
|
|
25
25
|
array:
|
|
26
26
|
|
|
27
|
-
| `kind`
|
|
28
|
-
|
|
|
29
|
-
| `mantyx`
|
|
30
|
-
| `mantyx_plugin`
|
|
31
|
-
| `local`
|
|
32
|
-
| `a2a`
|
|
33
|
-
| `a2a_local`
|
|
34
|
-
| `mcp`
|
|
35
|
-
| `mcp_local`
|
|
27
|
+
| `kind` | Resolved by | Notes |
|
|
28
|
+
| --------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
29
|
+
| `mantyx` | server | A workspace `Tool` row referenced by id (HTTP / Code / Plugin). |
|
|
30
|
+
| `mantyx_plugin` | server | A platform plugin tool referenced by name. |
|
|
31
|
+
| `local` | client | A custom tool defined and executed in the SDK's process. Carries `parameters` (input JSON Schema) plus optional `outputSchema` (return-value JSON Schema) and `longRunning` flag — see §4.1.1. |
|
|
32
|
+
| `a2a` | server | A _remote_ Agent2Agent peer MANTYX can reach; invoked via `message/send` and the reply is surfaced as the tool result. |
|
|
33
|
+
| `a2a_local` | client | An A2A peer MANTYX **cannot** reach. SDK resolves the [Agent Card](https://google.github.io/A2A/specification/#agent-card) locally and ships it inline; MANTYX uses it for the model description and routes calls back to the SDK over SSE. |
|
|
34
|
+
| `mcp` | server | A _remote_ MCP server (Streamable HTTP). At run start MANTYX lists the catalog and exposes every tool as `<server>_<tool>` (subject to `toolFilter`). |
|
|
35
|
+
| `mcp_local` | client | An MCP server MANTYX **cannot** reach. SDK runs `Initialize` + `tools/list` locally and ships the resolved `Tool[]` (with `inputSchema`); MANTYX exposes them to the model with the SDK-declared names and routes calls back over SSE. |
|
|
36
36
|
|
|
37
37
|
The split is deliberate:
|
|
38
38
|
|
|
@@ -42,7 +42,7 @@ The split is deliberate:
|
|
|
42
42
|
MCP/A2A this also means MANTYX does discovery (`listTools`, agent-card
|
|
43
43
|
fetch).
|
|
44
44
|
- **Client-resolved / "local"** (`local`, `a2a_local`, `mcp_local`) —
|
|
45
|
-
MANTYX has
|
|
45
|
+
MANTYX has _no_ access to the resource. The SDK does **all** of the
|
|
46
46
|
work: connection, discovery, listing, expansion, arg validation, auth,
|
|
47
47
|
execution, retries. MANTYX is a thin LLM-routing layer that emits a
|
|
48
48
|
`local_tool_call` event and blocks until the SDK POSTs back to
|
|
@@ -52,9 +52,9 @@ The split is deliberate:
|
|
|
52
52
|
|
|
53
53
|
**One-shot run vs. session.** A run is an LLM execution. Runs may be:
|
|
54
54
|
|
|
55
|
-
-
|
|
55
|
+
- _one-shot_ (`POST /agent-runs`) — fire-and-stream, no persistent state apart
|
|
56
56
|
from observability.
|
|
57
|
-
-
|
|
57
|
+
- _session-scoped_ (`POST /agent-sessions/:id/messages`) — the run inherits the
|
|
58
58
|
session's full message history, and the new user/assistant turns are
|
|
59
59
|
appended back to the session on success.
|
|
60
60
|
|
|
@@ -75,10 +75,10 @@ Authorization: Bearer <credential>
|
|
|
75
75
|
X-API-Key: <credential>
|
|
76
76
|
```
|
|
77
77
|
|
|
78
|
-
| Credential
|
|
79
|
-
|
|
|
80
|
-
| **Workspace API key**
|
|
81
|
-
| **OAuth 2.0 access token
|
|
78
|
+
| Credential | Token format | Identifies | Bound to | Use when |
|
|
79
|
+
| -------------------------- | ------------- | ---------------------------------------------------- | ------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
|
|
80
|
+
| **Workspace API key** | `mantyx_…` | The workspace | One workspace, no end-user | Personal scripts, internal automations, anything the SDK caller owns end-to-end. |
|
|
81
|
+
| **OAuth 2.0 access token** | `mantyx_at_…` | An end user **and** the workspace they consented for | One workspace, one user (or one app for `client_credentials`) | "Sign in with MANTYX" apps, third-party integrations, anywhere consent + scopes matter. |
|
|
82
82
|
|
|
83
83
|
The server resolves whichever it sees by token-prefix sniffing (see
|
|
84
84
|
`packages/api/src/services/bearer-credential.ts`) — SDKs do **not** need
|
|
@@ -115,19 +115,19 @@ two differences:
|
|
|
115
115
|
multi-scope ones — see §2.3). The SDK is expected to surface this
|
|
116
116
|
verbatim. The agent-runs surface uses these scopes:
|
|
117
117
|
|
|
118
|
-
| Endpoint
|
|
119
|
-
|
|
|
120
|
-
| `GET .../models`
|
|
121
|
-
| `POST .../agent-runs`
|
|
122
|
-
| `GET .../agent-runs/{runId}`
|
|
123
|
-
| `GET .../agent-runs/{runId}/stream`
|
|
124
|
-
| `POST .../agent-runs/{runId}/cancel`
|
|
125
|
-
| `POST .../agent-runs/{runId}/tool-results`
|
|
126
|
-
| `POST .../agent-sessions`
|
|
127
|
-
| `GET .../agent-sessions/{sessionId}`
|
|
128
|
-
| `DELETE .../agent-sessions/{sessionId}`
|
|
129
|
-
| `POST .../agent-sessions/{sessionId}/messages`
|
|
130
|
-
| `GET /api/oauth/userinfo`
|
|
118
|
+
| Endpoint | Required scope |
|
|
119
|
+
| ------------------------------------------------ | ---------------------- |
|
|
120
|
+
| `GET .../models` | `models:read` |
|
|
121
|
+
| `POST .../agent-runs` | `runs:write` |
|
|
122
|
+
| `GET .../agent-runs/{runId}` | `runs:read` |
|
|
123
|
+
| `GET .../agent-runs/{runId}/stream` | `runs:read` |
|
|
124
|
+
| `POST .../agent-runs/{runId}/cancel` | `runs:write` |
|
|
125
|
+
| `POST .../agent-runs/{runId}/tool-results` | `runs:write` |
|
|
126
|
+
| `POST .../agent-sessions` | `sessions:write` |
|
|
127
|
+
| `GET .../agent-sessions/{sessionId}` | `sessions:read` |
|
|
128
|
+
| `DELETE .../agent-sessions/{sessionId}` | `sessions:write` |
|
|
129
|
+
| `POST .../agent-sessions/{sessionId}/messages` | `sessions:write` |
|
|
130
|
+
| `GET /api/oauth/userinfo` | `mantyx.identity:read` |
|
|
131
131
|
|
|
132
132
|
For an SDK that exposes one-shot runs and sessions end-to-end, request
|
|
133
133
|
at minimum `models:read runs:read runs:write sessions:read sessions:write`,
|
|
@@ -143,9 +143,10 @@ two differences:
|
|
|
143
143
|
OAuth tokens **also** honor the per-token agent allow-list
|
|
144
144
|
(`OAuthAccessToken.agentIds`) the user picked at consent time — see
|
|
145
145
|
[`docs/oauth.md`](./oauth.md) for the full registration / authorization-code
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
146
|
+
|
|
147
|
+
- PKCE flow. PKCE (`S256`) is mandatory and every MANTYX OAuth app is a
|
|
148
|
+
confidential client, so the token endpoint requires both `client_secret`
|
|
149
|
+
and `code_verifier`.
|
|
149
150
|
|
|
150
151
|
**Token lifetimes.** Access tokens live **1 hour** (`expires_in: 3600`).
|
|
151
152
|
Refresh tokens are **persistent and non-rotating**: they have no
|
|
@@ -176,14 +177,14 @@ Content-Type: application/json
|
|
|
176
177
|
|
|
177
178
|
### 2.3 Error model for credentials
|
|
178
179
|
|
|
179
|
-
| Status | Body shape
|
|
180
|
-
| ------ |
|
|
181
|
-
| `401` | `{ "error": "Unauthorized", "message": "API key or OAuth access token required..." }`
|
|
182
|
-
| `401` | `{ "error": "Invalid API key or OAuth access token" }`
|
|
183
|
-
| `403` | `{ "error": "This API key is not for the Developer API", "hint": "..." }`
|
|
184
|
-
| `403` | `{ "error": "Workspace API keys are not available on this plan.", "code": "api_keys_plan" }` <br> `{ "error": "OAuth applications are not available on this plan.", "code": "oauth_apps_plan" }` | Workspace tier lacks the `apiKeys` / `oauthApps` feature.
|
|
185
|
-
| `403` | `{ "error": "insufficient_scope", "required": "runs:write" }` (or an array if a route needs multiple)
|
|
186
|
-
| `404` | `{ "error": "Workspace path does not match this credential", "hint": "..." }`
|
|
180
|
+
| Status | Body shape | When |
|
|
181
|
+
| ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
182
|
+
| `401` | `{ "error": "Unauthorized", "message": "API key or OAuth access token required..." }` | No `Authorization` / `X-API-Key` header. |
|
|
183
|
+
| `401` | `{ "error": "Invalid API key or OAuth access token" }` | Token doesn't match a row, expired, or revoked. |
|
|
184
|
+
| `403` | `{ "error": "This API key is not for the Developer API", "hint": "..." }` | API key has wrong `usage`. |
|
|
185
|
+
| `403` | `{ "error": "Workspace API keys are not available on this plan.", "code": "api_keys_plan" }` <br> `{ "error": "OAuth applications are not available on this plan.", "code": "oauth_apps_plan" }` | Workspace tier lacks the `apiKeys` / `oauthApps` feature. |
|
|
186
|
+
| `403` | `{ "error": "insufficient_scope", "required": "runs:write" }` (or an array if a route needs multiple) | OAuth token is missing a scope a route demands. The response also sets `WWW-Authenticate: Bearer error="insufficient_scope", scope="..."`. |
|
|
187
|
+
| `404` | `{ "error": "Workspace path does not match this credential", "hint": "..." }` | URL slug ≠ token's workspace. |
|
|
187
188
|
|
|
188
189
|
## 3. Models
|
|
189
190
|
|
|
@@ -204,7 +205,11 @@ platform-hosted offerings visible to the workspace's tier.
|
|
|
204
205
|
"vendorModelId": "claude-sonnet-4-5",
|
|
205
206
|
"source": "platform_offering",
|
|
206
207
|
"contextWindowTokens": 200000,
|
|
207
|
-
"pricing": {
|
|
208
|
+
"pricing": {
|
|
209
|
+
"inputPer1MUsd": 3.0,
|
|
210
|
+
"outputPer1MUsd": 15.0,
|
|
211
|
+
"cacheReadPer1MUsd": 0.3,
|
|
212
|
+
},
|
|
208
213
|
},
|
|
209
214
|
{
|
|
210
215
|
"id": "provider:cm6def456",
|
|
@@ -213,10 +218,10 @@ platform-hosted offerings visible to the workspace's tier.
|
|
|
213
218
|
"vendorModelId": "gpt-5.5",
|
|
214
219
|
"source": "workspace_provider",
|
|
215
220
|
"contextWindowTokens": 200000,
|
|
216
|
-
"pricing": null
|
|
217
|
-
}
|
|
221
|
+
"pricing": null,
|
|
222
|
+
},
|
|
218
223
|
],
|
|
219
|
-
"defaultModelId": "platform:cm6abc123"
|
|
224
|
+
"defaultModelId": "platform:cm6abc123",
|
|
220
225
|
}
|
|
221
226
|
```
|
|
222
227
|
|
|
@@ -240,11 +245,11 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
|
|
|
240
245
|
|
|
241
246
|
```jsonc
|
|
242
247
|
{
|
|
243
|
-
"name": "ephemeral",
|
|
244
|
-
"agentId": "agent_cm6abc123",
|
|
245
|
-
"systemPrompt": "You are helpful.",
|
|
246
|
-
"modelId": "platform:cm6abc123",
|
|
247
|
-
"reasoningLevel": "medium",
|
|
248
|
+
"name": "ephemeral", // optional, observability only
|
|
249
|
+
"agentId": "agent_cm6abc123", // optional — see §4.1
|
|
250
|
+
"systemPrompt": "You are helpful.", // required unless agentId is set
|
|
251
|
+
"modelId": "platform:cm6abc123", // optional, see §3
|
|
252
|
+
"reasoningLevel": "medium", // optional, see §4.4
|
|
248
253
|
"tools": [
|
|
249
254
|
{ "kind": "mantyx", "id": "tool_cm6..." },
|
|
250
255
|
{ "kind": "mantyx_plugin", "name": "web_search" },
|
|
@@ -252,20 +257,22 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
|
|
|
252
257
|
"kind": "local",
|
|
253
258
|
"name": "read_file",
|
|
254
259
|
"description": "Read a file from the user's machine",
|
|
255
|
-
"parameters": {
|
|
260
|
+
"parameters": {
|
|
261
|
+
// JSON Schema for the args object
|
|
256
262
|
"type": "object",
|
|
257
263
|
"properties": { "path": { "type": "string" } },
|
|
258
264
|
"required": ["path"],
|
|
259
|
-
"additionalProperties": false
|
|
265
|
+
"additionalProperties": false,
|
|
260
266
|
},
|
|
261
|
-
"outputSchema": {
|
|
267
|
+
"outputSchema": {
|
|
268
|
+
// optional — JSON Schema for the return value
|
|
262
269
|
"type": "object",
|
|
263
270
|
"properties": {
|
|
264
|
-
"bytes": { "type": "string", "description": "UTF-8 file contents" }
|
|
271
|
+
"bytes": { "type": "string", "description": "UTF-8 file contents" },
|
|
265
272
|
},
|
|
266
|
-
"required": ["bytes"]
|
|
273
|
+
"required": ["bytes"],
|
|
267
274
|
},
|
|
268
|
-
"longRunning": false
|
|
275
|
+
"longRunning": false, // optional — default false
|
|
269
276
|
},
|
|
270
277
|
{
|
|
271
278
|
"kind": "a2a",
|
|
@@ -273,12 +280,13 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
|
|
|
273
280
|
"description": "Delegate billing questions to the Acme billing agent.",
|
|
274
281
|
"agentCardUrl": "https://billing.acme.com/.well-known/agent-card.json",
|
|
275
282
|
"headers": { "Authorization": "Bearer ${BILLING_TOKEN}" },
|
|
276
|
-
"contextId": "ctx_abc"
|
|
283
|
+
"contextId": "ctx_abc", // optional A2A context to thread turns
|
|
277
284
|
},
|
|
278
285
|
{
|
|
279
286
|
"kind": "a2a_local",
|
|
280
287
|
"name": "intranet_hr_agent",
|
|
281
|
-
"agentCard": {
|
|
288
|
+
"agentCard": {
|
|
289
|
+
// SDK-resolved A2A Agent Card content
|
|
282
290
|
"protocolVersion": "0.3.0",
|
|
283
291
|
"name": "Acme HR",
|
|
284
292
|
"description": "Answers questions about HR policies and benefits.",
|
|
@@ -289,72 +297,79 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
|
|
|
289
297
|
{
|
|
290
298
|
"id": "pto_lookup",
|
|
291
299
|
"name": "PTO lookup",
|
|
292
|
-
"description": "Find a teammate's remaining PTO days for the year."
|
|
300
|
+
"description": "Find a teammate's remaining PTO days for the year.",
|
|
293
301
|
},
|
|
294
302
|
{
|
|
295
303
|
"id": "benefits_qa",
|
|
296
304
|
"name": "Benefits Q&A",
|
|
297
|
-
"description": "Answer questions about insurance, 401k, and parental leave."
|
|
298
|
-
}
|
|
299
|
-
]
|
|
300
|
-
}
|
|
305
|
+
"description": "Answer questions about insurance, 401k, and parental leave.",
|
|
306
|
+
},
|
|
307
|
+
],
|
|
308
|
+
},
|
|
301
309
|
},
|
|
302
310
|
{
|
|
303
311
|
"kind": "mcp",
|
|
304
|
-
"name": "github",
|
|
312
|
+
"name": "github", // → tools become github_<tool>
|
|
305
313
|
"url": "https://mcp.github.com/v1",
|
|
306
314
|
"headers": { "Authorization": "Bearer ${GH_PAT}" },
|
|
307
|
-
"toolFilter": ["search_repos", "read_file"]
|
|
315
|
+
"toolFilter": ["search_repos", "read_file"], // optional allowlist
|
|
308
316
|
},
|
|
309
317
|
{
|
|
310
318
|
"kind": "mcp_local",
|
|
311
|
-
"name": "fs",
|
|
312
|
-
"serverInfo": {
|
|
319
|
+
"name": "fs", // SDK-side server label only — NOT a prefix
|
|
320
|
+
"serverInfo": {
|
|
321
|
+
// optional; from MCP Initialize
|
|
313
322
|
"name": "mcp-server-filesystem",
|
|
314
|
-
"version": "0.4.1"
|
|
323
|
+
"version": "0.4.1",
|
|
315
324
|
},
|
|
316
|
-
"tools": [
|
|
325
|
+
"tools": [
|
|
326
|
+
// verbatim MCP tools/list response
|
|
317
327
|
{
|
|
318
|
-
"name": "fs_read_file",
|
|
328
|
+
"name": "fs_read_file", // model-facing name, exactly as declared
|
|
319
329
|
"description": "Read a file from the user's workstation",
|
|
320
|
-
"inputSchema": {
|
|
330
|
+
"inputSchema": {
|
|
331
|
+
// MCP's term — JSON Schema
|
|
321
332
|
"type": "object",
|
|
322
333
|
"properties": { "path": { "type": "string" } },
|
|
323
|
-
"required": ["path"]
|
|
324
|
-
}
|
|
325
|
-
}
|
|
326
|
-
]
|
|
327
|
-
}
|
|
334
|
+
"required": ["path"],
|
|
335
|
+
},
|
|
336
|
+
},
|
|
337
|
+
],
|
|
338
|
+
},
|
|
328
339
|
],
|
|
329
|
-
"budgets": { "maxToolTurns": 32 },
|
|
330
|
-
"outputSchema": {
|
|
340
|
+
"budgets": { "maxToolTurns": 32 }, // optional safety cap
|
|
341
|
+
"outputSchema": {
|
|
342
|
+
// optional, see §4.5
|
|
331
343
|
"name": "weather_report",
|
|
332
344
|
"schema": {
|
|
333
345
|
"type": "object",
|
|
334
346
|
"properties": {
|
|
335
347
|
"city": { "type": "string" },
|
|
336
|
-
"temperature_c": { "type": "number" }
|
|
348
|
+
"temperature_c": { "type": "number" },
|
|
337
349
|
},
|
|
338
|
-
"required": ["city", "temperature_c"]
|
|
339
|
-
}
|
|
350
|
+
"required": ["city", "temperature_c"],
|
|
351
|
+
},
|
|
340
352
|
},
|
|
341
|
-
"loopDetection": {
|
|
353
|
+
"loopDetection": {
|
|
354
|
+
// optional, see §4.6
|
|
342
355
|
"consecutiveThreshold": 3,
|
|
343
|
-
"hardCutoffThreshold": 6
|
|
356
|
+
"hardCutoffThreshold": 6,
|
|
344
357
|
},
|
|
345
|
-
"toolBudgets": {
|
|
346
|
-
|
|
358
|
+
"toolBudgets": {
|
|
359
|
+
// optional, see §4.7
|
|
360
|
+
"recall": { "maxCalls": 4 },
|
|
347
361
|
"hive_consult_ontology": { "maxCalls": 4 },
|
|
348
|
-
"scary_tool":
|
|
362
|
+
"scary_tool": { "maxCalls": 0 },
|
|
349
363
|
},
|
|
350
|
-
"metadata": {
|
|
364
|
+
"metadata": {
|
|
365
|
+
// optional, see §4.8
|
|
351
366
|
"customer": "acme",
|
|
352
|
-
"env": "prod"
|
|
353
|
-
}
|
|
367
|
+
"env": "prod",
|
|
368
|
+
},
|
|
354
369
|
}
|
|
355
370
|
```
|
|
356
371
|
|
|
357
|
-
`POST /agent-runs` additionally accepts `prompt`
|
|
372
|
+
`POST /agent-runs` additionally accepts `prompt` _or_ `messages` (an array of
|
|
358
373
|
`{role, content}`). Sending both is a `400 invalid_request`.
|
|
359
374
|
|
|
360
375
|
### 4.1 Triggering a persisted MANTYX agent (`agentId`)
|
|
@@ -366,7 +381,7 @@ defining an ephemeral one inline. When `agentId` is set:
|
|
|
366
381
|
stored system prompt at run time.
|
|
367
382
|
- `modelId` becomes optional. If omitted, the server uses the agent's
|
|
368
383
|
configured LLM provider (or the workspace automation provider if the agent
|
|
369
|
-
has
|
|
384
|
+
has _Use workspace default model_ turned on).
|
|
370
385
|
- The agent's own tools are loaded from its workspace configuration —
|
|
371
386
|
including memory, skills, and plugin tools — and your `tools` array is
|
|
372
387
|
**merged on top**. This is typically used to attach `local` tools so the
|
|
@@ -389,14 +404,14 @@ the handler in its own process. MANTYX never executes the body — it
|
|
|
389
404
|
emits a `local_tool_call` event when the model picks the tool and waits
|
|
390
405
|
for the SDK to POST a tool-result.
|
|
391
406
|
|
|
392
|
-
| Field | Required | Notes
|
|
393
|
-
| -------------- | -------- |
|
|
394
|
-
| `kind` | yes | Discriminator literal `"local"`.
|
|
395
|
-
| `name` | yes | Model-facing tool name. Must match `/^[a-zA-Z0-9_]{1,64}$/`.
|
|
396
|
-
| `description` | no | Free-form. Empty when omitted (acceptable, but reduces tool-selection accuracy).
|
|
397
|
-
| `parameters` | no | JSON Schema for the tool's input. Must be a `type: "object"` schema with `properties`; non-object roots are coerced to an empty object schema server-side. Forwarded **verbatim** to the LLM provider so nested constraints (`array.items`, `enum`, `anyOf`, numeric formats, …) survive. Args that fail server-side validation produce a structured `tool_input_invalid` tool result the model can recover from instead of crashing the call.
|
|
398
|
-
| `outputSchema` | no | JSON Schema for the structured value the tool returns. When present, forwarded to providers that accept per-tool response schemas (Gemini's `responseJsonSchema` on the FunctionDeclaration); other engines surface it through the description and rely on host-side validation. Helps the model emit follow-up arguments that round-trip cleanly. Must be an object schema; non-object roots are dropped server-side.
|
|
399
|
-
| `longRunning` | no | When `true`, MANTYX appends a stable hint to the model-facing description so every provider treats the tool as long-running:<br
|
|
407
|
+
| Field | Required | Notes |
|
|
408
|
+
| -------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
409
|
+
| `kind` | yes | Discriminator literal `"local"`. |
|
|
410
|
+
| `name` | yes | Model-facing tool name. Must match `/^[a-zA-Z0-9_]{1,64}$/`. |
|
|
411
|
+
| `description` | no | Free-form. Empty when omitted (acceptable, but reduces tool-selection accuracy). |
|
|
412
|
+
| `parameters` | no | JSON Schema for the tool's input. Must be a `type: "object"` schema with `properties`; non-object roots are coerced to an empty object schema server-side. Forwarded **verbatim** to the LLM provider so nested constraints (`array.items`, `enum`, `anyOf`, numeric formats, …) survive. Args that fail server-side validation produce a structured `tool_input_invalid` tool result the model can recover from instead of crashing the call. |
|
|
413
|
+
| `outputSchema` | no | JSON Schema for the structured value the tool returns. When present, forwarded to providers that accept per-tool response schemas (Gemini's `responseJsonSchema` on the FunctionDeclaration); other engines surface it through the description and rely on host-side validation. Helps the model emit follow-up arguments that round-trip cleanly. Must be an object schema; non-object roots are dropped server-side. |
|
|
414
|
+
| `longRunning` | no | When `true`, MANTYX appends a stable hint to the model-facing description so every provider treats the tool as long-running:<br>_"NOTE: This is a long-running operation. Do not call this tool again if it has already returned an intermediate or pending status."_<br>Useful for tools that return `pending` and rely on SDK-side polling — without the hint the model routinely fires repeat calls and burns turns. Pure declarative — MANTYX does not change scheduling. |
|
|
400
415
|
|
|
401
416
|
The `outputSchema` and `longRunning` fields are **additive** since wire
|
|
402
417
|
protocol v1: SDKs that don't ship them keep working unchanged. Providers
|
|
@@ -410,10 +425,10 @@ A2A delegation lets the agent hand a task to another
|
|
|
410
425
|
[Agent2Agent](https://google.github.io/A2A/) peer. The wire protocol exposes
|
|
411
426
|
two kinds depending on **who can reach the peer**:
|
|
412
427
|
|
|
413
|
-
- `kind: "a2a"` —
|
|
428
|
+
- `kind: "a2a"` — _remote_ (server-resolved). MANTYX dials `agentCardUrl`
|
|
414
429
|
directly. Pick this when the peer is on the public internet or in the
|
|
415
430
|
same VPC as MANTYX.
|
|
416
|
-
- `kind: "a2a_local"` —
|
|
431
|
+
- `kind: "a2a_local"` — _local_ (client-resolved). The SDK invokes the peer
|
|
417
432
|
on its side and posts back the reply. Pick this when the peer lives on an
|
|
418
433
|
intranet, behind a VPN, or on the user's device — anywhere MANTYX can't
|
|
419
434
|
reach but the SDK can.
|
|
@@ -431,14 +446,14 @@ POSTs the model's `message` argument to `agentCardUrl` over A2A's standard
|
|
|
431
446
|
and `/message/send` endpoints are probed in order) and forwards the remote
|
|
432
447
|
agent's text reply back as the tool result.
|
|
433
448
|
|
|
434
|
-
| Field
|
|
435
|
-
|
|
|
436
|
-
| `kind`
|
|
437
|
-
| `name`
|
|
438
|
-
| `description`
|
|
439
|
-
| `agentCardUrl`
|
|
440
|
-
| `headers`
|
|
441
|
-
| `contextId`
|
|
449
|
+
| Field | Required | Notes |
|
|
450
|
+
| -------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
451
|
+
| `kind` | yes | Discriminator literal `"a2a"`. |
|
|
452
|
+
| `name` | yes | Tool name surfaced to the model — must match `/^[a-zA-Z0-9_]{1,64}$/`. |
|
|
453
|
+
| `description` | no | Model-facing description. Defaults to `"Delegate a task to the <name> agent over A2A. Pass the full task as a single message."`. Mention the remote agent's purpose so the model picks it for the right turn. |
|
|
454
|
+
| `agentCardUrl` | yes | URL of the remote Agent Card (`/.well-known/agent-card.json`) or the JSON-RPC root the peer accepts. |
|
|
455
|
+
| `headers` | no | Flat string→string HTTP headers sent on every A2A request — typically `Authorization`. Each value is capped at 8 KB. |
|
|
456
|
+
| `contextId` | no | A2A `contextId` to thread multiple delegations into the same remote conversation. Omit for fresh per-call context. |
|
|
442
457
|
|
|
443
458
|
> **Secret handling.** `headers` are forwarded **as-is** by the SDK API. If
|
|
444
459
|
> you need long-lived credentials (refresh tokens, rotating API keys),
|
|
@@ -476,30 +491,30 @@ Per-run lifecycle:
|
|
|
476
491
|
5. **Continuation (MANTYX).** MANTYX feeds the reply back into the model
|
|
477
492
|
loop as the tool result.
|
|
478
493
|
|
|
479
|
-
| Field
|
|
480
|
-
|
|
|
481
|
-
| `kind`
|
|
482
|
-
| `name`
|
|
483
|
-
| `description`
|
|
484
|
-
| `agentCard`
|
|
494
|
+
| Field | Required | Notes |
|
|
495
|
+
| ------------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
496
|
+
| `kind` | yes | Discriminator literal `"a2a_local"`. |
|
|
497
|
+
| `name` | yes | Tool name surfaced to the model — must match `/^[a-zA-Z0-9_]{1,64}$/`. |
|
|
498
|
+
| `description` | no | Model-facing description override. When omitted, MANTYX synthesizes one from `agentCard.name`, `agentCard.description`, and the first 12 skills. |
|
|
499
|
+
| `agentCard` | yes | The resolved A2A Agent Card (JSON content). Schema follows the [A2A Agent Card spec](https://google.github.io/A2A/specification/#agent-card) — passthrough for unknown fields, so any spec-compliant card works. See the _Agent Card shape_ table below for the fields MANTYX actually reads. |
|
|
485
500
|
|
|
486
501
|
**Agent Card shape** (only the fields MANTYX inspects; everything else is
|
|
487
502
|
forwarded verbatim back to the SDK):
|
|
488
503
|
|
|
489
|
-
| Card field
|
|
490
|
-
|
|
|
491
|
-
| `protocolVersion`
|
|
492
|
-
| `name`
|
|
493
|
-
| `description`
|
|
494
|
-
| `url`
|
|
495
|
-
| `version`
|
|
496
|
-
| `provider`
|
|
497
|
-
| `capabilities`
|
|
498
|
-
| `defaultInputModes`
|
|
499
|
-
| `defaultOutputModes`
|
|
500
|
-
| `skills[]`
|
|
501
|
-
| `securitySchemes`, `security` | echo only
|
|
502
|
-
|
|
|
504
|
+
| Card field | Used by MANTYX | Notes |
|
|
505
|
+
| ----------------------------- | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
|
|
506
|
+
| `protocolVersion` | echo only | A2A protocol version (e.g. `"0.3.0"`). |
|
|
507
|
+
| `name` | description | Used when synthesizing the tool description (`"Delegate a task to the <name> agent ..."`). |
|
|
508
|
+
| `description` | description | One-paragraph summary of what the peer does — surfaced to the model. |
|
|
509
|
+
| `url` | echo only | Peer's A2A endpoint. Forwarded back to the SDK in the `local_tool_call` event so the SDK can dispatch by URL. Never fetched server-side. |
|
|
510
|
+
| `version` | echo only | Peer agent version. |
|
|
511
|
+
| `provider` | echo only | Vendor info. |
|
|
512
|
+
| `capabilities` | echo only | A2A capability flags (streaming, push notifications, …). |
|
|
513
|
+
| `defaultInputModes` | echo only | Modalities the peer accepts. |
|
|
514
|
+
| `defaultOutputModes` | echo only | Modalities the peer returns. |
|
|
515
|
+
| `skills[]` | description | First 12 skills (`name`, `description`) are bulleted into the tool description so the model knows what to ask for. |
|
|
516
|
+
| `securitySchemes`, `security` | echo only | Forwarded to the SDK; MANTYX does no auth. |
|
|
517
|
+
| _anything else_ | echo only | Passthrough — survives round-trip unchanged. |
|
|
503
518
|
|
|
504
519
|
Local A2A respects the same `localToolTimeoutMs` budget (default 5 minutes)
|
|
505
520
|
as `kind: "local"`. Tool-result POSTs after timeout return `409 run_terminal`.
|
|
@@ -510,25 +525,25 @@ as `kind: "local"`. Tool-result POSTs after timeout return `409 run_terminal`.
|
|
|
510
525
|
expose every tool published by an MCP server to the agent loop in one go.
|
|
511
526
|
Like A2A, the protocol distinguishes by **where the server lives**:
|
|
512
527
|
|
|
513
|
-
- `kind: "mcp"` —
|
|
528
|
+
- `kind: "mcp"` — _remote_ MCP (Streamable HTTP). MANTYX has network access
|
|
514
529
|
to the server, dials it, lists the catalog at run start, and proxies each
|
|
515
530
|
call server-side. **MANTYX prefixes every discovered tool name with the
|
|
516
531
|
ref's `name`** (e.g. `github_search_repos`) so multiple MCP servers
|
|
517
532
|
can coexist without colliding.
|
|
518
|
-
- `kind: "mcp_local"` —
|
|
533
|
+
- `kind: "mcp_local"` — _local_ MCP (stdio, on-device, intranet). MANTYX
|
|
519
534
|
has **no** access to the server; the SDK does discovery, validation, and
|
|
520
535
|
execution. The SDK declares the tool catalog with **the exact names it
|
|
521
536
|
wants the model to see** — MANTYX does not auto-prefix.
|
|
522
537
|
|
|
523
538
|
#### `kind: "mcp"` — remote MCP
|
|
524
539
|
|
|
525
|
-
| Field
|
|
526
|
-
|
|
|
527
|
-
| `kind`
|
|
528
|
-
| `name`
|
|
529
|
-
| `url`
|
|
530
|
-
| `headers`
|
|
531
|
-
| `toolFilter`
|
|
540
|
+
| Field | Required | Notes |
|
|
541
|
+
| ------------ | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
542
|
+
| `kind` | yes | Discriminator literal `"mcp"`. |
|
|
543
|
+
| `name` | yes | Server label — MANTYX prefixes every discovered tool name as `<name>_<tool>`. Must match `/^[a-zA-Z0-9_]{1,64}$/`. |
|
|
544
|
+
| `url` | yes | Streamable HTTP MCP endpoint. |
|
|
545
|
+
| `headers` | no | Flat string→string HTTP headers (e.g. `Authorization`). Each value capped at 8 KB. |
|
|
546
|
+
| `toolFilter` | no | Allowlist of MCP tool names (un-prefixed, as the server returns them). When set, tools not in the list are silently dropped. When omitted, every published tool is exposed. |
|
|
532
547
|
|
|
533
548
|
If the MCP server is unreachable when the run starts, MANTYX still exposes
|
|
534
549
|
a single stub tool named `<server>_unavailable` so the model can report the
|
|
@@ -566,16 +581,17 @@ Per-run lifecycle:
|
|
|
566
581
|
"type": "local_tool_call",
|
|
567
582
|
"data": {
|
|
568
583
|
"toolUseId": "tu_x",
|
|
569
|
-
"name": "fs_read_file",
|
|
584
|
+
"name": "fs_read_file", // SDK-declared name; same string the model called
|
|
570
585
|
"args": { "path": "/etc/hosts" },
|
|
571
586
|
"kind": "mcp_local",
|
|
572
|
-
"mcpServer": "fs",
|
|
587
|
+
"mcpServer": "fs", // the SDK-side label from the ref's `name`
|
|
573
588
|
"mcpToolName": "fs_read_file", // duplicates `name` for the SDK's convenience
|
|
574
|
-
"mcpServerInfo": {
|
|
589
|
+
"mcpServerInfo": {
|
|
590
|
+
// present iff the ref carried `serverInfo`
|
|
575
591
|
"name": "mcp-server-filesystem",
|
|
576
|
-
"version": "0.4.1"
|
|
577
|
-
}
|
|
578
|
-
}
|
|
592
|
+
"version": "0.4.1",
|
|
593
|
+
},
|
|
594
|
+
},
|
|
579
595
|
}
|
|
580
596
|
```
|
|
581
597
|
|
|
@@ -587,12 +603,12 @@ Per-run lifecycle:
|
|
|
587
603
|
updated `mcp_local` ref inside `POST /agent-sessions/:id/messages`'s
|
|
588
604
|
`tools` field; the catalog snapshot lives on the run, not the session.
|
|
589
605
|
|
|
590
|
-
| Field
|
|
591
|
-
|
|
|
592
|
-
| `kind`
|
|
593
|
-
| `name`
|
|
594
|
-
| `serverInfo`
|
|
595
|
-
| `tools`
|
|
606
|
+
| Field | Required | Notes |
|
|
607
|
+
| ------------ | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
608
|
+
| `kind` | yes | Discriminator literal `"mcp_local"`. |
|
|
609
|
+
| `name` | yes | SDK-side server label (e.g. `"fs"`, `"jira"`). Echoed back unchanged as `mcpServer` on every `local_tool_call`. **Not used to prefix tool names.** Match `/^[a-zA-Z0-9_]{1,64}$/`. |
|
|
610
|
+
| `serverInfo` | no | The MCP `Implementation` block the SDK got from `Initialize` (`{ name, version? }`, plus any extra fields the server returned). Forwarded to the SDK in `local_tool_call.mcpServerInfo` for observability; not used to drive behavior. |
|
|
611
|
+
| `tools` | yes | Verbatim MCP `tools/list` output (1–64 entries). Each item is the standard MCP `Tool` shape: `{ name, description?, inputSchema?, annotations?, … }`. `name` is the model-facing tool name (SDK owns naming). `inputSchema` is the MCP-spec JSON Schema for the tool's arguments — used to constrain the LLM's tool call. Empty `inputSchema` means a no-arg tool. |
|
|
596
612
|
|
|
597
613
|
Older SDKs that ignore the `kind` discriminator still see a normal
|
|
598
614
|
`local_tool_call` and can match on `name` alone.
|
|
@@ -612,10 +628,10 @@ provider:
|
|
|
612
628
|
|
|
613
629
|
Two equivalent input shapes are accepted:
|
|
614
630
|
|
|
615
|
-
| Form
|
|
616
|
-
|
|
|
617
|
-
| **String**
|
|
618
|
-
| **Number**
|
|
631
|
+
| Form | Values | Notes |
|
|
632
|
+
| ---------- | -------------------------------------- | --------------------------------------------------------------------------------------------------------------------- |
|
|
633
|
+
| **String** | `"off"`, `"low"`, `"medium"`, `"high"` | Snaps to the same anchors the web composer uses (Fast=30, Moderate=50, Smart=80; off=0). |
|
|
634
|
+
| **Number** | integer `0`–`100` | Pass-through to `RunAgentOptions.reasoningLevel`. `0` explicitly disables provider thinking even on reasoning models. |
|
|
619
635
|
|
|
620
636
|
When omitted, MANTYX falls back to the agent's default — for ephemeral
|
|
621
637
|
specs, that means thinking is off; for `agentId`-backed specs, it follows
|
|
@@ -649,29 +665,29 @@ reply directly into downstream code without LLM-flavoured prose to parse out.
|
|
|
649
665
|
}
|
|
650
666
|
```
|
|
651
667
|
|
|
652
|
-
| Field | Required | Notes
|
|
653
|
-
| -------- | -------- |
|
|
654
|
-
| `name` | no | Stable identifier passed to providers (OpenAI `text.format.name`, Anthropic synthetic-tool name). Defaults to `"output"`. Must match `/^[a-zA-Z0-9_-]{1,64}$/`.
|
|
668
|
+
| Field | Required | Notes |
|
|
669
|
+
| -------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
670
|
+
| `name` | no | Stable identifier passed to providers (OpenAI `text.format.name`, Anthropic synthetic-tool name). Defaults to `"output"`. Must match `/^[a-zA-Z0-9_-]{1,64}$/`. |
|
|
655
671
|
| `schema` | yes | JSON Schema describing the final assistant text. Root must be a JSON **object** (most providers reject array / scalar roots in structured-output mode). The schema is passed through verbatim — MANTYX does not validate its contents; the provider does. |
|
|
656
672
|
|
|
657
673
|
Validation (server-side, `400 invalid_request` on violation):
|
|
658
674
|
|
|
659
|
-
| Constraint
|
|
660
|
-
|
|
|
661
|
-
| Serialized JSON size of `outputSchema` | ≤ 32 KB
|
|
662
|
-
| `name` regex
|
|
663
|
-
| `schema` shape
|
|
675
|
+
| Constraint | Limit |
|
|
676
|
+
| -------------------------------------- | --------------------------------- |
|
|
677
|
+
| Serialized JSON size of `outputSchema` | ≤ 32 KB |
|
|
678
|
+
| `name` regex | `/^[a-zA-Z0-9_-]{1,64}$/` |
|
|
679
|
+
| `schema` shape | non-`null`, non-array JSON object |
|
|
664
680
|
|
|
665
681
|
**Per-provider behaviour** (mirrors the SDK's `RunAgentOptions.finalResponseSchema`):
|
|
666
682
|
|
|
667
|
-
| Provider
|
|
668
|
-
|
|
|
669
|
-
| OpenAI Responses (o-series, GPT-5.x, …) | `text.format = { type: "json_schema", strict: true, name, schema }` on every turn (works alongside tool calls).
|
|
670
|
-
| Gemini 3+ (any turn)
|
|
671
|
-
| Gemini ≤ 2.5 (no-tools turn)
|
|
672
|
-
| Gemini ≤ 2.5 (with tools)
|
|
673
|
-
| Anthropic / Bedrock-Anthropic
|
|
674
|
-
| xAI Grok, others
|
|
683
|
+
| Provider | How the schema is enforced |
|
|
684
|
+
| --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
685
|
+
| OpenAI Responses (o-series, GPT-5.x, …) | `text.format = { type: "json_schema", strict: true, name, schema }` on every turn (works alongside tool calls). |
|
|
686
|
+
| Gemini 3+ (any turn) | `responseMimeType: "application/json"` + `responseJsonSchema` on every `completeTurn`. Gemini 3 accepts the schema alongside `functionDeclarations`. |
|
|
687
|
+
| Gemini ≤ 2.5 (no-tools turn) | `responseMimeType: "application/json"` + `responseJsonSchema`. |
|
|
688
|
+
| Gemini ≤ 2.5 (with tools) | Synthetic `set_model_response` function declaration is injected; its `parametersJsonSchema` is the supplied schema. The system instruction is augmented to direct the model to call this tool with the final answer. The engine intercepts the call, hides it from the SDK, and surfaces the call's arguments as the assistant text (JSON-stringified). Sidesteps the API rejection ("Function calling with a response mime type: 'application/json' is unsupported") without round-tripping a 4xx. |
|
|
689
|
+
| Anthropic / Bedrock-Anthropic | Synthetic `final_report` tool whose `input_schema` is the supplied schema; `tool_choice` is forced on the no-tools finishing turn. The tool's input is surfaced as the assistant text. |
|
|
690
|
+
| xAI Grok, others | Ignored (the model returns plain text). |
|
|
675
691
|
|
|
676
692
|
The synthetic-tool paths (Gemini 2.5 + tools, Anthropic) are entirely
|
|
677
693
|
internal: the SDK never receives a `local_tool_call` for
|
|
@@ -727,17 +743,17 @@ The wire shape also accepts the literal `false`:
|
|
|
727
743
|
"loopDetection": false // explicitly disable the guard for this run
|
|
728
744
|
```
|
|
729
745
|
|
|
730
|
-
| Field | Type | Required | Notes
|
|
731
|
-
| ---------------------- | --------------- | -------- |
|
|
732
|
-
| `consecutiveThreshold` | integer ≥ 2 | no | Defaults to **3** when the field is omitted. Must be `>= 2` (one identical batch is just a single tool call, not a loop).
|
|
746
|
+
| Field | Type | Required | Notes |
|
|
747
|
+
| ---------------------- | --------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
|
|
748
|
+
| `consecutiveThreshold` | integer ≥ 2 | no | Defaults to **3** when the field is omitted. Must be `>= 2` (one identical batch is just a single tool call, not a loop). |
|
|
733
749
|
| `hardCutoffThreshold` | integer ≥ 3 | no | Defaults to **6** when the field is omitted. Must be `> consecutiveThreshold`; otherwise the soft nudge would never get a chance to land. |
|
|
734
|
-
| (top-level `false`) | literal `false` | no | Disables the guard entirely for this run. The pipeline still enforces `budgets.maxToolTurns`.
|
|
750
|
+
| (top-level `false`) | literal `false` | no | Disables the guard entirely for this run. The pipeline still enforces `budgets.maxToolTurns`. |
|
|
735
751
|
|
|
736
752
|
Validation (server-side, `400 invalid_request` on violation):
|
|
737
753
|
|
|
738
|
-
| Constraint
|
|
739
|
-
|
|
|
740
|
-
| `consecutiveThreshold` / `hardCutoffThreshold` upper bound
|
|
754
|
+
| Constraint | Limit |
|
|
755
|
+
| ------------------------------------------------------------------ | -------- |
|
|
756
|
+
| `consecutiveThreshold` / `hardCutoffThreshold` upper bound | `100` |
|
|
741
757
|
| `hardCutoffThreshold` strictly greater than `consecutiveThreshold` | enforced |
|
|
742
758
|
|
|
743
759
|
**Defaults.** When `loopDetection` is omitted entirely, MANTYX applies the
|
|
@@ -776,31 +792,31 @@ tool result.
|
|
|
776
792
|
}
|
|
777
793
|
```
|
|
778
794
|
|
|
779
|
-
| Field | Type | Required | Notes
|
|
780
|
-
| ---------- | ----------- | -------- |
|
|
781
|
-
| `<key>` | string | yes | Logical tool name as the model sees it (the same name on `ResolvedTool.name`; the SDK + pipeline handle sanitisation). 1–120 characters.
|
|
795
|
+
| Field | Type | Required | Notes |
|
|
796
|
+
| ---------- | ----------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
797
|
+
| `<key>` | string | yes | Logical tool name as the model sees it (the same name on `ResolvedTool.name`; the SDK + pipeline handle sanitisation). 1–120 characters. |
|
|
782
798
|
| `maxCalls` | integer ≥ 0 | yes | Hard cap on executed calls per run. `0` disables the tool entirely (every attempt returns the synthetic body on the first try). Budgets are **per-tool, not pooled**: `hive_search_deals: { maxCalls: 5 }` and `hive_search_meetings: { maxCalls: 5 }` give the agent five of each, not five between them. |
|
|
783
799
|
|
|
784
800
|
Validation (server-side, `400 invalid_request` on violation):
|
|
785
801
|
|
|
786
|
-
| Constraint
|
|
787
|
-
|
|
|
788
|
-
| Max entries
|
|
789
|
-
| `<key>` length
|
|
802
|
+
| Constraint | Limit |
|
|
803
|
+
| ---------------------- | -------------------------------------------------------------------------- |
|
|
804
|
+
| Max entries | `32` |
|
|
805
|
+
| `<key>` length | `1..120` chars |
|
|
790
806
|
| `maxCalls` upper bound | `1000` (functionally unlimited; the SDK's `maxToolTurns: 100` fires first) |
|
|
791
807
|
|
|
792
808
|
**Defaults.** When `toolBudgets` is omitted, MANTYX layers the runtime
|
|
793
809
|
defaults from `runtime/default-run-guards.ts` on top of the spec. The
|
|
794
810
|
default research-tool surface is:
|
|
795
811
|
|
|
796
|
-
| Tool
|
|
797
|
-
|
|
|
798
|
-
| `recall` (workspace memory hybrid search)
|
|
799
|
-
| `traverse` (memory graph BFS)
|
|
800
|
-
| `hive_consult_ontology` (per-hive ontology read; same name across all three hives)
|
|
801
|
-
| `hive_search_deals` / `_meetings` / `_companies` / `_people` (Sales Hive general search)
|
|
802
|
-
| `hive_search_tickets` / `_conversations` / `_accounts` (Customer Hive general search)
|
|
803
|
-
| `hive_search_releases` / `_issues` (Product Hive general search)
|
|
812
|
+
| Tool | Default `maxCalls` |
|
|
813
|
+
| ---------------------------------------------------------------------------------------- | ------------------ |
|
|
814
|
+
| `recall` (workspace memory hybrid search) | `4` |
|
|
815
|
+
| `traverse` (memory graph BFS) | `3` |
|
|
816
|
+
| `hive_consult_ontology` (per-hive ontology read; same name across all three hives) | `4` |
|
|
817
|
+
| `hive_search_deals` / `_meetings` / `_companies` / `_people` (Sales Hive general search) | `5` |
|
|
818
|
+
| `hive_search_tickets` / `_conversations` / `_accounts` (Customer Hive general search) | `5` |
|
|
819
|
+
| `hive_search_releases` / `_issues` (Product Hive general search) | `5` |
|
|
804
820
|
|
|
805
821
|
Pass `"toolBudgets": {}` to start from a clean slate (no defaults applied
|
|
806
822
|
on top — useful for runs that intentionally want unbounded research). When
|
|
@@ -838,12 +854,12 @@ prompt.
|
|
|
838
854
|
|
|
839
855
|
Validation (server-side, `400 invalid_request` on violation):
|
|
840
856
|
|
|
841
|
-
| Constraint
|
|
842
|
-
|
|
|
843
|
-
| Max entries
|
|
844
|
-
| Key pattern
|
|
845
|
-
| Value type / length
|
|
846
|
-
| Serialized JSON size
|
|
857
|
+
| Constraint | Limit |
|
|
858
|
+
| -------------------- | ------------------------ |
|
|
859
|
+
| Max entries | 16 |
|
|
860
|
+
| Key pattern | `^[A-Za-z0-9._-]{1,64}$` |
|
|
861
|
+
| Value type / length | string ≤ 256 chars |
|
|
862
|
+
| Serialized JSON size | ≤ 4 KB |
|
|
847
863
|
|
|
848
864
|
For session-scoped runs the inheritance rules are:
|
|
849
865
|
|
|
@@ -872,13 +888,18 @@ POST /api/v1/workspaces/{slug}/agent-runs/{runId}/cancel
|
|
|
872
888
|
`POST /agent-runs` returns `202 Accepted` immediately:
|
|
873
889
|
|
|
874
890
|
```json
|
|
875
|
-
{
|
|
891
|
+
{
|
|
892
|
+
"runId": "run_abc",
|
|
893
|
+
"streamUrl": "/api/v1/workspaces/acme/agent-runs/run_abc/stream"
|
|
894
|
+
}
|
|
876
895
|
```
|
|
877
896
|
|
|
878
897
|
`GET .../stream` is the canonical event channel; see §7.
|
|
879
898
|
|
|
880
899
|
`GET /agent-runs/{runId}` returns the run snapshot (status, final text, error,
|
|
881
|
-
spec
|
|
900
|
+
spec, plus the cost-attribution triple `tokens` / `turns` / `model` —
|
|
901
|
+
see §7.1) without subscribing to live events. Useful for polling long
|
|
902
|
+
runs or attributing spend after the SSE stream was already consumed.
|
|
882
903
|
|
|
883
904
|
## 6. Sessions
|
|
884
905
|
|
|
@@ -903,13 +924,15 @@ and returns `{ runId, streamUrl }` just like a one-shot run. Body:
|
|
|
903
924
|
```jsonc
|
|
904
925
|
{
|
|
905
926
|
"prompt": "What's in /etc/hosts?",
|
|
906
|
-
"tools": [
|
|
927
|
+
"tools": [
|
|
928
|
+
/* optional refresh of tool definitions */
|
|
929
|
+
],
|
|
907
930
|
}
|
|
908
931
|
```
|
|
909
932
|
|
|
910
933
|
The server prepends the session's prior messages, runs the model, and on
|
|
911
934
|
success appends the new user/assistant turns back to the session row. Local
|
|
912
|
-
tool **handlers** are
|
|
935
|
+
tool **handlers** are _not_ persisted: the session stores definitions
|
|
913
936
|
(name, schema, description) so that a restarted SDK can re-bind handlers and
|
|
914
937
|
keep going.
|
|
915
938
|
|
|
@@ -979,8 +1002,24 @@ data: <utf-8 JSON>
|
|
|
979
1002
|
{ "seq": 7, "type": "tool_budget_exceeded", "data": { "tool": "recall", "maxCalls": 4, "callIndex": 5 } }
|
|
980
1003
|
|
|
981
1004
|
// terminal event
|
|
982
|
-
|
|
983
|
-
|
|
1005
|
+
// Every terminal `result` event also carries `tokens`, `turns`, and `model`
|
|
1006
|
+
// for cost attribution and dashboards — see §7.1. Older platforms (pre-
|
|
1007
|
+
// 2026-09) omit these fields; SDK clients detect "no usage data" by
|
|
1008
|
+
// checking that `model.provider` is empty / falsy.
|
|
1009
|
+
{ "seq": 8, "type": "result", "data": {
|
|
1010
|
+
"subtype": "success",
|
|
1011
|
+
"text": "Final reply",
|
|
1012
|
+
"tokens": { "inputTokens": 1283, "cachedTokens": 512, "reasoningTokens": 96, "outputTokens": 240 },
|
|
1013
|
+
"turns": 3,
|
|
1014
|
+
"model": { "id": "platform:demo", "provider": "openai", "vendorModelId": "gpt-5.4-mini", "reasoningEffort": "low" }
|
|
1015
|
+
} }
|
|
1016
|
+
{ "seq": 8, "type": "result", "data": {
|
|
1017
|
+
"subtype": "error_local_tool_timeout",
|
|
1018
|
+
"error": "...",
|
|
1019
|
+
"tokens": { "inputTokens": 980, "cachedTokens": 0, "reasoningTokens": 0, "outputTokens": 14 },
|
|
1020
|
+
"turns": 2,
|
|
1021
|
+
"model": { "id": "platform:demo", "provider": "anthropic", "vendorModelId": "claude-opus-4-7" }
|
|
1022
|
+
} }
|
|
984
1023
|
{ "seq": 8, "type": "cancelled", "data": {} }
|
|
985
1024
|
```
|
|
986
1025
|
|
|
@@ -991,6 +1030,117 @@ field and the parsed `type` inside `data` — they are always equal, but
|
|
|
991
1030
|
implementations should rely on `data.type` because some HTTP middleware
|
|
992
1031
|
strips the `event:` line.
|
|
993
1032
|
|
|
1033
|
+
### 7.1 Cost-attribution fields (`tokens`, `turns`, `model`)
|
|
1034
|
+
|
|
1035
|
+
Every terminal `result` SSE event (and every terminal `error` event on
|
|
1036
|
+
platforms that emit it — see `docs/wire-protocol.md` §4.7) carries three
|
|
1037
|
+
additional fields so callers can drive cost dashboards, per-turn budgets,
|
|
1038
|
+
and provider/model spend reports without a follow-up
|
|
1039
|
+
`GET /agent-runs/:runId` round trip. The same fields are persisted on the
|
|
1040
|
+
`EphemeralAgentRun` row and surfaced by that endpoint.
|
|
1041
|
+
|
|
1042
|
+
| Field | Type | Notes |
|
|
1043
|
+
| -------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
1044
|
+
| `tokens` | object | Per-run token totals aggregated across every model invocation. See schema below. |
|
|
1045
|
+
| `turns` | int | Total `engine.completeTurn(...)` invocations for the run. Counts the failing call too — so a single-shot run is `1`, a tool loop is `>= 2`, and a run that errored on its first model call is `1`. Distinct from "tool turns" — `turns` is **model invocations**, regardless of whether the model called any tools. |
|
|
1046
|
+
| `model` | object | Resolved model that actually executed the run. See schema below. |
|
|
1047
|
+
|
|
1048
|
+
Always present on the terminal event for runs created against
|
|
1049
|
+
**MANTYX ≥ 2026-09** servers. Older servers omit these fields entirely;
|
|
1050
|
+
SDK clients (TS/Go/Python) detect "no usage data" by checking that
|
|
1051
|
+
`model.provider` is empty / falsy. JSON keys follow MANTYX's standard
|
|
1052
|
+
camelCase wire convention.
|
|
1053
|
+
|
|
1054
|
+
**`tokens` schema** — mirrors the wire shape produced by
|
|
1055
|
+
`tokenUsageToWireTokens` in `packages/ts-sdk/src/usage-wire.ts`:
|
|
1056
|
+
|
|
1057
|
+
| Field | Type | Notes |
|
|
1058
|
+
| ----------------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
1059
|
+
| `inputTokens` | int | **Total billable input** — fresh prompt tokens **plus** the cached-read slice the provider still bills (at a discount) **plus** any cache-creation tokens **plus** tool-prompt tokens. Equal to the sum of every provider-reported input bucket for the run. |
|
|
1060
|
+
| `cachedTokens` | int | The discounted slice of `inputTokens` that came from a prompt cache hit (Anthropic prompt caching, OpenAI cached prompt, Gemini implicit cache). `0` when the provider doesn't report cache reads or the run didn't hit cache. |
|
|
1061
|
+
| `reasoningTokens` | int | Non-visible thinking tokens. **Already counted inside `outputTokens`** — surfaced separately so dashboards can break out "thinking cost" vs visible output. `0` when the model didn't reason or didn't report it. |
|
|
1062
|
+
| `outputTokens` | int | **All** tokens the model emitted for this run, visible + reasoning. Matches the provider's "completion tokens" / "output tokens" billing line. |
|
|
1063
|
+
|
|
1064
|
+
`inputTokens` and `outputTokens` together cover every billable token the
|
|
1065
|
+
run consumed; `cachedTokens` and `reasoningTokens` are diagnostic
|
|
1066
|
+
breakdowns _inside_ those two totals (not separate buckets to be added).
|
|
1067
|
+
|
|
1068
|
+
**`model` schema** — fields the platform stamps onto every successful
|
|
1069
|
+
or failed run:
|
|
1070
|
+
|
|
1071
|
+
| Field | Type | Notes |
|
|
1072
|
+
| ----------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
1073
|
+
| `id` | string | Catalog id — the same string a caller would pass back as `modelId` to re-select this exact entry (e.g. `"platform:demo"`, `"provider:cmf…"`). Empty string against legacy fallbacks that didn't synthesise a catalog id. |
|
|
1074
|
+
| `provider` | string | Lowercase provider id: `"openai"`, `"anthropic"`, `"google"`, `"azure-openai"`. |
|
|
1075
|
+
| `vendorModelId` | string | The model id the platform actually sent to the provider (e.g. `"gpt-5.4-mini"`, `"claude-opus-4-7"`, `"gemini-2.5-pro"`). Carried through from the `model` field on `AgentSpec` after resolution. |
|
|
1076
|
+
| `reasoningEffort` | string | Optional. `"off"`, `"low"`, `"medium"`, `"high"`. Omitted when the provider doesn't expose a reasoning-level knob or the run didn't request one. |
|
|
1077
|
+
|
|
1078
|
+
**Per-provider token mapping.** Provider responses vary in how they
|
|
1079
|
+
report token usage. MANTYX normalises them into the wire shape above as
|
|
1080
|
+
follows:
|
|
1081
|
+
|
|
1082
|
+
| Provider | `inputTokens` ← | `cachedTokens` ← | `reasoningTokens` ← | `outputTokens` ← |
|
|
1083
|
+
| --------- | ----------------------------------------------------------------------------------------------- | ------------------------------------------- | ----------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
|
|
1084
|
+
| OpenAI | `usage.prompt_tokens` (already includes cached read tokens) | `usage.prompt_tokens_details.cached_tokens` | `usage.completion_tokens_details.reasoning_tokens` | `usage.completion_tokens` |
|
|
1085
|
+
| Anthropic | `usage.input_tokens` + `usage.cache_read_input_tokens` + `usage.cache_creation_input_tokens` | `usage.cache_read_input_tokens` | (extended-thinking tokens; folded into `output_tokens` by the provider) | `usage.output_tokens` |
|
|
1086
|
+
| Google | `usageMetadata.promptTokenCount` + `usageMetadata.cachedContentTokenCount` + tool-prompt tokens | `usageMetadata.cachedContentTokenCount` | `usageMetadata.thoughtsTokenCount` | `usageMetadata.candidatesTokenCount` (or `totalTokenCount - promptTokenCount` for older Gemini SDKs) |
|
|
1087
|
+
|
|
1088
|
+
If a provider doesn't report a given bucket the corresponding field is
|
|
1089
|
+
`0`, never `null`.
|
|
1090
|
+
|
|
1091
|
+
**Tool-loop accounting.** When the run executes tool turns, every
|
|
1092
|
+
`engine.completeTurn(...)` invocation contributes its usage to the
|
|
1093
|
+
aggregated `tokens` object — so a run with one tool round (model →
|
|
1094
|
+
tool → model) reports `turns: 2` and the **sum** of both model calls'
|
|
1095
|
+
token usage. The terminal event carries the cumulative totals; no
|
|
1096
|
+
per-turn breakdown is in the terminal event (use the
|
|
1097
|
+
`assistant_message` events for per-turn observability).
|
|
1098
|
+
|
|
1099
|
+
**Snapshot exposure.** `GET /api/v1/workspaces/{slug}/agent-runs/{runId}`
|
|
1100
|
+
also returns `tokens` / `turns` / `model` on the run snapshot JSON, with
|
|
1101
|
+
the same wire shape. The keys are always present (as `null` until the
|
|
1102
|
+
worker writes the terminal event, and on legacy rows pre-rollout) so
|
|
1103
|
+
SDK clients can probe server capability via `"tokens" in body` without
|
|
1104
|
+
triggering an undefined-vs-null distinction across HTTP/JSON
|
|
1105
|
+
serialization.
|
|
1106
|
+
|
|
1107
|
+
**A2A exposure.** The MANTYX-hosted A2A endpoint
|
|
1108
|
+
(`POST /api/a2a/{workspaceSlug}/agents/{agentSlug}`) returns the same
|
|
1109
|
+
triple on the JSON-RPC response under `result.metadata.mantyx`:
|
|
1110
|
+
|
|
1111
|
+
```jsonc
|
|
1112
|
+
{
|
|
1113
|
+
"result": {
|
|
1114
|
+
"kind": "message",
|
|
1115
|
+
"messageId": "msg_abc",
|
|
1116
|
+
"role": "agent",
|
|
1117
|
+
"parts": [{ "kind": "text", "text": "Final reply" }],
|
|
1118
|
+
"metadata": {
|
|
1119
|
+
"mantyx": {
|
|
1120
|
+
"tokens": {
|
|
1121
|
+
"inputTokens": 1283,
|
|
1122
|
+
"cachedTokens": 512,
|
|
1123
|
+
"reasoningTokens": 96,
|
|
1124
|
+
"outputTokens": 240,
|
|
1125
|
+
},
|
|
1126
|
+
"turns": 3,
|
|
1127
|
+
"model": {
|
|
1128
|
+
"id": "platform:demo",
|
|
1129
|
+
"provider": "openai",
|
|
1130
|
+
"vendorModelId": "gpt-5.4-mini",
|
|
1131
|
+
"reasoningEffort": "low",
|
|
1132
|
+
},
|
|
1133
|
+
},
|
|
1134
|
+
},
|
|
1135
|
+
},
|
|
1136
|
+
}
|
|
1137
|
+
```
|
|
1138
|
+
|
|
1139
|
+
The `metadata.mantyx` block is omitted entirely against legacy runners
|
|
1140
|
+
that haven't implemented `runWithUsage` on the A2A adapter (see
|
|
1141
|
+
`packages/ts-sdk/src/a2a/adapter.ts`); cross-platform A2A clients
|
|
1142
|
+
should treat its absence as "no usage data" rather than as zero usage.
|
|
1143
|
+
|
|
994
1144
|
## 8. Local tool result
|
|
995
1145
|
|
|
996
1146
|
```
|
|
@@ -1027,23 +1177,25 @@ All non-2xx responses use this body shape:
|
|
|
1027
1177
|
|
|
1028
1178
|
```jsonc
|
|
1029
1179
|
{
|
|
1030
|
-
"error": "invalid_model",
|
|
1180
|
+
"error": "invalid_model", // machine-readable code
|
|
1031
1181
|
"message": "Model 'foo' is ambiguous; pick one of: provider:cm6...",
|
|
1032
|
-
"candidates": [
|
|
1182
|
+
"candidates": [
|
|
1183
|
+
/* sometimes present */
|
|
1184
|
+
],
|
|
1033
1185
|
}
|
|
1034
1186
|
```
|
|
1035
1187
|
|
|
1036
1188
|
Common codes:
|
|
1037
1189
|
|
|
1038
|
-
| Code
|
|
1039
|
-
|
|
|
1040
|
-
| `unauthorized`
|
|
1041
|
-
| `not_found`
|
|
1042
|
-
| `invalid_request`
|
|
1043
|
-
| `invalid_model`
|
|
1044
|
-
| `unknown_tool_use`
|
|
1045
|
-
| `run_terminal`
|
|
1046
|
-
| `rate_limited`
|
|
1190
|
+
| Code | HTTP | Notes |
|
|
1191
|
+
| ------------------ | ---: | -------------------------------------- |
|
|
1192
|
+
| `unauthorized` | 401 | Missing/invalid API key |
|
|
1193
|
+
| `not_found` | 404 | Workspace, run, or session unknown |
|
|
1194
|
+
| `invalid_request` | 400 | Body failed Zod validation |
|
|
1195
|
+
| `invalid_model` | 400 | `modelId` couldn't be resolved |
|
|
1196
|
+
| `unknown_tool_use` | 404 | Tool-result for an unknown `toolUseId` |
|
|
1197
|
+
| `run_terminal` | 409 | Tool-result after run finished |
|
|
1198
|
+
| `rate_limited` | 429 | Per-API-key sliding window |
|
|
1047
1199
|
|
|
1048
1200
|
## 11. Suggested client architecture
|
|
1049
1201
|
|
|
@@ -1061,8 +1213,8 @@ A reference SDK should:
|
|
|
1061
1213
|
model-side "don't double-call" hint without hand-editing the
|
|
1062
1214
|
description.
|
|
1063
1215
|
- **Local A2A peers** (`kind: "a2a_local"`) — caller-supplied A2A
|
|
1064
|
-
clients. Resolve the peer's Agent Card
|
|
1065
|
-
|
|
1216
|
+
clients. Resolve the peer's Agent Card _first_ (e.g. `fetch
|
|
1217
|
+
"<peer>/.well-known/agent-card.json"` or read from a local registry),
|
|
1066
1218
|
attach it to the spec as `agentCard`, and in the dispatcher look the
|
|
1067
1219
|
client up by `agentCard.url` (or any other field you indexed on)
|
|
1068
1220
|
when the `local_tool_call` arrives.
|
|
@@ -1074,9 +1226,10 @@ A reference SDK should:
|
|
|
1074
1226
|
|
|
1075
1227
|
`mantyx`, `mantyx_plugin`, `a2a`, and `mcp` refs are server-resolved —
|
|
1076
1228
|
no SDK-side registry needed.
|
|
1229
|
+
|
|
1077
1230
|
3. On `runAgent` / `session.send`:
|
|
1078
1231
|
- Accept `reasoningLevel` from the caller and pass it through unchanged
|
|
1079
|
-
(string `"off" | "low" | "medium" | "high"`
|
|
1232
|
+
(string `"off" | "low" | "medium" | "high"` _or_ number `0–100`); do
|
|
1080
1233
|
**not** translate to a vendor-specific knob — the server owns that
|
|
1081
1234
|
mapping so all SDKs stay aligned with the web composer.
|
|
1082
1235
|
- POST the run/message, get `{ runId, streamUrl }`.
|
|
@@ -1089,7 +1242,7 @@ A reference SDK should:
|
|
|
1089
1242
|
them by default. Their presence depends on `reasoningLevel > 0` and
|
|
1090
1243
|
on the active model exposing thought parts.
|
|
1091
1244
|
- Accept `loopDetection` and `toolBudgets` from the caller and pass
|
|
1092
|
-
them through unchanged (see §4.6 / §4.7). Both fields are
|
|
1245
|
+
them through unchanged (see §4.6 / §4.7). Both fields are _additive_:
|
|
1093
1246
|
omitting them keeps MANTYX's runtime defaults; passing
|
|
1094
1247
|
`loopDetection: false` opts out; passing `toolBudgets: {}` clears the
|
|
1095
1248
|
defaults; passing entries layers caller overrides on top of the
|
|
@@ -1116,4 +1269,4 @@ A reference SDK should:
|
|
|
1116
1269
|
|
|
1117
1270
|
The npm package [`@mantyx/sdk`](https://www.npmjs.com/package/@mantyx/sdk) and the Go module
|
|
1118
1271
|
[`github.com/mantyx/mantyx-go-sdk`](https://github.com/mantyx/mantyx-go-sdk) are reference implementations of this protocol
|
|
1119
|
-
(maintained in the official **mantyx-sdk** repositories).
|
|
1272
|
+
(maintained in the official **mantyx-sdk** repositories).
|