@mantyx/sdk 0.10.1 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -16,7 +16,7 @@ Companion documents:
16
16
 
17
17
  ## 1. Concepts
18
18
 
19
- **Ephemeral agent.** A run-time agent that is *defined by the request* rather
19
+ **Ephemeral agent.** A run-time agent that is _defined by the request_ rather
20
20
  than persisted as a row in MANTYX's `Agent` table. The full spec (system
21
21
  prompt, model, tools) is stored as part of each session/run for observability
22
22
  but is not editable from the dashboard.
@@ -24,15 +24,15 @@ but is not editable from the dashboard.
24
24
  **Tool refs.** Seven flavours, all carried inside the agent spec's `tools`
25
25
  array:
26
26
 
27
- | `kind` | Resolved by | Notes |
28
- | ---------------- | ----------- | ----- |
29
- | `mantyx` | server | A workspace `Tool` row referenced by id (HTTP / Code / Plugin). |
30
- | `mantyx_plugin` | server | A platform plugin tool referenced by name. |
31
- | `local` | client | A custom tool defined and executed in the SDK's process. Carries `parameters` (input JSON Schema) plus optional `outputSchema` (return-value JSON Schema) and `longRunning` flag — see §4.1.1. |
32
- | `a2a` | server | A *remote* Agent2Agent peer MANTYX can reach; invoked via `message/send` and the reply is surfaced as the tool result. |
33
- | `a2a_local` | client | An A2A peer MANTYX **cannot** reach. SDK resolves the [Agent Card](https://google.github.io/A2A/specification/#agent-card) locally and ships it inline; MANTYX uses it for the model description and routes calls back to the SDK over SSE. |
34
- | `mcp` | server | A *remote* MCP server (Streamable HTTP). At run start MANTYX lists the catalog and exposes every tool as `<server>_<tool>` (subject to `toolFilter`). |
35
- | `mcp_local` | client | An MCP server MANTYX **cannot** reach. SDK runs `Initialize` + `tools/list` locally and ships the resolved `Tool[]` (with `inputSchema`); MANTYX exposes them to the model with the SDK-declared names and routes calls back over SSE. |
27
+ | `kind` | Resolved by | Notes |
28
+ | --------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
29
+ | `mantyx` | server | A workspace `Tool` row referenced by id (HTTP / Code / Plugin). |
30
+ | `mantyx_plugin` | server | A platform plugin tool referenced by name. |
31
+ | `local` | client | A custom tool defined and executed in the SDK's process. Carries `parameters` (input JSON Schema) plus optional `outputSchema` (return-value JSON Schema) and `longRunning` flag — see §4.1.1. |
32
+ | `a2a` | server | A _remote_ Agent2Agent peer MANTYX can reach; invoked via `message/send` and the reply is surfaced as the tool result. |
33
+ | `a2a_local` | client | An A2A peer MANTYX **cannot** reach. SDK resolves the [Agent Card](https://google.github.io/A2A/specification/#agent-card) locally and ships it inline; MANTYX uses it for the model description and routes calls back to the SDK over SSE. |
34
+ | `mcp` | server | A _remote_ MCP server (Streamable HTTP). At run start MANTYX lists the catalog and exposes every tool as `<server>_<tool>` (subject to `toolFilter`). |
35
+ | `mcp_local` | client | An MCP server MANTYX **cannot** reach. SDK runs `Initialize` + `tools/list` locally and ships the resolved `Tool[]` (with `inputSchema`); MANTYX exposes them to the model with the SDK-declared names and routes calls back over SSE. |
36
36
 
37
37
  The split is deliberate:
38
38
 
@@ -42,7 +42,7 @@ The split is deliberate:
42
42
  MCP/A2A this also means MANTYX does discovery (`listTools`, agent-card
43
43
  fetch).
44
44
  - **Client-resolved / "local"** (`local`, `a2a_local`, `mcp_local`) —
45
- MANTYX has *no* access to the resource. The SDK does **all** of the
45
+ MANTYX has _no_ access to the resource. The SDK does **all** of the
46
46
  work: connection, discovery, listing, expansion, arg validation, auth,
47
47
  execution, retries. MANTYX is a thin LLM-routing layer that emits a
48
48
  `local_tool_call` event and blocks until the SDK POSTs back to
@@ -52,9 +52,9 @@ The split is deliberate:
52
52
 
53
53
  **One-shot run vs. session.** A run is an LLM execution. Runs may be:
54
54
 
55
- - *one-shot* (`POST /agent-runs`) — fire-and-stream, no persistent state apart
55
+ - _one-shot_ (`POST /agent-runs`) — fire-and-stream, no persistent state apart
56
56
  from observability.
57
- - *session-scoped* (`POST /agent-sessions/:id/messages`) — the run inherits the
57
+ - _session-scoped_ (`POST /agent-sessions/:id/messages`) — the run inherits the
58
58
  session's full message history, and the new user/assistant turns are
59
59
  appended back to the session on success.
60
60
 
@@ -75,10 +75,10 @@ Authorization: Bearer <credential>
75
75
  X-API-Key: <credential>
76
76
  ```
77
77
 
78
- | Credential | Token format | Identifies | Bound to | Use when |
79
- | ------------------------- | --------------- | ------------------------ | ----------------------- | -------- |
80
- | **Workspace API key** | `mantyx_…` | The workspace | One workspace, no end-user | Personal scripts, internal automations, anything the SDK caller owns end-to-end. |
81
- | **OAuth 2.0 access token**| `mantyx_at_…` | An end user **and** the workspace they consented for | One workspace, one user (or one app for `client_credentials`) | "Sign in with MANTYX" apps, third-party integrations, anywhere consent + scopes matter. |
78
+ | Credential | Token format | Identifies | Bound to | Use when |
79
+ | -------------------------- | ------------- | ---------------------------------------------------- | ------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
80
+ | **Workspace API key** | `mantyx_…` | The workspace | One workspace, no end-user | Personal scripts, internal automations, anything the SDK caller owns end-to-end. |
81
+ | **OAuth 2.0 access token** | `mantyx_at_…` | An end user **and** the workspace they consented for | One workspace, one user (or one app for `client_credentials`) | "Sign in with MANTYX" apps, third-party integrations, anywhere consent + scopes matter. |
82
82
 
83
83
  The server resolves whichever it sees by token-prefix sniffing (see
84
84
  `packages/api/src/services/bearer-credential.ts`) — SDKs do **not** need
@@ -115,19 +115,19 @@ two differences:
115
115
  multi-scope ones — see §2.3). The SDK is expected to surface this
116
116
  verbatim. The agent-runs surface uses these scopes:
117
117
 
118
- | Endpoint | Required scope |
119
- | ------------------------------------------------------------ | -------------- |
120
- | `GET .../models` | `models:read` |
121
- | `POST .../agent-runs` | `runs:write` |
122
- | `GET .../agent-runs/{runId}` | `runs:read` |
123
- | `GET .../agent-runs/{runId}/stream` | `runs:read` |
124
- | `POST .../agent-runs/{runId}/cancel` | `runs:write` |
125
- | `POST .../agent-runs/{runId}/tool-results` | `runs:write` |
126
- | `POST .../agent-sessions` | `sessions:write` |
127
- | `GET .../agent-sessions/{sessionId}` | `sessions:read` |
128
- | `DELETE .../agent-sessions/{sessionId}` | `sessions:write` |
129
- | `POST .../agent-sessions/{sessionId}/messages` | `sessions:write` |
130
- | `GET /api/oauth/userinfo` | `mantyx.identity:read` |
118
+ | Endpoint | Required scope |
119
+ | ------------------------------------------------ | ---------------------- |
120
+ | `GET .../models` | `models:read` |
121
+ | `POST .../agent-runs` | `runs:write` |
122
+ | `GET .../agent-runs/{runId}` | `runs:read` |
123
+ | `GET .../agent-runs/{runId}/stream` | `runs:read` |
124
+ | `POST .../agent-runs/{runId}/cancel` | `runs:write` |
125
+ | `POST .../agent-runs/{runId}/tool-results` | `runs:write` |
126
+ | `POST .../agent-sessions` | `sessions:write` |
127
+ | `GET .../agent-sessions/{sessionId}` | `sessions:read` |
128
+ | `DELETE .../agent-sessions/{sessionId}` | `sessions:write` |
129
+ | `POST .../agent-sessions/{sessionId}/messages` | `sessions:write` |
130
+ | `GET /api/oauth/userinfo` | `mantyx.identity:read` |
131
131
 
132
132
  For an SDK that exposes one-shot runs and sessions end-to-end, request
133
133
  at minimum `models:read runs:read runs:write sessions:read sessions:write`,
@@ -143,9 +143,10 @@ two differences:
143
143
  OAuth tokens **also** honor the per-token agent allow-list
144
144
  (`OAuthAccessToken.agentIds`) the user picked at consent time — see
145
145
  [`docs/oauth.md`](./oauth.md) for the full registration / authorization-code
146
- + PKCE flow. PKCE (`S256`) is mandatory and every MANTYX OAuth app is a
147
- confidential client, so the token endpoint requires both `client_secret`
148
- and `code_verifier`.
146
+
147
+ - PKCE flow. PKCE (`S256`) is mandatory and every MANTYX OAuth app is a
148
+ confidential client, so the token endpoint requires both `client_secret`
149
+ and `code_verifier`.
149
150
 
150
151
  **Token lifetimes.** Access tokens live **1 hour** (`expires_in: 3600`).
151
152
  Refresh tokens are **persistent and non-rotating**: they have no
@@ -176,14 +177,14 @@ Content-Type: application/json
176
177
 
177
178
  ### 2.3 Error model for credentials
178
179
 
179
- | Status | Body shape | When |
180
- | ------ | ------------------------------------------------------------------------------------- | ---- |
181
- | `401` | `{ "error": "Unauthorized", "message": "API key or OAuth access token required..." }` | No `Authorization` / `X-API-Key` header. |
182
- | `401` | `{ "error": "Invalid API key or OAuth access token" }` | Token doesn't match a row, expired, or revoked. |
183
- | `403` | `{ "error": "This API key is not for the Developer API", "hint": "..." }` | API key has wrong `usage`. |
184
- | `403` | `{ "error": "Workspace API keys are not available on this plan.", "code": "api_keys_plan" }` <br> `{ "error": "OAuth applications are not available on this plan.", "code": "oauth_apps_plan" }` | Workspace tier lacks the `apiKeys` / `oauthApps` feature. |
185
- | `403` | `{ "error": "insufficient_scope", "required": "runs:write" }` (or an array if a route needs multiple) | OAuth token is missing a scope a route demands. The response also sets `WWW-Authenticate: Bearer error="insufficient_scope", scope="..."`. |
186
- | `404` | `{ "error": "Workspace path does not match this credential", "hint": "..." }` | URL slug ≠ token's workspace. |
180
+ | Status | Body shape | When |
181
+ | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------ |
182
+ | `401` | `{ "error": "Unauthorized", "message": "API key or OAuth access token required..." }` | No `Authorization` / `X-API-Key` header. |
183
+ | `401` | `{ "error": "Invalid API key or OAuth access token" }` | Token doesn't match a row, expired, or revoked. |
184
+ | `403` | `{ "error": "This API key is not for the Developer API", "hint": "..." }` | API key has wrong `usage`. |
185
+ | `403` | `{ "error": "Workspace API keys are not available on this plan.", "code": "api_keys_plan" }` <br> `{ "error": "OAuth applications are not available on this plan.", "code": "oauth_apps_plan" }` | Workspace tier lacks the `apiKeys` / `oauthApps` feature. |
186
+ | `403` | `{ "error": "insufficient_scope", "required": "runs:write" }` (or an array if a route needs multiple) | OAuth token is missing a scope a route demands. The response also sets `WWW-Authenticate: Bearer error="insufficient_scope", scope="..."`. |
187
+ | `404` | `{ "error": "Workspace path does not match this credential", "hint": "..." }` | URL slug ≠ token's workspace. |
187
188
 
188
189
  ## 3. Models
189
190
 
@@ -204,7 +205,11 @@ platform-hosted offerings visible to the workspace's tier.
204
205
  "vendorModelId": "claude-sonnet-4-5",
205
206
  "source": "platform_offering",
206
207
  "contextWindowTokens": 200000,
207
- "pricing": { "inputPer1MUsd": 3.0, "outputPer1MUsd": 15.0, "cacheReadPer1MUsd": 0.3 }
208
+ "pricing": {
209
+ "inputPer1MUsd": 3.0,
210
+ "outputPer1MUsd": 15.0,
211
+ "cacheReadPer1MUsd": 0.3,
212
+ },
208
213
  },
209
214
  {
210
215
  "id": "provider:cm6def456",
@@ -213,10 +218,10 @@ platform-hosted offerings visible to the workspace's tier.
213
218
  "vendorModelId": "gpt-5.5",
214
219
  "source": "workspace_provider",
215
220
  "contextWindowTokens": 200000,
216
- "pricing": null
217
- }
221
+ "pricing": null,
222
+ },
218
223
  ],
219
- "defaultModelId": "platform:cm6abc123"
224
+ "defaultModelId": "platform:cm6abc123",
220
225
  }
221
226
  ```
222
227
 
@@ -240,11 +245,11 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
240
245
 
241
246
  ```jsonc
242
247
  {
243
- "name": "ephemeral", // optional, observability only
244
- "agentId": "agent_cm6abc123", // optional — see §4.1
245
- "systemPrompt": "You are helpful.", // required unless agentId is set
246
- "modelId": "platform:cm6abc123", // optional, see §3
247
- "reasoningLevel": "medium", // optional, see §4.4
248
+ "name": "ephemeral", // optional, observability only
249
+ "agentId": "agent_cm6abc123", // optional — see §4.1
250
+ "systemPrompt": "You are helpful.", // required unless agentId is set
251
+ "modelId": "platform:cm6abc123", // optional, see §3
252
+ "reasoningLevel": "medium", // optional, see §4.4
248
253
  "tools": [
249
254
  { "kind": "mantyx", "id": "tool_cm6..." },
250
255
  { "kind": "mantyx_plugin", "name": "web_search" },
@@ -252,20 +257,22 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
252
257
  "kind": "local",
253
258
  "name": "read_file",
254
259
  "description": "Read a file from the user's machine",
255
- "parameters": { // JSON Schema for the args object
260
+ "parameters": {
261
+ // JSON Schema for the args object
256
262
  "type": "object",
257
263
  "properties": { "path": { "type": "string" } },
258
264
  "required": ["path"],
259
- "additionalProperties": false
265
+ "additionalProperties": false,
260
266
  },
261
- "outputSchema": { // optional — JSON Schema for the return value
267
+ "outputSchema": {
268
+ // optional — JSON Schema for the return value
262
269
  "type": "object",
263
270
  "properties": {
264
- "bytes": { "type": "string", "description": "UTF-8 file contents" }
271
+ "bytes": { "type": "string", "description": "UTF-8 file contents" },
265
272
  },
266
- "required": ["bytes"]
273
+ "required": ["bytes"],
267
274
  },
268
- "longRunning": false // optional — default false
275
+ "longRunning": false, // optional — default false
269
276
  },
270
277
  {
271
278
  "kind": "a2a",
@@ -273,12 +280,13 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
273
280
  "description": "Delegate billing questions to the Acme billing agent.",
274
281
  "agentCardUrl": "https://billing.acme.com/.well-known/agent-card.json",
275
282
  "headers": { "Authorization": "Bearer ${BILLING_TOKEN}" },
276
- "contextId": "ctx_abc" // optional A2A context to thread turns
283
+ "contextId": "ctx_abc", // optional A2A context to thread turns
277
284
  },
278
285
  {
279
286
  "kind": "a2a_local",
280
287
  "name": "intranet_hr_agent",
281
- "agentCard": { // SDK-resolved A2A Agent Card content
288
+ "agentCard": {
289
+ // SDK-resolved A2A Agent Card content
282
290
  "protocolVersion": "0.3.0",
283
291
  "name": "Acme HR",
284
292
  "description": "Answers questions about HR policies and benefits.",
@@ -289,72 +297,79 @@ The agent spec is the body shape used by `POST /agent-runs` and `POST
289
297
  {
290
298
  "id": "pto_lookup",
291
299
  "name": "PTO lookup",
292
- "description": "Find a teammate's remaining PTO days for the year."
300
+ "description": "Find a teammate's remaining PTO days for the year.",
293
301
  },
294
302
  {
295
303
  "id": "benefits_qa",
296
304
  "name": "Benefits Q&A",
297
- "description": "Answer questions about insurance, 401k, and parental leave."
298
- }
299
- ]
300
- }
305
+ "description": "Answer questions about insurance, 401k, and parental leave.",
306
+ },
307
+ ],
308
+ },
301
309
  },
302
310
  {
303
311
  "kind": "mcp",
304
- "name": "github", // → tools become github_<tool>
312
+ "name": "github", // → tools become github_<tool>
305
313
  "url": "https://mcp.github.com/v1",
306
314
  "headers": { "Authorization": "Bearer ${GH_PAT}" },
307
- "toolFilter": ["search_repos", "read_file"] // optional allowlist
315
+ "toolFilter": ["search_repos", "read_file"], // optional allowlist
308
316
  },
309
317
  {
310
318
  "kind": "mcp_local",
311
- "name": "fs", // SDK-side server label only — NOT a prefix
312
- "serverInfo": { // optional; from MCP Initialize
319
+ "name": "fs", // SDK-side server label only — NOT a prefix
320
+ "serverInfo": {
321
+ // optional; from MCP Initialize
313
322
  "name": "mcp-server-filesystem",
314
- "version": "0.4.1"
323
+ "version": "0.4.1",
315
324
  },
316
- "tools": [ // verbatim MCP tools/list response
325
+ "tools": [
326
+ // verbatim MCP tools/list response
317
327
  {
318
- "name": "fs_read_file", // model-facing name, exactly as declared
328
+ "name": "fs_read_file", // model-facing name, exactly as declared
319
329
  "description": "Read a file from the user's workstation",
320
- "inputSchema": { // MCP's term — JSON Schema
330
+ "inputSchema": {
331
+ // MCP's term — JSON Schema
321
332
  "type": "object",
322
333
  "properties": { "path": { "type": "string" } },
323
- "required": ["path"]
324
- }
325
- }
326
- ]
327
- }
334
+ "required": ["path"],
335
+ },
336
+ },
337
+ ],
338
+ },
328
339
  ],
329
- "budgets": { "maxToolTurns": 32 }, // optional safety cap
330
- "outputSchema": { // optional, see §4.5
340
+ "budgets": { "maxToolTurns": 32 }, // optional safety cap
341
+ "outputSchema": {
342
+ // optional, see §4.5
331
343
  "name": "weather_report",
332
344
  "schema": {
333
345
  "type": "object",
334
346
  "properties": {
335
347
  "city": { "type": "string" },
336
- "temperature_c": { "type": "number" }
348
+ "temperature_c": { "type": "number" },
337
349
  },
338
- "required": ["city", "temperature_c"]
339
- }
350
+ "required": ["city", "temperature_c"],
351
+ },
340
352
  },
341
- "loopDetection": { // optional, see §4.6
353
+ "loopDetection": {
354
+ // optional, see §4.6
342
355
  "consecutiveThreshold": 3,
343
- "hardCutoffThreshold": 6
356
+ "hardCutoffThreshold": 6,
344
357
  },
345
- "toolBudgets": { // optional, see §4.7
346
- "recall": { "maxCalls": 4 },
358
+ "toolBudgets": {
359
+ // optional, see §4.7
360
+ "recall": { "maxCalls": 4 },
347
361
  "hive_consult_ontology": { "maxCalls": 4 },
348
- "scary_tool": { "maxCalls": 0 }
362
+ "scary_tool": { "maxCalls": 0 },
349
363
  },
350
- "metadata": { // optional, see §4.8
364
+ "metadata": {
365
+ // optional, see §4.8
351
366
  "customer": "acme",
352
- "env": "prod"
353
- }
367
+ "env": "prod",
368
+ },
354
369
  }
355
370
  ```
356
371
 
357
- `POST /agent-runs` additionally accepts `prompt` *or* `messages` (an array of
372
+ `POST /agent-runs` additionally accepts `prompt` _or_ `messages` (an array of
358
373
  `{role, content}`). Sending both is a `400 invalid_request`.
359
374
 
360
375
  ### 4.1 Triggering a persisted MANTYX agent (`agentId`)
@@ -366,7 +381,7 @@ defining an ephemeral one inline. When `agentId` is set:
366
381
  stored system prompt at run time.
367
382
  - `modelId` becomes optional. If omitted, the server uses the agent's
368
383
  configured LLM provider (or the workspace automation provider if the agent
369
- has *Use workspace default model* turned on).
384
+ has _Use workspace default model_ turned on).
370
385
  - The agent's own tools are loaded from its workspace configuration —
371
386
  including memory, skills, and plugin tools — and your `tools` array is
372
387
  **merged on top**. This is typically used to attach `local` tools so the
@@ -389,14 +404,14 @@ the handler in its own process. MANTYX never executes the body — it
389
404
  emits a `local_tool_call` event when the model picks the tool and waits
390
405
  for the SDK to POST a tool-result.
391
406
 
392
- | Field | Required | Notes |
393
- | -------------- | -------- | ----- |
394
- | `kind` | yes | Discriminator literal `"local"`. |
395
- | `name` | yes | Model-facing tool name. Must match `/^[a-zA-Z0-9_]{1,64}$/`. |
396
- | `description` | no | Free-form. Empty when omitted (acceptable, but reduces tool-selection accuracy). |
397
- | `parameters` | no | JSON Schema for the tool's input. Must be a `type: "object"` schema with `properties`; non-object roots are coerced to an empty object schema server-side. Forwarded **verbatim** to the LLM provider so nested constraints (`array.items`, `enum`, `anyOf`, numeric formats, …) survive. Args that fail server-side validation produce a structured `tool_input_invalid` tool result the model can recover from instead of crashing the call. |
398
- | `outputSchema` | no | JSON Schema for the structured value the tool returns. When present, forwarded to providers that accept per-tool response schemas (Gemini's `responseJsonSchema` on the FunctionDeclaration); other engines surface it through the description and rely on host-side validation. Helps the model emit follow-up arguments that round-trip cleanly. Must be an object schema; non-object roots are dropped server-side. |
399
- | `longRunning` | no | When `true`, MANTYX appends a stable hint to the model-facing description so every provider treats the tool as long-running:<br>*"NOTE: This is a long-running operation. Do not call this tool again if it has already returned an intermediate or pending status."*<br>Useful for tools that return `pending` and rely on SDK-side polling — without the hint the model routinely fires repeat calls and burns turns. Pure declarative — MANTYX does not change scheduling. |
407
+ | Field | Required | Notes |
408
+ | -------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
409
+ | `kind` | yes | Discriminator literal `"local"`. |
410
+ | `name` | yes | Model-facing tool name. Must match `/^[a-zA-Z0-9_]{1,64}$/`. |
411
+ | `description` | no | Free-form. Empty when omitted (acceptable, but reduces tool-selection accuracy). |
412
+ | `parameters` | no | JSON Schema for the tool's input. Must be a `type: "object"` schema with `properties`; non-object roots are coerced to an empty object schema server-side. Forwarded **verbatim** to the LLM provider so nested constraints (`array.items`, `enum`, `anyOf`, numeric formats, …) survive. Args that fail server-side validation produce a structured `tool_input_invalid` tool result the model can recover from instead of crashing the call. |
413
+ | `outputSchema` | no | JSON Schema for the structured value the tool returns. When present, forwarded to providers that accept per-tool response schemas (Gemini's `responseJsonSchema` on the FunctionDeclaration); other engines surface it through the description and rely on host-side validation. Helps the model emit follow-up arguments that round-trip cleanly. Must be an object schema; non-object roots are dropped server-side. |
414
+ | `longRunning` | no | When `true`, MANTYX appends a stable hint to the model-facing description so every provider treats the tool as long-running:<br>_"NOTE: This is a long-running operation. Do not call this tool again if it has already returned an intermediate or pending status."_<br>Useful for tools that return `pending` and rely on SDK-side polling — without the hint the model routinely fires repeat calls and burns turns. Pure declarative — MANTYX does not change scheduling. |
400
415
 
401
416
  The `outputSchema` and `longRunning` fields are **additive** since wire
402
417
  protocol v1: SDKs that don't ship them keep working unchanged. Providers
@@ -410,10 +425,10 @@ A2A delegation lets the agent hand a task to another
410
425
  [Agent2Agent](https://google.github.io/A2A/) peer. The wire protocol exposes
411
426
  two kinds depending on **who can reach the peer**:
412
427
 
413
- - `kind: "a2a"` — *remote* (server-resolved). MANTYX dials `agentCardUrl`
428
+ - `kind: "a2a"` — _remote_ (server-resolved). MANTYX dials `agentCardUrl`
414
429
  directly. Pick this when the peer is on the public internet or in the
415
430
  same VPC as MANTYX.
416
- - `kind: "a2a_local"` — *local* (client-resolved). The SDK invokes the peer
431
+ - `kind: "a2a_local"` — _local_ (client-resolved). The SDK invokes the peer
417
432
  on its side and posts back the reply. Pick this when the peer lives on an
418
433
  intranet, behind a VPN, or on the user's device — anywhere MANTYX can't
419
434
  reach but the SDK can.
@@ -431,14 +446,14 @@ POSTs the model's `message` argument to `agentCardUrl` over A2A's standard
431
446
  and `/message/send` endpoints are probed in order) and forwards the remote
432
447
  agent's text reply back as the tool result.
433
448
 
434
- | Field | Required | Notes |
435
- | --------------- | -------- | ----- |
436
- | `kind` | yes | Discriminator literal `"a2a"`. |
437
- | `name` | yes | Tool name surfaced to the model — must match `/^[a-zA-Z0-9_]{1,64}$/`. |
438
- | `description` | no | Model-facing description. Defaults to `"Delegate a task to the <name> agent over A2A. Pass the full task as a single message."`. Mention the remote agent's purpose so the model picks it for the right turn. |
439
- | `agentCardUrl` | yes | URL of the remote Agent Card (`/.well-known/agent-card.json`) or the JSON-RPC root the peer accepts. |
440
- | `headers` | no | Flat string→string HTTP headers sent on every A2A request — typically `Authorization`. Each value is capped at 8 KB. |
441
- | `contextId` | no | A2A `contextId` to thread multiple delegations into the same remote conversation. Omit for fresh per-call context. |
449
+ | Field | Required | Notes |
450
+ | -------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
451
+ | `kind` | yes | Discriminator literal `"a2a"`. |
452
+ | `name` | yes | Tool name surfaced to the model — must match `/^[a-zA-Z0-9_]{1,64}$/`. |
453
+ | `description` | no | Model-facing description. Defaults to `"Delegate a task to the <name> agent over A2A. Pass the full task as a single message."`. Mention the remote agent's purpose so the model picks it for the right turn. |
454
+ | `agentCardUrl` | yes | URL of the remote Agent Card (`/.well-known/agent-card.json`) or the JSON-RPC root the peer accepts. |
455
+ | `headers` | no | Flat string→string HTTP headers sent on every A2A request — typically `Authorization`. Each value is capped at 8 KB. |
456
+ | `contextId` | no | A2A `contextId` to thread multiple delegations into the same remote conversation. Omit for fresh per-call context. |
442
457
 
443
458
  > **Secret handling.** `headers` are forwarded **as-is** by the SDK API. If
444
459
  > you need long-lived credentials (refresh tokens, rotating API keys),
@@ -476,30 +491,30 @@ Per-run lifecycle:
476
491
  5. **Continuation (MANTYX).** MANTYX feeds the reply back into the model
477
492
  loop as the tool result.
478
493
 
479
- | Field | Required | Notes |
480
- | --------------- | -------- | ----- |
481
- | `kind` | yes | Discriminator literal `"a2a_local"`. |
482
- | `name` | yes | Tool name surfaced to the model — must match `/^[a-zA-Z0-9_]{1,64}$/`. |
483
- | `description` | no | Model-facing description override. When omitted, MANTYX synthesizes one from `agentCard.name`, `agentCard.description`, and the first 12 skills. |
484
- | `agentCard` | yes | The resolved A2A Agent Card (JSON content). Schema follows the [A2A Agent Card spec](https://google.github.io/A2A/specification/#agent-card) — passthrough for unknown fields, so any spec-compliant card works. See the *Agent Card shape* table below for the fields MANTYX actually reads. |
494
+ | Field | Required | Notes |
495
+ | ------------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
496
+ | `kind` | yes | Discriminator literal `"a2a_local"`. |
497
+ | `name` | yes | Tool name surfaced to the model — must match `/^[a-zA-Z0-9_]{1,64}$/`. |
498
+ | `description` | no | Model-facing description override. When omitted, MANTYX synthesizes one from `agentCard.name`, `agentCard.description`, and the first 12 skills. |
499
+ | `agentCard` | yes | The resolved A2A Agent Card (JSON content). Schema follows the [A2A Agent Card spec](https://google.github.io/A2A/specification/#agent-card) — passthrough for unknown fields, so any spec-compliant card works. See the _Agent Card shape_ table below for the fields MANTYX actually reads. |
485
500
 
486
501
  **Agent Card shape** (only the fields MANTYX inspects; everything else is
487
502
  forwarded verbatim back to the SDK):
488
503
 
489
- | Card field | Used by MANTYX | Notes |
490
- | --------------------- | -------------- | ----- |
491
- | `protocolVersion` | echo only | A2A protocol version (e.g. `"0.3.0"`). |
492
- | `name` | description | Used when synthesizing the tool description (`"Delegate a task to the <name> agent ..."`). |
493
- | `description` | description | One-paragraph summary of what the peer does — surfaced to the model. |
494
- | `url` | echo only | Peer's A2A endpoint. Forwarded back to the SDK in the `local_tool_call` event so the SDK can dispatch by URL. Never fetched server-side. |
495
- | `version` | echo only | Peer agent version. |
496
- | `provider` | echo only | Vendor info. |
497
- | `capabilities` | echo only | A2A capability flags (streaming, push notifications, …). |
498
- | `defaultInputModes` | echo only | Modalities the peer accepts. |
499
- | `defaultOutputModes` | echo only | Modalities the peer returns. |
500
- | `skills[]` | description | First 12 skills (`name`, `description`) are bulleted into the tool description so the model knows what to ask for. |
501
- | `securitySchemes`, `security` | echo only | Forwarded to the SDK; MANTYX does no auth. |
502
- | *anything else* | echo only | Passthrough — survives round-trip unchanged. |
504
+ | Card field | Used by MANTYX | Notes |
505
+ | ----------------------------- | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
506
+ | `protocolVersion` | echo only | A2A protocol version (e.g. `"0.3.0"`). |
507
+ | `name` | description | Used when synthesizing the tool description (`"Delegate a task to the <name> agent ..."`). |
508
+ | `description` | description | One-paragraph summary of what the peer does — surfaced to the model. |
509
+ | `url` | echo only | Peer's A2A endpoint. Forwarded back to the SDK in the `local_tool_call` event so the SDK can dispatch by URL. Never fetched server-side. |
510
+ | `version` | echo only | Peer agent version. |
511
+ | `provider` | echo only | Vendor info. |
512
+ | `capabilities` | echo only | A2A capability flags (streaming, push notifications, …). |
513
+ | `defaultInputModes` | echo only | Modalities the peer accepts. |
514
+ | `defaultOutputModes` | echo only | Modalities the peer returns. |
515
+ | `skills[]` | description | First 12 skills (`name`, `description`) are bulleted into the tool description so the model knows what to ask for. |
516
+ | `securitySchemes`, `security` | echo only | Forwarded to the SDK; MANTYX does no auth. |
517
+ | _anything else_ | echo only | Passthrough — survives round-trip unchanged. |
503
518
 
504
519
  Local A2A respects the same `localToolTimeoutMs` budget (default 5 minutes)
505
520
  as `kind: "local"`. Tool-result POSTs after timeout return `409 run_terminal`.
@@ -510,25 +525,25 @@ as `kind: "local"`. Tool-result POSTs after timeout return `409 run_terminal`.
510
525
  expose every tool published by an MCP server to the agent loop in one go.
511
526
  Like A2A, the protocol distinguishes by **where the server lives**:
512
527
 
513
- - `kind: "mcp"` — *remote* MCP (Streamable HTTP). MANTYX has network access
528
+ - `kind: "mcp"` — _remote_ MCP (Streamable HTTP). MANTYX has network access
514
529
  to the server, dials it, lists the catalog at run start, and proxies each
515
530
  call server-side. **MANTYX prefixes every discovered tool name with the
516
531
  ref's `name`** (e.g. `github_search_repos`) so multiple MCP servers
517
532
  can coexist without colliding.
518
- - `kind: "mcp_local"` — *local* MCP (stdio, on-device, intranet). MANTYX
533
+ - `kind: "mcp_local"` — _local_ MCP (stdio, on-device, intranet). MANTYX
519
534
  has **no** access to the server; the SDK does discovery, validation, and
520
535
  execution. The SDK declares the tool catalog with **the exact names it
521
536
  wants the model to see** — MANTYX does not auto-prefix.
522
537
 
523
538
  #### `kind: "mcp"` — remote MCP
524
539
 
525
- | Field | Required | Notes |
526
- | -------------- | -------- | ----- |
527
- | `kind` | yes | Discriminator literal `"mcp"`. |
528
- | `name` | yes | Server label — MANTYX prefixes every discovered tool name as `<name>_<tool>`. Must match `/^[a-zA-Z0-9_]{1,64}$/`. |
529
- | `url` | yes | Streamable HTTP MCP endpoint. |
530
- | `headers` | no | Flat string→string HTTP headers (e.g. `Authorization`). Each value capped at 8 KB. |
531
- | `toolFilter` | no | Allowlist of MCP tool names (un-prefixed, as the server returns them). When set, tools not in the list are silently dropped. When omitted, every published tool is exposed. |
540
+ | Field | Required | Notes |
541
+ | ------------ | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
542
+ | `kind` | yes | Discriminator literal `"mcp"`. |
543
+ | `name` | yes | Server label — MANTYX prefixes every discovered tool name as `<name>_<tool>`. Must match `/^[a-zA-Z0-9_]{1,64}$/`. |
544
+ | `url` | yes | Streamable HTTP MCP endpoint. |
545
+ | `headers` | no | Flat string→string HTTP headers (e.g. `Authorization`). Each value capped at 8 KB. |
546
+ | `toolFilter` | no | Allowlist of MCP tool names (un-prefixed, as the server returns them). When set, tools not in the list are silently dropped. When omitted, every published tool is exposed. |
532
547
 
533
548
  If the MCP server is unreachable when the run starts, MANTYX still exposes
534
549
  a single stub tool named `<server>_unavailable` so the model can report the
@@ -566,16 +581,17 @@ Per-run lifecycle:
566
581
  "type": "local_tool_call",
567
582
  "data": {
568
583
  "toolUseId": "tu_x",
569
- "name": "fs_read_file", // SDK-declared name; same string the model called
584
+ "name": "fs_read_file", // SDK-declared name; same string the model called
570
585
  "args": { "path": "/etc/hosts" },
571
586
  "kind": "mcp_local",
572
- "mcpServer": "fs", // the SDK-side label from the ref's `name`
587
+ "mcpServer": "fs", // the SDK-side label from the ref's `name`
573
588
  "mcpToolName": "fs_read_file", // duplicates `name` for the SDK's convenience
574
- "mcpServerInfo": { // present iff the ref carried `serverInfo`
589
+ "mcpServerInfo": {
590
+ // present iff the ref carried `serverInfo`
575
591
  "name": "mcp-server-filesystem",
576
- "version": "0.4.1"
577
- }
578
- }
592
+ "version": "0.4.1",
593
+ },
594
+ },
579
595
  }
580
596
  ```
581
597
 
@@ -587,12 +603,12 @@ Per-run lifecycle:
587
603
  updated `mcp_local` ref inside `POST /agent-sessions/:id/messages`'s
588
604
  `tools` field; the catalog snapshot lives on the run, not the session.
589
605
 
590
- | Field | Required | Notes |
591
- | -------------- | -------- | ----- |
592
- | `kind` | yes | Discriminator literal `"mcp_local"`. |
593
- | `name` | yes | SDK-side server label (e.g. `"fs"`, `"jira"`). Echoed back unchanged as `mcpServer` on every `local_tool_call`. **Not used to prefix tool names.** Match `/^[a-zA-Z0-9_]{1,64}$/`. |
594
- | `serverInfo` | no | The MCP `Implementation` block the SDK got from `Initialize` (`{ name, version? }`, plus any extra fields the server returned). Forwarded to the SDK in `local_tool_call.mcpServerInfo` for observability; not used to drive behavior. |
595
- | `tools` | yes | Verbatim MCP `tools/list` output (1–64 entries). Each item is the standard MCP `Tool` shape: `{ name, description?, inputSchema?, annotations?, … }`. `name` is the model-facing tool name (SDK owns naming). `inputSchema` is the MCP-spec JSON Schema for the tool's arguments — used to constrain the LLM's tool call. Empty `inputSchema` means a no-arg tool. |
606
+ | Field | Required | Notes |
607
+ | ------------ | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
608
+ | `kind` | yes | Discriminator literal `"mcp_local"`. |
609
+ | `name` | yes | SDK-side server label (e.g. `"fs"`, `"jira"`). Echoed back unchanged as `mcpServer` on every `local_tool_call`. **Not used to prefix tool names.** Match `/^[a-zA-Z0-9_]{1,64}$/`. |
610
+ | `serverInfo` | no | The MCP `Implementation` block the SDK got from `Initialize` (`{ name, version? }`, plus any extra fields the server returned). Forwarded to the SDK in `local_tool_call.mcpServerInfo` for observability; not used to drive behavior. |
611
+ | `tools` | yes | Verbatim MCP `tools/list` output (1–64 entries). Each item is the standard MCP `Tool` shape: `{ name, description?, inputSchema?, annotations?, … }`. `name` is the model-facing tool name (SDK owns naming). `inputSchema` is the MCP-spec JSON Schema for the tool's arguments — used to constrain the LLM's tool call. Empty `inputSchema` means a no-arg tool. |
596
612
 
597
613
  Older SDKs that ignore the `kind` discriminator still see a normal
598
614
  `local_tool_call` and can match on `name` alone.
@@ -612,10 +628,10 @@ provider:
612
628
 
613
629
  Two equivalent input shapes are accepted:
614
630
 
615
- | Form | Values | Notes |
616
- | ----------- | ------------------------------------- | ----- |
617
- | **String** | `"off"`, `"low"`, `"medium"`, `"high"` | Snaps to the same anchors the web composer uses (Fast=30, Moderate=50, Smart=80; off=0). |
618
- | **Number** | integer `0`–`100` | Pass-through to `RunAgentOptions.reasoningLevel`. `0` explicitly disables provider thinking even on reasoning models. |
631
+ | Form | Values | Notes |
632
+ | ---------- | -------------------------------------- | --------------------------------------------------------------------------------------------------------------------- |
633
+ | **String** | `"off"`, `"low"`, `"medium"`, `"high"` | Snaps to the same anchors the web composer uses (Fast=30, Moderate=50, Smart=80; off=0). |
634
+ | **Number** | integer `0`–`100` | Pass-through to `RunAgentOptions.reasoningLevel`. `0` explicitly disables provider thinking even on reasoning models. |
619
635
 
620
636
  When omitted, MANTYX falls back to the agent's default — for ephemeral
621
637
  specs, that means thinking is off; for `agentId`-backed specs, it follows
@@ -649,29 +665,29 @@ reply directly into downstream code without LLM-flavoured prose to parse out.
649
665
  }
650
666
  ```
651
667
 
652
- | Field | Required | Notes |
653
- | -------- | -------- | ----- |
654
- | `name` | no | Stable identifier passed to providers (OpenAI `text.format.name`, Anthropic synthetic-tool name). Defaults to `"output"`. Must match `/^[a-zA-Z0-9_-]{1,64}$/`. |
668
+ | Field | Required | Notes |
669
+ | -------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
670
+ | `name` | no | Stable identifier passed to providers (OpenAI `text.format.name`, Anthropic synthetic-tool name). Defaults to `"output"`. Must match `/^[a-zA-Z0-9_-]{1,64}$/`. |
655
671
  | `schema` | yes | JSON Schema describing the final assistant text. Root must be a JSON **object** (most providers reject array / scalar roots in structured-output mode). The schema is passed through verbatim — MANTYX does not validate its contents; the provider does. |
656
672
 
657
673
  Validation (server-side, `400 invalid_request` on violation):
658
674
 
659
- | Constraint | Limit |
660
- | ----------------------------------- | ----- |
661
- | Serialized JSON size of `outputSchema` | ≤ 32 KB |
662
- | `name` regex | `/^[a-zA-Z0-9_-]{1,64}$/` |
663
- | `schema` shape | non-`null`, non-array JSON object |
675
+ | Constraint | Limit |
676
+ | -------------------------------------- | --------------------------------- |
677
+ | Serialized JSON size of `outputSchema` | ≤ 32 KB |
678
+ | `name` regex | `/^[a-zA-Z0-9_-]{1,64}$/` |
679
+ | `schema` shape | non-`null`, non-array JSON object |
664
680
 
665
681
  **Per-provider behaviour** (mirrors the SDK's `RunAgentOptions.finalResponseSchema`):
666
682
 
667
- | Provider | How the schema is enforced |
668
- | ------------------------------ | -------------------------- |
669
- | OpenAI Responses (o-series, GPT-5.x, …) | `text.format = { type: "json_schema", strict: true, name, schema }` on every turn (works alongside tool calls). |
670
- | Gemini 3+ (any turn) | `responseMimeType: "application/json"` + `responseJsonSchema` on every `completeTurn`. Gemini 3 accepts the schema alongside `functionDeclarations`. |
671
- | Gemini ≤ 2.5 (no-tools turn) | `responseMimeType: "application/json"` + `responseJsonSchema`. |
672
- | Gemini ≤ 2.5 (with tools) | Synthetic `set_model_response` function declaration is injected; its `parametersJsonSchema` is the supplied schema. The system instruction is augmented to direct the model to call this tool with the final answer. The engine intercepts the call, hides it from the SDK, and surfaces the call's arguments as the assistant text (JSON-stringified). Sidesteps the API rejection ("Function calling with a response mime type: 'application/json' is unsupported") without round-tripping a 4xx. |
673
- | Anthropic / Bedrock-Anthropic | Synthetic `final_report` tool whose `input_schema` is the supplied schema; `tool_choice` is forced on the no-tools finishing turn. The tool's input is surfaced as the assistant text. |
674
- | xAI Grok, others | Ignored (the model returns plain text). |
683
+ | Provider | How the schema is enforced |
684
+ | --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
685
+ | OpenAI Responses (o-series, GPT-5.x, …) | `text.format = { type: "json_schema", strict: true, name, schema }` on every turn (works alongside tool calls). |
686
+ | Gemini 3+ (any turn) | `responseMimeType: "application/json"` + `responseJsonSchema` on every `completeTurn`. Gemini 3 accepts the schema alongside `functionDeclarations`. |
687
+ | Gemini ≤ 2.5 (no-tools turn) | `responseMimeType: "application/json"` + `responseJsonSchema`. |
688
+ | Gemini ≤ 2.5 (with tools) | Synthetic `set_model_response` function declaration is injected; its `parametersJsonSchema` is the supplied schema. The system instruction is augmented to direct the model to call this tool with the final answer. The engine intercepts the call, hides it from the SDK, and surfaces the call's arguments as the assistant text (JSON-stringified). Sidesteps the API rejection ("Function calling with a response mime type: 'application/json' is unsupported") without round-tripping a 4xx. |
689
+ | Anthropic / Bedrock-Anthropic | Synthetic `final_report` tool whose `input_schema` is the supplied schema; `tool_choice` is forced on the no-tools finishing turn. The tool's input is surfaced as the assistant text. |
690
+ | xAI Grok, others | Ignored (the model returns plain text). |
675
691
 
676
692
  The synthetic-tool paths (Gemini 2.5 + tools, Anthropic) are entirely
677
693
  internal: the SDK never receives a `local_tool_call` for
@@ -727,17 +743,17 @@ The wire shape also accepts the literal `false`:
727
743
  "loopDetection": false // explicitly disable the guard for this run
728
744
  ```
729
745
 
730
- | Field | Type | Required | Notes |
731
- | ---------------------- | --------------- | -------- | ----- |
732
- | `consecutiveThreshold` | integer ≥ 2 | no | Defaults to **3** when the field is omitted. Must be `>= 2` (one identical batch is just a single tool call, not a loop). |
746
+ | Field | Type | Required | Notes |
747
+ | ---------------------- | --------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
748
+ | `consecutiveThreshold` | integer ≥ 2 | no | Defaults to **3** when the field is omitted. Must be `>= 2` (one identical batch is just a single tool call, not a loop). |
733
749
  | `hardCutoffThreshold` | integer ≥ 3 | no | Defaults to **6** when the field is omitted. Must be `> consecutiveThreshold`; otherwise the soft nudge would never get a chance to land. |
734
- | (top-level `false`) | literal `false` | no | Disables the guard entirely for this run. The pipeline still enforces `budgets.maxToolTurns`. |
750
+ | (top-level `false`) | literal `false` | no | Disables the guard entirely for this run. The pipeline still enforces `budgets.maxToolTurns`. |
735
751
 
736
752
  Validation (server-side, `400 invalid_request` on violation):
737
753
 
738
- | Constraint | Limit |
739
- | -------------------------------------------------- | ----- |
740
- | `consecutiveThreshold` / `hardCutoffThreshold` upper bound | `100` |
754
+ | Constraint | Limit |
755
+ | ------------------------------------------------------------------ | -------- |
756
+ | `consecutiveThreshold` / `hardCutoffThreshold` upper bound | `100` |
741
757
  | `hardCutoffThreshold` strictly greater than `consecutiveThreshold` | enforced |
742
758
 
743
759
  **Defaults.** When `loopDetection` is omitted entirely, MANTYX applies the
@@ -776,31 +792,31 @@ tool result.
776
792
  }
777
793
  ```
778
794
 
779
- | Field | Type | Required | Notes |
780
- | ---------- | ----------- | -------- | ----- |
781
- | `<key>` | string | yes | Logical tool name as the model sees it (the same name on `ResolvedTool.name`; the SDK + pipeline handle sanitisation). 1–120 characters. |
795
+ | Field | Type | Required | Notes |
796
+ | ---------- | ----------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
797
+ | `<key>` | string | yes | Logical tool name as the model sees it (the same name on `ResolvedTool.name`; the SDK + pipeline handle sanitisation). 1–120 characters. |
782
798
  | `maxCalls` | integer ≥ 0 | yes | Hard cap on executed calls per run. `0` disables the tool entirely (every attempt returns the synthetic body on the first try). Budgets are **per-tool, not pooled**: `hive_search_deals: { maxCalls: 5 }` and `hive_search_meetings: { maxCalls: 5 }` give the agent five of each, not five between them. |
783
799
 
784
800
  Validation (server-side, `400 invalid_request` on violation):
785
801
 
786
- | Constraint | Limit |
787
- | --------------------- | ----- |
788
- | Max entries | `32` |
789
- | `<key>` length | `1..120` chars |
802
+ | Constraint | Limit |
803
+ | ---------------------- | -------------------------------------------------------------------------- |
804
+ | Max entries | `32` |
805
+ | `<key>` length | `1..120` chars |
790
806
  | `maxCalls` upper bound | `1000` (functionally unlimited; the SDK's `maxToolTurns: 100` fires first) |
791
807
 
792
808
  **Defaults.** When `toolBudgets` is omitted, MANTYX layers the runtime
793
809
  defaults from `runtime/default-run-guards.ts` on top of the spec. The
794
810
  default research-tool surface is:
795
811
 
796
- | Tool | Default `maxCalls` |
797
- | ------------------------------------------------------------------------------------------------ | ------------------ |
798
- | `recall` (workspace memory hybrid search) | `4` |
799
- | `traverse` (memory graph BFS) | `3` |
800
- | `hive_consult_ontology` (per-hive ontology read; same name across all three hives) | `4` |
801
- | `hive_search_deals` / `_meetings` / `_companies` / `_people` (Sales Hive general search) | `5` |
802
- | `hive_search_tickets` / `_conversations` / `_accounts` (Customer Hive general search) | `5` |
803
- | `hive_search_releases` / `_issues` (Product Hive general search) | `5` |
812
+ | Tool | Default `maxCalls` |
813
+ | ---------------------------------------------------------------------------------------- | ------------------ |
814
+ | `recall` (workspace memory hybrid search) | `4` |
815
+ | `traverse` (memory graph BFS) | `3` |
816
+ | `hive_consult_ontology` (per-hive ontology read; same name across all three hives) | `4` |
817
+ | `hive_search_deals` / `_meetings` / `_companies` / `_people` (Sales Hive general search) | `5` |
818
+ | `hive_search_tickets` / `_conversations` / `_accounts` (Customer Hive general search) | `5` |
819
+ | `hive_search_releases` / `_issues` (Product Hive general search) | `5` |
804
820
 
805
821
  Pass `"toolBudgets": {}` to start from a clean slate (no defaults applied
806
822
  on top — useful for runs that intentionally want unbounded research). When
@@ -838,12 +854,12 @@ prompt.
838
854
 
839
855
  Validation (server-side, `400 invalid_request` on violation):
840
856
 
841
- | Constraint | Limit |
842
- | ------------------------- | ---------------------------------- |
843
- | Max entries | 16 |
844
- | Key pattern | `^[A-Za-z0-9._-]{1,64}$` |
845
- | Value type / length | string ≤ 256 chars |
846
- | Serialized JSON size | ≤ 4 KB |
857
+ | Constraint | Limit |
858
+ | -------------------- | ------------------------ |
859
+ | Max entries | 16 |
860
+ | Key pattern | `^[A-Za-z0-9._-]{1,64}$` |
861
+ | Value type / length | string ≤ 256 chars |
862
+ | Serialized JSON size | ≤ 4 KB |
847
863
 
848
864
  For session-scoped runs the inheritance rules are:
849
865
 
@@ -872,13 +888,18 @@ POST /api/v1/workspaces/{slug}/agent-runs/{runId}/cancel
872
888
  `POST /agent-runs` returns `202 Accepted` immediately:
873
889
 
874
890
  ```json
875
- { "runId": "run_abc", "streamUrl": "/api/v1/workspaces/acme/agent-runs/run_abc/stream" }
891
+ {
892
+ "runId": "run_abc",
893
+ "streamUrl": "/api/v1/workspaces/acme/agent-runs/run_abc/stream"
894
+ }
876
895
  ```
877
896
 
878
897
  `GET .../stream` is the canonical event channel; see §7.
879
898
 
880
899
  `GET /agent-runs/{runId}` returns the run snapshot (status, final text, error,
881
- spec) without subscribing to live events. Useful for polling long runs.
900
+ spec, plus the cost-attribution triple `tokens` / `turns` / `model`
901
+ see §7.1) without subscribing to live events. Useful for polling long
902
+ runs or attributing spend after the SSE stream was already consumed.
882
903
 
883
904
  ## 6. Sessions
884
905
 
@@ -903,13 +924,15 @@ and returns `{ runId, streamUrl }` just like a one-shot run. Body:
903
924
  ```jsonc
904
925
  {
905
926
  "prompt": "What's in /etc/hosts?",
906
- "tools": [/* optional refresh of tool definitions */]
927
+ "tools": [
928
+ /* optional refresh of tool definitions */
929
+ ],
907
930
  }
908
931
  ```
909
932
 
910
933
  The server prepends the session's prior messages, runs the model, and on
911
934
  success appends the new user/assistant turns back to the session row. Local
912
- tool **handlers** are *not* persisted: the session stores definitions
935
+ tool **handlers** are _not_ persisted: the session stores definitions
913
936
  (name, schema, description) so that a restarted SDK can re-bind handlers and
914
937
  keep going.
915
938
 
@@ -979,8 +1002,24 @@ data: <utf-8 JSON>
979
1002
  { "seq": 7, "type": "tool_budget_exceeded", "data": { "tool": "recall", "maxCalls": 4, "callIndex": 5 } }
980
1003
 
981
1004
  // terminal event
982
- { "seq": 8, "type": "result", "data": { "subtype": "success", "text": "Final reply" } }
983
- { "seq": 8, "type": "result", "data": { "subtype": "error_local_tool_timeout", "error": "..." } }
1005
+ // Every terminal `result` event also carries `tokens`, `turns`, and `model`
1006
+ // for cost attribution and dashboards see §7.1. Older platforms (pre-
1007
+ // 2026-09) omit these fields; SDK clients detect "no usage data" by
1008
+ // checking that `model.provider` is empty / falsy.
1009
+ { "seq": 8, "type": "result", "data": {
1010
+ "subtype": "success",
1011
+ "text": "Final reply",
1012
+ "tokens": { "inputTokens": 1283, "cachedTokens": 512, "reasoningTokens": 96, "outputTokens": 240 },
1013
+ "turns": 3,
1014
+ "model": { "id": "platform:demo", "provider": "openai", "vendorModelId": "gpt-5.4-mini", "reasoningEffort": "low" }
1015
+ } }
1016
+ { "seq": 8, "type": "result", "data": {
1017
+ "subtype": "error_local_tool_timeout",
1018
+ "error": "...",
1019
+ "tokens": { "inputTokens": 980, "cachedTokens": 0, "reasoningTokens": 0, "outputTokens": 14 },
1020
+ "turns": 2,
1021
+ "model": { "id": "platform:demo", "provider": "anthropic", "vendorModelId": "claude-opus-4-7" }
1022
+ } }
984
1023
  { "seq": 8, "type": "cancelled", "data": {} }
985
1024
  ```
986
1025
 
@@ -991,6 +1030,117 @@ field and the parsed `type` inside `data` — they are always equal, but
991
1030
  implementations should rely on `data.type` because some HTTP middleware
992
1031
  strips the `event:` line.
993
1032
 
1033
+ ### 7.1 Cost-attribution fields (`tokens`, `turns`, `model`)
1034
+
1035
+ Every terminal `result` SSE event (and every terminal `error` event on
1036
+ platforms that emit it — see `docs/wire-protocol.md` §4.7) carries three
1037
+ additional fields so callers can drive cost dashboards, per-turn budgets,
1038
+ and provider/model spend reports without a follow-up
1039
+ `GET /agent-runs/:runId` round trip. The same fields are persisted on the
1040
+ `EphemeralAgentRun` row and surfaced by that endpoint.
1041
+
1042
+ | Field | Type | Notes |
1043
+ | -------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
1044
+ | `tokens` | object | Per-run token totals aggregated across every model invocation. See schema below. |
1045
+ | `turns` | int | Total `engine.completeTurn(...)` invocations for the run. Counts the failing call too — so a single-shot run is `1`, a tool loop is `>= 2`, and a run that errored on its first model call is `1`. Distinct from "tool turns" — `turns` is **model invocations**, regardless of whether the model called any tools. |
1046
+ | `model` | object | Resolved model that actually executed the run. See schema below. |
1047
+
1048
+ Always present on the terminal event for runs created against
1049
+ **MANTYX ≥ 2026-09** servers. Older servers omit these fields entirely;
1050
+ SDK clients (TS/Go/Python) detect "no usage data" by checking that
1051
+ `model.provider` is empty / falsy. JSON keys follow MANTYX's standard
1052
+ camelCase wire convention.
1053
+
1054
+ **`tokens` schema** — mirrors the wire shape produced by
1055
+ `tokenUsageToWireTokens` in `packages/ts-sdk/src/usage-wire.ts`:
1056
+
1057
+ | Field | Type | Notes |
1058
+ | ----------------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
1059
+ | `inputTokens` | int | **Total billable input** — fresh prompt tokens **plus** the cached-read slice the provider still bills (at a discount) **plus** any cache-creation tokens **plus** tool-prompt tokens. Equal to the sum of every provider-reported input bucket for the run. |
1060
+ | `cachedTokens` | int | The discounted slice of `inputTokens` that came from a prompt cache hit (Anthropic prompt caching, OpenAI cached prompt, Gemini implicit cache). `0` when the provider doesn't report cache reads or the run didn't hit cache. |
1061
+ | `reasoningTokens` | int | Non-visible thinking tokens. **Already counted inside `outputTokens`** — surfaced separately so dashboards can break out "thinking cost" vs visible output. `0` when the model didn't reason or didn't report it. |
1062
+ | `outputTokens` | int | **All** tokens the model emitted for this run, visible + reasoning. Matches the provider's "completion tokens" / "output tokens" billing line. |
1063
+
1064
+ `inputTokens` and `outputTokens` together cover every billable token the
1065
+ run consumed; `cachedTokens` and `reasoningTokens` are diagnostic
1066
+ breakdowns _inside_ those two totals (not separate buckets to be added).
1067
+
1068
+ **`model` schema** — fields the platform stamps onto every successful
1069
+ or failed run:
1070
+
1071
+ | Field | Type | Notes |
1072
+ | ----------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
1073
+ | `id` | string | Catalog id — the same string a caller would pass back as `modelId` to re-select this exact entry (e.g. `"platform:demo"`, `"provider:cmf…"`). Empty string against legacy fallbacks that didn't synthesise a catalog id. |
1074
+ | `provider` | string | Lowercase provider id: `"openai"`, `"anthropic"`, `"google"`, `"azure-openai"`. |
1075
+ | `vendorModelId` | string | The model id the platform actually sent to the provider (e.g. `"gpt-5.4-mini"`, `"claude-opus-4-7"`, `"gemini-2.5-pro"`). Carried through from the `model` field on `AgentSpec` after resolution. |
1076
+ | `reasoningEffort` | string | Optional. `"off"`, `"low"`, `"medium"`, `"high"`. Omitted when the provider doesn't expose a reasoning-level knob or the run didn't request one. |
1077
+
1078
+ **Per-provider token mapping.** Provider responses vary in how they
1079
+ report token usage. MANTYX normalises them into the wire shape above as
1080
+ follows:
1081
+
1082
+ | Provider | `inputTokens` ← | `cachedTokens` ← | `reasoningTokens` ← | `outputTokens` ← |
1083
+ | --------- | ----------------------------------------------------------------------------------------------- | ------------------------------------------- | ----------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
1084
+ | OpenAI | `usage.prompt_tokens` (already includes cached read tokens) | `usage.prompt_tokens_details.cached_tokens` | `usage.completion_tokens_details.reasoning_tokens` | `usage.completion_tokens` |
1085
+ | Anthropic | `usage.input_tokens` + `usage.cache_read_input_tokens` + `usage.cache_creation_input_tokens` | `usage.cache_read_input_tokens` | (extended-thinking tokens; folded into `output_tokens` by the provider) | `usage.output_tokens` |
1086
+ | Google | `usageMetadata.promptTokenCount` + `usageMetadata.cachedContentTokenCount` + tool-prompt tokens | `usageMetadata.cachedContentTokenCount` | `usageMetadata.thoughtsTokenCount` | `usageMetadata.candidatesTokenCount` (or `totalTokenCount - promptTokenCount` for older Gemini SDKs) |
1087
+
1088
+ If a provider doesn't report a given bucket the corresponding field is
1089
+ `0`, never `null`.
1090
+
1091
+ **Tool-loop accounting.** When the run executes tool turns, every
1092
+ `engine.completeTurn(...)` invocation contributes its usage to the
1093
+ aggregated `tokens` object — so a run with one tool round (model →
1094
+ tool → model) reports `turns: 2` and the **sum** of both model calls'
1095
+ token usage. The terminal event carries the cumulative totals; no
1096
+ per-turn breakdown is in the terminal event (use the
1097
+ `assistant_message` events for per-turn observability).
1098
+
1099
+ **Snapshot exposure.** `GET /api/v1/workspaces/{slug}/agent-runs/{runId}`
1100
+ also returns `tokens` / `turns` / `model` on the run snapshot JSON, with
1101
+ the same wire shape. The keys are always present (as `null` until the
1102
+ worker writes the terminal event, and on legacy rows pre-rollout) so
1103
+ SDK clients can probe server capability via `"tokens" in body` without
1104
+ triggering an undefined-vs-null distinction across HTTP/JSON
1105
+ serialization.
1106
+
1107
+ **A2A exposure.** The MANTYX-hosted A2A endpoint
1108
+ (`POST /api/a2a/{workspaceSlug}/agents/{agentSlug}`) returns the same
1109
+ triple on the JSON-RPC response under `result.metadata.mantyx`:
1110
+
1111
+ ```jsonc
1112
+ {
1113
+ "result": {
1114
+ "kind": "message",
1115
+ "messageId": "msg_abc",
1116
+ "role": "agent",
1117
+ "parts": [{ "kind": "text", "text": "Final reply" }],
1118
+ "metadata": {
1119
+ "mantyx": {
1120
+ "tokens": {
1121
+ "inputTokens": 1283,
1122
+ "cachedTokens": 512,
1123
+ "reasoningTokens": 96,
1124
+ "outputTokens": 240,
1125
+ },
1126
+ "turns": 3,
1127
+ "model": {
1128
+ "id": "platform:demo",
1129
+ "provider": "openai",
1130
+ "vendorModelId": "gpt-5.4-mini",
1131
+ "reasoningEffort": "low",
1132
+ },
1133
+ },
1134
+ },
1135
+ },
1136
+ }
1137
+ ```
1138
+
1139
+ The `metadata.mantyx` block is omitted entirely against legacy runners
1140
+ that haven't implemented `runWithUsage` on the A2A adapter (see
1141
+ `packages/ts-sdk/src/a2a/adapter.ts`); cross-platform A2A clients
1142
+ should treat its absence as "no usage data" rather than as zero usage.
1143
+
994
1144
  ## 8. Local tool result
995
1145
 
996
1146
  ```
@@ -1027,23 +1177,25 @@ All non-2xx responses use this body shape:
1027
1177
 
1028
1178
  ```jsonc
1029
1179
  {
1030
- "error": "invalid_model", // machine-readable code
1180
+ "error": "invalid_model", // machine-readable code
1031
1181
  "message": "Model 'foo' is ambiguous; pick one of: provider:cm6...",
1032
- "candidates": [/* sometimes present */]
1182
+ "candidates": [
1183
+ /* sometimes present */
1184
+ ],
1033
1185
  }
1034
1186
  ```
1035
1187
 
1036
1188
  Common codes:
1037
1189
 
1038
- | Code | HTTP | Notes |
1039
- | ---------------------- | ---: | ----- |
1040
- | `unauthorized` | 401 | Missing/invalid API key |
1041
- | `not_found` | 404 | Workspace, run, or session unknown |
1042
- | `invalid_request` | 400 | Body failed Zod validation |
1043
- | `invalid_model` | 400 | `modelId` couldn't be resolved |
1044
- | `unknown_tool_use` | 404 | Tool-result for an unknown `toolUseId` |
1045
- | `run_terminal` | 409 | Tool-result after run finished |
1046
- | `rate_limited` | 429 | Per-API-key sliding window |
1190
+ | Code | HTTP | Notes |
1191
+ | ------------------ | ---: | -------------------------------------- |
1192
+ | `unauthorized` | 401 | Missing/invalid API key |
1193
+ | `not_found` | 404 | Workspace, run, or session unknown |
1194
+ | `invalid_request` | 400 | Body failed Zod validation |
1195
+ | `invalid_model` | 400 | `modelId` couldn't be resolved |
1196
+ | `unknown_tool_use` | 404 | Tool-result for an unknown `toolUseId` |
1197
+ | `run_terminal` | 409 | Tool-result after run finished |
1198
+ | `rate_limited` | 429 | Per-API-key sliding window |
1047
1199
 
1048
1200
  ## 11. Suggested client architecture
1049
1201
 
@@ -1061,8 +1213,8 @@ A reference SDK should:
1061
1213
  model-side "don't double-call" hint without hand-editing the
1062
1214
  description.
1063
1215
  - **Local A2A peers** (`kind: "a2a_local"`) — caller-supplied A2A
1064
- clients. Resolve the peer's Agent Card *first* (e.g. `fetch
1065
- "<peer>/.well-known/agent-card.json"` or read from a local registry),
1216
+ clients. Resolve the peer's Agent Card _first_ (e.g. `fetch
1217
+ "<peer>/.well-known/agent-card.json"` or read from a local registry),
1066
1218
  attach it to the spec as `agentCard`, and in the dispatcher look the
1067
1219
  client up by `agentCard.url` (or any other field you indexed on)
1068
1220
  when the `local_tool_call` arrives.
@@ -1074,9 +1226,10 @@ A reference SDK should:
1074
1226
 
1075
1227
  `mantyx`, `mantyx_plugin`, `a2a`, and `mcp` refs are server-resolved —
1076
1228
  no SDK-side registry needed.
1229
+
1077
1230
  3. On `runAgent` / `session.send`:
1078
1231
  - Accept `reasoningLevel` from the caller and pass it through unchanged
1079
- (string `"off" | "low" | "medium" | "high"` *or* number `0–100`); do
1232
+ (string `"off" | "low" | "medium" | "high"` _or_ number `0–100`); do
1080
1233
  **not** translate to a vendor-specific knob — the server owns that
1081
1234
  mapping so all SDKs stay aligned with the web composer.
1082
1235
  - POST the run/message, get `{ runId, streamUrl }`.
@@ -1089,7 +1242,7 @@ A reference SDK should:
1089
1242
  them by default. Their presence depends on `reasoningLevel > 0` and
1090
1243
  on the active model exposing thought parts.
1091
1244
  - Accept `loopDetection` and `toolBudgets` from the caller and pass
1092
- them through unchanged (see §4.6 / §4.7). Both fields are *additive*:
1245
+ them through unchanged (see §4.6 / §4.7). Both fields are _additive_:
1093
1246
  omitting them keeps MANTYX's runtime defaults; passing
1094
1247
  `loopDetection: false` opts out; passing `toolBudgets: {}` clears the
1095
1248
  defaults; passing entries layers caller overrides on top of the
@@ -1116,4 +1269,4 @@ A reference SDK should:
1116
1269
 
1117
1270
  The npm package [`@mantyx/sdk`](https://www.npmjs.com/package/@mantyx/sdk) and the Go module
1118
1271
  [`github.com/mantyx/mantyx-go-sdk`](https://github.com/mantyx/mantyx-go-sdk) are reference implementations of this protocol
1119
- (maintained in the official **mantyx-sdk** repositories).
1272
+ (maintained in the official **mantyx-sdk** repositories).