@octavus/docs 2.16.0 → 2.17.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -96,6 +96,31 @@ return new Response(toSSEStream(events), {
96
96
  });
97
97
  ```
98
98
 
99
+ ### Computer Capabilities
100
+
101
+ Give agents access to browser, filesystem, and shell via MCP:
102
+
103
+ ```typescript
104
+ import { Computer } from '@octavus/computer';
105
+
106
+ const computer = new Computer({
107
+ mcpServers: {
108
+ browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=...']),
109
+ filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [dir]),
110
+ shell: Computer.shell({ cwd: dir, mode: 'unrestricted' }),
111
+ },
112
+ });
113
+
114
+ await computer.start();
115
+
116
+ const session = client.agentSessions.attach(sessionId, {
117
+ tools: {
118
+ 'set-chat-title': async (args) => ({ title: args.title }),
119
+ },
120
+ computer,
121
+ });
122
+ ```
123
+
99
124
  ### Workers
100
125
 
101
126
  Execute worker agents for task-based processing:
@@ -239,3 +264,4 @@ The client uploads files directly to S3 using the presigned upload URL. See [Fil
239
264
  - [Streaming](/docs/server-sdk/streaming) — Understanding stream events
240
265
  - [Workers](/docs/server-sdk/workers) — Executing worker agents
241
266
  - [Debugging](/docs/server-sdk/debugging) — Model request tracing and debugging
267
+ - [Computer](/docs/server-sdk/computer) — Browser, filesystem, and shell via MCP
@@ -87,9 +87,20 @@ const session = client.agentSessions.attach(sessionId, {
87
87
  resources: [
88
88
  // Resource watchers (optional)
89
89
  ],
90
+ computer: computer, // Computer capabilities (optional, see Computer documentation)
90
91
  });
91
92
  ```
92
93
 
94
+ ### Attach Options
95
+
96
+ | Option | Type | Description |
97
+ | ----------- | -------------- | ---------------------------------------------------------- |
98
+ | `tools` | `ToolHandlers` | Server-side tool handler functions |
99
+ | `resources` | `Resource[]` | Resource watchers for real-time updates |
100
+ | `computer` | `ToolProvider` | Computer capabilities — browser, filesystem, shell via MCP |
101
+
102
+ When `computer` is provided, its tool handlers are merged with `tools` (manual handlers take priority on conflict), and its tool schemas are sent to the platform. See [Computer](/docs/server-sdk/computer) for details.
103
+
93
104
  ## Executing Requests
94
105
 
95
106
  Once attached, execute requests on the session using `execute()`:
@@ -12,13 +12,16 @@ Tools extend what agents can do. In Octavus, tools can execute either on your se
12
12
  | Location | Use Case | Registration |
13
13
  | ---------- | ------------------------------------------------- | --------------------------------------- |
14
14
  | **Server** | Database queries, API calls, sensitive operations | Register handler in `attach()` |
15
+ | **MCP** | Browser, filesystem, shell, external services | Via `computer` option in `attach()` |
15
16
  | **Client** | Browser APIs, interactive UIs, confirmations | No server handler (forwarded to client) |
16
17
 
17
18
  When the Server SDK encounters a tool call:
18
19
 
19
- 1. **Handler exists** → Execute on server, continue automatically
20
+ 1. **Handler exists** (server or MCP) → Execute on server, continue automatically
20
21
  2. **No handler** → Forward to client via `client-tool-request` event
21
22
 
23
+ MCP tool handlers from `@octavus/computer` are merged with your manual handlers — they work identically from the platform's perspective. See [Computer](/docs/server-sdk/computer) for MCP tool integration.
24
+
22
25
  For client-side tool handling, see [Client Tools](/docs/client-sdk/client-tools).
23
26
 
24
27
  ## Why Server Tools
@@ -0,0 +1,400 @@
1
+ ---
2
+ title: Computer
3
+ description: Adding browser, filesystem, and shell capabilities to agents with @octavus/computer.
4
+ ---
5
+
6
+ # Computer
7
+
8
+ The `@octavus/computer` package gives agents access to a physical or virtual machine's browser, filesystem, and shell. It connects to [MCP](https://modelcontextprotocol.io) servers, discovers their tools, and provides them to the server-sdk.
9
+
10
+ **Current version:** `{{VERSION:@octavus/computer}}`
11
+
12
+ ## Installation
13
+
14
+ ```bash
15
+ npm install @octavus/computer
16
+ ```
17
+
18
+ ## Quick Start
19
+
20
+ ```typescript
21
+ import { Computer } from '@octavus/computer';
22
+ import { OctavusClient } from '@octavus/server-sdk';
23
+
24
+ const computer = new Computer({
25
+ mcpServers: {
26
+ browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=http://127.0.0.1:9222']),
27
+ filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', ['/path/to/workspace']),
28
+ shell: Computer.shell({ cwd: '/path/to/workspace', mode: 'unrestricted' }),
29
+ },
30
+ });
31
+
32
+ await computer.start();
33
+
34
+ const client = new OctavusClient({
35
+ baseUrl: 'https://octavus.ai',
36
+ apiKey: 'your-api-key',
37
+ });
38
+
39
+ const session = client.agentSessions.attach(sessionId, {
40
+ tools: {
41
+ 'set-chat-title': async (args) => ({ title: args.title }),
42
+ },
43
+ computer,
44
+ });
45
+ ```
46
+
47
+ The `computer` is passed to `attach()` — the server-sdk handles the rest. Tool schemas are sent to the platform, and tool calls flow back through the existing execution loop.
48
+
49
+ ## How It Works
50
+
51
+ 1. You configure MCP servers with namespaces (e.g., `browser`, `filesystem`, `shell`)
52
+ 2. `computer.start()` connects to all servers in parallel and discovers their tools
53
+ 3. Each tool is namespaced with `__` (e.g., `browser__navigate_page`, `filesystem__read_file`)
54
+ 4. The server-sdk sends tool schemas to the platform and handles tool call execution
55
+
56
+ The agent's protocol must declare matching `mcpServers` with `source: device` — see [MCP Servers](/docs/protocol/mcp-servers).
57
+
58
+ ## Entry Types
59
+
60
+ The `Computer` class supports three types of MCP entries:
61
+
62
+ ### Stdio (MCP Subprocess)
63
+
64
+ Spawns an MCP server as a child process, communicating via stdin/stdout:
65
+
66
+ ```typescript
67
+ Computer.stdio(command: string, args?: string[], options?: {
68
+ env?: Record<string, string>;
69
+ cwd?: string;
70
+ })
71
+ ```
72
+
73
+ Use this for local MCP servers installed as npm packages or standalone executables:
74
+
75
+ ```typescript
76
+ const computer = new Computer({
77
+ mcpServers: {
78
+ browser: Computer.stdio('chrome-devtools-mcp', [
79
+ '--browser-url=http://127.0.0.1:9222',
80
+ '--no-usage-statistics',
81
+ ]),
82
+ filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [
83
+ '/Users/me/projects/my-app',
84
+ ]),
85
+ },
86
+ });
87
+ ```
88
+
89
+ ### HTTP (Remote MCP Endpoint)
90
+
91
+ Connects to an MCP server over Streamable HTTP:
92
+
93
+ ```typescript
94
+ Computer.http(url: string, options?: {
95
+ headers?: Record<string, string>;
96
+ })
97
+ ```
98
+
99
+ Use this for MCP servers running as HTTP services:
100
+
101
+ ```typescript
102
+ const computer = new Computer({
103
+ mcpServers: {
104
+ docs: Computer.http('http://localhost:3001/mcp', {
105
+ headers: { Authorization: 'Bearer token' },
106
+ }),
107
+ },
108
+ });
109
+ ```
110
+
111
+ ### Shell (Built-in)
112
+
113
+ Provides shell command execution without spawning an MCP subprocess:
114
+
115
+ ```typescript
116
+ Computer.shell(options: {
117
+ cwd?: string;
118
+ mode: ShellMode;
119
+ timeout?: number; // Default: 300,000ms (5 minutes)
120
+ })
121
+ ```
122
+
123
+ This exposes a `run_command` tool (namespaced as `shell__run_command` when the key is `shell`). Commands execute in a login shell with the user's full environment.
124
+
125
+ ```typescript
126
+ const computer = new Computer({
127
+ mcpServers: {
128
+ shell: Computer.shell({
129
+ cwd: '/Users/me/projects/my-app',
130
+ mode: 'unrestricted',
131
+ timeout: 300_000,
132
+ }),
133
+ },
134
+ });
135
+ ```
136
+
137
+ #### Shell Safety Modes
138
+
139
+ | Mode | Description |
140
+ | -------------------------------------- | --------------------------------------------- |
141
+ | `'unrestricted'` | All commands allowed (for dedicated machines) |
142
+ | `{ allowedPatterns, blockedPatterns }` | Pattern-based command filtering |
143
+
144
+ Pattern-based filtering:
145
+
146
+ ```typescript
147
+ Computer.shell({
148
+ cwd: workspaceDir,
149
+ mode: {
150
+ blockedPatterns: [/rm\s+-rf/, /sudo/],
151
+ allowedPatterns: [/^git\s/, /^npm\s/, /^ls\s/],
152
+ },
153
+ });
154
+ ```
155
+
156
+ When `allowedPatterns` is set, only matching commands are permitted. When `blockedPatterns` is set, matching commands are rejected. Blocked patterns are checked first.
157
+
158
+ ## Lifecycle
159
+
160
+ ### Starting
161
+
162
+ `computer.start()` connects to all configured MCP servers in parallel. If some servers fail to connect, the computer still starts with the remaining servers — only if _all_ connections fail does it throw an error.
163
+
164
+ ```typescript
165
+ const { errors } = await computer.start();
166
+
167
+ if (errors.length > 0) {
168
+ console.warn('Some MCP servers failed to connect:', errors);
169
+ }
170
+ ```
171
+
172
+ ### Stopping
173
+
174
+ `computer.stop()` closes all MCP connections and kills managed processes:
175
+
176
+ ```typescript
177
+ await computer.stop();
178
+ ```
179
+
180
+ Always call `stop()` when the session ends to clean up MCP subprocesses. For managed processes (like Chrome), pass them in the config for automatic cleanup.
181
+
182
+ ## Chrome Launch Helper
183
+
184
+ For desktop applications that need to control a browser, `Computer.launchChrome()` launches Chrome with remote debugging enabled:
185
+
186
+ ```typescript
187
+ const browser = await Computer.launchChrome({
188
+ profileDir: '/Users/me/.my-app/chrome-profiles/agent-1',
189
+ debuggingPort: 9222, // Optional, auto-allocated if omitted
190
+ flags: ['--window-size=1280,800'],
191
+ });
192
+
193
+ console.log(`Chrome running on port ${browser.port}, PID ${browser.pid}`);
194
+ ```
195
+
196
+ Pass the browser to `managedProcesses` for automatic cleanup when the computer stops:
197
+
198
+ ```typescript
199
+ const computer = new Computer({
200
+ mcpServers: {
201
+ browser: Computer.stdio('chrome-devtools-mcp', [
202
+ `--browser-url=http://127.0.0.1:${browser.port}`,
203
+ ]),
204
+ filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [workspaceDir]),
205
+ shell: Computer.shell({ cwd: workspaceDir, mode: 'unrestricted' }),
206
+ },
207
+ managedProcesses: [{ process: browser.process }],
208
+ });
209
+ ```
210
+
211
+ ### ChromeLaunchOptions
212
+
213
+ | Field | Required | Description |
214
+ | --------------- | -------- | ----------------------------------------------------- |
215
+ | `profileDir` | Yes | Directory for Chrome's user data (profile isolation) |
216
+ | `debuggingPort` | No | Port for remote debugging (auto-allocated if omitted) |
217
+ | `flags` | No | Additional Chrome launch flags |
218
+
219
+ ## ToolProvider Interface
220
+
221
+ `Computer` implements the `ToolProvider` interface from `@octavus/core`:
222
+
223
+ ```typescript
224
+ interface ToolProvider {
225
+ toolHandlers(): Record<string, ToolHandler>;
226
+ toolSchemas(): ToolSchema[];
227
+ }
228
+ ```
229
+
230
+ The server-sdk accepts any `ToolProvider` on the `computer` option — you can implement your own if `@octavus/computer` doesn't fit your use case:
231
+
232
+ ```typescript
233
+ const customProvider: ToolProvider = {
234
+ toolHandlers() {
235
+ return {
236
+ custom__my_tool: async (args) => {
237
+ return { result: 'done' };
238
+ },
239
+ };
240
+ },
241
+ toolSchemas() {
242
+ return [
243
+ {
244
+ name: 'custom__my_tool',
245
+ description: 'A custom tool',
246
+ inputSchema: {
247
+ type: 'object',
248
+ properties: {
249
+ input: { type: 'string', description: 'Tool input' },
250
+ },
251
+ required: ['input'],
252
+ },
253
+ },
254
+ ];
255
+ },
256
+ };
257
+
258
+ const session = client.agentSessions.attach(sessionId, {
259
+ tools: { 'set-chat-title': titleHandler },
260
+ computer: customProvider,
261
+ });
262
+ ```
263
+
264
+ ## Complete Example
265
+
266
+ A desktop application with browser, filesystem, and shell capabilities:
267
+
268
+ ```typescript
269
+ import { Computer } from '@octavus/computer';
270
+ import { OctavusClient } from '@octavus/server-sdk';
271
+
272
+ const WORKSPACE_DIR = '/Users/me/projects/my-app';
273
+ const PROFILE_DIR = '/Users/me/.my-app/chrome-profiles/agent';
274
+
275
+ async function startSession(sessionId: string) {
276
+ // 1. Launch Chrome with remote debugging
277
+ const browser = await Computer.launchChrome({
278
+ profileDir: PROFILE_DIR,
279
+ });
280
+
281
+ // 2. Create computer with all capabilities
282
+ const computer = new Computer({
283
+ mcpServers: {
284
+ browser: Computer.stdio('chrome-devtools-mcp', [
285
+ `--browser-url=http://127.0.0.1:${browser.port}`,
286
+ '--no-usage-statistics',
287
+ ]),
288
+ filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [WORKSPACE_DIR]),
289
+ shell: Computer.shell({
290
+ cwd: WORKSPACE_DIR,
291
+ mode: 'unrestricted',
292
+ }),
293
+ },
294
+ managedProcesses: [{ process: browser.process }],
295
+ });
296
+
297
+ // 3. Connect to all MCP servers
298
+ const { errors } = await computer.start();
299
+ if (errors.length > 0) {
300
+ console.warn('Failed to connect:', errors);
301
+ }
302
+
303
+ // 4. Attach to session with computer
304
+ const client = new OctavusClient({
305
+ baseUrl: process.env.OCTAVUS_API_URL!,
306
+ apiKey: process.env.OCTAVUS_API_KEY!,
307
+ });
308
+
309
+ const session = client.agentSessions.attach(sessionId, {
310
+ tools: {
311
+ 'set-chat-title': async (args) => {
312
+ console.log('Chat title:', args.title);
313
+ return { success: true };
314
+ },
315
+ },
316
+ computer,
317
+ });
318
+
319
+ // 5. Execute and stream
320
+ const events = session.execute({
321
+ type: 'trigger',
322
+ triggerName: 'user-message',
323
+ input: { USER_MESSAGE: 'Navigate to github.com and take a screenshot' },
324
+ });
325
+
326
+ for await (const event of events) {
327
+ // Handle stream events
328
+ }
329
+
330
+ // 6. Clean up
331
+ await computer.stop();
332
+ }
333
+ ```
334
+
335
+ ## API Reference
336
+
337
+ ### Computer
338
+
339
+ ```typescript
340
+ class Computer implements ToolProvider {
341
+ constructor(config: ComputerConfig);
342
+
343
+ // Static factories for MCP entries
344
+ static stdio(
345
+ command: string,
346
+ args?: string[],
347
+ options?: {
348
+ env?: Record<string, string>;
349
+ cwd?: string;
350
+ },
351
+ ): StdioConfig;
352
+
353
+ static http(
354
+ url: string,
355
+ options?: {
356
+ headers?: Record<string, string>;
357
+ },
358
+ ): HttpConfig;
359
+
360
+ static shell(options: { cwd?: string; mode: ShellMode; timeout?: number }): ShellConfig;
361
+
362
+ // Chrome launch helper
363
+ static launchChrome(options: ChromeLaunchOptions): Promise<ChromeInstance>;
364
+
365
+ // Lifecycle
366
+ start(): Promise<{ errors: string[] }>;
367
+ stop(): Promise<void>;
368
+
369
+ // ToolProvider implementation
370
+ toolHandlers(): Record<string, ToolHandler>;
371
+ toolSchemas(): ToolSchema[];
372
+ }
373
+ ```
374
+
375
+ ### ComputerConfig
376
+
377
+ ```typescript
378
+ interface ComputerConfig {
379
+ mcpServers: Record<string, McpEntry>;
380
+ managedProcesses?: { process: ChildProcess }[];
381
+ }
382
+
383
+ type McpEntry = StdioConfig | HttpConfig | ShellConfig;
384
+ type ShellMode =
385
+ | 'unrestricted'
386
+ | {
387
+ allowedPatterns?: RegExp[];
388
+ blockedPatterns?: RegExp[];
389
+ };
390
+ ```
391
+
392
+ ### ChromeInstance
393
+
394
+ ```typescript
395
+ interface ChromeInstance {
396
+ port: number;
397
+ process: ChildProcess;
398
+ pid: number;
399
+ }
400
+ ```
@@ -68,6 +68,13 @@ tools:
68
68
  parameters:
69
69
  userId: { type: string }
70
70
 
71
+ # MCP servers (remote services and device capabilities)
72
+ mcpServers:
73
+ figma:
74
+ description: Figma design tool integration
75
+ source: remote
76
+ display: description
77
+
71
78
  # Octavus skills (provider-agnostic code execution)
72
79
  skills:
73
80
  qr-code:
@@ -79,6 +86,7 @@ agent:
79
86
  model: anthropic/claude-sonnet-4-5
80
87
  system: system # References prompts/system.md
81
88
  tools: [get-user-account]
89
+ mcpServers: [figma] # Enable MCP servers
82
90
  skills: [qr-code] # Enable skills
83
91
  imageModel: google/gemini-2.5-flash-image # Enable image generation
84
92
  webSearch: true # Enable web search
@@ -187,6 +195,7 @@ The referenced prompt content is inserted before variable interpolation, so vari
187
195
  - [Input & Resources](/docs/protocol/input-resources) — Defining agent inputs
188
196
  - [Triggers](/docs/protocol/triggers) — How agents are invoked
189
197
  - [Tools](/docs/protocol/tools) — External capabilities
198
+ - [MCP Servers](/docs/protocol/mcp-servers) — Remote services and device capabilities via MCP
190
199
  - [Skills](/docs/protocol/skills) — Code execution and knowledge packages
191
200
  - [References](/docs/protocol/references) — On-demand context documents
192
201
  - [Handlers](/docs/protocol/handlers) — Execution blocks
@@ -8,11 +8,12 @@ description: Defining external tools implemented in your backend.
8
8
  Tools extend what agents can do. Octavus supports multiple types:
9
9
 
10
10
  1. **External Tools** — Defined in the protocol, implemented in your backend (this page)
11
- 2. **Built-in Tools** — Provider-agnostic tools managed by Octavus (web search, image generation)
12
- 3. **Provider Tools** — Provider-specific tools executed by the provider (e.g., Anthropic's code execution)
13
- 4. **Skills** — Code execution and knowledge packages (see [Skills](/docs/protocol/skills))
11
+ 2. **MCP Tools** — Auto-discovered from MCP servers (see [MCP Servers](/docs/protocol/mcp-servers))
12
+ 3. **Built-in Tools** — Provider-agnostic tools managed by Octavus (web search, image generation)
13
+ 4. **Provider Tools** — Provider-specific tools executed by the provider (e.g., Anthropic's code execution)
14
+ 5. **Skills** — Code execution and knowledge packages (see [Skills](/docs/protocol/skills))
14
15
 
15
- This page covers external tools. Built-in tools are enabled via agent config — see [Web Search](/docs/protocol/agent-config#web-search) and [Image Generation](/docs/protocol/agent-config#image-generation). For provider-specific tools, see [Provider Options](/docs/protocol/provider-options). For code execution, see [Skills](/docs/protocol/skills).
16
+ This page covers external tools. For MCP-based tools from services like Figma, Sentry, or device capabilities like browser and filesystem, see [MCP Servers](/docs/protocol/mcp-servers). Built-in tools are enabled via agent config — see [Web Search](/docs/protocol/agent-config#web-search) and [Image Generation](/docs/protocol/agent-config#image-generation). For provider-specific tools, see [Provider Options](/docs/protocol/provider-options). For code execution, see [Skills](/docs/protocol/skills).
16
17
 
17
18
  ## External Tools
18
19
 
@@ -144,16 +144,18 @@ Start summary thread:
144
144
  block: start-thread
145
145
  thread: summary # Thread name
146
146
  model: anthropic/claude-sonnet-4-5 # Optional: different model
147
+ backupModel: openai/gpt-4o # Failover on provider errors
147
148
  thinking: low # Extended reasoning level
148
149
  maxSteps: 1 # Tool call limit
149
150
  system: escalation-summary # System prompt
150
151
  input: [COMPANY_NAME] # Variables for prompt
152
+ mcpServers: [figma, browser] # MCP servers for this thread
151
153
  skills: [qr-code] # Octavus skills for this thread
152
154
  sandboxTimeout: 600000 # Skill sandbox timeout (default: 5 min, max: 1 hour)
153
155
  imageModel: google/gemini-2.5-flash-image # Image generation model
154
156
  ```
155
157
 
156
- The `model` field can also reference a variable for dynamic model selection:
158
+ The `model` field can also reference a variable for dynamic model selection. The `backupModel` field follows the same format and supports variable references.
157
159
 
158
160
  ```yaml
159
161
  Start summary thread:
@@ -14,28 +14,31 @@ agent:
14
14
  model: anthropic/claude-sonnet-4-5
15
15
  system: system # References prompts/system.md
16
16
  tools: [get-user-account] # Available tools
17
+ mcpServers: [figma, browser] # MCP server connections
17
18
  skills: [qr-code] # Available skills
18
19
  references: [api-guidelines] # On-demand context documents
19
20
  ```
20
21
 
21
22
  ## Configuration Options
22
23
 
23
- | Field | Required | Description |
24
- | ---------------- | -------- | --------------------------------------------------------- |
25
- | `model` | Yes | Model identifier or variable reference |
26
- | `system` | Yes | System prompt filename (without .md) |
27
- | `input` | No | Variables to pass to the system prompt |
28
- | `tools` | No | List of tools the LLM can call |
29
- | `skills` | No | List of Octavus skills the LLM can use |
30
- | `references` | No | List of references the LLM can fetch on demand |
31
- | `sandboxTimeout` | No | Skill sandbox timeout in ms (default: 5 min, max: 1 hour) |
32
- | `imageModel` | No | Image generation model (enables agentic image generation) |
33
- | `webSearch` | No | Enable built-in web search tool (provider-agnostic) |
34
- | `agentic` | No | Allow multiple tool call cycles |
35
- | `maxSteps` | No | Maximum agentic steps (default: 10) |
36
- | `temperature` | No | Model temperature (0-2) |
37
- | `thinking` | No | Extended reasoning level |
38
- | `anthropic` | No | Anthropic-specific options (tools, skills) |
24
+ | Field | Required | Description |
25
+ | ---------------- | -------- | ------------------------------------------------------------------------------ |
26
+ | `model` | Yes | Model identifier or variable reference |
27
+ | `backupModel` | No | Backup model for automatic failover on provider errors |
28
+ | `system` | Yes | System prompt filename (without .md) |
29
+ | `input` | No | Variables to pass to the system prompt |
30
+ | `tools` | No | List of tools the LLM can call |
31
+ | `mcpServers` | No | List of MCP servers to connect (see [MCP Servers](/docs/protocol/mcp-servers)) |
32
+ | `skills` | No | List of Octavus skills the LLM can use |
33
+ | `references` | No | List of references the LLM can fetch on demand |
34
+ | `sandboxTimeout` | No | Skill sandbox timeout in ms (default: 5 min, max: 1 hour) |
35
+ | `imageModel` | No | Image generation model (enables agentic image generation) |
36
+ | `webSearch` | No | Enable built-in web search tool (provider-agnostic) |
37
+ | `agentic` | No | Allow multiple tool call cycles |
38
+ | `maxSteps` | No | Maximum agentic steps (default: 10) |
39
+ | `temperature` | No | Model temperature (0-2) |
40
+ | `thinking` | No | Extended reasoning level |
41
+ | `anthropic` | No | Anthropic-specific options (tools, skills) |
39
42
 
40
43
  ## Models
41
44
 
@@ -104,6 +107,41 @@ The model value is validated at runtime to ensure it's in the correct `provider/
104
107
 
105
108
  > **Note**: When using dynamic models, provider-specific options (like `anthropic:`) may not apply if the model resolves to a different provider.
106
109
 
110
+ ## Backup Model
111
+
112
+ Configure a fallback model that activates automatically when the primary model encounters a transient provider error (rate limits, outages, timeouts):
113
+
114
+ ```yaml
115
+ agent:
116
+ model: anthropic/claude-sonnet-4-5
117
+ backupModel: openai/gpt-4o
118
+ system: system
119
+ ```
120
+
121
+ When a provider error occurs, the system retries once with the backup model. If the backup also fails, the original error is returned.
122
+
123
+ **Key behaviors:**
124
+
125
+ - Only transient provider errors trigger fallback — authentication and validation errors are not retried
126
+ - Provider-specific options (like `anthropic:`) are only forwarded to the backup model if it uses the same provider
127
+ - For streaming responses, fallback only occurs if no content has been sent to the client yet
128
+
129
+ Like `model`, `backupModel` supports variable references:
130
+
131
+ ```yaml
132
+ input:
133
+ BACKUP_MODEL:
134
+ type: string
135
+ description: Fallback model for provider errors
136
+
137
+ agent:
138
+ model: anthropic/claude-sonnet-4-5
139
+ backupModel: BACKUP_MODEL
140
+ system: system
141
+ ```
142
+
143
+ > **Tip**: Use a different provider for your backup model (e.g., primary on Anthropic, backup on OpenAI) to maximize resilience against single-provider outages.
144
+
107
145
  ## System Prompt
108
146
 
109
147
  The system prompt sets the agent's persona and instructions. The `input` field controls which variables are available to the prompt — only variables listed in `input` are interpolated.
@@ -358,16 +396,18 @@ handlers:
358
396
  block: start-thread
359
397
  thread: summary
360
398
  model: anthropic/claude-sonnet-4-5 # Different model
399
+ backupModel: openai/gpt-4o # Failover model
361
400
  thinking: low # Different thinking
362
401
  maxSteps: 1 # Limit tool calls
363
402
  system: escalation-summary # Different prompt
403
+ mcpServers: [figma, browser] # Thread-specific MCP servers
364
404
  skills: [data-analysis] # Thread-specific skills
365
405
  references: [escalation-policy] # Thread-specific references
366
406
  imageModel: google/gemini-2.5-flash-image # Thread-specific image model
367
407
  webSearch: true # Thread-specific web search
368
408
  ```
369
409
 
370
- Each thread can have its own skills, references, image model, and web search setting. Skills must be defined in the protocol's `skills:` section. References must exist in the agent's `references/` directory. Workers use this same pattern since they don't have a global `agent:` section.
410
+ Each thread can have its own model, backup model, MCP servers, skills, references, image model, and web search setting. Skills must be defined in the protocol's `skills:` section. References must exist in the agent's `references/` directory. Workers use this same pattern since they don't have a global `agent:` section.
371
411
 
372
412
  ## Full Example
373
413
 
@@ -399,6 +439,12 @@ tools:
399
439
  summary: { type: string }
400
440
  priority: { type: string } # low, medium, high
401
441
 
442
+ mcpServers:
443
+ figma:
444
+ description: Figma design tool integration
445
+ source: remote
446
+ display: description
447
+
402
448
  skills:
403
449
  qr-code:
404
450
  display: description
@@ -406,6 +452,7 @@ skills:
406
452
 
407
453
  agent:
408
454
  model: anthropic/claude-sonnet-4-5
455
+ backupModel: openai/gpt-4o
409
456
  system: system
410
457
  input:
411
458
  - COMPANY_NAME
@@ -414,6 +461,7 @@ agent:
414
461
  - get-user-account
415
462
  - search-docs
416
463
  - create-support-ticket
464
+ mcpServers: [figma] # MCP server connections
417
465
  skills: [qr-code] # Octavus skills
418
466
  references: [support-policies] # On-demand context
419
467
  webSearch: true # Built-in web search