@octavus/docs 5.0.0 → 5.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -23,7 +23,7 @@ var docs_default = [
23
23
  section: "server-sdk",
24
24
  title: "Overview",
25
25
  description: "Introduction to the Octavus Server SDK for backend integration.",
26
- content: "\n# Server SDK Overview\n\nThe `@octavus/server-sdk` package provides a Node.js SDK for integrating Octavus agents into your backend application. It handles session management, streaming, and the tool execution continuation loop.\n\n**Current version:** `5.0.0`\n\n## Installation\n\n```bash\nnpm install @octavus/server-sdk\n```\n\nFor agent management (sync, validate), install the CLI as a dev dependency:\n\n```bash\nnpm install --save-dev @octavus/cli\n```\n\n## Basic Usage\n\n```typescript\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst client = new OctavusClient({\n baseUrl: 'https://octavus.ai',\n apiKey: 'your-api-key',\n});\n```\n\n## Key Features\n\n### Agent Management\n\nAgent definitions are managed via the CLI. See the [CLI documentation](/docs/server-sdk/cli) for details.\n\n```bash\n# Sync agent from local files\noctavus sync ./agents/support-chat\n\n# Output: Created: support-chat\n# Agent ID: clxyz123abc456\n```\n\n### Session Management\n\nCreate and manage agent sessions using the agent ID:\n\n```typescript\n// Create a new session (use agent ID from CLI sync)\nconst sessionId = await client.agentSessions.create('clxyz123abc456', {\n COMPANY_NAME: 'Acme Corp',\n PRODUCT_NAME: 'Widget Pro',\n});\n\n// Get UI-ready session messages (for session restore)\nconst session = await client.agentSessions.getMessages(sessionId);\n```\n\n### Tool Handlers\n\nTools run on your server with your data:\n\n```typescript\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'get-user-account': async (args) => {\n // Access your database, APIs, etc.\n return await db.users.findById(args.userId);\n },\n },\n});\n```\n\n### Streaming\n\nAll responses stream in real-time:\n\n```typescript\nimport { toSSEStream } from '@octavus/server-sdk';\n\n// execute() returns an async generator of events\nconst events = session.execute({\n type: 'trigger',\n triggerName: 'user-message',\n input: { USER_MESSAGE: 'Hello!' },\n});\n\n// Convert to SSE stream for HTTP responses\nreturn new Response(toSSEStream(events), {\n headers: { 'Content-Type': 'text/event-stream' },\n});\n```\n\n### Computer Capabilities\n\nGive agents access to browser, filesystem, and shell via MCP:\n\n```typescript\nimport { Computer } from '@octavus/computer';\n\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=...']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [dir]),\n shell: Computer.shell({ cwd: dir, mode: 'unrestricted' }),\n },\n});\n\nawait computer.start();\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => ({ title: args.title }),\n },\n});\n\nsession.setDynamicTools(computer);\n```\n\n### Workers\n\nExecute worker agents for task-based processing:\n\n```typescript\n// Non-streaming: get the output directly\nconst { output } = await client.workers.generate(agentId, {\n TOPIC: 'AI safety',\n});\n\n// Streaming: observe events in real-time\nfor await (const event of client.workers.execute(agentId, input)) {\n // Handle stream events\n}\n```\n\n## API Reference\n\n### OctavusClient\n\nThe main entry point for interacting with Octavus.\n\n```typescript\ninterface OctavusClientConfig {\n baseUrl: string; // Octavus API URL\n apiKey?: string; // Your API key\n traceModelRequests?: boolean; // Enable model request tracing (default: false)\n maxRetries?: number; // Retries for transient network failures during streaming (default: 2, set to 0 to disable)\n}\n\nclass OctavusClient {\n readonly agents: AgentsApi;\n readonly agentSessions: AgentSessionsApi;\n readonly workers: WorkersApi;\n readonly files: FilesApi;\n\n constructor(config: OctavusClientConfig);\n}\n```\n\n### AgentSessionsApi\n\nManages agent sessions.\n\n```typescript\nclass AgentSessionsApi {\n // Create a new session\n async create(agentId: string, input?: Record<string, unknown>): Promise<string>;\n\n // Get full session state (for debugging/internal use)\n async get(sessionId: string): Promise<SessionState>;\n\n // Get UI-ready messages (for client display)\n async getMessages(sessionId: string): Promise<UISessionState>;\n\n // Attach to a session for triggering\n attach(sessionId: string, options?: SessionAttachOptions): AgentSession;\n}\n\n// Full session state (internal format)\ninterface SessionState {\n id: string;\n agentId: string;\n input: Record<string, unknown>;\n variables: Record<string, unknown>;\n resources: Record<string, unknown>;\n messages: ChatMessage[]; // Internal message format\n createdAt: string;\n updatedAt: string;\n}\n\n// UI-ready session state\ninterface UISessionState {\n sessionId: string;\n agentId: string;\n messages: UIMessage[]; // UI-ready messages for frontend\n}\n```\n\n### AgentSession\n\nHandles request execution and streaming for a specific session.\n\n```typescript\nclass AgentSession {\n // Execute a request and stream parsed events\n execute(request: SessionRequest, options?: TriggerOptions): AsyncGenerator<StreamEvent>;\n\n // Get the session ID\n getSessionId(): string;\n\n // Register dynamic tools (e.g., pass a Computer or explicit DynamicTool[])\n setDynamicTools(source: ToolProvider | DynamicTool[]): void;\n}\n\ntype SessionRequest = TriggerRequest | ContinueRequest;\n\ninterface TriggerRequest {\n type: 'trigger';\n triggerName: string;\n input?: Record<string, unknown>;\n}\n\ninterface ContinueRequest {\n type: 'continue';\n executionId: string;\n toolResults: ToolResult[];\n}\n\n// Helper to convert events to SSE stream\nfunction toSSEStream(events: AsyncIterable<StreamEvent>): ReadableStream<Uint8Array>;\n```\n\n### FilesApi\n\nHandles file uploads for sessions.\n\n```typescript\nclass FilesApi {\n // Get presigned URLs for file uploads\n async getUploadUrls(sessionId: string, files: FileUploadRequest[]): Promise<UploadUrlsResponse>;\n}\n\ninterface FileUploadRequest {\n filename: string;\n mediaType: string;\n size: number;\n}\n\ninterface UploadUrlsResponse {\n files: {\n id: string; // File ID for references\n uploadUrl: string; // PUT to this URL\n downloadUrl: string; // GET URL after upload\n }[];\n}\n```\n\nThe client uploads files directly to S3 using the presigned upload URL. See [File Uploads](/docs/client-sdk/file-uploads) for the full integration pattern.\n\n## Next Steps\n\n- [Sessions](/docs/server-sdk/sessions) - Deep dive into session management\n- [Tools](/docs/server-sdk/tools) - Implementing tool handlers\n- [Streaming](/docs/server-sdk/streaming) - Understanding stream events\n- [Workers](/docs/server-sdk/workers) - Executing worker agents\n- [Debugging](/docs/server-sdk/debugging) - Model request tracing and debugging\n- [Computer](/docs/server-sdk/computer) - Browser, filesystem, and shell via MCP\n",
26
+ content: "\n# Server SDK Overview\n\nThe `@octavus/server-sdk` package provides a Node.js SDK for integrating Octavus agents into your backend application. It handles session management, streaming, and the tool execution continuation loop.\n\n**Current version:** `5.1.0`\n\n## Installation\n\n```bash\nnpm install @octavus/server-sdk\n```\n\nFor agent management (sync, validate), install the CLI as a dev dependency:\n\n```bash\nnpm install --save-dev @octavus/cli\n```\n\n## Basic Usage\n\n```typescript\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst client = new OctavusClient({\n baseUrl: 'https://octavus.ai',\n apiKey: 'your-api-key',\n});\n```\n\n## Key Features\n\n### Agent Management\n\nAgent definitions are managed via the CLI. See the [CLI documentation](/docs/server-sdk/cli) for details.\n\n```bash\n# Sync agent from local files\noctavus sync ./agents/support-chat\n\n# Output: Created: support-chat\n# Agent ID: clxyz123abc456\n```\n\n### Session Management\n\nCreate and manage agent sessions using the agent ID:\n\n```typescript\n// Create a new session (use agent ID from CLI sync)\nconst sessionId = await client.agentSessions.create('clxyz123abc456', {\n COMPANY_NAME: 'Acme Corp',\n PRODUCT_NAME: 'Widget Pro',\n});\n\n// Get UI-ready session messages (for session restore)\nconst session = await client.agentSessions.getMessages(sessionId);\n```\n\n### Tool Handlers\n\nTools run on your server with your data:\n\n```typescript\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'get-user-account': async (args) => {\n // Access your database, APIs, etc.\n return await db.users.findById(args.userId);\n },\n },\n});\n```\n\n### Streaming\n\nAll responses stream in real-time:\n\n```typescript\nimport { toSSEStream } from '@octavus/server-sdk';\n\n// execute() returns an async generator of events\nconst events = session.execute({\n type: 'trigger',\n triggerName: 'user-message',\n input: { USER_MESSAGE: 'Hello!' },\n});\n\n// Convert to SSE stream for HTTP responses\nreturn new Response(toSSEStream(events), {\n headers: { 'Content-Type': 'text/event-stream' },\n});\n```\n\n### Computer Capabilities\n\nGive agents access to browser, filesystem, and shell via MCP:\n\n```typescript\nimport { Computer } from '@octavus/computer';\n\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=...']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [dir]),\n shell: Computer.shell({ cwd: dir, mode: 'unrestricted' }),\n },\n});\n\nawait computer.start();\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => ({ title: args.title }),\n },\n});\n\nsession.setDynamicTools(computer);\n```\n\n### Workers\n\nExecute worker agents for task-based processing:\n\n```typescript\n// Non-streaming: get the output directly\nconst { output } = await client.workers.generate(agentId, {\n TOPIC: 'AI safety',\n});\n\n// Streaming: observe events in real-time\nfor await (const event of client.workers.execute(agentId, input)) {\n // Handle stream events\n}\n```\n\n## API Reference\n\n### OctavusClient\n\nThe main entry point for interacting with Octavus.\n\n```typescript\ninterface OctavusClientConfig {\n baseUrl: string; // Octavus API URL\n apiKey?: string; // Your API key\n traceModelRequests?: boolean; // Enable model request tracing (default: false)\n maxRetries?: number; // Retries for transient network failures during streaming (default: 2, set to 0 to disable)\n}\n\nclass OctavusClient {\n readonly agents: AgentsApi;\n readonly agentSessions: AgentSessionsApi;\n readonly workers: WorkersApi;\n readonly files: FilesApi;\n\n constructor(config: OctavusClientConfig);\n}\n```\n\n### AgentSessionsApi\n\nManages agent sessions.\n\n```typescript\nclass AgentSessionsApi {\n // Create a new session\n async create(agentId: string, input?: Record<string, unknown>): Promise<string>;\n\n // Get full session state (for debugging/internal use)\n async get(sessionId: string): Promise<SessionState>;\n\n // Get UI-ready messages (for client display)\n async getMessages(sessionId: string): Promise<UISessionState>;\n\n // Attach to a session for triggering\n attach(sessionId: string, options?: SessionAttachOptions): AgentSession;\n}\n\n// Full session state (internal format)\ninterface SessionState {\n id: string;\n agentId: string;\n input: Record<string, unknown>;\n variables: Record<string, unknown>;\n resources: Record<string, unknown>;\n messages: ChatMessage[]; // Internal message format\n createdAt: string;\n updatedAt: string;\n}\n\n// UI-ready session state\ninterface UISessionState {\n sessionId: string;\n agentId: string;\n messages: UIMessage[]; // UI-ready messages for frontend\n}\n```\n\n### AgentSession\n\nHandles request execution and streaming for a specific session.\n\n```typescript\nclass AgentSession {\n // Execute a request and stream parsed events\n execute(request: SessionRequest, options?: TriggerOptions): AsyncGenerator<StreamEvent>;\n\n // Get the session ID\n getSessionId(): string;\n\n // Register dynamic tools (e.g., pass a Computer or explicit DynamicTool[])\n setDynamicTools(source: ToolProvider | DynamicTool[]): void;\n}\n\ntype SessionRequest = TriggerRequest | ContinueRequest;\n\ninterface TriggerRequest {\n type: 'trigger';\n triggerName: string;\n input?: Record<string, unknown>;\n}\n\ninterface ContinueRequest {\n type: 'continue';\n executionId: string;\n toolResults: ToolResult[];\n}\n\n// Helper to convert events to SSE stream\nfunction toSSEStream(events: AsyncIterable<StreamEvent>): ReadableStream<Uint8Array>;\n```\n\n### FilesApi\n\nHandles file uploads for sessions.\n\n```typescript\nclass FilesApi {\n // Get presigned URLs for file uploads\n async getUploadUrls(sessionId: string, files: FileUploadRequest[]): Promise<UploadUrlsResponse>;\n}\n\ninterface FileUploadRequest {\n filename: string;\n mediaType: string;\n size: number;\n}\n\ninterface UploadUrlsResponse {\n files: {\n id: string; // File ID for references\n uploadUrl: string; // PUT to this URL\n downloadUrl: string; // GET URL after upload\n }[];\n}\n```\n\nThe client uploads files directly to S3 using the presigned upload URL. See [File Uploads](/docs/client-sdk/file-uploads) for the full integration pattern.\n\n## Next Steps\n\n- [Sessions](/docs/server-sdk/sessions) - Deep dive into session management\n- [Tools](/docs/server-sdk/tools) - Implementing tool handlers\n- [Streaming](/docs/server-sdk/streaming) - Understanding stream events\n- [Workers](/docs/server-sdk/workers) - Executing worker agents\n- [Debugging](/docs/server-sdk/debugging) - Model request tracing and debugging\n- [Computer](/docs/server-sdk/computer) - Browser, filesystem, and shell via MCP\n",
27
27
  excerpt: "Server SDK Overview The package provides a Node.js SDK for integrating Octavus agents into your backend application. It handles session management, streaming, and the tool execution continuation...",
28
28
  order: 1
29
29
  },
@@ -59,7 +59,7 @@ var docs_default = [
59
59
  section: "server-sdk",
60
60
  title: "CLI",
61
61
  description: "Command-line interface for validating and syncing agent definitions.",
62
- content: '\n# Octavus CLI\n\nThe `@octavus/cli` package provides a command-line interface for validating and syncing agent definitions from your local filesystem to the Octavus platform.\n\n**Current version:** `5.0.0`\n\n## Installation\n\n```bash\nnpm install --save-dev @octavus/cli\n```\n\n## Configuration\n\nThe CLI requires an API key with the **Agents** permission.\n\n### Environment Variables\n\n| Variable | Description |\n| --------------------- | ---------------------------------------------- |\n| `OCTAVUS_CLI_API_KEY` | API key with "Agents" permission (recommended) |\n| `OCTAVUS_API_KEY` | Fallback if `OCTAVUS_CLI_API_KEY` not set |\n| `OCTAVUS_API_URL` | Optional, defaults to `https://octavus.ai` |\n\n### Two-Key Strategy (Recommended)\n\nFor production deployments, use separate API keys with minimal permissions:\n\n```bash\n# CI/CD or .env.local (not committed)\nOCTAVUS_CLI_API_KEY=oct_sk_... # "Agents" permission only\n\n# Production .env\nOCTAVUS_API_KEY=oct_sk_... # "Sessions" permission only\n```\n\nThis ensures production servers only have session permissions (smaller blast radius if leaked), while agent management is restricted to development/CI environments.\n\n### Multiple Environments\n\nUse separate Octavus projects for staging and production, each with their own API keys. The `--env` flag lets you load different environment files:\n\n```bash\n# Local development (default: .env)\noctavus sync ./agents/my-agent\n\n# Staging project\noctavus --env .env.staging sync ./agents/my-agent\n\n# Production project\noctavus --env .env.production sync ./agents/my-agent\n```\n\nExample environment files:\n\n```bash\n# .env.staging (syncs to your staging project)\nOCTAVUS_CLI_API_KEY=oct_sk_staging_project_key...\n\n# .env.production (syncs to your production project)\nOCTAVUS_CLI_API_KEY=oct_sk_production_project_key...\n```\n\nEach project has its own agents, so you\'ll get different agent IDs per environment.\n\n## Global Options\n\n| Option | Description |\n| -------------- | ------------------------------------------------------- |\n| `--env <file>` | Load environment from a specific file (default: `.env`) |\n| `--help` | Show help |\n| `--version` | Show version |\n\n## Commands\n\n### `octavus sync <path>`\n\nSync an agent definition to the platform. Creates the agent if it doesn\'t exist, or updates it if it does.\n\n```bash\noctavus sync ./agents/my-agent\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Reading agent from ./agents/my-agent...\n\u2139 Syncing support-chat...\n\u2713 Created: support-chat\n Agent ID: clxyz123abc456\n```\n\n### `octavus validate <path>`\n\nValidate an agent definition without saving. Useful for CI/CD pipelines.\n\n```bash\noctavus validate ./agents/my-agent\n```\n\n**Exit codes:**\n\n- `0` - Validation passed\n- `1` - Validation errors\n- `2` - Configuration errors (missing API key, etc.)\n\n### `octavus list`\n\nList all agents in your project.\n\n```bash\noctavus list\n```\n\n**Example output:**\n\n```\nSLUG NAME FORMAT ID\n\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\nsupport-chat Support Chat Agent interactive clxyz123abc456\n\n1 agent(s)\n```\n\n### `octavus get <slug>`\n\nGet details about a specific agent by its slug.\n\n```bash\noctavus get support-chat\n```\n\n### `octavus archive <slug>`\n\nArchive an agent by slug (soft delete). Archived agents are removed from the active agent list and their slug is freed for reuse.\n\n```bash\noctavus archive support-chat\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Archiving support-chat...\n\u2713 Archived: support-chat\n Agent ID: clxyz123abc456\n```\n\n### `octavus skills sync <path>`\n\nSync a skill to the platform. Packages the skill directory into a bundle (excluding `.env` files, `.git`, and `node_modules`), uploads it, and optionally pushes secrets from the skill\'s `.env` file.\n\n```bash\noctavus skills sync ./skills/github\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Reading skill from ./skills/github...\n\u2139 Packaging github...\n\u2713 Created: github\n Skill ID: clxyz789def012\n\u2139 Pushing 2 secret(s)...\n\u2713 2 secret(s) updated\n```\n\n**Secret handling:**\n\nIf the skill directory contains a `.env` file, secrets are pushed alongside the bundle. Secrets are cross-validated against the `secrets` declarations in `SKILL.md` - warnings are shown for undeclared or missing required secrets.\n\n```\nmy-skill/\n\u251C\u2500\u2500 SKILL.md\n\u251C\u2500\u2500 scripts/\n\u2502 \u2514\u2500\u2500 run.py\n\u2514\u2500\u2500 .env # Secrets (not included in bundle)\n```\n\nSee [Skills](/docs/protocol/skills) for details on skill format, secrets, and secure mode.\n\n## Agent Directory Structure\n\nThe CLI expects agent definitions in a specific directory structure:\n\n```\nmy-agent/\n\u251C\u2500\u2500 settings.json # Required: Agent metadata\n\u251C\u2500\u2500 protocol.yaml # Required: Agent protocol\n\u251C\u2500\u2500 prompts/ # Optional: Prompt templates\n\u2502 \u251C\u2500\u2500 system.md\n\u2502 \u2514\u2500\u2500 user-message.md\n\u2514\u2500\u2500 references/ # Optional: Reference documents\n \u2514\u2500\u2500 api-guidelines.md\n```\n\n### references/\n\nReference files are markdown documents with YAML frontmatter containing a `description`. The agent can fetch these on demand during execution. See [References](/docs/protocol/references) for details.\n\n### settings.json\n\n```json\n{\n "slug": "my-agent",\n "name": "My Agent",\n "description": "A helpful assistant",\n "format": "interactive"\n}\n```\n\n### protocol.yaml\n\nSee the [Protocol documentation](/docs/protocol/overview) for details on protocol syntax.\n\n## CI/CD Integration\n\n### GitHub Actions\n\n```yaml\nname: Validate and Sync Agents\n\non:\n push:\n branches: [main]\n paths:\n - \'agents/**\'\n\njobs:\n sync:\n runs-on: ubuntu-latest\n steps:\n - uses: actions/checkout@v4\n\n - uses: actions/setup-node@v4\n with:\n node-version: \'22\'\n\n - run: npm install\n\n - name: Validate agent\n run: npx octavus validate ./agents/support-chat\n env:\n OCTAVUS_CLI_API_KEY: ${{ secrets.OCTAVUS_CLI_API_KEY }}\n\n - name: Sync agent\n run: npx octavus sync ./agents/support-chat\n env:\n OCTAVUS_CLI_API_KEY: ${{ secrets.OCTAVUS_CLI_API_KEY }}\n```\n\n### Package.json Scripts\n\nAdd sync scripts to your `package.json`:\n\n```json\n{\n "scripts": {\n "agents:validate": "octavus validate ./agents/my-agent",\n "agents:sync": "octavus sync ./agents/my-agent"\n },\n "devDependencies": {\n "@octavus/cli": "^0.1.0"\n }\n}\n```\n\n## Workflow\n\nThe recommended workflow for managing agents:\n\n1. **Define agent locally** - Create `settings.json`, `protocol.yaml`, and prompts\n2. **Validate** - Run `octavus validate ./my-agent` to check for errors\n3. **Sync** - Run `octavus sync ./my-agent` to push to platform\n4. **Store agent ID** - Save the output ID in an environment variable\n5. **Use in app** - Read the ID from env and pass to `client.agentSessions.create()`\n\n```bash\n# After syncing: octavus sync ./agents/support-chat\n# Output: Agent ID: clxyz123abc456\n\n# Add to your .env file\nOCTAVUS_SUPPORT_AGENT_ID=clxyz123abc456\n```\n\n```typescript\nconst agentId = process.env.OCTAVUS_SUPPORT_AGENT_ID;\n\nconst sessionId = await client.agentSessions.create(agentId, {\n COMPANY_NAME: \'Acme Corp\',\n});\n```\n',
62
+ content: '\n# Octavus CLI\n\nThe `@octavus/cli` package provides a command-line interface for validating and syncing agent definitions from your local filesystem to the Octavus platform.\n\n**Current version:** `5.1.0`\n\n## Installation\n\n```bash\nnpm install --save-dev @octavus/cli\n```\n\n## Configuration\n\nThe CLI requires an API key with the **Agents** permission.\n\n### Environment Variables\n\n| Variable | Description |\n| --------------------- | ---------------------------------------------- |\n| `OCTAVUS_CLI_API_KEY` | API key with "Agents" permission (recommended) |\n| `OCTAVUS_API_KEY` | Fallback if `OCTAVUS_CLI_API_KEY` not set |\n| `OCTAVUS_API_URL` | Optional, defaults to `https://octavus.ai` |\n\n### Two-Key Strategy (Recommended)\n\nFor production deployments, use separate API keys with minimal permissions:\n\n```bash\n# CI/CD or .env.local (not committed)\nOCTAVUS_CLI_API_KEY=oct_sk_... # "Agents" permission only\n\n# Production .env\nOCTAVUS_API_KEY=oct_sk_... # "Sessions" permission only\n```\n\nThis ensures production servers only have session permissions (smaller blast radius if leaked), while agent management is restricted to development/CI environments.\n\n### Multiple Environments\n\nUse separate Octavus projects for staging and production, each with their own API keys. The `--env` flag lets you load different environment files:\n\n```bash\n# Local development (default: .env)\noctavus sync ./agents/my-agent\n\n# Staging project\noctavus --env .env.staging sync ./agents/my-agent\n\n# Production project\noctavus --env .env.production sync ./agents/my-agent\n```\n\nExample environment files:\n\n```bash\n# .env.staging (syncs to your staging project)\nOCTAVUS_CLI_API_KEY=oct_sk_staging_project_key...\n\n# .env.production (syncs to your production project)\nOCTAVUS_CLI_API_KEY=oct_sk_production_project_key...\n```\n\nEach project has its own agents, so you\'ll get different agent IDs per environment.\n\n## Global Options\n\n| Option | Description |\n| -------------- | ------------------------------------------------------- |\n| `--env <file>` | Load environment from a specific file (default: `.env`) |\n| `--help` | Show help |\n| `--version` | Show version |\n\n## Commands\n\n### `octavus sync <path>`\n\nSync an agent definition to the platform. Creates the agent if it doesn\'t exist, or updates it if it does.\n\n```bash\noctavus sync ./agents/my-agent\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Reading agent from ./agents/my-agent...\n\u2139 Syncing support-chat...\n\u2713 Created: support-chat\n Agent ID: clxyz123abc456\n```\n\n### `octavus validate <path>`\n\nValidate an agent definition without saving. Useful for CI/CD pipelines.\n\n```bash\noctavus validate ./agents/my-agent\n```\n\n**Exit codes:**\n\n- `0` - Validation passed\n- `1` - Validation errors\n- `2` - Configuration errors (missing API key, etc.)\n\n### `octavus list`\n\nList all agents in your project.\n\n```bash\noctavus list\n```\n\n**Example output:**\n\n```\nSLUG NAME FORMAT ID\n\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\nsupport-chat Support Chat Agent interactive clxyz123abc456\n\n1 agent(s)\n```\n\n### `octavus get <slug>`\n\nGet details about a specific agent by its slug.\n\n```bash\noctavus get support-chat\n```\n\n### `octavus archive <slug>`\n\nArchive an agent by slug (soft delete). Archived agents are removed from the active agent list and their slug is freed for reuse.\n\n```bash\noctavus archive support-chat\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Archiving support-chat...\n\u2713 Archived: support-chat\n Agent ID: clxyz123abc456\n```\n\n### `octavus skills sync <path>`\n\nSync a skill to the platform. Packages the skill directory into a bundle (excluding `.env` files, `.git`, and `node_modules`), uploads it, and optionally pushes secrets from the skill\'s `.env` file.\n\n```bash\noctavus skills sync ./skills/github\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Reading skill from ./skills/github...\n\u2139 Packaging github...\n\u2713 Created: github\n Skill ID: clxyz789def012\n\u2139 Pushing 2 secret(s)...\n\u2713 2 secret(s) updated\n```\n\n**Secret handling:**\n\nIf the skill directory contains a `.env` file, secrets are pushed alongside the bundle. Secrets are cross-validated against the `secrets` declarations in `SKILL.md` - warnings are shown for undeclared or missing required secrets.\n\n```\nmy-skill/\n\u251C\u2500\u2500 SKILL.md\n\u251C\u2500\u2500 scripts/\n\u2502 \u2514\u2500\u2500 run.py\n\u2514\u2500\u2500 .env # Secrets (not included in bundle)\n```\n\nSee [Skills](/docs/protocol/skills) for details on skill format, secrets, and secure mode.\n\n## Agent Directory Structure\n\nThe CLI expects agent definitions in a specific directory structure:\n\n```\nmy-agent/\n\u251C\u2500\u2500 settings.json # Required: Agent metadata\n\u251C\u2500\u2500 protocol.yaml # Required: Agent protocol\n\u251C\u2500\u2500 prompts/ # Optional: Prompt templates\n\u2502 \u251C\u2500\u2500 system.md\n\u2502 \u2514\u2500\u2500 user-message.md\n\u2514\u2500\u2500 references/ # Optional: Reference documents\n \u2514\u2500\u2500 api-guidelines.md\n```\n\n### references/\n\nReference files are markdown documents with YAML frontmatter containing a `description`. The agent can fetch these on demand during execution. See [References](/docs/protocol/references) for details.\n\n### settings.json\n\n```json\n{\n "slug": "my-agent",\n "name": "My Agent",\n "description": "A helpful assistant",\n "format": "interactive"\n}\n```\n\n### protocol.yaml\n\nSee the [Protocol documentation](/docs/protocol/overview) for details on protocol syntax.\n\n## CI/CD Integration\n\n### GitHub Actions\n\n```yaml\nname: Validate and Sync Agents\n\non:\n push:\n branches: [main]\n paths:\n - \'agents/**\'\n\njobs:\n sync:\n runs-on: ubuntu-latest\n steps:\n - uses: actions/checkout@v4\n\n - uses: actions/setup-node@v4\n with:\n node-version: \'22\'\n\n - run: npm install\n\n - name: Validate agent\n run: npx octavus validate ./agents/support-chat\n env:\n OCTAVUS_CLI_API_KEY: ${{ secrets.OCTAVUS_CLI_API_KEY }}\n\n - name: Sync agent\n run: npx octavus sync ./agents/support-chat\n env:\n OCTAVUS_CLI_API_KEY: ${{ secrets.OCTAVUS_CLI_API_KEY }}\n```\n\n### Package.json Scripts\n\nAdd sync scripts to your `package.json`:\n\n```json\n{\n "scripts": {\n "agents:validate": "octavus validate ./agents/my-agent",\n "agents:sync": "octavus sync ./agents/my-agent"\n },\n "devDependencies": {\n "@octavus/cli": "^0.1.0"\n }\n}\n```\n\n## Workflow\n\nThe recommended workflow for managing agents:\n\n1. **Define agent locally** - Create `settings.json`, `protocol.yaml`, and prompts\n2. **Validate** - Run `octavus validate ./my-agent` to check for errors\n3. **Sync** - Run `octavus sync ./my-agent` to push to platform\n4. **Store agent ID** - Save the output ID in an environment variable\n5. **Use in app** - Read the ID from env and pass to `client.agentSessions.create()`\n\n```bash\n# After syncing: octavus sync ./agents/support-chat\n# Output: Agent ID: clxyz123abc456\n\n# Add to your .env file\nOCTAVUS_SUPPORT_AGENT_ID=clxyz123abc456\n```\n\n```typescript\nconst agentId = process.env.OCTAVUS_SUPPORT_AGENT_ID;\n\nconst sessionId = await client.agentSessions.create(agentId, {\n COMPANY_NAME: \'Acme Corp\',\n});\n```\n',
63
63
  excerpt: "Octavus CLI The package provides a command-line interface for validating and syncing agent definitions from your local filesystem to the Octavus platform. Current version: Installation ...",
64
64
  order: 5
65
65
  },
@@ -86,7 +86,7 @@ var docs_default = [
86
86
  section: "server-sdk",
87
87
  title: "Computer",
88
88
  description: "Adding browser, filesystem, and shell capabilities to agents with @octavus/computer.",
89
- content: "\n# Computer\n\nThe `@octavus/computer` package gives agents access to a physical or virtual machine's browser, filesystem, and shell. It connects to [MCP](https://modelcontextprotocol.io) servers, discovers their tools, and provides them to the server-sdk.\n\n**Current version:** `5.0.0`\n\n## Installation\n\n```bash\nnpm install @octavus/computer\n```\n\n## Quick Start\n\n```typescript\nimport { Computer } from '@octavus/computer';\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=http://127.0.0.1:9222']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', ['/path/to/workspace']),\n shell: Computer.shell({ cwd: '/path/to/workspace', mode: 'unrestricted' }),\n },\n});\n\nawait computer.start();\n\nconst client = new OctavusClient({\n baseUrl: 'https://octavus.ai',\n apiKey: 'your-api-key',\n});\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => ({ title: args.title }),\n },\n});\n\nsession.setDynamicTools(computer);\n```\n\nDynamic tools are registered after attaching via `session.setDynamicTools()`. Pass the `computer` directly - the session extracts schemas and handlers from the `ToolProvider`. Tool schemas are sent to the platform on the next `execute()` call, and tool calls flow back through the existing execution loop.\n\n## How It Works\n\n1. You configure MCP servers with namespaces (e.g., `browser`, `filesystem`, `shell`)\n2. `computer.start()` connects to all servers in parallel and discovers their tools\n3. Each tool is namespaced with `__` (e.g., `browser__navigate_page`, `filesystem__read_file`)\n4. The server-sdk sends tool schemas to the platform and handles tool call execution\n\nThe agent's protocol must declare matching `mcpServers` with `source: device` - see [MCP Servers](/docs/protocol/mcp-servers).\n\n## Entry Types\n\nThe `Computer` class supports three types of MCP entries:\n\n### Stdio (MCP Subprocess)\n\nSpawns an MCP server as a child process, communicating via stdin/stdout:\n\n```typescript\nComputer.stdio(command: string, args?: string[], options?: {\n env?: Record<string, string>;\n cwd?: string;\n})\n```\n\nUse this for local MCP servers installed as npm packages or standalone executables:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n '--browser-url=http://127.0.0.1:9222',\n '--no-usage-statistics',\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [\n '/Users/me/projects/my-app',\n ]),\n },\n});\n```\n\n### HTTP (Remote MCP Endpoint)\n\nConnects to an MCP server over Streamable HTTP:\n\n```typescript\nComputer.http(url: string, options?: {\n headers?: Record<string, string>;\n})\n```\n\nUse this for MCP servers running as HTTP services:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n docs: Computer.http('http://localhost:3001/mcp', {\n headers: { Authorization: 'Bearer token' },\n }),\n },\n});\n```\n\n### Shell (Built-in)\n\nProvides shell command execution without spawning an MCP subprocess:\n\n```typescript\nComputer.shell(options: {\n cwd?: string;\n mode: ShellMode;\n timeout?: number; // Default: 300,000ms (5 minutes)\n})\n```\n\nThis exposes a `run_command` tool (namespaced as `shell__run_command` when the key is `shell`). Commands execute in a login shell with the user's full environment.\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n shell: Computer.shell({\n cwd: '/Users/me/projects/my-app',\n mode: 'unrestricted',\n timeout: 300_000,\n }),\n },\n});\n```\n\n#### Shell Safety Modes\n\n| Mode | Description |\n| -------------------------------------- | --------------------------------------------- |\n| `'unrestricted'` | All commands allowed (for dedicated machines) |\n| `{ allowedPatterns, blockedPatterns }` | Pattern-based command filtering |\n\nPattern-based filtering:\n\n```typescript\nComputer.shell({\n cwd: workspaceDir,\n mode: {\n blockedPatterns: [/rm\\s+-rf/, /sudo/],\n allowedPatterns: [/^git\\s/, /^npm\\s/, /^ls\\s/],\n },\n});\n```\n\nWhen `allowedPatterns` is set, only matching commands are permitted. When `blockedPatterns` is set, matching commands are rejected. Blocked patterns are checked first.\n\n## Lifecycle\n\n### Starting\n\n`computer.start()` connects to all configured MCP servers in parallel. If some servers fail to connect, the computer still starts with the remaining servers - only if _all_ connections fail does it throw an error.\n\n```typescript\nconst { errors } = await computer.start();\n\nif (errors.length > 0) {\n console.warn('Some MCP servers failed to connect:', errors);\n}\n```\n\n### Stopping\n\n`computer.stop()` closes all MCP connections and kills managed processes:\n\n```typescript\nawait computer.stop();\n```\n\nAlways call `stop()` when the session ends to clean up MCP subprocesses. For managed processes (like Chrome), pass them in the config for automatic cleanup.\n\n## Dynamic Entries\n\nYou can add or remove MCP entries on a running `Computer` after `start()` has returned. This is useful when MCP configurations arrive after construction - for example, when a session-manager receives per-session entries from a dispatch payload and wants to wire them into the existing computer instead of rebuilding it.\n\n### `addEntry(namespace, entry, options?)`\n\nRegisters a new MCP entry under `namespace`. By default, connects immediately:\n\n```typescript\nawait computer.addEntry(\n 'github',\n Computer.stdio('@modelcontextprotocol/server-github', [], {\n env: { GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GH_TOKEN! },\n }),\n);\n```\n\nPass `{ deferred: true }` to register the entry without connecting. The entry starts in a degraded state and connects on the next `restartEntry(namespace)` call - useful for lazy MCPs the agent activates on demand:\n\n```typescript\nawait computer.addEntry('github', githubEntry, { deferred: true });\n\n// Later, when the agent decides it needs GitHub:\nawait computer.restartEntry('github');\n```\n\n`addEntry` throws if the namespace already exists. To replace an entry, call `removeEntry` first.\n\nIf the immediate connection fails, `addEntry` does not throw - the entry is registered as degraded with the error message attached. Inspect via `getHealth()` or `restartEntry()` to retry.\n\n### `removeEntry(namespace)`\n\nCloses the entry's connection (if any) and drops it from the configuration. No-op when the namespace doesn't exist:\n\n```typescript\nawait computer.removeEntry('github');\n```\n\n### `restartEntry(namespace)`\n\nCloses the existing connection (if any) and reconnects with the current configuration:\n\n```typescript\nawait computer.restartEntry('github');\n```\n\nUse this to bring a deferred entry online for the first time, or to recover an entry that became degraded mid-session.\n\n### Detecting dynamic-entry support\n\nConsumers that work with arbitrary `ToolProvider` implementations can detect dynamic-entry capability with `isDynamicMcpProvider`:\n\n```typescript\nimport { isDynamicMcpProvider } from '@octavus/server-sdk';\n\nif (isDynamicMcpProvider(provider)) {\n await provider.addEntry('github', githubEntry);\n}\n```\n\n`Computer` always passes this check.\n\n## Chrome Launch Helper\n\nFor desktop applications that need to control a browser, `Computer.launchChrome()` launches Chrome with remote debugging enabled:\n\n```typescript\nconst browser = await Computer.launchChrome({\n profileDir: '/Users/me/.my-app/chrome-profiles/agent-1',\n debuggingPort: 9222, // Optional, auto-allocated if omitted\n flags: ['--window-size=1280,800'],\n});\n\nconsole.log(`Chrome running on port ${browser.port}, PID ${browser.pid}`);\n```\n\nPass the browser to `managedProcesses` for automatic cleanup when the computer stops:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n `--browser-url=http://127.0.0.1:${browser.port}`,\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [workspaceDir]),\n shell: Computer.shell({ cwd: workspaceDir, mode: 'unrestricted' }),\n },\n managedProcesses: [{ process: browser.process }],\n});\n```\n\n### ChromeLaunchOptions\n\n| Field | Required | Description |\n| --------------- | -------- | ----------------------------------------------------- |\n| `profileDir` | Yes | Directory for Chrome's user data (profile isolation) |\n| `debuggingPort` | No | Port for remote debugging (auto-allocated if omitted) |\n| `flags` | No | Additional Chrome launch flags |\n\n## ToolProvider Interface\n\n`Computer` implements the `ToolProvider` interface from `@octavus/core`:\n\n```typescript\ninterface ToolProvider {\n toolHandlers(): Record<string, ToolHandler>;\n toolSchemas(): ToolSchema[];\n}\n```\n\n`setDynamicTools()` accepts any `ToolProvider` directly - the session extracts schemas and handlers automatically:\n\n```typescript\nsession.setDynamicTools(computer);\n```\n\nYou can also pass a custom `ToolProvider`:\n\n```typescript\nconst customProvider: ToolProvider = {\n toolHandlers() {\n return {\n custom__my_tool: async (args) => {\n return { result: 'done' };\n },\n };\n },\n toolSchemas() {\n return [\n {\n name: 'custom__my_tool',\n description: 'A custom tool',\n inputSchema: {\n type: 'object',\n properties: {\n input: { type: 'string', description: 'Tool input' },\n },\n required: ['input'],\n },\n },\n ];\n },\n};\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: { 'set-chat-title': titleHandler },\n});\n\nsession.setDynamicTools(customProvider);\n```\n\nFor cases where you need explicit control, `setDynamicTools()` also accepts a `DynamicTool[]` array:\n\n```typescript\ninterface DynamicTool {\n schema: ToolSchema;\n handler: ToolHandler;\n}\n```\n\n## Complete Example\n\nA desktop application with browser, filesystem, and shell capabilities:\n\n```typescript\nimport { Computer } from '@octavus/computer';\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst WORKSPACE_DIR = '/Users/me/projects/my-app';\nconst PROFILE_DIR = '/Users/me/.my-app/chrome-profiles/agent';\n\nasync function startSession(sessionId: string) {\n // 1. Launch Chrome with remote debugging\n const browser = await Computer.launchChrome({\n profileDir: PROFILE_DIR,\n });\n\n // 2. Create computer with all capabilities\n const computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n `--browser-url=http://127.0.0.1:${browser.port}`,\n '--no-usage-statistics',\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [WORKSPACE_DIR]),\n shell: Computer.shell({\n cwd: WORKSPACE_DIR,\n mode: 'unrestricted',\n }),\n },\n managedProcesses: [{ process: browser.process }],\n });\n\n // 3. Connect to all MCP servers\n const { errors } = await computer.start();\n if (errors.length > 0) {\n console.warn('Failed to connect:', errors);\n }\n\n // 4. Attach to session and register dynamic tools\n const client = new OctavusClient({\n baseUrl: process.env.OCTAVUS_API_URL!,\n apiKey: process.env.OCTAVUS_API_KEY!,\n });\n\n const session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => {\n console.log('Chat title:', args.title);\n return { success: true };\n },\n },\n });\n\n session.setDynamicTools(computer);\n\n // 5. Execute and stream\n const events = session.execute({\n type: 'trigger',\n triggerName: 'user-message',\n input: { USER_MESSAGE: 'Navigate to github.com and take a screenshot' },\n });\n\n for await (const event of events) {\n // Handle stream events\n }\n\n // 6. Clean up\n await computer.stop();\n}\n```\n\n## API Reference\n\n### Computer\n\n```typescript\nclass Computer implements ToolProvider {\n constructor(config: ComputerConfig);\n\n // Static factories for MCP entries\n static stdio(\n command: string,\n args?: string[],\n options?: {\n env?: Record<string, string>;\n cwd?: string;\n },\n ): StdioConfig;\n\n static http(\n url: string,\n options?: {\n headers?: Record<string, string>;\n },\n ): HttpConfig;\n\n static shell(options: { cwd?: string; mode: ShellMode; timeout?: number }): ShellConfig;\n\n // Chrome launch helper\n static launchChrome(options: ChromeLaunchOptions): Promise<ChromeInstance>;\n\n // Lifecycle\n start(): Promise<{ errors: string[] }>;\n stop(): Promise<void>;\n\n // Dynamic entries\n addEntry(namespace: string, entry: McpEntry, options?: { deferred?: boolean }): Promise<void>;\n removeEntry(namespace: string): Promise<void>;\n restartEntry(namespace: string): Promise<void>;\n stopEntry(namespace: string): Promise<void>;\n\n // Health\n getHealth(): Promise<ComputerHealth>;\n ensureReady(): Promise<EnsureReadyResult>;\n retryDegraded(): Promise<{ recovered: string[]; stillDegraded: string[] }>;\n\n // ToolProvider implementation\n toolHandlers(): Record<string, ToolHandler>;\n toolSchemas(): ToolSchema[];\n}\n\ninterface ComputerHealth {\n healthy: boolean;\n entries: EntryHealth[];\n totalTools: number;\n}\n\ninterface EntryHealth {\n name: string;\n healthy: boolean;\n error?: string;\n}\n\ninterface EnsureReadyResult extends ComputerHealth {\n recovered?: string[];\n failedEntries?: string[];\n}\n```\n\n### ComputerConfig\n\n```typescript\ninterface ComputerConfig {\n mcpServers: Record<string, McpEntry>;\n managedProcesses?: { process: ChildProcess }[];\n /** Namespaces to skip during start() - they begin as degraded and can be connected on demand via restartEntry(). */\n deferredEntries?: string[];\n}\n\ntype McpEntry = StdioConfig | HttpConfig | ShellConfig;\ntype ShellMode =\n | 'unrestricted'\n | {\n allowedPatterns?: RegExp[];\n blockedPatterns?: RegExp[];\n };\n```\n\n### ChromeInstance\n\n```typescript\ninterface ChromeInstance {\n port: number;\n process: ChildProcess;\n pid: number;\n}\n```\n",
89
+ content: "\n# Computer\n\nThe `@octavus/computer` package gives agents access to a physical or virtual machine's browser, filesystem, and shell. It connects to [MCP](https://modelcontextprotocol.io) servers, discovers their tools, and provides them to the server-sdk.\n\n**Current version:** `5.1.0`\n\n## Installation\n\n```bash\nnpm install @octavus/computer\n```\n\n## Quick Start\n\n```typescript\nimport { Computer } from '@octavus/computer';\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=http://127.0.0.1:9222']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', ['/path/to/workspace']),\n shell: Computer.shell({ cwd: '/path/to/workspace', mode: 'unrestricted' }),\n },\n});\n\nawait computer.start();\n\nconst client = new OctavusClient({\n baseUrl: 'https://octavus.ai',\n apiKey: 'your-api-key',\n});\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => ({ title: args.title }),\n },\n});\n\nsession.setDynamicTools(computer);\n```\n\nDynamic tools are registered after attaching via `session.setDynamicTools()`. Pass the `computer` directly - the session extracts schemas and handlers from the `ToolProvider`. Tool schemas are sent to the platform on the next `execute()` call, and tool calls flow back through the existing execution loop.\n\n## How It Works\n\n1. You configure MCP servers with namespaces (e.g., `browser`, `filesystem`, `shell`)\n2. `computer.start()` connects to all servers in parallel and discovers their tools\n3. Each tool is namespaced with `__` (e.g., `browser__navigate_page`, `filesystem__read_file`)\n4. The server-sdk sends tool schemas to the platform and handles tool call execution\n\nThe agent's protocol must declare matching `mcpServers` with `source: device` - see [MCP Servers](/docs/protocol/mcp-servers).\n\n## Entry Types\n\nThe `Computer` class supports three types of MCP entries:\n\n### Stdio (MCP Subprocess)\n\nSpawns an MCP server as a child process, communicating via stdin/stdout:\n\n```typescript\nComputer.stdio(command: string, args?: string[], options?: {\n env?: Record<string, string>;\n cwd?: string;\n})\n```\n\nUse this for local MCP servers installed as npm packages or standalone executables:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n '--browser-url=http://127.0.0.1:9222',\n '--no-usage-statistics',\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [\n '/Users/me/projects/my-app',\n ]),\n },\n});\n```\n\n### HTTP (Remote MCP Endpoint)\n\nConnects to an MCP server over Streamable HTTP:\n\n```typescript\nComputer.http(url: string, options?: {\n headers?: Record<string, string>;\n})\n```\n\nUse this for MCP servers running as HTTP services:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n docs: Computer.http('http://localhost:3001/mcp', {\n headers: { Authorization: 'Bearer token' },\n }),\n },\n});\n```\n\n### Shell (Built-in)\n\nProvides shell command execution without spawning an MCP subprocess:\n\n```typescript\nComputer.shell(options: {\n cwd?: string;\n mode: ShellMode;\n timeout?: number; // Default: 300,000ms (5 minutes)\n})\n```\n\nThis exposes a `run_command` tool (namespaced as `shell__run_command` when the key is `shell`). Commands execute in a login shell with the user's full environment.\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n shell: Computer.shell({\n cwd: '/Users/me/projects/my-app',\n mode: 'unrestricted',\n timeout: 300_000,\n }),\n },\n});\n```\n\n#### Shell Safety Modes\n\n| Mode | Description |\n| -------------------------------------- | --------------------------------------------- |\n| `'unrestricted'` | All commands allowed (for dedicated machines) |\n| `{ allowedPatterns, blockedPatterns }` | Pattern-based command filtering |\n\nPattern-based filtering:\n\n```typescript\nComputer.shell({\n cwd: workspaceDir,\n mode: {\n blockedPatterns: [/rm\\s+-rf/, /sudo/],\n allowedPatterns: [/^git\\s/, /^npm\\s/, /^ls\\s/],\n },\n});\n```\n\nWhen `allowedPatterns` is set, only matching commands are permitted. When `blockedPatterns` is set, matching commands are rejected. Blocked patterns are checked first.\n\n## Lifecycle\n\n### Starting\n\n`computer.start()` connects to all configured MCP servers in parallel. If some servers fail to connect, the computer still starts with the remaining servers - only if _all_ connections fail does it throw an error.\n\n```typescript\nconst { errors } = await computer.start();\n\nif (errors.length > 0) {\n console.warn('Some MCP servers failed to connect:', errors);\n}\n```\n\n### Stopping\n\n`computer.stop()` closes all MCP connections and kills managed processes:\n\n```typescript\nawait computer.stop();\n```\n\nAlways call `stop()` when the session ends to clean up MCP subprocesses. For managed processes (like Chrome), pass them in the config for automatic cleanup.\n\n## Dynamic Entries\n\nYou can add or remove MCP entries on a running `Computer` after `start()` has returned. This is useful when MCP configurations arrive after construction - for example, when a session-manager receives per-session entries from a dispatch payload and wants to wire them into the existing computer instead of rebuilding it.\n\n### `addEntry(namespace, entry, options?)`\n\nRegisters a new MCP entry under `namespace`. By default, connects immediately:\n\n```typescript\nawait computer.addEntry(\n 'github',\n Computer.stdio('@modelcontextprotocol/server-github', [], {\n env: { GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GH_TOKEN! },\n }),\n);\n```\n\nPass `{ deferred: true }` to register the entry without connecting. The entry starts in a degraded state and connects on the next `restartEntry(namespace)` call - useful for lazy MCPs the agent activates on demand:\n\n```typescript\nawait computer.addEntry('github', githubEntry, { deferred: true });\n\n// Later, when the agent decides it needs GitHub:\nawait computer.restartEntry('github');\n```\n\n`addEntry` throws if the namespace already exists. To replace an entry, call `removeEntry` first.\n\nIf the immediate connection fails, `addEntry` does not throw - the entry is registered as degraded with the error message attached. Inspect via `getHealth()` or `restartEntry()` to retry.\n\n### `removeEntry(namespace)`\n\nCloses the entry's connection (if any) and drops it from the configuration. No-op when the namespace doesn't exist:\n\n```typescript\nawait computer.removeEntry('github');\n```\n\n### `restartEntry(namespace)`\n\nCloses the existing connection (if any) and reconnects with the current configuration:\n\n```typescript\nawait computer.restartEntry('github');\n```\n\nUse this to bring a deferred entry online for the first time, or to recover an entry that became degraded mid-session.\n\n### Detecting dynamic-entry support\n\nConsumers that work with arbitrary `ToolProvider` implementations can detect dynamic-entry capability with `isDynamicMcpProvider`:\n\n```typescript\nimport { isDynamicMcpProvider } from '@octavus/server-sdk';\n\nif (isDynamicMcpProvider(provider)) {\n await provider.addEntry('github', githubEntry);\n}\n```\n\n`Computer` always passes this check.\n\n## Chrome Launch Helper\n\nFor desktop applications that need to control a browser, `Computer.launchChrome()` launches Chrome with remote debugging enabled:\n\n```typescript\nconst browser = await Computer.launchChrome({\n profileDir: '/Users/me/.my-app/chrome-profiles/agent-1',\n debuggingPort: 9222, // Optional, auto-allocated if omitted\n flags: ['--window-size=1280,800'],\n});\n\nconsole.log(`Chrome running on port ${browser.port}, PID ${browser.pid}`);\n```\n\nPass the browser to `managedProcesses` for automatic cleanup when the computer stops:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n `--browser-url=http://127.0.0.1:${browser.port}`,\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [workspaceDir]),\n shell: Computer.shell({ cwd: workspaceDir, mode: 'unrestricted' }),\n },\n managedProcesses: [{ process: browser.process }],\n});\n```\n\n### ChromeLaunchOptions\n\n| Field | Required | Description |\n| --------------- | -------- | ----------------------------------------------------- |\n| `profileDir` | Yes | Directory for Chrome's user data (profile isolation) |\n| `debuggingPort` | No | Port for remote debugging (auto-allocated if omitted) |\n| `flags` | No | Additional Chrome launch flags |\n\n## ToolProvider Interface\n\n`Computer` implements the `ToolProvider` interface from `@octavus/core`:\n\n```typescript\ninterface ToolProvider {\n toolHandlers(): Record<string, ToolHandler>;\n toolSchemas(): ToolSchema[];\n}\n```\n\n`setDynamicTools()` accepts any `ToolProvider` directly - the session extracts schemas and handlers automatically:\n\n```typescript\nsession.setDynamicTools(computer);\n```\n\nYou can also pass a custom `ToolProvider`:\n\n```typescript\nconst customProvider: ToolProvider = {\n toolHandlers() {\n return {\n custom__my_tool: async (args) => {\n return { result: 'done' };\n },\n };\n },\n toolSchemas() {\n return [\n {\n name: 'custom__my_tool',\n description: 'A custom tool',\n inputSchema: {\n type: 'object',\n properties: {\n input: { type: 'string', description: 'Tool input' },\n },\n required: ['input'],\n },\n },\n ];\n },\n};\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: { 'set-chat-title': titleHandler },\n});\n\nsession.setDynamicTools(customProvider);\n```\n\nFor cases where you need explicit control, `setDynamicTools()` also accepts a `DynamicTool[]` array:\n\n```typescript\ninterface DynamicTool {\n schema: ToolSchema;\n handler: ToolHandler;\n}\n```\n\n## Complete Example\n\nA desktop application with browser, filesystem, and shell capabilities:\n\n```typescript\nimport { Computer } from '@octavus/computer';\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst WORKSPACE_DIR = '/Users/me/projects/my-app';\nconst PROFILE_DIR = '/Users/me/.my-app/chrome-profiles/agent';\n\nasync function startSession(sessionId: string) {\n // 1. Launch Chrome with remote debugging\n const browser = await Computer.launchChrome({\n profileDir: PROFILE_DIR,\n });\n\n // 2. Create computer with all capabilities\n const computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n `--browser-url=http://127.0.0.1:${browser.port}`,\n '--no-usage-statistics',\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [WORKSPACE_DIR]),\n shell: Computer.shell({\n cwd: WORKSPACE_DIR,\n mode: 'unrestricted',\n }),\n },\n managedProcesses: [{ process: browser.process }],\n });\n\n // 3. Connect to all MCP servers\n const { errors } = await computer.start();\n if (errors.length > 0) {\n console.warn('Failed to connect:', errors);\n }\n\n // 4. Attach to session and register dynamic tools\n const client = new OctavusClient({\n baseUrl: process.env.OCTAVUS_API_URL!,\n apiKey: process.env.OCTAVUS_API_KEY!,\n });\n\n const session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => {\n console.log('Chat title:', args.title);\n return { success: true };\n },\n },\n });\n\n session.setDynamicTools(computer);\n\n // 5. Execute and stream\n const events = session.execute({\n type: 'trigger',\n triggerName: 'user-message',\n input: { USER_MESSAGE: 'Navigate to github.com and take a screenshot' },\n });\n\n for await (const event of events) {\n // Handle stream events\n }\n\n // 6. Clean up\n await computer.stop();\n}\n```\n\n## API Reference\n\n### Computer\n\n```typescript\nclass Computer implements ToolProvider {\n constructor(config: ComputerConfig);\n\n // Static factories for MCP entries\n static stdio(\n command: string,\n args?: string[],\n options?: {\n env?: Record<string, string>;\n cwd?: string;\n },\n ): StdioConfig;\n\n static http(\n url: string,\n options?: {\n headers?: Record<string, string>;\n },\n ): HttpConfig;\n\n static shell(options: { cwd?: string; mode: ShellMode; timeout?: number }): ShellConfig;\n\n // Chrome launch helper\n static launchChrome(options: ChromeLaunchOptions): Promise<ChromeInstance>;\n\n // Lifecycle\n start(): Promise<{ errors: string[] }>;\n stop(): Promise<void>;\n\n // Dynamic entries\n addEntry(namespace: string, entry: McpEntry, options?: { deferred?: boolean }): Promise<void>;\n removeEntry(namespace: string): Promise<void>;\n restartEntry(namespace: string): Promise<void>;\n stopEntry(namespace: string): Promise<void>;\n\n // Health\n getHealth(): Promise<ComputerHealth>;\n ensureReady(): Promise<EnsureReadyResult>;\n retryDegraded(): Promise<{ recovered: string[]; stillDegraded: string[] }>;\n\n // ToolProvider implementation\n toolHandlers(): Record<string, ToolHandler>;\n toolSchemas(): ToolSchema[];\n}\n\ninterface ComputerHealth {\n healthy: boolean;\n entries: EntryHealth[];\n totalTools: number;\n}\n\ninterface EntryHealth {\n name: string;\n healthy: boolean;\n error?: string;\n}\n\ninterface EnsureReadyResult extends ComputerHealth {\n recovered?: string[];\n failedEntries?: string[];\n}\n```\n\n### ComputerConfig\n\n```typescript\ninterface ComputerConfig {\n mcpServers: Record<string, McpEntry>;\n managedProcesses?: { process: ChildProcess }[];\n /** Namespaces to skip during start() - they begin as degraded and can be connected on demand via restartEntry(). */\n deferredEntries?: string[];\n}\n\ntype McpEntry = StdioConfig | HttpConfig | ShellConfig;\ntype ShellMode =\n | 'unrestricted'\n | {\n allowedPatterns?: RegExp[];\n blockedPatterns?: RegExp[];\n };\n```\n\n### ChromeInstance\n\n```typescript\ninterface ChromeInstance {\n port: number;\n process: ChildProcess;\n pid: number;\n}\n```\n",
90
90
  excerpt: "Computer The package gives agents access to a physical or virtual machine's browser, filesystem, and shell. It connects to MCP servers, discovers their tools, and provides them to the server-sdk....",
91
91
  order: 8
92
92
  },
@@ -104,7 +104,7 @@ var docs_default = [
104
104
  section: "client-sdk",
105
105
  title: "Overview",
106
106
  description: "Introduction to the Octavus Client SDKs for building chat interfaces.",
107
- content: "\n# Client SDK Overview\n\nOctavus provides two packages for frontend integration:\n\n| Package | Purpose | Use When |\n| --------------------- | ------------------------ | ----------------------------------------------------- |\n| `@octavus/react` | React hooks and bindings | Building React applications |\n| `@octavus/client-sdk` | Framework-agnostic core | Using Vue, Svelte, vanilla JS, or custom integrations |\n\n**Most users should install `@octavus/react`** - it includes everything from `@octavus/client-sdk` plus React-specific hooks.\n\n## Installation\n\n### React Applications\n\n```bash\nnpm install @octavus/react\n```\n\n**Current version:** `5.0.0`\n\n### Other Frameworks\n\n```bash\nnpm install @octavus/client-sdk\n```\n\n**Current version:** `5.0.0`\n\n## Transport Pattern\n\nThe Client SDK uses a **transport abstraction** to handle communication with your backend. This gives you flexibility in how events are delivered:\n\n| Transport | Use Case | Docs |\n| ----------------------- | -------------------------------------------- | ----------------------------------------------------- |\n| `createHttpTransport` | HTTP/SSE (Next.js, Express, etc.) | [HTTP Transport](/docs/client-sdk/http-transport) |\n| `createSocketTransport` | WebSocket, SockJS, or other socket protocols | [Socket Transport](/docs/client-sdk/socket-transport) |\n\nWhen the transport changes (e.g., when `sessionId` changes), the `useOctavusChat` hook automatically reinitializes with the new transport.\n\n> **Recommendation**: Use HTTP transport unless you specifically need WebSocket features (custom real-time events, Meteor/Phoenix, etc.).\n\n## React Usage\n\nThe `useOctavusChat` hook provides state management and streaming for React applications:\n\n```tsx\nimport { useMemo } from 'react';\nimport { useOctavusChat, createHttpTransport, type UIMessage } from '@octavus/react';\n\nfunction Chat({ sessionId }: { sessionId: string }) {\n // Create a stable transport instance (memoized on sessionId)\n const transport = useMemo(\n () =>\n createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n }),\n [sessionId],\n );\n\n const { messages, status, send } = useOctavusChat({ transport });\n\n const sendMessage = async (text: string) => {\n await send('user-message', { USER_MESSAGE: text }, { userMessage: { content: text } });\n };\n\n return (\n <div>\n {messages.map((msg) => (\n <MessageBubble key={msg.id} message={msg} />\n ))}\n </div>\n );\n}\n\nfunction MessageBubble({ message }: { message: UIMessage }) {\n return (\n <div>\n {message.parts.map((part, i) => {\n if (part.type === 'text') {\n return <p key={i}>{part.text}</p>;\n }\n return null;\n })}\n </div>\n );\n}\n```\n\n## Framework-Agnostic Usage\n\nThe `OctavusChat` class can be used with any framework or vanilla JavaScript:\n\n```typescript\nimport { OctavusChat, createHttpTransport } from '@octavus/client-sdk';\n\nconst transport = createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n});\n\nconst chat = new OctavusChat({ transport });\n\n// Subscribe to state changes\nconst unsubscribe = chat.subscribe(() => {\n console.log('Messages:', chat.messages);\n console.log('Status:', chat.status);\n // Update your UI here\n});\n\n// Send a message\nawait chat.send('user-message', { USER_MESSAGE: 'Hello' }, { userMessage: { content: 'Hello' } });\n\n// Cleanup when done\nunsubscribe();\n```\n\n## Key Features\n\n### Unified Send Function\n\nThe `send` function handles both user message display and agent triggering in one call:\n\n```tsx\nconst { send } = useOctavusChat({ transport });\n\n// Add user message to UI and trigger agent\nawait send('user-message', { USER_MESSAGE: text }, { userMessage: { content: text } });\n\n// Trigger without adding a user message (e.g., button click)\nawait send('request-human');\n```\n\n### Message Parts\n\nMessages contain ordered `parts` for rich content:\n\n```tsx\nconst { messages } = useOctavusChat({ transport });\n\n// Each message has typed parts\nmessage.parts.map((part) => {\n switch (part.type) {\n case 'text': // Text content\n case 'reasoning': // Extended reasoning/thinking\n case 'tool-call': // Tool execution\n case 'operation': // Internal operations (set-resource, etc.)\n }\n});\n```\n\n### Status Tracking\n\n```tsx\nconst { status } = useOctavusChat({ transport });\n\n// status: 'idle' | 'streaming' | 'error' | 'awaiting-input'\n// 'awaiting-input' occurs when interactive client tools need user action\n```\n\n### Stop Streaming\n\n```tsx\nconst { stop } = useOctavusChat({ transport });\n\n// Stop current stream and finalize message\nstop();\n```\n\n### Retry Last Trigger\n\nRe-execute the last trigger from the same starting point. Messages are rolled back to the state before the trigger, the user message is re-added (if any), and the agent re-executes. Already-uploaded files are reused without re-uploading.\n\n```tsx\nconst { retry, canRetry } = useOctavusChat({ transport });\n\n// Retry after an error, cancellation, or unsatisfactory result\nif (canRetry) {\n await retry();\n}\n```\n\n`canRetry` is `true` when a trigger has been sent and the chat is not currently streaming or awaiting input.\n\n## Hook Reference (React)\n\n### useOctavusChat\n\n```typescript\nfunction useOctavusChat(options: OctavusChatOptions): UseOctavusChatReturn;\n\ninterface OctavusChatOptions {\n // Required: Transport for streaming events\n transport: Transport;\n\n // Optional: Function to request upload URLs for file uploads\n requestUploadUrls?: (\n files: { filename: string; mediaType: string; size: number }[],\n ) => Promise<UploadUrlsResponse>;\n\n // Optional: Client-side tool handlers\n // - Function: executes automatically and returns result\n // - 'interactive': appears in pendingClientTools for user input\n clientTools?: Record<string, ClientToolHandler>;\n\n // Optional: Pre-populate with existing messages (session restore)\n initialMessages?: UIMessage[];\n\n // Optional: Callbacks\n onError?: (error: OctavusError) => void; // Structured error with type, source, retryable\n onFinish?: () => void;\n onStop?: () => void; // Called when user stops generation\n onResourceUpdate?: (name: string, value: unknown) => void;\n}\n\ninterface UseOctavusChatReturn {\n // State\n messages: UIMessage[];\n status: ChatStatus; // 'idle' | 'streaming' | 'error' | 'awaiting-input'\n error: OctavusError | null; // Structured error with type, source, retryable\n\n // Connection (socket transport only - undefined for HTTP)\n connectionState: ConnectionState | undefined; // 'disconnected' | 'connecting' | 'connected' | 'error'\n connectionError: Error | undefined;\n\n // Client tools (interactive tools awaiting user input)\n pendingClientTools: Record<string, InteractiveTool[]>; // Keyed by tool name\n\n // Actions\n send: (\n triggerName: string,\n input?: Record<string, unknown>,\n options?: { userMessage?: UserMessageInput },\n ) => Promise<void>;\n stop: () => void;\n retry: () => Promise<void>; // Retry last trigger from same starting point\n canRetry: boolean; // Whether retry() can be called\n\n // Connection management (socket transport only - undefined for HTTP)\n connect: (() => Promise<void>) | undefined;\n disconnect: (() => void) | undefined;\n\n // File uploads (requires requestUploadUrls)\n uploadFiles: (\n files: FileList | File[],\n onProgress?: (fileIndex: number, progress: number) => void,\n ) => Promise<FileReference[]>;\n}\n\ninterface UserMessageInput {\n content?: string;\n files?: FileList | File[] | FileReference[];\n}\n```\n\n### useAutoScroll\n\nSmart auto-scroll for chat containers. Scrolls to bottom when content updates, but pauses if the user has scrolled up. See [Streaming - Auto-Scroll](/docs/client-sdk/streaming#auto-scroll) for full usage.\n\n```typescript\nfunction useAutoScroll(options?: UseAutoScrollOptions): {\n scrollRef: RefObject<HTMLDivElement | null>;\n handleScroll: () => void;\n scrollOnUpdate: () => void;\n resetAutoScroll: () => void;\n};\n\ninterface UseAutoScrollOptions {\n scrollRef?: RefObject<HTMLDivElement | null>;\n threshold?: number; // Distance from bottom in px (default: 80)\n}\n```\n\n## Transport Reference\n\n### createHttpTransport\n\nCreates an HTTP/SSE transport using native `fetch()`:\n\n```typescript\nimport { createHttpTransport } from '@octavus/react';\n\nconst transport = createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n});\n```\n\n### createSocketTransport\n\nCreates a WebSocket/SockJS transport for real-time connections:\n\n```typescript\nimport { createSocketTransport } from '@octavus/react';\n\nconst transport = createSocketTransport({\n connect: () =>\n new Promise((resolve, reject) => {\n const ws = new WebSocket(`wss://api.example.com/stream?sessionId=${sessionId}`);\n ws.onopen = () => resolve(ws);\n ws.onerror = () => reject(new Error('Connection failed'));\n }),\n});\n```\n\nSocket transport provides additional connection management:\n\n```typescript\n// Access connection state directly\ntransport.connectionState; // 'disconnected' | 'connecting' | 'connected' | 'error'\n\n// Subscribe to state changes\ntransport.onConnectionStateChange((state, error) => {\n /* ... */\n});\n\n// Eager connection (instead of lazy on first send)\nawait transport.connect();\n\n// Manual disconnect\ntransport.disconnect();\n```\n\nFor detailed WebSocket/SockJS usage including custom events, reconnection patterns, and server-side implementation, see [Socket Transport](/docs/client-sdk/socket-transport).\n\n## Class Reference (Framework-Agnostic)\n\n### OctavusChat\n\n```typescript\nclass OctavusChat {\n constructor(options: OctavusChatOptions);\n\n // State (read-only)\n readonly messages: UIMessage[];\n readonly status: ChatStatus; // 'idle' | 'streaming' | 'error' | 'awaiting-input'\n readonly error: OctavusError | null; // Structured error\n readonly pendingClientTools: Record<string, InteractiveTool[]>; // Interactive tools\n\n // Actions\n send(\n triggerName: string,\n input?: Record<string, unknown>,\n options?: { userMessage?: UserMessageInput },\n ): Promise<void>;\n stop(): void;\n\n // Subscription\n subscribe(callback: () => void): () => void; // Returns unsubscribe function\n}\n```\n\n## Next Steps\n\n- [HTTP Transport](/docs/client-sdk/http-transport) - HTTP/SSE integration (recommended)\n- [Socket Transport](/docs/client-sdk/socket-transport) - WebSocket and SockJS integration\n- [Messages](/docs/client-sdk/messages) - Working with message state\n- [Streaming](/docs/client-sdk/streaming) - Building streaming UIs\n- [Client Tools](/docs/client-sdk/client-tools) - Interactive browser-side tool handling\n- [Operations](/docs/client-sdk/execution-blocks) - Showing agent progress\n- [Error Handling](/docs/client-sdk/error-handling) - Handling errors with type guards\n- [File Uploads](/docs/client-sdk/file-uploads) - Uploading images and documents\n- [Examples](/docs/examples/overview) - Complete working examples\n",
107
+ content: "\n# Client SDK Overview\n\nOctavus provides two packages for frontend integration:\n\n| Package | Purpose | Use When |\n| --------------------- | ------------------------ | ----------------------------------------------------- |\n| `@octavus/react` | React hooks and bindings | Building React applications |\n| `@octavus/client-sdk` | Framework-agnostic core | Using Vue, Svelte, vanilla JS, or custom integrations |\n\n**Most users should install `@octavus/react`** - it includes everything from `@octavus/client-sdk` plus React-specific hooks.\n\n## Installation\n\n### React Applications\n\n```bash\nnpm install @octavus/react\n```\n\n**Current version:** `5.1.0`\n\n### Other Frameworks\n\n```bash\nnpm install @octavus/client-sdk\n```\n\n**Current version:** `5.1.0`\n\n## Transport Pattern\n\nThe Client SDK uses a **transport abstraction** to handle communication with your backend. This gives you flexibility in how events are delivered:\n\n| Transport | Use Case | Docs |\n| ----------------------- | -------------------------------------------- | ----------------------------------------------------- |\n| `createHttpTransport` | HTTP/SSE (Next.js, Express, etc.) | [HTTP Transport](/docs/client-sdk/http-transport) |\n| `createSocketTransport` | WebSocket, SockJS, or other socket protocols | [Socket Transport](/docs/client-sdk/socket-transport) |\n\nWhen the transport changes (e.g., when `sessionId` changes), the `useOctavusChat` hook automatically reinitializes with the new transport.\n\n> **Recommendation**: Use HTTP transport unless you specifically need WebSocket features (custom real-time events, Meteor/Phoenix, etc.).\n\n## React Usage\n\nThe `useOctavusChat` hook provides state management and streaming for React applications:\n\n```tsx\nimport { useMemo } from 'react';\nimport { useOctavusChat, createHttpTransport, type UIMessage } from '@octavus/react';\n\nfunction Chat({ sessionId }: { sessionId: string }) {\n // Create a stable transport instance (memoized on sessionId)\n const transport = useMemo(\n () =>\n createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n }),\n [sessionId],\n );\n\n const { messages, status, send } = useOctavusChat({ transport });\n\n const sendMessage = async (text: string) => {\n await send('user-message', { USER_MESSAGE: text }, { userMessage: { content: text } });\n };\n\n return (\n <div>\n {messages.map((msg) => (\n <MessageBubble key={msg.id} message={msg} />\n ))}\n </div>\n );\n}\n\nfunction MessageBubble({ message }: { message: UIMessage }) {\n return (\n <div>\n {message.parts.map((part, i) => {\n if (part.type === 'text') {\n return <p key={i}>{part.text}</p>;\n }\n return null;\n })}\n </div>\n );\n}\n```\n\n## Framework-Agnostic Usage\n\nThe `OctavusChat` class can be used with any framework or vanilla JavaScript:\n\n```typescript\nimport { OctavusChat, createHttpTransport } from '@octavus/client-sdk';\n\nconst transport = createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n});\n\nconst chat = new OctavusChat({ transport });\n\n// Subscribe to state changes\nconst unsubscribe = chat.subscribe(() => {\n console.log('Messages:', chat.messages);\n console.log('Status:', chat.status);\n // Update your UI here\n});\n\n// Send a message\nawait chat.send('user-message', { USER_MESSAGE: 'Hello' }, { userMessage: { content: 'Hello' } });\n\n// Cleanup when done\nunsubscribe();\n```\n\n## Key Features\n\n### Unified Send Function\n\nThe `send` function handles both user message display and agent triggering in one call:\n\n```tsx\nconst { send } = useOctavusChat({ transport });\n\n// Add user message to UI and trigger agent\nawait send('user-message', { USER_MESSAGE: text }, { userMessage: { content: text } });\n\n// Trigger without adding a user message (e.g., button click)\nawait send('request-human');\n```\n\n### Message Parts\n\nMessages contain ordered `parts` for rich content:\n\n```tsx\nconst { messages } = useOctavusChat({ transport });\n\n// Each message has typed parts\nmessage.parts.map((part) => {\n switch (part.type) {\n case 'text': // Text content\n case 'reasoning': // Extended reasoning/thinking\n case 'tool-call': // Tool execution\n case 'operation': // Internal operations (set-resource, etc.)\n }\n});\n```\n\n### Status Tracking\n\n```tsx\nconst { status } = useOctavusChat({ transport });\n\n// status: 'idle' | 'streaming' | 'error' | 'awaiting-input'\n// 'awaiting-input' occurs when interactive client tools need user action\n```\n\n### Stop Streaming\n\n```tsx\nconst { stop } = useOctavusChat({ transport });\n\n// Stop current stream and finalize message\nstop();\n```\n\n### Retry Last Trigger\n\nRe-execute the last trigger from the same starting point. Messages are rolled back to the state before the trigger, the user message is re-added (if any), and the agent re-executes. Already-uploaded files are reused without re-uploading.\n\n```tsx\nconst { retry, canRetry } = useOctavusChat({ transport });\n\n// Retry after an error, cancellation, or unsatisfactory result\nif (canRetry) {\n await retry();\n}\n```\n\n`canRetry` is `true` when a trigger has been sent and the chat is not currently streaming or awaiting input.\n\n## Hook Reference (React)\n\n### useOctavusChat\n\n```typescript\nfunction useOctavusChat(options: OctavusChatOptions): UseOctavusChatReturn;\n\ninterface OctavusChatOptions {\n // Required: Transport for streaming events\n transport: Transport;\n\n // Optional: Function to request upload URLs for file uploads\n requestUploadUrls?: (\n files: { filename: string; mediaType: string; size: number }[],\n ) => Promise<UploadUrlsResponse>;\n\n // Optional: Client-side tool handlers\n // - Function: executes automatically and returns result\n // - 'interactive': appears in pendingClientTools for user input\n clientTools?: Record<string, ClientToolHandler>;\n\n // Optional: Pre-populate with existing messages (session restore)\n initialMessages?: UIMessage[];\n\n // Optional: Callbacks\n onError?: (error: OctavusError) => void; // Structured error with type, source, retryable\n onFinish?: () => void;\n onStop?: () => void; // Called when user stops generation\n onResourceUpdate?: (name: string, value: unknown) => void;\n}\n\ninterface UseOctavusChatReturn {\n // State\n messages: UIMessage[];\n status: ChatStatus; // 'idle' | 'streaming' | 'error' | 'awaiting-input'\n error: OctavusError | null; // Structured error with type, source, retryable\n\n // Connection (socket transport only - undefined for HTTP)\n connectionState: ConnectionState | undefined; // 'disconnected' | 'connecting' | 'connected' | 'error'\n connectionError: Error | undefined;\n\n // Client tools (interactive tools awaiting user input)\n pendingClientTools: Record<string, InteractiveTool[]>; // Keyed by tool name\n\n // Actions\n send: (\n triggerName: string,\n input?: Record<string, unknown>,\n options?: { userMessage?: UserMessageInput },\n ) => Promise<void>;\n stop: () => void;\n retry: () => Promise<void>; // Retry last trigger from same starting point\n canRetry: boolean; // Whether retry() can be called\n\n // Connection management (socket transport only - undefined for HTTP)\n connect: (() => Promise<void>) | undefined;\n disconnect: (() => void) | undefined;\n\n // File uploads (requires requestUploadUrls)\n uploadFiles: (\n files: FileList | File[],\n onProgress?: (fileIndex: number, progress: number) => void,\n ) => Promise<FileReference[]>;\n}\n\ninterface UserMessageInput {\n content?: string;\n files?: FileList | File[] | FileReference[];\n}\n```\n\n### useAutoScroll\n\nSmart auto-scroll for chat containers. Scrolls to bottom when content updates, but pauses if the user has scrolled up. See [Streaming - Auto-Scroll](/docs/client-sdk/streaming#auto-scroll) for full usage.\n\n```typescript\nfunction useAutoScroll(options?: UseAutoScrollOptions): {\n scrollRef: RefObject<HTMLDivElement | null>;\n handleScroll: () => void;\n scrollOnUpdate: () => void;\n resetAutoScroll: () => void;\n};\n\ninterface UseAutoScrollOptions {\n scrollRef?: RefObject<HTMLDivElement | null>;\n threshold?: number; // Distance from bottom in px (default: 80)\n}\n```\n\n## Transport Reference\n\n### createHttpTransport\n\nCreates an HTTP/SSE transport using native `fetch()`:\n\n```typescript\nimport { createHttpTransport } from '@octavus/react';\n\nconst transport = createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n});\n```\n\n### createSocketTransport\n\nCreates a WebSocket/SockJS transport for real-time connections:\n\n```typescript\nimport { createSocketTransport } from '@octavus/react';\n\nconst transport = createSocketTransport({\n connect: () =>\n new Promise((resolve, reject) => {\n const ws = new WebSocket(`wss://api.example.com/stream?sessionId=${sessionId}`);\n ws.onopen = () => resolve(ws);\n ws.onerror = () => reject(new Error('Connection failed'));\n }),\n});\n```\n\nSocket transport provides additional connection management:\n\n```typescript\n// Access connection state directly\ntransport.connectionState; // 'disconnected' | 'connecting' | 'connected' | 'error'\n\n// Subscribe to state changes\ntransport.onConnectionStateChange((state, error) => {\n /* ... */\n});\n\n// Eager connection (instead of lazy on first send)\nawait transport.connect();\n\n// Manual disconnect\ntransport.disconnect();\n```\n\nFor detailed WebSocket/SockJS usage including custom events, reconnection patterns, and server-side implementation, see [Socket Transport](/docs/client-sdk/socket-transport).\n\n## Class Reference (Framework-Agnostic)\n\n### OctavusChat\n\n```typescript\nclass OctavusChat {\n constructor(options: OctavusChatOptions);\n\n // State (read-only)\n readonly messages: UIMessage[];\n readonly status: ChatStatus; // 'idle' | 'streaming' | 'error' | 'awaiting-input'\n readonly error: OctavusError | null; // Structured error\n readonly pendingClientTools: Record<string, InteractiveTool[]>; // Interactive tools\n\n // Actions\n send(\n triggerName: string,\n input?: Record<string, unknown>,\n options?: { userMessage?: UserMessageInput },\n ): Promise<void>;\n stop(): void;\n\n // Subscription\n subscribe(callback: () => void): () => void; // Returns unsubscribe function\n}\n```\n\n## Next Steps\n\n- [HTTP Transport](/docs/client-sdk/http-transport) - HTTP/SSE integration (recommended)\n- [Socket Transport](/docs/client-sdk/socket-transport) - WebSocket and SockJS integration\n- [Messages](/docs/client-sdk/messages) - Working with message state\n- [Streaming](/docs/client-sdk/streaming) - Building streaming UIs\n- [Client Tools](/docs/client-sdk/client-tools) - Interactive browser-side tool handling\n- [Operations](/docs/client-sdk/execution-blocks) - Showing agent progress\n- [Error Handling](/docs/client-sdk/error-handling) - Handling errors with type guards\n- [File Uploads](/docs/client-sdk/file-uploads) - Uploading images and documents\n- [Examples](/docs/examples/overview) - Complete working examples\n",
108
108
  excerpt: "Client SDK Overview Octavus provides two packages for frontend integration: | Package | Purpose | Use When | |...",
109
109
  order: 1
110
110
  },
@@ -588,7 +588,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
588
588
  section: "protocol",
589
589
  title: "Tools",
590
590
  description: "Defining external tools implemented in your backend.",
591
- content: '\n# Tools\n\nTools extend what agents can do. Octavus supports multiple types:\n\n1. **External Tools** - Defined in the protocol, implemented in your backend (this page)\n2. **MCP Tools** - Auto-discovered from MCP servers (see [MCP Servers](/docs/protocol/mcp-servers))\n3. **Built-in Tools** - Provider-agnostic tools managed by Octavus (web search, image generation)\n4. **Provider Tools** - Provider-specific tools executed by the provider (e.g., Anthropic\'s code execution)\n5. **Skills** - Code execution and knowledge packages (see [Skills](/docs/protocol/skills))\n\nThis page covers external tools. For MCP-based tools from services like Figma, Sentry, or device capabilities like browser and filesystem, see [MCP Servers](/docs/protocol/mcp-servers). Built-in tools are enabled via agent config - see [Web Search](/docs/protocol/agent-config#web-search) and [Image Generation](/docs/protocol/agent-config#image-generation). For provider-specific tools, see [Provider Options](/docs/protocol/provider-options). For code execution, see [Skills](/docs/protocol/skills).\n\n## External Tools\n\nExternal tools are defined in the `tools:` section and implemented in your backend.\n\n## Defining Tools\n\n```yaml\ntools:\n get-user-account:\n description: Looking up your account information\n display: description\n parameters:\n userId:\n type: string\n description: The user ID to look up\n```\n\n### Tool Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------------ |\n| `description` | Yes | What the tool does (shown to LLM and optionally user) |\n| `display` | No | How to show in UI: `hidden`, `name`, `description`, `stream` |\n| `parameters` | No | Input parameters the tool accepts |\n\n### Display Modes\n\nControls what the client sees about tool execution. The default is `description`.\n\n| Mode | Behavior |\n| ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |\n| `hidden` | No UI events emitted. The tool executes silently and the user has no awareness it was called. Use for internal plumbing tools (title setting, context management). |\n| `name` | Shows the raw tool name while executing. Arguments and result are not displayed. |\n| `description` | Shows the tool\'s description while executing (default). Arguments are visible during live streaming but the result is not preserved after page refresh. |\n| `stream` | Full visibility. Arguments stream progressively as the LLM generates them, and the result is shown after execution. The result is preserved after page refresh. |\n\n**When to use `stream`:**\n\n- Client tools where the user benefits from seeing arguments or results\n- Interactive client tools (user provides input via the tool card)\n- Tools whose result is rendered via `renderToolCallResult`\n- Any tool where transparency into what was sent/received matters\n\n**When to use `hidden`:**\n\n- Internal lifecycle tools (e.g., session title setting)\n- Context-setting tools that would clutter the UI\n- Tools that are implementation details of the agent\'s protocol\n\n**Refresh and restore behavior:**\n\n`stream` is the only mode that preserves the tool result after a page refresh. For all other modes, the result is available during the live session but stripped on refresh. On session restore (when the session expires and is rebuilt from stored `UIMessage[]`), `stream` tools retain their original result while other modes receive a placeholder.\n\n## Parameters\n\nTool calls are always objects where each parameter name maps to a value. The LLM generates: `{ param1: value1, param2: value2, ... }`\n\n### Parameter Fields\n\n| Field | Required | Description |\n| ------------- | -------- | -------------------------------------------------------------------------------- |\n| `type` | Yes | Data type: `string`, `number`, `integer`, `boolean`, `unknown`, or a custom type |\n| `description` | No | Describes what this parameter is for |\n| `optional` | No | If true, parameter is not required (default: false) |\n\n> **Tip**: You can use [custom types](/docs/protocol/types) for complex parameters like `type: ProductFilter` or `type: SearchOptions`.\n\n### Array Parameters\n\nFor array parameters, define a [top-level array type](/docs/protocol/types#top-level-array-types) and use it:\n\n```yaml\ntypes:\n CartItem:\n productId:\n type: string\n quantity:\n type: integer\n\n CartItemList:\n type: array\n items:\n type: CartItem\n\ntools:\n add-to-cart:\n description: Add items to cart\n parameters:\n items:\n type: CartItemList\n description: Items to add\n```\n\nThe tool receives: `{ items: [{ productId: "...", quantity: 1 }, ...] }`\n\n### Optional Parameters\n\nParameters are **required by default**. Use `optional: true` to make a parameter optional:\n\n```yaml\ntools:\n search-products:\n description: Search the product catalog\n parameters:\n query:\n type: string\n description: Search query\n\n category:\n type: string\n description: Filter by category\n optional: true\n\n maxPrice:\n type: number\n description: Maximum price filter\n optional: true\n\n inStock:\n type: boolean\n description: Only show in-stock items\n optional: true\n```\n\n## Making Tools Available\n\nTools defined in `tools:` are available. To make them usable by the LLM, add them to `agent.tools`:\n\n```yaml\ntools:\n get-user-account:\n description: Look up user account\n parameters:\n userId: { type: string }\n\n create-support-ticket:\n description: Create a support ticket\n parameters:\n summary: { type: string }\n priority: { type: string } # low, medium, high, urgent\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools:\n - get-user-account\n - create-support-ticket # LLM can decide when to call these\n agentic: true\n```\n\n## Tool Invocation Modes\n\n### LLM-Decided (Agentic)\n\nThe LLM decides when to call tools based on the conversation:\n\n```yaml\nagent:\n tools: [get-user-account, create-support-ticket]\n agentic: true # Allow multiple tool calls\n maxSteps: 10 # Max tool call cycles\n```\n\n### Deterministic (Block-Based)\n\nForce tool calls at specific points in the handler:\n\n```yaml\nhandlers:\n request-human:\n # Always create a ticket when escalating\n Create support ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY # From variable\n priority: medium # Literal value\n output: TICKET # Store result\n```\n\n## Tool Results\n\n### In Prompts\n\nTool results are stored in variables. Reference the variable in prompts:\n\n```markdown\n<!-- prompts/ticket-directive.md -->\n\nA support ticket has been created:\n{{TICKET}}\n\nLet the user know their ticket has been created.\n```\n\nWhen the `TICKET` variable contains an object, it\'s automatically serialized as JSON in the prompt:\n\n```\nA support ticket has been created:\n{\n "ticketId": "TKT-123ABC",\n "estimatedResponse": "24 hours"\n}\n\nLet the user know their ticket has been created.\n```\n\n> **Note**: Variables use `{{VARIABLE_NAME}}` syntax with `UPPERCASE_SNAKE_CASE`. Dot notation (like `{{TICKET.ticketId}}`) is not supported. Objects are automatically JSON-serialized.\n\n### In Variables\n\nStore tool results for later use:\n\n```yaml\nhandlers:\n request-human:\n Get account:\n block: tool-call\n tool: get-user-account\n input:\n userId: USER_ID\n output: ACCOUNT # Result stored here\n\n Create ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY\n priority: medium\n output: TICKET\n```\n\n## Implementing Tools\n\nTools are implemented in your backend:\n\n```typescript\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n \'get-user-account\': async (args) => {\n const userId = args.userId as string;\n const user = await db.users.findById(userId);\n\n return {\n name: user.name,\n email: user.email,\n plan: user.subscription.plan,\n createdAt: user.createdAt.toISOString(),\n };\n },\n\n \'create-support-ticket\': async (args) => {\n const ticket = await ticketService.create({\n summary: args.summary as string,\n priority: args.priority as string,\n });\n\n return {\n ticketId: ticket.id,\n estimatedResponse: getEstimatedTime(args.priority),\n };\n },\n },\n});\n```\n\n## Tool Best Practices\n\n### 1. Clear Descriptions\n\n```yaml\ntools:\n # Good - clear and specific\n get-user-account:\n description: >\n Retrieves the user\'s account information including name, email,\n subscription plan, and account creation date. Use this when the\n user asks about their account or you need to verify their identity.\n\n # Avoid - vague\n get-data:\n description: Gets some data\n```\n\n### 2. Document Constrained Values\n\n```yaml\ntools:\n create-support-ticket:\n parameters:\n priority:\n type: string\n description: Ticket priority level (low, medium, high, urgent)\n```\n',
591
+ content: '\n# Tools\n\nTools extend what agents can do. Octavus supports multiple types:\n\n1. **External Tools** - Defined in the protocol, implemented in your backend (this page)\n2. **MCP Tools** - Auto-discovered from MCP servers (see [MCP Servers](/docs/protocol/mcp-servers))\n3. **Built-in Tools** - Provider-agnostic tools managed by Octavus (web search, image generation)\n4. **Provider Tools** - Provider-specific tools executed by the provider (e.g., Anthropic\'s code execution)\n5. **Skills** - Code execution and knowledge packages (see [Skills](/docs/protocol/skills))\n\nThis page covers external tools. For MCP-based tools from services like Figma, Sentry, or device capabilities like browser and filesystem, see [MCP Servers](/docs/protocol/mcp-servers). Built-in tools are enabled via agent config - see [Web Search](/docs/protocol/agent-config#web-search) and [Image Generation](/docs/protocol/agent-config#image-generation). For provider-specific tools, see [Provider Options](/docs/protocol/provider-options). For code execution, see [Skills](/docs/protocol/skills).\n\n## External Tools\n\nExternal tools are defined in the `tools:` section and implemented in your backend.\n\n## Defining Tools\n\n```yaml\ntools:\n get-user-account:\n description: Looking up your account information\n display: description\n parameters:\n userId:\n type: string\n description: The user ID to look up\n```\n\n### Tool Fields\n\n| Field | Required | Description |\n| ------------- | -------- | -------------------------------------------------------------------------- |\n| `description` | Yes | What the tool does (shown to LLM and optionally user) |\n| `display` | No | How to show in UI: `hidden`, `name`, `description`, `stream`, `title` |\n| `title` | No | UI label shown when `display: title` (hides the description and arguments) |\n| `parameters` | No | Input parameters the tool accepts |\n\n### Display Modes\n\nControls what the client sees about tool execution. The default is `description`.\n\n| Mode | Behavior |\n| ------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `hidden` | No UI events emitted. The tool executes silently and the user has no awareness it was called. Use for internal plumbing tools (title setting, context management). |\n| `name` | Shows the raw tool name while executing. Arguments and result are not displayed. |\n| `description` | Shows the tool\'s description while executing (default). Arguments are visible during live streaming but the result is not preserved after page refresh. |\n| `stream` | Full visibility. Arguments stream progressively as the LLM generates them, and the result is shown after execution. The result is preserved after page refresh. |\n| `title` | Shows the tool\'s `title` plus the tool name only. The description, arguments, and result are hidden in the UI; the `description` is still sent to the LLM. Use for server-side tools that should appear as a labeled step without exposing their inputs/outputs. |\n\n**When to use `stream`:**\n\n- Client tools where the user benefits from seeing arguments or results\n- Interactive client tools (user provides input via the tool card)\n- Tools whose result is rendered via `renderToolCallResult`\n- Any tool where transparency into what was sent/received matters\n\n**When to use `hidden`:**\n\n- Internal lifecycle tools (e.g., session title setting)\n- Context-setting tools that would clutter the UI\n- Tools that are implementation details of the agent\'s protocol\n\n**When to use `title`:**\n\n- Server-side tools that should appear in the UI as a clean, labeled step (e.g. "Looking up your account") without exposing their arguments or result\n- Tools whose `description` is written for the LLM and shouldn\'t be shown verbatim to the user\n- This is the recommended mode for server-executed tools that should be visible: give them a human-readable `title` for the UI and keep `description` focused on instructing the model\n\n`title` is the preferred choice for server-side tools that should surface in the UI. `name` and `description` remain supported for backward compatibility, but for new server-side tools prefer `title` (a clean UI label) - or `stream` for client tools where the user benefits from seeing the arguments and result.\n\n**Refresh and restore behavior:**\n\n`stream` is the only mode that preserves the tool result after a page refresh. For all other modes, the result is available during the live session but stripped on refresh. On session restore (when the session expires and is rebuilt from stored `UIMessage[]`), `stream` tools retain their original result while other modes receive a placeholder.\n\n## Parameters\n\nTool calls are always objects where each parameter name maps to a value. The LLM generates: `{ param1: value1, param2: value2, ... }`\n\n### Parameter Fields\n\n| Field | Required | Description |\n| ------------- | -------- | -------------------------------------------------------------------------------- |\n| `type` | Yes | Data type: `string`, `number`, `integer`, `boolean`, `unknown`, or a custom type |\n| `description` | No | Describes what this parameter is for |\n| `optional` | No | If true, parameter is not required (default: false) |\n\n> **Tip**: You can use [custom types](/docs/protocol/types) for complex parameters like `type: ProductFilter` or `type: SearchOptions`.\n\n### Array Parameters\n\nFor array parameters, define a [top-level array type](/docs/protocol/types#top-level-array-types) and use it:\n\n```yaml\ntypes:\n CartItem:\n productId:\n type: string\n quantity:\n type: integer\n\n CartItemList:\n type: array\n items:\n type: CartItem\n\ntools:\n add-to-cart:\n description: Add items to cart\n parameters:\n items:\n type: CartItemList\n description: Items to add\n```\n\nThe tool receives: `{ items: [{ productId: "...", quantity: 1 }, ...] }`\n\n### Optional Parameters\n\nParameters are **required by default**. Use `optional: true` to make a parameter optional:\n\n```yaml\ntools:\n search-products:\n description: Search the product catalog\n parameters:\n query:\n type: string\n description: Search query\n\n category:\n type: string\n description: Filter by category\n optional: true\n\n maxPrice:\n type: number\n description: Maximum price filter\n optional: true\n\n inStock:\n type: boolean\n description: Only show in-stock items\n optional: true\n```\n\n## Making Tools Available\n\nTools defined in `tools:` are available. To make them usable by the LLM, add them to `agent.tools`:\n\n```yaml\ntools:\n get-user-account:\n description: Look up user account\n parameters:\n userId: { type: string }\n\n create-support-ticket:\n description: Create a support ticket\n parameters:\n summary: { type: string }\n priority: { type: string } # low, medium, high, urgent\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools:\n - get-user-account\n - create-support-ticket # LLM can decide when to call these\n agentic: true\n```\n\n## Tool Invocation Modes\n\n### LLM-Decided (Agentic)\n\nThe LLM decides when to call tools based on the conversation:\n\n```yaml\nagent:\n tools: [get-user-account, create-support-ticket]\n agentic: true # Allow multiple tool calls\n maxSteps: 10 # Max tool call cycles\n```\n\n### Deterministic (Block-Based)\n\nForce tool calls at specific points in the handler:\n\n```yaml\nhandlers:\n request-human:\n # Always create a ticket when escalating\n Create support ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY # From variable\n priority: medium # Literal value\n output: TICKET # Store result\n```\n\n## Tool Results\n\n### In Prompts\n\nTool results are stored in variables. Reference the variable in prompts:\n\n```markdown\n<!-- prompts/ticket-directive.md -->\n\nA support ticket has been created:\n{{TICKET}}\n\nLet the user know their ticket has been created.\n```\n\nWhen the `TICKET` variable contains an object, it\'s automatically serialized as JSON in the prompt:\n\n```\nA support ticket has been created:\n{\n "ticketId": "TKT-123ABC",\n "estimatedResponse": "24 hours"\n}\n\nLet the user know their ticket has been created.\n```\n\n> **Note**: Variables use `{{VARIABLE_NAME}}` syntax with `UPPERCASE_SNAKE_CASE`. Dot notation (like `{{TICKET.ticketId}}`) is not supported. Objects are automatically JSON-serialized.\n\n### In Variables\n\nStore tool results for later use:\n\n```yaml\nhandlers:\n request-human:\n Get account:\n block: tool-call\n tool: get-user-account\n input:\n userId: USER_ID\n output: ACCOUNT # Result stored here\n\n Create ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY\n priority: medium\n output: TICKET\n```\n\n## Implementing Tools\n\nTools are implemented in your backend:\n\n```typescript\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n \'get-user-account\': async (args) => {\n const userId = args.userId as string;\n const user = await db.users.findById(userId);\n\n return {\n name: user.name,\n email: user.email,\n plan: user.subscription.plan,\n createdAt: user.createdAt.toISOString(),\n };\n },\n\n \'create-support-ticket\': async (args) => {\n const ticket = await ticketService.create({\n summary: args.summary as string,\n priority: args.priority as string,\n });\n\n return {\n ticketId: ticket.id,\n estimatedResponse: getEstimatedTime(args.priority),\n };\n },\n },\n});\n```\n\n## Tool Best Practices\n\n### 1. Clear Descriptions\n\n```yaml\ntools:\n # Good - clear and specific\n get-user-account:\n description: >\n Retrieves the user\'s account information including name, email,\n subscription plan, and account creation date. Use this when the\n user asks about their account or you need to verify their identity.\n\n # Avoid - vague\n get-data:\n description: Gets some data\n```\n\n### 2. Document Constrained Values\n\n```yaml\ntools:\n create-support-ticket:\n parameters:\n priority:\n type: string\n description: Ticket priority level (low, medium, high, urgent)\n```\n',
592
592
  excerpt: "Tools Tools extend what agents can do. Octavus supports multiple types: 1. External Tools - Defined in the protocol, implemented in your backend (this page) 2. MCP Tools - Auto-discovered from MCP...",
593
593
  order: 4
594
594
  },
@@ -597,7 +597,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
597
597
  section: "protocol",
598
598
  title: "Skills",
599
599
  description: "Using Octavus skills for code execution and specialized capabilities.",
600
- content: "\n# Skills\n\nSkills are knowledge packages that enable agents to execute code and generate files. Unlike external tools (which you implement in your backend), skills are self-contained packages with documentation and scripts. By default, skills run in isolated sandbox environments, but they can also run directly on the agent's computer.\n\n## Overview\n\nOctavus Skills provide **provider-agnostic** code execution. They work with any LLM provider (Anthropic, OpenAI, Google) by using explicit tool calls and system prompt injection.\n\n### How Skills Work\n\n1. **Skill Definition**: Skills are defined in the protocol's `skills:` section\n2. **Skill Resolution**: Skills are resolved from available sources (see below)\n3. **Execution**: Code runs in an isolated sandbox (default) or on the agent's computer\n4. **File Generation**: Files saved to `/output/` are automatically captured and made available for download (sandbox skills)\n\n### Skill Sources\n\nSkills come from two sources, visible in the Skills tab of your organization:\n\n| Source | Badge in UI | Visibility | Example |\n| ----------- | ----------- | ------------------------------ | ------------------ |\n| **Octavus** | `Octavus` | Available to all organizations | `qr-code` |\n| **Custom** | None | Private to your organization | `my-company-skill` |\n\nWhen you reference a skill in your protocol, Octavus resolves it from your available skills. If you create a custom skill with the same name as an Octavus skill, your custom skill takes precedence.\n\n## Defining Skills\n\nDefine skills in the protocol's `skills:` section:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n data-analysis:\n display: description\n description: Analyzing data and generating reports\n```\n\n### Skill Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------------------------------------- |\n| `display` | No | How to show in UI: `hidden`, `name`, `description`, `stream` (default: `description`) |\n| `description` | No | Custom description shown to users (overrides skill's built-in description) |\n| `execution` | No | Where the skill runs: `sandbox` (default) or `device` |\n\n### Display Modes\n\nThe `display` setting on a skill applies to all tools under that skill namespace. See [Tool Display Modes](/docs/protocol/tools#display-modes) for full details on each mode.\n\n| Mode | Behavior |\n| ------------- | -------------------------------------------------------------------------------------------------------------------- |\n| `hidden` | Skill tools run silently, no UI events emitted |\n| `name` | Shows skill name while executing |\n| `description` | Shows description while executing (default). Result not preserved after page refresh. |\n| `stream` | Full visibility - arguments stream progressively, result shown after execution, result preserved after page refresh. |\n\n## Enabling Skills\n\nAfter defining skills in the `skills:` section, specify which skills are available. Skills work in both interactive agents and workers.\n\n### Interactive Agents\n\nReference skills in `agent.skills`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools: [get-user-account]\n skills: [qr-code]\n agentic: true\n```\n\n### Workers and Named Threads\n\nReference skills per-thread in `start-thread.skills`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nsteps:\n Start thread:\n block: start-thread\n thread: worker\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n maxSteps: 10\n```\n\nThis also works for named threads in interactive agents, allowing different threads to have different skills.\n\n## Skill Tools\n\nWhen skills are enabled, the LLM has access to these tools:\n\n| Tool | Purpose | Availability |\n| --------------------- | ----------------------------------------------- | ------------------------------ |\n| `octavus_skill_read` | Read skill documentation (SKILL.md) | All skills |\n| `octavus_skill_list` | List available scripts in a skill | All skills |\n| `octavus_skill_run` | Execute a pre-built script from a skill | All skills |\n| `octavus_skill_setup` | Install a skill on the device for file browsing | Device skills only |\n| `octavus_code_run` | Execute arbitrary Python/Bash code | Sandbox skills (standard) only |\n| `octavus_file_write` | Create files in the sandbox | Sandbox skills (standard) only |\n| `octavus_file_read` | Read files from the sandbox | Sandbox skills (standard) only |\n\nThe LLM learns about available skills through system prompt injection and can use these tools to interact with skills.\n\nSkills that have [secrets](#skill-secrets) configured run in **secure mode**, where only `octavus_skill_read`, `octavus_skill_list`, and `octavus_skill_run` are available. See [Skill Secrets](#skill-secrets) below.\n\n## Device Execution\n\nBy default, skills run in an isolated sandbox. When `execution: device` is set, the skill runs on the agent's computer instead.\n\n```yaml\nskills:\n deploy-tool:\n display: description\n description: Deploy applications to production\n execution: device\n qr-code:\n display: description\n description: Generating QR codes\n # execution defaults to sandbox\n```\n\n### How Device Skills Work\n\nDevice skills are installed on the agent's computer so the agent can browse their files and run their scripts directly. After attaching a skill via integrations, the agent uses `octavus_skill_setup` to install it on the device. Once installed, the agent can:\n\n- Read the skill's documentation with `octavus_skill_read`\n- List available scripts with `octavus_skill_list`\n- Run pre-built scripts with `octavus_skill_run`\n\nThe generic workspace tools (`octavus_code_run`, `octavus_file_write`, `octavus_file_read`) are **not available** for device skills. Instead, the agent uses the device's own shell and filesystem MCP servers to interact with files and run commands.\n\n### Sandbox vs Device Skills\n\n| Aspect | Sandbox (default) | Device |\n| ------------------- | ---------------------------------- | ------------------------------------------------------ |\n| **Environment** | Isolated sandbox | The agent's computer |\n| **Available tools** | All 6 skill tools | `skill_read`, `skill_list`, `skill_run`, `skill_setup` |\n| **File access** | Via `octavus_file_read/write` | Via device filesystem MCP |\n| **Code execution** | Via `octavus_code_run` | Via device shell MCP |\n| **Isolation** | Fully sandboxed | Runs alongside other device processes |\n| **File output** | `/output/` directory auto-captured | Files written to device filesystem |\n\n### When to Use Device Execution\n\nUse `execution: device` when the skill needs to:\n\n- Access the agent's local filesystem or running processes\n- Use tools or CLIs installed on the device\n- Interact with services running on the device\n- Persist files beyond a single execution cycle\n\n## Example: QR Code Generation\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n agentic: true\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n\n Respond:\n block: next-message\n```\n\nWhen a user asks \"Create a QR code for octavus.ai\", the LLM will:\n\n1. Recognize the task matches the `qr-code` skill\n2. Call `octavus_skill_read` to learn how to use the skill\n3. Execute code (via `octavus_code_run` or `octavus_skill_run`) to generate the QR code\n4. Save the image to `/output/` in the sandbox\n5. The file is automatically captured and made available for download\n\n## File Output\n\nFiles saved to `/output/` in the sandbox are automatically:\n\n1. **Captured** after code execution\n2. **Uploaded** to S3 storage\n3. **Made available** via presigned URLs\n4. **Included** in the message as file parts\n\nFiles persist across page refreshes and are stored in the session's message history.\n\n## Skill Format\n\nSkills follow the [Agent Skills](https://agentskills.io) open standard:\n\n- `SKILL.md` - Required skill documentation with YAML frontmatter\n- `scripts/` - Optional executable code (Python/Bash)\n- `references/` - Optional documentation loaded as needed\n- `assets/` - Optional files used in outputs (templates, images)\n\n### SKILL.md Format\n\n````yaml\n---\nname: qr-code\ndescription: >\n Generate QR codes from text, URLs, or data. Use when the user needs to create\n a QR code for any purpose - sharing links, contact information, WiFi credentials,\n or any text data that should be scannable.\nversion: 1.0.0\nlicense: MIT\nauthor: Octavus Team\ncategory: Productivity\n---\n\n# QR Code Generator\n\n## Overview\n\nThis skill creates QR codes from text data using Python...\n\n## Quick Start\n\nGenerate a QR code with Python:\n\n```python\nimport qrcode\nimport os\n\noutput_dir = os.environ.get('OUTPUT_DIR', '/output')\n# ... code to generate QR code ...\n````\n\n## Scripts Reference\n\n### scripts/generate.py\n\nMain script for generating QR codes...\n\n````\n\n### Frontmatter Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------ |\n| `name` | Yes | Skill slug (lowercase, hyphens) |\n| `description` | Yes | What the skill does (shown to the LLM) |\n| `version` | No | Semantic version string |\n| `license` | No | License identifier |\n| `author` | No | Skill author |\n| `category` | No | Display category used to group and filter skills in the UI |\n| `secrets` | No | Array of secret declarations (enables secure mode) |\n\n## Best Practices\n\n### 1. Clear Descriptions\n\nProvide clear, purpose-driven descriptions:\n\n```yaml\nskills:\n # Good - clear purpose\n qr-code:\n description: Generating QR codes for URLs, contact info, or any text data\n\n # Avoid - vague\n utility:\n description: Does stuff\n````\n\n### 2. When to Use Skills vs Tools\n\n| Use Skills When | Use Tools When |\n| ------------------------ | ---------------------------- |\n| Code execution needed | Simple API calls |\n| File generation | Database queries |\n| Complex calculations | External service integration |\n| Data processing | Authentication required |\n| Provider-agnostic needed | Backend-specific logic |\n\n### 3. Skill Selection\n\nDefine all skills available to this agent in the `skills:` section. Then specify which skills are available for the chat thread in `agent.skills`:\n\n```yaml\n# All skills available to this agent (defined once at protocol level)\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n data-analysis:\n display: description\n description: Analyzing data\n pdf-processor:\n display: description\n description: Processing PDFs\n\n# Skills available for this chat thread\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code, data-analysis] # Skills available for this thread\n```\n\n### 4. Display Modes\n\nChoose appropriate display modes based on user experience:\n\n```yaml\nskills:\n # Background processing - hide from user\n data-analysis:\n display: hidden\n\n # User-facing generation - show description\n qr-code:\n display: description\n\n # Interactive progress - stream updates\n report-generation:\n display: stream\n```\n\n## Comparison: Skills vs Tools vs Provider Options\n\n| Feature | Octavus Skills | External Tools | Provider Tools/Skills |\n| ------------------ | --------------------------- | ------------------- | --------------------- |\n| **Execution** | Sandbox or agent's computer | Your backend | Provider servers |\n| **Provider** | Any (agnostic) | N/A | Provider-specific |\n| **Code Execution** | Yes | No | Yes (provider tools) |\n| **File Output** | Yes | No | Yes (provider skills) |\n| **Implementation** | Skill packages | Your code | Built-in |\n| **Cost** | Sandbox + LLM API | Your infrastructure | Included in API |\n\n## Uploading Custom Skills\n\nYou can upload custom skills to your organization using the CLI or the platform UI.\n\n### Via CLI (Recommended)\n\nUse [`octavus skills sync`](/docs/server-sdk/cli#octavus-skills-sync-path) to package and upload a skill directory. If the skill has a `.env` file, secrets are pushed alongside the bundle:\n\n```bash\noctavus skills sync ./skills/my-skill\n```\n\n### Skill Directory Structure\n\n```\nmy-skill/\n\u251C\u2500\u2500 SKILL.md # Required: Skill documentation with frontmatter\n\u251C\u2500\u2500 scripts/ # Optional: Executable scripts\n\u2502 \u251C\u2500\u2500 run.py\n\u2502 \u2514\u2500\u2500 requirements.txt\n\u251C\u2500\u2500 references/ # Optional: Additional documentation\n\u251C\u2500\u2500 assets/ # Optional: Templates, images\n\u2514\u2500\u2500 .env # Optional: Secrets (not included in bundle)\n```\n\nOnce uploaded, reference the skill by slug in your protocol:\n\n```yaml\nskills:\n my-skill:\n display: description\n description: Custom analysis tool\n\nagent:\n skills: [my-skill]\n```\n\n## On-Demand Skills\n\nOn-demand skills (`onDemandSkills`) also support the `execution` field:\n\n```yaml\nonDemandSkills:\n display: description\n execution: device\n```\n\nWhen `execution: device` is set on the on-demand skills declaration, any skill attached at runtime via integrations runs on the agent's computer instead of in a sandbox.\n\n## Sandbox Timeout\n\nThe default sandbox timeout is 5 minutes (applies to sandbox skills only). You can configure a custom timeout using `sandboxTimeout` in the agent config or on individual `start-thread` blocks:\n\n```yaml\n# Agent-level timeout (applies to main thread)\nagent:\n model: anthropic/claude-sonnet-4-5\n skills: [data-analysis]\n sandboxTimeout: 1800000 # 30 minutes (in milliseconds)\n```\n\n```yaml\n# Thread-level timeout (overrides agent-level for this thread)\nsteps:\n Start thread:\n block: start-thread\n thread: analysis\n model: anthropic/claude-sonnet-4-5\n skills: [data-analysis]\n sandboxTimeout: 3600000 # 1 hour\n```\n\nThread-level `sandboxTimeout` takes priority over agent-level. Maximum: 1 hour (3,600,000 ms).\n\n## Skill Secrets\n\nSkills can declare secrets they need to function. When an organization configures those secrets, the skill runs in **secure mode** with additional isolation.\n\n### Declaring Secrets\n\nAdd a `secrets` array to your SKILL.md frontmatter:\n\n```yaml\n---\nname: github\ndescription: >\n Run GitHub CLI (gh) commands to manage repos, issues, PRs, and more.\nsecrets:\n - name: GITHUB_TOKEN\n description: GitHub personal access token with repo access\n required: true\n - name: GITHUB_ORG\n description: Default GitHub organization\n required: false\n---\n```\n\nEach secret declaration has:\n\n| Field | Required | Description |\n| ------------- | -------- | ----------------------------------------------------------- |\n| `name` | Yes | Environment variable name (uppercase, e.g., `GITHUB_TOKEN`) |\n| `description` | No | Explains what this secret is for (shown in the UI) |\n| `required` | No | Whether the secret is required (defaults to `true`) |\n\nSecret names must match the pattern `^[A-Z_][A-Z0-9_]*$` (uppercase letters, digits, and underscores).\n\n### Configuring Secrets\n\nOrganization admins configure secret values through the skill editor in the platform UI. Each organization maintains its own independent set of secrets for each skill.\n\nSecrets are encrypted at rest and only decrypted at execution time.\n\n### Secure Mode\n\nWhen a skill has secrets configured for the organization, it automatically runs in **secure mode**:\n\n- The skill gets its own **isolated sandbox** (separate from other skills)\n- Secrets are injected as **environment variables** available to all scripts\n- Only `octavus_skill_read`, `octavus_skill_list`, and `octavus_skill_run` are available - `octavus_code_run`, `octavus_file_write`, and `octavus_file_read` are blocked\n- Scripts receive input as **JSON via stdin** (using the `input` parameter on `octavus_skill_run`) instead of CLI args\n- All output (stdout/stderr) is **automatically redacted** for secret values before being returned to the LLM\n\n### Writing Scripts for Secure Skills\n\nScripts in secure skills read input from stdin as JSON and access secrets from environment variables:\n\n```python\nimport json\nimport os\nimport sys\n\ninput_data = json.load(sys.stdin)\ntoken = os.environ.get('GITHUB_TOKEN')\n\n# Use the token and input_data to perform the task\n```\n\nFor standard skills (without secrets), scripts receive input as CLI arguments. For secure skills, always use stdin JSON.\n\n## Security\n\nSandbox skills run in isolated environments:\n\n- **No network access** (unless explicitly configured)\n- **No persistent storage** (sandbox destroyed after each `next-message` execution)\n- **File output only** via `/output/` directory\n- **Time limits** enforced (5-minute default, configurable via `sandboxTimeout`)\n- **Secret redaction** - output from secure skills is automatically scanned for secret values\n\nDevice skills run on the agent's computer and share its environment. They do not have sandbox isolation but benefit from restricted tool access (only slug-bearing tools are available).\n\n## Next Steps\n\n- [Agent Config](/docs/protocol/agent-config) - Configuring skills in agent settings\n- [Provider Options](/docs/protocol/provider-options) - Anthropic's built-in skills\n- [Skills Advanced Guide](/docs/protocol/skills-advanced) - Best practices and advanced patterns\n",
600
+ content: "\n# Skills\n\nSkills are knowledge packages that enable agents to execute code and generate files. Unlike external tools (which you implement in your backend), skills are self-contained packages with documentation and scripts. By default, skills run in isolated sandbox environments, but they can also run directly on the agent's computer.\n\n## Overview\n\nOctavus Skills provide **provider-agnostic** code execution. They work with any LLM provider (Anthropic, OpenAI, Google) by using explicit tool calls and system prompt injection.\n\n### How Skills Work\n\n1. **Skill Definition**: Skills are defined in the protocol's `skills:` section\n2. **Skill Resolution**: Skills are resolved from available sources (see below)\n3. **Execution**: Code runs in an isolated sandbox (default) or on the agent's computer\n4. **File Generation**: Files saved to `/output/` are automatically captured and made available for download (sandbox skills)\n\n### Skill Sources\n\nSkills come from two sources, visible in the Skills tab of your organization:\n\n| Source | Badge in UI | Visibility | Example |\n| ----------- | ----------- | ------------------------------ | ------------------ |\n| **Octavus** | `Octavus` | Available to all organizations | `qr-code` |\n| **Custom** | None | Private to your organization | `my-company-skill` |\n\nWhen you reference a skill in your protocol, Octavus resolves it from your available skills. If you create a custom skill with the same name as an Octavus skill, your custom skill takes precedence.\n\n## Defining Skills\n\nDefine skills in the protocol's `skills:` section:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n data-analysis:\n display: description\n description: Analyzing data and generating reports\n```\n\n### Skill Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ---------------------------------------------------------------------------------------------- |\n| `display` | No | How to show in UI: `hidden`, `name`, `description`, `stream`, `title` (default: `description`) |\n| `title` | No | UI label shown when `display: title` (hides the description and arguments) |\n| `description` | No | Custom description shown to users (overrides skill's built-in description) |\n| `execution` | No | Where the skill runs: `sandbox` (default) or `device` |\n\n### Display Modes\n\nThe `display` setting on a skill applies to all tools under that skill namespace. See [Tool Display Modes](/docs/protocol/tools#display-modes) for full details on each mode.\n\n| Mode | Behavior |\n| ------------- | -------------------------------------------------------------------------------------------------------------------- |\n| `hidden` | Skill tools run silently, no UI events emitted |\n| `name` | Shows skill name while executing |\n| `description` | Shows description while executing (default). Result not preserved after page refresh. |\n| `stream` | Full visibility - arguments stream progressively, result shown after execution, result preserved after page refresh. |\n\n## Enabling Skills\n\nAfter defining skills in the `skills:` section, specify which skills are available. Skills work in both interactive agents and workers.\n\n### Interactive Agents\n\nReference skills in `agent.skills`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools: [get-user-account]\n skills: [qr-code]\n agentic: true\n```\n\n### Workers and Named Threads\n\nReference skills per-thread in `start-thread.skills`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nsteps:\n Start thread:\n block: start-thread\n thread: worker\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n maxSteps: 10\n```\n\nThis also works for named threads in interactive agents, allowing different threads to have different skills.\n\n## Skill Tools\n\nWhen skills are enabled, the LLM has access to these tools:\n\n| Tool | Purpose | Availability |\n| --------------------- | ----------------------------------------------- | ------------------------------ |\n| `octavus_skill_read` | Read skill documentation (SKILL.md) | All skills |\n| `octavus_skill_list` | List available scripts in a skill | All skills |\n| `octavus_skill_run` | Execute a pre-built script from a skill | All skills |\n| `octavus_skill_setup` | Install a skill on the device for file browsing | Device skills only |\n| `octavus_code_run` | Execute arbitrary Python/Bash code | Sandbox skills (standard) only |\n| `octavus_file_write` | Create files in the sandbox | Sandbox skills (standard) only |\n| `octavus_file_read` | Read files from the sandbox | Sandbox skills (standard) only |\n\nThe LLM learns about available skills through system prompt injection and can use these tools to interact with skills.\n\nSkills that have [secrets](#skill-secrets) configured run in **secure mode**, where only `octavus_skill_read`, `octavus_skill_list`, and `octavus_skill_run` are available. See [Skill Secrets](#skill-secrets) below.\n\n## Device Execution\n\nBy default, skills run in an isolated sandbox. When `execution: device` is set, the skill runs on the agent's computer instead.\n\n```yaml\nskills:\n deploy-tool:\n display: description\n description: Deploy applications to production\n execution: device\n qr-code:\n display: description\n description: Generating QR codes\n # execution defaults to sandbox\n```\n\n### How Device Skills Work\n\nDevice skills are installed on the agent's computer so the agent can browse their files and run their scripts directly. After attaching a skill via integrations, the agent uses `octavus_skill_setup` to install it on the device. Once installed, the agent can:\n\n- Read the skill's documentation with `octavus_skill_read`\n- List available scripts with `octavus_skill_list`\n- Run pre-built scripts with `octavus_skill_run`\n\nThe generic workspace tools (`octavus_code_run`, `octavus_file_write`, `octavus_file_read`) are **not available** for device skills. Instead, the agent uses the device's own shell and filesystem MCP servers to interact with files and run commands.\n\n### Sandbox vs Device Skills\n\n| Aspect | Sandbox (default) | Device |\n| ------------------- | ---------------------------------- | ------------------------------------------------------ |\n| **Environment** | Isolated sandbox | The agent's computer |\n| **Available tools** | All 6 skill tools | `skill_read`, `skill_list`, `skill_run`, `skill_setup` |\n| **File access** | Via `octavus_file_read/write` | Via device filesystem MCP |\n| **Code execution** | Via `octavus_code_run` | Via device shell MCP |\n| **Isolation** | Fully sandboxed | Runs alongside other device processes |\n| **File output** | `/output/` directory auto-captured | Files written to device filesystem |\n\n### When to Use Device Execution\n\nUse `execution: device` when the skill needs to:\n\n- Access the agent's local filesystem or running processes\n- Use tools or CLIs installed on the device\n- Interact with services running on the device\n- Persist files beyond a single execution cycle\n\n## Example: QR Code Generation\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n agentic: true\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n\n Respond:\n block: next-message\n```\n\nWhen a user asks \"Create a QR code for octavus.ai\", the LLM will:\n\n1. Recognize the task matches the `qr-code` skill\n2. Call `octavus_skill_read` to learn how to use the skill\n3. Execute code (via `octavus_code_run` or `octavus_skill_run`) to generate the QR code\n4. Save the image to `/output/` in the sandbox\n5. The file is automatically captured and made available for download\n\n## File Output\n\nFiles saved to `/output/` in the sandbox are automatically:\n\n1. **Captured** after code execution\n2. **Uploaded** to S3 storage\n3. **Made available** via presigned URLs\n4. **Included** in the message as file parts\n\nFiles persist across page refreshes and are stored in the session's message history.\n\n## Skill Format\n\nSkills follow the [Agent Skills](https://agentskills.io) open standard:\n\n- `SKILL.md` - Required skill documentation with YAML frontmatter\n- `scripts/` - Optional executable code (Python/Bash)\n- `references/` - Optional documentation loaded as needed\n- `assets/` - Optional files used in outputs (templates, images)\n\n### SKILL.md Format\n\n````yaml\n---\nname: qr-code\ndescription: >\n Generate QR codes from text, URLs, or data. Use when the user needs to create\n a QR code for any purpose - sharing links, contact information, WiFi credentials,\n or any text data that should be scannable.\nversion: 1.0.0\nlicense: MIT\nauthor: Octavus Team\ncategory: Productivity\n---\n\n# QR Code Generator\n\n## Overview\n\nThis skill creates QR codes from text data using Python...\n\n## Quick Start\n\nGenerate a QR code with Python:\n\n```python\nimport qrcode\nimport os\n\noutput_dir = os.environ.get('OUTPUT_DIR', '/output')\n# ... code to generate QR code ...\n````\n\n## Scripts Reference\n\n### scripts/generate.py\n\nMain script for generating QR codes...\n\n````\n\n### Frontmatter Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------ |\n| `name` | Yes | Skill slug (lowercase, hyphens) |\n| `description` | Yes | What the skill does (shown to the LLM) |\n| `version` | No | Semantic version string |\n| `license` | No | License identifier |\n| `author` | No | Skill author |\n| `category` | No | Display category used to group and filter skills in the UI |\n| `secrets` | No | Array of secret declarations (enables secure mode) |\n\n## Best Practices\n\n### 1. Clear Descriptions\n\nProvide clear, purpose-driven descriptions:\n\n```yaml\nskills:\n # Good - clear purpose\n qr-code:\n description: Generating QR codes for URLs, contact info, or any text data\n\n # Avoid - vague\n utility:\n description: Does stuff\n````\n\n### 2. When to Use Skills vs Tools\n\n| Use Skills When | Use Tools When |\n| ------------------------ | ---------------------------- |\n| Code execution needed | Simple API calls |\n| File generation | Database queries |\n| Complex calculations | External service integration |\n| Data processing | Authentication required |\n| Provider-agnostic needed | Backend-specific logic |\n\n### 3. Skill Selection\n\nDefine all skills available to this agent in the `skills:` section. Then specify which skills are available for the chat thread in `agent.skills`:\n\n```yaml\n# All skills available to this agent (defined once at protocol level)\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n data-analysis:\n display: description\n description: Analyzing data\n pdf-processor:\n display: description\n description: Processing PDFs\n\n# Skills available for this chat thread\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code, data-analysis] # Skills available for this thread\n```\n\n### 4. Display Modes\n\nChoose appropriate display modes based on user experience:\n\n```yaml\nskills:\n # Background processing - hide from user\n data-analysis:\n display: hidden\n\n # User-facing generation - show description\n qr-code:\n display: description\n\n # Interactive progress - stream updates\n report-generation:\n display: stream\n```\n\n## Comparison: Skills vs Tools vs Provider Options\n\n| Feature | Octavus Skills | External Tools | Provider Tools/Skills |\n| ------------------ | --------------------------- | ------------------- | --------------------- |\n| **Execution** | Sandbox or agent's computer | Your backend | Provider servers |\n| **Provider** | Any (agnostic) | N/A | Provider-specific |\n| **Code Execution** | Yes | No | Yes (provider tools) |\n| **File Output** | Yes | No | Yes (provider skills) |\n| **Implementation** | Skill packages | Your code | Built-in |\n| **Cost** | Sandbox + LLM API | Your infrastructure | Included in API |\n\n## Uploading Custom Skills\n\nYou can upload custom skills to your organization using the CLI or the platform UI.\n\n### Via CLI (Recommended)\n\nUse [`octavus skills sync`](/docs/server-sdk/cli#octavus-skills-sync-path) to package and upload a skill directory. If the skill has a `.env` file, secrets are pushed alongside the bundle:\n\n```bash\noctavus skills sync ./skills/my-skill\n```\n\n### Skill Directory Structure\n\n```\nmy-skill/\n\u251C\u2500\u2500 SKILL.md # Required: Skill documentation with frontmatter\n\u251C\u2500\u2500 scripts/ # Optional: Executable scripts\n\u2502 \u251C\u2500\u2500 run.py\n\u2502 \u2514\u2500\u2500 requirements.txt\n\u251C\u2500\u2500 references/ # Optional: Additional documentation\n\u251C\u2500\u2500 assets/ # Optional: Templates, images\n\u2514\u2500\u2500 .env # Optional: Secrets (not included in bundle)\n```\n\nOnce uploaded, reference the skill by slug in your protocol:\n\n```yaml\nskills:\n my-skill:\n display: description\n description: Custom analysis tool\n\nagent:\n skills: [my-skill]\n```\n\n## On-Demand Skills\n\nOn-demand skills (`onDemandSkills`) also support the `execution` field:\n\n```yaml\nonDemandSkills:\n display: description\n execution: device\n```\n\nWhen `execution: device` is set on the on-demand skills declaration, any skill attached at runtime via integrations runs on the agent's computer instead of in a sandbox.\n\n## Sandbox Timeout\n\nThe default sandbox timeout is 5 minutes (applies to sandbox skills only). You can configure a custom timeout using `sandboxTimeout` in the agent config or on individual `start-thread` blocks:\n\n```yaml\n# Agent-level timeout (applies to main thread)\nagent:\n model: anthropic/claude-sonnet-4-5\n skills: [data-analysis]\n sandboxTimeout: 1800000 # 30 minutes (in milliseconds)\n```\n\n```yaml\n# Thread-level timeout (overrides agent-level for this thread)\nsteps:\n Start thread:\n block: start-thread\n thread: analysis\n model: anthropic/claude-sonnet-4-5\n skills: [data-analysis]\n sandboxTimeout: 3600000 # 1 hour\n```\n\nThread-level `sandboxTimeout` takes priority over agent-level. Maximum: 1 hour (3,600,000 ms).\n\n## Skill Secrets\n\nSkills can declare secrets they need to function. When an organization configures those secrets, the skill runs in **secure mode** with additional isolation.\n\n### Declaring Secrets\n\nAdd a `secrets` array to your SKILL.md frontmatter:\n\n```yaml\n---\nname: github\ndescription: >\n Run GitHub CLI (gh) commands to manage repos, issues, PRs, and more.\nsecrets:\n - name: GITHUB_TOKEN\n description: GitHub personal access token with repo access\n required: true\n - name: GITHUB_ORG\n description: Default GitHub organization\n required: false\n---\n```\n\nEach secret declaration has:\n\n| Field | Required | Description |\n| ------------- | -------- | ----------------------------------------------------------- |\n| `name` | Yes | Environment variable name (uppercase, e.g., `GITHUB_TOKEN`) |\n| `description` | No | Explains what this secret is for (shown in the UI) |\n| `required` | No | Whether the secret is required (defaults to `true`) |\n\nSecret names must match the pattern `^[A-Z_][A-Z0-9_]*$` (uppercase letters, digits, and underscores).\n\n### Configuring Secrets\n\nOrganization admins configure secret values through the skill editor in the platform UI. Each organization maintains its own independent set of secrets for each skill.\n\nSecrets are encrypted at rest and only decrypted at execution time.\n\n### Secure Mode\n\nWhen a skill has secrets configured for the organization, it automatically runs in **secure mode**:\n\n- The skill gets its own **isolated sandbox** (separate from other skills)\n- Secrets are injected as **environment variables** available to all scripts\n- Only `octavus_skill_read`, `octavus_skill_list`, and `octavus_skill_run` are available - `octavus_code_run`, `octavus_file_write`, and `octavus_file_read` are blocked\n- Scripts receive input as **JSON via stdin** (using the `input` parameter on `octavus_skill_run`) instead of CLI args\n- All output (stdout/stderr) is **automatically redacted** for secret values before being returned to the LLM\n\n### Writing Scripts for Secure Skills\n\nScripts in secure skills read input from stdin as JSON and access secrets from environment variables:\n\n```python\nimport json\nimport os\nimport sys\n\ninput_data = json.load(sys.stdin)\ntoken = os.environ.get('GITHUB_TOKEN')\n\n# Use the token and input_data to perform the task\n```\n\nFor standard skills (without secrets), scripts receive input as CLI arguments. For secure skills, always use stdin JSON.\n\n## Security\n\nSandbox skills run in isolated environments:\n\n- **No network access** (unless explicitly configured)\n- **No persistent storage** (sandbox destroyed after each `next-message` execution)\n- **File output only** via `/output/` directory\n- **Time limits** enforced (5-minute default, configurable via `sandboxTimeout`)\n- **Secret redaction** - output from secure skills is automatically scanned for secret values\n\nDevice skills run on the agent's computer and share its environment. They do not have sandbox isolation but benefit from restricted tool access (only slug-bearing tools are available).\n\n## Next Steps\n\n- [Agent Config](/docs/protocol/agent-config) - Configuring skills in agent settings\n- [Provider Options](/docs/protocol/provider-options) - Anthropic's built-in skills\n- [Skills Advanced Guide](/docs/protocol/skills-advanced) - Best practices and advanced patterns\n",
601
601
  excerpt: "Skills Skills are knowledge packages that enable agents to execute code and generate files. Unlike external tools (which you implement in your backend), skills are self-contained packages with...",
602
602
  order: 5
603
603
  },
@@ -606,7 +606,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
606
606
  section: "protocol",
607
607
  title: "Handlers",
608
608
  description: "Defining execution handlers with blocks.",
609
- content: "\n# Handlers\n\nHandlers define what happens when a trigger fires. They contain execution blocks that run in sequence.\n\n## Handler Structure\n\n```yaml\nhandlers:\n trigger-name:\n Block Name:\n block: block-kind\n # block-specific properties\n\n Another Block:\n block: another-kind\n # ...\n```\n\nEach block has a human-readable name (shown in debug UI) and a `block` field that determines its behavior.\n\n## Block Kinds\n\n### next-message\n\nGenerate a response from the LLM:\n\n```yaml\nhandlers:\n user-message:\n Respond to user:\n block: next-message\n # Uses main conversation thread by default\n # Display defaults to 'stream'\n```\n\nWith options:\n\n```yaml\nGenerate summary:\n block: next-message\n thread: summary # Use named thread\n display: stream # Show streaming content\n independent: true # Don't add to main chat\n output: SUMMARY # Store output in variable\n description: Generating summary # Shown in UI\n```\n\nFor structured output (typed JSON response):\n\n```yaml\nRespond with suggestions:\n block: next-message\n responseType: ChatResponse # Type defined in types section\n output: RESPONSE # Stores the parsed object\n```\n\nWhen `responseType` is specified:\n\n- The LLM generates JSON matching the type schema\n- The `output` variable receives the parsed object (not plain text)\n- The client receives a `UIObjectPart` for custom rendering\n\nSee [Types](/docs/protocol/types#structured-output) for more details.\n\n### add-message\n\nAdd a message to the conversation:\n\n```yaml\nAdd user message:\n block: add-message\n role: user # user | assistant | system\n prompt: user-message # Reference to prompt file\n input: [USER_MESSAGE] # Variables to interpolate\n display: hidden # Don't show in UI\n```\n\nFor internal directives (LLM sees it, user doesn't):\n\n```yaml\nAdd internal directive:\n block: add-message\n role: user\n prompt: ticket-directive\n input: [TICKET_DETAILS]\n visible: false # LLM sees this, user doesn't\n```\n\nFor structured user input (object shown in UI, prompt for LLM context):\n\n```yaml\nAdd user message:\n block: add-message\n role: user\n prompt: user-message # Rendered for LLM context (hidden from UI)\n input: [USER_INPUT]\n uiContent: USER_INPUT # Variable shown in UI (object \u2192 object part)\n display: hidden\n```\n\nWhen `uiContent` is set:\n\n- The variable value is shown in the UI (string \u2192 text part, object \u2192 object part)\n- The prompt text is hidden from the UI but kept for LLM context\n- Useful for rich UI interactions where the visual differs from the LLM context\n\n### tool-call\n\nCall a tool deterministically:\n\n```yaml\nCreate ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY # Variable reference\n priority: medium # Literal value\n output: TICKET # Store result\n```\n\n### set-resource\n\nUpdate a persistent resource:\n\n```yaml\nSave summary:\n block: set-resource\n resource: CONVERSATION_SUMMARY\n value: SUMMARY # Variable to save\n display: name # Show block name\n```\n\n### start-thread\n\nCreate a named conversation thread:\n\n```yaml\nStart summary thread:\n block: start-thread\n thread: summary # Thread name\n model: anthropic/claude-sonnet-4-5 # Optional: different model\n backupModel: openai/gpt-4o # Failover on provider errors\n thinking: low # Extended reasoning level\n cache: auto # auto (default) | extended | off\n maxSteps: 1 # Tool call limit\n system: escalation-summary # System prompt\n input: [COMPANY_NAME] # Variables for prompt\n mcpServers: [figma, browser] # MCP servers for this thread\n skills: [qr-code] # Octavus skills for this thread\n sandboxTimeout: 600000 # Skill sandbox timeout (default: 5 min, max: 1 hour)\n imageModel: google/gemini-2.5-flash-image # Image generation model\n```\n\nThe `cache` field controls prompt caching for this thread and defaults to `auto` when omitted. Threads do not inherit the agent's `cache` value - see [Prompt Caching](/docs/protocol/agent-config#prompt-caching).\n\nThe `model` field can also reference a variable for dynamic model selection. The `backupModel`, `temperature`, `thinking`, and `maxSteps` fields also support variable references - see [Dynamic Configuration](/docs/protocol/agent-config#dynamic-configuration).\n\n```yaml\nStart summary thread:\n block: start-thread\n thread: summary\n model: SUMMARY_MODEL # Resolved from input variable\n system: escalation-summary\n```\n\n### serialize-thread\n\nConvert conversation to text:\n\n```yaml\nSerialize conversation:\n block: serialize-thread\n thread: main # Which thread (default: main)\n format: markdown # markdown | json\n output: CONVERSATION_TEXT # Variable to store result\n```\n\n### generate-image\n\nGenerate an image from a prompt variable:\n\n```yaml\nGenerate image:\n block: generate-image\n prompt: OPTIMIZED_PROMPT # Variable containing the prompt\n imageModel: google/gemini-2.5-flash-image # Required image model\n size: 1024x1024 # 1024x1024 | 1792x1024 | 1024x1792\n output: GENERATED_IMAGE # Store URL in variable\n description: Generating your image... # Shown in UI\n```\n\nEdit an existing image using reference images:\n\n```yaml\nEdit image:\n block: generate-image\n prompt: EDIT_INSTRUCTIONS # e.g., \"Remove the background\"\n referenceImages: [SOURCE_IMAGE_URL] # Variable(s) containing image URLs\n imageModel: google/gemini-2.5-flash-image\n output: EDITED_IMAGE\n description: Editing image...\n```\n\n| Field | Required | Description |\n| ----------------- | -------- | --------------------------------------------------------------- |\n| `prompt` | Yes | Variable name containing the image prompt or edit instructions |\n| `imageModel` | Yes | Image model identifier (e.g., `google/gemini-2.5-flash-image`) |\n| `size` | No | Image dimensions: `1024x1024`, `1792x1024`, or `1024x1792` |\n| `referenceImages` | No | Variable names containing image URLs for editing/transformation |\n| `output` | No | Variable name to store the generated image URL |\n| `thread` | No | Thread to associate the output file with |\n| `description` | No | Description shown in the UI during generation |\n\nThis block is for deterministic image generation pipelines where the prompt is constructed programmatically (e.g., via prompt engineering in a separate thread). When `referenceImages` are provided, the prompt describes how to modify those images.\n\nFor agentic image generation where the LLM decides when to generate, configure `imageModel` in the [agent config](/docs/protocol/agent-config#image-generation).\n\n## Display Modes\n\nEvery block has a `display` property:\n\n| Mode | Default For | Behavior |\n| ------------- | ------------------------- | ----------------- |\n| `hidden` | add-message | Not shown to user |\n| `name` | set-resource | Shows block name |\n| `description` | tool-call, generate-image | Shows description |\n| `stream` | next-message | Streams content |\n\n## Complete Example\n\n```yaml\nhandlers:\n user-message:\n # Add the user's message to conversation\n Add user message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n # Generate response (LLM may call tools)\n Respond to user:\n block: next-message\n # display: stream (default)\n\n request-human:\n # Step 1: Serialize conversation for summary\n Serialize conversation:\n block: serialize-thread\n format: markdown\n output: CONVERSATION_TEXT\n\n # Step 2: Create separate thread for summarization\n Start summary thread:\n block: start-thread\n thread: summary\n model: anthropic/claude-sonnet-4-5\n thinking: low\n system: escalation-summary\n input: [COMPANY_NAME]\n\n # Step 3: Add request to summary thread\n Add summarize request:\n block: add-message\n thread: summary\n role: user\n prompt: summarize-request\n input:\n - CONVERSATION: CONVERSATION_TEXT\n\n # Step 4: Generate summary\n Generate summary:\n block: next-message\n thread: summary\n display: stream\n description: Summarizing your conversation\n independent: true\n output: SUMMARY\n\n # Step 5: Save to resource\n Save summary:\n block: set-resource\n resource: CONVERSATION_SUMMARY\n value: SUMMARY\n\n # Step 6: Create support ticket\n Create ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY\n priority: medium\n output: TICKET\n\n # Step 7: Add directive for response\n Add directive:\n block: add-message\n role: user\n prompt: ticket-directive\n input: [TICKET_DETAILS: TICKET]\n visible: false\n\n # Step 8: Respond to user\n Respond:\n block: next-message\n```\n\n## Block Input Mapping\n\nThe `input` field on blocks controls which variables are passed to the prompt. Only variables listed in `input` are available for interpolation.\n\nVariables can come from `protocol.input`, `protocol.resources`, `protocol.variables`, `trigger.input`, or outputs from prior blocks.\n\n```yaml\n# Array format (same name)\ninput: [USER_MESSAGE, COMPANY_NAME]\n\n# Array format (rename)\ninput:\n - CONVERSATION: CONVERSATION_TEXT # Prompt sees CONVERSATION, value comes from CONVERSATION_TEXT\n - TICKET_DETAILS: TICKET\n\n# Object format (rename)\ninput:\n CONVERSATION: CONVERSATION_TEXT\n TICKET_DETAILS: TICKET\n```\n\n## Independent Blocks\n\nUse `independent: true` for content that shouldn't go to the main chat:\n\n```yaml\nGenerate summary:\n block: next-message\n thread: summary\n independent: true # Output stored in variable, not main chat\n output: SUMMARY\n```\n\nThis is useful for:\n\n- Background processing\n- Summarization in separate threads\n- Generating content for tools\n",
609
+ content: "\n# Handlers\n\nHandlers define what happens when a trigger fires. They contain execution blocks that run in sequence.\n\n## Handler Structure\n\n```yaml\nhandlers:\n trigger-name:\n Block Name:\n block: block-kind\n # block-specific properties\n\n Another Block:\n block: another-kind\n # ...\n```\n\nEach block has a human-readable name (shown in debug UI) and a `block` field that determines its behavior.\n\n## Block Kinds\n\n### next-message\n\nGenerate a response from the LLM:\n\n```yaml\nhandlers:\n user-message:\n Respond to user:\n block: next-message\n # Uses main conversation thread by default\n # Display defaults to 'stream'\n```\n\nWith options:\n\n```yaml\nGenerate summary:\n block: next-message\n thread: summary # Use named thread\n display: stream # Show streaming content\n independent: true # Don't add to main chat\n output: SUMMARY # Store output in variable\n description: Generating summary # Shown in UI\n```\n\nFor structured output (typed JSON response):\n\n```yaml\nRespond with suggestions:\n block: next-message\n responseType: ChatResponse # Type defined in types section\n output: RESPONSE # Stores the parsed object\n```\n\nWhen `responseType` is specified:\n\n- The LLM generates JSON matching the type schema\n- The `output` variable receives the parsed object (not plain text)\n- The client receives a `UIObjectPart` for custom rendering\n\nSee [Types](/docs/protocol/types#structured-output) for more details.\n\n### add-message\n\nAdd a message to the conversation:\n\n```yaml\nAdd user message:\n block: add-message\n role: user # user | assistant | system\n prompt: user-message # Reference to prompt file\n input: [USER_MESSAGE] # Variables to interpolate\n display: hidden # Don't show in UI\n```\n\nFor internal directives (LLM sees it, user doesn't):\n\n```yaml\nAdd internal directive:\n block: add-message\n role: user\n prompt: ticket-directive\n input: [TICKET_DETAILS]\n visible: false # LLM sees this, user doesn't\n```\n\nFor structured user input (object shown in UI, prompt for LLM context):\n\n```yaml\nAdd user message:\n block: add-message\n role: user\n prompt: user-message # Rendered for LLM context (hidden from UI)\n input: [USER_INPUT]\n uiContent: USER_INPUT # Variable shown in UI (object \u2192 object part)\n display: hidden\n```\n\nWhen `uiContent` is set:\n\n- The variable value is shown in the UI (string \u2192 text part, object \u2192 object part)\n- The prompt text is hidden from the UI but kept for LLM context\n- Useful for rich UI interactions where the visual differs from the LLM context\n\n### tool-call\n\nCall a tool deterministically:\n\n```yaml\nCreate ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY # Variable reference\n priority: medium # Literal value\n output: TICKET # Store result\n```\n\n### set-resource\n\nUpdate a persistent resource:\n\n```yaml\nSave summary:\n block: set-resource\n resource: CONVERSATION_SUMMARY\n value: SUMMARY # Variable to save\n display: name # Show block name\n```\n\n### start-thread\n\nCreate a named conversation thread:\n\n```yaml\nStart summary thread:\n block: start-thread\n thread: summary # Thread name\n model: anthropic/claude-sonnet-4-5 # Optional: different model\n backupModel: openai/gpt-4o # Failover on provider errors\n thinking: low # Extended reasoning level\n cache: auto # auto (default) | extended | off\n maxSteps: 1 # Tool call limit\n system: escalation-summary # System prompt\n input: [COMPANY_NAME] # Variables for prompt\n mcpServers: [figma, browser] # MCP servers for this thread\n skills: [qr-code] # Octavus skills for this thread\n sandboxTimeout: 600000 # Skill sandbox timeout (default: 5 min, max: 1 hour)\n imageModel: google/gemini-2.5-flash-image # Image generation model\n```\n\nThe `cache` field controls prompt caching for this thread and defaults to `auto` when omitted. Threads do not inherit the agent's `cache` value - see [Prompt Caching](/docs/protocol/agent-config#prompt-caching).\n\nThe `model` field can also reference a variable for dynamic model selection. The `backupModel`, `temperature`, `thinking`, and `maxSteps` fields also support variable references - see [Dynamic Configuration](/docs/protocol/agent-config#dynamic-configuration).\n\n```yaml\nStart summary thread:\n block: start-thread\n thread: summary\n model: SUMMARY_MODEL # Resolved from input variable\n system: escalation-summary\n```\n\n### serialize-thread\n\nConvert conversation to text:\n\n```yaml\nSerialize conversation:\n block: serialize-thread\n thread: main # Which thread (default: main)\n format: markdown # markdown | json\n output: CONVERSATION_TEXT # Variable to store result\n```\n\n### generate-image\n\nGenerate an image from a prompt variable:\n\n```yaml\nGenerate image:\n block: generate-image\n prompt: OPTIMIZED_PROMPT # Variable containing the prompt\n imageModel: google/gemini-2.5-flash-image # Required image model\n size: 1024x1024 # 1024x1024 | 1792x1024 | 1024x1792\n output: GENERATED_IMAGE # Store URL in variable\n description: Generating your image... # Shown in UI\n```\n\nEdit an existing image using reference images:\n\n```yaml\nEdit image:\n block: generate-image\n prompt: EDIT_INSTRUCTIONS # e.g., \"Remove the background\"\n referenceImages: [SOURCE_IMAGE_URL] # Variable(s) containing image URLs\n imageModel: google/gemini-2.5-flash-image\n output: EDITED_IMAGE\n description: Editing image...\n```\n\n| Field | Required | Description |\n| ----------------- | -------- | --------------------------------------------------------------- |\n| `prompt` | Yes | Variable name containing the image prompt or edit instructions |\n| `imageModel` | Yes | Image model identifier (e.g., `google/gemini-2.5-flash-image`) |\n| `size` | No | Image dimensions: `1024x1024`, `1792x1024`, or `1024x1792` |\n| `referenceImages` | No | Variable names containing image URLs for editing/transformation |\n| `output` | No | Variable name to store the generated image URL |\n| `thread` | No | Thread to associate the output file with |\n| `description` | No | Description shown in the UI during generation |\n\nThis block is for deterministic image generation pipelines where the prompt is constructed programmatically (e.g., via prompt engineering in a separate thread). When `referenceImages` are provided, the prompt describes how to modify those images.\n\nFor agentic image generation where the LLM decides when to generate, configure `imageModel` in the [agent config](/docs/protocol/agent-config#image-generation).\n\n## Display Modes\n\nEvery block has a `display` property:\n\n| Mode | Default For | Behavior |\n| ------------- | ------------------------- | ------------------------------- |\n| `hidden` | add-message | Not shown to user |\n| `name` | set-resource | Shows block name |\n| `description` | tool-call, generate-image | Shows description |\n| `stream` | next-message | Streams content |\n| `title` | - | Shows the block's `title` field |\n\n## Complete Example\n\n```yaml\nhandlers:\n user-message:\n # Add the user's message to conversation\n Add user message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n # Generate response (LLM may call tools)\n Respond to user:\n block: next-message\n # display: stream (default)\n\n request-human:\n # Step 1: Serialize conversation for summary\n Serialize conversation:\n block: serialize-thread\n format: markdown\n output: CONVERSATION_TEXT\n\n # Step 2: Create separate thread for summarization\n Start summary thread:\n block: start-thread\n thread: summary\n model: anthropic/claude-sonnet-4-5\n thinking: low\n system: escalation-summary\n input: [COMPANY_NAME]\n\n # Step 3: Add request to summary thread\n Add summarize request:\n block: add-message\n thread: summary\n role: user\n prompt: summarize-request\n input:\n - CONVERSATION: CONVERSATION_TEXT\n\n # Step 4: Generate summary\n Generate summary:\n block: next-message\n thread: summary\n display: stream\n description: Summarizing your conversation\n independent: true\n output: SUMMARY\n\n # Step 5: Save to resource\n Save summary:\n block: set-resource\n resource: CONVERSATION_SUMMARY\n value: SUMMARY\n\n # Step 6: Create support ticket\n Create ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY\n priority: medium\n output: TICKET\n\n # Step 7: Add directive for response\n Add directive:\n block: add-message\n role: user\n prompt: ticket-directive\n input: [TICKET_DETAILS: TICKET]\n visible: false\n\n # Step 8: Respond to user\n Respond:\n block: next-message\n```\n\n## Block Input Mapping\n\nThe `input` field on blocks controls which variables are passed to the prompt. Only variables listed in `input` are available for interpolation.\n\nVariables can come from `protocol.input`, `protocol.resources`, `protocol.variables`, `trigger.input`, or outputs from prior blocks.\n\n```yaml\n# Array format (same name)\ninput: [USER_MESSAGE, COMPANY_NAME]\n\n# Array format (rename)\ninput:\n - CONVERSATION: CONVERSATION_TEXT # Prompt sees CONVERSATION, value comes from CONVERSATION_TEXT\n - TICKET_DETAILS: TICKET\n\n# Object format (rename)\ninput:\n CONVERSATION: CONVERSATION_TEXT\n TICKET_DETAILS: TICKET\n```\n\n## Independent Blocks\n\nUse `independent: true` for content that shouldn't go to the main chat:\n\n```yaml\nGenerate summary:\n block: next-message\n thread: summary\n independent: true # Output stored in variable, not main chat\n output: SUMMARY\n```\n\nThis is useful for:\n\n- Background processing\n- Summarization in separate threads\n- Generating content for tools\n",
610
610
  excerpt: "Handlers Handlers define what happens when a trigger fires. They contain execution blocks that run in sequence. Handler Structure Each block has a human-readable name (shown in debug UI) and a ...",
611
611
  order: 6
612
612
  },
@@ -624,7 +624,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
624
624
  section: "protocol",
625
625
  title: "Provider Options",
626
626
  description: "Configuring provider-specific tools and features.",
627
- content: "\n# Provider Options\n\nProvider options let you enable provider-specific features like Anthropic's built-in tools and skills. These features run server-side on the provider's infrastructure.\n\n> **Note**: For provider-agnostic code execution, use [Octavus Skills](/docs/protocol/skills) instead. Octavus Skills work with any LLM provider and run in isolated sandbox environments.\n\n## Anthropic Options\n\nConfigure Anthropic-specific features when using `anthropic/*` models:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n anthropic:\n # Provider tools (server-side)\n tools:\n web-search:\n display: description\n description: Searching the web\n code-execution:\n display: description\n description: Running code\n\n # Skills (knowledge packages)\n skills:\n pdf:\n type: anthropic\n display: description\n description: Processing PDF document\n```\n\n> **Note**: Provider options are validated against the model provider. Using `anthropic:` options with non-Anthropic models will result in a validation error.\n\n## Provider Tools\n\nProvider tools are executed server-side by the provider (Anthropic). Unlike external tools that you implement, provider tools are built-in capabilities.\n\n### Available Tools\n\n| Tool | Description |\n| ---------------- | -------------------------------------------- |\n| `web-search` | Search the web for current information |\n| `code-execution` | Execute Python/Bash in a sandboxed container |\n\n### Tool Configuration\n\n```yaml\nanthropic:\n tools:\n web-search:\n display: description # How to show in UI\n description: Searching... # Custom display text\n```\n\n| Field | Required | Description |\n| ------------- | -------- | --------------------------------------------------------------------- |\n| `display` | No | `hidden`, `name`, `description`, or `stream` (default: `description`) |\n| `description` | No | Custom text shown to users during execution |\n\n### Web Search\n\nAllows the agent to search the web using Anthropic's built-in web search:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n anthropic:\n tools:\n web-search:\n display: description\n description: Looking up current information\n```\n\n> **Tip**: Octavus also provides a **provider-agnostic** web search via `webSearch: true` in the agent config. This works with any LLM provider and is the recommended approach for multi-provider agents. See [Web Search](/docs/protocol/agent-config#web-search) for details.\n\n### Code Execution\n\nEnables Python and Bash execution in a sandboxed container:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n anthropic:\n tools:\n code-execution:\n display: description\n description: Running analysis\n```\n\nUse cases:\n\n- Data analysis and calculations\n- File processing\n- Chart generation\n- Script execution\n\n> **Note**: Code execution is automatically enabled when skills are configured (skills require the container environment).\n\n## Skills\n\n> **Important**: This section covers **Anthropic's built-in skills** (provider-specific). For provider-agnostic skills that work with any LLM, see [Octavus Skills](/docs/protocol/skills).\n\nAnthropic skills are knowledge packages that give the agent specialized capabilities. They're loaded into Anthropic's code execution container at `/skills/{skill-id}/` and only work with Anthropic models.\n\n### Skill Configuration\n\n```yaml\nanthropic:\n skills:\n pdf:\n type: anthropic # 'anthropic' or 'custom'\n version: latest # Optional version\n display: description\n description: Processing PDF\n```\n\n| Field | Required | Description |\n| ------------- | -------- | --------------------------------------------------------------------- |\n| `type` | Yes | `anthropic` (built-in) or `custom` (uploaded) |\n| `version` | No | Skill version (default: `latest`) |\n| `display` | No | `hidden`, `name`, `description`, or `stream` (default: `description`) |\n| `description` | No | Custom text shown to users |\n\n### Built-in Skills\n\nAnthropic provides several built-in skills:\n\n| Skill ID | Purpose |\n| -------- | ----------------------------------------------- |\n| `pdf` | PDF manipulation, text extraction, form filling |\n| `xlsx` | Excel spreadsheet operations and analysis |\n| `docx` | Word document creation and editing |\n| `pptx` | PowerPoint presentation creation |\n\n### Using Skills\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n anthropic:\n skills:\n pdf:\n type: anthropic\n description: Processing your PDF\n xlsx:\n type: anthropic\n description: Analyzing spreadsheet\n```\n\nWhen skills are configured:\n\n1. Code execution is automatically enabled\n2. Skill files are loaded into the container\n3. The agent can read skill instructions and execute scripts\n\n### Custom Skills\n\nYou can create and upload custom skills to Anthropic:\n\n```yaml\nanthropic:\n skills:\n custom-analysis:\n type: custom\n version: latest\n description: Running custom analysis\n```\n\nCustom skills follow the [Agent Skills standard](https://agentskills.io) and contain:\n\n- `SKILL.md` with instructions and metadata\n- Optional `scripts/`, `references/`, and `assets/` directories\n\n### Octavus Skills vs Anthropic Skills\n\n| Feature | Anthropic Skills | Octavus Skills |\n| ----------------- | ------------------------ | ----------------------------- |\n| **Provider** | Anthropic only | Any (agnostic) |\n| **Execution** | Anthropic's container | Isolated sandbox |\n| **Configuration** | `agent.anthropic.skills` | `agent.skills` |\n| **Definition** | `anthropic:` section | `skills:` section |\n| **Use Case** | Claude-specific features | Cross-provider code execution |\n\nFor provider-agnostic code execution, use Octavus Skills defined in the protocol's `skills:` section and enabled via `agent.skills`. See [Skills](/docs/protocol/skills) for details.\n\n## Display Modes\n\nBoth tools and skills support display modes:\n\n| Mode | Behavior |\n| ------------- | ------------------------------- |\n| `hidden` | Not shown to users |\n| `name` | Shows the tool/skill name |\n| `description` | Shows the description (default) |\n| `stream` | Streams progress if available |\n\n## Full Example\n\n```yaml\ninput:\n COMPANY_NAME: { type: string }\n USER_ID: { type: string, optional: true }\n\ntools:\n get-user-account:\n description: Looking up your account\n parameters:\n userId: { type: string }\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n input: [COMPANY_NAME, USER_ID]\n tools: [get-user-account] # External tools\n agentic: true\n thinking: medium\n\n # Anthropic-specific options\n anthropic:\n # Provider tools (server-side)\n tools:\n web-search:\n display: description\n description: Searching the web\n code-execution:\n display: description\n description: Running code\n\n # Skills (knowledge packages)\n skills:\n pdf:\n type: anthropic\n display: description\n description: Processing PDF document\n xlsx:\n type: anthropic\n display: description\n description: Analyzing spreadsheet\n\ntriggers:\n user-message:\n input:\n USER_MESSAGE: { type: string }\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n Respond:\n block: next-message\n```\n\n## Validation\n\nThe protocol validator enforces:\n\n1. **Model match**: Provider options must match the model provider\n - `anthropic:` options require `anthropic/*` model\n - Using mismatched options results in a validation error\n\n2. **Valid tool types**: Only recognized tools are accepted\n - `web-search` and `code-execution` for Anthropic\n\n3. **Valid skill types**: Only `anthropic` or `custom` are accepted\n\n### Error Example\n\n```yaml\n# This will fail validation\nagent:\n model: openai/gpt-4o # OpenAI model\n anthropic: # Anthropic options - mismatch!\n tools:\n web-search: {}\n```\n\nError: `\"anthropic\" options require an anthropic model. Current model provider: \"openai\"`\n",
627
+ content: "\n# Provider Options\n\nProvider options let you enable provider-specific features like Anthropic's built-in tools and skills. These features run server-side on the provider's infrastructure.\n\n> **Note**: For provider-agnostic code execution, use [Octavus Skills](/docs/protocol/skills) instead. Octavus Skills work with any LLM provider and run in isolated sandbox environments.\n\n## Anthropic Options\n\nConfigure Anthropic-specific features when using `anthropic/*` models:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n anthropic:\n # Provider tools (server-side)\n tools:\n web-search:\n display: description\n description: Searching the web\n code-execution:\n display: description\n description: Running code\n\n # Skills (knowledge packages)\n skills:\n pdf:\n type: anthropic\n display: description\n description: Processing PDF document\n```\n\n> **Note**: Provider options are validated against the model provider. Using `anthropic:` options with non-Anthropic models will result in a validation error.\n\n## Provider Tools\n\nProvider tools are executed server-side by the provider (Anthropic). Unlike external tools that you implement, provider tools are built-in capabilities.\n\n### Available Tools\n\n| Tool | Description |\n| ---------------- | -------------------------------------------- |\n| `web-search` | Search the web for current information |\n| `code-execution` | Execute Python/Bash in a sandboxed container |\n\n### Tool Configuration\n\n```yaml\nanthropic:\n tools:\n web-search:\n display: description # How to show in UI\n description: Searching... # Custom display text\n```\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------------------------------ |\n| `display` | No | `hidden`, `name`, `description`, `stream`, or `title` (default: `description`) |\n| `title` | No | UI label shown when `display: title` (hides description and arguments) |\n| `description` | No | Custom text shown to users during execution |\n\n### Web Search\n\nAllows the agent to search the web using Anthropic's built-in web search:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n anthropic:\n tools:\n web-search:\n display: description\n description: Looking up current information\n```\n\n> **Tip**: Octavus also provides a **provider-agnostic** web search via `webSearch: true` in the agent config. This works with any LLM provider and is the recommended approach for multi-provider agents. See [Web Search](/docs/protocol/agent-config#web-search) for details.\n\n### Code Execution\n\nEnables Python and Bash execution in a sandboxed container:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n anthropic:\n tools:\n code-execution:\n display: description\n description: Running analysis\n```\n\nUse cases:\n\n- Data analysis and calculations\n- File processing\n- Chart generation\n- Script execution\n\n> **Note**: Code execution is automatically enabled when skills are configured (skills require the container environment).\n\n## Skills\n\n> **Important**: This section covers **Anthropic's built-in skills** (provider-specific). For provider-agnostic skills that work with any LLM, see [Octavus Skills](/docs/protocol/skills).\n\nAnthropic skills are knowledge packages that give the agent specialized capabilities. They're loaded into Anthropic's code execution container at `/skills/{skill-id}/` and only work with Anthropic models.\n\n### Skill Configuration\n\n```yaml\nanthropic:\n skills:\n pdf:\n type: anthropic # 'anthropic' or 'custom'\n version: latest # Optional version\n display: description\n description: Processing PDF\n```\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------------------------------ |\n| `type` | Yes | `anthropic` (built-in) or `custom` (uploaded) |\n| `version` | No | Skill version (default: `latest`) |\n| `display` | No | `hidden`, `name`, `description`, `stream`, or `title` (default: `description`) |\n| `title` | No | UI label shown when `display: title` (hides description and arguments) |\n| `description` | No | Custom text shown to users |\n\n### Built-in Skills\n\nAnthropic provides several built-in skills:\n\n| Skill ID | Purpose |\n| -------- | ----------------------------------------------- |\n| `pdf` | PDF manipulation, text extraction, form filling |\n| `xlsx` | Excel spreadsheet operations and analysis |\n| `docx` | Word document creation and editing |\n| `pptx` | PowerPoint presentation creation |\n\n### Using Skills\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n anthropic:\n skills:\n pdf:\n type: anthropic\n description: Processing your PDF\n xlsx:\n type: anthropic\n description: Analyzing spreadsheet\n```\n\nWhen skills are configured:\n\n1. Code execution is automatically enabled\n2. Skill files are loaded into the container\n3. The agent can read skill instructions and execute scripts\n\n### Custom Skills\n\nYou can create and upload custom skills to Anthropic:\n\n```yaml\nanthropic:\n skills:\n custom-analysis:\n type: custom\n version: latest\n description: Running custom analysis\n```\n\nCustom skills follow the [Agent Skills standard](https://agentskills.io) and contain:\n\n- `SKILL.md` with instructions and metadata\n- Optional `scripts/`, `references/`, and `assets/` directories\n\n### Octavus Skills vs Anthropic Skills\n\n| Feature | Anthropic Skills | Octavus Skills |\n| ----------------- | ------------------------ | ----------------------------- |\n| **Provider** | Anthropic only | Any (agnostic) |\n| **Execution** | Anthropic's container | Isolated sandbox |\n| **Configuration** | `agent.anthropic.skills` | `agent.skills` |\n| **Definition** | `anthropic:` section | `skills:` section |\n| **Use Case** | Claude-specific features | Cross-provider code execution |\n\nFor provider-agnostic code execution, use Octavus Skills defined in the protocol's `skills:` section and enabled via `agent.skills`. See [Skills](/docs/protocol/skills) for details.\n\n## Display Modes\n\nBoth tools and skills support display modes:\n\n| Mode | Behavior |\n| ------------- | ------------------------------- |\n| `hidden` | Not shown to users |\n| `name` | Shows the tool/skill name |\n| `description` | Shows the description (default) |\n| `stream` | Streams progress if available |\n\n## Full Example\n\n```yaml\ninput:\n COMPANY_NAME: { type: string }\n USER_ID: { type: string, optional: true }\n\ntools:\n get-user-account:\n description: Looking up your account\n parameters:\n userId: { type: string }\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n input: [COMPANY_NAME, USER_ID]\n tools: [get-user-account] # External tools\n agentic: true\n thinking: medium\n\n # Anthropic-specific options\n anthropic:\n # Provider tools (server-side)\n tools:\n web-search:\n display: description\n description: Searching the web\n code-execution:\n display: description\n description: Running code\n\n # Skills (knowledge packages)\n skills:\n pdf:\n type: anthropic\n display: description\n description: Processing PDF document\n xlsx:\n type: anthropic\n display: description\n description: Analyzing spreadsheet\n\ntriggers:\n user-message:\n input:\n USER_MESSAGE: { type: string }\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n Respond:\n block: next-message\n```\n\n## Validation\n\nThe protocol validator enforces:\n\n1. **Model match**: Provider options must match the model provider\n - `anthropic:` options require `anthropic/*` model\n - Using mismatched options results in a validation error\n\n2. **Valid tool types**: Only recognized tools are accepted\n - `web-search` and `code-execution` for Anthropic\n\n3. **Valid skill types**: Only `anthropic` or `custom` are accepted\n\n### Error Example\n\n```yaml\n# This will fail validation\nagent:\n model: openai/gpt-4o # OpenAI model\n anthropic: # Anthropic options - mismatch!\n tools:\n web-search: {}\n```\n\nError: `\"anthropic\" options require an anthropic model. Current model provider: \"openai\"`\n",
628
628
  excerpt: "Provider Options Provider options let you enable provider-specific features like Anthropic's built-in tools and skills. These features run server-side on the provider's infrastructure. > Note: For...",
629
629
  order: 8
630
630
  },
@@ -651,7 +651,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
651
651
  section: "protocol",
652
652
  title: "Workers",
653
653
  description: "Defining worker agents for background and task-based execution.",
654
- content: '\n# Workers\n\nWorkers are agents designed for task-based execution. Unlike interactive agents that handle multi-turn conversations, workers execute a sequence of steps and return an output value.\n\n## When to Use Workers\n\nWorkers are ideal for:\n\n- **Background processing** - Long-running tasks that don\'t need conversation\n- **Composable tasks** - Reusable units of work called by other agents\n- **Pipelines** - Multi-step processing with structured output\n- **Parallel execution** - Tasks that can run independently\n\nUse interactive agents instead when:\n\n- **Conversation is needed** - Multi-turn dialogue with users\n- **Persistence matters** - State should survive across interactions\n- **Session context** - User context needs to persist\n\n## Worker vs Interactive\n\n| Aspect | Interactive | Worker |\n| ---------- | ---------------------------------- | ----------------------------- |\n| Structure | `triggers` + `handlers` + `agent` | `steps` + `output` |\n| LLM Config | Global `agent:` section | Per-thread via `start-thread` |\n| Invocation | Fire a named trigger | Direct execution with input |\n| Session | Persists across triggers (24h TTL) | Single execution |\n| Result | Streaming chat | Streaming + output value |\n\n## Protocol Structure\n\nWorkers use a simpler protocol structure than interactive agents:\n\n```yaml\n# Input schema - provided when worker is executed\ninput:\n TOPIC:\n type: string\n description: Topic to research\n DEPTH:\n type: string\n optional: true\n default: medium\n\n# Variables for intermediate results\nvariables:\n RESEARCH_DATA:\n type: string\n ANALYSIS:\n type: string\n description: Final analysis result\n\n# Tools available to the worker\ntools:\n web-search:\n description: Search the web\n parameters:\n query: { type: string }\n\n# Sequential execution steps\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: research-system\n input: [TOPIC, DEPTH]\n tools: [web-search]\n maxSteps: 5\n\n Add research request:\n block: add-message\n thread: research\n role: user\n prompt: research-prompt\n input: [TOPIC, DEPTH]\n\n Generate research:\n block: next-message\n thread: research\n output: RESEARCH_DATA\n\n Start analysis:\n block: start-thread\n thread: analysis\n model: anthropic/claude-sonnet-4-5\n system: analysis-system\n\n Add analysis request:\n block: add-message\n thread: analysis\n role: user\n prompt: analysis-prompt\n input: [RESEARCH_DATA]\n\n Generate analysis:\n block: next-message\n thread: analysis\n output: ANALYSIS\n\n# Output variable - the worker\'s return value\noutput: ANALYSIS\n```\n\n## settings.json\n\nWorkers are identified by the `format` field:\n\n```json\n{\n "slug": "research-assistant",\n "name": "Research Assistant",\n "description": "Researches topics and returns structured analysis",\n "format": "worker"\n}\n```\n\n## Key Differences\n\n### No Global Agent Config\n\nInteractive agents have a global `agent:` section that configures a main thread. Workers don\'t have this - every thread must be explicitly created via `start-thread`:\n\n```yaml\n# Interactive agent: Global config\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools: [tool-a, tool-b]\n\n# Worker: Each thread configured independently\nsteps:\n Start thread A:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n tools: [tool-a]\n\n Start thread B:\n block: start-thread\n thread: analysis\n model: openai/gpt-4o\n tools: [tool-b]\n```\n\nThis gives workers flexibility to use different models, tools, skills, and settings at different stages.\n\n### Steps Instead of Handlers\n\nWorkers use `steps:` instead of `handlers:`. Steps execute sequentially, like handler blocks:\n\n```yaml\n# Interactive: Handlers respond to triggers\nhandlers:\n user-message:\n Add message:\n block: add-message\n # ...\n\n# Worker: Steps execute in sequence\nsteps:\n Add message:\n block: add-message\n # ...\n```\n\n### Output Value\n\nWorkers can return an output value to the caller:\n\n```yaml\nvariables:\n RESULT:\n type: string\n\nsteps:\n # ... steps that populate RESULT ...\n\noutput: RESULT # Return this variable\'s value\n```\n\nThe `output` field references a variable declared in `variables:`. If omitted, the worker completes without returning a value.\n\n## Available Blocks\n\nWorkers support the same blocks as handlers:\n\n| Block | Purpose |\n| ------------------ | -------------------------------------------- |\n| `start-thread` | Create a named thread with LLM configuration |\n| `add-message` | Add a message to a thread |\n| `next-message` | Generate LLM response |\n| `tool-call` | Call a tool deterministically |\n| `set-resource` | Update a resource value |\n| `serialize-thread` | Convert thread to text |\n| `generate-image` | Generate an image from a prompt variable |\n\n### start-thread (Required for LLM)\n\nEvery thread must be initialized with `start-thread` before using `next-message`:\n\n```yaml\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: research-system\n input: [TOPIC]\n tools: [web-search]\n thinking: medium\n maxSteps: 5\n```\n\nAll LLM configuration goes here:\n\n| Field | Description |\n| --------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |\n| `thread` | Thread name (defaults to block name) |\n| `model` | LLM model to use |\n| `system` | System prompt filename (required) |\n| `input` | Variables for system prompt |\n| `tools` | Tools available in this thread |\n| `skills` | Octavus skills available in this thread |\n| `mcpServers` | MCP servers available in this thread |\n| `imageModel` | Image generation model |\n| `webSearch` | Enable built-in web search tool |\n| `thinking` | Extended reasoning level (`low`/`medium`/`high`/`max`), `"off"`, or variable reference |\n| `cache` | Prompt caching mode: `auto` (default), `extended`, or `off` |\n| `temperature` | Model temperature (0-2), `"off"`, or variable reference |\n| `maxSteps` | Maximum tool call cycles (enables agentic if > 1), or variable reference |\n| `maxToolOutputTokens` | Cap a single tool result at this many tokens in the thread\'s model view (head+tail preview + note). Omit to leave tool output unbounded |\n\n## Simple Example\n\nA worker that generates a title from a summary:\n\n```yaml\n# Input\ninput:\n CONVERSATION_SUMMARY:\n type: string\n description: Summary to generate a title for\n\n# Variables\nvariables:\n TITLE:\n type: string\n description: The generated title\n\n# Steps\nsteps:\n Start title thread:\n block: start-thread\n thread: title-gen\n model: anthropic/claude-sonnet-4-5\n system: title-system\n\n Add title request:\n block: add-message\n thread: title-gen\n role: user\n prompt: title-request\n input: [CONVERSATION_SUMMARY]\n\n Generate title:\n block: next-message\n thread: title-gen\n output: TITLE\n display: stream\n\n# Output\noutput: TITLE\n```\n\n## Advanced Example\n\nA worker with multiple threads, tools, and agentic behavior:\n\n```yaml\ninput:\n USER_MESSAGE:\n type: string\n description: The user\'s message to respond to\n USER_ID:\n type: string\n description: User ID for account lookups\n optional: true\n\ntools:\n get-user-account:\n description: Looking up account information\n parameters:\n userId: { type: string }\n create-support-ticket:\n description: Creating a support ticket\n parameters:\n summary: { type: string }\n priority: { type: string }\n\nvariables:\n ASSISTANT_RESPONSE:\n type: string\n CHAT_TRANSCRIPT:\n type: string\n CONVERSATION_SUMMARY:\n type: string\n\nsteps:\n # Thread 1: Chat with agentic tool calling\n Start chat thread:\n block: start-thread\n thread: chat\n model: anthropic/claude-sonnet-4-5\n system: chat-system\n input: [USER_ID]\n tools: [get-user-account, create-support-ticket]\n thinking: medium\n maxSteps: 5\n\n Add user message:\n block: add-message\n thread: chat\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n\n Generate response:\n block: next-message\n thread: chat\n output: ASSISTANT_RESPONSE\n display: stream\n\n # Serialize for summary\n Save conversation:\n block: serialize-thread\n thread: chat\n output: CHAT_TRANSCRIPT\n\n # Thread 2: Summary generation\n Start summary thread:\n block: start-thread\n thread: summary\n model: anthropic/claude-sonnet-4-5\n system: summary-system\n thinking: low\n\n Add summary request:\n block: add-message\n thread: summary\n role: user\n prompt: summary-request\n input: [CHAT_TRANSCRIPT]\n\n Generate summary:\n block: next-message\n thread: summary\n output: CONVERSATION_SUMMARY\n display: stream\n\noutput: CONVERSATION_SUMMARY\n```\n\n## MCP Servers\n\nWorkers can declare and use MCP servers, just like interactive agents. Define them in `mcpServers:` and reference them in `start-thread`:\n\n```yaml\nmcpServers:\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [sentry, browser]\n maxSteps: 10\n```\n\nWorkers resolve their own MCP connections independently - they don\'t inherit MCP servers from a parent interactive agent. Remote MCP connections are project-scoped, so a worker in the same project automatically has access to the same OAuth connections.\n\nSee [MCP Servers](/docs/protocol/mcp-servers) for full documentation.\n\n## Skills, Image Generation, and Web Search\n\nWorkers can use Octavus skills, image generation, and web search, configured per-thread via `start-thread`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generate QR codes\n\nsteps:\n Start thread:\n block: start-thread\n thread: worker\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n imageModel: google/gemini-2.5-flash-image\n webSearch: true\n maxSteps: 10\n```\n\nWorkers define their own skills independently - they don\'t inherit skills from a parent interactive agent. Each thread gets its own sandbox scoped to only its listed skills.\n\nSkills with `execution: device` work the same way in workers as in interactive agents - the skill runs on the agent\'s computer. Workers resolve their device execution independently, so a worker can use device skills even if the parent agent does not.\n\nSee [Skills](/docs/protocol/skills) for full documentation.\n\n## Tool Handling\n\nWorkers support the same tool handling as interactive agents:\n\n- **Server tools** - Handled by tool handlers you provide\n- **Client tools** - Pause execution, return tool request to caller\n\n```typescript\n// Non-streaming: get the output directly\nconst { output } = await client.workers.generate(\n agentId,\n { TOPIC: \'AI safety\' },\n {\n tools: {\n \'web-search\': async (args) => await searchWeb(args.query),\n },\n },\n);\n\n// Streaming: observe events in real-time\nconst events = client.workers.execute(\n agentId,\n { TOPIC: \'AI safety\' },\n {\n tools: {\n \'web-search\': async (args) => await searchWeb(args.query),\n },\n },\n);\n```\n\nSee [Server SDK Workers](/docs/server-sdk/workers) for tool handling details.\n\n## Stream Events\n\nWorkers emit the same events as interactive agents, plus worker-specific events:\n\n| Event | Description |\n| --------------- | ---------------------------------- |\n| `worker-start` | Worker execution begins |\n| `worker-result` | Worker completes (includes output) |\n\nAll standard events (text-delta, tool calls, etc.) are also emitted.\n\n## Calling Workers from Interactive Agents\n\nInteractive agents can call workers in three ways:\n\n1. **Deterministically** - Using the `run-worker` block\n2. **Agentically** - LLM calls worker as a tool\n3. **Automatically** - Octavus invokes the worker as part of a built-in capability, not the model. Context management\'s `summarizerWorker` (see [Context Management](/docs/protocol/context-management)) works this way: declare it in `workers:` but leave it out of `agent.workers` so the model never sees it as a tool.\n\n### Worker Declaration\n\nFirst, declare workers in your interactive agent\'s protocol:\n\n```yaml\nworkers:\n generate-title:\n description: Generating conversation title\n display: description\n research-assistant:\n description: Researching topic\n display: stream\n tools:\n search: web-search # Map worker tool \u2192 parent tool\n```\n\n### run-worker Block\n\nCall a worker deterministically from a handler:\n\n```yaml\nhandlers:\n request-human:\n Generate title:\n block: run-worker\n worker: generate-title\n input:\n CONVERSATION_SUMMARY: SUMMARY\n output: CONVERSATION_TITLE\n```\n\n### LLM Tool Invocation\n\nMake workers available to the LLM:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n workers: [generate-title, research-assistant]\n agentic: true\n```\n\nThe LLM can then call workers as tools during conversation.\n\n### Display Modes\n\nControls how worker execution appears to users. The default for workers is `stream`.\n\n| Mode | Behavior |\n| ------------- | ---------------------------------------------------------------------------------------------------------------------------------- |\n| `hidden` | Worker runs silently. No events reach the client - no `UIWorkerPart` is created. |\n| `name` | Shows a running/done indicator with the worker name. No nested content (text, tool calls, reasoning) is forwarded. |\n| `description` | Shows a running/done indicator with the worker description. No nested content is forwarded. |\n| `stream` | Full visibility. All nested events are forwarded - text, reasoning, tool calls, sources, files. Worker input is included on start. |\n\n**Progressive input streaming:** When a worker with `display: stream` is invoked agentically (LLM calls it as a tool), the `UIWorkerPart` appears in the UI immediately as the LLM starts generating the worker\'s arguments. The worker input streams progressively into the worker part, the same way text tokens stream into a text part. Once input finishes, worker execution begins and nested content flows into the same worker part. There is no intermediate tool card.\n\n**`name` and `description` modes:** Worker input is stripped from the `worker-start` event (it may contain sensitive data). Only the running/done status and the final `worker-result` are forwarded to the parent stream. Use these for workers where the user only needs to know the worker ran, not what it did internally.\n\n**`hidden` mode:** The worker executes normally but produces no UI presence at all. Use for internal workers that are implementation details.\n\n### Tool Mapping\n\nMap parent tools to worker tools when the worker needs access to your tool handlers:\n\n```yaml\nworkers:\n research-assistant:\n description: Research topics\n tools:\n search: web-search # Worker\'s "search" \u2192 parent\'s "web-search"\n```\n\nWhen the worker calls its `search` tool, your `web-search` handler executes.\n\n## Next Steps\n\n- [Server SDK Workers](/docs/server-sdk/workers) - Executing workers from code\n- [Handlers](/docs/protocol/handlers) - Block reference for steps\n- [Agent Config](/docs/protocol/agent-config) - Model and settings\n',
654
+ content: '\n# Workers\n\nWorkers are agents designed for task-based execution. Unlike interactive agents that handle multi-turn conversations, workers execute a sequence of steps and return an output value.\n\n## When to Use Workers\n\nWorkers are ideal for:\n\n- **Background processing** - Long-running tasks that don\'t need conversation\n- **Composable tasks** - Reusable units of work called by other agents\n- **Pipelines** - Multi-step processing with structured output\n- **Parallel execution** - Tasks that can run independently\n\nUse interactive agents instead when:\n\n- **Conversation is needed** - Multi-turn dialogue with users\n- **Persistence matters** - State should survive across interactions\n- **Session context** - User context needs to persist\n\n## Worker vs Interactive\n\n| Aspect | Interactive | Worker |\n| ---------- | ---------------------------------- | ----------------------------- |\n| Structure | `triggers` + `handlers` + `agent` | `steps` + `output` |\n| LLM Config | Global `agent:` section | Per-thread via `start-thread` |\n| Invocation | Fire a named trigger | Direct execution with input |\n| Session | Persists across triggers (24h TTL) | Single execution |\n| Result | Streaming chat | Streaming + output value |\n\n## Protocol Structure\n\nWorkers use a simpler protocol structure than interactive agents:\n\n```yaml\n# Input schema - provided when worker is executed\ninput:\n TOPIC:\n type: string\n description: Topic to research\n DEPTH:\n type: string\n optional: true\n default: medium\n\n# Variables for intermediate results\nvariables:\n RESEARCH_DATA:\n type: string\n ANALYSIS:\n type: string\n description: Final analysis result\n\n# Tools available to the worker\ntools:\n web-search:\n description: Search the web\n parameters:\n query: { type: string }\n\n# Sequential execution steps\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: research-system\n input: [TOPIC, DEPTH]\n tools: [web-search]\n maxSteps: 5\n\n Add research request:\n block: add-message\n thread: research\n role: user\n prompt: research-prompt\n input: [TOPIC, DEPTH]\n\n Generate research:\n block: next-message\n thread: research\n output: RESEARCH_DATA\n\n Start analysis:\n block: start-thread\n thread: analysis\n model: anthropic/claude-sonnet-4-5\n system: analysis-system\n\n Add analysis request:\n block: add-message\n thread: analysis\n role: user\n prompt: analysis-prompt\n input: [RESEARCH_DATA]\n\n Generate analysis:\n block: next-message\n thread: analysis\n output: ANALYSIS\n\n# Output variable - the worker\'s return value\noutput: ANALYSIS\n```\n\n## settings.json\n\nWorkers are identified by the `format` field:\n\n```json\n{\n "slug": "research-assistant",\n "name": "Research Assistant",\n "description": "Researches topics and returns structured analysis",\n "format": "worker"\n}\n```\n\n## Key Differences\n\n### No Global Agent Config\n\nInteractive agents have a global `agent:` section that configures a main thread. Workers don\'t have this - every thread must be explicitly created via `start-thread`:\n\n```yaml\n# Interactive agent: Global config\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools: [tool-a, tool-b]\n\n# Worker: Each thread configured independently\nsteps:\n Start thread A:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n tools: [tool-a]\n\n Start thread B:\n block: start-thread\n thread: analysis\n model: openai/gpt-4o\n tools: [tool-b]\n```\n\nThis gives workers flexibility to use different models, tools, skills, and settings at different stages.\n\n### Steps Instead of Handlers\n\nWorkers use `steps:` instead of `handlers:`. Steps execute sequentially, like handler blocks:\n\n```yaml\n# Interactive: Handlers respond to triggers\nhandlers:\n user-message:\n Add message:\n block: add-message\n # ...\n\n# Worker: Steps execute in sequence\nsteps:\n Add message:\n block: add-message\n # ...\n```\n\n### Output Value\n\nWorkers can return an output value to the caller:\n\n```yaml\nvariables:\n RESULT:\n type: string\n\nsteps:\n # ... steps that populate RESULT ...\n\noutput: RESULT # Return this variable\'s value\n```\n\nThe `output` field references a variable declared in `variables:`. If omitted, the worker completes without returning a value.\n\n## Available Blocks\n\nWorkers support the same blocks as handlers:\n\n| Block | Purpose |\n| ------------------ | -------------------------------------------- |\n| `start-thread` | Create a named thread with LLM configuration |\n| `add-message` | Add a message to a thread |\n| `next-message` | Generate LLM response |\n| `tool-call` | Call a tool deterministically |\n| `set-resource` | Update a resource value |\n| `serialize-thread` | Convert thread to text |\n| `generate-image` | Generate an image from a prompt variable |\n\n### start-thread (Required for LLM)\n\nEvery thread must be initialized with `start-thread` before using `next-message`:\n\n```yaml\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: research-system\n input: [TOPIC]\n tools: [web-search]\n thinking: medium\n maxSteps: 5\n```\n\nAll LLM configuration goes here:\n\n| Field | Description |\n| --------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |\n| `thread` | Thread name (defaults to block name) |\n| `model` | LLM model to use |\n| `system` | System prompt filename (required) |\n| `input` | Variables for system prompt |\n| `tools` | Tools available in this thread |\n| `skills` | Octavus skills available in this thread |\n| `mcpServers` | MCP servers available in this thread |\n| `imageModel` | Image generation model |\n| `webSearch` | Enable built-in web search tool |\n| `thinking` | Extended reasoning level (`low`/`medium`/`high`/`max`), `"off"`, or variable reference |\n| `cache` | Prompt caching mode: `auto` (default), `extended`, or `off` |\n| `temperature` | Model temperature (0-2), `"off"`, or variable reference |\n| `maxSteps` | Maximum tool call cycles (enables agentic if > 1), or variable reference |\n| `maxToolOutputTokens` | Cap a single tool result at this many tokens in the thread\'s model view (head+tail preview + note). Omit to leave tool output unbounded |\n\n## Simple Example\n\nA worker that generates a title from a summary:\n\n```yaml\n# Input\ninput:\n CONVERSATION_SUMMARY:\n type: string\n description: Summary to generate a title for\n\n# Variables\nvariables:\n TITLE:\n type: string\n description: The generated title\n\n# Steps\nsteps:\n Start title thread:\n block: start-thread\n thread: title-gen\n model: anthropic/claude-sonnet-4-5\n system: title-system\n\n Add title request:\n block: add-message\n thread: title-gen\n role: user\n prompt: title-request\n input: [CONVERSATION_SUMMARY]\n\n Generate title:\n block: next-message\n thread: title-gen\n output: TITLE\n display: stream\n\n# Output\noutput: TITLE\n```\n\n## Advanced Example\n\nA worker with multiple threads, tools, and agentic behavior:\n\n```yaml\ninput:\n USER_MESSAGE:\n type: string\n description: The user\'s message to respond to\n USER_ID:\n type: string\n description: User ID for account lookups\n optional: true\n\ntools:\n get-user-account:\n description: Looking up account information\n parameters:\n userId: { type: string }\n create-support-ticket:\n description: Creating a support ticket\n parameters:\n summary: { type: string }\n priority: { type: string }\n\nvariables:\n ASSISTANT_RESPONSE:\n type: string\n CHAT_TRANSCRIPT:\n type: string\n CONVERSATION_SUMMARY:\n type: string\n\nsteps:\n # Thread 1: Chat with agentic tool calling\n Start chat thread:\n block: start-thread\n thread: chat\n model: anthropic/claude-sonnet-4-5\n system: chat-system\n input: [USER_ID]\n tools: [get-user-account, create-support-ticket]\n thinking: medium\n maxSteps: 5\n\n Add user message:\n block: add-message\n thread: chat\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n\n Generate response:\n block: next-message\n thread: chat\n output: ASSISTANT_RESPONSE\n display: stream\n\n # Serialize for summary\n Save conversation:\n block: serialize-thread\n thread: chat\n output: CHAT_TRANSCRIPT\n\n # Thread 2: Summary generation\n Start summary thread:\n block: start-thread\n thread: summary\n model: anthropic/claude-sonnet-4-5\n system: summary-system\n thinking: low\n\n Add summary request:\n block: add-message\n thread: summary\n role: user\n prompt: summary-request\n input: [CHAT_TRANSCRIPT]\n\n Generate summary:\n block: next-message\n thread: summary\n output: CONVERSATION_SUMMARY\n display: stream\n\noutput: CONVERSATION_SUMMARY\n```\n\n## MCP Servers\n\nWorkers can declare and use MCP servers, just like interactive agents. Define them in `mcpServers:` and reference them in `start-thread`:\n\n```yaml\nmcpServers:\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [sentry, browser]\n maxSteps: 10\n```\n\nWorkers resolve their own MCP connections independently - they don\'t inherit MCP servers from a parent interactive agent. Remote MCP connections are project-scoped, so a worker in the same project automatically has access to the same OAuth connections.\n\nSee [MCP Servers](/docs/protocol/mcp-servers) for full documentation.\n\n## Skills, Image Generation, and Web Search\n\nWorkers can use Octavus skills, image generation, and web search, configured per-thread via `start-thread`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generate QR codes\n\nsteps:\n Start thread:\n block: start-thread\n thread: worker\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n imageModel: google/gemini-2.5-flash-image\n webSearch: true\n maxSteps: 10\n```\n\nWorkers define their own skills independently - they don\'t inherit skills from a parent interactive agent. Each thread gets its own sandbox scoped to only its listed skills.\n\nSkills with `execution: device` work the same way in workers as in interactive agents - the skill runs on the agent\'s computer. Workers resolve their device execution independently, so a worker can use device skills even if the parent agent does not.\n\nSee [Skills](/docs/protocol/skills) for full documentation.\n\n## Tool Handling\n\nWorkers support the same tool handling as interactive agents:\n\n- **Server tools** - Handled by tool handlers you provide\n- **Client tools** - Pause execution, return tool request to caller\n\n```typescript\n// Non-streaming: get the output directly\nconst { output } = await client.workers.generate(\n agentId,\n { TOPIC: \'AI safety\' },\n {\n tools: {\n \'web-search\': async (args) => await searchWeb(args.query),\n },\n },\n);\n\n// Streaming: observe events in real-time\nconst events = client.workers.execute(\n agentId,\n { TOPIC: \'AI safety\' },\n {\n tools: {\n \'web-search\': async (args) => await searchWeb(args.query),\n },\n },\n);\n```\n\nSee [Server SDK Workers](/docs/server-sdk/workers) for tool handling details.\n\n## Stream Events\n\nWorkers emit the same events as interactive agents, plus worker-specific events:\n\n| Event | Description |\n| --------------- | ---------------------------------- |\n| `worker-start` | Worker execution begins |\n| `worker-result` | Worker completes (includes output) |\n\nAll standard events (text-delta, tool calls, etc.) are also emitted.\n\n## Calling Workers from Interactive Agents\n\nInteractive agents can call workers in three ways:\n\n1. **Deterministically** - Using the `run-worker` block\n2. **Agentically** - LLM calls worker as a tool\n3. **Automatically** - Octavus invokes the worker as part of a built-in capability, not the model. Context management\'s `summarizerWorker` (see [Context Management](/docs/protocol/context-management)) works this way: declare it in `workers:` but leave it out of `agent.workers` so the model never sees it as a tool.\n\n### Worker Declaration\n\nFirst, declare workers in your interactive agent\'s protocol:\n\n```yaml\nworkers:\n generate-title:\n description: Generating conversation title\n display: description\n research-assistant:\n description: Researching topic\n display: stream\n tools:\n search: web-search # Map worker tool \u2192 parent tool\n```\n\n### run-worker Block\n\nCall a worker deterministically from a handler:\n\n```yaml\nhandlers:\n request-human:\n Generate title:\n block: run-worker\n worker: generate-title\n input:\n CONVERSATION_SUMMARY: SUMMARY\n output: CONVERSATION_TITLE\n```\n\n### LLM Tool Invocation\n\nMake workers available to the LLM:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n workers: [generate-title, research-assistant]\n agentic: true\n```\n\nThe LLM can then call workers as tools during conversation.\n\n### Display Modes\n\nControls how worker execution appears to users. The default for workers is `stream`.\n\n| Mode | Behavior |\n| ------------- | ---------------------------------------------------------------------------------------------------------------------------------- |\n| `hidden` | Worker runs silently. No events reach the client - no `UIWorkerPart` is created. |\n| `name` | Shows a running/done indicator with the worker name. No nested content (text, tool calls, reasoning) is forwarded. |\n| `description` | Shows a running/done indicator with the worker description. No nested content is forwarded. |\n| `stream` | Full visibility. All nested events are forwarded - text, reasoning, tool calls, sources, files. Worker input is included on start. |\n| `title` | Like `description`, but shows the worker\'s `title` field instead of its description. No nested content or input is forwarded. |\n\n**Progressive input streaming:** When a worker with `display: stream` is invoked agentically (LLM calls it as a tool), the `UIWorkerPart` appears in the UI immediately as the LLM starts generating the worker\'s arguments. The worker input streams progressively into the worker part, the same way text tokens stream into a text part. Once input finishes, worker execution begins and nested content flows into the same worker part. There is no intermediate tool card.\n\n**`name` and `description` modes:** Worker input is stripped from the `worker-start` event (it may contain sensitive data). Only the running/done status and the final `worker-result` are forwarded to the parent stream. Use these for workers where the user only needs to know the worker ran, not what it did internally.\n\n**`hidden` mode:** The worker executes normally but produces no UI presence at all. Use for internal workers that are implementation details.\n\n### Tool Mapping\n\nMap parent tools to worker tools when the worker needs access to your tool handlers:\n\n```yaml\nworkers:\n research-assistant:\n description: Research topics\n tools:\n search: web-search # Worker\'s "search" \u2192 parent\'s "web-search"\n```\n\nWhen the worker calls its `search` tool, your `web-search` handler executes.\n\n## Next Steps\n\n- [Server SDK Workers](/docs/server-sdk/workers) - Executing workers from code\n- [Handlers](/docs/protocol/handlers) - Block reference for steps\n- [Agent Config](/docs/protocol/agent-config) - Model and settings\n',
655
655
  excerpt: "Workers Workers are agents designed for task-based execution. Unlike interactive agents that handle multi-turn conversations, workers execute a sequence of steps and return an output value. When to...",
656
656
  order: 11
657
657
  },
@@ -669,7 +669,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
669
669
  section: "protocol",
670
670
  title: "MCP Servers",
671
671
  description: "Connecting agents to external tools via Model Context Protocol (MCP).",
672
- content: "\n# MCP Servers\n\nMCP servers extend your agent with tools from external services. Define them in your protocol, and agents automatically discover and use their tools at runtime.\n\nThere are three types of MCP servers:\n\n| Source | Description | Example |\n| ---------- | ----------------------------------------------------------------------------------- | ------------------------------------- |\n| `remote` | HTTP-based MCP servers, managed by the platform | Figma, Sentry, GitHub |\n| `device` | Local MCP servers running on the agent's machine via `@octavus/computer` | Browser automation, filesystem |\n| `consumer` | Inline MCP servers defined in your server-sdk process via `createInlineMcpServer()` | Custom integrations, third-party APIs |\n\n## Defining MCP Servers\n\nMCP servers are defined in the `mcpServers:` section. The key becomes the **namespace** for all tools from that server.\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\n github:\n description: Repository management - issues, pull requests, code\n source: consumer\n display: name\n```\n\n### Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ---------------------------------------------------------------------------------------------------------------------- |\n| `description` | Yes | What the MCP server provides |\n| `source` | Yes | `remote`, `device`, or `consumer` (see source types above) |\n| `display` | No | How tool calls appear in UI: `hidden`, `name`, `description` (default: `description`) |\n| `connection` | No | When to connect: `eager` or `lazy` (default: `lazy`). `remote` only. |\n| `execution` | No | Where the MCP process runs: `sandbox` (default) or `device`. `remote` only. See [Device Execution](#device-execution). |\n\n### Display Modes\n\nDisplay modes control visibility of all tool calls from the MCP server, using the same modes as [regular tools](/docs/protocol/tools#display-modes):\n\n| Mode | Behavior |\n| ------------- | -------------------------------------- |\n| `hidden` | Tool calls run silently |\n| `name` | Shows tool name while executing |\n| `description` | Shows tool description while executing |\n\n## Making MCP Servers Available\n\nLike tools, MCP servers defined in `mcpServers:` must be referenced in `agent.mcpServers` to be available:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\n filesystem:\n description: Filesystem access for reading and writing files\n source: device\n display: hidden\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [figma, sentry, browser, filesystem]\n tools: [set-chat-title]\n agentic: true\n maxSteps: 100\n```\n\n## Tool Namespacing\n\nAll MCP tools are automatically namespaced using `__` (double underscore) as a separator. The namespace comes from the `mcpServers` key.\n\nFor example, a server defined as `browser:` that exposes `navigate_page` and `click` produces:\n\n- `browser__navigate_page`\n- `browser__click`\n\nA server defined as `figma:` that exposes `get_design_context` produces:\n\n- `figma__get_design_context`\n\nThe namespace is stripped before calling the MCP server - the server receives the original tool name. This convention matches Anthropic's MCP integration in Claude Desktop and ensures tool names stay unique across servers.\n\n### What the LLM Sees\n\nWhen an agent has both regular tools and MCP servers configured, the LLM sees all tools combined:\n\n```\nProtocol tools:\n set-chat-title\n\nRemote MCP tools (auto-discovered):\n figma__get_design_context\n figma__get_screenshot\n sentry__get_issues\n sentry__get_issue_details\n\nDevice MCP tools (auto-discovered):\n browser__navigate_page\n browser__click\n browser__take_snapshot\n filesystem__read_file\n filesystem__write_file\n filesystem__list_directory\n\nConsumer MCP tools (provided by the server-sdk):\n github__get-pr-overview\n github__list-issues\n```\n\nYou don't define individual MCP tool schemas in the protocol - remote and device tools are auto-discovered from each MCP server at runtime, and consumer tools are supplied by the server-sdk.\n\n## Remote MCP Servers\n\nRemote MCP servers (`source: remote`) connect to HTTP-based MCP endpoints. The platform manages the connection, authentication, and tool discovery.\n\nConfiguration happens in the Octavus platform UI:\n\n1. Add an MCP server to your project (URL + authentication)\n2. The server's slug must match the namespace in your protocol\n3. The platform connects, discovers tools, and makes them available to the agent\n\n### Connection Modes\n\nThe `connection` field controls when the platform connects to a remote MCP server:\n\n| Mode | Behavior |\n| ------- | ---------------------------------------------------------------------------------------------------------------------- |\n| `lazy` | (default) The agent activates integrations on demand at runtime. The agent starts responding immediately. |\n| `eager` | The platform connects and discovers tools before the first LLM request. Tools are guaranteed available from message 1. |\n\n```yaml\nmcpServers:\n sentry:\n source: remote\n connection: eager # Always connected upfront\n display: name\n\n notion:\n source: remote\n # connection defaults to lazy - agent activates when needed\n display: description\n```\n\nWith **lazy connection** (the default), the agent receives two built-in tools - one for listing available integrations and one for activating them. The agent decides which integrations it needs based on the conversation and activates them on demand. This avoids paying connection latency for integrations the agent doesn't end up using.\n\nWith **eager connection**, the platform connects to the MCP server before the first LLM request, exactly like a declared tool. Use this when the agent needs the MCP's tools from the very first message.\n\nThe `connection` field is only valid on `source: remote` - device MCPs (`source: device`) have their own connection mechanism through the server-sdk. The `connection` field is respected for remote MCPs with `execution: device` the same way as sandbox MCPs.\n\n### Authentication\n\nRemote MCP servers support multiple authentication methods:\n\n| Auth Type | Description |\n| --------- | ------------------------------- |\n| MCP OAuth | Standard MCP OAuth flow |\n| API Key | Static API key sent as a header |\n| Bearer | Bearer token authentication |\n| None | No authentication required |\n\nAuthentication is configured per-project - different projects can connect to the same MCP server with different credentials.\n\n## Device Execution\n\nThe `execution` field controls where a remote MCP server's STDIO process runs. By default (`execution: sandbox`), the process runs in the platform's sandbox. When set to `execution: device`, the STDIO process runs on the agent's computer (VM or desktop) instead.\n\n```yaml\nmcpServers:\n code-tools:\n description: Code analysis and refactoring tools\n source: remote\n execution: device # STDIO process runs on the agent's computer\n display: name\n\n sentry:\n description: Error tracking\n source: remote\n # execution defaults to sandbox - runs in the platform\n display: name\n```\n\n### When to Use\n\nUse `execution: device` when the MCP server needs access to the agent's local environment - for example, tools that read from the local filesystem, interact with running processes, or need CLIs installed on the device.\n\n### Rules\n\n- `execution` is only meaningful for `source: remote` MCPs that use STDIO transport. HTTP-transport remote MCPs always connect from the platform regardless of the `execution` setting.\n- `execution` is **invalid** on `source: device` and `source: consumer` MCPs - they already run outside the platform. Using it produces a validation error.\n- The `connection` field (`eager` or `lazy`) is respected for device-executed MCPs the same way as sandbox-executed MCPs.\n\n## Device MCP Servers\n\nDevice MCP servers (`source: device`) run on the consumer's machine. The consumer provides the MCP tools via the `@octavus/computer` package (or any `ToolProvider` implementation) through the server-sdk.\n\nWhen an agent has device MCP servers:\n\n1. The consumer creates a `Computer` with matching namespaces\n2. `@octavus/computer` discovers tools from each MCP server\n3. Tool schemas are sent to the platform via the server-sdk\n4. Tool calls flow back to the consumer for execution\n\nSee [`@octavus/computer`](/docs/server-sdk/computer) for the full integration guide.\n\n### Namespace Matching\n\nThe `mcpServers` keys in the protocol must match the keys in the consumer's `Computer` configuration:\n\n```yaml\n# protocol.yaml\nmcpServers:\n browser: # \u2190 must match\n source: device\n filesystem: # \u2190 must match\n source: device\n```\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=...']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [dir]),\n },\n});\n```\n\nIf the consumer provides a namespace not declared in the protocol, the platform rejects it.\n\n## Consumer MCP Servers\n\nConsumer MCP servers (`source: consumer`) are defined inline in your server-sdk process. Tool schemas and handlers live in your code; the platform learns the namespace from the protocol and routes tool calls to your process via the same `dynamicToolSchemas` channel that device MCPs use.\n\n```yaml\nmcpServers:\n github:\n description: Repository management - issues, pull requests, code\n source: consumer\n display: name\n\nagent:\n mcpServers:\n - github\n```\n\nThe protocol declaration is intentionally minimal - the SDK supplies tool names and JSON schemas at runtime, so adding or evolving tools doesn't require a protocol change.\n\nUse consumer MCPs when:\n\n- The integration's credentials should never reach the platform (OAuth tokens, customer API keys).\n- You want to group an integration's tools (`github__list-prs`, `github__get-issue`) without enumerating each one in YAML.\n- You want type-safe handler arguments via Zod schemas.\n\nSee [`createInlineMcpServer`](/docs/server-sdk/inline-mcp) in the server-sdk reference for the full implementation guide.\n\n### Namespace Matching\n\nThe protocol namespace must match the namespace passed to `createInlineMcpServer()`:\n\n```yaml\n# protocol.yaml\nmcpServers:\n github: # \u2190 must match\n source: consumer\n```\n\n```typescript\nconst github = createInlineMcpServer('github', {\n /* tools... */\n});\n\nsession = client.agentSessions.attach(sessionId, {\n mcpServers: [github],\n});\n```\n\nIf the SDK provides a namespace not declared in the protocol, those tools are filtered out at the runtime boundary and the LLM never sees them.\n\n## Thread-Level Scoping\n\nThreads can scope which MCP servers are available, the same way they scope [tools](/docs/protocol/handlers#start-thread):\n\n```yaml\nhandlers:\n user-message:\n Start research:\n block: start-thread\n thread: research\n mcpServers: [figma, browser]\n tools: [set-chat-title]\n system: research-prompt\n```\n\nThis thread can use Figma and browser tools, but not sentry or filesystem - even if those are available on the main agent.\n\n## On-Demand MCP Servers\n\nBy default, an agent can only call MCP tools whose namespace is listed in `mcpServers`. With `onDemandMcpServers`, a scope can opt into **every connected MCP of a given source** at runtime, without enumerating each one in the protocol.\n\nRemote MCPs are connected at the project level from the Octavus dashboard. Normally, each connected MCP that the agent should be able to use has to be declared in the protocol - connecting a new MCP means editing the protocol and redeploying. `onDemandMcpServers` removes that round-trip: once a source is opted in, any MCP connected to the project under that source becomes available to the agent immediately.\n\nCurrently supported for `source: remote`.\n\n### Protocol-level declaration\n\nAdd an `onDemandMcpServers:` section alongside `mcpServers:`, keyed by source. Each entry configures how the matched MCPs appear in tool lists:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\nonDemandMcpServers:\n remote:\n description: Additional connected integrations\n display: name\n execution: device # on-demand MCPs run on the agent's computer\n contextRetention:\n toolResults: { retainLast: 5 }\n```\n\nOn-demand MCP definitions also support the `execution` field. When set, all MCPs matched by that on-demand source inherit the execution mode.\n\n### Scope-level opt-in\n\nThe agent and individual `start-thread` blocks each choose whether to pick up on-demand MCPs, by listing the sources they want:\n\n```yaml\nagent:\n mcpServers: [figma]\n onDemandMcpServers: [remote]\n\nhandlers:\n user-message:\n focused:\n block: start-thread\n mcpServers: [figma]\n # no onDemandMcpServers - this thread does NOT see on-demand MCPs\n broad:\n block: start-thread\n mcpServers: [figma]\n onDemandMcpServers: [remote]\n```\n\n### Rules\n\n- A scope's tool list includes every **connected** MCP of any referenced source, whether or not any protocol declares that slug.\n- Undeclared namespaces inherit `description`, `display`, and `contextRetention` from the per-source entry in `onDemandMcpServers`.\n- Scopes decide independently - threads do not inherit `onDemandMcpServers` from their parent, the same rule as `mcpServers:`.\n- Tool namespaces are always the connector's slug (for example `notion__search`, `linear__create_issue`). Source keys are never namespaces.\n\nWorkers opt into on-demand MCPs the same way: through `start-thread` blocks inside `steps`. A worker without a `start-thread` that lists a source won't see on-demand MCPs of that source.\n\n## Workers\n\nWorkers can declare and use MCP servers using the same `mcpServers:` syntax. Workers resolve their own MCP connections independently - they don't inherit from a parent interactive agent.\n\n```yaml\n# Worker protocol\nmcpServers:\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [sentry, browser]\n maxSteps: 10\n```\n\nSince workers don't have a global `agent:` section, MCP servers are scoped per-thread via `start-thread` - the same way tools and skills work in workers. Remote MCP connections are project-scoped, so workers in the same project share the same OAuth connections.\n\nSee [Workers](/docs/protocol/workers) for the full worker protocol reference.\n\n## Full Example\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n connection: eager\n display: description\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n filesystem:\n description: Filesystem access for reading and writing files\n source: device\n display: hidden\n shell:\n description: Shell command execution\n source: device\n display: name\n\ntools:\n set-chat-title:\n description: Set the title of the current chat.\n parameters:\n title: { type: string, description: The title to set }\n\nagent:\n model: anthropic/claude-opus-4-6\n system: system\n mcpServers: [figma, sentry, browser, filesystem, shell]\n tools: [set-chat-title]\n thinking: medium\n maxSteps: 300\n agentic: true\n\ntriggers:\n user-message:\n input:\n USER_MESSAGE: { type: string }\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n Respond:\n block: next-message\n```\n\n### Cloud-Only Agent\n\nAgents that only use remote MCP servers don't need `@octavus/computer`:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n connection: eager # Need design tools from message 1\n display: description\n sentry:\n description: Error tracking and debugging\n source: remote\n # Lazy (default) - agent activates when debugging is needed\n display: name\n\ntools:\n submit-code:\n description: Submit code to the user.\n parameters:\n code: { type: string }\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [figma, sentry]\n tools: [submit-code]\n agentic: true\n```\n",
672
+ content: "\n# MCP Servers\n\nMCP servers extend your agent with tools from external services. Define them in your protocol, and agents automatically discover and use their tools at runtime.\n\nThere are three types of MCP servers:\n\n| Source | Description | Example |\n| ---------- | ----------------------------------------------------------------------------------- | ------------------------------------- |\n| `remote` | HTTP-based MCP servers, managed by the platform | Figma, Sentry, GitHub |\n| `device` | Local MCP servers running on the agent's machine via `@octavus/computer` | Browser automation, filesystem |\n| `consumer` | Inline MCP servers defined in your server-sdk process via `createInlineMcpServer()` | Custom integrations, third-party APIs |\n\n## Defining MCP Servers\n\nMCP servers are defined in the `mcpServers:` section. The key becomes the **namespace** for all tools from that server.\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\n github:\n description: Repository management - issues, pull requests, code\n source: consumer\n display: name\n```\n\n### Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ---------------------------------------------------------------------------------------------------------------------- |\n| `description` | Yes | What the MCP server provides |\n| `source` | Yes | `remote`, `device`, or `consumer` (see source types above) |\n| `display` | No | How tool calls appear in UI: `hidden`, `name`, `description`, `stream`, `title` (default: `description`) |\n| `title` | No | UI label shown when `display: title`; applies to every tool in this namespace (hides description and arguments) |\n| `connection` | No | When to connect: `eager` or `lazy` (default: `lazy`). `remote` only. |\n| `execution` | No | Where the MCP process runs: `sandbox` (default) or `device`. `remote` only. See [Device Execution](#device-execution). |\n\n### Display Modes\n\nDisplay modes control visibility of all tool calls from the MCP server, using the same modes as [regular tools](/docs/protocol/tools#display-modes):\n\n| Mode | Behavior |\n| ------------- | -------------------------------------- |\n| `hidden` | Tool calls run silently |\n| `name` | Shows tool name while executing |\n| `description` | Shows tool description while executing |\n\n## Making MCP Servers Available\n\nLike tools, MCP servers defined in `mcpServers:` must be referenced in `agent.mcpServers` to be available:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\n filesystem:\n description: Filesystem access for reading and writing files\n source: device\n display: hidden\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [figma, sentry, browser, filesystem]\n tools: [set-chat-title]\n agentic: true\n maxSteps: 100\n```\n\n## Tool Namespacing\n\nAll MCP tools are automatically namespaced using `__` (double underscore) as a separator. The namespace comes from the `mcpServers` key.\n\nFor example, a server defined as `browser:` that exposes `navigate_page` and `click` produces:\n\n- `browser__navigate_page`\n- `browser__click`\n\nA server defined as `figma:` that exposes `get_design_context` produces:\n\n- `figma__get_design_context`\n\nThe namespace is stripped before calling the MCP server - the server receives the original tool name. This convention matches Anthropic's MCP integration in Claude Desktop and ensures tool names stay unique across servers.\n\n### What the LLM Sees\n\nWhen an agent has both regular tools and MCP servers configured, the LLM sees all tools combined:\n\n```\nProtocol tools:\n set-chat-title\n\nRemote MCP tools (auto-discovered):\n figma__get_design_context\n figma__get_screenshot\n sentry__get_issues\n sentry__get_issue_details\n\nDevice MCP tools (auto-discovered):\n browser__navigate_page\n browser__click\n browser__take_snapshot\n filesystem__read_file\n filesystem__write_file\n filesystem__list_directory\n\nConsumer MCP tools (provided by the server-sdk):\n github__get-pr-overview\n github__list-issues\n```\n\nYou don't define individual MCP tool schemas in the protocol - remote and device tools are auto-discovered from each MCP server at runtime, and consumer tools are supplied by the server-sdk.\n\n## Remote MCP Servers\n\nRemote MCP servers (`source: remote`) connect to HTTP-based MCP endpoints. The platform manages the connection, authentication, and tool discovery.\n\nConfiguration happens in the Octavus platform UI:\n\n1. Add an MCP server to your project (URL + authentication)\n2. The server's slug must match the namespace in your protocol\n3. The platform connects, discovers tools, and makes them available to the agent\n\n### Connection Modes\n\nThe `connection` field controls when the platform connects to a remote MCP server:\n\n| Mode | Behavior |\n| ------- | ---------------------------------------------------------------------------------------------------------------------- |\n| `lazy` | (default) The agent activates integrations on demand at runtime. The agent starts responding immediately. |\n| `eager` | The platform connects and discovers tools before the first LLM request. Tools are guaranteed available from message 1. |\n\n```yaml\nmcpServers:\n sentry:\n source: remote\n connection: eager # Always connected upfront\n display: name\n\n notion:\n source: remote\n # connection defaults to lazy - agent activates when needed\n display: description\n```\n\nWith **lazy connection** (the default), the agent receives two built-in tools - one for listing available integrations and one for activating them. The agent decides which integrations it needs based on the conversation and activates them on demand. This avoids paying connection latency for integrations the agent doesn't end up using.\n\nWith **eager connection**, the platform connects to the MCP server before the first LLM request, exactly like a declared tool. Use this when the agent needs the MCP's tools from the very first message.\n\nThe `connection` field is only valid on `source: remote` - device MCPs (`source: device`) have their own connection mechanism through the server-sdk. The `connection` field is respected for remote MCPs with `execution: device` the same way as sandbox MCPs.\n\n### Authentication\n\nRemote MCP servers support multiple authentication methods:\n\n| Auth Type | Description |\n| --------- | ------------------------------- |\n| MCP OAuth | Standard MCP OAuth flow |\n| API Key | Static API key sent as a header |\n| Bearer | Bearer token authentication |\n| None | No authentication required |\n\nAuthentication is configured per-project - different projects can connect to the same MCP server with different credentials.\n\n## Device Execution\n\nThe `execution` field controls where a remote MCP server's STDIO process runs. By default (`execution: sandbox`), the process runs in the platform's sandbox. When set to `execution: device`, the STDIO process runs on the agent's computer (VM or desktop) instead.\n\n```yaml\nmcpServers:\n code-tools:\n description: Code analysis and refactoring tools\n source: remote\n execution: device # STDIO process runs on the agent's computer\n display: name\n\n sentry:\n description: Error tracking\n source: remote\n # execution defaults to sandbox - runs in the platform\n display: name\n```\n\n### When to Use\n\nUse `execution: device` when the MCP server needs access to the agent's local environment - for example, tools that read from the local filesystem, interact with running processes, or need CLIs installed on the device.\n\n### Rules\n\n- `execution` is only meaningful for `source: remote` MCPs that use STDIO transport. HTTP-transport remote MCPs always connect from the platform regardless of the `execution` setting.\n- `execution` is **invalid** on `source: device` and `source: consumer` MCPs - they already run outside the platform. Using it produces a validation error.\n- The `connection` field (`eager` or `lazy`) is respected for device-executed MCPs the same way as sandbox-executed MCPs.\n\n## Device MCP Servers\n\nDevice MCP servers (`source: device`) run on the consumer's machine. The consumer provides the MCP tools via the `@octavus/computer` package (or any `ToolProvider` implementation) through the server-sdk.\n\nWhen an agent has device MCP servers:\n\n1. The consumer creates a `Computer` with matching namespaces\n2. `@octavus/computer` discovers tools from each MCP server\n3. Tool schemas are sent to the platform via the server-sdk\n4. Tool calls flow back to the consumer for execution\n\nSee [`@octavus/computer`](/docs/server-sdk/computer) for the full integration guide.\n\n### Namespace Matching\n\nThe `mcpServers` keys in the protocol must match the keys in the consumer's `Computer` configuration:\n\n```yaml\n# protocol.yaml\nmcpServers:\n browser: # \u2190 must match\n source: device\n filesystem: # \u2190 must match\n source: device\n```\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=...']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [dir]),\n },\n});\n```\n\nIf the consumer provides a namespace not declared in the protocol, the platform rejects it.\n\n## Consumer MCP Servers\n\nConsumer MCP servers (`source: consumer`) are defined inline in your server-sdk process. Tool schemas and handlers live in your code; the platform learns the namespace from the protocol and routes tool calls to your process via the same `dynamicToolSchemas` channel that device MCPs use.\n\n```yaml\nmcpServers:\n github:\n description: Repository management - issues, pull requests, code\n source: consumer\n display: name\n\nagent:\n mcpServers:\n - github\n```\n\nThe protocol declaration is intentionally minimal - the SDK supplies tool names and JSON schemas at runtime, so adding or evolving tools doesn't require a protocol change.\n\nUse consumer MCPs when:\n\n- The integration's credentials should never reach the platform (OAuth tokens, customer API keys).\n- You want to group an integration's tools (`github__list-prs`, `github__get-issue`) without enumerating each one in YAML.\n- You want type-safe handler arguments via Zod schemas.\n\nSee [`createInlineMcpServer`](/docs/server-sdk/inline-mcp) in the server-sdk reference for the full implementation guide.\n\n### Namespace Matching\n\nThe protocol namespace must match the namespace passed to `createInlineMcpServer()`:\n\n```yaml\n# protocol.yaml\nmcpServers:\n github: # \u2190 must match\n source: consumer\n```\n\n```typescript\nconst github = createInlineMcpServer('github', {\n /* tools... */\n});\n\nsession = client.agentSessions.attach(sessionId, {\n mcpServers: [github],\n});\n```\n\nIf the SDK provides a namespace not declared in the protocol, those tools are filtered out at the runtime boundary and the LLM never sees them.\n\n## Thread-Level Scoping\n\nThreads can scope which MCP servers are available, the same way they scope [tools](/docs/protocol/handlers#start-thread):\n\n```yaml\nhandlers:\n user-message:\n Start research:\n block: start-thread\n thread: research\n mcpServers: [figma, browser]\n tools: [set-chat-title]\n system: research-prompt\n```\n\nThis thread can use Figma and browser tools, but not sentry or filesystem - even if those are available on the main agent.\n\n## On-Demand MCP Servers\n\nBy default, an agent can only call MCP tools whose namespace is listed in `mcpServers`. With `onDemandMcpServers`, a scope can opt into **every connected MCP of a given source** at runtime, without enumerating each one in the protocol.\n\nRemote MCPs are connected at the project level from the Octavus dashboard. Normally, each connected MCP that the agent should be able to use has to be declared in the protocol - connecting a new MCP means editing the protocol and redeploying. `onDemandMcpServers` removes that round-trip: once a source is opted in, any MCP connected to the project under that source becomes available to the agent immediately.\n\nCurrently supported for `source: remote`.\n\n### Protocol-level declaration\n\nAdd an `onDemandMcpServers:` section alongside `mcpServers:`, keyed by source. Each entry configures how the matched MCPs appear in tool lists:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\nonDemandMcpServers:\n remote:\n description: Additional connected integrations\n display: name\n execution: device # on-demand MCPs run on the agent's computer\n contextRetention:\n toolResults: { retainLast: 5 }\n```\n\nOn-demand MCP definitions also support the `execution` field. When set, all MCPs matched by that on-demand source inherit the execution mode.\n\n### Scope-level opt-in\n\nThe agent and individual `start-thread` blocks each choose whether to pick up on-demand MCPs, by listing the sources they want:\n\n```yaml\nagent:\n mcpServers: [figma]\n onDemandMcpServers: [remote]\n\nhandlers:\n user-message:\n focused:\n block: start-thread\n mcpServers: [figma]\n # no onDemandMcpServers - this thread does NOT see on-demand MCPs\n broad:\n block: start-thread\n mcpServers: [figma]\n onDemandMcpServers: [remote]\n```\n\n### Rules\n\n- A scope's tool list includes every **connected** MCP of any referenced source, whether or not any protocol declares that slug.\n- Undeclared namespaces inherit `description`, `display`, and `contextRetention` from the per-source entry in `onDemandMcpServers`.\n- Scopes decide independently - threads do not inherit `onDemandMcpServers` from their parent, the same rule as `mcpServers:`.\n- Tool namespaces are always the connector's slug (for example `notion__search`, `linear__create_issue`). Source keys are never namespaces.\n\nWorkers opt into on-demand MCPs the same way: through `start-thread` blocks inside `steps`. A worker without a `start-thread` that lists a source won't see on-demand MCPs of that source.\n\n## Workers\n\nWorkers can declare and use MCP servers using the same `mcpServers:` syntax. Workers resolve their own MCP connections independently - they don't inherit from a parent interactive agent.\n\n```yaml\n# Worker protocol\nmcpServers:\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [sentry, browser]\n maxSteps: 10\n```\n\nSince workers don't have a global `agent:` section, MCP servers are scoped per-thread via `start-thread` - the same way tools and skills work in workers. Remote MCP connections are project-scoped, so workers in the same project share the same OAuth connections.\n\nSee [Workers](/docs/protocol/workers) for the full worker protocol reference.\n\n## Full Example\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n connection: eager\n display: description\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n filesystem:\n description: Filesystem access for reading and writing files\n source: device\n display: hidden\n shell:\n description: Shell command execution\n source: device\n display: name\n\ntools:\n set-chat-title:\n description: Set the title of the current chat.\n parameters:\n title: { type: string, description: The title to set }\n\nagent:\n model: anthropic/claude-opus-4-6\n system: system\n mcpServers: [figma, sentry, browser, filesystem, shell]\n tools: [set-chat-title]\n thinking: medium\n maxSteps: 300\n agentic: true\n\ntriggers:\n user-message:\n input:\n USER_MESSAGE: { type: string }\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n Respond:\n block: next-message\n```\n\n### Cloud-Only Agent\n\nAgents that only use remote MCP servers don't need `@octavus/computer`:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n connection: eager # Need design tools from message 1\n display: description\n sentry:\n description: Error tracking and debugging\n source: remote\n # Lazy (default) - agent activates when debugging is needed\n display: name\n\ntools:\n submit-code:\n description: Submit code to the user.\n parameters:\n code: { type: string }\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [figma, sentry]\n tools: [submit-code]\n agentic: true\n```\n",
673
673
  excerpt: "MCP Servers MCP servers extend your agent with tools from external services. Define them in your protocol, and agents automatically discover and use their tools at runtime. There are three types of...",
674
674
  order: 13
675
675
  },
@@ -744,6 +744,33 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
744
744
  content: "\n# Socket Chat Example\n\nThis example builds a chat interface using SockJS for bidirectional communication. Use this pattern for Meteor, Phoenix, or when you need custom real-time events.\n\n## What You're Building\n\nA chat interface that:\n\n- Uses SockJS for real-time streaming\n- Manages sessions server-side (client doesn't need sessionId)\n- Supports custom events alongside chat\n- Works with frameworks that use WebSocket-like transports\n\n## Architecture\n\n```mermaid\nflowchart LR\n Browser[\"Browser<br/>(React)\"] <-->|\"SockJS\"| Server[\"Your Server<br/>(Express)\"]\n Server --> Platform[\"Octavus Platform\"]\n```\n\n**Key difference from HTTP:** The server maintains a persistent socket connection and manages sessions internally. The client never needs to know about `sessionId`.\n\n## Prerequisites\n\n- Express (or similar Node.js server)\n- React frontend\n- `sockjs` (server) and `sockjs-client` (client)\n- Octavus account with API key\n\n## Step 1: Install Dependencies\n\n**Server:**\n\n```bash\nnpm install @octavus/server-sdk sockjs express\nnpm install -D @types/sockjs @types/express\n```\n\n**Client:**\n\n```bash\nnpm install @octavus/react sockjs-client\nnpm install -D @types/sockjs-client\n```\n\n## Step 2: Configure Environment\n\n```bash\n# .env\nOCTAVUS_API_URL=https://octavus.ai\nOCTAVUS_API_KEY=your-api-key\nOCTAVUS_AGENT_ID=your-agent-id\n```\n\n## Step 3: Create the Octavus Client (Server)\n\n```typescript\n// server/octavus/client.ts\nimport { OctavusClient } from '@octavus/server-sdk';\n\nexport const octavus = new OctavusClient({\n baseUrl: process.env.OCTAVUS_API_URL!,\n apiKey: process.env.OCTAVUS_API_KEY!,\n});\n\nexport const AGENT_ID = process.env.OCTAVUS_AGENT_ID!;\n```\n\n## Step 4: Create the Socket Handler (Server)\n\nThis is the core of socket integration. Each connection gets its own session:\n\n```typescript\n// server/octavus/socket-handler.ts\nimport type { Connection } from 'sockjs';\nimport { OctavusClient, type AgentSession, type SocketMessage } from '@octavus/server-sdk';\n\nconst octavus = new OctavusClient({\n baseUrl: process.env.OCTAVUS_API_URL!,\n apiKey: process.env.OCTAVUS_API_KEY!,\n});\n\nconst AGENT_ID = process.env.OCTAVUS_AGENT_ID!;\n\nexport function createSocketHandler() {\n return (conn: Connection) => {\n let session: AgentSession | null = null;\n\n const send = (data: unknown) => conn.write(JSON.stringify(data));\n\n conn.on('data', (rawData: string) => {\n void handleMessage(rawData);\n });\n\n async function handleMessage(rawData: string) {\n const msg = JSON.parse(rawData);\n\n // Handle trigger, continue, and stop messages\n if (msg.type === 'trigger' || msg.type === 'continue' || msg.type === 'stop') {\n // Create session lazily on first trigger\n if (!session && msg.type === 'trigger') {\n const sessionId = await octavus.agentSessions.create(AGENT_ID, {\n COMPANY_NAME: 'Acme Corp',\n });\n\n session = octavus.agentSessions.attach(sessionId, {\n tools: {\n 'get-user-account': async () => {\n return { name: 'Demo User', plan: 'pro' };\n },\n 'create-support-ticket': async () => {\n return { ticketId: 'TKT-123', estimatedResponse: '24h' };\n },\n },\n });\n }\n\n if (!session) return;\n\n // handleSocketMessage manages abort controller internally\n await session.handleSocketMessage(msg as SocketMessage, {\n onEvent: send,\n });\n }\n }\n\n conn.on('close', () => {});\n };\n}\n```\n\n## Step 5: Set Up the Express Server\n\n```typescript\n// server/index.ts\nimport express from 'express';\nimport http from 'http';\nimport sockjs from 'sockjs';\nimport { createSocketHandler } from './octavus/socket-handler';\n\nconst app = express();\nconst server = http.createServer(app);\n\n// Create SockJS server\nconst sockServer = sockjs.createServer({\n prefix: '/octavus',\n log: () => {}, // Silence logs\n});\n\n// Attach handler\nsockServer.on('connection', createSocketHandler());\nsockServer.installHandlers(server);\n\n// Serve your frontend\napp.use(express.static('dist/client'));\n\nserver.listen(3001, () => {\n console.log('Server running on http://localhost:3001');\n});\n```\n\n## Step 6: Create the Socket Hook (Client)\n\n```typescript\n// src/hooks/useOctavusSocket.ts\nimport { useEffect, useMemo } from 'react';\nimport SockJS from 'sockjs-client';\nimport { useOctavusChat, createSocketTransport, type SocketLike } from '@octavus/react';\n\nfunction connectSocket(): Promise<SocketLike> {\n return new Promise((resolve, reject) => {\n const sock = new SockJS('/octavus');\n\n sock.onopen = () => resolve(sock);\n sock.onerror = () => reject(new Error('Connection failed'));\n });\n}\n\nexport function useOctavusSocket() {\n // Transport is stable - empty deps because server manages sessions\n const transport = useMemo(() => createSocketTransport({ connect: connectSocket }), []);\n\n const {\n messages,\n status,\n error,\n send,\n stop,\n // Socket-specific connection state\n connectionState,\n connectionError,\n connect,\n disconnect,\n } = useOctavusChat({\n transport,\n onError: (err) => console.error('Chat error:', err),\n });\n\n // Eagerly connect for UI status indicator\n useEffect(() => {\n connect?.();\n return () => disconnect?.();\n }, [connect, disconnect]);\n\n const sendMessage = async (message: string) => {\n await send('user-message', { USER_MESSAGE: message }, { userMessage: { content: message } });\n };\n\n return { messages, status, error, connectionState, connectionError, sendMessage, stop };\n}\n```\n\n## Step 7: Build the Chat Component\n\n```tsx\n// src/components/Chat.tsx\nimport { useState } from 'react';\nimport { useOctavusSocket } from '../hooks/useOctavusSocket';\n\nfunction ConnectionIndicator({ state }: { state: string | undefined }) {\n const colors: Record<string, string> = {\n connected: 'bg-green-500',\n connecting: 'bg-yellow-500',\n error: 'bg-red-500',\n disconnected: 'bg-gray-400',\n };\n const color = colors[state ?? 'disconnected'];\n return <div className={`w-2 h-2 rounded-full ${color}`} title={state} />;\n}\n\nexport function Chat() {\n const [inputValue, setInputValue] = useState('');\n const { messages, status, connectionState, sendMessage, stop } = useOctavusSocket();\n\n const handleSubmit = async (e: React.FormEvent) => {\n e.preventDefault();\n if (!inputValue.trim() || status === 'streaming') return;\n\n const message = inputValue.trim();\n setInputValue('');\n await sendMessage(message);\n };\n\n return (\n <div className=\"flex flex-col h-screen\">\n {/* Header with connection status */}\n <div className=\"p-4 border-b flex items-center justify-between\">\n <h1 className=\"font-semibold\">AI Chat</h1>\n <ConnectionIndicator state={connectionState} />\n </div>\n\n {/* Messages */}\n <div className=\"flex-1 overflow-y-auto p-4 space-y-4\">\n {messages.map((msg) => (\n <div key={msg.id} className={msg.role === 'user' ? 'text-right' : 'text-left'}>\n <div\n className={`inline-block p-3 rounded-lg ${\n msg.role === 'user' ? 'bg-blue-500 text-white' : 'bg-gray-100'\n }`}\n >\n {msg.parts.map((part, i) => {\n if (part.type === 'text') return <p key={i}>{part.text}</p>;\n return null;\n })}\n </div>\n </div>\n ))}\n </div>\n\n {/* Input */}\n <form onSubmit={handleSubmit} className=\"p-4 border-t flex gap-2\">\n <input\n type=\"text\"\n value={inputValue}\n onChange={(e) => setInputValue(e.target.value)}\n placeholder=\"Type a message...\"\n className=\"flex-1 px-4 py-2 border rounded-lg\"\n disabled={status === 'streaming'}\n />\n {status === 'streaming' ? (\n <button\n type=\"button\"\n onClick={stop}\n className=\"px-4 py-2 bg-red-500 text-white rounded-lg\"\n >\n Stop\n </button>\n ) : (\n <button type=\"submit\" className=\"px-4 py-2 bg-blue-500 text-white rounded-lg\">\n Send\n </button>\n )}\n </form>\n </div>\n );\n}\n```\n\n## Custom Events\n\nSocket transport supports custom events alongside Octavus events:\n\n```typescript\n// Client - handle custom events\nconst transport = useMemo(\n () =>\n createSocketTransport({\n connect: connectSocket,\n onMessage: (data) => {\n const msg = data as { type: string; [key: string]: unknown };\n\n if (msg.type === 'typing-indicator') {\n setAgentTyping(msg.isTyping as boolean);\n }\n\n if (msg.type === 'custom-notification') {\n showToast(msg.message as string);\n }\n\n // Octavus events (text-delta, finish, etc.) are handled automatically\n },\n }),\n [],\n);\n```\n\n```typescript\n// Server - send custom events\nconn.write(\n JSON.stringify({\n type: 'typing-indicator',\n isTyping: true,\n }),\n);\n```\n\n## Protocol Integration\n\n### Messages\n\nThe socket handler receives messages and forwards them to Octavus:\n\n```typescript\n// Client sends trigger:\n{ type: 'trigger', triggerName: 'user-message', input: { USER_MESSAGE: 'Hello' } }\n\n// Client sends continuation (after client tool handling):\n{ type: 'continue', executionId: '...', toolResults: [...] }\n\n// Client sends stop:\n{ type: 'stop' }\n\n// Server handles all three with handleSocketMessage:\nawait session.handleSocketMessage(msg, {\n onEvent: (event) => conn.write(JSON.stringify(event)),\n});\n```\n\n### Tools\n\nTools are defined in your agent's protocol. Server-side tools have handlers, client-side tools don't:\n\n```yaml\n# protocol.yaml\ntools:\n get-user-account:\n description: Fetch user details\n parameters:\n userId:\n type: string\n\n get-browser-location:\n description: Get user's location from browser\n # No server handler - handled on client\n```\n\n```typescript\n// Server tool handlers (only for server tools)\ntools: {\n 'get-user-account': async (args) => {\n const userId = args.userId as string;\n return await db.users.find(userId);\n },\n // get-browser-location has no handler - forwarded to client\n}\n```\n\nSee [Client Tools](/docs/client-sdk/client-tools) for handling tools on the frontend.\n\n## Meteor Integration Note\n\nMeteor's bundler may have issues with ES6 imports of `sockjs-client`:\n\n```typescript\n// Use require() instead of import\nconst SockJS: typeof import('sockjs-client') = require('sockjs-client');\n```\n\n## Next Steps\n\n- [Socket Transport](/docs/client-sdk/socket-transport) - Advanced socket patterns\n- [Protocol Overview](/docs/protocol/overview) - Define agent behavior\n- [Tools](/docs/protocol/tools) - Building tool handlers\n",
745
745
  excerpt: "Socket Chat Example This example builds a chat interface using SockJS for bidirectional communication. Use this pattern for Meteor, Phoenix, or when you need custom real-time events. What You're...",
746
746
  order: 3
747
+ },
748
+ {
749
+ slug: "workforce-agents/overview",
750
+ section: "workforce-agents",
751
+ title: "Overview",
752
+ description: "What Workforce Agents are and how to drive them programmatically.",
753
+ content: "\n# Workforce Agents\n\nWorkforce Agents are Octavus's autonomous AI teammates - specialized agents you hire and configure in the dashboard, each with its own computer, skills, and tools. Browse and hire them at [octavus.ai/agents](https://octavus.ai/agents), and see the full roster at [octavus.ai/agents/discover](https://octavus.ai/agents/discover).\n\nThe **Workforce Agents API** lets you drive one of your agents without a browser: give it a task, wait for it to finish, and read what it did. It is built for automation - CI pipelines, scheduled jobs, and backend integrations that hand work to an agent and collect the result.\n\n> Workforce Agents are distinct from agents you build yourself with the SDK. The [Sessions API](/docs/api-reference/sessions) drives agents you define and host; the Workforce Agents API drives the managed agents you hire and configure in the dashboard.\n\n## How it works\n\nA run follows a simple dispatch-then-poll model - there is no stream or webhook to manage:\n\n1. **Dispatch** a message to an agent. This starts a **thread** (a conversation) and returns a `threadId` right away.\n2. **Poll** the thread until its `status` is terminal.\n3. **Read** the thread's messages to get the agent's work.\n\n```mermaid\nsequenceDiagram\n participant C as Your code\n participant A as Workforce Agent\n C->>A: dispatch a task\n A-->>C: threadId and status\n loop until terminal status\n C->>A: get thread\n A-->>C: status and messages\n end\n```\n\n### Thread status\n\n| Status | Meaning |\n| ----------- | ----------------------------------------------------------------------------------- |\n| `pending` | Dispatched, waiting to start |\n| `queued` | Waiting for the agent to free up - an agent runs one task at a time on its computer |\n| `running` | The agent is working |\n| `completed` | Finished successfully (terminal) |\n| `failed` | Ended with an error - see `failureReason` (terminal) |\n| `cancelled` | Stopped (terminal) |\n\nPoll while the status is `pending`, `queued`, or `running`, and stop once it is `completed`, `failed`, or `cancelled`.\n\n## Authentication\n\nEach agent has its own **API key**, created from the agent's settings in the dashboard (Settings -> API). The key:\n\n- Drives only that one agent - a key for one agent can never call another.\n- Is a secret. Use it from a backend, script, or CI, never in a browser or client app.\n- Is sent as a bearer token: `Authorization: Bearer oct_agt_...`.\n\nCreate a key, copy it once (it is shown only at creation), and store it securely.\n\n## Next steps\n\n- [Using the SDK](/docs/workforce-agents/sdk) - the `@octavus/server-sdk` `workforce` client, including a run-and-wait helper.\n- [API reference](/docs/workforce-agents/api-reference) - the REST endpoints, for any language.\n",
754
+ excerpt: "Workforce Agents Workforce Agents are Octavus's autonomous AI teammates - specialized agents you hire and configure in the dashboard, each with its own computer, skills, and tools. Browse and hire...",
755
+ order: 1
756
+ },
757
+ {
758
+ slug: "workforce-agents/sdk",
759
+ section: "workforce-agents",
760
+ title: "Using the SDK",
761
+ description: "Drive a Workforce Agent with the @octavus/server-sdk workforce client.",
762
+ content: "\n# Using the SDK\n\n`@octavus/server-sdk` ships a `workforce` client for driving an agent with its per-agent key.\n\n## Install\n\n```bash\nnpm install @octavus/server-sdk@5.1.0\n```\n\n## Set up the client\n\nConstruct `OctavusClient` with your agent's API key:\n\n```ts\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst client = new OctavusClient({\n baseUrl: 'https://octavus.ai',\n apiKey: process.env.OCTAVUS_AGENT_KEY, // oct_agt_...\n});\n```\n\n## Run a task and wait for the result\n\n`run()` is the all-in-one method: it dispatches a message, polls until the run finishes, and returns the completed thread.\n\n```ts\nconst thread = await client.workforce.run(agentId, 'Summarize the latest sales report');\n\nconsole.log(thread.status); // 'completed' | 'failed' | 'cancelled'\nconsole.log(thread.messages); // the full conversation, including the agent's reply\n```\n\nThe agent's latest turn is at the end of `thread.messages`.\n\n## Dispatch and poll manually\n\nFor more control, dispatch and read the thread yourself:\n\n```ts\nconst { threadId } = await client.workforce.dispatch(agentId, 'Research our top 3 competitors');\n\nconst thread = await client.workforce.getThread(agentId, threadId);\nif (thread.status === 'completed') {\n // ...\n}\n```\n\n`waitForCompletion()` does the polling loop for you until the thread is terminal:\n\n```ts\nconst { threadId } = await client.workforce.dispatch(agentId, 'Draft the Q3 board update');\nconst thread = await client.workforce.waitForCompletion(agentId, threadId);\n```\n\n## Follow up in the same thread\n\nContinue a thread with another message. It runs after the current turn finishes.\n\n```ts\nawait client.workforce.followUp(agentId, threadId, 'Now turn that into a slide deck');\nconst thread = await client.workforce.waitForCompletion(agentId, threadId);\n```\n\n## Options\n\n`run()` and `waitForCompletion()` accept polling options. Full runs can take several minutes, so the defaults are generous.\n\n| Option | Type | Default | Description |\n| ---------------- | ----------- | -------- | --------------------------------------------- |\n| `pollIntervalMs` | number | `3000` | Delay between status checks |\n| `timeoutMs` | number | `900000` | Max time to wait before throwing (15 minutes) |\n| `signal` | AbortSignal | - | Cancel the wait early |\n\n`dispatch()`, `followUp()`, and `run()` also accept `files` (an array of `FileReference`) to attach hosted files to the message.\n\n```ts\nconst thread = await client.workforce.run(agentId, 'Review this spec', {\n timeoutMs: 30 * 60 * 1000, // 30 minutes\n pollIntervalMs: 5000,\n});\n```\n\nIf the timeout elapses first, `waitForCompletion()` and `run()` throw. The run keeps going on the server, so you can read it later with `getThread()`.\n\n## Result shape\n\n`getThread()`, `waitForCompletion()`, and `run()` return a thread:\n\n| Field | Type | Description |\n| --------------- | -------------- | -------------------------------------------------------------------------------------------------------------- |\n| `threadId` | string | The thread identifier |\n| `status` | string | `idle`, `queued`, `pending`, `running`, `completed`, `failed`, or `cancelled` |\n| `failureReason` | string \\| null | Why the run failed, when `status` is `failed` |\n| `messages` | UIMessage[] | The conversation - text, tool and skill steps, and files (see [UIMessage parts](/docs/api-reference/sessions)) |\n\nUse `isTerminalThreadStatus(status)` to check whether a run has finished.\n\nThe same operations are available as plain HTTP - see the [API reference](/docs/workforce-agents/api-reference).\n",
763
+ excerpt: "Using the SDK ships a client for driving an agent with its per-agent key. Install Set up the client Construct with your agent's API key: Run a task and wait for the result is the all-in-one...",
764
+ order: 2
765
+ },
766
+ {
767
+ slug: "workforce-agents/api-reference",
768
+ section: "workforce-agents",
769
+ title: "API Reference",
770
+ description: "REST endpoints for the Workforce Agents API.",
771
+ content: '\n# Workforce Agents API\n\nDrive a single agent over HTTP. Every request is authenticated with that agent\'s API key, sent as a bearer token:\n\n```\nAuthorization: Bearer oct_agt_...\n```\n\nAll endpoints are scoped to one agent through the `{agentId}` path segment - find the agent ID in the agent\'s page URL in the dashboard. A key only works for the agent it was created for.\n\n## Start a thread\n\nDispatch a message to the agent. This starts a new thread and returns immediately.\n\n```\nPOST /api/v1/workforce/agents/{agentId}/threads\n```\n\n### Request Body\n\n```json\n{\n "message": "Summarize the latest sales report"\n}\n```\n\n| Field | Type | Required | Description |\n| --------- | --------------- | -------- | --------------------------------- |\n| `message` | string | Yes | The task or message for the agent |\n| `files` | FileReference[] | No | Hosted file attachments |\n\n### Response\n\nReturns `201`.\n\n```json\n{\n "threadId": "cm5xyz123abc456def",\n "status": "pending"\n}\n```\n\nPoll the [Get a thread](#get-a-thread) endpoint until the status is terminal.\n\n### Example\n\n```bash\ncurl -X POST https://octavus.ai/api/v1/workforce/agents/AGENT_ID/threads \\\n -H "Authorization: Bearer oct_agt_..." \\\n -H "Content-Type: application/json" \\\n -d \'{ "message": "Summarize the latest sales report" }\'\n```\n\n## Get a thread\n\nRead a thread\'s status and messages. Poll this until the run finishes.\n\n```\nGET /api/v1/workforce/agents/{agentId}/threads/{threadId}\n```\n\n### Response\n\n```json\n{\n "threadId": "cm5xyz123abc456def",\n "status": "completed",\n "failureReason": null,\n "messages": []\n}\n```\n\n| Field | Type | Description |\n| --------------- | -------------- | ----------------------------------------------------------------------------- |\n| `threadId` | string | The thread identifier |\n| `status` | string | `idle`, `queued`, `pending`, `running`, `completed`, `failed`, or `cancelled` |\n| `failureReason` | string \\| null | Why the run failed, when `status` is `failed` |\n| `messages` | UIMessage[] | The conversation - see [UIMessage parts](/docs/api-reference/sessions) |\n\nKeep polling while the status is `pending`, `queued`, or `running`. Stop when it is `completed`, `failed`, or `cancelled`.\n\n### Example\n\n```bash\ncurl https://octavus.ai/api/v1/workforce/agents/AGENT_ID/threads/THREAD_ID \\\n -H "Authorization: Bearer oct_agt_..."\n```\n\n## Follow up in a thread\n\nSend another message into an existing thread. If the agent is still working the message runs after the current turn finishes; otherwise it starts immediately.\n\n```\nPOST /api/v1/workforce/agents/{agentId}/threads/{threadId}/messages\n```\n\n### Request Body\n\n```json\n{\n "message": "Now turn that into a slide deck"\n}\n```\n\n| Field | Type | Required | Description |\n| --------- | --------------- | -------- | ----------------------- |\n| `message` | string | Yes | The follow-up message |\n| `files` | FileReference[] | No | Hosted file attachments |\n\n### Response\n\nReturns `202`.\n\n```json\n{\n "threadId": "cm5xyz123abc456def",\n "status": "running"\n}\n```\n\nPoll [Get a thread](#get-a-thread) for the new run\'s result.\n\n### Example\n\n```bash\ncurl -X POST https://octavus.ai/api/v1/workforce/agents/AGENT_ID/threads/THREAD_ID/messages \\\n -H "Authorization: Bearer oct_agt_..." \\\n -H "Content-Type: application/json" \\\n -d \'{ "message": "Now turn that into a slide deck" }\'\n```\n\n## Errors\n\nErrors return `{ "error": string, "code": string }` with an HTTP status:\n\n| Status | Meaning |\n| ------ | ------------------------------------------------- |\n| `401` | Missing or invalid API key |\n| `402` | The agent is blocked by a usage or spending limit |\n| `403` | The key is not authorized for this agent |\n| `404` | The thread does not exist for this agent |\n',
772
+ excerpt: "Workforce Agents API Drive a single agent over HTTP. Every request is authenticated with that agent's API key, sent as a bearer token: All endpoints are scoped to one agent through the path segment...",
773
+ order: 3
747
774
  }
748
775
  ];
749
776
 
@@ -786,7 +813,7 @@ var sections_default = [
786
813
  section: "server-sdk",
787
814
  title: "Overview",
788
815
  description: "Introduction to the Octavus Server SDK for backend integration.",
789
- content: "\n# Server SDK Overview\n\nThe `@octavus/server-sdk` package provides a Node.js SDK for integrating Octavus agents into your backend application. It handles session management, streaming, and the tool execution continuation loop.\n\n**Current version:** `5.0.0`\n\n## Installation\n\n```bash\nnpm install @octavus/server-sdk\n```\n\nFor agent management (sync, validate), install the CLI as a dev dependency:\n\n```bash\nnpm install --save-dev @octavus/cli\n```\n\n## Basic Usage\n\n```typescript\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst client = new OctavusClient({\n baseUrl: 'https://octavus.ai',\n apiKey: 'your-api-key',\n});\n```\n\n## Key Features\n\n### Agent Management\n\nAgent definitions are managed via the CLI. See the [CLI documentation](/docs/server-sdk/cli) for details.\n\n```bash\n# Sync agent from local files\noctavus sync ./agents/support-chat\n\n# Output: Created: support-chat\n# Agent ID: clxyz123abc456\n```\n\n### Session Management\n\nCreate and manage agent sessions using the agent ID:\n\n```typescript\n// Create a new session (use agent ID from CLI sync)\nconst sessionId = await client.agentSessions.create('clxyz123abc456', {\n COMPANY_NAME: 'Acme Corp',\n PRODUCT_NAME: 'Widget Pro',\n});\n\n// Get UI-ready session messages (for session restore)\nconst session = await client.agentSessions.getMessages(sessionId);\n```\n\n### Tool Handlers\n\nTools run on your server with your data:\n\n```typescript\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'get-user-account': async (args) => {\n // Access your database, APIs, etc.\n return await db.users.findById(args.userId);\n },\n },\n});\n```\n\n### Streaming\n\nAll responses stream in real-time:\n\n```typescript\nimport { toSSEStream } from '@octavus/server-sdk';\n\n// execute() returns an async generator of events\nconst events = session.execute({\n type: 'trigger',\n triggerName: 'user-message',\n input: { USER_MESSAGE: 'Hello!' },\n});\n\n// Convert to SSE stream for HTTP responses\nreturn new Response(toSSEStream(events), {\n headers: { 'Content-Type': 'text/event-stream' },\n});\n```\n\n### Computer Capabilities\n\nGive agents access to browser, filesystem, and shell via MCP:\n\n```typescript\nimport { Computer } from '@octavus/computer';\n\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=...']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [dir]),\n shell: Computer.shell({ cwd: dir, mode: 'unrestricted' }),\n },\n});\n\nawait computer.start();\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => ({ title: args.title }),\n },\n});\n\nsession.setDynamicTools(computer);\n```\n\n### Workers\n\nExecute worker agents for task-based processing:\n\n```typescript\n// Non-streaming: get the output directly\nconst { output } = await client.workers.generate(agentId, {\n TOPIC: 'AI safety',\n});\n\n// Streaming: observe events in real-time\nfor await (const event of client.workers.execute(agentId, input)) {\n // Handle stream events\n}\n```\n\n## API Reference\n\n### OctavusClient\n\nThe main entry point for interacting with Octavus.\n\n```typescript\ninterface OctavusClientConfig {\n baseUrl: string; // Octavus API URL\n apiKey?: string; // Your API key\n traceModelRequests?: boolean; // Enable model request tracing (default: false)\n maxRetries?: number; // Retries for transient network failures during streaming (default: 2, set to 0 to disable)\n}\n\nclass OctavusClient {\n readonly agents: AgentsApi;\n readonly agentSessions: AgentSessionsApi;\n readonly workers: WorkersApi;\n readonly files: FilesApi;\n\n constructor(config: OctavusClientConfig);\n}\n```\n\n### AgentSessionsApi\n\nManages agent sessions.\n\n```typescript\nclass AgentSessionsApi {\n // Create a new session\n async create(agentId: string, input?: Record<string, unknown>): Promise<string>;\n\n // Get full session state (for debugging/internal use)\n async get(sessionId: string): Promise<SessionState>;\n\n // Get UI-ready messages (for client display)\n async getMessages(sessionId: string): Promise<UISessionState>;\n\n // Attach to a session for triggering\n attach(sessionId: string, options?: SessionAttachOptions): AgentSession;\n}\n\n// Full session state (internal format)\ninterface SessionState {\n id: string;\n agentId: string;\n input: Record<string, unknown>;\n variables: Record<string, unknown>;\n resources: Record<string, unknown>;\n messages: ChatMessage[]; // Internal message format\n createdAt: string;\n updatedAt: string;\n}\n\n// UI-ready session state\ninterface UISessionState {\n sessionId: string;\n agentId: string;\n messages: UIMessage[]; // UI-ready messages for frontend\n}\n```\n\n### AgentSession\n\nHandles request execution and streaming for a specific session.\n\n```typescript\nclass AgentSession {\n // Execute a request and stream parsed events\n execute(request: SessionRequest, options?: TriggerOptions): AsyncGenerator<StreamEvent>;\n\n // Get the session ID\n getSessionId(): string;\n\n // Register dynamic tools (e.g., pass a Computer or explicit DynamicTool[])\n setDynamicTools(source: ToolProvider | DynamicTool[]): void;\n}\n\ntype SessionRequest = TriggerRequest | ContinueRequest;\n\ninterface TriggerRequest {\n type: 'trigger';\n triggerName: string;\n input?: Record<string, unknown>;\n}\n\ninterface ContinueRequest {\n type: 'continue';\n executionId: string;\n toolResults: ToolResult[];\n}\n\n// Helper to convert events to SSE stream\nfunction toSSEStream(events: AsyncIterable<StreamEvent>): ReadableStream<Uint8Array>;\n```\n\n### FilesApi\n\nHandles file uploads for sessions.\n\n```typescript\nclass FilesApi {\n // Get presigned URLs for file uploads\n async getUploadUrls(sessionId: string, files: FileUploadRequest[]): Promise<UploadUrlsResponse>;\n}\n\ninterface FileUploadRequest {\n filename: string;\n mediaType: string;\n size: number;\n}\n\ninterface UploadUrlsResponse {\n files: {\n id: string; // File ID for references\n uploadUrl: string; // PUT to this URL\n downloadUrl: string; // GET URL after upload\n }[];\n}\n```\n\nThe client uploads files directly to S3 using the presigned upload URL. See [File Uploads](/docs/client-sdk/file-uploads) for the full integration pattern.\n\n## Next Steps\n\n- [Sessions](/docs/server-sdk/sessions) - Deep dive into session management\n- [Tools](/docs/server-sdk/tools) - Implementing tool handlers\n- [Streaming](/docs/server-sdk/streaming) - Understanding stream events\n- [Workers](/docs/server-sdk/workers) - Executing worker agents\n- [Debugging](/docs/server-sdk/debugging) - Model request tracing and debugging\n- [Computer](/docs/server-sdk/computer) - Browser, filesystem, and shell via MCP\n",
816
+ content: "\n# Server SDK Overview\n\nThe `@octavus/server-sdk` package provides a Node.js SDK for integrating Octavus agents into your backend application. It handles session management, streaming, and the tool execution continuation loop.\n\n**Current version:** `5.1.0`\n\n## Installation\n\n```bash\nnpm install @octavus/server-sdk\n```\n\nFor agent management (sync, validate), install the CLI as a dev dependency:\n\n```bash\nnpm install --save-dev @octavus/cli\n```\n\n## Basic Usage\n\n```typescript\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst client = new OctavusClient({\n baseUrl: 'https://octavus.ai',\n apiKey: 'your-api-key',\n});\n```\n\n## Key Features\n\n### Agent Management\n\nAgent definitions are managed via the CLI. See the [CLI documentation](/docs/server-sdk/cli) for details.\n\n```bash\n# Sync agent from local files\noctavus sync ./agents/support-chat\n\n# Output: Created: support-chat\n# Agent ID: clxyz123abc456\n```\n\n### Session Management\n\nCreate and manage agent sessions using the agent ID:\n\n```typescript\n// Create a new session (use agent ID from CLI sync)\nconst sessionId = await client.agentSessions.create('clxyz123abc456', {\n COMPANY_NAME: 'Acme Corp',\n PRODUCT_NAME: 'Widget Pro',\n});\n\n// Get UI-ready session messages (for session restore)\nconst session = await client.agentSessions.getMessages(sessionId);\n```\n\n### Tool Handlers\n\nTools run on your server with your data:\n\n```typescript\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'get-user-account': async (args) => {\n // Access your database, APIs, etc.\n return await db.users.findById(args.userId);\n },\n },\n});\n```\n\n### Streaming\n\nAll responses stream in real-time:\n\n```typescript\nimport { toSSEStream } from '@octavus/server-sdk';\n\n// execute() returns an async generator of events\nconst events = session.execute({\n type: 'trigger',\n triggerName: 'user-message',\n input: { USER_MESSAGE: 'Hello!' },\n});\n\n// Convert to SSE stream for HTTP responses\nreturn new Response(toSSEStream(events), {\n headers: { 'Content-Type': 'text/event-stream' },\n});\n```\n\n### Computer Capabilities\n\nGive agents access to browser, filesystem, and shell via MCP:\n\n```typescript\nimport { Computer } from '@octavus/computer';\n\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=...']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [dir]),\n shell: Computer.shell({ cwd: dir, mode: 'unrestricted' }),\n },\n});\n\nawait computer.start();\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => ({ title: args.title }),\n },\n});\n\nsession.setDynamicTools(computer);\n```\n\n### Workers\n\nExecute worker agents for task-based processing:\n\n```typescript\n// Non-streaming: get the output directly\nconst { output } = await client.workers.generate(agentId, {\n TOPIC: 'AI safety',\n});\n\n// Streaming: observe events in real-time\nfor await (const event of client.workers.execute(agentId, input)) {\n // Handle stream events\n}\n```\n\n## API Reference\n\n### OctavusClient\n\nThe main entry point for interacting with Octavus.\n\n```typescript\ninterface OctavusClientConfig {\n baseUrl: string; // Octavus API URL\n apiKey?: string; // Your API key\n traceModelRequests?: boolean; // Enable model request tracing (default: false)\n maxRetries?: number; // Retries for transient network failures during streaming (default: 2, set to 0 to disable)\n}\n\nclass OctavusClient {\n readonly agents: AgentsApi;\n readonly agentSessions: AgentSessionsApi;\n readonly workers: WorkersApi;\n readonly files: FilesApi;\n\n constructor(config: OctavusClientConfig);\n}\n```\n\n### AgentSessionsApi\n\nManages agent sessions.\n\n```typescript\nclass AgentSessionsApi {\n // Create a new session\n async create(agentId: string, input?: Record<string, unknown>): Promise<string>;\n\n // Get full session state (for debugging/internal use)\n async get(sessionId: string): Promise<SessionState>;\n\n // Get UI-ready messages (for client display)\n async getMessages(sessionId: string): Promise<UISessionState>;\n\n // Attach to a session for triggering\n attach(sessionId: string, options?: SessionAttachOptions): AgentSession;\n}\n\n// Full session state (internal format)\ninterface SessionState {\n id: string;\n agentId: string;\n input: Record<string, unknown>;\n variables: Record<string, unknown>;\n resources: Record<string, unknown>;\n messages: ChatMessage[]; // Internal message format\n createdAt: string;\n updatedAt: string;\n}\n\n// UI-ready session state\ninterface UISessionState {\n sessionId: string;\n agentId: string;\n messages: UIMessage[]; // UI-ready messages for frontend\n}\n```\n\n### AgentSession\n\nHandles request execution and streaming for a specific session.\n\n```typescript\nclass AgentSession {\n // Execute a request and stream parsed events\n execute(request: SessionRequest, options?: TriggerOptions): AsyncGenerator<StreamEvent>;\n\n // Get the session ID\n getSessionId(): string;\n\n // Register dynamic tools (e.g., pass a Computer or explicit DynamicTool[])\n setDynamicTools(source: ToolProvider | DynamicTool[]): void;\n}\n\ntype SessionRequest = TriggerRequest | ContinueRequest;\n\ninterface TriggerRequest {\n type: 'trigger';\n triggerName: string;\n input?: Record<string, unknown>;\n}\n\ninterface ContinueRequest {\n type: 'continue';\n executionId: string;\n toolResults: ToolResult[];\n}\n\n// Helper to convert events to SSE stream\nfunction toSSEStream(events: AsyncIterable<StreamEvent>): ReadableStream<Uint8Array>;\n```\n\n### FilesApi\n\nHandles file uploads for sessions.\n\n```typescript\nclass FilesApi {\n // Get presigned URLs for file uploads\n async getUploadUrls(sessionId: string, files: FileUploadRequest[]): Promise<UploadUrlsResponse>;\n}\n\ninterface FileUploadRequest {\n filename: string;\n mediaType: string;\n size: number;\n}\n\ninterface UploadUrlsResponse {\n files: {\n id: string; // File ID for references\n uploadUrl: string; // PUT to this URL\n downloadUrl: string; // GET URL after upload\n }[];\n}\n```\n\nThe client uploads files directly to S3 using the presigned upload URL. See [File Uploads](/docs/client-sdk/file-uploads) for the full integration pattern.\n\n## Next Steps\n\n- [Sessions](/docs/server-sdk/sessions) - Deep dive into session management\n- [Tools](/docs/server-sdk/tools) - Implementing tool handlers\n- [Streaming](/docs/server-sdk/streaming) - Understanding stream events\n- [Workers](/docs/server-sdk/workers) - Executing worker agents\n- [Debugging](/docs/server-sdk/debugging) - Model request tracing and debugging\n- [Computer](/docs/server-sdk/computer) - Browser, filesystem, and shell via MCP\n",
790
817
  excerpt: "Server SDK Overview The package provides a Node.js SDK for integrating Octavus agents into your backend application. It handles session management, streaming, and the tool execution continuation...",
791
818
  order: 1
792
819
  },
@@ -822,7 +849,7 @@ var sections_default = [
822
849
  section: "server-sdk",
823
850
  title: "CLI",
824
851
  description: "Command-line interface for validating and syncing agent definitions.",
825
- content: '\n# Octavus CLI\n\nThe `@octavus/cli` package provides a command-line interface for validating and syncing agent definitions from your local filesystem to the Octavus platform.\n\n**Current version:** `5.0.0`\n\n## Installation\n\n```bash\nnpm install --save-dev @octavus/cli\n```\n\n## Configuration\n\nThe CLI requires an API key with the **Agents** permission.\n\n### Environment Variables\n\n| Variable | Description |\n| --------------------- | ---------------------------------------------- |\n| `OCTAVUS_CLI_API_KEY` | API key with "Agents" permission (recommended) |\n| `OCTAVUS_API_KEY` | Fallback if `OCTAVUS_CLI_API_KEY` not set |\n| `OCTAVUS_API_URL` | Optional, defaults to `https://octavus.ai` |\n\n### Two-Key Strategy (Recommended)\n\nFor production deployments, use separate API keys with minimal permissions:\n\n```bash\n# CI/CD or .env.local (not committed)\nOCTAVUS_CLI_API_KEY=oct_sk_... # "Agents" permission only\n\n# Production .env\nOCTAVUS_API_KEY=oct_sk_... # "Sessions" permission only\n```\n\nThis ensures production servers only have session permissions (smaller blast radius if leaked), while agent management is restricted to development/CI environments.\n\n### Multiple Environments\n\nUse separate Octavus projects for staging and production, each with their own API keys. The `--env` flag lets you load different environment files:\n\n```bash\n# Local development (default: .env)\noctavus sync ./agents/my-agent\n\n# Staging project\noctavus --env .env.staging sync ./agents/my-agent\n\n# Production project\noctavus --env .env.production sync ./agents/my-agent\n```\n\nExample environment files:\n\n```bash\n# .env.staging (syncs to your staging project)\nOCTAVUS_CLI_API_KEY=oct_sk_staging_project_key...\n\n# .env.production (syncs to your production project)\nOCTAVUS_CLI_API_KEY=oct_sk_production_project_key...\n```\n\nEach project has its own agents, so you\'ll get different agent IDs per environment.\n\n## Global Options\n\n| Option | Description |\n| -------------- | ------------------------------------------------------- |\n| `--env <file>` | Load environment from a specific file (default: `.env`) |\n| `--help` | Show help |\n| `--version` | Show version |\n\n## Commands\n\n### `octavus sync <path>`\n\nSync an agent definition to the platform. Creates the agent if it doesn\'t exist, or updates it if it does.\n\n```bash\noctavus sync ./agents/my-agent\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Reading agent from ./agents/my-agent...\n\u2139 Syncing support-chat...\n\u2713 Created: support-chat\n Agent ID: clxyz123abc456\n```\n\n### `octavus validate <path>`\n\nValidate an agent definition without saving. Useful for CI/CD pipelines.\n\n```bash\noctavus validate ./agents/my-agent\n```\n\n**Exit codes:**\n\n- `0` - Validation passed\n- `1` - Validation errors\n- `2` - Configuration errors (missing API key, etc.)\n\n### `octavus list`\n\nList all agents in your project.\n\n```bash\noctavus list\n```\n\n**Example output:**\n\n```\nSLUG NAME FORMAT ID\n\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\nsupport-chat Support Chat Agent interactive clxyz123abc456\n\n1 agent(s)\n```\n\n### `octavus get <slug>`\n\nGet details about a specific agent by its slug.\n\n```bash\noctavus get support-chat\n```\n\n### `octavus archive <slug>`\n\nArchive an agent by slug (soft delete). Archived agents are removed from the active agent list and their slug is freed for reuse.\n\n```bash\noctavus archive support-chat\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Archiving support-chat...\n\u2713 Archived: support-chat\n Agent ID: clxyz123abc456\n```\n\n### `octavus skills sync <path>`\n\nSync a skill to the platform. Packages the skill directory into a bundle (excluding `.env` files, `.git`, and `node_modules`), uploads it, and optionally pushes secrets from the skill\'s `.env` file.\n\n```bash\noctavus skills sync ./skills/github\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Reading skill from ./skills/github...\n\u2139 Packaging github...\n\u2713 Created: github\n Skill ID: clxyz789def012\n\u2139 Pushing 2 secret(s)...\n\u2713 2 secret(s) updated\n```\n\n**Secret handling:**\n\nIf the skill directory contains a `.env` file, secrets are pushed alongside the bundle. Secrets are cross-validated against the `secrets` declarations in `SKILL.md` - warnings are shown for undeclared or missing required secrets.\n\n```\nmy-skill/\n\u251C\u2500\u2500 SKILL.md\n\u251C\u2500\u2500 scripts/\n\u2502 \u2514\u2500\u2500 run.py\n\u2514\u2500\u2500 .env # Secrets (not included in bundle)\n```\n\nSee [Skills](/docs/protocol/skills) for details on skill format, secrets, and secure mode.\n\n## Agent Directory Structure\n\nThe CLI expects agent definitions in a specific directory structure:\n\n```\nmy-agent/\n\u251C\u2500\u2500 settings.json # Required: Agent metadata\n\u251C\u2500\u2500 protocol.yaml # Required: Agent protocol\n\u251C\u2500\u2500 prompts/ # Optional: Prompt templates\n\u2502 \u251C\u2500\u2500 system.md\n\u2502 \u2514\u2500\u2500 user-message.md\n\u2514\u2500\u2500 references/ # Optional: Reference documents\n \u2514\u2500\u2500 api-guidelines.md\n```\n\n### references/\n\nReference files are markdown documents with YAML frontmatter containing a `description`. The agent can fetch these on demand during execution. See [References](/docs/protocol/references) for details.\n\n### settings.json\n\n```json\n{\n "slug": "my-agent",\n "name": "My Agent",\n "description": "A helpful assistant",\n "format": "interactive"\n}\n```\n\n### protocol.yaml\n\nSee the [Protocol documentation](/docs/protocol/overview) for details on protocol syntax.\n\n## CI/CD Integration\n\n### GitHub Actions\n\n```yaml\nname: Validate and Sync Agents\n\non:\n push:\n branches: [main]\n paths:\n - \'agents/**\'\n\njobs:\n sync:\n runs-on: ubuntu-latest\n steps:\n - uses: actions/checkout@v4\n\n - uses: actions/setup-node@v4\n with:\n node-version: \'22\'\n\n - run: npm install\n\n - name: Validate agent\n run: npx octavus validate ./agents/support-chat\n env:\n OCTAVUS_CLI_API_KEY: ${{ secrets.OCTAVUS_CLI_API_KEY }}\n\n - name: Sync agent\n run: npx octavus sync ./agents/support-chat\n env:\n OCTAVUS_CLI_API_KEY: ${{ secrets.OCTAVUS_CLI_API_KEY }}\n```\n\n### Package.json Scripts\n\nAdd sync scripts to your `package.json`:\n\n```json\n{\n "scripts": {\n "agents:validate": "octavus validate ./agents/my-agent",\n "agents:sync": "octavus sync ./agents/my-agent"\n },\n "devDependencies": {\n "@octavus/cli": "^0.1.0"\n }\n}\n```\n\n## Workflow\n\nThe recommended workflow for managing agents:\n\n1. **Define agent locally** - Create `settings.json`, `protocol.yaml`, and prompts\n2. **Validate** - Run `octavus validate ./my-agent` to check for errors\n3. **Sync** - Run `octavus sync ./my-agent` to push to platform\n4. **Store agent ID** - Save the output ID in an environment variable\n5. **Use in app** - Read the ID from env and pass to `client.agentSessions.create()`\n\n```bash\n# After syncing: octavus sync ./agents/support-chat\n# Output: Agent ID: clxyz123abc456\n\n# Add to your .env file\nOCTAVUS_SUPPORT_AGENT_ID=clxyz123abc456\n```\n\n```typescript\nconst agentId = process.env.OCTAVUS_SUPPORT_AGENT_ID;\n\nconst sessionId = await client.agentSessions.create(agentId, {\n COMPANY_NAME: \'Acme Corp\',\n});\n```\n',
852
+ content: '\n# Octavus CLI\n\nThe `@octavus/cli` package provides a command-line interface for validating and syncing agent definitions from your local filesystem to the Octavus platform.\n\n**Current version:** `5.1.0`\n\n## Installation\n\n```bash\nnpm install --save-dev @octavus/cli\n```\n\n## Configuration\n\nThe CLI requires an API key with the **Agents** permission.\n\n### Environment Variables\n\n| Variable | Description |\n| --------------------- | ---------------------------------------------- |\n| `OCTAVUS_CLI_API_KEY` | API key with "Agents" permission (recommended) |\n| `OCTAVUS_API_KEY` | Fallback if `OCTAVUS_CLI_API_KEY` not set |\n| `OCTAVUS_API_URL` | Optional, defaults to `https://octavus.ai` |\n\n### Two-Key Strategy (Recommended)\n\nFor production deployments, use separate API keys with minimal permissions:\n\n```bash\n# CI/CD or .env.local (not committed)\nOCTAVUS_CLI_API_KEY=oct_sk_... # "Agents" permission only\n\n# Production .env\nOCTAVUS_API_KEY=oct_sk_... # "Sessions" permission only\n```\n\nThis ensures production servers only have session permissions (smaller blast radius if leaked), while agent management is restricted to development/CI environments.\n\n### Multiple Environments\n\nUse separate Octavus projects for staging and production, each with their own API keys. The `--env` flag lets you load different environment files:\n\n```bash\n# Local development (default: .env)\noctavus sync ./agents/my-agent\n\n# Staging project\noctavus --env .env.staging sync ./agents/my-agent\n\n# Production project\noctavus --env .env.production sync ./agents/my-agent\n```\n\nExample environment files:\n\n```bash\n# .env.staging (syncs to your staging project)\nOCTAVUS_CLI_API_KEY=oct_sk_staging_project_key...\n\n# .env.production (syncs to your production project)\nOCTAVUS_CLI_API_KEY=oct_sk_production_project_key...\n```\n\nEach project has its own agents, so you\'ll get different agent IDs per environment.\n\n## Global Options\n\n| Option | Description |\n| -------------- | ------------------------------------------------------- |\n| `--env <file>` | Load environment from a specific file (default: `.env`) |\n| `--help` | Show help |\n| `--version` | Show version |\n\n## Commands\n\n### `octavus sync <path>`\n\nSync an agent definition to the platform. Creates the agent if it doesn\'t exist, or updates it if it does.\n\n```bash\noctavus sync ./agents/my-agent\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Reading agent from ./agents/my-agent...\n\u2139 Syncing support-chat...\n\u2713 Created: support-chat\n Agent ID: clxyz123abc456\n```\n\n### `octavus validate <path>`\n\nValidate an agent definition without saving. Useful for CI/CD pipelines.\n\n```bash\noctavus validate ./agents/my-agent\n```\n\n**Exit codes:**\n\n- `0` - Validation passed\n- `1` - Validation errors\n- `2` - Configuration errors (missing API key, etc.)\n\n### `octavus list`\n\nList all agents in your project.\n\n```bash\noctavus list\n```\n\n**Example output:**\n\n```\nSLUG NAME FORMAT ID\n\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\nsupport-chat Support Chat Agent interactive clxyz123abc456\n\n1 agent(s)\n```\n\n### `octavus get <slug>`\n\nGet details about a specific agent by its slug.\n\n```bash\noctavus get support-chat\n```\n\n### `octavus archive <slug>`\n\nArchive an agent by slug (soft delete). Archived agents are removed from the active agent list and their slug is freed for reuse.\n\n```bash\noctavus archive support-chat\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Archiving support-chat...\n\u2713 Archived: support-chat\n Agent ID: clxyz123abc456\n```\n\n### `octavus skills sync <path>`\n\nSync a skill to the platform. Packages the skill directory into a bundle (excluding `.env` files, `.git`, and `node_modules`), uploads it, and optionally pushes secrets from the skill\'s `.env` file.\n\n```bash\noctavus skills sync ./skills/github\n```\n\n**Options:**\n\n- `--json` - Output as JSON (for CI/CD parsing)\n- `--quiet` - Suppress non-essential output\n\n**Example output:**\n\n```\n\u2139 Reading skill from ./skills/github...\n\u2139 Packaging github...\n\u2713 Created: github\n Skill ID: clxyz789def012\n\u2139 Pushing 2 secret(s)...\n\u2713 2 secret(s) updated\n```\n\n**Secret handling:**\n\nIf the skill directory contains a `.env` file, secrets are pushed alongside the bundle. Secrets are cross-validated against the `secrets` declarations in `SKILL.md` - warnings are shown for undeclared or missing required secrets.\n\n```\nmy-skill/\n\u251C\u2500\u2500 SKILL.md\n\u251C\u2500\u2500 scripts/\n\u2502 \u2514\u2500\u2500 run.py\n\u2514\u2500\u2500 .env # Secrets (not included in bundle)\n```\n\nSee [Skills](/docs/protocol/skills) for details on skill format, secrets, and secure mode.\n\n## Agent Directory Structure\n\nThe CLI expects agent definitions in a specific directory structure:\n\n```\nmy-agent/\n\u251C\u2500\u2500 settings.json # Required: Agent metadata\n\u251C\u2500\u2500 protocol.yaml # Required: Agent protocol\n\u251C\u2500\u2500 prompts/ # Optional: Prompt templates\n\u2502 \u251C\u2500\u2500 system.md\n\u2502 \u2514\u2500\u2500 user-message.md\n\u2514\u2500\u2500 references/ # Optional: Reference documents\n \u2514\u2500\u2500 api-guidelines.md\n```\n\n### references/\n\nReference files are markdown documents with YAML frontmatter containing a `description`. The agent can fetch these on demand during execution. See [References](/docs/protocol/references) for details.\n\n### settings.json\n\n```json\n{\n "slug": "my-agent",\n "name": "My Agent",\n "description": "A helpful assistant",\n "format": "interactive"\n}\n```\n\n### protocol.yaml\n\nSee the [Protocol documentation](/docs/protocol/overview) for details on protocol syntax.\n\n## CI/CD Integration\n\n### GitHub Actions\n\n```yaml\nname: Validate and Sync Agents\n\non:\n push:\n branches: [main]\n paths:\n - \'agents/**\'\n\njobs:\n sync:\n runs-on: ubuntu-latest\n steps:\n - uses: actions/checkout@v4\n\n - uses: actions/setup-node@v4\n with:\n node-version: \'22\'\n\n - run: npm install\n\n - name: Validate agent\n run: npx octavus validate ./agents/support-chat\n env:\n OCTAVUS_CLI_API_KEY: ${{ secrets.OCTAVUS_CLI_API_KEY }}\n\n - name: Sync agent\n run: npx octavus sync ./agents/support-chat\n env:\n OCTAVUS_CLI_API_KEY: ${{ secrets.OCTAVUS_CLI_API_KEY }}\n```\n\n### Package.json Scripts\n\nAdd sync scripts to your `package.json`:\n\n```json\n{\n "scripts": {\n "agents:validate": "octavus validate ./agents/my-agent",\n "agents:sync": "octavus sync ./agents/my-agent"\n },\n "devDependencies": {\n "@octavus/cli": "^0.1.0"\n }\n}\n```\n\n## Workflow\n\nThe recommended workflow for managing agents:\n\n1. **Define agent locally** - Create `settings.json`, `protocol.yaml`, and prompts\n2. **Validate** - Run `octavus validate ./my-agent` to check for errors\n3. **Sync** - Run `octavus sync ./my-agent` to push to platform\n4. **Store agent ID** - Save the output ID in an environment variable\n5. **Use in app** - Read the ID from env and pass to `client.agentSessions.create()`\n\n```bash\n# After syncing: octavus sync ./agents/support-chat\n# Output: Agent ID: clxyz123abc456\n\n# Add to your .env file\nOCTAVUS_SUPPORT_AGENT_ID=clxyz123abc456\n```\n\n```typescript\nconst agentId = process.env.OCTAVUS_SUPPORT_AGENT_ID;\n\nconst sessionId = await client.agentSessions.create(agentId, {\n COMPANY_NAME: \'Acme Corp\',\n});\n```\n',
826
853
  excerpt: "Octavus CLI The package provides a command-line interface for validating and syncing agent definitions from your local filesystem to the Octavus platform. Current version: Installation ...",
827
854
  order: 5
828
855
  },
@@ -849,7 +876,7 @@ var sections_default = [
849
876
  section: "server-sdk",
850
877
  title: "Computer",
851
878
  description: "Adding browser, filesystem, and shell capabilities to agents with @octavus/computer.",
852
- content: "\n# Computer\n\nThe `@octavus/computer` package gives agents access to a physical or virtual machine's browser, filesystem, and shell. It connects to [MCP](https://modelcontextprotocol.io) servers, discovers their tools, and provides them to the server-sdk.\n\n**Current version:** `5.0.0`\n\n## Installation\n\n```bash\nnpm install @octavus/computer\n```\n\n## Quick Start\n\n```typescript\nimport { Computer } from '@octavus/computer';\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=http://127.0.0.1:9222']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', ['/path/to/workspace']),\n shell: Computer.shell({ cwd: '/path/to/workspace', mode: 'unrestricted' }),\n },\n});\n\nawait computer.start();\n\nconst client = new OctavusClient({\n baseUrl: 'https://octavus.ai',\n apiKey: 'your-api-key',\n});\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => ({ title: args.title }),\n },\n});\n\nsession.setDynamicTools(computer);\n```\n\nDynamic tools are registered after attaching via `session.setDynamicTools()`. Pass the `computer` directly - the session extracts schemas and handlers from the `ToolProvider`. Tool schemas are sent to the platform on the next `execute()` call, and tool calls flow back through the existing execution loop.\n\n## How It Works\n\n1. You configure MCP servers with namespaces (e.g., `browser`, `filesystem`, `shell`)\n2. `computer.start()` connects to all servers in parallel and discovers their tools\n3. Each tool is namespaced with `__` (e.g., `browser__navigate_page`, `filesystem__read_file`)\n4. The server-sdk sends tool schemas to the platform and handles tool call execution\n\nThe agent's protocol must declare matching `mcpServers` with `source: device` - see [MCP Servers](/docs/protocol/mcp-servers).\n\n## Entry Types\n\nThe `Computer` class supports three types of MCP entries:\n\n### Stdio (MCP Subprocess)\n\nSpawns an MCP server as a child process, communicating via stdin/stdout:\n\n```typescript\nComputer.stdio(command: string, args?: string[], options?: {\n env?: Record<string, string>;\n cwd?: string;\n})\n```\n\nUse this for local MCP servers installed as npm packages or standalone executables:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n '--browser-url=http://127.0.0.1:9222',\n '--no-usage-statistics',\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [\n '/Users/me/projects/my-app',\n ]),\n },\n});\n```\n\n### HTTP (Remote MCP Endpoint)\n\nConnects to an MCP server over Streamable HTTP:\n\n```typescript\nComputer.http(url: string, options?: {\n headers?: Record<string, string>;\n})\n```\n\nUse this for MCP servers running as HTTP services:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n docs: Computer.http('http://localhost:3001/mcp', {\n headers: { Authorization: 'Bearer token' },\n }),\n },\n});\n```\n\n### Shell (Built-in)\n\nProvides shell command execution without spawning an MCP subprocess:\n\n```typescript\nComputer.shell(options: {\n cwd?: string;\n mode: ShellMode;\n timeout?: number; // Default: 300,000ms (5 minutes)\n})\n```\n\nThis exposes a `run_command` tool (namespaced as `shell__run_command` when the key is `shell`). Commands execute in a login shell with the user's full environment.\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n shell: Computer.shell({\n cwd: '/Users/me/projects/my-app',\n mode: 'unrestricted',\n timeout: 300_000,\n }),\n },\n});\n```\n\n#### Shell Safety Modes\n\n| Mode | Description |\n| -------------------------------------- | --------------------------------------------- |\n| `'unrestricted'` | All commands allowed (for dedicated machines) |\n| `{ allowedPatterns, blockedPatterns }` | Pattern-based command filtering |\n\nPattern-based filtering:\n\n```typescript\nComputer.shell({\n cwd: workspaceDir,\n mode: {\n blockedPatterns: [/rm\\s+-rf/, /sudo/],\n allowedPatterns: [/^git\\s/, /^npm\\s/, /^ls\\s/],\n },\n});\n```\n\nWhen `allowedPatterns` is set, only matching commands are permitted. When `blockedPatterns` is set, matching commands are rejected. Blocked patterns are checked first.\n\n## Lifecycle\n\n### Starting\n\n`computer.start()` connects to all configured MCP servers in parallel. If some servers fail to connect, the computer still starts with the remaining servers - only if _all_ connections fail does it throw an error.\n\n```typescript\nconst { errors } = await computer.start();\n\nif (errors.length > 0) {\n console.warn('Some MCP servers failed to connect:', errors);\n}\n```\n\n### Stopping\n\n`computer.stop()` closes all MCP connections and kills managed processes:\n\n```typescript\nawait computer.stop();\n```\n\nAlways call `stop()` when the session ends to clean up MCP subprocesses. For managed processes (like Chrome), pass them in the config for automatic cleanup.\n\n## Dynamic Entries\n\nYou can add or remove MCP entries on a running `Computer` after `start()` has returned. This is useful when MCP configurations arrive after construction - for example, when a session-manager receives per-session entries from a dispatch payload and wants to wire them into the existing computer instead of rebuilding it.\n\n### `addEntry(namespace, entry, options?)`\n\nRegisters a new MCP entry under `namespace`. By default, connects immediately:\n\n```typescript\nawait computer.addEntry(\n 'github',\n Computer.stdio('@modelcontextprotocol/server-github', [], {\n env: { GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GH_TOKEN! },\n }),\n);\n```\n\nPass `{ deferred: true }` to register the entry without connecting. The entry starts in a degraded state and connects on the next `restartEntry(namespace)` call - useful for lazy MCPs the agent activates on demand:\n\n```typescript\nawait computer.addEntry('github', githubEntry, { deferred: true });\n\n// Later, when the agent decides it needs GitHub:\nawait computer.restartEntry('github');\n```\n\n`addEntry` throws if the namespace already exists. To replace an entry, call `removeEntry` first.\n\nIf the immediate connection fails, `addEntry` does not throw - the entry is registered as degraded with the error message attached. Inspect via `getHealth()` or `restartEntry()` to retry.\n\n### `removeEntry(namespace)`\n\nCloses the entry's connection (if any) and drops it from the configuration. No-op when the namespace doesn't exist:\n\n```typescript\nawait computer.removeEntry('github');\n```\n\n### `restartEntry(namespace)`\n\nCloses the existing connection (if any) and reconnects with the current configuration:\n\n```typescript\nawait computer.restartEntry('github');\n```\n\nUse this to bring a deferred entry online for the first time, or to recover an entry that became degraded mid-session.\n\n### Detecting dynamic-entry support\n\nConsumers that work with arbitrary `ToolProvider` implementations can detect dynamic-entry capability with `isDynamicMcpProvider`:\n\n```typescript\nimport { isDynamicMcpProvider } from '@octavus/server-sdk';\n\nif (isDynamicMcpProvider(provider)) {\n await provider.addEntry('github', githubEntry);\n}\n```\n\n`Computer` always passes this check.\n\n## Chrome Launch Helper\n\nFor desktop applications that need to control a browser, `Computer.launchChrome()` launches Chrome with remote debugging enabled:\n\n```typescript\nconst browser = await Computer.launchChrome({\n profileDir: '/Users/me/.my-app/chrome-profiles/agent-1',\n debuggingPort: 9222, // Optional, auto-allocated if omitted\n flags: ['--window-size=1280,800'],\n});\n\nconsole.log(`Chrome running on port ${browser.port}, PID ${browser.pid}`);\n```\n\nPass the browser to `managedProcesses` for automatic cleanup when the computer stops:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n `--browser-url=http://127.0.0.1:${browser.port}`,\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [workspaceDir]),\n shell: Computer.shell({ cwd: workspaceDir, mode: 'unrestricted' }),\n },\n managedProcesses: [{ process: browser.process }],\n});\n```\n\n### ChromeLaunchOptions\n\n| Field | Required | Description |\n| --------------- | -------- | ----------------------------------------------------- |\n| `profileDir` | Yes | Directory for Chrome's user data (profile isolation) |\n| `debuggingPort` | No | Port for remote debugging (auto-allocated if omitted) |\n| `flags` | No | Additional Chrome launch flags |\n\n## ToolProvider Interface\n\n`Computer` implements the `ToolProvider` interface from `@octavus/core`:\n\n```typescript\ninterface ToolProvider {\n toolHandlers(): Record<string, ToolHandler>;\n toolSchemas(): ToolSchema[];\n}\n```\n\n`setDynamicTools()` accepts any `ToolProvider` directly - the session extracts schemas and handlers automatically:\n\n```typescript\nsession.setDynamicTools(computer);\n```\n\nYou can also pass a custom `ToolProvider`:\n\n```typescript\nconst customProvider: ToolProvider = {\n toolHandlers() {\n return {\n custom__my_tool: async (args) => {\n return { result: 'done' };\n },\n };\n },\n toolSchemas() {\n return [\n {\n name: 'custom__my_tool',\n description: 'A custom tool',\n inputSchema: {\n type: 'object',\n properties: {\n input: { type: 'string', description: 'Tool input' },\n },\n required: ['input'],\n },\n },\n ];\n },\n};\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: { 'set-chat-title': titleHandler },\n});\n\nsession.setDynamicTools(customProvider);\n```\n\nFor cases where you need explicit control, `setDynamicTools()` also accepts a `DynamicTool[]` array:\n\n```typescript\ninterface DynamicTool {\n schema: ToolSchema;\n handler: ToolHandler;\n}\n```\n\n## Complete Example\n\nA desktop application with browser, filesystem, and shell capabilities:\n\n```typescript\nimport { Computer } from '@octavus/computer';\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst WORKSPACE_DIR = '/Users/me/projects/my-app';\nconst PROFILE_DIR = '/Users/me/.my-app/chrome-profiles/agent';\n\nasync function startSession(sessionId: string) {\n // 1. Launch Chrome with remote debugging\n const browser = await Computer.launchChrome({\n profileDir: PROFILE_DIR,\n });\n\n // 2. Create computer with all capabilities\n const computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n `--browser-url=http://127.0.0.1:${browser.port}`,\n '--no-usage-statistics',\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [WORKSPACE_DIR]),\n shell: Computer.shell({\n cwd: WORKSPACE_DIR,\n mode: 'unrestricted',\n }),\n },\n managedProcesses: [{ process: browser.process }],\n });\n\n // 3. Connect to all MCP servers\n const { errors } = await computer.start();\n if (errors.length > 0) {\n console.warn('Failed to connect:', errors);\n }\n\n // 4. Attach to session and register dynamic tools\n const client = new OctavusClient({\n baseUrl: process.env.OCTAVUS_API_URL!,\n apiKey: process.env.OCTAVUS_API_KEY!,\n });\n\n const session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => {\n console.log('Chat title:', args.title);\n return { success: true };\n },\n },\n });\n\n session.setDynamicTools(computer);\n\n // 5. Execute and stream\n const events = session.execute({\n type: 'trigger',\n triggerName: 'user-message',\n input: { USER_MESSAGE: 'Navigate to github.com and take a screenshot' },\n });\n\n for await (const event of events) {\n // Handle stream events\n }\n\n // 6. Clean up\n await computer.stop();\n}\n```\n\n## API Reference\n\n### Computer\n\n```typescript\nclass Computer implements ToolProvider {\n constructor(config: ComputerConfig);\n\n // Static factories for MCP entries\n static stdio(\n command: string,\n args?: string[],\n options?: {\n env?: Record<string, string>;\n cwd?: string;\n },\n ): StdioConfig;\n\n static http(\n url: string,\n options?: {\n headers?: Record<string, string>;\n },\n ): HttpConfig;\n\n static shell(options: { cwd?: string; mode: ShellMode; timeout?: number }): ShellConfig;\n\n // Chrome launch helper\n static launchChrome(options: ChromeLaunchOptions): Promise<ChromeInstance>;\n\n // Lifecycle\n start(): Promise<{ errors: string[] }>;\n stop(): Promise<void>;\n\n // Dynamic entries\n addEntry(namespace: string, entry: McpEntry, options?: { deferred?: boolean }): Promise<void>;\n removeEntry(namespace: string): Promise<void>;\n restartEntry(namespace: string): Promise<void>;\n stopEntry(namespace: string): Promise<void>;\n\n // Health\n getHealth(): Promise<ComputerHealth>;\n ensureReady(): Promise<EnsureReadyResult>;\n retryDegraded(): Promise<{ recovered: string[]; stillDegraded: string[] }>;\n\n // ToolProvider implementation\n toolHandlers(): Record<string, ToolHandler>;\n toolSchemas(): ToolSchema[];\n}\n\ninterface ComputerHealth {\n healthy: boolean;\n entries: EntryHealth[];\n totalTools: number;\n}\n\ninterface EntryHealth {\n name: string;\n healthy: boolean;\n error?: string;\n}\n\ninterface EnsureReadyResult extends ComputerHealth {\n recovered?: string[];\n failedEntries?: string[];\n}\n```\n\n### ComputerConfig\n\n```typescript\ninterface ComputerConfig {\n mcpServers: Record<string, McpEntry>;\n managedProcesses?: { process: ChildProcess }[];\n /** Namespaces to skip during start() - they begin as degraded and can be connected on demand via restartEntry(). */\n deferredEntries?: string[];\n}\n\ntype McpEntry = StdioConfig | HttpConfig | ShellConfig;\ntype ShellMode =\n | 'unrestricted'\n | {\n allowedPatterns?: RegExp[];\n blockedPatterns?: RegExp[];\n };\n```\n\n### ChromeInstance\n\n```typescript\ninterface ChromeInstance {\n port: number;\n process: ChildProcess;\n pid: number;\n}\n```\n",
879
+ content: "\n# Computer\n\nThe `@octavus/computer` package gives agents access to a physical or virtual machine's browser, filesystem, and shell. It connects to [MCP](https://modelcontextprotocol.io) servers, discovers their tools, and provides them to the server-sdk.\n\n**Current version:** `5.1.0`\n\n## Installation\n\n```bash\nnpm install @octavus/computer\n```\n\n## Quick Start\n\n```typescript\nimport { Computer } from '@octavus/computer';\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=http://127.0.0.1:9222']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', ['/path/to/workspace']),\n shell: Computer.shell({ cwd: '/path/to/workspace', mode: 'unrestricted' }),\n },\n});\n\nawait computer.start();\n\nconst client = new OctavusClient({\n baseUrl: 'https://octavus.ai',\n apiKey: 'your-api-key',\n});\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => ({ title: args.title }),\n },\n});\n\nsession.setDynamicTools(computer);\n```\n\nDynamic tools are registered after attaching via `session.setDynamicTools()`. Pass the `computer` directly - the session extracts schemas and handlers from the `ToolProvider`. Tool schemas are sent to the platform on the next `execute()` call, and tool calls flow back through the existing execution loop.\n\n## How It Works\n\n1. You configure MCP servers with namespaces (e.g., `browser`, `filesystem`, `shell`)\n2. `computer.start()` connects to all servers in parallel and discovers their tools\n3. Each tool is namespaced with `__` (e.g., `browser__navigate_page`, `filesystem__read_file`)\n4. The server-sdk sends tool schemas to the platform and handles tool call execution\n\nThe agent's protocol must declare matching `mcpServers` with `source: device` - see [MCP Servers](/docs/protocol/mcp-servers).\n\n## Entry Types\n\nThe `Computer` class supports three types of MCP entries:\n\n### Stdio (MCP Subprocess)\n\nSpawns an MCP server as a child process, communicating via stdin/stdout:\n\n```typescript\nComputer.stdio(command: string, args?: string[], options?: {\n env?: Record<string, string>;\n cwd?: string;\n})\n```\n\nUse this for local MCP servers installed as npm packages or standalone executables:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n '--browser-url=http://127.0.0.1:9222',\n '--no-usage-statistics',\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [\n '/Users/me/projects/my-app',\n ]),\n },\n});\n```\n\n### HTTP (Remote MCP Endpoint)\n\nConnects to an MCP server over Streamable HTTP:\n\n```typescript\nComputer.http(url: string, options?: {\n headers?: Record<string, string>;\n})\n```\n\nUse this for MCP servers running as HTTP services:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n docs: Computer.http('http://localhost:3001/mcp', {\n headers: { Authorization: 'Bearer token' },\n }),\n },\n});\n```\n\n### Shell (Built-in)\n\nProvides shell command execution without spawning an MCP subprocess:\n\n```typescript\nComputer.shell(options: {\n cwd?: string;\n mode: ShellMode;\n timeout?: number; // Default: 300,000ms (5 minutes)\n})\n```\n\nThis exposes a `run_command` tool (namespaced as `shell__run_command` when the key is `shell`). Commands execute in a login shell with the user's full environment.\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n shell: Computer.shell({\n cwd: '/Users/me/projects/my-app',\n mode: 'unrestricted',\n timeout: 300_000,\n }),\n },\n});\n```\n\n#### Shell Safety Modes\n\n| Mode | Description |\n| -------------------------------------- | --------------------------------------------- |\n| `'unrestricted'` | All commands allowed (for dedicated machines) |\n| `{ allowedPatterns, blockedPatterns }` | Pattern-based command filtering |\n\nPattern-based filtering:\n\n```typescript\nComputer.shell({\n cwd: workspaceDir,\n mode: {\n blockedPatterns: [/rm\\s+-rf/, /sudo/],\n allowedPatterns: [/^git\\s/, /^npm\\s/, /^ls\\s/],\n },\n});\n```\n\nWhen `allowedPatterns` is set, only matching commands are permitted. When `blockedPatterns` is set, matching commands are rejected. Blocked patterns are checked first.\n\n## Lifecycle\n\n### Starting\n\n`computer.start()` connects to all configured MCP servers in parallel. If some servers fail to connect, the computer still starts with the remaining servers - only if _all_ connections fail does it throw an error.\n\n```typescript\nconst { errors } = await computer.start();\n\nif (errors.length > 0) {\n console.warn('Some MCP servers failed to connect:', errors);\n}\n```\n\n### Stopping\n\n`computer.stop()` closes all MCP connections and kills managed processes:\n\n```typescript\nawait computer.stop();\n```\n\nAlways call `stop()` when the session ends to clean up MCP subprocesses. For managed processes (like Chrome), pass them in the config for automatic cleanup.\n\n## Dynamic Entries\n\nYou can add or remove MCP entries on a running `Computer` after `start()` has returned. This is useful when MCP configurations arrive after construction - for example, when a session-manager receives per-session entries from a dispatch payload and wants to wire them into the existing computer instead of rebuilding it.\n\n### `addEntry(namespace, entry, options?)`\n\nRegisters a new MCP entry under `namespace`. By default, connects immediately:\n\n```typescript\nawait computer.addEntry(\n 'github',\n Computer.stdio('@modelcontextprotocol/server-github', [], {\n env: { GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GH_TOKEN! },\n }),\n);\n```\n\nPass `{ deferred: true }` to register the entry without connecting. The entry starts in a degraded state and connects on the next `restartEntry(namespace)` call - useful for lazy MCPs the agent activates on demand:\n\n```typescript\nawait computer.addEntry('github', githubEntry, { deferred: true });\n\n// Later, when the agent decides it needs GitHub:\nawait computer.restartEntry('github');\n```\n\n`addEntry` throws if the namespace already exists. To replace an entry, call `removeEntry` first.\n\nIf the immediate connection fails, `addEntry` does not throw - the entry is registered as degraded with the error message attached. Inspect via `getHealth()` or `restartEntry()` to retry.\n\n### `removeEntry(namespace)`\n\nCloses the entry's connection (if any) and drops it from the configuration. No-op when the namespace doesn't exist:\n\n```typescript\nawait computer.removeEntry('github');\n```\n\n### `restartEntry(namespace)`\n\nCloses the existing connection (if any) and reconnects with the current configuration:\n\n```typescript\nawait computer.restartEntry('github');\n```\n\nUse this to bring a deferred entry online for the first time, or to recover an entry that became degraded mid-session.\n\n### Detecting dynamic-entry support\n\nConsumers that work with arbitrary `ToolProvider` implementations can detect dynamic-entry capability with `isDynamicMcpProvider`:\n\n```typescript\nimport { isDynamicMcpProvider } from '@octavus/server-sdk';\n\nif (isDynamicMcpProvider(provider)) {\n await provider.addEntry('github', githubEntry);\n}\n```\n\n`Computer` always passes this check.\n\n## Chrome Launch Helper\n\nFor desktop applications that need to control a browser, `Computer.launchChrome()` launches Chrome with remote debugging enabled:\n\n```typescript\nconst browser = await Computer.launchChrome({\n profileDir: '/Users/me/.my-app/chrome-profiles/agent-1',\n debuggingPort: 9222, // Optional, auto-allocated if omitted\n flags: ['--window-size=1280,800'],\n});\n\nconsole.log(`Chrome running on port ${browser.port}, PID ${browser.pid}`);\n```\n\nPass the browser to `managedProcesses` for automatic cleanup when the computer stops:\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n `--browser-url=http://127.0.0.1:${browser.port}`,\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [workspaceDir]),\n shell: Computer.shell({ cwd: workspaceDir, mode: 'unrestricted' }),\n },\n managedProcesses: [{ process: browser.process }],\n});\n```\n\n### ChromeLaunchOptions\n\n| Field | Required | Description |\n| --------------- | -------- | ----------------------------------------------------- |\n| `profileDir` | Yes | Directory for Chrome's user data (profile isolation) |\n| `debuggingPort` | No | Port for remote debugging (auto-allocated if omitted) |\n| `flags` | No | Additional Chrome launch flags |\n\n## ToolProvider Interface\n\n`Computer` implements the `ToolProvider` interface from `@octavus/core`:\n\n```typescript\ninterface ToolProvider {\n toolHandlers(): Record<string, ToolHandler>;\n toolSchemas(): ToolSchema[];\n}\n```\n\n`setDynamicTools()` accepts any `ToolProvider` directly - the session extracts schemas and handlers automatically:\n\n```typescript\nsession.setDynamicTools(computer);\n```\n\nYou can also pass a custom `ToolProvider`:\n\n```typescript\nconst customProvider: ToolProvider = {\n toolHandlers() {\n return {\n custom__my_tool: async (args) => {\n return { result: 'done' };\n },\n };\n },\n toolSchemas() {\n return [\n {\n name: 'custom__my_tool',\n description: 'A custom tool',\n inputSchema: {\n type: 'object',\n properties: {\n input: { type: 'string', description: 'Tool input' },\n },\n required: ['input'],\n },\n },\n ];\n },\n};\n\nconst session = client.agentSessions.attach(sessionId, {\n tools: { 'set-chat-title': titleHandler },\n});\n\nsession.setDynamicTools(customProvider);\n```\n\nFor cases where you need explicit control, `setDynamicTools()` also accepts a `DynamicTool[]` array:\n\n```typescript\ninterface DynamicTool {\n schema: ToolSchema;\n handler: ToolHandler;\n}\n```\n\n## Complete Example\n\nA desktop application with browser, filesystem, and shell capabilities:\n\n```typescript\nimport { Computer } from '@octavus/computer';\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst WORKSPACE_DIR = '/Users/me/projects/my-app';\nconst PROFILE_DIR = '/Users/me/.my-app/chrome-profiles/agent';\n\nasync function startSession(sessionId: string) {\n // 1. Launch Chrome with remote debugging\n const browser = await Computer.launchChrome({\n profileDir: PROFILE_DIR,\n });\n\n // 2. Create computer with all capabilities\n const computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', [\n `--browser-url=http://127.0.0.1:${browser.port}`,\n '--no-usage-statistics',\n ]),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [WORKSPACE_DIR]),\n shell: Computer.shell({\n cwd: WORKSPACE_DIR,\n mode: 'unrestricted',\n }),\n },\n managedProcesses: [{ process: browser.process }],\n });\n\n // 3. Connect to all MCP servers\n const { errors } = await computer.start();\n if (errors.length > 0) {\n console.warn('Failed to connect:', errors);\n }\n\n // 4. Attach to session and register dynamic tools\n const client = new OctavusClient({\n baseUrl: process.env.OCTAVUS_API_URL!,\n apiKey: process.env.OCTAVUS_API_KEY!,\n });\n\n const session = client.agentSessions.attach(sessionId, {\n tools: {\n 'set-chat-title': async (args) => {\n console.log('Chat title:', args.title);\n return { success: true };\n },\n },\n });\n\n session.setDynamicTools(computer);\n\n // 5. Execute and stream\n const events = session.execute({\n type: 'trigger',\n triggerName: 'user-message',\n input: { USER_MESSAGE: 'Navigate to github.com and take a screenshot' },\n });\n\n for await (const event of events) {\n // Handle stream events\n }\n\n // 6. Clean up\n await computer.stop();\n}\n```\n\n## API Reference\n\n### Computer\n\n```typescript\nclass Computer implements ToolProvider {\n constructor(config: ComputerConfig);\n\n // Static factories for MCP entries\n static stdio(\n command: string,\n args?: string[],\n options?: {\n env?: Record<string, string>;\n cwd?: string;\n },\n ): StdioConfig;\n\n static http(\n url: string,\n options?: {\n headers?: Record<string, string>;\n },\n ): HttpConfig;\n\n static shell(options: { cwd?: string; mode: ShellMode; timeout?: number }): ShellConfig;\n\n // Chrome launch helper\n static launchChrome(options: ChromeLaunchOptions): Promise<ChromeInstance>;\n\n // Lifecycle\n start(): Promise<{ errors: string[] }>;\n stop(): Promise<void>;\n\n // Dynamic entries\n addEntry(namespace: string, entry: McpEntry, options?: { deferred?: boolean }): Promise<void>;\n removeEntry(namespace: string): Promise<void>;\n restartEntry(namespace: string): Promise<void>;\n stopEntry(namespace: string): Promise<void>;\n\n // Health\n getHealth(): Promise<ComputerHealth>;\n ensureReady(): Promise<EnsureReadyResult>;\n retryDegraded(): Promise<{ recovered: string[]; stillDegraded: string[] }>;\n\n // ToolProvider implementation\n toolHandlers(): Record<string, ToolHandler>;\n toolSchemas(): ToolSchema[];\n}\n\ninterface ComputerHealth {\n healthy: boolean;\n entries: EntryHealth[];\n totalTools: number;\n}\n\ninterface EntryHealth {\n name: string;\n healthy: boolean;\n error?: string;\n}\n\ninterface EnsureReadyResult extends ComputerHealth {\n recovered?: string[];\n failedEntries?: string[];\n}\n```\n\n### ComputerConfig\n\n```typescript\ninterface ComputerConfig {\n mcpServers: Record<string, McpEntry>;\n managedProcesses?: { process: ChildProcess }[];\n /** Namespaces to skip during start() - they begin as degraded and can be connected on demand via restartEntry(). */\n deferredEntries?: string[];\n}\n\ntype McpEntry = StdioConfig | HttpConfig | ShellConfig;\ntype ShellMode =\n | 'unrestricted'\n | {\n allowedPatterns?: RegExp[];\n blockedPatterns?: RegExp[];\n };\n```\n\n### ChromeInstance\n\n```typescript\ninterface ChromeInstance {\n port: number;\n process: ChildProcess;\n pid: number;\n}\n```\n",
853
880
  excerpt: "Computer The package gives agents access to a physical or virtual machine's browser, filesystem, and shell. It connects to MCP servers, discovers their tools, and provides them to the server-sdk....",
854
881
  order: 8
855
882
  },
@@ -875,7 +902,7 @@ var sections_default = [
875
902
  section: "client-sdk",
876
903
  title: "Overview",
877
904
  description: "Introduction to the Octavus Client SDKs for building chat interfaces.",
878
- content: "\n# Client SDK Overview\n\nOctavus provides two packages for frontend integration:\n\n| Package | Purpose | Use When |\n| --------------------- | ------------------------ | ----------------------------------------------------- |\n| `@octavus/react` | React hooks and bindings | Building React applications |\n| `@octavus/client-sdk` | Framework-agnostic core | Using Vue, Svelte, vanilla JS, or custom integrations |\n\n**Most users should install `@octavus/react`** - it includes everything from `@octavus/client-sdk` plus React-specific hooks.\n\n## Installation\n\n### React Applications\n\n```bash\nnpm install @octavus/react\n```\n\n**Current version:** `5.0.0`\n\n### Other Frameworks\n\n```bash\nnpm install @octavus/client-sdk\n```\n\n**Current version:** `5.0.0`\n\n## Transport Pattern\n\nThe Client SDK uses a **transport abstraction** to handle communication with your backend. This gives you flexibility in how events are delivered:\n\n| Transport | Use Case | Docs |\n| ----------------------- | -------------------------------------------- | ----------------------------------------------------- |\n| `createHttpTransport` | HTTP/SSE (Next.js, Express, etc.) | [HTTP Transport](/docs/client-sdk/http-transport) |\n| `createSocketTransport` | WebSocket, SockJS, or other socket protocols | [Socket Transport](/docs/client-sdk/socket-transport) |\n\nWhen the transport changes (e.g., when `sessionId` changes), the `useOctavusChat` hook automatically reinitializes with the new transport.\n\n> **Recommendation**: Use HTTP transport unless you specifically need WebSocket features (custom real-time events, Meteor/Phoenix, etc.).\n\n## React Usage\n\nThe `useOctavusChat` hook provides state management and streaming for React applications:\n\n```tsx\nimport { useMemo } from 'react';\nimport { useOctavusChat, createHttpTransport, type UIMessage } from '@octavus/react';\n\nfunction Chat({ sessionId }: { sessionId: string }) {\n // Create a stable transport instance (memoized on sessionId)\n const transport = useMemo(\n () =>\n createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n }),\n [sessionId],\n );\n\n const { messages, status, send } = useOctavusChat({ transport });\n\n const sendMessage = async (text: string) => {\n await send('user-message', { USER_MESSAGE: text }, { userMessage: { content: text } });\n };\n\n return (\n <div>\n {messages.map((msg) => (\n <MessageBubble key={msg.id} message={msg} />\n ))}\n </div>\n );\n}\n\nfunction MessageBubble({ message }: { message: UIMessage }) {\n return (\n <div>\n {message.parts.map((part, i) => {\n if (part.type === 'text') {\n return <p key={i}>{part.text}</p>;\n }\n return null;\n })}\n </div>\n );\n}\n```\n\n## Framework-Agnostic Usage\n\nThe `OctavusChat` class can be used with any framework or vanilla JavaScript:\n\n```typescript\nimport { OctavusChat, createHttpTransport } from '@octavus/client-sdk';\n\nconst transport = createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n});\n\nconst chat = new OctavusChat({ transport });\n\n// Subscribe to state changes\nconst unsubscribe = chat.subscribe(() => {\n console.log('Messages:', chat.messages);\n console.log('Status:', chat.status);\n // Update your UI here\n});\n\n// Send a message\nawait chat.send('user-message', { USER_MESSAGE: 'Hello' }, { userMessage: { content: 'Hello' } });\n\n// Cleanup when done\nunsubscribe();\n```\n\n## Key Features\n\n### Unified Send Function\n\nThe `send` function handles both user message display and agent triggering in one call:\n\n```tsx\nconst { send } = useOctavusChat({ transport });\n\n// Add user message to UI and trigger agent\nawait send('user-message', { USER_MESSAGE: text }, { userMessage: { content: text } });\n\n// Trigger without adding a user message (e.g., button click)\nawait send('request-human');\n```\n\n### Message Parts\n\nMessages contain ordered `parts` for rich content:\n\n```tsx\nconst { messages } = useOctavusChat({ transport });\n\n// Each message has typed parts\nmessage.parts.map((part) => {\n switch (part.type) {\n case 'text': // Text content\n case 'reasoning': // Extended reasoning/thinking\n case 'tool-call': // Tool execution\n case 'operation': // Internal operations (set-resource, etc.)\n }\n});\n```\n\n### Status Tracking\n\n```tsx\nconst { status } = useOctavusChat({ transport });\n\n// status: 'idle' | 'streaming' | 'error' | 'awaiting-input'\n// 'awaiting-input' occurs when interactive client tools need user action\n```\n\n### Stop Streaming\n\n```tsx\nconst { stop } = useOctavusChat({ transport });\n\n// Stop current stream and finalize message\nstop();\n```\n\n### Retry Last Trigger\n\nRe-execute the last trigger from the same starting point. Messages are rolled back to the state before the trigger, the user message is re-added (if any), and the agent re-executes. Already-uploaded files are reused without re-uploading.\n\n```tsx\nconst { retry, canRetry } = useOctavusChat({ transport });\n\n// Retry after an error, cancellation, or unsatisfactory result\nif (canRetry) {\n await retry();\n}\n```\n\n`canRetry` is `true` when a trigger has been sent and the chat is not currently streaming or awaiting input.\n\n## Hook Reference (React)\n\n### useOctavusChat\n\n```typescript\nfunction useOctavusChat(options: OctavusChatOptions): UseOctavusChatReturn;\n\ninterface OctavusChatOptions {\n // Required: Transport for streaming events\n transport: Transport;\n\n // Optional: Function to request upload URLs for file uploads\n requestUploadUrls?: (\n files: { filename: string; mediaType: string; size: number }[],\n ) => Promise<UploadUrlsResponse>;\n\n // Optional: Client-side tool handlers\n // - Function: executes automatically and returns result\n // - 'interactive': appears in pendingClientTools for user input\n clientTools?: Record<string, ClientToolHandler>;\n\n // Optional: Pre-populate with existing messages (session restore)\n initialMessages?: UIMessage[];\n\n // Optional: Callbacks\n onError?: (error: OctavusError) => void; // Structured error with type, source, retryable\n onFinish?: () => void;\n onStop?: () => void; // Called when user stops generation\n onResourceUpdate?: (name: string, value: unknown) => void;\n}\n\ninterface UseOctavusChatReturn {\n // State\n messages: UIMessage[];\n status: ChatStatus; // 'idle' | 'streaming' | 'error' | 'awaiting-input'\n error: OctavusError | null; // Structured error with type, source, retryable\n\n // Connection (socket transport only - undefined for HTTP)\n connectionState: ConnectionState | undefined; // 'disconnected' | 'connecting' | 'connected' | 'error'\n connectionError: Error | undefined;\n\n // Client tools (interactive tools awaiting user input)\n pendingClientTools: Record<string, InteractiveTool[]>; // Keyed by tool name\n\n // Actions\n send: (\n triggerName: string,\n input?: Record<string, unknown>,\n options?: { userMessage?: UserMessageInput },\n ) => Promise<void>;\n stop: () => void;\n retry: () => Promise<void>; // Retry last trigger from same starting point\n canRetry: boolean; // Whether retry() can be called\n\n // Connection management (socket transport only - undefined for HTTP)\n connect: (() => Promise<void>) | undefined;\n disconnect: (() => void) | undefined;\n\n // File uploads (requires requestUploadUrls)\n uploadFiles: (\n files: FileList | File[],\n onProgress?: (fileIndex: number, progress: number) => void,\n ) => Promise<FileReference[]>;\n}\n\ninterface UserMessageInput {\n content?: string;\n files?: FileList | File[] | FileReference[];\n}\n```\n\n### useAutoScroll\n\nSmart auto-scroll for chat containers. Scrolls to bottom when content updates, but pauses if the user has scrolled up. See [Streaming - Auto-Scroll](/docs/client-sdk/streaming#auto-scroll) for full usage.\n\n```typescript\nfunction useAutoScroll(options?: UseAutoScrollOptions): {\n scrollRef: RefObject<HTMLDivElement | null>;\n handleScroll: () => void;\n scrollOnUpdate: () => void;\n resetAutoScroll: () => void;\n};\n\ninterface UseAutoScrollOptions {\n scrollRef?: RefObject<HTMLDivElement | null>;\n threshold?: number; // Distance from bottom in px (default: 80)\n}\n```\n\n## Transport Reference\n\n### createHttpTransport\n\nCreates an HTTP/SSE transport using native `fetch()`:\n\n```typescript\nimport { createHttpTransport } from '@octavus/react';\n\nconst transport = createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n});\n```\n\n### createSocketTransport\n\nCreates a WebSocket/SockJS transport for real-time connections:\n\n```typescript\nimport { createSocketTransport } from '@octavus/react';\n\nconst transport = createSocketTransport({\n connect: () =>\n new Promise((resolve, reject) => {\n const ws = new WebSocket(`wss://api.example.com/stream?sessionId=${sessionId}`);\n ws.onopen = () => resolve(ws);\n ws.onerror = () => reject(new Error('Connection failed'));\n }),\n});\n```\n\nSocket transport provides additional connection management:\n\n```typescript\n// Access connection state directly\ntransport.connectionState; // 'disconnected' | 'connecting' | 'connected' | 'error'\n\n// Subscribe to state changes\ntransport.onConnectionStateChange((state, error) => {\n /* ... */\n});\n\n// Eager connection (instead of lazy on first send)\nawait transport.connect();\n\n// Manual disconnect\ntransport.disconnect();\n```\n\nFor detailed WebSocket/SockJS usage including custom events, reconnection patterns, and server-side implementation, see [Socket Transport](/docs/client-sdk/socket-transport).\n\n## Class Reference (Framework-Agnostic)\n\n### OctavusChat\n\n```typescript\nclass OctavusChat {\n constructor(options: OctavusChatOptions);\n\n // State (read-only)\n readonly messages: UIMessage[];\n readonly status: ChatStatus; // 'idle' | 'streaming' | 'error' | 'awaiting-input'\n readonly error: OctavusError | null; // Structured error\n readonly pendingClientTools: Record<string, InteractiveTool[]>; // Interactive tools\n\n // Actions\n send(\n triggerName: string,\n input?: Record<string, unknown>,\n options?: { userMessage?: UserMessageInput },\n ): Promise<void>;\n stop(): void;\n\n // Subscription\n subscribe(callback: () => void): () => void; // Returns unsubscribe function\n}\n```\n\n## Next Steps\n\n- [HTTP Transport](/docs/client-sdk/http-transport) - HTTP/SSE integration (recommended)\n- [Socket Transport](/docs/client-sdk/socket-transport) - WebSocket and SockJS integration\n- [Messages](/docs/client-sdk/messages) - Working with message state\n- [Streaming](/docs/client-sdk/streaming) - Building streaming UIs\n- [Client Tools](/docs/client-sdk/client-tools) - Interactive browser-side tool handling\n- [Operations](/docs/client-sdk/execution-blocks) - Showing agent progress\n- [Error Handling](/docs/client-sdk/error-handling) - Handling errors with type guards\n- [File Uploads](/docs/client-sdk/file-uploads) - Uploading images and documents\n- [Examples](/docs/examples/overview) - Complete working examples\n",
905
+ content: "\n# Client SDK Overview\n\nOctavus provides two packages for frontend integration:\n\n| Package | Purpose | Use When |\n| --------------------- | ------------------------ | ----------------------------------------------------- |\n| `@octavus/react` | React hooks and bindings | Building React applications |\n| `@octavus/client-sdk` | Framework-agnostic core | Using Vue, Svelte, vanilla JS, or custom integrations |\n\n**Most users should install `@octavus/react`** - it includes everything from `@octavus/client-sdk` plus React-specific hooks.\n\n## Installation\n\n### React Applications\n\n```bash\nnpm install @octavus/react\n```\n\n**Current version:** `5.1.0`\n\n### Other Frameworks\n\n```bash\nnpm install @octavus/client-sdk\n```\n\n**Current version:** `5.1.0`\n\n## Transport Pattern\n\nThe Client SDK uses a **transport abstraction** to handle communication with your backend. This gives you flexibility in how events are delivered:\n\n| Transport | Use Case | Docs |\n| ----------------------- | -------------------------------------------- | ----------------------------------------------------- |\n| `createHttpTransport` | HTTP/SSE (Next.js, Express, etc.) | [HTTP Transport](/docs/client-sdk/http-transport) |\n| `createSocketTransport` | WebSocket, SockJS, or other socket protocols | [Socket Transport](/docs/client-sdk/socket-transport) |\n\nWhen the transport changes (e.g., when `sessionId` changes), the `useOctavusChat` hook automatically reinitializes with the new transport.\n\n> **Recommendation**: Use HTTP transport unless you specifically need WebSocket features (custom real-time events, Meteor/Phoenix, etc.).\n\n## React Usage\n\nThe `useOctavusChat` hook provides state management and streaming for React applications:\n\n```tsx\nimport { useMemo } from 'react';\nimport { useOctavusChat, createHttpTransport, type UIMessage } from '@octavus/react';\n\nfunction Chat({ sessionId }: { sessionId: string }) {\n // Create a stable transport instance (memoized on sessionId)\n const transport = useMemo(\n () =>\n createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n }),\n [sessionId],\n );\n\n const { messages, status, send } = useOctavusChat({ transport });\n\n const sendMessage = async (text: string) => {\n await send('user-message', { USER_MESSAGE: text }, { userMessage: { content: text } });\n };\n\n return (\n <div>\n {messages.map((msg) => (\n <MessageBubble key={msg.id} message={msg} />\n ))}\n </div>\n );\n}\n\nfunction MessageBubble({ message }: { message: UIMessage }) {\n return (\n <div>\n {message.parts.map((part, i) => {\n if (part.type === 'text') {\n return <p key={i}>{part.text}</p>;\n }\n return null;\n })}\n </div>\n );\n}\n```\n\n## Framework-Agnostic Usage\n\nThe `OctavusChat` class can be used with any framework or vanilla JavaScript:\n\n```typescript\nimport { OctavusChat, createHttpTransport } from '@octavus/client-sdk';\n\nconst transport = createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n});\n\nconst chat = new OctavusChat({ transport });\n\n// Subscribe to state changes\nconst unsubscribe = chat.subscribe(() => {\n console.log('Messages:', chat.messages);\n console.log('Status:', chat.status);\n // Update your UI here\n});\n\n// Send a message\nawait chat.send('user-message', { USER_MESSAGE: 'Hello' }, { userMessage: { content: 'Hello' } });\n\n// Cleanup when done\nunsubscribe();\n```\n\n## Key Features\n\n### Unified Send Function\n\nThe `send` function handles both user message display and agent triggering in one call:\n\n```tsx\nconst { send } = useOctavusChat({ transport });\n\n// Add user message to UI and trigger agent\nawait send('user-message', { USER_MESSAGE: text }, { userMessage: { content: text } });\n\n// Trigger without adding a user message (e.g., button click)\nawait send('request-human');\n```\n\n### Message Parts\n\nMessages contain ordered `parts` for rich content:\n\n```tsx\nconst { messages } = useOctavusChat({ transport });\n\n// Each message has typed parts\nmessage.parts.map((part) => {\n switch (part.type) {\n case 'text': // Text content\n case 'reasoning': // Extended reasoning/thinking\n case 'tool-call': // Tool execution\n case 'operation': // Internal operations (set-resource, etc.)\n }\n});\n```\n\n### Status Tracking\n\n```tsx\nconst { status } = useOctavusChat({ transport });\n\n// status: 'idle' | 'streaming' | 'error' | 'awaiting-input'\n// 'awaiting-input' occurs when interactive client tools need user action\n```\n\n### Stop Streaming\n\n```tsx\nconst { stop } = useOctavusChat({ transport });\n\n// Stop current stream and finalize message\nstop();\n```\n\n### Retry Last Trigger\n\nRe-execute the last trigger from the same starting point. Messages are rolled back to the state before the trigger, the user message is re-added (if any), and the agent re-executes. Already-uploaded files are reused without re-uploading.\n\n```tsx\nconst { retry, canRetry } = useOctavusChat({ transport });\n\n// Retry after an error, cancellation, or unsatisfactory result\nif (canRetry) {\n await retry();\n}\n```\n\n`canRetry` is `true` when a trigger has been sent and the chat is not currently streaming or awaiting input.\n\n## Hook Reference (React)\n\n### useOctavusChat\n\n```typescript\nfunction useOctavusChat(options: OctavusChatOptions): UseOctavusChatReturn;\n\ninterface OctavusChatOptions {\n // Required: Transport for streaming events\n transport: Transport;\n\n // Optional: Function to request upload URLs for file uploads\n requestUploadUrls?: (\n files: { filename: string; mediaType: string; size: number }[],\n ) => Promise<UploadUrlsResponse>;\n\n // Optional: Client-side tool handlers\n // - Function: executes automatically and returns result\n // - 'interactive': appears in pendingClientTools for user input\n clientTools?: Record<string, ClientToolHandler>;\n\n // Optional: Pre-populate with existing messages (session restore)\n initialMessages?: UIMessage[];\n\n // Optional: Callbacks\n onError?: (error: OctavusError) => void; // Structured error with type, source, retryable\n onFinish?: () => void;\n onStop?: () => void; // Called when user stops generation\n onResourceUpdate?: (name: string, value: unknown) => void;\n}\n\ninterface UseOctavusChatReturn {\n // State\n messages: UIMessage[];\n status: ChatStatus; // 'idle' | 'streaming' | 'error' | 'awaiting-input'\n error: OctavusError | null; // Structured error with type, source, retryable\n\n // Connection (socket transport only - undefined for HTTP)\n connectionState: ConnectionState | undefined; // 'disconnected' | 'connecting' | 'connected' | 'error'\n connectionError: Error | undefined;\n\n // Client tools (interactive tools awaiting user input)\n pendingClientTools: Record<string, InteractiveTool[]>; // Keyed by tool name\n\n // Actions\n send: (\n triggerName: string,\n input?: Record<string, unknown>,\n options?: { userMessage?: UserMessageInput },\n ) => Promise<void>;\n stop: () => void;\n retry: () => Promise<void>; // Retry last trigger from same starting point\n canRetry: boolean; // Whether retry() can be called\n\n // Connection management (socket transport only - undefined for HTTP)\n connect: (() => Promise<void>) | undefined;\n disconnect: (() => void) | undefined;\n\n // File uploads (requires requestUploadUrls)\n uploadFiles: (\n files: FileList | File[],\n onProgress?: (fileIndex: number, progress: number) => void,\n ) => Promise<FileReference[]>;\n}\n\ninterface UserMessageInput {\n content?: string;\n files?: FileList | File[] | FileReference[];\n}\n```\n\n### useAutoScroll\n\nSmart auto-scroll for chat containers. Scrolls to bottom when content updates, but pauses if the user has scrolled up. See [Streaming - Auto-Scroll](/docs/client-sdk/streaming#auto-scroll) for full usage.\n\n```typescript\nfunction useAutoScroll(options?: UseAutoScrollOptions): {\n scrollRef: RefObject<HTMLDivElement | null>;\n handleScroll: () => void;\n scrollOnUpdate: () => void;\n resetAutoScroll: () => void;\n};\n\ninterface UseAutoScrollOptions {\n scrollRef?: RefObject<HTMLDivElement | null>;\n threshold?: number; // Distance from bottom in px (default: 80)\n}\n```\n\n## Transport Reference\n\n### createHttpTransport\n\nCreates an HTTP/SSE transport using native `fetch()`:\n\n```typescript\nimport { createHttpTransport } from '@octavus/react';\n\nconst transport = createHttpTransport({\n request: (payload, options) =>\n fetch('/api/trigger', {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({ sessionId, ...payload }),\n signal: options?.signal,\n }),\n});\n```\n\n### createSocketTransport\n\nCreates a WebSocket/SockJS transport for real-time connections:\n\n```typescript\nimport { createSocketTransport } from '@octavus/react';\n\nconst transport = createSocketTransport({\n connect: () =>\n new Promise((resolve, reject) => {\n const ws = new WebSocket(`wss://api.example.com/stream?sessionId=${sessionId}`);\n ws.onopen = () => resolve(ws);\n ws.onerror = () => reject(new Error('Connection failed'));\n }),\n});\n```\n\nSocket transport provides additional connection management:\n\n```typescript\n// Access connection state directly\ntransport.connectionState; // 'disconnected' | 'connecting' | 'connected' | 'error'\n\n// Subscribe to state changes\ntransport.onConnectionStateChange((state, error) => {\n /* ... */\n});\n\n// Eager connection (instead of lazy on first send)\nawait transport.connect();\n\n// Manual disconnect\ntransport.disconnect();\n```\n\nFor detailed WebSocket/SockJS usage including custom events, reconnection patterns, and server-side implementation, see [Socket Transport](/docs/client-sdk/socket-transport).\n\n## Class Reference (Framework-Agnostic)\n\n### OctavusChat\n\n```typescript\nclass OctavusChat {\n constructor(options: OctavusChatOptions);\n\n // State (read-only)\n readonly messages: UIMessage[];\n readonly status: ChatStatus; // 'idle' | 'streaming' | 'error' | 'awaiting-input'\n readonly error: OctavusError | null; // Structured error\n readonly pendingClientTools: Record<string, InteractiveTool[]>; // Interactive tools\n\n // Actions\n send(\n triggerName: string,\n input?: Record<string, unknown>,\n options?: { userMessage?: UserMessageInput },\n ): Promise<void>;\n stop(): void;\n\n // Subscription\n subscribe(callback: () => void): () => void; // Returns unsubscribe function\n}\n```\n\n## Next Steps\n\n- [HTTP Transport](/docs/client-sdk/http-transport) - HTTP/SSE integration (recommended)\n- [Socket Transport](/docs/client-sdk/socket-transport) - WebSocket and SockJS integration\n- [Messages](/docs/client-sdk/messages) - Working with message state\n- [Streaming](/docs/client-sdk/streaming) - Building streaming UIs\n- [Client Tools](/docs/client-sdk/client-tools) - Interactive browser-side tool handling\n- [Operations](/docs/client-sdk/execution-blocks) - Showing agent progress\n- [Error Handling](/docs/client-sdk/error-handling) - Handling errors with type guards\n- [File Uploads](/docs/client-sdk/file-uploads) - Uploading images and documents\n- [Examples](/docs/examples/overview) - Complete working examples\n",
879
906
  excerpt: "Client SDK Overview Octavus provides two packages for frontend integration: | Package | Purpose | Use When | |...",
880
907
  order: 1
881
908
  },
@@ -1367,7 +1394,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
1367
1394
  section: "protocol",
1368
1395
  title: "Tools",
1369
1396
  description: "Defining external tools implemented in your backend.",
1370
- content: '\n# Tools\n\nTools extend what agents can do. Octavus supports multiple types:\n\n1. **External Tools** - Defined in the protocol, implemented in your backend (this page)\n2. **MCP Tools** - Auto-discovered from MCP servers (see [MCP Servers](/docs/protocol/mcp-servers))\n3. **Built-in Tools** - Provider-agnostic tools managed by Octavus (web search, image generation)\n4. **Provider Tools** - Provider-specific tools executed by the provider (e.g., Anthropic\'s code execution)\n5. **Skills** - Code execution and knowledge packages (see [Skills](/docs/protocol/skills))\n\nThis page covers external tools. For MCP-based tools from services like Figma, Sentry, or device capabilities like browser and filesystem, see [MCP Servers](/docs/protocol/mcp-servers). Built-in tools are enabled via agent config - see [Web Search](/docs/protocol/agent-config#web-search) and [Image Generation](/docs/protocol/agent-config#image-generation). For provider-specific tools, see [Provider Options](/docs/protocol/provider-options). For code execution, see [Skills](/docs/protocol/skills).\n\n## External Tools\n\nExternal tools are defined in the `tools:` section and implemented in your backend.\n\n## Defining Tools\n\n```yaml\ntools:\n get-user-account:\n description: Looking up your account information\n display: description\n parameters:\n userId:\n type: string\n description: The user ID to look up\n```\n\n### Tool Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------------ |\n| `description` | Yes | What the tool does (shown to LLM and optionally user) |\n| `display` | No | How to show in UI: `hidden`, `name`, `description`, `stream` |\n| `parameters` | No | Input parameters the tool accepts |\n\n### Display Modes\n\nControls what the client sees about tool execution. The default is `description`.\n\n| Mode | Behavior |\n| ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |\n| `hidden` | No UI events emitted. The tool executes silently and the user has no awareness it was called. Use for internal plumbing tools (title setting, context management). |\n| `name` | Shows the raw tool name while executing. Arguments and result are not displayed. |\n| `description` | Shows the tool\'s description while executing (default). Arguments are visible during live streaming but the result is not preserved after page refresh. |\n| `stream` | Full visibility. Arguments stream progressively as the LLM generates them, and the result is shown after execution. The result is preserved after page refresh. |\n\n**When to use `stream`:**\n\n- Client tools where the user benefits from seeing arguments or results\n- Interactive client tools (user provides input via the tool card)\n- Tools whose result is rendered via `renderToolCallResult`\n- Any tool where transparency into what was sent/received matters\n\n**When to use `hidden`:**\n\n- Internal lifecycle tools (e.g., session title setting)\n- Context-setting tools that would clutter the UI\n- Tools that are implementation details of the agent\'s protocol\n\n**Refresh and restore behavior:**\n\n`stream` is the only mode that preserves the tool result after a page refresh. For all other modes, the result is available during the live session but stripped on refresh. On session restore (when the session expires and is rebuilt from stored `UIMessage[]`), `stream` tools retain their original result while other modes receive a placeholder.\n\n## Parameters\n\nTool calls are always objects where each parameter name maps to a value. The LLM generates: `{ param1: value1, param2: value2, ... }`\n\n### Parameter Fields\n\n| Field | Required | Description |\n| ------------- | -------- | -------------------------------------------------------------------------------- |\n| `type` | Yes | Data type: `string`, `number`, `integer`, `boolean`, `unknown`, or a custom type |\n| `description` | No | Describes what this parameter is for |\n| `optional` | No | If true, parameter is not required (default: false) |\n\n> **Tip**: You can use [custom types](/docs/protocol/types) for complex parameters like `type: ProductFilter` or `type: SearchOptions`.\n\n### Array Parameters\n\nFor array parameters, define a [top-level array type](/docs/protocol/types#top-level-array-types) and use it:\n\n```yaml\ntypes:\n CartItem:\n productId:\n type: string\n quantity:\n type: integer\n\n CartItemList:\n type: array\n items:\n type: CartItem\n\ntools:\n add-to-cart:\n description: Add items to cart\n parameters:\n items:\n type: CartItemList\n description: Items to add\n```\n\nThe tool receives: `{ items: [{ productId: "...", quantity: 1 }, ...] }`\n\n### Optional Parameters\n\nParameters are **required by default**. Use `optional: true` to make a parameter optional:\n\n```yaml\ntools:\n search-products:\n description: Search the product catalog\n parameters:\n query:\n type: string\n description: Search query\n\n category:\n type: string\n description: Filter by category\n optional: true\n\n maxPrice:\n type: number\n description: Maximum price filter\n optional: true\n\n inStock:\n type: boolean\n description: Only show in-stock items\n optional: true\n```\n\n## Making Tools Available\n\nTools defined in `tools:` are available. To make them usable by the LLM, add them to `agent.tools`:\n\n```yaml\ntools:\n get-user-account:\n description: Look up user account\n parameters:\n userId: { type: string }\n\n create-support-ticket:\n description: Create a support ticket\n parameters:\n summary: { type: string }\n priority: { type: string } # low, medium, high, urgent\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools:\n - get-user-account\n - create-support-ticket # LLM can decide when to call these\n agentic: true\n```\n\n## Tool Invocation Modes\n\n### LLM-Decided (Agentic)\n\nThe LLM decides when to call tools based on the conversation:\n\n```yaml\nagent:\n tools: [get-user-account, create-support-ticket]\n agentic: true # Allow multiple tool calls\n maxSteps: 10 # Max tool call cycles\n```\n\n### Deterministic (Block-Based)\n\nForce tool calls at specific points in the handler:\n\n```yaml\nhandlers:\n request-human:\n # Always create a ticket when escalating\n Create support ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY # From variable\n priority: medium # Literal value\n output: TICKET # Store result\n```\n\n## Tool Results\n\n### In Prompts\n\nTool results are stored in variables. Reference the variable in prompts:\n\n```markdown\n<!-- prompts/ticket-directive.md -->\n\nA support ticket has been created:\n{{TICKET}}\n\nLet the user know their ticket has been created.\n```\n\nWhen the `TICKET` variable contains an object, it\'s automatically serialized as JSON in the prompt:\n\n```\nA support ticket has been created:\n{\n "ticketId": "TKT-123ABC",\n "estimatedResponse": "24 hours"\n}\n\nLet the user know their ticket has been created.\n```\n\n> **Note**: Variables use `{{VARIABLE_NAME}}` syntax with `UPPERCASE_SNAKE_CASE`. Dot notation (like `{{TICKET.ticketId}}`) is not supported. Objects are automatically JSON-serialized.\n\n### In Variables\n\nStore tool results for later use:\n\n```yaml\nhandlers:\n request-human:\n Get account:\n block: tool-call\n tool: get-user-account\n input:\n userId: USER_ID\n output: ACCOUNT # Result stored here\n\n Create ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY\n priority: medium\n output: TICKET\n```\n\n## Implementing Tools\n\nTools are implemented in your backend:\n\n```typescript\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n \'get-user-account\': async (args) => {\n const userId = args.userId as string;\n const user = await db.users.findById(userId);\n\n return {\n name: user.name,\n email: user.email,\n plan: user.subscription.plan,\n createdAt: user.createdAt.toISOString(),\n };\n },\n\n \'create-support-ticket\': async (args) => {\n const ticket = await ticketService.create({\n summary: args.summary as string,\n priority: args.priority as string,\n });\n\n return {\n ticketId: ticket.id,\n estimatedResponse: getEstimatedTime(args.priority),\n };\n },\n },\n});\n```\n\n## Tool Best Practices\n\n### 1. Clear Descriptions\n\n```yaml\ntools:\n # Good - clear and specific\n get-user-account:\n description: >\n Retrieves the user\'s account information including name, email,\n subscription plan, and account creation date. Use this when the\n user asks about their account or you need to verify their identity.\n\n # Avoid - vague\n get-data:\n description: Gets some data\n```\n\n### 2. Document Constrained Values\n\n```yaml\ntools:\n create-support-ticket:\n parameters:\n priority:\n type: string\n description: Ticket priority level (low, medium, high, urgent)\n```\n',
1397
+ content: '\n# Tools\n\nTools extend what agents can do. Octavus supports multiple types:\n\n1. **External Tools** - Defined in the protocol, implemented in your backend (this page)\n2. **MCP Tools** - Auto-discovered from MCP servers (see [MCP Servers](/docs/protocol/mcp-servers))\n3. **Built-in Tools** - Provider-agnostic tools managed by Octavus (web search, image generation)\n4. **Provider Tools** - Provider-specific tools executed by the provider (e.g., Anthropic\'s code execution)\n5. **Skills** - Code execution and knowledge packages (see [Skills](/docs/protocol/skills))\n\nThis page covers external tools. For MCP-based tools from services like Figma, Sentry, or device capabilities like browser and filesystem, see [MCP Servers](/docs/protocol/mcp-servers). Built-in tools are enabled via agent config - see [Web Search](/docs/protocol/agent-config#web-search) and [Image Generation](/docs/protocol/agent-config#image-generation). For provider-specific tools, see [Provider Options](/docs/protocol/provider-options). For code execution, see [Skills](/docs/protocol/skills).\n\n## External Tools\n\nExternal tools are defined in the `tools:` section and implemented in your backend.\n\n## Defining Tools\n\n```yaml\ntools:\n get-user-account:\n description: Looking up your account information\n display: description\n parameters:\n userId:\n type: string\n description: The user ID to look up\n```\n\n### Tool Fields\n\n| Field | Required | Description |\n| ------------- | -------- | -------------------------------------------------------------------------- |\n| `description` | Yes | What the tool does (shown to LLM and optionally user) |\n| `display` | No | How to show in UI: `hidden`, `name`, `description`, `stream`, `title` |\n| `title` | No | UI label shown when `display: title` (hides the description and arguments) |\n| `parameters` | No | Input parameters the tool accepts |\n\n### Display Modes\n\nControls what the client sees about tool execution. The default is `description`.\n\n| Mode | Behavior |\n| ------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `hidden` | No UI events emitted. The tool executes silently and the user has no awareness it was called. Use for internal plumbing tools (title setting, context management). |\n| `name` | Shows the raw tool name while executing. Arguments and result are not displayed. |\n| `description` | Shows the tool\'s description while executing (default). Arguments are visible during live streaming but the result is not preserved after page refresh. |\n| `stream` | Full visibility. Arguments stream progressively as the LLM generates them, and the result is shown after execution. The result is preserved after page refresh. |\n| `title` | Shows the tool\'s `title` plus the tool name only. The description, arguments, and result are hidden in the UI; the `description` is still sent to the LLM. Use for server-side tools that should appear as a labeled step without exposing their inputs/outputs. |\n\n**When to use `stream`:**\n\n- Client tools where the user benefits from seeing arguments or results\n- Interactive client tools (user provides input via the tool card)\n- Tools whose result is rendered via `renderToolCallResult`\n- Any tool where transparency into what was sent/received matters\n\n**When to use `hidden`:**\n\n- Internal lifecycle tools (e.g., session title setting)\n- Context-setting tools that would clutter the UI\n- Tools that are implementation details of the agent\'s protocol\n\n**When to use `title`:**\n\n- Server-side tools that should appear in the UI as a clean, labeled step (e.g. "Looking up your account") without exposing their arguments or result\n- Tools whose `description` is written for the LLM and shouldn\'t be shown verbatim to the user\n- This is the recommended mode for server-executed tools that should be visible: give them a human-readable `title` for the UI and keep `description` focused on instructing the model\n\n`title` is the preferred choice for server-side tools that should surface in the UI. `name` and `description` remain supported for backward compatibility, but for new server-side tools prefer `title` (a clean UI label) - or `stream` for client tools where the user benefits from seeing the arguments and result.\n\n**Refresh and restore behavior:**\n\n`stream` is the only mode that preserves the tool result after a page refresh. For all other modes, the result is available during the live session but stripped on refresh. On session restore (when the session expires and is rebuilt from stored `UIMessage[]`), `stream` tools retain their original result while other modes receive a placeholder.\n\n## Parameters\n\nTool calls are always objects where each parameter name maps to a value. The LLM generates: `{ param1: value1, param2: value2, ... }`\n\n### Parameter Fields\n\n| Field | Required | Description |\n| ------------- | -------- | -------------------------------------------------------------------------------- |\n| `type` | Yes | Data type: `string`, `number`, `integer`, `boolean`, `unknown`, or a custom type |\n| `description` | No | Describes what this parameter is for |\n| `optional` | No | If true, parameter is not required (default: false) |\n\n> **Tip**: You can use [custom types](/docs/protocol/types) for complex parameters like `type: ProductFilter` or `type: SearchOptions`.\n\n### Array Parameters\n\nFor array parameters, define a [top-level array type](/docs/protocol/types#top-level-array-types) and use it:\n\n```yaml\ntypes:\n CartItem:\n productId:\n type: string\n quantity:\n type: integer\n\n CartItemList:\n type: array\n items:\n type: CartItem\n\ntools:\n add-to-cart:\n description: Add items to cart\n parameters:\n items:\n type: CartItemList\n description: Items to add\n```\n\nThe tool receives: `{ items: [{ productId: "...", quantity: 1 }, ...] }`\n\n### Optional Parameters\n\nParameters are **required by default**. Use `optional: true` to make a parameter optional:\n\n```yaml\ntools:\n search-products:\n description: Search the product catalog\n parameters:\n query:\n type: string\n description: Search query\n\n category:\n type: string\n description: Filter by category\n optional: true\n\n maxPrice:\n type: number\n description: Maximum price filter\n optional: true\n\n inStock:\n type: boolean\n description: Only show in-stock items\n optional: true\n```\n\n## Making Tools Available\n\nTools defined in `tools:` are available. To make them usable by the LLM, add them to `agent.tools`:\n\n```yaml\ntools:\n get-user-account:\n description: Look up user account\n parameters:\n userId: { type: string }\n\n create-support-ticket:\n description: Create a support ticket\n parameters:\n summary: { type: string }\n priority: { type: string } # low, medium, high, urgent\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools:\n - get-user-account\n - create-support-ticket # LLM can decide when to call these\n agentic: true\n```\n\n## Tool Invocation Modes\n\n### LLM-Decided (Agentic)\n\nThe LLM decides when to call tools based on the conversation:\n\n```yaml\nagent:\n tools: [get-user-account, create-support-ticket]\n agentic: true # Allow multiple tool calls\n maxSteps: 10 # Max tool call cycles\n```\n\n### Deterministic (Block-Based)\n\nForce tool calls at specific points in the handler:\n\n```yaml\nhandlers:\n request-human:\n # Always create a ticket when escalating\n Create support ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY # From variable\n priority: medium # Literal value\n output: TICKET # Store result\n```\n\n## Tool Results\n\n### In Prompts\n\nTool results are stored in variables. Reference the variable in prompts:\n\n```markdown\n<!-- prompts/ticket-directive.md -->\n\nA support ticket has been created:\n{{TICKET}}\n\nLet the user know their ticket has been created.\n```\n\nWhen the `TICKET` variable contains an object, it\'s automatically serialized as JSON in the prompt:\n\n```\nA support ticket has been created:\n{\n "ticketId": "TKT-123ABC",\n "estimatedResponse": "24 hours"\n}\n\nLet the user know their ticket has been created.\n```\n\n> **Note**: Variables use `{{VARIABLE_NAME}}` syntax with `UPPERCASE_SNAKE_CASE`. Dot notation (like `{{TICKET.ticketId}}`) is not supported. Objects are automatically JSON-serialized.\n\n### In Variables\n\nStore tool results for later use:\n\n```yaml\nhandlers:\n request-human:\n Get account:\n block: tool-call\n tool: get-user-account\n input:\n userId: USER_ID\n output: ACCOUNT # Result stored here\n\n Create ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY\n priority: medium\n output: TICKET\n```\n\n## Implementing Tools\n\nTools are implemented in your backend:\n\n```typescript\nconst session = client.agentSessions.attach(sessionId, {\n tools: {\n \'get-user-account\': async (args) => {\n const userId = args.userId as string;\n const user = await db.users.findById(userId);\n\n return {\n name: user.name,\n email: user.email,\n plan: user.subscription.plan,\n createdAt: user.createdAt.toISOString(),\n };\n },\n\n \'create-support-ticket\': async (args) => {\n const ticket = await ticketService.create({\n summary: args.summary as string,\n priority: args.priority as string,\n });\n\n return {\n ticketId: ticket.id,\n estimatedResponse: getEstimatedTime(args.priority),\n };\n },\n },\n});\n```\n\n## Tool Best Practices\n\n### 1. Clear Descriptions\n\n```yaml\ntools:\n # Good - clear and specific\n get-user-account:\n description: >\n Retrieves the user\'s account information including name, email,\n subscription plan, and account creation date. Use this when the\n user asks about their account or you need to verify their identity.\n\n # Avoid - vague\n get-data:\n description: Gets some data\n```\n\n### 2. Document Constrained Values\n\n```yaml\ntools:\n create-support-ticket:\n parameters:\n priority:\n type: string\n description: Ticket priority level (low, medium, high, urgent)\n```\n',
1371
1398
  excerpt: "Tools Tools extend what agents can do. Octavus supports multiple types: 1. External Tools - Defined in the protocol, implemented in your backend (this page) 2. MCP Tools - Auto-discovered from MCP...",
1372
1399
  order: 4
1373
1400
  },
@@ -1376,7 +1403,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
1376
1403
  section: "protocol",
1377
1404
  title: "Skills",
1378
1405
  description: "Using Octavus skills for code execution and specialized capabilities.",
1379
- content: "\n# Skills\n\nSkills are knowledge packages that enable agents to execute code and generate files. Unlike external tools (which you implement in your backend), skills are self-contained packages with documentation and scripts. By default, skills run in isolated sandbox environments, but they can also run directly on the agent's computer.\n\n## Overview\n\nOctavus Skills provide **provider-agnostic** code execution. They work with any LLM provider (Anthropic, OpenAI, Google) by using explicit tool calls and system prompt injection.\n\n### How Skills Work\n\n1. **Skill Definition**: Skills are defined in the protocol's `skills:` section\n2. **Skill Resolution**: Skills are resolved from available sources (see below)\n3. **Execution**: Code runs in an isolated sandbox (default) or on the agent's computer\n4. **File Generation**: Files saved to `/output/` are automatically captured and made available for download (sandbox skills)\n\n### Skill Sources\n\nSkills come from two sources, visible in the Skills tab of your organization:\n\n| Source | Badge in UI | Visibility | Example |\n| ----------- | ----------- | ------------------------------ | ------------------ |\n| **Octavus** | `Octavus` | Available to all organizations | `qr-code` |\n| **Custom** | None | Private to your organization | `my-company-skill` |\n\nWhen you reference a skill in your protocol, Octavus resolves it from your available skills. If you create a custom skill with the same name as an Octavus skill, your custom skill takes precedence.\n\n## Defining Skills\n\nDefine skills in the protocol's `skills:` section:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n data-analysis:\n display: description\n description: Analyzing data and generating reports\n```\n\n### Skill Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------------------------------------- |\n| `display` | No | How to show in UI: `hidden`, `name`, `description`, `stream` (default: `description`) |\n| `description` | No | Custom description shown to users (overrides skill's built-in description) |\n| `execution` | No | Where the skill runs: `sandbox` (default) or `device` |\n\n### Display Modes\n\nThe `display` setting on a skill applies to all tools under that skill namespace. See [Tool Display Modes](/docs/protocol/tools#display-modes) for full details on each mode.\n\n| Mode | Behavior |\n| ------------- | -------------------------------------------------------------------------------------------------------------------- |\n| `hidden` | Skill tools run silently, no UI events emitted |\n| `name` | Shows skill name while executing |\n| `description` | Shows description while executing (default). Result not preserved after page refresh. |\n| `stream` | Full visibility - arguments stream progressively, result shown after execution, result preserved after page refresh. |\n\n## Enabling Skills\n\nAfter defining skills in the `skills:` section, specify which skills are available. Skills work in both interactive agents and workers.\n\n### Interactive Agents\n\nReference skills in `agent.skills`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools: [get-user-account]\n skills: [qr-code]\n agentic: true\n```\n\n### Workers and Named Threads\n\nReference skills per-thread in `start-thread.skills`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nsteps:\n Start thread:\n block: start-thread\n thread: worker\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n maxSteps: 10\n```\n\nThis also works for named threads in interactive agents, allowing different threads to have different skills.\n\n## Skill Tools\n\nWhen skills are enabled, the LLM has access to these tools:\n\n| Tool | Purpose | Availability |\n| --------------------- | ----------------------------------------------- | ------------------------------ |\n| `octavus_skill_read` | Read skill documentation (SKILL.md) | All skills |\n| `octavus_skill_list` | List available scripts in a skill | All skills |\n| `octavus_skill_run` | Execute a pre-built script from a skill | All skills |\n| `octavus_skill_setup` | Install a skill on the device for file browsing | Device skills only |\n| `octavus_code_run` | Execute arbitrary Python/Bash code | Sandbox skills (standard) only |\n| `octavus_file_write` | Create files in the sandbox | Sandbox skills (standard) only |\n| `octavus_file_read` | Read files from the sandbox | Sandbox skills (standard) only |\n\nThe LLM learns about available skills through system prompt injection and can use these tools to interact with skills.\n\nSkills that have [secrets](#skill-secrets) configured run in **secure mode**, where only `octavus_skill_read`, `octavus_skill_list`, and `octavus_skill_run` are available. See [Skill Secrets](#skill-secrets) below.\n\n## Device Execution\n\nBy default, skills run in an isolated sandbox. When `execution: device` is set, the skill runs on the agent's computer instead.\n\n```yaml\nskills:\n deploy-tool:\n display: description\n description: Deploy applications to production\n execution: device\n qr-code:\n display: description\n description: Generating QR codes\n # execution defaults to sandbox\n```\n\n### How Device Skills Work\n\nDevice skills are installed on the agent's computer so the agent can browse their files and run their scripts directly. After attaching a skill via integrations, the agent uses `octavus_skill_setup` to install it on the device. Once installed, the agent can:\n\n- Read the skill's documentation with `octavus_skill_read`\n- List available scripts with `octavus_skill_list`\n- Run pre-built scripts with `octavus_skill_run`\n\nThe generic workspace tools (`octavus_code_run`, `octavus_file_write`, `octavus_file_read`) are **not available** for device skills. Instead, the agent uses the device's own shell and filesystem MCP servers to interact with files and run commands.\n\n### Sandbox vs Device Skills\n\n| Aspect | Sandbox (default) | Device |\n| ------------------- | ---------------------------------- | ------------------------------------------------------ |\n| **Environment** | Isolated sandbox | The agent's computer |\n| **Available tools** | All 6 skill tools | `skill_read`, `skill_list`, `skill_run`, `skill_setup` |\n| **File access** | Via `octavus_file_read/write` | Via device filesystem MCP |\n| **Code execution** | Via `octavus_code_run` | Via device shell MCP |\n| **Isolation** | Fully sandboxed | Runs alongside other device processes |\n| **File output** | `/output/` directory auto-captured | Files written to device filesystem |\n\n### When to Use Device Execution\n\nUse `execution: device` when the skill needs to:\n\n- Access the agent's local filesystem or running processes\n- Use tools or CLIs installed on the device\n- Interact with services running on the device\n- Persist files beyond a single execution cycle\n\n## Example: QR Code Generation\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n agentic: true\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n\n Respond:\n block: next-message\n```\n\nWhen a user asks \"Create a QR code for octavus.ai\", the LLM will:\n\n1. Recognize the task matches the `qr-code` skill\n2. Call `octavus_skill_read` to learn how to use the skill\n3. Execute code (via `octavus_code_run` or `octavus_skill_run`) to generate the QR code\n4. Save the image to `/output/` in the sandbox\n5. The file is automatically captured and made available for download\n\n## File Output\n\nFiles saved to `/output/` in the sandbox are automatically:\n\n1. **Captured** after code execution\n2. **Uploaded** to S3 storage\n3. **Made available** via presigned URLs\n4. **Included** in the message as file parts\n\nFiles persist across page refreshes and are stored in the session's message history.\n\n## Skill Format\n\nSkills follow the [Agent Skills](https://agentskills.io) open standard:\n\n- `SKILL.md` - Required skill documentation with YAML frontmatter\n- `scripts/` - Optional executable code (Python/Bash)\n- `references/` - Optional documentation loaded as needed\n- `assets/` - Optional files used in outputs (templates, images)\n\n### SKILL.md Format\n\n````yaml\n---\nname: qr-code\ndescription: >\n Generate QR codes from text, URLs, or data. Use when the user needs to create\n a QR code for any purpose - sharing links, contact information, WiFi credentials,\n or any text data that should be scannable.\nversion: 1.0.0\nlicense: MIT\nauthor: Octavus Team\ncategory: Productivity\n---\n\n# QR Code Generator\n\n## Overview\n\nThis skill creates QR codes from text data using Python...\n\n## Quick Start\n\nGenerate a QR code with Python:\n\n```python\nimport qrcode\nimport os\n\noutput_dir = os.environ.get('OUTPUT_DIR', '/output')\n# ... code to generate QR code ...\n````\n\n## Scripts Reference\n\n### scripts/generate.py\n\nMain script for generating QR codes...\n\n````\n\n### Frontmatter Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------ |\n| `name` | Yes | Skill slug (lowercase, hyphens) |\n| `description` | Yes | What the skill does (shown to the LLM) |\n| `version` | No | Semantic version string |\n| `license` | No | License identifier |\n| `author` | No | Skill author |\n| `category` | No | Display category used to group and filter skills in the UI |\n| `secrets` | No | Array of secret declarations (enables secure mode) |\n\n## Best Practices\n\n### 1. Clear Descriptions\n\nProvide clear, purpose-driven descriptions:\n\n```yaml\nskills:\n # Good - clear purpose\n qr-code:\n description: Generating QR codes for URLs, contact info, or any text data\n\n # Avoid - vague\n utility:\n description: Does stuff\n````\n\n### 2. When to Use Skills vs Tools\n\n| Use Skills When | Use Tools When |\n| ------------------------ | ---------------------------- |\n| Code execution needed | Simple API calls |\n| File generation | Database queries |\n| Complex calculations | External service integration |\n| Data processing | Authentication required |\n| Provider-agnostic needed | Backend-specific logic |\n\n### 3. Skill Selection\n\nDefine all skills available to this agent in the `skills:` section. Then specify which skills are available for the chat thread in `agent.skills`:\n\n```yaml\n# All skills available to this agent (defined once at protocol level)\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n data-analysis:\n display: description\n description: Analyzing data\n pdf-processor:\n display: description\n description: Processing PDFs\n\n# Skills available for this chat thread\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code, data-analysis] # Skills available for this thread\n```\n\n### 4. Display Modes\n\nChoose appropriate display modes based on user experience:\n\n```yaml\nskills:\n # Background processing - hide from user\n data-analysis:\n display: hidden\n\n # User-facing generation - show description\n qr-code:\n display: description\n\n # Interactive progress - stream updates\n report-generation:\n display: stream\n```\n\n## Comparison: Skills vs Tools vs Provider Options\n\n| Feature | Octavus Skills | External Tools | Provider Tools/Skills |\n| ------------------ | --------------------------- | ------------------- | --------------------- |\n| **Execution** | Sandbox or agent's computer | Your backend | Provider servers |\n| **Provider** | Any (agnostic) | N/A | Provider-specific |\n| **Code Execution** | Yes | No | Yes (provider tools) |\n| **File Output** | Yes | No | Yes (provider skills) |\n| **Implementation** | Skill packages | Your code | Built-in |\n| **Cost** | Sandbox + LLM API | Your infrastructure | Included in API |\n\n## Uploading Custom Skills\n\nYou can upload custom skills to your organization using the CLI or the platform UI.\n\n### Via CLI (Recommended)\n\nUse [`octavus skills sync`](/docs/server-sdk/cli#octavus-skills-sync-path) to package and upload a skill directory. If the skill has a `.env` file, secrets are pushed alongside the bundle:\n\n```bash\noctavus skills sync ./skills/my-skill\n```\n\n### Skill Directory Structure\n\n```\nmy-skill/\n\u251C\u2500\u2500 SKILL.md # Required: Skill documentation with frontmatter\n\u251C\u2500\u2500 scripts/ # Optional: Executable scripts\n\u2502 \u251C\u2500\u2500 run.py\n\u2502 \u2514\u2500\u2500 requirements.txt\n\u251C\u2500\u2500 references/ # Optional: Additional documentation\n\u251C\u2500\u2500 assets/ # Optional: Templates, images\n\u2514\u2500\u2500 .env # Optional: Secrets (not included in bundle)\n```\n\nOnce uploaded, reference the skill by slug in your protocol:\n\n```yaml\nskills:\n my-skill:\n display: description\n description: Custom analysis tool\n\nagent:\n skills: [my-skill]\n```\n\n## On-Demand Skills\n\nOn-demand skills (`onDemandSkills`) also support the `execution` field:\n\n```yaml\nonDemandSkills:\n display: description\n execution: device\n```\n\nWhen `execution: device` is set on the on-demand skills declaration, any skill attached at runtime via integrations runs on the agent's computer instead of in a sandbox.\n\n## Sandbox Timeout\n\nThe default sandbox timeout is 5 minutes (applies to sandbox skills only). You can configure a custom timeout using `sandboxTimeout` in the agent config or on individual `start-thread` blocks:\n\n```yaml\n# Agent-level timeout (applies to main thread)\nagent:\n model: anthropic/claude-sonnet-4-5\n skills: [data-analysis]\n sandboxTimeout: 1800000 # 30 minutes (in milliseconds)\n```\n\n```yaml\n# Thread-level timeout (overrides agent-level for this thread)\nsteps:\n Start thread:\n block: start-thread\n thread: analysis\n model: anthropic/claude-sonnet-4-5\n skills: [data-analysis]\n sandboxTimeout: 3600000 # 1 hour\n```\n\nThread-level `sandboxTimeout` takes priority over agent-level. Maximum: 1 hour (3,600,000 ms).\n\n## Skill Secrets\n\nSkills can declare secrets they need to function. When an organization configures those secrets, the skill runs in **secure mode** with additional isolation.\n\n### Declaring Secrets\n\nAdd a `secrets` array to your SKILL.md frontmatter:\n\n```yaml\n---\nname: github\ndescription: >\n Run GitHub CLI (gh) commands to manage repos, issues, PRs, and more.\nsecrets:\n - name: GITHUB_TOKEN\n description: GitHub personal access token with repo access\n required: true\n - name: GITHUB_ORG\n description: Default GitHub organization\n required: false\n---\n```\n\nEach secret declaration has:\n\n| Field | Required | Description |\n| ------------- | -------- | ----------------------------------------------------------- |\n| `name` | Yes | Environment variable name (uppercase, e.g., `GITHUB_TOKEN`) |\n| `description` | No | Explains what this secret is for (shown in the UI) |\n| `required` | No | Whether the secret is required (defaults to `true`) |\n\nSecret names must match the pattern `^[A-Z_][A-Z0-9_]*$` (uppercase letters, digits, and underscores).\n\n### Configuring Secrets\n\nOrganization admins configure secret values through the skill editor in the platform UI. Each organization maintains its own independent set of secrets for each skill.\n\nSecrets are encrypted at rest and only decrypted at execution time.\n\n### Secure Mode\n\nWhen a skill has secrets configured for the organization, it automatically runs in **secure mode**:\n\n- The skill gets its own **isolated sandbox** (separate from other skills)\n- Secrets are injected as **environment variables** available to all scripts\n- Only `octavus_skill_read`, `octavus_skill_list`, and `octavus_skill_run` are available - `octavus_code_run`, `octavus_file_write`, and `octavus_file_read` are blocked\n- Scripts receive input as **JSON via stdin** (using the `input` parameter on `octavus_skill_run`) instead of CLI args\n- All output (stdout/stderr) is **automatically redacted** for secret values before being returned to the LLM\n\n### Writing Scripts for Secure Skills\n\nScripts in secure skills read input from stdin as JSON and access secrets from environment variables:\n\n```python\nimport json\nimport os\nimport sys\n\ninput_data = json.load(sys.stdin)\ntoken = os.environ.get('GITHUB_TOKEN')\n\n# Use the token and input_data to perform the task\n```\n\nFor standard skills (without secrets), scripts receive input as CLI arguments. For secure skills, always use stdin JSON.\n\n## Security\n\nSandbox skills run in isolated environments:\n\n- **No network access** (unless explicitly configured)\n- **No persistent storage** (sandbox destroyed after each `next-message` execution)\n- **File output only** via `/output/` directory\n- **Time limits** enforced (5-minute default, configurable via `sandboxTimeout`)\n- **Secret redaction** - output from secure skills is automatically scanned for secret values\n\nDevice skills run on the agent's computer and share its environment. They do not have sandbox isolation but benefit from restricted tool access (only slug-bearing tools are available).\n\n## Next Steps\n\n- [Agent Config](/docs/protocol/agent-config) - Configuring skills in agent settings\n- [Provider Options](/docs/protocol/provider-options) - Anthropic's built-in skills\n- [Skills Advanced Guide](/docs/protocol/skills-advanced) - Best practices and advanced patterns\n",
1406
+ content: "\n# Skills\n\nSkills are knowledge packages that enable agents to execute code and generate files. Unlike external tools (which you implement in your backend), skills are self-contained packages with documentation and scripts. By default, skills run in isolated sandbox environments, but they can also run directly on the agent's computer.\n\n## Overview\n\nOctavus Skills provide **provider-agnostic** code execution. They work with any LLM provider (Anthropic, OpenAI, Google) by using explicit tool calls and system prompt injection.\n\n### How Skills Work\n\n1. **Skill Definition**: Skills are defined in the protocol's `skills:` section\n2. **Skill Resolution**: Skills are resolved from available sources (see below)\n3. **Execution**: Code runs in an isolated sandbox (default) or on the agent's computer\n4. **File Generation**: Files saved to `/output/` are automatically captured and made available for download (sandbox skills)\n\n### Skill Sources\n\nSkills come from two sources, visible in the Skills tab of your organization:\n\n| Source | Badge in UI | Visibility | Example |\n| ----------- | ----------- | ------------------------------ | ------------------ |\n| **Octavus** | `Octavus` | Available to all organizations | `qr-code` |\n| **Custom** | None | Private to your organization | `my-company-skill` |\n\nWhen you reference a skill in your protocol, Octavus resolves it from your available skills. If you create a custom skill with the same name as an Octavus skill, your custom skill takes precedence.\n\n## Defining Skills\n\nDefine skills in the protocol's `skills:` section:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n data-analysis:\n display: description\n description: Analyzing data and generating reports\n```\n\n### Skill Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ---------------------------------------------------------------------------------------------- |\n| `display` | No | How to show in UI: `hidden`, `name`, `description`, `stream`, `title` (default: `description`) |\n| `title` | No | UI label shown when `display: title` (hides the description and arguments) |\n| `description` | No | Custom description shown to users (overrides skill's built-in description) |\n| `execution` | No | Where the skill runs: `sandbox` (default) or `device` |\n\n### Display Modes\n\nThe `display` setting on a skill applies to all tools under that skill namespace. See [Tool Display Modes](/docs/protocol/tools#display-modes) for full details on each mode.\n\n| Mode | Behavior |\n| ------------- | -------------------------------------------------------------------------------------------------------------------- |\n| `hidden` | Skill tools run silently, no UI events emitted |\n| `name` | Shows skill name while executing |\n| `description` | Shows description while executing (default). Result not preserved after page refresh. |\n| `stream` | Full visibility - arguments stream progressively, result shown after execution, result preserved after page refresh. |\n\n## Enabling Skills\n\nAfter defining skills in the `skills:` section, specify which skills are available. Skills work in both interactive agents and workers.\n\n### Interactive Agents\n\nReference skills in `agent.skills`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools: [get-user-account]\n skills: [qr-code]\n agentic: true\n```\n\n### Workers and Named Threads\n\nReference skills per-thread in `start-thread.skills`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nsteps:\n Start thread:\n block: start-thread\n thread: worker\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n maxSteps: 10\n```\n\nThis also works for named threads in interactive agents, allowing different threads to have different skills.\n\n## Skill Tools\n\nWhen skills are enabled, the LLM has access to these tools:\n\n| Tool | Purpose | Availability |\n| --------------------- | ----------------------------------------------- | ------------------------------ |\n| `octavus_skill_read` | Read skill documentation (SKILL.md) | All skills |\n| `octavus_skill_list` | List available scripts in a skill | All skills |\n| `octavus_skill_run` | Execute a pre-built script from a skill | All skills |\n| `octavus_skill_setup` | Install a skill on the device for file browsing | Device skills only |\n| `octavus_code_run` | Execute arbitrary Python/Bash code | Sandbox skills (standard) only |\n| `octavus_file_write` | Create files in the sandbox | Sandbox skills (standard) only |\n| `octavus_file_read` | Read files from the sandbox | Sandbox skills (standard) only |\n\nThe LLM learns about available skills through system prompt injection and can use these tools to interact with skills.\n\nSkills that have [secrets](#skill-secrets) configured run in **secure mode**, where only `octavus_skill_read`, `octavus_skill_list`, and `octavus_skill_run` are available. See [Skill Secrets](#skill-secrets) below.\n\n## Device Execution\n\nBy default, skills run in an isolated sandbox. When `execution: device` is set, the skill runs on the agent's computer instead.\n\n```yaml\nskills:\n deploy-tool:\n display: description\n description: Deploy applications to production\n execution: device\n qr-code:\n display: description\n description: Generating QR codes\n # execution defaults to sandbox\n```\n\n### How Device Skills Work\n\nDevice skills are installed on the agent's computer so the agent can browse their files and run their scripts directly. After attaching a skill via integrations, the agent uses `octavus_skill_setup` to install it on the device. Once installed, the agent can:\n\n- Read the skill's documentation with `octavus_skill_read`\n- List available scripts with `octavus_skill_list`\n- Run pre-built scripts with `octavus_skill_run`\n\nThe generic workspace tools (`octavus_code_run`, `octavus_file_write`, `octavus_file_read`) are **not available** for device skills. Instead, the agent uses the device's own shell and filesystem MCP servers to interact with files and run commands.\n\n### Sandbox vs Device Skills\n\n| Aspect | Sandbox (default) | Device |\n| ------------------- | ---------------------------------- | ------------------------------------------------------ |\n| **Environment** | Isolated sandbox | The agent's computer |\n| **Available tools** | All 6 skill tools | `skill_read`, `skill_list`, `skill_run`, `skill_setup` |\n| **File access** | Via `octavus_file_read/write` | Via device filesystem MCP |\n| **Code execution** | Via `octavus_code_run` | Via device shell MCP |\n| **Isolation** | Fully sandboxed | Runs alongside other device processes |\n| **File output** | `/output/` directory auto-captured | Files written to device filesystem |\n\n### When to Use Device Execution\n\nUse `execution: device` when the skill needs to:\n\n- Access the agent's local filesystem or running processes\n- Use tools or CLIs installed on the device\n- Interact with services running on the device\n- Persist files beyond a single execution cycle\n\n## Example: QR Code Generation\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n agentic: true\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n\n Respond:\n block: next-message\n```\n\nWhen a user asks \"Create a QR code for octavus.ai\", the LLM will:\n\n1. Recognize the task matches the `qr-code` skill\n2. Call `octavus_skill_read` to learn how to use the skill\n3. Execute code (via `octavus_code_run` or `octavus_skill_run`) to generate the QR code\n4. Save the image to `/output/` in the sandbox\n5. The file is automatically captured and made available for download\n\n## File Output\n\nFiles saved to `/output/` in the sandbox are automatically:\n\n1. **Captured** after code execution\n2. **Uploaded** to S3 storage\n3. **Made available** via presigned URLs\n4. **Included** in the message as file parts\n\nFiles persist across page refreshes and are stored in the session's message history.\n\n## Skill Format\n\nSkills follow the [Agent Skills](https://agentskills.io) open standard:\n\n- `SKILL.md` - Required skill documentation with YAML frontmatter\n- `scripts/` - Optional executable code (Python/Bash)\n- `references/` - Optional documentation loaded as needed\n- `assets/` - Optional files used in outputs (templates, images)\n\n### SKILL.md Format\n\n````yaml\n---\nname: qr-code\ndescription: >\n Generate QR codes from text, URLs, or data. Use when the user needs to create\n a QR code for any purpose - sharing links, contact information, WiFi credentials,\n or any text data that should be scannable.\nversion: 1.0.0\nlicense: MIT\nauthor: Octavus Team\ncategory: Productivity\n---\n\n# QR Code Generator\n\n## Overview\n\nThis skill creates QR codes from text data using Python...\n\n## Quick Start\n\nGenerate a QR code with Python:\n\n```python\nimport qrcode\nimport os\n\noutput_dir = os.environ.get('OUTPUT_DIR', '/output')\n# ... code to generate QR code ...\n````\n\n## Scripts Reference\n\n### scripts/generate.py\n\nMain script for generating QR codes...\n\n````\n\n### Frontmatter Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------ |\n| `name` | Yes | Skill slug (lowercase, hyphens) |\n| `description` | Yes | What the skill does (shown to the LLM) |\n| `version` | No | Semantic version string |\n| `license` | No | License identifier |\n| `author` | No | Skill author |\n| `category` | No | Display category used to group and filter skills in the UI |\n| `secrets` | No | Array of secret declarations (enables secure mode) |\n\n## Best Practices\n\n### 1. Clear Descriptions\n\nProvide clear, purpose-driven descriptions:\n\n```yaml\nskills:\n # Good - clear purpose\n qr-code:\n description: Generating QR codes for URLs, contact info, or any text data\n\n # Avoid - vague\n utility:\n description: Does stuff\n````\n\n### 2. When to Use Skills vs Tools\n\n| Use Skills When | Use Tools When |\n| ------------------------ | ---------------------------- |\n| Code execution needed | Simple API calls |\n| File generation | Database queries |\n| Complex calculations | External service integration |\n| Data processing | Authentication required |\n| Provider-agnostic needed | Backend-specific logic |\n\n### 3. Skill Selection\n\nDefine all skills available to this agent in the `skills:` section. Then specify which skills are available for the chat thread in `agent.skills`:\n\n```yaml\n# All skills available to this agent (defined once at protocol level)\nskills:\n qr-code:\n display: description\n description: Generating QR codes\n data-analysis:\n display: description\n description: Analyzing data\n pdf-processor:\n display: description\n description: Processing PDFs\n\n# Skills available for this chat thread\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code, data-analysis] # Skills available for this thread\n```\n\n### 4. Display Modes\n\nChoose appropriate display modes based on user experience:\n\n```yaml\nskills:\n # Background processing - hide from user\n data-analysis:\n display: hidden\n\n # User-facing generation - show description\n qr-code:\n display: description\n\n # Interactive progress - stream updates\n report-generation:\n display: stream\n```\n\n## Comparison: Skills vs Tools vs Provider Options\n\n| Feature | Octavus Skills | External Tools | Provider Tools/Skills |\n| ------------------ | --------------------------- | ------------------- | --------------------- |\n| **Execution** | Sandbox or agent's computer | Your backend | Provider servers |\n| **Provider** | Any (agnostic) | N/A | Provider-specific |\n| **Code Execution** | Yes | No | Yes (provider tools) |\n| **File Output** | Yes | No | Yes (provider skills) |\n| **Implementation** | Skill packages | Your code | Built-in |\n| **Cost** | Sandbox + LLM API | Your infrastructure | Included in API |\n\n## Uploading Custom Skills\n\nYou can upload custom skills to your organization using the CLI or the platform UI.\n\n### Via CLI (Recommended)\n\nUse [`octavus skills sync`](/docs/server-sdk/cli#octavus-skills-sync-path) to package and upload a skill directory. If the skill has a `.env` file, secrets are pushed alongside the bundle:\n\n```bash\noctavus skills sync ./skills/my-skill\n```\n\n### Skill Directory Structure\n\n```\nmy-skill/\n\u251C\u2500\u2500 SKILL.md # Required: Skill documentation with frontmatter\n\u251C\u2500\u2500 scripts/ # Optional: Executable scripts\n\u2502 \u251C\u2500\u2500 run.py\n\u2502 \u2514\u2500\u2500 requirements.txt\n\u251C\u2500\u2500 references/ # Optional: Additional documentation\n\u251C\u2500\u2500 assets/ # Optional: Templates, images\n\u2514\u2500\u2500 .env # Optional: Secrets (not included in bundle)\n```\n\nOnce uploaded, reference the skill by slug in your protocol:\n\n```yaml\nskills:\n my-skill:\n display: description\n description: Custom analysis tool\n\nagent:\n skills: [my-skill]\n```\n\n## On-Demand Skills\n\nOn-demand skills (`onDemandSkills`) also support the `execution` field:\n\n```yaml\nonDemandSkills:\n display: description\n execution: device\n```\n\nWhen `execution: device` is set on the on-demand skills declaration, any skill attached at runtime via integrations runs on the agent's computer instead of in a sandbox.\n\n## Sandbox Timeout\n\nThe default sandbox timeout is 5 minutes (applies to sandbox skills only). You can configure a custom timeout using `sandboxTimeout` in the agent config or on individual `start-thread` blocks:\n\n```yaml\n# Agent-level timeout (applies to main thread)\nagent:\n model: anthropic/claude-sonnet-4-5\n skills: [data-analysis]\n sandboxTimeout: 1800000 # 30 minutes (in milliseconds)\n```\n\n```yaml\n# Thread-level timeout (overrides agent-level for this thread)\nsteps:\n Start thread:\n block: start-thread\n thread: analysis\n model: anthropic/claude-sonnet-4-5\n skills: [data-analysis]\n sandboxTimeout: 3600000 # 1 hour\n```\n\nThread-level `sandboxTimeout` takes priority over agent-level. Maximum: 1 hour (3,600,000 ms).\n\n## Skill Secrets\n\nSkills can declare secrets they need to function. When an organization configures those secrets, the skill runs in **secure mode** with additional isolation.\n\n### Declaring Secrets\n\nAdd a `secrets` array to your SKILL.md frontmatter:\n\n```yaml\n---\nname: github\ndescription: >\n Run GitHub CLI (gh) commands to manage repos, issues, PRs, and more.\nsecrets:\n - name: GITHUB_TOKEN\n description: GitHub personal access token with repo access\n required: true\n - name: GITHUB_ORG\n description: Default GitHub organization\n required: false\n---\n```\n\nEach secret declaration has:\n\n| Field | Required | Description |\n| ------------- | -------- | ----------------------------------------------------------- |\n| `name` | Yes | Environment variable name (uppercase, e.g., `GITHUB_TOKEN`) |\n| `description` | No | Explains what this secret is for (shown in the UI) |\n| `required` | No | Whether the secret is required (defaults to `true`) |\n\nSecret names must match the pattern `^[A-Z_][A-Z0-9_]*$` (uppercase letters, digits, and underscores).\n\n### Configuring Secrets\n\nOrganization admins configure secret values through the skill editor in the platform UI. Each organization maintains its own independent set of secrets for each skill.\n\nSecrets are encrypted at rest and only decrypted at execution time.\n\n### Secure Mode\n\nWhen a skill has secrets configured for the organization, it automatically runs in **secure mode**:\n\n- The skill gets its own **isolated sandbox** (separate from other skills)\n- Secrets are injected as **environment variables** available to all scripts\n- Only `octavus_skill_read`, `octavus_skill_list`, and `octavus_skill_run` are available - `octavus_code_run`, `octavus_file_write`, and `octavus_file_read` are blocked\n- Scripts receive input as **JSON via stdin** (using the `input` parameter on `octavus_skill_run`) instead of CLI args\n- All output (stdout/stderr) is **automatically redacted** for secret values before being returned to the LLM\n\n### Writing Scripts for Secure Skills\n\nScripts in secure skills read input from stdin as JSON and access secrets from environment variables:\n\n```python\nimport json\nimport os\nimport sys\n\ninput_data = json.load(sys.stdin)\ntoken = os.environ.get('GITHUB_TOKEN')\n\n# Use the token and input_data to perform the task\n```\n\nFor standard skills (without secrets), scripts receive input as CLI arguments. For secure skills, always use stdin JSON.\n\n## Security\n\nSandbox skills run in isolated environments:\n\n- **No network access** (unless explicitly configured)\n- **No persistent storage** (sandbox destroyed after each `next-message` execution)\n- **File output only** via `/output/` directory\n- **Time limits** enforced (5-minute default, configurable via `sandboxTimeout`)\n- **Secret redaction** - output from secure skills is automatically scanned for secret values\n\nDevice skills run on the agent's computer and share its environment. They do not have sandbox isolation but benefit from restricted tool access (only slug-bearing tools are available).\n\n## Next Steps\n\n- [Agent Config](/docs/protocol/agent-config) - Configuring skills in agent settings\n- [Provider Options](/docs/protocol/provider-options) - Anthropic's built-in skills\n- [Skills Advanced Guide](/docs/protocol/skills-advanced) - Best practices and advanced patterns\n",
1380
1407
  excerpt: "Skills Skills are knowledge packages that enable agents to execute code and generate files. Unlike external tools (which you implement in your backend), skills are self-contained packages with...",
1381
1408
  order: 5
1382
1409
  },
@@ -1385,7 +1412,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
1385
1412
  section: "protocol",
1386
1413
  title: "Handlers",
1387
1414
  description: "Defining execution handlers with blocks.",
1388
- content: "\n# Handlers\n\nHandlers define what happens when a trigger fires. They contain execution blocks that run in sequence.\n\n## Handler Structure\n\n```yaml\nhandlers:\n trigger-name:\n Block Name:\n block: block-kind\n # block-specific properties\n\n Another Block:\n block: another-kind\n # ...\n```\n\nEach block has a human-readable name (shown in debug UI) and a `block` field that determines its behavior.\n\n## Block Kinds\n\n### next-message\n\nGenerate a response from the LLM:\n\n```yaml\nhandlers:\n user-message:\n Respond to user:\n block: next-message\n # Uses main conversation thread by default\n # Display defaults to 'stream'\n```\n\nWith options:\n\n```yaml\nGenerate summary:\n block: next-message\n thread: summary # Use named thread\n display: stream # Show streaming content\n independent: true # Don't add to main chat\n output: SUMMARY # Store output in variable\n description: Generating summary # Shown in UI\n```\n\nFor structured output (typed JSON response):\n\n```yaml\nRespond with suggestions:\n block: next-message\n responseType: ChatResponse # Type defined in types section\n output: RESPONSE # Stores the parsed object\n```\n\nWhen `responseType` is specified:\n\n- The LLM generates JSON matching the type schema\n- The `output` variable receives the parsed object (not plain text)\n- The client receives a `UIObjectPart` for custom rendering\n\nSee [Types](/docs/protocol/types#structured-output) for more details.\n\n### add-message\n\nAdd a message to the conversation:\n\n```yaml\nAdd user message:\n block: add-message\n role: user # user | assistant | system\n prompt: user-message # Reference to prompt file\n input: [USER_MESSAGE] # Variables to interpolate\n display: hidden # Don't show in UI\n```\n\nFor internal directives (LLM sees it, user doesn't):\n\n```yaml\nAdd internal directive:\n block: add-message\n role: user\n prompt: ticket-directive\n input: [TICKET_DETAILS]\n visible: false # LLM sees this, user doesn't\n```\n\nFor structured user input (object shown in UI, prompt for LLM context):\n\n```yaml\nAdd user message:\n block: add-message\n role: user\n prompt: user-message # Rendered for LLM context (hidden from UI)\n input: [USER_INPUT]\n uiContent: USER_INPUT # Variable shown in UI (object \u2192 object part)\n display: hidden\n```\n\nWhen `uiContent` is set:\n\n- The variable value is shown in the UI (string \u2192 text part, object \u2192 object part)\n- The prompt text is hidden from the UI but kept for LLM context\n- Useful for rich UI interactions where the visual differs from the LLM context\n\n### tool-call\n\nCall a tool deterministically:\n\n```yaml\nCreate ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY # Variable reference\n priority: medium # Literal value\n output: TICKET # Store result\n```\n\n### set-resource\n\nUpdate a persistent resource:\n\n```yaml\nSave summary:\n block: set-resource\n resource: CONVERSATION_SUMMARY\n value: SUMMARY # Variable to save\n display: name # Show block name\n```\n\n### start-thread\n\nCreate a named conversation thread:\n\n```yaml\nStart summary thread:\n block: start-thread\n thread: summary # Thread name\n model: anthropic/claude-sonnet-4-5 # Optional: different model\n backupModel: openai/gpt-4o # Failover on provider errors\n thinking: low # Extended reasoning level\n cache: auto # auto (default) | extended | off\n maxSteps: 1 # Tool call limit\n system: escalation-summary # System prompt\n input: [COMPANY_NAME] # Variables for prompt\n mcpServers: [figma, browser] # MCP servers for this thread\n skills: [qr-code] # Octavus skills for this thread\n sandboxTimeout: 600000 # Skill sandbox timeout (default: 5 min, max: 1 hour)\n imageModel: google/gemini-2.5-flash-image # Image generation model\n```\n\nThe `cache` field controls prompt caching for this thread and defaults to `auto` when omitted. Threads do not inherit the agent's `cache` value - see [Prompt Caching](/docs/protocol/agent-config#prompt-caching).\n\nThe `model` field can also reference a variable for dynamic model selection. The `backupModel`, `temperature`, `thinking`, and `maxSteps` fields also support variable references - see [Dynamic Configuration](/docs/protocol/agent-config#dynamic-configuration).\n\n```yaml\nStart summary thread:\n block: start-thread\n thread: summary\n model: SUMMARY_MODEL # Resolved from input variable\n system: escalation-summary\n```\n\n### serialize-thread\n\nConvert conversation to text:\n\n```yaml\nSerialize conversation:\n block: serialize-thread\n thread: main # Which thread (default: main)\n format: markdown # markdown | json\n output: CONVERSATION_TEXT # Variable to store result\n```\n\n### generate-image\n\nGenerate an image from a prompt variable:\n\n```yaml\nGenerate image:\n block: generate-image\n prompt: OPTIMIZED_PROMPT # Variable containing the prompt\n imageModel: google/gemini-2.5-flash-image # Required image model\n size: 1024x1024 # 1024x1024 | 1792x1024 | 1024x1792\n output: GENERATED_IMAGE # Store URL in variable\n description: Generating your image... # Shown in UI\n```\n\nEdit an existing image using reference images:\n\n```yaml\nEdit image:\n block: generate-image\n prompt: EDIT_INSTRUCTIONS # e.g., \"Remove the background\"\n referenceImages: [SOURCE_IMAGE_URL] # Variable(s) containing image URLs\n imageModel: google/gemini-2.5-flash-image\n output: EDITED_IMAGE\n description: Editing image...\n```\n\n| Field | Required | Description |\n| ----------------- | -------- | --------------------------------------------------------------- |\n| `prompt` | Yes | Variable name containing the image prompt or edit instructions |\n| `imageModel` | Yes | Image model identifier (e.g., `google/gemini-2.5-flash-image`) |\n| `size` | No | Image dimensions: `1024x1024`, `1792x1024`, or `1024x1792` |\n| `referenceImages` | No | Variable names containing image URLs for editing/transformation |\n| `output` | No | Variable name to store the generated image URL |\n| `thread` | No | Thread to associate the output file with |\n| `description` | No | Description shown in the UI during generation |\n\nThis block is for deterministic image generation pipelines where the prompt is constructed programmatically (e.g., via prompt engineering in a separate thread). When `referenceImages` are provided, the prompt describes how to modify those images.\n\nFor agentic image generation where the LLM decides when to generate, configure `imageModel` in the [agent config](/docs/protocol/agent-config#image-generation).\n\n## Display Modes\n\nEvery block has a `display` property:\n\n| Mode | Default For | Behavior |\n| ------------- | ------------------------- | ----------------- |\n| `hidden` | add-message | Not shown to user |\n| `name` | set-resource | Shows block name |\n| `description` | tool-call, generate-image | Shows description |\n| `stream` | next-message | Streams content |\n\n## Complete Example\n\n```yaml\nhandlers:\n user-message:\n # Add the user's message to conversation\n Add user message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n # Generate response (LLM may call tools)\n Respond to user:\n block: next-message\n # display: stream (default)\n\n request-human:\n # Step 1: Serialize conversation for summary\n Serialize conversation:\n block: serialize-thread\n format: markdown\n output: CONVERSATION_TEXT\n\n # Step 2: Create separate thread for summarization\n Start summary thread:\n block: start-thread\n thread: summary\n model: anthropic/claude-sonnet-4-5\n thinking: low\n system: escalation-summary\n input: [COMPANY_NAME]\n\n # Step 3: Add request to summary thread\n Add summarize request:\n block: add-message\n thread: summary\n role: user\n prompt: summarize-request\n input:\n - CONVERSATION: CONVERSATION_TEXT\n\n # Step 4: Generate summary\n Generate summary:\n block: next-message\n thread: summary\n display: stream\n description: Summarizing your conversation\n independent: true\n output: SUMMARY\n\n # Step 5: Save to resource\n Save summary:\n block: set-resource\n resource: CONVERSATION_SUMMARY\n value: SUMMARY\n\n # Step 6: Create support ticket\n Create ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY\n priority: medium\n output: TICKET\n\n # Step 7: Add directive for response\n Add directive:\n block: add-message\n role: user\n prompt: ticket-directive\n input: [TICKET_DETAILS: TICKET]\n visible: false\n\n # Step 8: Respond to user\n Respond:\n block: next-message\n```\n\n## Block Input Mapping\n\nThe `input` field on blocks controls which variables are passed to the prompt. Only variables listed in `input` are available for interpolation.\n\nVariables can come from `protocol.input`, `protocol.resources`, `protocol.variables`, `trigger.input`, or outputs from prior blocks.\n\n```yaml\n# Array format (same name)\ninput: [USER_MESSAGE, COMPANY_NAME]\n\n# Array format (rename)\ninput:\n - CONVERSATION: CONVERSATION_TEXT # Prompt sees CONVERSATION, value comes from CONVERSATION_TEXT\n - TICKET_DETAILS: TICKET\n\n# Object format (rename)\ninput:\n CONVERSATION: CONVERSATION_TEXT\n TICKET_DETAILS: TICKET\n```\n\n## Independent Blocks\n\nUse `independent: true` for content that shouldn't go to the main chat:\n\n```yaml\nGenerate summary:\n block: next-message\n thread: summary\n independent: true # Output stored in variable, not main chat\n output: SUMMARY\n```\n\nThis is useful for:\n\n- Background processing\n- Summarization in separate threads\n- Generating content for tools\n",
1415
+ content: "\n# Handlers\n\nHandlers define what happens when a trigger fires. They contain execution blocks that run in sequence.\n\n## Handler Structure\n\n```yaml\nhandlers:\n trigger-name:\n Block Name:\n block: block-kind\n # block-specific properties\n\n Another Block:\n block: another-kind\n # ...\n```\n\nEach block has a human-readable name (shown in debug UI) and a `block` field that determines its behavior.\n\n## Block Kinds\n\n### next-message\n\nGenerate a response from the LLM:\n\n```yaml\nhandlers:\n user-message:\n Respond to user:\n block: next-message\n # Uses main conversation thread by default\n # Display defaults to 'stream'\n```\n\nWith options:\n\n```yaml\nGenerate summary:\n block: next-message\n thread: summary # Use named thread\n display: stream # Show streaming content\n independent: true # Don't add to main chat\n output: SUMMARY # Store output in variable\n description: Generating summary # Shown in UI\n```\n\nFor structured output (typed JSON response):\n\n```yaml\nRespond with suggestions:\n block: next-message\n responseType: ChatResponse # Type defined in types section\n output: RESPONSE # Stores the parsed object\n```\n\nWhen `responseType` is specified:\n\n- The LLM generates JSON matching the type schema\n- The `output` variable receives the parsed object (not plain text)\n- The client receives a `UIObjectPart` for custom rendering\n\nSee [Types](/docs/protocol/types#structured-output) for more details.\n\n### add-message\n\nAdd a message to the conversation:\n\n```yaml\nAdd user message:\n block: add-message\n role: user # user | assistant | system\n prompt: user-message # Reference to prompt file\n input: [USER_MESSAGE] # Variables to interpolate\n display: hidden # Don't show in UI\n```\n\nFor internal directives (LLM sees it, user doesn't):\n\n```yaml\nAdd internal directive:\n block: add-message\n role: user\n prompt: ticket-directive\n input: [TICKET_DETAILS]\n visible: false # LLM sees this, user doesn't\n```\n\nFor structured user input (object shown in UI, prompt for LLM context):\n\n```yaml\nAdd user message:\n block: add-message\n role: user\n prompt: user-message # Rendered for LLM context (hidden from UI)\n input: [USER_INPUT]\n uiContent: USER_INPUT # Variable shown in UI (object \u2192 object part)\n display: hidden\n```\n\nWhen `uiContent` is set:\n\n- The variable value is shown in the UI (string \u2192 text part, object \u2192 object part)\n- The prompt text is hidden from the UI but kept for LLM context\n- Useful for rich UI interactions where the visual differs from the LLM context\n\n### tool-call\n\nCall a tool deterministically:\n\n```yaml\nCreate ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY # Variable reference\n priority: medium # Literal value\n output: TICKET # Store result\n```\n\n### set-resource\n\nUpdate a persistent resource:\n\n```yaml\nSave summary:\n block: set-resource\n resource: CONVERSATION_SUMMARY\n value: SUMMARY # Variable to save\n display: name # Show block name\n```\n\n### start-thread\n\nCreate a named conversation thread:\n\n```yaml\nStart summary thread:\n block: start-thread\n thread: summary # Thread name\n model: anthropic/claude-sonnet-4-5 # Optional: different model\n backupModel: openai/gpt-4o # Failover on provider errors\n thinking: low # Extended reasoning level\n cache: auto # auto (default) | extended | off\n maxSteps: 1 # Tool call limit\n system: escalation-summary # System prompt\n input: [COMPANY_NAME] # Variables for prompt\n mcpServers: [figma, browser] # MCP servers for this thread\n skills: [qr-code] # Octavus skills for this thread\n sandboxTimeout: 600000 # Skill sandbox timeout (default: 5 min, max: 1 hour)\n imageModel: google/gemini-2.5-flash-image # Image generation model\n```\n\nThe `cache` field controls prompt caching for this thread and defaults to `auto` when omitted. Threads do not inherit the agent's `cache` value - see [Prompt Caching](/docs/protocol/agent-config#prompt-caching).\n\nThe `model` field can also reference a variable for dynamic model selection. The `backupModel`, `temperature`, `thinking`, and `maxSteps` fields also support variable references - see [Dynamic Configuration](/docs/protocol/agent-config#dynamic-configuration).\n\n```yaml\nStart summary thread:\n block: start-thread\n thread: summary\n model: SUMMARY_MODEL # Resolved from input variable\n system: escalation-summary\n```\n\n### serialize-thread\n\nConvert conversation to text:\n\n```yaml\nSerialize conversation:\n block: serialize-thread\n thread: main # Which thread (default: main)\n format: markdown # markdown | json\n output: CONVERSATION_TEXT # Variable to store result\n```\n\n### generate-image\n\nGenerate an image from a prompt variable:\n\n```yaml\nGenerate image:\n block: generate-image\n prompt: OPTIMIZED_PROMPT # Variable containing the prompt\n imageModel: google/gemini-2.5-flash-image # Required image model\n size: 1024x1024 # 1024x1024 | 1792x1024 | 1024x1792\n output: GENERATED_IMAGE # Store URL in variable\n description: Generating your image... # Shown in UI\n```\n\nEdit an existing image using reference images:\n\n```yaml\nEdit image:\n block: generate-image\n prompt: EDIT_INSTRUCTIONS # e.g., \"Remove the background\"\n referenceImages: [SOURCE_IMAGE_URL] # Variable(s) containing image URLs\n imageModel: google/gemini-2.5-flash-image\n output: EDITED_IMAGE\n description: Editing image...\n```\n\n| Field | Required | Description |\n| ----------------- | -------- | --------------------------------------------------------------- |\n| `prompt` | Yes | Variable name containing the image prompt or edit instructions |\n| `imageModel` | Yes | Image model identifier (e.g., `google/gemini-2.5-flash-image`) |\n| `size` | No | Image dimensions: `1024x1024`, `1792x1024`, or `1024x1792` |\n| `referenceImages` | No | Variable names containing image URLs for editing/transformation |\n| `output` | No | Variable name to store the generated image URL |\n| `thread` | No | Thread to associate the output file with |\n| `description` | No | Description shown in the UI during generation |\n\nThis block is for deterministic image generation pipelines where the prompt is constructed programmatically (e.g., via prompt engineering in a separate thread). When `referenceImages` are provided, the prompt describes how to modify those images.\n\nFor agentic image generation where the LLM decides when to generate, configure `imageModel` in the [agent config](/docs/protocol/agent-config#image-generation).\n\n## Display Modes\n\nEvery block has a `display` property:\n\n| Mode | Default For | Behavior |\n| ------------- | ------------------------- | ------------------------------- |\n| `hidden` | add-message | Not shown to user |\n| `name` | set-resource | Shows block name |\n| `description` | tool-call, generate-image | Shows description |\n| `stream` | next-message | Streams content |\n| `title` | - | Shows the block's `title` field |\n\n## Complete Example\n\n```yaml\nhandlers:\n user-message:\n # Add the user's message to conversation\n Add user message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n # Generate response (LLM may call tools)\n Respond to user:\n block: next-message\n # display: stream (default)\n\n request-human:\n # Step 1: Serialize conversation for summary\n Serialize conversation:\n block: serialize-thread\n format: markdown\n output: CONVERSATION_TEXT\n\n # Step 2: Create separate thread for summarization\n Start summary thread:\n block: start-thread\n thread: summary\n model: anthropic/claude-sonnet-4-5\n thinking: low\n system: escalation-summary\n input: [COMPANY_NAME]\n\n # Step 3: Add request to summary thread\n Add summarize request:\n block: add-message\n thread: summary\n role: user\n prompt: summarize-request\n input:\n - CONVERSATION: CONVERSATION_TEXT\n\n # Step 4: Generate summary\n Generate summary:\n block: next-message\n thread: summary\n display: stream\n description: Summarizing your conversation\n independent: true\n output: SUMMARY\n\n # Step 5: Save to resource\n Save summary:\n block: set-resource\n resource: CONVERSATION_SUMMARY\n value: SUMMARY\n\n # Step 6: Create support ticket\n Create ticket:\n block: tool-call\n tool: create-support-ticket\n input:\n summary: SUMMARY\n priority: medium\n output: TICKET\n\n # Step 7: Add directive for response\n Add directive:\n block: add-message\n role: user\n prompt: ticket-directive\n input: [TICKET_DETAILS: TICKET]\n visible: false\n\n # Step 8: Respond to user\n Respond:\n block: next-message\n```\n\n## Block Input Mapping\n\nThe `input` field on blocks controls which variables are passed to the prompt. Only variables listed in `input` are available for interpolation.\n\nVariables can come from `protocol.input`, `protocol.resources`, `protocol.variables`, `trigger.input`, or outputs from prior blocks.\n\n```yaml\n# Array format (same name)\ninput: [USER_MESSAGE, COMPANY_NAME]\n\n# Array format (rename)\ninput:\n - CONVERSATION: CONVERSATION_TEXT # Prompt sees CONVERSATION, value comes from CONVERSATION_TEXT\n - TICKET_DETAILS: TICKET\n\n# Object format (rename)\ninput:\n CONVERSATION: CONVERSATION_TEXT\n TICKET_DETAILS: TICKET\n```\n\n## Independent Blocks\n\nUse `independent: true` for content that shouldn't go to the main chat:\n\n```yaml\nGenerate summary:\n block: next-message\n thread: summary\n independent: true # Output stored in variable, not main chat\n output: SUMMARY\n```\n\nThis is useful for:\n\n- Background processing\n- Summarization in separate threads\n- Generating content for tools\n",
1389
1416
  excerpt: "Handlers Handlers define what happens when a trigger fires. They contain execution blocks that run in sequence. Handler Structure Each block has a human-readable name (shown in debug UI) and a ...",
1390
1417
  order: 6
1391
1418
  },
@@ -1403,7 +1430,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
1403
1430
  section: "protocol",
1404
1431
  title: "Provider Options",
1405
1432
  description: "Configuring provider-specific tools and features.",
1406
- content: "\n# Provider Options\n\nProvider options let you enable provider-specific features like Anthropic's built-in tools and skills. These features run server-side on the provider's infrastructure.\n\n> **Note**: For provider-agnostic code execution, use [Octavus Skills](/docs/protocol/skills) instead. Octavus Skills work with any LLM provider and run in isolated sandbox environments.\n\n## Anthropic Options\n\nConfigure Anthropic-specific features when using `anthropic/*` models:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n anthropic:\n # Provider tools (server-side)\n tools:\n web-search:\n display: description\n description: Searching the web\n code-execution:\n display: description\n description: Running code\n\n # Skills (knowledge packages)\n skills:\n pdf:\n type: anthropic\n display: description\n description: Processing PDF document\n```\n\n> **Note**: Provider options are validated against the model provider. Using `anthropic:` options with non-Anthropic models will result in a validation error.\n\n## Provider Tools\n\nProvider tools are executed server-side by the provider (Anthropic). Unlike external tools that you implement, provider tools are built-in capabilities.\n\n### Available Tools\n\n| Tool | Description |\n| ---------------- | -------------------------------------------- |\n| `web-search` | Search the web for current information |\n| `code-execution` | Execute Python/Bash in a sandboxed container |\n\n### Tool Configuration\n\n```yaml\nanthropic:\n tools:\n web-search:\n display: description # How to show in UI\n description: Searching... # Custom display text\n```\n\n| Field | Required | Description |\n| ------------- | -------- | --------------------------------------------------------------------- |\n| `display` | No | `hidden`, `name`, `description`, or `stream` (default: `description`) |\n| `description` | No | Custom text shown to users during execution |\n\n### Web Search\n\nAllows the agent to search the web using Anthropic's built-in web search:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n anthropic:\n tools:\n web-search:\n display: description\n description: Looking up current information\n```\n\n> **Tip**: Octavus also provides a **provider-agnostic** web search via `webSearch: true` in the agent config. This works with any LLM provider and is the recommended approach for multi-provider agents. See [Web Search](/docs/protocol/agent-config#web-search) for details.\n\n### Code Execution\n\nEnables Python and Bash execution in a sandboxed container:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n anthropic:\n tools:\n code-execution:\n display: description\n description: Running analysis\n```\n\nUse cases:\n\n- Data analysis and calculations\n- File processing\n- Chart generation\n- Script execution\n\n> **Note**: Code execution is automatically enabled when skills are configured (skills require the container environment).\n\n## Skills\n\n> **Important**: This section covers **Anthropic's built-in skills** (provider-specific). For provider-agnostic skills that work with any LLM, see [Octavus Skills](/docs/protocol/skills).\n\nAnthropic skills are knowledge packages that give the agent specialized capabilities. They're loaded into Anthropic's code execution container at `/skills/{skill-id}/` and only work with Anthropic models.\n\n### Skill Configuration\n\n```yaml\nanthropic:\n skills:\n pdf:\n type: anthropic # 'anthropic' or 'custom'\n version: latest # Optional version\n display: description\n description: Processing PDF\n```\n\n| Field | Required | Description |\n| ------------- | -------- | --------------------------------------------------------------------- |\n| `type` | Yes | `anthropic` (built-in) or `custom` (uploaded) |\n| `version` | No | Skill version (default: `latest`) |\n| `display` | No | `hidden`, `name`, `description`, or `stream` (default: `description`) |\n| `description` | No | Custom text shown to users |\n\n### Built-in Skills\n\nAnthropic provides several built-in skills:\n\n| Skill ID | Purpose |\n| -------- | ----------------------------------------------- |\n| `pdf` | PDF manipulation, text extraction, form filling |\n| `xlsx` | Excel spreadsheet operations and analysis |\n| `docx` | Word document creation and editing |\n| `pptx` | PowerPoint presentation creation |\n\n### Using Skills\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n anthropic:\n skills:\n pdf:\n type: anthropic\n description: Processing your PDF\n xlsx:\n type: anthropic\n description: Analyzing spreadsheet\n```\n\nWhen skills are configured:\n\n1. Code execution is automatically enabled\n2. Skill files are loaded into the container\n3. The agent can read skill instructions and execute scripts\n\n### Custom Skills\n\nYou can create and upload custom skills to Anthropic:\n\n```yaml\nanthropic:\n skills:\n custom-analysis:\n type: custom\n version: latest\n description: Running custom analysis\n```\n\nCustom skills follow the [Agent Skills standard](https://agentskills.io) and contain:\n\n- `SKILL.md` with instructions and metadata\n- Optional `scripts/`, `references/`, and `assets/` directories\n\n### Octavus Skills vs Anthropic Skills\n\n| Feature | Anthropic Skills | Octavus Skills |\n| ----------------- | ------------------------ | ----------------------------- |\n| **Provider** | Anthropic only | Any (agnostic) |\n| **Execution** | Anthropic's container | Isolated sandbox |\n| **Configuration** | `agent.anthropic.skills` | `agent.skills` |\n| **Definition** | `anthropic:` section | `skills:` section |\n| **Use Case** | Claude-specific features | Cross-provider code execution |\n\nFor provider-agnostic code execution, use Octavus Skills defined in the protocol's `skills:` section and enabled via `agent.skills`. See [Skills](/docs/protocol/skills) for details.\n\n## Display Modes\n\nBoth tools and skills support display modes:\n\n| Mode | Behavior |\n| ------------- | ------------------------------- |\n| `hidden` | Not shown to users |\n| `name` | Shows the tool/skill name |\n| `description` | Shows the description (default) |\n| `stream` | Streams progress if available |\n\n## Full Example\n\n```yaml\ninput:\n COMPANY_NAME: { type: string }\n USER_ID: { type: string, optional: true }\n\ntools:\n get-user-account:\n description: Looking up your account\n parameters:\n userId: { type: string }\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n input: [COMPANY_NAME, USER_ID]\n tools: [get-user-account] # External tools\n agentic: true\n thinking: medium\n\n # Anthropic-specific options\n anthropic:\n # Provider tools (server-side)\n tools:\n web-search:\n display: description\n description: Searching the web\n code-execution:\n display: description\n description: Running code\n\n # Skills (knowledge packages)\n skills:\n pdf:\n type: anthropic\n display: description\n description: Processing PDF document\n xlsx:\n type: anthropic\n display: description\n description: Analyzing spreadsheet\n\ntriggers:\n user-message:\n input:\n USER_MESSAGE: { type: string }\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n Respond:\n block: next-message\n```\n\n## Validation\n\nThe protocol validator enforces:\n\n1. **Model match**: Provider options must match the model provider\n - `anthropic:` options require `anthropic/*` model\n - Using mismatched options results in a validation error\n\n2. **Valid tool types**: Only recognized tools are accepted\n - `web-search` and `code-execution` for Anthropic\n\n3. **Valid skill types**: Only `anthropic` or `custom` are accepted\n\n### Error Example\n\n```yaml\n# This will fail validation\nagent:\n model: openai/gpt-4o # OpenAI model\n anthropic: # Anthropic options - mismatch!\n tools:\n web-search: {}\n```\n\nError: `\"anthropic\" options require an anthropic model. Current model provider: \"openai\"`\n",
1433
+ content: "\n# Provider Options\n\nProvider options let you enable provider-specific features like Anthropic's built-in tools and skills. These features run server-side on the provider's infrastructure.\n\n> **Note**: For provider-agnostic code execution, use [Octavus Skills](/docs/protocol/skills) instead. Octavus Skills work with any LLM provider and run in isolated sandbox environments.\n\n## Anthropic Options\n\nConfigure Anthropic-specific features when using `anthropic/*` models:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n anthropic:\n # Provider tools (server-side)\n tools:\n web-search:\n display: description\n description: Searching the web\n code-execution:\n display: description\n description: Running code\n\n # Skills (knowledge packages)\n skills:\n pdf:\n type: anthropic\n display: description\n description: Processing PDF document\n```\n\n> **Note**: Provider options are validated against the model provider. Using `anthropic:` options with non-Anthropic models will result in a validation error.\n\n## Provider Tools\n\nProvider tools are executed server-side by the provider (Anthropic). Unlike external tools that you implement, provider tools are built-in capabilities.\n\n### Available Tools\n\n| Tool | Description |\n| ---------------- | -------------------------------------------- |\n| `web-search` | Search the web for current information |\n| `code-execution` | Execute Python/Bash in a sandboxed container |\n\n### Tool Configuration\n\n```yaml\nanthropic:\n tools:\n web-search:\n display: description # How to show in UI\n description: Searching... # Custom display text\n```\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------------------------------ |\n| `display` | No | `hidden`, `name`, `description`, `stream`, or `title` (default: `description`) |\n| `title` | No | UI label shown when `display: title` (hides description and arguments) |\n| `description` | No | Custom text shown to users during execution |\n\n### Web Search\n\nAllows the agent to search the web using Anthropic's built-in web search:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n anthropic:\n tools:\n web-search:\n display: description\n description: Looking up current information\n```\n\n> **Tip**: Octavus also provides a **provider-agnostic** web search via `webSearch: true` in the agent config. This works with any LLM provider and is the recommended approach for multi-provider agents. See [Web Search](/docs/protocol/agent-config#web-search) for details.\n\n### Code Execution\n\nEnables Python and Bash execution in a sandboxed container:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n anthropic:\n tools:\n code-execution:\n display: description\n description: Running analysis\n```\n\nUse cases:\n\n- Data analysis and calculations\n- File processing\n- Chart generation\n- Script execution\n\n> **Note**: Code execution is automatically enabled when skills are configured (skills require the container environment).\n\n## Skills\n\n> **Important**: This section covers **Anthropic's built-in skills** (provider-specific). For provider-agnostic skills that work with any LLM, see [Octavus Skills](/docs/protocol/skills).\n\nAnthropic skills are knowledge packages that give the agent specialized capabilities. They're loaded into Anthropic's code execution container at `/skills/{skill-id}/` and only work with Anthropic models.\n\n### Skill Configuration\n\n```yaml\nanthropic:\n skills:\n pdf:\n type: anthropic # 'anthropic' or 'custom'\n version: latest # Optional version\n display: description\n description: Processing PDF\n```\n\n| Field | Required | Description |\n| ------------- | -------- | ------------------------------------------------------------------------------ |\n| `type` | Yes | `anthropic` (built-in) or `custom` (uploaded) |\n| `version` | No | Skill version (default: `latest`) |\n| `display` | No | `hidden`, `name`, `description`, `stream`, or `title` (default: `description`) |\n| `title` | No | UI label shown when `display: title` (hides description and arguments) |\n| `description` | No | Custom text shown to users |\n\n### Built-in Skills\n\nAnthropic provides several built-in skills:\n\n| Skill ID | Purpose |\n| -------- | ----------------------------------------------- |\n| `pdf` | PDF manipulation, text extraction, form filling |\n| `xlsx` | Excel spreadsheet operations and analysis |\n| `docx` | Word document creation and editing |\n| `pptx` | PowerPoint presentation creation |\n\n### Using Skills\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n anthropic:\n skills:\n pdf:\n type: anthropic\n description: Processing your PDF\n xlsx:\n type: anthropic\n description: Analyzing spreadsheet\n```\n\nWhen skills are configured:\n\n1. Code execution is automatically enabled\n2. Skill files are loaded into the container\n3. The agent can read skill instructions and execute scripts\n\n### Custom Skills\n\nYou can create and upload custom skills to Anthropic:\n\n```yaml\nanthropic:\n skills:\n custom-analysis:\n type: custom\n version: latest\n description: Running custom analysis\n```\n\nCustom skills follow the [Agent Skills standard](https://agentskills.io) and contain:\n\n- `SKILL.md` with instructions and metadata\n- Optional `scripts/`, `references/`, and `assets/` directories\n\n### Octavus Skills vs Anthropic Skills\n\n| Feature | Anthropic Skills | Octavus Skills |\n| ----------------- | ------------------------ | ----------------------------- |\n| **Provider** | Anthropic only | Any (agnostic) |\n| **Execution** | Anthropic's container | Isolated sandbox |\n| **Configuration** | `agent.anthropic.skills` | `agent.skills` |\n| **Definition** | `anthropic:` section | `skills:` section |\n| **Use Case** | Claude-specific features | Cross-provider code execution |\n\nFor provider-agnostic code execution, use Octavus Skills defined in the protocol's `skills:` section and enabled via `agent.skills`. See [Skills](/docs/protocol/skills) for details.\n\n## Display Modes\n\nBoth tools and skills support display modes:\n\n| Mode | Behavior |\n| ------------- | ------------------------------- |\n| `hidden` | Not shown to users |\n| `name` | Shows the tool/skill name |\n| `description` | Shows the description (default) |\n| `stream` | Streams progress if available |\n\n## Full Example\n\n```yaml\ninput:\n COMPANY_NAME: { type: string }\n USER_ID: { type: string, optional: true }\n\ntools:\n get-user-account:\n description: Looking up your account\n parameters:\n userId: { type: string }\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n input: [COMPANY_NAME, USER_ID]\n tools: [get-user-account] # External tools\n agentic: true\n thinking: medium\n\n # Anthropic-specific options\n anthropic:\n # Provider tools (server-side)\n tools:\n web-search:\n display: description\n description: Searching the web\n code-execution:\n display: description\n description: Running code\n\n # Skills (knowledge packages)\n skills:\n pdf:\n type: anthropic\n display: description\n description: Processing PDF document\n xlsx:\n type: anthropic\n display: description\n description: Analyzing spreadsheet\n\ntriggers:\n user-message:\n input:\n USER_MESSAGE: { type: string }\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n Respond:\n block: next-message\n```\n\n## Validation\n\nThe protocol validator enforces:\n\n1. **Model match**: Provider options must match the model provider\n - `anthropic:` options require `anthropic/*` model\n - Using mismatched options results in a validation error\n\n2. **Valid tool types**: Only recognized tools are accepted\n - `web-search` and `code-execution` for Anthropic\n\n3. **Valid skill types**: Only `anthropic` or `custom` are accepted\n\n### Error Example\n\n```yaml\n# This will fail validation\nagent:\n model: openai/gpt-4o # OpenAI model\n anthropic: # Anthropic options - mismatch!\n tools:\n web-search: {}\n```\n\nError: `\"anthropic\" options require an anthropic model. Current model provider: \"openai\"`\n",
1407
1434
  excerpt: "Provider Options Provider options let you enable provider-specific features like Anthropic's built-in tools and skills. These features run server-side on the provider's infrastructure. > Note: For...",
1408
1435
  order: 8
1409
1436
  },
@@ -1430,7 +1457,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
1430
1457
  section: "protocol",
1431
1458
  title: "Workers",
1432
1459
  description: "Defining worker agents for background and task-based execution.",
1433
- content: '\n# Workers\n\nWorkers are agents designed for task-based execution. Unlike interactive agents that handle multi-turn conversations, workers execute a sequence of steps and return an output value.\n\n## When to Use Workers\n\nWorkers are ideal for:\n\n- **Background processing** - Long-running tasks that don\'t need conversation\n- **Composable tasks** - Reusable units of work called by other agents\n- **Pipelines** - Multi-step processing with structured output\n- **Parallel execution** - Tasks that can run independently\n\nUse interactive agents instead when:\n\n- **Conversation is needed** - Multi-turn dialogue with users\n- **Persistence matters** - State should survive across interactions\n- **Session context** - User context needs to persist\n\n## Worker vs Interactive\n\n| Aspect | Interactive | Worker |\n| ---------- | ---------------------------------- | ----------------------------- |\n| Structure | `triggers` + `handlers` + `agent` | `steps` + `output` |\n| LLM Config | Global `agent:` section | Per-thread via `start-thread` |\n| Invocation | Fire a named trigger | Direct execution with input |\n| Session | Persists across triggers (24h TTL) | Single execution |\n| Result | Streaming chat | Streaming + output value |\n\n## Protocol Structure\n\nWorkers use a simpler protocol structure than interactive agents:\n\n```yaml\n# Input schema - provided when worker is executed\ninput:\n TOPIC:\n type: string\n description: Topic to research\n DEPTH:\n type: string\n optional: true\n default: medium\n\n# Variables for intermediate results\nvariables:\n RESEARCH_DATA:\n type: string\n ANALYSIS:\n type: string\n description: Final analysis result\n\n# Tools available to the worker\ntools:\n web-search:\n description: Search the web\n parameters:\n query: { type: string }\n\n# Sequential execution steps\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: research-system\n input: [TOPIC, DEPTH]\n tools: [web-search]\n maxSteps: 5\n\n Add research request:\n block: add-message\n thread: research\n role: user\n prompt: research-prompt\n input: [TOPIC, DEPTH]\n\n Generate research:\n block: next-message\n thread: research\n output: RESEARCH_DATA\n\n Start analysis:\n block: start-thread\n thread: analysis\n model: anthropic/claude-sonnet-4-5\n system: analysis-system\n\n Add analysis request:\n block: add-message\n thread: analysis\n role: user\n prompt: analysis-prompt\n input: [RESEARCH_DATA]\n\n Generate analysis:\n block: next-message\n thread: analysis\n output: ANALYSIS\n\n# Output variable - the worker\'s return value\noutput: ANALYSIS\n```\n\n## settings.json\n\nWorkers are identified by the `format` field:\n\n```json\n{\n "slug": "research-assistant",\n "name": "Research Assistant",\n "description": "Researches topics and returns structured analysis",\n "format": "worker"\n}\n```\n\n## Key Differences\n\n### No Global Agent Config\n\nInteractive agents have a global `agent:` section that configures a main thread. Workers don\'t have this - every thread must be explicitly created via `start-thread`:\n\n```yaml\n# Interactive agent: Global config\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools: [tool-a, tool-b]\n\n# Worker: Each thread configured independently\nsteps:\n Start thread A:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n tools: [tool-a]\n\n Start thread B:\n block: start-thread\n thread: analysis\n model: openai/gpt-4o\n tools: [tool-b]\n```\n\nThis gives workers flexibility to use different models, tools, skills, and settings at different stages.\n\n### Steps Instead of Handlers\n\nWorkers use `steps:` instead of `handlers:`. Steps execute sequentially, like handler blocks:\n\n```yaml\n# Interactive: Handlers respond to triggers\nhandlers:\n user-message:\n Add message:\n block: add-message\n # ...\n\n# Worker: Steps execute in sequence\nsteps:\n Add message:\n block: add-message\n # ...\n```\n\n### Output Value\n\nWorkers can return an output value to the caller:\n\n```yaml\nvariables:\n RESULT:\n type: string\n\nsteps:\n # ... steps that populate RESULT ...\n\noutput: RESULT # Return this variable\'s value\n```\n\nThe `output` field references a variable declared in `variables:`. If omitted, the worker completes without returning a value.\n\n## Available Blocks\n\nWorkers support the same blocks as handlers:\n\n| Block | Purpose |\n| ------------------ | -------------------------------------------- |\n| `start-thread` | Create a named thread with LLM configuration |\n| `add-message` | Add a message to a thread |\n| `next-message` | Generate LLM response |\n| `tool-call` | Call a tool deterministically |\n| `set-resource` | Update a resource value |\n| `serialize-thread` | Convert thread to text |\n| `generate-image` | Generate an image from a prompt variable |\n\n### start-thread (Required for LLM)\n\nEvery thread must be initialized with `start-thread` before using `next-message`:\n\n```yaml\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: research-system\n input: [TOPIC]\n tools: [web-search]\n thinking: medium\n maxSteps: 5\n```\n\nAll LLM configuration goes here:\n\n| Field | Description |\n| --------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |\n| `thread` | Thread name (defaults to block name) |\n| `model` | LLM model to use |\n| `system` | System prompt filename (required) |\n| `input` | Variables for system prompt |\n| `tools` | Tools available in this thread |\n| `skills` | Octavus skills available in this thread |\n| `mcpServers` | MCP servers available in this thread |\n| `imageModel` | Image generation model |\n| `webSearch` | Enable built-in web search tool |\n| `thinking` | Extended reasoning level (`low`/`medium`/`high`/`max`), `"off"`, or variable reference |\n| `cache` | Prompt caching mode: `auto` (default), `extended`, or `off` |\n| `temperature` | Model temperature (0-2), `"off"`, or variable reference |\n| `maxSteps` | Maximum tool call cycles (enables agentic if > 1), or variable reference |\n| `maxToolOutputTokens` | Cap a single tool result at this many tokens in the thread\'s model view (head+tail preview + note). Omit to leave tool output unbounded |\n\n## Simple Example\n\nA worker that generates a title from a summary:\n\n```yaml\n# Input\ninput:\n CONVERSATION_SUMMARY:\n type: string\n description: Summary to generate a title for\n\n# Variables\nvariables:\n TITLE:\n type: string\n description: The generated title\n\n# Steps\nsteps:\n Start title thread:\n block: start-thread\n thread: title-gen\n model: anthropic/claude-sonnet-4-5\n system: title-system\n\n Add title request:\n block: add-message\n thread: title-gen\n role: user\n prompt: title-request\n input: [CONVERSATION_SUMMARY]\n\n Generate title:\n block: next-message\n thread: title-gen\n output: TITLE\n display: stream\n\n# Output\noutput: TITLE\n```\n\n## Advanced Example\n\nA worker with multiple threads, tools, and agentic behavior:\n\n```yaml\ninput:\n USER_MESSAGE:\n type: string\n description: The user\'s message to respond to\n USER_ID:\n type: string\n description: User ID for account lookups\n optional: true\n\ntools:\n get-user-account:\n description: Looking up account information\n parameters:\n userId: { type: string }\n create-support-ticket:\n description: Creating a support ticket\n parameters:\n summary: { type: string }\n priority: { type: string }\n\nvariables:\n ASSISTANT_RESPONSE:\n type: string\n CHAT_TRANSCRIPT:\n type: string\n CONVERSATION_SUMMARY:\n type: string\n\nsteps:\n # Thread 1: Chat with agentic tool calling\n Start chat thread:\n block: start-thread\n thread: chat\n model: anthropic/claude-sonnet-4-5\n system: chat-system\n input: [USER_ID]\n tools: [get-user-account, create-support-ticket]\n thinking: medium\n maxSteps: 5\n\n Add user message:\n block: add-message\n thread: chat\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n\n Generate response:\n block: next-message\n thread: chat\n output: ASSISTANT_RESPONSE\n display: stream\n\n # Serialize for summary\n Save conversation:\n block: serialize-thread\n thread: chat\n output: CHAT_TRANSCRIPT\n\n # Thread 2: Summary generation\n Start summary thread:\n block: start-thread\n thread: summary\n model: anthropic/claude-sonnet-4-5\n system: summary-system\n thinking: low\n\n Add summary request:\n block: add-message\n thread: summary\n role: user\n prompt: summary-request\n input: [CHAT_TRANSCRIPT]\n\n Generate summary:\n block: next-message\n thread: summary\n output: CONVERSATION_SUMMARY\n display: stream\n\noutput: CONVERSATION_SUMMARY\n```\n\n## MCP Servers\n\nWorkers can declare and use MCP servers, just like interactive agents. Define them in `mcpServers:` and reference them in `start-thread`:\n\n```yaml\nmcpServers:\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [sentry, browser]\n maxSteps: 10\n```\n\nWorkers resolve their own MCP connections independently - they don\'t inherit MCP servers from a parent interactive agent. Remote MCP connections are project-scoped, so a worker in the same project automatically has access to the same OAuth connections.\n\nSee [MCP Servers](/docs/protocol/mcp-servers) for full documentation.\n\n## Skills, Image Generation, and Web Search\n\nWorkers can use Octavus skills, image generation, and web search, configured per-thread via `start-thread`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generate QR codes\n\nsteps:\n Start thread:\n block: start-thread\n thread: worker\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n imageModel: google/gemini-2.5-flash-image\n webSearch: true\n maxSteps: 10\n```\n\nWorkers define their own skills independently - they don\'t inherit skills from a parent interactive agent. Each thread gets its own sandbox scoped to only its listed skills.\n\nSkills with `execution: device` work the same way in workers as in interactive agents - the skill runs on the agent\'s computer. Workers resolve their device execution independently, so a worker can use device skills even if the parent agent does not.\n\nSee [Skills](/docs/protocol/skills) for full documentation.\n\n## Tool Handling\n\nWorkers support the same tool handling as interactive agents:\n\n- **Server tools** - Handled by tool handlers you provide\n- **Client tools** - Pause execution, return tool request to caller\n\n```typescript\n// Non-streaming: get the output directly\nconst { output } = await client.workers.generate(\n agentId,\n { TOPIC: \'AI safety\' },\n {\n tools: {\n \'web-search\': async (args) => await searchWeb(args.query),\n },\n },\n);\n\n// Streaming: observe events in real-time\nconst events = client.workers.execute(\n agentId,\n { TOPIC: \'AI safety\' },\n {\n tools: {\n \'web-search\': async (args) => await searchWeb(args.query),\n },\n },\n);\n```\n\nSee [Server SDK Workers](/docs/server-sdk/workers) for tool handling details.\n\n## Stream Events\n\nWorkers emit the same events as interactive agents, plus worker-specific events:\n\n| Event | Description |\n| --------------- | ---------------------------------- |\n| `worker-start` | Worker execution begins |\n| `worker-result` | Worker completes (includes output) |\n\nAll standard events (text-delta, tool calls, etc.) are also emitted.\n\n## Calling Workers from Interactive Agents\n\nInteractive agents can call workers in three ways:\n\n1. **Deterministically** - Using the `run-worker` block\n2. **Agentically** - LLM calls worker as a tool\n3. **Automatically** - Octavus invokes the worker as part of a built-in capability, not the model. Context management\'s `summarizerWorker` (see [Context Management](/docs/protocol/context-management)) works this way: declare it in `workers:` but leave it out of `agent.workers` so the model never sees it as a tool.\n\n### Worker Declaration\n\nFirst, declare workers in your interactive agent\'s protocol:\n\n```yaml\nworkers:\n generate-title:\n description: Generating conversation title\n display: description\n research-assistant:\n description: Researching topic\n display: stream\n tools:\n search: web-search # Map worker tool \u2192 parent tool\n```\n\n### run-worker Block\n\nCall a worker deterministically from a handler:\n\n```yaml\nhandlers:\n request-human:\n Generate title:\n block: run-worker\n worker: generate-title\n input:\n CONVERSATION_SUMMARY: SUMMARY\n output: CONVERSATION_TITLE\n```\n\n### LLM Tool Invocation\n\nMake workers available to the LLM:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n workers: [generate-title, research-assistant]\n agentic: true\n```\n\nThe LLM can then call workers as tools during conversation.\n\n### Display Modes\n\nControls how worker execution appears to users. The default for workers is `stream`.\n\n| Mode | Behavior |\n| ------------- | ---------------------------------------------------------------------------------------------------------------------------------- |\n| `hidden` | Worker runs silently. No events reach the client - no `UIWorkerPart` is created. |\n| `name` | Shows a running/done indicator with the worker name. No nested content (text, tool calls, reasoning) is forwarded. |\n| `description` | Shows a running/done indicator with the worker description. No nested content is forwarded. |\n| `stream` | Full visibility. All nested events are forwarded - text, reasoning, tool calls, sources, files. Worker input is included on start. |\n\n**Progressive input streaming:** When a worker with `display: stream` is invoked agentically (LLM calls it as a tool), the `UIWorkerPart` appears in the UI immediately as the LLM starts generating the worker\'s arguments. The worker input streams progressively into the worker part, the same way text tokens stream into a text part. Once input finishes, worker execution begins and nested content flows into the same worker part. There is no intermediate tool card.\n\n**`name` and `description` modes:** Worker input is stripped from the `worker-start` event (it may contain sensitive data). Only the running/done status and the final `worker-result` are forwarded to the parent stream. Use these for workers where the user only needs to know the worker ran, not what it did internally.\n\n**`hidden` mode:** The worker executes normally but produces no UI presence at all. Use for internal workers that are implementation details.\n\n### Tool Mapping\n\nMap parent tools to worker tools when the worker needs access to your tool handlers:\n\n```yaml\nworkers:\n research-assistant:\n description: Research topics\n tools:\n search: web-search # Worker\'s "search" \u2192 parent\'s "web-search"\n```\n\nWhen the worker calls its `search` tool, your `web-search` handler executes.\n\n## Next Steps\n\n- [Server SDK Workers](/docs/server-sdk/workers) - Executing workers from code\n- [Handlers](/docs/protocol/handlers) - Block reference for steps\n- [Agent Config](/docs/protocol/agent-config) - Model and settings\n',
1460
+ content: '\n# Workers\n\nWorkers are agents designed for task-based execution. Unlike interactive agents that handle multi-turn conversations, workers execute a sequence of steps and return an output value.\n\n## When to Use Workers\n\nWorkers are ideal for:\n\n- **Background processing** - Long-running tasks that don\'t need conversation\n- **Composable tasks** - Reusable units of work called by other agents\n- **Pipelines** - Multi-step processing with structured output\n- **Parallel execution** - Tasks that can run independently\n\nUse interactive agents instead when:\n\n- **Conversation is needed** - Multi-turn dialogue with users\n- **Persistence matters** - State should survive across interactions\n- **Session context** - User context needs to persist\n\n## Worker vs Interactive\n\n| Aspect | Interactive | Worker |\n| ---------- | ---------------------------------- | ----------------------------- |\n| Structure | `triggers` + `handlers` + `agent` | `steps` + `output` |\n| LLM Config | Global `agent:` section | Per-thread via `start-thread` |\n| Invocation | Fire a named trigger | Direct execution with input |\n| Session | Persists across triggers (24h TTL) | Single execution |\n| Result | Streaming chat | Streaming + output value |\n\n## Protocol Structure\n\nWorkers use a simpler protocol structure than interactive agents:\n\n```yaml\n# Input schema - provided when worker is executed\ninput:\n TOPIC:\n type: string\n description: Topic to research\n DEPTH:\n type: string\n optional: true\n default: medium\n\n# Variables for intermediate results\nvariables:\n RESEARCH_DATA:\n type: string\n ANALYSIS:\n type: string\n description: Final analysis result\n\n# Tools available to the worker\ntools:\n web-search:\n description: Search the web\n parameters:\n query: { type: string }\n\n# Sequential execution steps\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: research-system\n input: [TOPIC, DEPTH]\n tools: [web-search]\n maxSteps: 5\n\n Add research request:\n block: add-message\n thread: research\n role: user\n prompt: research-prompt\n input: [TOPIC, DEPTH]\n\n Generate research:\n block: next-message\n thread: research\n output: RESEARCH_DATA\n\n Start analysis:\n block: start-thread\n thread: analysis\n model: anthropic/claude-sonnet-4-5\n system: analysis-system\n\n Add analysis request:\n block: add-message\n thread: analysis\n role: user\n prompt: analysis-prompt\n input: [RESEARCH_DATA]\n\n Generate analysis:\n block: next-message\n thread: analysis\n output: ANALYSIS\n\n# Output variable - the worker\'s return value\noutput: ANALYSIS\n```\n\n## settings.json\n\nWorkers are identified by the `format` field:\n\n```json\n{\n "slug": "research-assistant",\n "name": "Research Assistant",\n "description": "Researches topics and returns structured analysis",\n "format": "worker"\n}\n```\n\n## Key Differences\n\n### No Global Agent Config\n\nInteractive agents have a global `agent:` section that configures a main thread. Workers don\'t have this - every thread must be explicitly created via `start-thread`:\n\n```yaml\n# Interactive agent: Global config\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n tools: [tool-a, tool-b]\n\n# Worker: Each thread configured independently\nsteps:\n Start thread A:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n tools: [tool-a]\n\n Start thread B:\n block: start-thread\n thread: analysis\n model: openai/gpt-4o\n tools: [tool-b]\n```\n\nThis gives workers flexibility to use different models, tools, skills, and settings at different stages.\n\n### Steps Instead of Handlers\n\nWorkers use `steps:` instead of `handlers:`. Steps execute sequentially, like handler blocks:\n\n```yaml\n# Interactive: Handlers respond to triggers\nhandlers:\n user-message:\n Add message:\n block: add-message\n # ...\n\n# Worker: Steps execute in sequence\nsteps:\n Add message:\n block: add-message\n # ...\n```\n\n### Output Value\n\nWorkers can return an output value to the caller:\n\n```yaml\nvariables:\n RESULT:\n type: string\n\nsteps:\n # ... steps that populate RESULT ...\n\noutput: RESULT # Return this variable\'s value\n```\n\nThe `output` field references a variable declared in `variables:`. If omitted, the worker completes without returning a value.\n\n## Available Blocks\n\nWorkers support the same blocks as handlers:\n\n| Block | Purpose |\n| ------------------ | -------------------------------------------- |\n| `start-thread` | Create a named thread with LLM configuration |\n| `add-message` | Add a message to a thread |\n| `next-message` | Generate LLM response |\n| `tool-call` | Call a tool deterministically |\n| `set-resource` | Update a resource value |\n| `serialize-thread` | Convert thread to text |\n| `generate-image` | Generate an image from a prompt variable |\n\n### start-thread (Required for LLM)\n\nEvery thread must be initialized with `start-thread` before using `next-message`:\n\n```yaml\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: research-system\n input: [TOPIC]\n tools: [web-search]\n thinking: medium\n maxSteps: 5\n```\n\nAll LLM configuration goes here:\n\n| Field | Description |\n| --------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |\n| `thread` | Thread name (defaults to block name) |\n| `model` | LLM model to use |\n| `system` | System prompt filename (required) |\n| `input` | Variables for system prompt |\n| `tools` | Tools available in this thread |\n| `skills` | Octavus skills available in this thread |\n| `mcpServers` | MCP servers available in this thread |\n| `imageModel` | Image generation model |\n| `webSearch` | Enable built-in web search tool |\n| `thinking` | Extended reasoning level (`low`/`medium`/`high`/`max`), `"off"`, or variable reference |\n| `cache` | Prompt caching mode: `auto` (default), `extended`, or `off` |\n| `temperature` | Model temperature (0-2), `"off"`, or variable reference |\n| `maxSteps` | Maximum tool call cycles (enables agentic if > 1), or variable reference |\n| `maxToolOutputTokens` | Cap a single tool result at this many tokens in the thread\'s model view (head+tail preview + note). Omit to leave tool output unbounded |\n\n## Simple Example\n\nA worker that generates a title from a summary:\n\n```yaml\n# Input\ninput:\n CONVERSATION_SUMMARY:\n type: string\n description: Summary to generate a title for\n\n# Variables\nvariables:\n TITLE:\n type: string\n description: The generated title\n\n# Steps\nsteps:\n Start title thread:\n block: start-thread\n thread: title-gen\n model: anthropic/claude-sonnet-4-5\n system: title-system\n\n Add title request:\n block: add-message\n thread: title-gen\n role: user\n prompt: title-request\n input: [CONVERSATION_SUMMARY]\n\n Generate title:\n block: next-message\n thread: title-gen\n output: TITLE\n display: stream\n\n# Output\noutput: TITLE\n```\n\n## Advanced Example\n\nA worker with multiple threads, tools, and agentic behavior:\n\n```yaml\ninput:\n USER_MESSAGE:\n type: string\n description: The user\'s message to respond to\n USER_ID:\n type: string\n description: User ID for account lookups\n optional: true\n\ntools:\n get-user-account:\n description: Looking up account information\n parameters:\n userId: { type: string }\n create-support-ticket:\n description: Creating a support ticket\n parameters:\n summary: { type: string }\n priority: { type: string }\n\nvariables:\n ASSISTANT_RESPONSE:\n type: string\n CHAT_TRANSCRIPT:\n type: string\n CONVERSATION_SUMMARY:\n type: string\n\nsteps:\n # Thread 1: Chat with agentic tool calling\n Start chat thread:\n block: start-thread\n thread: chat\n model: anthropic/claude-sonnet-4-5\n system: chat-system\n input: [USER_ID]\n tools: [get-user-account, create-support-ticket]\n thinking: medium\n maxSteps: 5\n\n Add user message:\n block: add-message\n thread: chat\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n\n Generate response:\n block: next-message\n thread: chat\n output: ASSISTANT_RESPONSE\n display: stream\n\n # Serialize for summary\n Save conversation:\n block: serialize-thread\n thread: chat\n output: CHAT_TRANSCRIPT\n\n # Thread 2: Summary generation\n Start summary thread:\n block: start-thread\n thread: summary\n model: anthropic/claude-sonnet-4-5\n system: summary-system\n thinking: low\n\n Add summary request:\n block: add-message\n thread: summary\n role: user\n prompt: summary-request\n input: [CHAT_TRANSCRIPT]\n\n Generate summary:\n block: next-message\n thread: summary\n output: CONVERSATION_SUMMARY\n display: stream\n\noutput: CONVERSATION_SUMMARY\n```\n\n## MCP Servers\n\nWorkers can declare and use MCP servers, just like interactive agents. Define them in `mcpServers:` and reference them in `start-thread`:\n\n```yaml\nmcpServers:\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [sentry, browser]\n maxSteps: 10\n```\n\nWorkers resolve their own MCP connections independently - they don\'t inherit MCP servers from a parent interactive agent. Remote MCP connections are project-scoped, so a worker in the same project automatically has access to the same OAuth connections.\n\nSee [MCP Servers](/docs/protocol/mcp-servers) for full documentation.\n\n## Skills, Image Generation, and Web Search\n\nWorkers can use Octavus skills, image generation, and web search, configured per-thread via `start-thread`:\n\n```yaml\nskills:\n qr-code:\n display: description\n description: Generate QR codes\n\nsteps:\n Start thread:\n block: start-thread\n thread: worker\n model: anthropic/claude-sonnet-4-5\n system: system\n skills: [qr-code]\n imageModel: google/gemini-2.5-flash-image\n webSearch: true\n maxSteps: 10\n```\n\nWorkers define their own skills independently - they don\'t inherit skills from a parent interactive agent. Each thread gets its own sandbox scoped to only its listed skills.\n\nSkills with `execution: device` work the same way in workers as in interactive agents - the skill runs on the agent\'s computer. Workers resolve their device execution independently, so a worker can use device skills even if the parent agent does not.\n\nSee [Skills](/docs/protocol/skills) for full documentation.\n\n## Tool Handling\n\nWorkers support the same tool handling as interactive agents:\n\n- **Server tools** - Handled by tool handlers you provide\n- **Client tools** - Pause execution, return tool request to caller\n\n```typescript\n// Non-streaming: get the output directly\nconst { output } = await client.workers.generate(\n agentId,\n { TOPIC: \'AI safety\' },\n {\n tools: {\n \'web-search\': async (args) => await searchWeb(args.query),\n },\n },\n);\n\n// Streaming: observe events in real-time\nconst events = client.workers.execute(\n agentId,\n { TOPIC: \'AI safety\' },\n {\n tools: {\n \'web-search\': async (args) => await searchWeb(args.query),\n },\n },\n);\n```\n\nSee [Server SDK Workers](/docs/server-sdk/workers) for tool handling details.\n\n## Stream Events\n\nWorkers emit the same events as interactive agents, plus worker-specific events:\n\n| Event | Description |\n| --------------- | ---------------------------------- |\n| `worker-start` | Worker execution begins |\n| `worker-result` | Worker completes (includes output) |\n\nAll standard events (text-delta, tool calls, etc.) are also emitted.\n\n## Calling Workers from Interactive Agents\n\nInteractive agents can call workers in three ways:\n\n1. **Deterministically** - Using the `run-worker` block\n2. **Agentically** - LLM calls worker as a tool\n3. **Automatically** - Octavus invokes the worker as part of a built-in capability, not the model. Context management\'s `summarizerWorker` (see [Context Management](/docs/protocol/context-management)) works this way: declare it in `workers:` but leave it out of `agent.workers` so the model never sees it as a tool.\n\n### Worker Declaration\n\nFirst, declare workers in your interactive agent\'s protocol:\n\n```yaml\nworkers:\n generate-title:\n description: Generating conversation title\n display: description\n research-assistant:\n description: Researching topic\n display: stream\n tools:\n search: web-search # Map worker tool \u2192 parent tool\n```\n\n### run-worker Block\n\nCall a worker deterministically from a handler:\n\n```yaml\nhandlers:\n request-human:\n Generate title:\n block: run-worker\n worker: generate-title\n input:\n CONVERSATION_SUMMARY: SUMMARY\n output: CONVERSATION_TITLE\n```\n\n### LLM Tool Invocation\n\nMake workers available to the LLM:\n\n```yaml\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n workers: [generate-title, research-assistant]\n agentic: true\n```\n\nThe LLM can then call workers as tools during conversation.\n\n### Display Modes\n\nControls how worker execution appears to users. The default for workers is `stream`.\n\n| Mode | Behavior |\n| ------------- | ---------------------------------------------------------------------------------------------------------------------------------- |\n| `hidden` | Worker runs silently. No events reach the client - no `UIWorkerPart` is created. |\n| `name` | Shows a running/done indicator with the worker name. No nested content (text, tool calls, reasoning) is forwarded. |\n| `description` | Shows a running/done indicator with the worker description. No nested content is forwarded. |\n| `stream` | Full visibility. All nested events are forwarded - text, reasoning, tool calls, sources, files. Worker input is included on start. |\n| `title` | Like `description`, but shows the worker\'s `title` field instead of its description. No nested content or input is forwarded. |\n\n**Progressive input streaming:** When a worker with `display: stream` is invoked agentically (LLM calls it as a tool), the `UIWorkerPart` appears in the UI immediately as the LLM starts generating the worker\'s arguments. The worker input streams progressively into the worker part, the same way text tokens stream into a text part. Once input finishes, worker execution begins and nested content flows into the same worker part. There is no intermediate tool card.\n\n**`name` and `description` modes:** Worker input is stripped from the `worker-start` event (it may contain sensitive data). Only the running/done status and the final `worker-result` are forwarded to the parent stream. Use these for workers where the user only needs to know the worker ran, not what it did internally.\n\n**`hidden` mode:** The worker executes normally but produces no UI presence at all. Use for internal workers that are implementation details.\n\n### Tool Mapping\n\nMap parent tools to worker tools when the worker needs access to your tool handlers:\n\n```yaml\nworkers:\n research-assistant:\n description: Research topics\n tools:\n search: web-search # Worker\'s "search" \u2192 parent\'s "web-search"\n```\n\nWhen the worker calls its `search` tool, your `web-search` handler executes.\n\n## Next Steps\n\n- [Server SDK Workers](/docs/server-sdk/workers) - Executing workers from code\n- [Handlers](/docs/protocol/handlers) - Block reference for steps\n- [Agent Config](/docs/protocol/agent-config) - Model and settings\n',
1434
1461
  excerpt: "Workers Workers are agents designed for task-based execution. Unlike interactive agents that handle multi-turn conversations, workers execute a sequence of steps and return an output value. When to...",
1435
1462
  order: 11
1436
1463
  },
@@ -1448,7 +1475,7 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
1448
1475
  section: "protocol",
1449
1476
  title: "MCP Servers",
1450
1477
  description: "Connecting agents to external tools via Model Context Protocol (MCP).",
1451
- content: "\n# MCP Servers\n\nMCP servers extend your agent with tools from external services. Define them in your protocol, and agents automatically discover and use their tools at runtime.\n\nThere are three types of MCP servers:\n\n| Source | Description | Example |\n| ---------- | ----------------------------------------------------------------------------------- | ------------------------------------- |\n| `remote` | HTTP-based MCP servers, managed by the platform | Figma, Sentry, GitHub |\n| `device` | Local MCP servers running on the agent's machine via `@octavus/computer` | Browser automation, filesystem |\n| `consumer` | Inline MCP servers defined in your server-sdk process via `createInlineMcpServer()` | Custom integrations, third-party APIs |\n\n## Defining MCP Servers\n\nMCP servers are defined in the `mcpServers:` section. The key becomes the **namespace** for all tools from that server.\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\n github:\n description: Repository management - issues, pull requests, code\n source: consumer\n display: name\n```\n\n### Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ---------------------------------------------------------------------------------------------------------------------- |\n| `description` | Yes | What the MCP server provides |\n| `source` | Yes | `remote`, `device`, or `consumer` (see source types above) |\n| `display` | No | How tool calls appear in UI: `hidden`, `name`, `description` (default: `description`) |\n| `connection` | No | When to connect: `eager` or `lazy` (default: `lazy`). `remote` only. |\n| `execution` | No | Where the MCP process runs: `sandbox` (default) or `device`. `remote` only. See [Device Execution](#device-execution). |\n\n### Display Modes\n\nDisplay modes control visibility of all tool calls from the MCP server, using the same modes as [regular tools](/docs/protocol/tools#display-modes):\n\n| Mode | Behavior |\n| ------------- | -------------------------------------- |\n| `hidden` | Tool calls run silently |\n| `name` | Shows tool name while executing |\n| `description` | Shows tool description while executing |\n\n## Making MCP Servers Available\n\nLike tools, MCP servers defined in `mcpServers:` must be referenced in `agent.mcpServers` to be available:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\n filesystem:\n description: Filesystem access for reading and writing files\n source: device\n display: hidden\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [figma, sentry, browser, filesystem]\n tools: [set-chat-title]\n agentic: true\n maxSteps: 100\n```\n\n## Tool Namespacing\n\nAll MCP tools are automatically namespaced using `__` (double underscore) as a separator. The namespace comes from the `mcpServers` key.\n\nFor example, a server defined as `browser:` that exposes `navigate_page` and `click` produces:\n\n- `browser__navigate_page`\n- `browser__click`\n\nA server defined as `figma:` that exposes `get_design_context` produces:\n\n- `figma__get_design_context`\n\nThe namespace is stripped before calling the MCP server - the server receives the original tool name. This convention matches Anthropic's MCP integration in Claude Desktop and ensures tool names stay unique across servers.\n\n### What the LLM Sees\n\nWhen an agent has both regular tools and MCP servers configured, the LLM sees all tools combined:\n\n```\nProtocol tools:\n set-chat-title\n\nRemote MCP tools (auto-discovered):\n figma__get_design_context\n figma__get_screenshot\n sentry__get_issues\n sentry__get_issue_details\n\nDevice MCP tools (auto-discovered):\n browser__navigate_page\n browser__click\n browser__take_snapshot\n filesystem__read_file\n filesystem__write_file\n filesystem__list_directory\n\nConsumer MCP tools (provided by the server-sdk):\n github__get-pr-overview\n github__list-issues\n```\n\nYou don't define individual MCP tool schemas in the protocol - remote and device tools are auto-discovered from each MCP server at runtime, and consumer tools are supplied by the server-sdk.\n\n## Remote MCP Servers\n\nRemote MCP servers (`source: remote`) connect to HTTP-based MCP endpoints. The platform manages the connection, authentication, and tool discovery.\n\nConfiguration happens in the Octavus platform UI:\n\n1. Add an MCP server to your project (URL + authentication)\n2. The server's slug must match the namespace in your protocol\n3. The platform connects, discovers tools, and makes them available to the agent\n\n### Connection Modes\n\nThe `connection` field controls when the platform connects to a remote MCP server:\n\n| Mode | Behavior |\n| ------- | ---------------------------------------------------------------------------------------------------------------------- |\n| `lazy` | (default) The agent activates integrations on demand at runtime. The agent starts responding immediately. |\n| `eager` | The platform connects and discovers tools before the first LLM request. Tools are guaranteed available from message 1. |\n\n```yaml\nmcpServers:\n sentry:\n source: remote\n connection: eager # Always connected upfront\n display: name\n\n notion:\n source: remote\n # connection defaults to lazy - agent activates when needed\n display: description\n```\n\nWith **lazy connection** (the default), the agent receives two built-in tools - one for listing available integrations and one for activating them. The agent decides which integrations it needs based on the conversation and activates them on demand. This avoids paying connection latency for integrations the agent doesn't end up using.\n\nWith **eager connection**, the platform connects to the MCP server before the first LLM request, exactly like a declared tool. Use this when the agent needs the MCP's tools from the very first message.\n\nThe `connection` field is only valid on `source: remote` - device MCPs (`source: device`) have their own connection mechanism through the server-sdk. The `connection` field is respected for remote MCPs with `execution: device` the same way as sandbox MCPs.\n\n### Authentication\n\nRemote MCP servers support multiple authentication methods:\n\n| Auth Type | Description |\n| --------- | ------------------------------- |\n| MCP OAuth | Standard MCP OAuth flow |\n| API Key | Static API key sent as a header |\n| Bearer | Bearer token authentication |\n| None | No authentication required |\n\nAuthentication is configured per-project - different projects can connect to the same MCP server with different credentials.\n\n## Device Execution\n\nThe `execution` field controls where a remote MCP server's STDIO process runs. By default (`execution: sandbox`), the process runs in the platform's sandbox. When set to `execution: device`, the STDIO process runs on the agent's computer (VM or desktop) instead.\n\n```yaml\nmcpServers:\n code-tools:\n description: Code analysis and refactoring tools\n source: remote\n execution: device # STDIO process runs on the agent's computer\n display: name\n\n sentry:\n description: Error tracking\n source: remote\n # execution defaults to sandbox - runs in the platform\n display: name\n```\n\n### When to Use\n\nUse `execution: device` when the MCP server needs access to the agent's local environment - for example, tools that read from the local filesystem, interact with running processes, or need CLIs installed on the device.\n\n### Rules\n\n- `execution` is only meaningful for `source: remote` MCPs that use STDIO transport. HTTP-transport remote MCPs always connect from the platform regardless of the `execution` setting.\n- `execution` is **invalid** on `source: device` and `source: consumer` MCPs - they already run outside the platform. Using it produces a validation error.\n- The `connection` field (`eager` or `lazy`) is respected for device-executed MCPs the same way as sandbox-executed MCPs.\n\n## Device MCP Servers\n\nDevice MCP servers (`source: device`) run on the consumer's machine. The consumer provides the MCP tools via the `@octavus/computer` package (or any `ToolProvider` implementation) through the server-sdk.\n\nWhen an agent has device MCP servers:\n\n1. The consumer creates a `Computer` with matching namespaces\n2. `@octavus/computer` discovers tools from each MCP server\n3. Tool schemas are sent to the platform via the server-sdk\n4. Tool calls flow back to the consumer for execution\n\nSee [`@octavus/computer`](/docs/server-sdk/computer) for the full integration guide.\n\n### Namespace Matching\n\nThe `mcpServers` keys in the protocol must match the keys in the consumer's `Computer` configuration:\n\n```yaml\n# protocol.yaml\nmcpServers:\n browser: # \u2190 must match\n source: device\n filesystem: # \u2190 must match\n source: device\n```\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=...']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [dir]),\n },\n});\n```\n\nIf the consumer provides a namespace not declared in the protocol, the platform rejects it.\n\n## Consumer MCP Servers\n\nConsumer MCP servers (`source: consumer`) are defined inline in your server-sdk process. Tool schemas and handlers live in your code; the platform learns the namespace from the protocol and routes tool calls to your process via the same `dynamicToolSchemas` channel that device MCPs use.\n\n```yaml\nmcpServers:\n github:\n description: Repository management - issues, pull requests, code\n source: consumer\n display: name\n\nagent:\n mcpServers:\n - github\n```\n\nThe protocol declaration is intentionally minimal - the SDK supplies tool names and JSON schemas at runtime, so adding or evolving tools doesn't require a protocol change.\n\nUse consumer MCPs when:\n\n- The integration's credentials should never reach the platform (OAuth tokens, customer API keys).\n- You want to group an integration's tools (`github__list-prs`, `github__get-issue`) without enumerating each one in YAML.\n- You want type-safe handler arguments via Zod schemas.\n\nSee [`createInlineMcpServer`](/docs/server-sdk/inline-mcp) in the server-sdk reference for the full implementation guide.\n\n### Namespace Matching\n\nThe protocol namespace must match the namespace passed to `createInlineMcpServer()`:\n\n```yaml\n# protocol.yaml\nmcpServers:\n github: # \u2190 must match\n source: consumer\n```\n\n```typescript\nconst github = createInlineMcpServer('github', {\n /* tools... */\n});\n\nsession = client.agentSessions.attach(sessionId, {\n mcpServers: [github],\n});\n```\n\nIf the SDK provides a namespace not declared in the protocol, those tools are filtered out at the runtime boundary and the LLM never sees them.\n\n## Thread-Level Scoping\n\nThreads can scope which MCP servers are available, the same way they scope [tools](/docs/protocol/handlers#start-thread):\n\n```yaml\nhandlers:\n user-message:\n Start research:\n block: start-thread\n thread: research\n mcpServers: [figma, browser]\n tools: [set-chat-title]\n system: research-prompt\n```\n\nThis thread can use Figma and browser tools, but not sentry or filesystem - even if those are available on the main agent.\n\n## On-Demand MCP Servers\n\nBy default, an agent can only call MCP tools whose namespace is listed in `mcpServers`. With `onDemandMcpServers`, a scope can opt into **every connected MCP of a given source** at runtime, without enumerating each one in the protocol.\n\nRemote MCPs are connected at the project level from the Octavus dashboard. Normally, each connected MCP that the agent should be able to use has to be declared in the protocol - connecting a new MCP means editing the protocol and redeploying. `onDemandMcpServers` removes that round-trip: once a source is opted in, any MCP connected to the project under that source becomes available to the agent immediately.\n\nCurrently supported for `source: remote`.\n\n### Protocol-level declaration\n\nAdd an `onDemandMcpServers:` section alongside `mcpServers:`, keyed by source. Each entry configures how the matched MCPs appear in tool lists:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\nonDemandMcpServers:\n remote:\n description: Additional connected integrations\n display: name\n execution: device # on-demand MCPs run on the agent's computer\n contextRetention:\n toolResults: { retainLast: 5 }\n```\n\nOn-demand MCP definitions also support the `execution` field. When set, all MCPs matched by that on-demand source inherit the execution mode.\n\n### Scope-level opt-in\n\nThe agent and individual `start-thread` blocks each choose whether to pick up on-demand MCPs, by listing the sources they want:\n\n```yaml\nagent:\n mcpServers: [figma]\n onDemandMcpServers: [remote]\n\nhandlers:\n user-message:\n focused:\n block: start-thread\n mcpServers: [figma]\n # no onDemandMcpServers - this thread does NOT see on-demand MCPs\n broad:\n block: start-thread\n mcpServers: [figma]\n onDemandMcpServers: [remote]\n```\n\n### Rules\n\n- A scope's tool list includes every **connected** MCP of any referenced source, whether or not any protocol declares that slug.\n- Undeclared namespaces inherit `description`, `display`, and `contextRetention` from the per-source entry in `onDemandMcpServers`.\n- Scopes decide independently - threads do not inherit `onDemandMcpServers` from their parent, the same rule as `mcpServers:`.\n- Tool namespaces are always the connector's slug (for example `notion__search`, `linear__create_issue`). Source keys are never namespaces.\n\nWorkers opt into on-demand MCPs the same way: through `start-thread` blocks inside `steps`. A worker without a `start-thread` that lists a source won't see on-demand MCPs of that source.\n\n## Workers\n\nWorkers can declare and use MCP servers using the same `mcpServers:` syntax. Workers resolve their own MCP connections independently - they don't inherit from a parent interactive agent.\n\n```yaml\n# Worker protocol\nmcpServers:\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [sentry, browser]\n maxSteps: 10\n```\n\nSince workers don't have a global `agent:` section, MCP servers are scoped per-thread via `start-thread` - the same way tools and skills work in workers. Remote MCP connections are project-scoped, so workers in the same project share the same OAuth connections.\n\nSee [Workers](/docs/protocol/workers) for the full worker protocol reference.\n\n## Full Example\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n connection: eager\n display: description\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n filesystem:\n description: Filesystem access for reading and writing files\n source: device\n display: hidden\n shell:\n description: Shell command execution\n source: device\n display: name\n\ntools:\n set-chat-title:\n description: Set the title of the current chat.\n parameters:\n title: { type: string, description: The title to set }\n\nagent:\n model: anthropic/claude-opus-4-6\n system: system\n mcpServers: [figma, sentry, browser, filesystem, shell]\n tools: [set-chat-title]\n thinking: medium\n maxSteps: 300\n agentic: true\n\ntriggers:\n user-message:\n input:\n USER_MESSAGE: { type: string }\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n Respond:\n block: next-message\n```\n\n### Cloud-Only Agent\n\nAgents that only use remote MCP servers don't need `@octavus/computer`:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n connection: eager # Need design tools from message 1\n display: description\n sentry:\n description: Error tracking and debugging\n source: remote\n # Lazy (default) - agent activates when debugging is needed\n display: name\n\ntools:\n submit-code:\n description: Submit code to the user.\n parameters:\n code: { type: string }\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [figma, sentry]\n tools: [submit-code]\n agentic: true\n```\n",
1478
+ content: "\n# MCP Servers\n\nMCP servers extend your agent with tools from external services. Define them in your protocol, and agents automatically discover and use their tools at runtime.\n\nThere are three types of MCP servers:\n\n| Source | Description | Example |\n| ---------- | ----------------------------------------------------------------------------------- | ------------------------------------- |\n| `remote` | HTTP-based MCP servers, managed by the platform | Figma, Sentry, GitHub |\n| `device` | Local MCP servers running on the agent's machine via `@octavus/computer` | Browser automation, filesystem |\n| `consumer` | Inline MCP servers defined in your server-sdk process via `createInlineMcpServer()` | Custom integrations, third-party APIs |\n\n## Defining MCP Servers\n\nMCP servers are defined in the `mcpServers:` section. The key becomes the **namespace** for all tools from that server.\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\n github:\n description: Repository management - issues, pull requests, code\n source: consumer\n display: name\n```\n\n### Fields\n\n| Field | Required | Description |\n| ------------- | -------- | ---------------------------------------------------------------------------------------------------------------------- |\n| `description` | Yes | What the MCP server provides |\n| `source` | Yes | `remote`, `device`, or `consumer` (see source types above) |\n| `display` | No | How tool calls appear in UI: `hidden`, `name`, `description`, `stream`, `title` (default: `description`) |\n| `title` | No | UI label shown when `display: title`; applies to every tool in this namespace (hides description and arguments) |\n| `connection` | No | When to connect: `eager` or `lazy` (default: `lazy`). `remote` only. |\n| `execution` | No | Where the MCP process runs: `sandbox` (default) or `device`. `remote` only. See [Device Execution](#device-execution). |\n\n### Display Modes\n\nDisplay modes control visibility of all tool calls from the MCP server, using the same modes as [regular tools](/docs/protocol/tools#display-modes):\n\n| Mode | Behavior |\n| ------------- | -------------------------------------- |\n| `hidden` | Tool calls run silently |\n| `name` | Shows tool name while executing |\n| `description` | Shows tool description while executing |\n\n## Making MCP Servers Available\n\nLike tools, MCP servers defined in `mcpServers:` must be referenced in `agent.mcpServers` to be available:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\n filesystem:\n description: Filesystem access for reading and writing files\n source: device\n display: hidden\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [figma, sentry, browser, filesystem]\n tools: [set-chat-title]\n agentic: true\n maxSteps: 100\n```\n\n## Tool Namespacing\n\nAll MCP tools are automatically namespaced using `__` (double underscore) as a separator. The namespace comes from the `mcpServers` key.\n\nFor example, a server defined as `browser:` that exposes `navigate_page` and `click` produces:\n\n- `browser__navigate_page`\n- `browser__click`\n\nA server defined as `figma:` that exposes `get_design_context` produces:\n\n- `figma__get_design_context`\n\nThe namespace is stripped before calling the MCP server - the server receives the original tool name. This convention matches Anthropic's MCP integration in Claude Desktop and ensures tool names stay unique across servers.\n\n### What the LLM Sees\n\nWhen an agent has both regular tools and MCP servers configured, the LLM sees all tools combined:\n\n```\nProtocol tools:\n set-chat-title\n\nRemote MCP tools (auto-discovered):\n figma__get_design_context\n figma__get_screenshot\n sentry__get_issues\n sentry__get_issue_details\n\nDevice MCP tools (auto-discovered):\n browser__navigate_page\n browser__click\n browser__take_snapshot\n filesystem__read_file\n filesystem__write_file\n filesystem__list_directory\n\nConsumer MCP tools (provided by the server-sdk):\n github__get-pr-overview\n github__list-issues\n```\n\nYou don't define individual MCP tool schemas in the protocol - remote and device tools are auto-discovered from each MCP server at runtime, and consumer tools are supplied by the server-sdk.\n\n## Remote MCP Servers\n\nRemote MCP servers (`source: remote`) connect to HTTP-based MCP endpoints. The platform manages the connection, authentication, and tool discovery.\n\nConfiguration happens in the Octavus platform UI:\n\n1. Add an MCP server to your project (URL + authentication)\n2. The server's slug must match the namespace in your protocol\n3. The platform connects, discovers tools, and makes them available to the agent\n\n### Connection Modes\n\nThe `connection` field controls when the platform connects to a remote MCP server:\n\n| Mode | Behavior |\n| ------- | ---------------------------------------------------------------------------------------------------------------------- |\n| `lazy` | (default) The agent activates integrations on demand at runtime. The agent starts responding immediately. |\n| `eager` | The platform connects and discovers tools before the first LLM request. Tools are guaranteed available from message 1. |\n\n```yaml\nmcpServers:\n sentry:\n source: remote\n connection: eager # Always connected upfront\n display: name\n\n notion:\n source: remote\n # connection defaults to lazy - agent activates when needed\n display: description\n```\n\nWith **lazy connection** (the default), the agent receives two built-in tools - one for listing available integrations and one for activating them. The agent decides which integrations it needs based on the conversation and activates them on demand. This avoids paying connection latency for integrations the agent doesn't end up using.\n\nWith **eager connection**, the platform connects to the MCP server before the first LLM request, exactly like a declared tool. Use this when the agent needs the MCP's tools from the very first message.\n\nThe `connection` field is only valid on `source: remote` - device MCPs (`source: device`) have their own connection mechanism through the server-sdk. The `connection` field is respected for remote MCPs with `execution: device` the same way as sandbox MCPs.\n\n### Authentication\n\nRemote MCP servers support multiple authentication methods:\n\n| Auth Type | Description |\n| --------- | ------------------------------- |\n| MCP OAuth | Standard MCP OAuth flow |\n| API Key | Static API key sent as a header |\n| Bearer | Bearer token authentication |\n| None | No authentication required |\n\nAuthentication is configured per-project - different projects can connect to the same MCP server with different credentials.\n\n## Device Execution\n\nThe `execution` field controls where a remote MCP server's STDIO process runs. By default (`execution: sandbox`), the process runs in the platform's sandbox. When set to `execution: device`, the STDIO process runs on the agent's computer (VM or desktop) instead.\n\n```yaml\nmcpServers:\n code-tools:\n description: Code analysis and refactoring tools\n source: remote\n execution: device # STDIO process runs on the agent's computer\n display: name\n\n sentry:\n description: Error tracking\n source: remote\n # execution defaults to sandbox - runs in the platform\n display: name\n```\n\n### When to Use\n\nUse `execution: device` when the MCP server needs access to the agent's local environment - for example, tools that read from the local filesystem, interact with running processes, or need CLIs installed on the device.\n\n### Rules\n\n- `execution` is only meaningful for `source: remote` MCPs that use STDIO transport. HTTP-transport remote MCPs always connect from the platform regardless of the `execution` setting.\n- `execution` is **invalid** on `source: device` and `source: consumer` MCPs - they already run outside the platform. Using it produces a validation error.\n- The `connection` field (`eager` or `lazy`) is respected for device-executed MCPs the same way as sandbox-executed MCPs.\n\n## Device MCP Servers\n\nDevice MCP servers (`source: device`) run on the consumer's machine. The consumer provides the MCP tools via the `@octavus/computer` package (or any `ToolProvider` implementation) through the server-sdk.\n\nWhen an agent has device MCP servers:\n\n1. The consumer creates a `Computer` with matching namespaces\n2. `@octavus/computer` discovers tools from each MCP server\n3. Tool schemas are sent to the platform via the server-sdk\n4. Tool calls flow back to the consumer for execution\n\nSee [`@octavus/computer`](/docs/server-sdk/computer) for the full integration guide.\n\n### Namespace Matching\n\nThe `mcpServers` keys in the protocol must match the keys in the consumer's `Computer` configuration:\n\n```yaml\n# protocol.yaml\nmcpServers:\n browser: # \u2190 must match\n source: device\n filesystem: # \u2190 must match\n source: device\n```\n\n```typescript\nconst computer = new Computer({\n mcpServers: {\n browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=...']),\n filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [dir]),\n },\n});\n```\n\nIf the consumer provides a namespace not declared in the protocol, the platform rejects it.\n\n## Consumer MCP Servers\n\nConsumer MCP servers (`source: consumer`) are defined inline in your server-sdk process. Tool schemas and handlers live in your code; the platform learns the namespace from the protocol and routes tool calls to your process via the same `dynamicToolSchemas` channel that device MCPs use.\n\n```yaml\nmcpServers:\n github:\n description: Repository management - issues, pull requests, code\n source: consumer\n display: name\n\nagent:\n mcpServers:\n - github\n```\n\nThe protocol declaration is intentionally minimal - the SDK supplies tool names and JSON schemas at runtime, so adding or evolving tools doesn't require a protocol change.\n\nUse consumer MCPs when:\n\n- The integration's credentials should never reach the platform (OAuth tokens, customer API keys).\n- You want to group an integration's tools (`github__list-prs`, `github__get-issue`) without enumerating each one in YAML.\n- You want type-safe handler arguments via Zod schemas.\n\nSee [`createInlineMcpServer`](/docs/server-sdk/inline-mcp) in the server-sdk reference for the full implementation guide.\n\n### Namespace Matching\n\nThe protocol namespace must match the namespace passed to `createInlineMcpServer()`:\n\n```yaml\n# protocol.yaml\nmcpServers:\n github: # \u2190 must match\n source: consumer\n```\n\n```typescript\nconst github = createInlineMcpServer('github', {\n /* tools... */\n});\n\nsession = client.agentSessions.attach(sessionId, {\n mcpServers: [github],\n});\n```\n\nIf the SDK provides a namespace not declared in the protocol, those tools are filtered out at the runtime boundary and the LLM never sees them.\n\n## Thread-Level Scoping\n\nThreads can scope which MCP servers are available, the same way they scope [tools](/docs/protocol/handlers#start-thread):\n\n```yaml\nhandlers:\n user-message:\n Start research:\n block: start-thread\n thread: research\n mcpServers: [figma, browser]\n tools: [set-chat-title]\n system: research-prompt\n```\n\nThis thread can use Figma and browser tools, but not sentry or filesystem - even if those are available on the main agent.\n\n## On-Demand MCP Servers\n\nBy default, an agent can only call MCP tools whose namespace is listed in `mcpServers`. With `onDemandMcpServers`, a scope can opt into **every connected MCP of a given source** at runtime, without enumerating each one in the protocol.\n\nRemote MCPs are connected at the project level from the Octavus dashboard. Normally, each connected MCP that the agent should be able to use has to be declared in the protocol - connecting a new MCP means editing the protocol and redeploying. `onDemandMcpServers` removes that round-trip: once a source is opted in, any MCP connected to the project under that source becomes available to the agent immediately.\n\nCurrently supported for `source: remote`.\n\n### Protocol-level declaration\n\nAdd an `onDemandMcpServers:` section alongside `mcpServers:`, keyed by source. Each entry configures how the matched MCPs appear in tool lists:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n display: description\n\nonDemandMcpServers:\n remote:\n description: Additional connected integrations\n display: name\n execution: device # on-demand MCPs run on the agent's computer\n contextRetention:\n toolResults: { retainLast: 5 }\n```\n\nOn-demand MCP definitions also support the `execution` field. When set, all MCPs matched by that on-demand source inherit the execution mode.\n\n### Scope-level opt-in\n\nThe agent and individual `start-thread` blocks each choose whether to pick up on-demand MCPs, by listing the sources they want:\n\n```yaml\nagent:\n mcpServers: [figma]\n onDemandMcpServers: [remote]\n\nhandlers:\n user-message:\n focused:\n block: start-thread\n mcpServers: [figma]\n # no onDemandMcpServers - this thread does NOT see on-demand MCPs\n broad:\n block: start-thread\n mcpServers: [figma]\n onDemandMcpServers: [remote]\n```\n\n### Rules\n\n- A scope's tool list includes every **connected** MCP of any referenced source, whether or not any protocol declares that slug.\n- Undeclared namespaces inherit `description`, `display`, and `contextRetention` from the per-source entry in `onDemandMcpServers`.\n- Scopes decide independently - threads do not inherit `onDemandMcpServers` from their parent, the same rule as `mcpServers:`.\n- Tool namespaces are always the connector's slug (for example `notion__search`, `linear__create_issue`). Source keys are never namespaces.\n\nWorkers opt into on-demand MCPs the same way: through `start-thread` blocks inside `steps`. A worker without a `start-thread` that lists a source won't see on-demand MCPs of that source.\n\n## Workers\n\nWorkers can declare and use MCP servers using the same `mcpServers:` syntax. Workers resolve their own MCP connections independently - they don't inherit from a parent interactive agent.\n\n```yaml\n# Worker protocol\nmcpServers:\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n\nsteps:\n Start research:\n block: start-thread\n thread: research\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [sentry, browser]\n maxSteps: 10\n```\n\nSince workers don't have a global `agent:` section, MCP servers are scoped per-thread via `start-thread` - the same way tools and skills work in workers. Remote MCP connections are project-scoped, so workers in the same project share the same OAuth connections.\n\nSee [Workers](/docs/protocol/workers) for the full worker protocol reference.\n\n## Full Example\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n connection: eager\n display: description\n sentry:\n description: Error tracking and debugging\n source: remote\n display: name\n browser:\n description: Chrome DevTools browser automation\n source: device\n display: name\n filesystem:\n description: Filesystem access for reading and writing files\n source: device\n display: hidden\n shell:\n description: Shell command execution\n source: device\n display: name\n\ntools:\n set-chat-title:\n description: Set the title of the current chat.\n parameters:\n title: { type: string, description: The title to set }\n\nagent:\n model: anthropic/claude-opus-4-6\n system: system\n mcpServers: [figma, sentry, browser, filesystem, shell]\n tools: [set-chat-title]\n thinking: medium\n maxSteps: 300\n agentic: true\n\ntriggers:\n user-message:\n input:\n USER_MESSAGE: { type: string }\n\nhandlers:\n user-message:\n Add message:\n block: add-message\n role: user\n prompt: user-message\n input: [USER_MESSAGE]\n display: hidden\n\n Respond:\n block: next-message\n```\n\n### Cloud-Only Agent\n\nAgents that only use remote MCP servers don't need `@octavus/computer`:\n\n```yaml\nmcpServers:\n figma:\n description: Figma design tool integration\n source: remote\n connection: eager # Need design tools from message 1\n display: description\n sentry:\n description: Error tracking and debugging\n source: remote\n # Lazy (default) - agent activates when debugging is needed\n display: name\n\ntools:\n submit-code:\n description: Submit code to the user.\n parameters:\n code: { type: string }\n\nagent:\n model: anthropic/claude-sonnet-4-5\n system: system\n mcpServers: [figma, sentry]\n tools: [submit-code]\n agentic: true\n```\n",
1452
1479
  excerpt: "MCP Servers MCP servers extend your agent with tools from external services. Define them in your protocol, and agents automatically discover and use their tools at runtime. There are three types of...",
1453
1480
  order: 13
1454
1481
  },
@@ -1541,6 +1568,41 @@ See [Streaming Events](/docs/server-sdk/streaming#event-types) for the full list
1541
1568
  order: 3
1542
1569
  }
1543
1570
  ]
1571
+ },
1572
+ {
1573
+ slug: "workforce-agents",
1574
+ title: "Workforce Agents",
1575
+ description: "Automate Octavus Agents over the API - dispatch a task, poll for completion, and read the result.",
1576
+ order: 7,
1577
+ docs: [
1578
+ {
1579
+ slug: "workforce-agents/overview",
1580
+ section: "workforce-agents",
1581
+ title: "Overview",
1582
+ description: "What Workforce Agents are and how to drive them programmatically.",
1583
+ content: "\n# Workforce Agents\n\nWorkforce Agents are Octavus's autonomous AI teammates - specialized agents you hire and configure in the dashboard, each with its own computer, skills, and tools. Browse and hire them at [octavus.ai/agents](https://octavus.ai/agents), and see the full roster at [octavus.ai/agents/discover](https://octavus.ai/agents/discover).\n\nThe **Workforce Agents API** lets you drive one of your agents without a browser: give it a task, wait for it to finish, and read what it did. It is built for automation - CI pipelines, scheduled jobs, and backend integrations that hand work to an agent and collect the result.\n\n> Workforce Agents are distinct from agents you build yourself with the SDK. The [Sessions API](/docs/api-reference/sessions) drives agents you define and host; the Workforce Agents API drives the managed agents you hire and configure in the dashboard.\n\n## How it works\n\nA run follows a simple dispatch-then-poll model - there is no stream or webhook to manage:\n\n1. **Dispatch** a message to an agent. This starts a **thread** (a conversation) and returns a `threadId` right away.\n2. **Poll** the thread until its `status` is terminal.\n3. **Read** the thread's messages to get the agent's work.\n\n```mermaid\nsequenceDiagram\n participant C as Your code\n participant A as Workforce Agent\n C->>A: dispatch a task\n A-->>C: threadId and status\n loop until terminal status\n C->>A: get thread\n A-->>C: status and messages\n end\n```\n\n### Thread status\n\n| Status | Meaning |\n| ----------- | ----------------------------------------------------------------------------------- |\n| `pending` | Dispatched, waiting to start |\n| `queued` | Waiting for the agent to free up - an agent runs one task at a time on its computer |\n| `running` | The agent is working |\n| `completed` | Finished successfully (terminal) |\n| `failed` | Ended with an error - see `failureReason` (terminal) |\n| `cancelled` | Stopped (terminal) |\n\nPoll while the status is `pending`, `queued`, or `running`, and stop once it is `completed`, `failed`, or `cancelled`.\n\n## Authentication\n\nEach agent has its own **API key**, created from the agent's settings in the dashboard (Settings -> API). The key:\n\n- Drives only that one agent - a key for one agent can never call another.\n- Is a secret. Use it from a backend, script, or CI, never in a browser or client app.\n- Is sent as a bearer token: `Authorization: Bearer oct_agt_...`.\n\nCreate a key, copy it once (it is shown only at creation), and store it securely.\n\n## Next steps\n\n- [Using the SDK](/docs/workforce-agents/sdk) - the `@octavus/server-sdk` `workforce` client, including a run-and-wait helper.\n- [API reference](/docs/workforce-agents/api-reference) - the REST endpoints, for any language.\n",
1584
+ excerpt: "Workforce Agents Workforce Agents are Octavus's autonomous AI teammates - specialized agents you hire and configure in the dashboard, each with its own computer, skills, and tools. Browse and hire...",
1585
+ order: 1
1586
+ },
1587
+ {
1588
+ slug: "workforce-agents/sdk",
1589
+ section: "workforce-agents",
1590
+ title: "Using the SDK",
1591
+ description: "Drive a Workforce Agent with the @octavus/server-sdk workforce client.",
1592
+ content: "\n# Using the SDK\n\n`@octavus/server-sdk` ships a `workforce` client for driving an agent with its per-agent key.\n\n## Install\n\n```bash\nnpm install @octavus/server-sdk@5.1.0\n```\n\n## Set up the client\n\nConstruct `OctavusClient` with your agent's API key:\n\n```ts\nimport { OctavusClient } from '@octavus/server-sdk';\n\nconst client = new OctavusClient({\n baseUrl: 'https://octavus.ai',\n apiKey: process.env.OCTAVUS_AGENT_KEY, // oct_agt_...\n});\n```\n\n## Run a task and wait for the result\n\n`run()` is the all-in-one method: it dispatches a message, polls until the run finishes, and returns the completed thread.\n\n```ts\nconst thread = await client.workforce.run(agentId, 'Summarize the latest sales report');\n\nconsole.log(thread.status); // 'completed' | 'failed' | 'cancelled'\nconsole.log(thread.messages); // the full conversation, including the agent's reply\n```\n\nThe agent's latest turn is at the end of `thread.messages`.\n\n## Dispatch and poll manually\n\nFor more control, dispatch and read the thread yourself:\n\n```ts\nconst { threadId } = await client.workforce.dispatch(agentId, 'Research our top 3 competitors');\n\nconst thread = await client.workforce.getThread(agentId, threadId);\nif (thread.status === 'completed') {\n // ...\n}\n```\n\n`waitForCompletion()` does the polling loop for you until the thread is terminal:\n\n```ts\nconst { threadId } = await client.workforce.dispatch(agentId, 'Draft the Q3 board update');\nconst thread = await client.workforce.waitForCompletion(agentId, threadId);\n```\n\n## Follow up in the same thread\n\nContinue a thread with another message. It runs after the current turn finishes.\n\n```ts\nawait client.workforce.followUp(agentId, threadId, 'Now turn that into a slide deck');\nconst thread = await client.workforce.waitForCompletion(agentId, threadId);\n```\n\n## Options\n\n`run()` and `waitForCompletion()` accept polling options. Full runs can take several minutes, so the defaults are generous.\n\n| Option | Type | Default | Description |\n| ---------------- | ----------- | -------- | --------------------------------------------- |\n| `pollIntervalMs` | number | `3000` | Delay between status checks |\n| `timeoutMs` | number | `900000` | Max time to wait before throwing (15 minutes) |\n| `signal` | AbortSignal | - | Cancel the wait early |\n\n`dispatch()`, `followUp()`, and `run()` also accept `files` (an array of `FileReference`) to attach hosted files to the message.\n\n```ts\nconst thread = await client.workforce.run(agentId, 'Review this spec', {\n timeoutMs: 30 * 60 * 1000, // 30 minutes\n pollIntervalMs: 5000,\n});\n```\n\nIf the timeout elapses first, `waitForCompletion()` and `run()` throw. The run keeps going on the server, so you can read it later with `getThread()`.\n\n## Result shape\n\n`getThread()`, `waitForCompletion()`, and `run()` return a thread:\n\n| Field | Type | Description |\n| --------------- | -------------- | -------------------------------------------------------------------------------------------------------------- |\n| `threadId` | string | The thread identifier |\n| `status` | string | `idle`, `queued`, `pending`, `running`, `completed`, `failed`, or `cancelled` |\n| `failureReason` | string \\| null | Why the run failed, when `status` is `failed` |\n| `messages` | UIMessage[] | The conversation - text, tool and skill steps, and files (see [UIMessage parts](/docs/api-reference/sessions)) |\n\nUse `isTerminalThreadStatus(status)` to check whether a run has finished.\n\nThe same operations are available as plain HTTP - see the [API reference](/docs/workforce-agents/api-reference).\n",
1593
+ excerpt: "Using the SDK ships a client for driving an agent with its per-agent key. Install Set up the client Construct with your agent's API key: Run a task and wait for the result is the all-in-one...",
1594
+ order: 2
1595
+ },
1596
+ {
1597
+ slug: "workforce-agents/api-reference",
1598
+ section: "workforce-agents",
1599
+ title: "API Reference",
1600
+ description: "REST endpoints for the Workforce Agents API.",
1601
+ content: '\n# Workforce Agents API\n\nDrive a single agent over HTTP. Every request is authenticated with that agent\'s API key, sent as a bearer token:\n\n```\nAuthorization: Bearer oct_agt_...\n```\n\nAll endpoints are scoped to one agent through the `{agentId}` path segment - find the agent ID in the agent\'s page URL in the dashboard. A key only works for the agent it was created for.\n\n## Start a thread\n\nDispatch a message to the agent. This starts a new thread and returns immediately.\n\n```\nPOST /api/v1/workforce/agents/{agentId}/threads\n```\n\n### Request Body\n\n```json\n{\n "message": "Summarize the latest sales report"\n}\n```\n\n| Field | Type | Required | Description |\n| --------- | --------------- | -------- | --------------------------------- |\n| `message` | string | Yes | The task or message for the agent |\n| `files` | FileReference[] | No | Hosted file attachments |\n\n### Response\n\nReturns `201`.\n\n```json\n{\n "threadId": "cm5xyz123abc456def",\n "status": "pending"\n}\n```\n\nPoll the [Get a thread](#get-a-thread) endpoint until the status is terminal.\n\n### Example\n\n```bash\ncurl -X POST https://octavus.ai/api/v1/workforce/agents/AGENT_ID/threads \\\n -H "Authorization: Bearer oct_agt_..." \\\n -H "Content-Type: application/json" \\\n -d \'{ "message": "Summarize the latest sales report" }\'\n```\n\n## Get a thread\n\nRead a thread\'s status and messages. Poll this until the run finishes.\n\n```\nGET /api/v1/workforce/agents/{agentId}/threads/{threadId}\n```\n\n### Response\n\n```json\n{\n "threadId": "cm5xyz123abc456def",\n "status": "completed",\n "failureReason": null,\n "messages": []\n}\n```\n\n| Field | Type | Description |\n| --------------- | -------------- | ----------------------------------------------------------------------------- |\n| `threadId` | string | The thread identifier |\n| `status` | string | `idle`, `queued`, `pending`, `running`, `completed`, `failed`, or `cancelled` |\n| `failureReason` | string \\| null | Why the run failed, when `status` is `failed` |\n| `messages` | UIMessage[] | The conversation - see [UIMessage parts](/docs/api-reference/sessions) |\n\nKeep polling while the status is `pending`, `queued`, or `running`. Stop when it is `completed`, `failed`, or `cancelled`.\n\n### Example\n\n```bash\ncurl https://octavus.ai/api/v1/workforce/agents/AGENT_ID/threads/THREAD_ID \\\n -H "Authorization: Bearer oct_agt_..."\n```\n\n## Follow up in a thread\n\nSend another message into an existing thread. If the agent is still working the message runs after the current turn finishes; otherwise it starts immediately.\n\n```\nPOST /api/v1/workforce/agents/{agentId}/threads/{threadId}/messages\n```\n\n### Request Body\n\n```json\n{\n "message": "Now turn that into a slide deck"\n}\n```\n\n| Field | Type | Required | Description |\n| --------- | --------------- | -------- | ----------------------- |\n| `message` | string | Yes | The follow-up message |\n| `files` | FileReference[] | No | Hosted file attachments |\n\n### Response\n\nReturns `202`.\n\n```json\n{\n "threadId": "cm5xyz123abc456def",\n "status": "running"\n}\n```\n\nPoll [Get a thread](#get-a-thread) for the new run\'s result.\n\n### Example\n\n```bash\ncurl -X POST https://octavus.ai/api/v1/workforce/agents/AGENT_ID/threads/THREAD_ID/messages \\\n -H "Authorization: Bearer oct_agt_..." \\\n -H "Content-Type: application/json" \\\n -d \'{ "message": "Now turn that into a slide deck" }\'\n```\n\n## Errors\n\nErrors return `{ "error": string, "code": string }` with an HTTP status:\n\n| Status | Meaning |\n| ------ | ------------------------------------------------- |\n| `401` | Missing or invalid API key |\n| `402` | The agent is blocked by a usage or spending limit |\n| `403` | The key is not authorized for this agent |\n| `404` | The thread does not exist for this agent |\n',
1602
+ excerpt: "Workforce Agents API Drive a single agent over HTTP. Every request is authenticated with that agent's API key, sent as a bearer token: All endpoints are scoped to one agent through the path segment...",
1603
+ order: 3
1604
+ }
1605
+ ]
1544
1606
  }
1545
1607
  ];
1546
1608
 
@@ -1574,4 +1636,4 @@ export {
1574
1636
  getDocSlugs,
1575
1637
  getSectionBySlug
1576
1638
  };
1577
- //# sourceMappingURL=chunk-Z2OPVMHI.js.map
1639
+ //# sourceMappingURL=chunk-RW7WOZKG.js.map