npm - @octavus/docs - Versions diffs - 2.15.0 → 2.17.0 - Mend

@octavus/docs 2.15.0 → 2.17.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

package/content/01-getting-started/02-quickstart.md +1 -0
package/content/02-server-sdk/01-overview.md +26 -0
package/content/02-server-sdk/02-sessions.md +11 -0
package/content/02-server-sdk/03-tools.md +4 -1
package/content/02-server-sdk/04-streaming.md +9 -0
package/content/02-server-sdk/08-computer.md +400 -0
package/content/03-client-sdk/06-http-transport.md +2 -0
package/content/04-protocol/01-overview.md +9 -0
package/content/04-protocol/04-tools.md +5 -4
package/content/04-protocol/05-skills.md +88 -8
package/content/04-protocol/06-handlers.md +3 -1
package/content/04-protocol/07-agent-config.md +65 -17
package/content/04-protocol/09-skills-advanced.md +89 -8
package/content/04-protocol/13-mcp-servers.md +289 -0
package/content/06-examples/02-nextjs-chat.md +1 -0
package/dist/chunk-4PNP4HF5.js +1549 -0
package/dist/chunk-4PNP4HF5.js.map +1 -0
package/dist/{chunk-2UFDUNPK.js → chunk-54ND2CTI.js} +25 -21
package/dist/chunk-54ND2CTI.js.map +1 -0
package/dist/chunk-B4A36GEV.js +1549 -0
package/dist/chunk-B4A36GEV.js.map +1 -0
package/dist/chunk-CFDET7QG.js +1549 -0
package/dist/chunk-CFDET7QG.js.map +1 -0
package/dist/chunk-DKVYIFV7.js +1549 -0
package/dist/chunk-DKVYIFV7.js.map +1 -0
package/dist/{chunk-JEOGYIRI.js → chunk-UZWGRPRR.js} +61 -21
package/dist/chunk-UZWGRPRR.js.map +1 -0
package/dist/content.js +1 -1
package/dist/docs.json +34 -16
package/dist/index.js +1 -1
package/dist/search-index.json +1 -1
package/dist/search.js +1 -1
package/dist/search.js.map +1 -1
package/dist/sections.json +34 -16
package/package.json +1 -1
package/dist/chunk-2UFDUNPK.js.map +0 -1
package/dist/chunk-JEOGYIRI.js.map +0 -1

package/content/01-getting-started/02-quickstart.md CHANGED Viewed

@@ -141,6 +141,7 @@ export async function POST(request: Request) {
       'Content-Type': 'text/event-stream',
       'Cache-Control': 'no-cache',
       Connection: 'keep-alive',
+      'X-Accel-Buffering': 'no',
     },
   });
 }

package/content/02-server-sdk/01-overview.md CHANGED Viewed

@@ -96,6 +96,31 @@ return new Response(toSSEStream(events), {
 });
 ```
+### Computer Capabilities
+Give agents access to browser, filesystem, and shell via MCP:
+```typescript
+import { Computer } from '@octavus/computer';
+const computer = new Computer({
+  mcpServers: {
+    browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=...']),
+    filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [dir]),
+    shell: Computer.shell({ cwd: dir, mode: 'unrestricted' }),
+  },
+});
+await computer.start();
+const session = client.agentSessions.attach(sessionId, {
+  tools: {
+    'set-chat-title': async (args) => ({ title: args.title }),
+  },
+  computer,
+});
+```
 ### Workers
 Execute worker agents for task-based processing:
@@ -239,3 +264,4 @@ The client uploads files directly to S3 using the presigned upload URL. See [Fil
 - [Streaming](/docs/server-sdk/streaming) — Understanding stream events
 - [Workers](/docs/server-sdk/workers) — Executing worker agents
 - [Debugging](/docs/server-sdk/debugging) — Model request tracing and debugging
+- [Computer](/docs/server-sdk/computer) — Browser, filesystem, and shell via MCP

package/content/02-server-sdk/02-sessions.md CHANGED Viewed

@@ -87,9 +87,20 @@ const session = client.agentSessions.attach(sessionId, {
   resources: [
     // Resource watchers (optional)
   ],
+  computer: computer, // Computer capabilities (optional, see Computer documentation)
 });
 ```
+### Attach Options
+| Option      | Type           | Description                                                |
+| ----------- | -------------- | ---------------------------------------------------------- |
+| `tools`     | `ToolHandlers` | Server-side tool handler functions                         |
+| `resources` | `Resource[]`   | Resource watchers for real-time updates                    |
+| `computer`  | `ToolProvider` | Computer capabilities — browser, filesystem, shell via MCP |
+When `computer` is provided, its tool handlers are merged with `tools` (manual handlers take priority on conflict), and its tool schemas are sent to the platform. See [Computer](/docs/server-sdk/computer) for details.
 ## Executing Requests
 Once attached, execute requests on the session using `execute()`:

package/content/02-server-sdk/03-tools.md CHANGED Viewed

@@ -12,13 +12,16 @@ Tools extend what agents can do. In Octavus, tools can execute either on your se
 | Location   | Use Case                                          | Registration                            |
 | ---------- | ------------------------------------------------- | --------------------------------------- |
 | **Server** | Database queries, API calls, sensitive operations | Register handler in `attach()`          |
+| **MCP**    | Browser, filesystem, shell, external services     | Via `computer` option in `attach()`     |
 | **Client** | Browser APIs, interactive UIs, confirmations      | No server handler (forwarded to client) |
 When the Server SDK encounters a tool call:
-1. **Handler exists** → Execute on server, continue automatically
+1. **Handler exists** (server or MCP) → Execute on server, continue automatically
 2. **No handler** → Forward to client via `client-tool-request` event
+MCP tool handlers from `@octavus/computer` are merged with your manual handlers — they work identically from the platform's perspective. See [Computer](/docs/server-sdk/computer) for MCP tool integration.
 For client-side tool handling, see [Client Tools](/docs/client-sdk/client-tools).
 ## Why Server Tools

package/content/02-server-sdk/04-streaming.md CHANGED Viewed

@@ -27,6 +27,7 @@ return new Response(toSSEStream(events), {
     'Content-Type': 'text/event-stream',
     'Cache-Control': 'no-cache',
     Connection: 'keep-alive',
+    'X-Accel-Buffering': 'no',
   },
 });
@@ -36,6 +37,14 @@ for await (const event of events) {
 }
 ```
+The `X-Accel-Buffering: no` header disables proxy buffering on Nginx-based infrastructure (including Vercel), ensuring SSE events are forwarded immediately instead of being batched.
+### Heartbeat
+`toSSEStream` automatically sends SSE comment lines (`: heartbeat`) every 15 seconds during idle periods. This prevents proxies and load balancers from closing the connection due to inactivity — particularly important during multi-step executions where the stream may be silent while waiting for tool processing or LLM responses.
+Heartbeat comments are ignored by all SSE parsers per the spec. No client-side handling is needed.
 ## Event Types
 The stream emits various event types for lifecycle, text, reasoning, and tool interactions.

package/content/02-server-sdk/08-computer.md ADDED Viewed

@@ -0,0 +1,400 @@
+---
+title: Computer
+description: Adding browser, filesystem, and shell capabilities to agents with @octavus/computer.
+---
+# Computer
+The `@octavus/computer` package gives agents access to a physical or virtual machine's browser, filesystem, and shell. It connects to [MCP](https://modelcontextprotocol.io) servers, discovers their tools, and provides them to the server-sdk.
+**Current version:** `{{VERSION:@octavus/computer}}`
+## Installation
+```bash
+npm install @octavus/computer
+```
+## Quick Start
+```typescript
+import { Computer } from '@octavus/computer';
+import { OctavusClient } from '@octavus/server-sdk';
+const computer = new Computer({
+  mcpServers: {
+    browser: Computer.stdio('chrome-devtools-mcp', ['--browser-url=http://127.0.0.1:9222']),
+    filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', ['/path/to/workspace']),
+    shell: Computer.shell({ cwd: '/path/to/workspace', mode: 'unrestricted' }),
+  },
+});
+await computer.start();
+const client = new OctavusClient({
+  baseUrl: 'https://octavus.ai',
+  apiKey: 'your-api-key',
+});
+const session = client.agentSessions.attach(sessionId, {
+  tools: {
+    'set-chat-title': async (args) => ({ title: args.title }),
+  },
+  computer,
+});
+```
+The `computer` is passed to `attach()` — the server-sdk handles the rest. Tool schemas are sent to the platform, and tool calls flow back through the existing execution loop.
+## How It Works
+1. You configure MCP servers with namespaces (e.g., `browser`, `filesystem`, `shell`)
+2. `computer.start()` connects to all servers in parallel and discovers their tools
+3. Each tool is namespaced with `__` (e.g., `browser__navigate_page`, `filesystem__read_file`)
+4. The server-sdk sends tool schemas to the platform and handles tool call execution
+The agent's protocol must declare matching `mcpServers` with `source: device` — see [MCP Servers](/docs/protocol/mcp-servers).
+## Entry Types
+The `Computer` class supports three types of MCP entries:
+### Stdio (MCP Subprocess)
+Spawns an MCP server as a child process, communicating via stdin/stdout:
+```typescript
+Computer.stdio(command: string, args?: string[], options?: {
+  env?: Record<string, string>;
+  cwd?: string;
+})
+```
+Use this for local MCP servers installed as npm packages or standalone executables:
+```typescript
+const computer = new Computer({
+  mcpServers: {
+    browser: Computer.stdio('chrome-devtools-mcp', [
+      '--browser-url=http://127.0.0.1:9222',
+      '--no-usage-statistics',
+    ]),
+    filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [
+      '/Users/me/projects/my-app',
+    ]),
+  },
+});
+```
+### HTTP (Remote MCP Endpoint)
+Connects to an MCP server over Streamable HTTP:
+```typescript
+Computer.http(url: string, options?: {
+  headers?: Record<string, string>;
+})
+```
+Use this for MCP servers running as HTTP services:
+```typescript
+const computer = new Computer({
+  mcpServers: {
+    docs: Computer.http('http://localhost:3001/mcp', {
+      headers: { Authorization: 'Bearer token' },
+    }),
+  },
+});
+```
+### Shell (Built-in)
+Provides shell command execution without spawning an MCP subprocess:
+```typescript
+Computer.shell(options: {
+  cwd?: string;
+  mode: ShellMode;
+  timeout?: number;  // Default: 300,000ms (5 minutes)
+})
+```
+This exposes a `run_command` tool (namespaced as `shell__run_command` when the key is `shell`). Commands execute in a login shell with the user's full environment.
+```typescript
+const computer = new Computer({
+  mcpServers: {
+    shell: Computer.shell({
+      cwd: '/Users/me/projects/my-app',
+      mode: 'unrestricted',
+      timeout: 300_000,
+    }),
+  },
+});
+```
+#### Shell Safety Modes
+| Mode                                   | Description                                   |
+| -------------------------------------- | --------------------------------------------- |
+| `'unrestricted'`                       | All commands allowed (for dedicated machines) |
+| `{ allowedPatterns, blockedPatterns }` | Pattern-based command filtering               |
+Pattern-based filtering:
+```typescript
+Computer.shell({
+  cwd: workspaceDir,
+  mode: {
+    blockedPatterns: [/rm\s+-rf/, /sudo/],
+    allowedPatterns: [/^git\s/, /^npm\s/, /^ls\s/],
+  },
+});
+```
+When `allowedPatterns` is set, only matching commands are permitted. When `blockedPatterns` is set, matching commands are rejected. Blocked patterns are checked first.
+## Lifecycle
+### Starting
+`computer.start()` connects to all configured MCP servers in parallel. If some servers fail to connect, the computer still starts with the remaining servers — only if _all_ connections fail does it throw an error.
+```typescript
+const { errors } = await computer.start();
+if (errors.length > 0) {
+  console.warn('Some MCP servers failed to connect:', errors);
+}
+```
+### Stopping
+`computer.stop()` closes all MCP connections and kills managed processes:
+```typescript
+await computer.stop();
+```
+Always call `stop()` when the session ends to clean up MCP subprocesses. For managed processes (like Chrome), pass them in the config for automatic cleanup.
+## Chrome Launch Helper
+For desktop applications that need to control a browser, `Computer.launchChrome()` launches Chrome with remote debugging enabled:
+```typescript
+const browser = await Computer.launchChrome({
+  profileDir: '/Users/me/.my-app/chrome-profiles/agent-1',
+  debuggingPort: 9222, // Optional, auto-allocated if omitted
+  flags: ['--window-size=1280,800'],
+});
+console.log(`Chrome running on port ${browser.port}, PID ${browser.pid}`);
+```
+Pass the browser to `managedProcesses` for automatic cleanup when the computer stops:
+```typescript
+const computer = new Computer({
+  mcpServers: {
+    browser: Computer.stdio('chrome-devtools-mcp', [
+      `--browser-url=http://127.0.0.1:${browser.port}`,
+    ]),
+    filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [workspaceDir]),
+    shell: Computer.shell({ cwd: workspaceDir, mode: 'unrestricted' }),
+  },
+  managedProcesses: [{ process: browser.process }],
+});
+```
+### ChromeLaunchOptions
+| Field           | Required | Description                                           |
+| --------------- | -------- | ----------------------------------------------------- |
+| `profileDir`    | Yes      | Directory for Chrome's user data (profile isolation)  |
+| `debuggingPort` | No       | Port for remote debugging (auto-allocated if omitted) |
+| `flags`         | No       | Additional Chrome launch flags                        |
+## ToolProvider Interface
+`Computer` implements the `ToolProvider` interface from `@octavus/core`:
+```typescript
+interface ToolProvider {
+  toolHandlers(): Record<string, ToolHandler>;
+  toolSchemas(): ToolSchema[];
+}
+```
+The server-sdk accepts any `ToolProvider` on the `computer` option — you can implement your own if `@octavus/computer` doesn't fit your use case:
+```typescript
+const customProvider: ToolProvider = {
+  toolHandlers() {
+    return {
+      custom__my_tool: async (args) => {
+        return { result: 'done' };
+      },
+    };
+  },
+  toolSchemas() {
+    return [
+      {
+        name: 'custom__my_tool',
+        description: 'A custom tool',
+        inputSchema: {
+          type: 'object',
+          properties: {
+            input: { type: 'string', description: 'Tool input' },
+          },
+          required: ['input'],
+        },
+      },
+    ];
+  },
+};
+const session = client.agentSessions.attach(sessionId, {
+  tools: { 'set-chat-title': titleHandler },
+  computer: customProvider,
+});
+```
+## Complete Example
+A desktop application with browser, filesystem, and shell capabilities:
+```typescript
+import { Computer } from '@octavus/computer';
+import { OctavusClient } from '@octavus/server-sdk';
+const WORKSPACE_DIR = '/Users/me/projects/my-app';
+const PROFILE_DIR = '/Users/me/.my-app/chrome-profiles/agent';
+async function startSession(sessionId: string) {
+  // 1. Launch Chrome with remote debugging
+  const browser = await Computer.launchChrome({
+    profileDir: PROFILE_DIR,
+  });
+  // 2. Create computer with all capabilities
+  const computer = new Computer({
+    mcpServers: {
+      browser: Computer.stdio('chrome-devtools-mcp', [
+        `--browser-url=http://127.0.0.1:${browser.port}`,
+        '--no-usage-statistics',
+      ]),
+      filesystem: Computer.stdio('@modelcontextprotocol/server-filesystem', [WORKSPACE_DIR]),
+      shell: Computer.shell({
+        cwd: WORKSPACE_DIR,
+        mode: 'unrestricted',
+      }),
+    },
+    managedProcesses: [{ process: browser.process }],
+  });
+  // 3. Connect to all MCP servers
+  const { errors } = await computer.start();
+  if (errors.length > 0) {
+    console.warn('Failed to connect:', errors);
+  }
+  // 4. Attach to session with computer
+  const client = new OctavusClient({
+    baseUrl: process.env.OCTAVUS_API_URL!,
+    apiKey: process.env.OCTAVUS_API_KEY!,
+  });
+  const session = client.agentSessions.attach(sessionId, {
+    tools: {
+      'set-chat-title': async (args) => {
+        console.log('Chat title:', args.title);
+        return { success: true };
+      },
+    },
+    computer,
+  });
+  // 5. Execute and stream
+  const events = session.execute({
+    type: 'trigger',
+    triggerName: 'user-message',
+    input: { USER_MESSAGE: 'Navigate to github.com and take a screenshot' },
+  });
+  for await (const event of events) {
+    // Handle stream events
+  }
+  // 6. Clean up
+  await computer.stop();
+}
+```
+## API Reference
+### Computer
+```typescript
+class Computer implements ToolProvider {
+  constructor(config: ComputerConfig);
+  // Static factories for MCP entries
+  static stdio(
+    command: string,
+    args?: string[],
+    options?: {
+      env?: Record<string, string>;
+      cwd?: string;
+    },
+  ): StdioConfig;
+  static http(
+    url: string,
+    options?: {
+      headers?: Record<string, string>;
+    },
+  ): HttpConfig;
+  static shell(options: { cwd?: string; mode: ShellMode; timeout?: number }): ShellConfig;
+  // Chrome launch helper
+  static launchChrome(options: ChromeLaunchOptions): Promise<ChromeInstance>;
+  // Lifecycle
+  start(): Promise<{ errors: string[] }>;
+  stop(): Promise<void>;
+  // ToolProvider implementation
+  toolHandlers(): Record<string, ToolHandler>;
+  toolSchemas(): ToolSchema[];
+}
+```
+### ComputerConfig
+```typescript
+interface ComputerConfig {
+  mcpServers: Record<string, McpEntry>;
+  managedProcesses?: { process: ChildProcess }[];
+}
+type McpEntry = StdioConfig | HttpConfig | ShellConfig;
+type ShellMode =
+  | 'unrestricted'
+  | {
+      allowedPatterns?: RegExp[];
+      blockedPatterns?: RegExp[];
+    };
+```
+### ChromeInstance
+```typescript
+interface ChromeInstance {
+  port: number;
+  process: ChildProcess;
+  pid: number;
+}
+```

package/content/03-client-sdk/06-http-transport.md CHANGED Viewed

@@ -80,6 +80,7 @@ export async function POST(request: Request) {
       'Content-Type': 'text/event-stream',
       'Cache-Control': 'no-cache',
       Connection: 'keep-alive',
+      'X-Accel-Buffering': 'no',
     },
   });
 }
@@ -234,6 +235,7 @@ app.post('/api/trigger', async (req, res) => {
   res.setHeader('Content-Type', 'text/event-stream');
   res.setHeader('Cache-Control', 'no-cache');
   res.setHeader('Connection', 'keep-alive');
+  res.setHeader('X-Accel-Buffering', 'no');
   // Pipe the stream to the response
   const reader = stream.getReader();

package/content/04-protocol/01-overview.md CHANGED Viewed

@@ -68,6 +68,13 @@ tools:
     parameters:
       userId: { type: string }
+# MCP servers (remote services and device capabilities)
+mcpServers:
+  figma:
+    description: Figma design tool integration
+    source: remote
+    display: description
 # Octavus skills (provider-agnostic code execution)
 skills:
   qr-code:
@@ -79,6 +86,7 @@ agent:
   model: anthropic/claude-sonnet-4-5
   system: system # References prompts/system.md
   tools: [get-user-account]
+  mcpServers: [figma] # Enable MCP servers
   skills: [qr-code] # Enable skills
   imageModel: google/gemini-2.5-flash-image # Enable image generation
   webSearch: true # Enable web search
@@ -187,6 +195,7 @@ The referenced prompt content is inserted before variable interpolation, so vari
 - [Input & Resources](/docs/protocol/input-resources) — Defining agent inputs
 - [Triggers](/docs/protocol/triggers) — How agents are invoked
 - [Tools](/docs/protocol/tools) — External capabilities
+- [MCP Servers](/docs/protocol/mcp-servers) — Remote services and device capabilities via MCP
 - [Skills](/docs/protocol/skills) — Code execution and knowledge packages
 - [References](/docs/protocol/references) — On-demand context documents
 - [Handlers](/docs/protocol/handlers) — Execution blocks

package/content/04-protocol/04-tools.md CHANGED Viewed

@@ -8,11 +8,12 @@ description: Defining external tools implemented in your backend.
 Tools extend what agents can do. Octavus supports multiple types:
 1. **External Tools** — Defined in the protocol, implemented in your backend (this page)
-2. **Built-in Tools** — Provider-agnostic tools managed by Octavus (web search, image generation)
-3. **Provider Tools** — Provider-specific tools executed by the provider (e.g., Anthropic's code execution)
-4. **Skills** — Code execution and knowledge packages (see [Skills](/docs/protocol/skills))
+2. **MCP Tools** — Auto-discovered from MCP servers (see [MCP Servers](/docs/protocol/mcp-servers))
+3. **Built-in Tools** — Provider-agnostic tools managed by Octavus (web search, image generation)
+4. **Provider Tools** — Provider-specific tools executed by the provider (e.g., Anthropic's code execution)
+5. **Skills** — Code execution and knowledge packages (see [Skills](/docs/protocol/skills))
-This page covers external tools. Built-in tools are enabled via agent config — see [Web Search](/docs/protocol/agent-config#web-search) and [Image Generation](/docs/protocol/agent-config#image-generation). For provider-specific tools, see [Provider Options](/docs/protocol/provider-options). For code execution, see [Skills](/docs/protocol/skills).
+This page covers external tools. For MCP-based tools from services like Figma, Sentry, or device capabilities like browser and filesystem, see [MCP Servers](/docs/protocol/mcp-servers). Built-in tools are enabled via agent config — see [Web Search](/docs/protocol/agent-config#web-search) and [Image Generation](/docs/protocol/agent-config#image-generation). For provider-specific tools, see [Provider Options](/docs/protocol/provider-options). For code execution, see [Skills](/docs/protocol/skills).
 ## External Tools