npm - @octavus/docs - Versions diffs - 3.0.0 → 3.1.0 - Mend

@octavus/docs 3.0.0 → 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (58) hide show

package/content/01-getting-started/01-introduction.md +15 -15
package/content/01-getting-started/02-quickstart.md +2 -2
package/content/02-server-sdk/01-overview.md +6 -6
package/content/02-server-sdk/02-sessions.md +1 -1
package/content/02-server-sdk/03-tools.md +4 -4
package/content/02-server-sdk/04-streaming.md +1 -1
package/content/02-server-sdk/05-cli.md +15 -15
package/content/02-server-sdk/06-workers.md +8 -8
package/content/02-server-sdk/07-debugging.md +7 -7
package/content/02-server-sdk/08-computer.md +4 -4
package/content/03-client-sdk/01-overview.md +11 -11
package/content/03-client-sdk/03-streaming.md +3 -3
package/content/03-client-sdk/04-execution-blocks.md +1 -1
package/content/03-client-sdk/05-socket-transport.md +5 -5
package/content/03-client-sdk/06-http-transport.md +5 -5
package/content/03-client-sdk/07-structured-output.md +3 -3
package/content/03-client-sdk/08-file-uploads.md +2 -2
package/content/03-client-sdk/09-error-handling.md +3 -3
package/content/03-client-sdk/10-client-tools.md +3 -3
package/content/04-protocol/01-overview.md +18 -18
package/content/04-protocol/02-input-resources.md +1 -1
package/content/04-protocol/03-triggers.md +1 -1
package/content/04-protocol/04-tools.md +6 -6
package/content/04-protocol/05-skills.md +5 -5
package/content/04-protocol/06-handlers.md +3 -0
package/content/04-protocol/07-agent-config.md +66 -11
package/content/04-protocol/09-skills-advanced.md +18 -18
package/content/04-protocol/10-types.md +22 -22
package/content/04-protocol/11-workers.md +31 -30
package/content/04-protocol/12-references.md +10 -10
package/content/04-protocol/13-mcp-servers.md +63 -6
package/content/05-api-reference/02-sessions.md +2 -2
package/content/06-examples/02-nextjs-chat.md +3 -3
package/content/06-examples/03-socket-chat.md +3 -3
package/dist/chunk-PD34BHI2.js +1523 -0
package/dist/chunk-PD34BHI2.js.map +1 -0
package/dist/content.js +1 -1
package/dist/docs.json +39 -39
package/dist/index.js +1 -1
package/dist/search-index.json +1 -1
package/dist/search.js +1 -1
package/dist/search.js.map +1 -1
package/dist/sections.json +39 -39
package/package.json +1 -1
package/dist/chunk-4XCEGHY7.js +0 -1549
package/dist/chunk-4XCEGHY7.js.map +0 -1
package/dist/chunk-BKMROUXE.js +0 -1523
package/dist/chunk-BKMROUXE.js.map +0 -1
package/dist/chunk-HMRAGEPN.js +0 -1523
package/dist/chunk-HMRAGEPN.js.map +0 -1
package/dist/chunk-HQCOEPPD.js +0 -1523
package/dist/chunk-HQCOEPPD.js.map +0 -1
package/dist/chunk-J5MPASK3.js +0 -1523
package/dist/chunk-J5MPASK3.js.map +0 -1
package/dist/chunk-TFR7YOK2.js +0 -1523
package/dist/chunk-TFR7YOK2.js.map +0 -1
package/dist/chunk-XVO2F2JU.js +0 -1523
package/dist/chunk-XVO2F2JU.js.map +0 -1

package/content/04-protocol/05-skills.md CHANGED Viewed

@@ -413,7 +413,7 @@ When a skill has secrets configured for the organization, it automatically runs
 - The skill gets its own **isolated sandbox** (separate from other skills)
 - Secrets are injected as **environment variables** available to all scripts
-- Only `octavus_skill_read`, `octavus_skill_list`, and `octavus_skill_run` are available — `octavus_code_run`, `octavus_file_write`, and `octavus_file_read` are blocked
+- Only `octavus_skill_read`, `octavus_skill_list`, and `octavus_skill_run` are available - `octavus_code_run`, `octavus_file_write`, and `octavus_file_read` are blocked
 - Scripts receive input as **JSON via stdin** (using the `input` parameter on `octavus_skill_run`) instead of CLI args
 - All output (stdout/stderr) is **automatically redacted** for secret values before being returned to the LLM
@@ -442,10 +442,10 @@ Skills run in isolated sandbox environments:
 - **No persistent storage** (sandbox destroyed after each `next-message` execution)
 - **File output only** via `/output/` directory
 - **Time limits** enforced (5-minute default, configurable via `sandboxTimeout`)
-- **Secret redaction** — output from secure skills is automatically scanned for secret values
+- **Secret redaction** - output from secure skills is automatically scanned for secret values
 ## Next Steps
-- [Agent Config](/docs/protocol/agent-config) — Configuring skills in agent settings
-- [Provider Options](/docs/protocol/provider-options) — Anthropic's built-in skills
-- [Skills Advanced Guide](/docs/protocol/skills-advanced) — Best practices and advanced patterns
+- [Agent Config](/docs/protocol/agent-config) - Configuring skills in agent settings
+- [Provider Options](/docs/protocol/provider-options) - Anthropic's built-in skills
+- [Skills Advanced Guide](/docs/protocol/skills-advanced) - Best practices and advanced patterns

package/content/04-protocol/06-handlers.md CHANGED Viewed

@@ -146,6 +146,7 @@ Start summary thread:
   model: anthropic/claude-sonnet-4-5 # Optional: different model
   backupModel: openai/gpt-4o # Failover on provider errors
   thinking: low # Extended reasoning level
+  cache: auto # auto (default) | extended | off
   maxSteps: 1 # Tool call limit
   system: escalation-summary # System prompt
   input: [COMPANY_NAME] # Variables for prompt
@@ -155,6 +156,8 @@ Start summary thread:
   imageModel: google/gemini-2.5-flash-image # Image generation model
 ```
+The `cache` field controls prompt caching for this thread and defaults to `auto` when omitted. Threads do not inherit the agent's `cache` value - see [Prompt Caching](/docs/protocol/agent-config#prompt-caching).
 The `model` field can also reference a variable for dynamic model selection. The `backupModel` field follows the same format and supports variable references.
 ```yaml

package/content/04-protocol/07-agent-config.md CHANGED Viewed

@@ -38,6 +38,7 @@ agent:
 | `maxSteps`       | No       | Maximum agentic steps (default: 10)                                            |
 | `temperature`    | No       | Model temperature (0-2)                                                        |
 | `thinking`       | No       | Extended reasoning level                                                       |
+| `cache`          | No       | Prompt caching mode: `auto` (default), `extended`, or `off`                    |
 | `anthropic`      | No       | Anthropic-specific options (tools, skills)                                     |
 ## Models
@@ -99,9 +100,9 @@ const sessionId = await client.agentSessions.create('my-agent', {
 This enables:
-- **Multi-provider support** — Same agent works with different providers
-- **A/B testing** — Test different models without protocol changes
-- **User preferences** — Let users choose their preferred model
+- **Multi-provider support** - Same agent works with different providers
+- **A/B testing** - Test different models without protocol changes
+- **User preferences** - Let users choose their preferred model
 The model value is validated at runtime to ensure it's in the correct `provider/model-id` format.
@@ -122,7 +123,7 @@ When a provider error occurs, the system retries once with the backup model. If
 **Key behaviors:**
-- Only transient provider errors trigger fallback — authentication and validation errors are not retried
+- Only transient provider errors trigger fallback - authentication and validation errors are not retried
 - Provider-specific options (like `anthropic:`) are only forwarded to the backup model if it uses the same provider
 - For streaming responses, fallback only occurs if no content has been sent to the client yet
@@ -144,7 +145,7 @@ agent:
 ## System Prompt
-The system prompt sets the agent's persona and instructions. The `input` field controls which variables are available to the prompt — only variables listed in `input` are interpolated.
+The system prompt sets the agent's persona and instructions. The `input` field controls which variables are available to the prompt - only variables listed in `input` are interpolated.
 ```yaml
 agent:
@@ -232,6 +233,59 @@ agent:
 Thinking content streams to the UI and can be displayed to users.
+## Prompt Caching
+Providers charge less for tokens served from their prompt cache (often 10% of the uncached rate). Octavus exposes a single `cache` field that picks the right retention policy per provider, so the stable prefix of your agent - tools, system prompt, and historical messages - gets billed at the cache-read rate on repeat requests.
+```yaml
+agent:
+  model: anthropic/claude-sonnet-4-5
+  cache: auto # auto (default) | extended | off
+```
+| Mode       | Behavior                                                                      | When to use                                                                                             |
+| ---------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
+| `auto`     | Short-TTL caching. Default when omitted.                                      | Most agents. Free on all supported providers and pays for itself within the same session.               |
+| `extended` | Long-TTL caching. Trades a higher cache-write cost for much longer residency. | Agents triggered with gaps (daily reports, on-call assistants) where the prefix is reused across hours. |
+| `off`      | No opt-in caching emitted.                                                    | When you explicitly want to skip caching - e.g. debugging a non-deterministic prefix.                   |
+### Per-provider behavior
+The `cache` field is provider-agnostic at the protocol level - each provider translates it into its own cache retention policy:
+| Provider  | `auto` TTL                | `extended` TTL |
+| --------- | ------------------------- | -------------- |
+| Anthropic | 5 minutes                 | 1 hour         |
+| OpenAI    | in-memory (~5–10 minutes) | 24 hours       |
+| Google    | Implicit (Gemini 2.5+)    | Implicit       |
+On `off`, Octavus emits no explicit cache options. Providers that auto-cache (OpenAI on prefixes ≥ 1,024 tokens, Gemini 2.5+) may still cache transparently - `off` just disables Octavus's opt-in behavior.
+### Threads don't inherit
+Named threads (created with `start-thread`) read their own `cache` field independently - they **do not** inherit the agent's cache value:
+```yaml
+agent:
+  cache: extended # 1-hour TTL on the main thread
+handlers:
+  summarize:
+    Start summary:
+      block: start-thread
+      thread: summary
+      # No cache field → defaults to 'auto' (5-minute TTL), NOT 'extended'
+      system: summary-system
+```
+This is intentional: named threads are often used for short, one-shot work (summarization, classification) where the long TTL would be wasted. Set `cache` explicitly on `start-thread` when you do want it.
+### Cost trade-offs
+- **Cache reads** are always much cheaper than uncached input on any provider - caching is effectively free if your prefix is stable.
+- **Cache writes** on Anthropic cost ~1.25× input for `auto` and 2× input for `extended`. OpenAI and Google don't charge separately for cache writes.
+- Use `extended` only when the same prefix is genuinely reused across sessions that span hours; otherwise the higher write cost dominates the savings.
 ## Skills
 Enable Octavus skills for code execution and file generation:
@@ -297,9 +351,9 @@ When `imageModel` is configured, the `octavus_generate_image` tool becomes avail
 The tool supports three image sizes:
-- `1024x1024` (default) — Square
-- `1792x1024` — Landscape (16:9)
-- `1024x1792` — Portrait (9:16)
+- `1024x1024` (default) - Square
+- `1792x1024` - Landscape (16:9)
+- `1024x1792` - Portrait (9:16)
 ### Image Editing with Reference Images
@@ -338,7 +392,7 @@ agent:
 When `webSearch` is enabled, the `octavus_web_search` tool becomes available. The LLM can decide when to search the web based on the conversation. Search results include source URLs that are emitted as citations in the UI.
-This is a **provider-agnostic** built-in tool — it works with any LLM provider (Anthropic, Google, OpenAI, etc.). For Anthropic's own web search implementation, see [Provider Options](/docs/protocol/provider-options).
+This is a **provider-agnostic** built-in tool - it works with any LLM provider (Anthropic, Google, OpenAI, etc.). For Anthropic's own web search implementation, see [Provider Options](/docs/protocol/provider-options).
 Use cases:
@@ -381,7 +435,7 @@ agent:
         description: Processing PDF
 ```
-Provider options are validated against the model—using `anthropic:` with a non-Anthropic model will fail validation.
+Provider options are validated against the model - using `anthropic:` with a non-Anthropic model will fail validation.
 See [Provider Options](/docs/protocol/provider-options) for full documentation.
@@ -398,6 +452,7 @@ handlers:
       model: anthropic/claude-sonnet-4-5 # Different model
       backupModel: openai/gpt-4o # Failover model
       thinking: low # Different thinking
+      cache: off # Different cache mode (does not inherit from agent)
       maxSteps: 1 # Limit tool calls
       system: escalation-summary # Different prompt
       mcpServers: [figma, browser] # Thread-specific MCP servers
@@ -407,7 +462,7 @@ handlers:
       webSearch: true # Thread-specific web search
 ```
-Each thread can have its own model, backup model, MCP servers, skills, references, image model, and web search setting. Skills must be defined in the protocol's `skills:` section. References must exist in the agent's `references/` directory. Workers use this same pattern since they don't have a global `agent:` section.
+Each thread can have its own model, backup model, cache mode, MCP servers, skills, references, image model, and web search setting. Skills must be defined in the protocol's `skills:` section. References must exist in the agent's `references/` directory. Workers use this same pattern since they don't have a global `agent:` section.
 ## Full Example

package/content/04-protocol/09-skills-advanced.md CHANGED Viewed

@@ -28,7 +28,7 @@ Use external tools instead when:
 Define all skills in the `skills:` section, then reference which skills are available where they're used:
-**Interactive agents** — reference in `agent.skills`:
+**Interactive agents** - reference in `agent.skills`:
 ```yaml
 skills:
@@ -45,7 +45,7 @@ agent:
   skills: [qr-code]
 ```
-**Workers and named threads** — reference per-thread in `start-thread.skills`:
+**Workers and named threads** - reference per-thread in `start-thread.skills`:
 ```yaml
 skills:
@@ -348,8 +348,8 @@ print(result.stdout)
 Key patterns:
 - **Read stdin**: `json.load(sys.stdin)` to get the `input` object from the `octavus_skill_run` call
-- **Access secrets**: `os.environ["SECRET_NAME"]` — secrets are injected as env vars
-- **Print output**: Write results to stdout — the LLM sees the (redacted) stdout
+- **Access secrets**: `os.environ["SECRET_NAME"]` - secrets are injected as env vars
+- **Print output**: Write results to stdout - the LLM sees the (redacted) stdout
 - **Error handling**: Write errors to stderr and exit with non-zero code
 ### Declaring Secrets in SKILL.md
@@ -447,10 +447,10 @@ The LLM sees these errors and can retry or explain to users.
 For skills with configured secrets:
-- **Isolated sandbox** — each secure skill gets its own sandbox, preventing cross-skill secret leakage
-- **No arbitrary code** — `octavus_code_run`, `octavus_file_write`, and `octavus_file_read` are blocked for secure skills, so only pre-built scripts can execute
-- **Output redaction** — all stdout and stderr are scanned for secret values before being returned to the LLM
-- **Encrypted at rest** — secrets are encrypted using AES-256-GCM and only decrypted at execution time
+- **Isolated sandbox** - each secure skill gets its own sandbox, preventing cross-skill secret leakage
+- **No arbitrary code** - `octavus_code_run`, `octavus_file_write`, and `octavus_file_read` are blocked for secure skills, so only pre-built scripts can execute
+- **Output redaction** - all stdout and stderr are scanned for secret values before being returned to the LLM
+- **Encrypted at rest** - secrets are encrypted using AES-256-GCM and only decrypted at execution time
 ### Input Validation
@@ -534,16 +534,16 @@ Check execution logs in the platform debug view:
 ## Best Practices Summary
-1. **Enable only needed skills** — Don't overwhelm the LLM
-2. **Choose appropriate display modes** — Match user experience needs
-3. **Write clear skill descriptions** — Help LLM understand when to use
-4. **Handle errors gracefully** — Provide helpful error messages
-5. **Test skills locally** — Verify before uploading
-6. **Monitor execution** — Check logs for issues
-7. **Combine with tools** — Use tools for data, skills for processing
-8. **Consider performance** — Be aware of timeouts and limits
-9. **Use secrets for credentials** — Declare secrets in frontmatter instead of hardcoding tokens
-10. **Design scripts for stdin input** — Secure skills receive JSON via stdin, so plan for both input methods if the skill might be used in either mode
+1. **Enable only needed skills** - Don't overwhelm the LLM
+2. **Choose appropriate display modes** - Match user experience needs
+3. **Write clear skill descriptions** - Help LLM understand when to use
+4. **Handle errors gracefully** - Provide helpful error messages
+5. **Test skills locally** - Verify before uploading
+6. **Monitor execution** - Check logs for issues
+7. **Combine with tools** - Use tools for data, skills for processing
+8. **Consider performance** - Be aware of timeouts and limits
+9. **Use secrets for credentials** - Declare secrets in frontmatter instead of hardcoding tokens
+10. **Design scripts for stdin input** - Secure skills receive JSON via stdin, so plan for both input methods if the skill might be used in either mode
 ## Next Steps

package/content/04-protocol/10-types.md CHANGED Viewed

@@ -9,11 +9,11 @@ Types let you define reusable data structures for your agent. Use them in inputs
 ## Why Types?
-- **Reusability** — Define once, use in multiple places
-- **Validation** — Catch errors at protocol validation time
-- **Documentation** — Clear data contracts for your agent
-- **Tool Parameters** — Use complex types in tool parameters
-- **Structured Output** — Get typed JSON responses from the LLM
+- **Reusability** - Define once, use in multiple places
+- **Validation** - Catch errors at protocol validation time
+- **Documentation** - Clear data contracts for your agent
+- **Tool Parameters** - Use complex types in tool parameters
+- **Structured Output** - Get typed JSON responses from the LLM
 ## Defining Types
@@ -504,9 +504,9 @@ The tool receives: `{ cartItems: [{ productId: "...", quantity: 1 }, ...] }`
 Named array types provide:
-- **Reusability** — Use the same array type in multiple tools
-- **Clear schema** — The array structure is validated
-- **Clean tool calls** — No unnecessary wrapper objects
+- **Reusability** - Use the same array type in multiple tools
+- **Clear schema** - The array structure is validated
+- **Clean tool calls** - No unnecessary wrapper objects
 ## Structured Output
@@ -602,9 +602,9 @@ The `responseType` must be an **object type** (regular custom type with properti
 The following cannot be used directly as `responseType`:
-- **Discriminated unions** — LLM providers don't allow `anyOf` at the schema root ([OpenAI docs](https://platform.openai.com/docs/guides/structured-outputs#root-objects-must-not-be-anyof-and-must-be-an-object))
-- **Array types** — Must be wrapped in an object
-- **Primitives** — `string`, `number`, etc. are not valid
+- **Discriminated unions** - LLM providers don't allow `anyOf` at the schema root ([OpenAI docs](https://platform.openai.com/docs/guides/structured-outputs#root-objects-must-not-be-anyof-and-must-be-an-object))
+- **Array types** - Must be wrapped in an object
+- **Primitives** - `string`, `number`, etc. are not valid
 ```yaml
 types:
@@ -695,23 +695,23 @@ Types are validated when the protocol is loaded:
 ### Type Definition Limits
-- **No standalone `array` or `object`** — Define a custom type instead, or use `unknown` for untyped data
-- **No recursive types** — A type cannot reference itself (directly or indirectly)
-- **No generic types** — Types are concrete, not parameterized
-- **String enums only** — `enum` values must be strings
-- **No array constraints** — `minItems` and `maxItems` are not supported (LLM providers don't enforce them)
+- **No standalone `array` or `object`** - Define a custom type instead, or use `unknown` for untyped data
+- **No recursive types** - A type cannot reference itself (directly or indirectly)
+- **No generic types** - Types are concrete, not parameterized
+- **String enums only** - `enum` values must be strings
+- **No array constraints** - `minItems` and `maxItems` are not supported (LLM providers don't enforce them)
 ### Tool Limitations
-- **Tool parameters are always objects** — Each tool call is `{ param1: value1, param2: value2, ... }`
-- **Array parameters need named types** — Use top-level array types for array parameters
+- **Tool parameters are always objects** - Each tool call is `{ param1: value1, param2: value2, ... }`
+- **Array parameters need named types** - Use top-level array types for array parameters
 ### Structured Output Limitations
-- **responseType must be an object type** — Only object types can be used as responseType
-- **Discriminated unions need object wrapper** — Unions (`anyOf`) are not allowed at the schema root
-- **Array types need object wrapper** — Arrays cannot be used directly as responseType
-- **Primitives are not allowed** — `string`, `number`, etc. cannot be used as responseType
+- **responseType must be an object type** - Only object types can be used as responseType
+- **Discriminated unions need object wrapper** - Unions (`anyOf`) are not allowed at the schema root
+- **Array types need object wrapper** - Arrays cannot be used directly as responseType
+- **Primitives are not allowed** - `string`, `number`, etc. cannot be used as responseType
 These limitations exist because LLM providers (OpenAI, Anthropic) require the root schema to be an object:

package/content/04-protocol/11-workers.md CHANGED Viewed

@@ -11,16 +11,16 @@ Workers are agents designed for task-based execution. Unlike interactive agents
 Workers are ideal for:
-- **Background processing** — Long-running tasks that don't need conversation
-- **Composable tasks** — Reusable units of work called by other agents
-- **Pipelines** — Multi-step processing with structured output
-- **Parallel execution** — Tasks that can run independently
+- **Background processing** - Long-running tasks that don't need conversation
+- **Composable tasks** - Reusable units of work called by other agents
+- **Pipelines** - Multi-step processing with structured output
+- **Parallel execution** - Tasks that can run independently
 Use interactive agents instead when:
-- **Conversation is needed** — Multi-turn dialogue with users
-- **Persistence matters** — State should survive across interactions
-- **Session context** — User context needs to persist
+- **Conversation is needed** - Multi-turn dialogue with users
+- **Persistence matters** - State should survive across interactions
+- **Session context** - User context needs to persist
 ## Worker vs Interactive
@@ -124,7 +124,7 @@ Workers are identified by the `format` field:
 ### No Global Agent Config
-Interactive agents have a global `agent:` section that configures a main thread. Workers don't have this — every thread must be explicitly created via `start-thread`:
+Interactive agents have a global `agent:` section that configures a main thread. Workers don't have this - every thread must be explicitly created via `start-thread`:
 ```yaml
 # Interactive agent: Global config
@@ -219,20 +219,21 @@ steps:
 All LLM configuration goes here:
-| Field         | Description                                       |
-| ------------- | ------------------------------------------------- |
-| `thread`      | Thread name (defaults to block name)              |
-| `model`       | LLM model to use                                  |
-| `system`      | System prompt filename (required)                 |
-| `input`       | Variables for system prompt                       |
-| `tools`       | Tools available in this thread                    |
-| `skills`      | Octavus skills available in this thread           |
-| `mcpServers`  | MCP servers available in this thread              |
-| `imageModel`  | Image generation model                            |
-| `webSearch`   | Enable built-in web search tool                   |
-| `thinking`    | Extended reasoning level                          |
-| `temperature` | Model temperature                                 |
-| `maxSteps`    | Maximum tool call cycles (enables agentic if > 1) |
+| Field         | Description                                                 |
+| ------------- | ----------------------------------------------------------- |
+| `thread`      | Thread name (defaults to block name)                        |
+| `model`       | LLM model to use                                            |
+| `system`      | System prompt filename (required)                           |
+| `input`       | Variables for system prompt                                 |
+| `tools`       | Tools available in this thread                              |
+| `skills`      | Octavus skills available in this thread                     |
+| `mcpServers`  | MCP servers available in this thread                        |
+| `imageModel`  | Image generation model                                      |
+| `webSearch`   | Enable built-in web search tool                             |
+| `thinking`    | Extended reasoning level                                    |
+| `cache`       | Prompt caching mode: `auto` (default), `extended`, or `off` |
+| `temperature` | Model temperature                                           |
+| `maxSteps`    | Maximum tool call cycles (enables agentic if > 1)           |
 ## Simple Example
@@ -389,7 +390,7 @@ steps:
     maxSteps: 10
 ```
-Workers resolve their own MCP connections independently — they don't inherit MCP servers from a parent interactive agent. Remote MCP connections are project-scoped, so a worker in the same project automatically has access to the same OAuth connections.
+Workers resolve their own MCP connections independently - they don't inherit MCP servers from a parent interactive agent. Remote MCP connections are project-scoped, so a worker in the same project automatically has access to the same OAuth connections.
 See [MCP Servers](/docs/protocol/mcp-servers) for full documentation.
@@ -423,8 +424,8 @@ See [Skills](/docs/protocol/skills) for full documentation.
 Workers support the same tool handling as interactive agents:
-- **Server tools** — Handled by tool handlers you provide
-- **Client tools** — Pause execution, return tool request to caller
+- **Server tools** - Handled by tool handlers you provide
+- **Client tools** - Pause execution, return tool request to caller
 ```typescript
 // Non-streaming: get the output directly
@@ -467,8 +468,8 @@ All standard events (text-delta, tool calls, etc.) are also emitted.
 Interactive agents can call workers in two ways:
-1. **Deterministically** — Using the `run-worker` block
-2. **Agentically** — LLM calls worker as a tool
+1. **Deterministically** - Using the `run-worker` block
+2. **Agentically** - LLM calls worker as a tool
 ### Worker Declaration
@@ -542,6 +543,6 @@ When the worker calls its `search` tool, your `web-search` handler executes.
 ## Next Steps
-- [Server SDK Workers](/docs/server-sdk/workers) — Executing workers from code
-- [Handlers](/docs/protocol/handlers) — Block reference for steps
-- [Agent Config](/docs/protocol/agent-config) — Model and settings
+- [Server SDK Workers](/docs/server-sdk/workers) - Executing workers from code
+- [Handlers](/docs/protocol/handlers) - Block reference for steps
+- [Agent Config](/docs/protocol/agent-config) - Model and settings

package/content/04-protocol/12-references.md CHANGED Viewed

@@ -11,9 +11,9 @@ References are markdown documents that agents can fetch on demand. Instead of lo
 References are useful for:
-- **Large context** — Documents too long to include in every system prompt
-- **Selective loading** — Let the agent decide which context is relevant
-- **Shared knowledge** — Reusable documents across threads
+- **Large context** - Documents too long to include in every system prompt
+- **Selective loading** - Let the agent decide which context is relevant
+- **Shared knowledge** - Reusable documents across threads
 ### How References Work
@@ -165,10 +165,10 @@ When a user asks the agent to review code, the agent will:
 The CLI and platform validate references during sync and deployment:
-- **Undefined references** — Referencing a name that doesn't have a matching file in `references/`
-- **Unused references** — A reference file exists but isn't listed in any `agent.references` or `start-thread.references`
-- **Invalid names** — Names that don't follow the `lowercase-with-dashes` convention
-- **Missing description** — Reference files without the required `description` in frontmatter
+- **Undefined references** - Referencing a name that doesn't have a matching file in `references/`
+- **Unused references** - A reference file exists but isn't listed in any `agent.references` or `start-thread.references`
+- **Invalid names** - Names that don't follow the `lowercase-with-dashes` convention
+- **Missing description** - Reference files without the required `description` in frontmatter
 ## References vs Skills
@@ -184,6 +184,6 @@ Use **references** when the agent needs access to text-based knowledge. Use **sk
 ## Next Steps
-- [Agent Config](/docs/protocol/agent-config) — Configuring references in agent settings
-- [Skills](/docs/protocol/skills) — Code execution and knowledge packages
-- [Workers](/docs/protocol/workers) — Using references in worker agents
+- [Agent Config](/docs/protocol/agent-config) - Configuring references in agent settings
+- [Skills](/docs/protocol/skills) - Code execution and knowledge packages
+- [Workers](/docs/protocol/workers) - Using references in worker agents

package/content/04-protocol/13-mcp-servers.md CHANGED Viewed

@@ -97,7 +97,7 @@ A server defined as `figma:` that exposes `get_design_context` produces:
 - `figma__get_design_context`
-The namespace is stripped before calling the MCP server — the server receives the original tool name. This convention matches Anthropic's MCP integration in Claude Desktop and ensures tool names stay unique across servers.
+The namespace is stripped before calling the MCP server - the server receives the original tool name. This convention matches Anthropic's MCP integration in Claude Desktop and ensures tool names stay unique across servers.
 ### What the LLM Sees
@@ -122,7 +122,7 @@ Device MCP tools (auto-discovered):
   filesystem__list_directory
 ```
-You don't define individual MCP tool schemas in the protocol — they're auto-discovered from each MCP server at runtime.
+You don't define individual MCP tool schemas in the protocol - they're auto-discovered from each MCP server at runtime.
 ## Remote MCP Servers
@@ -145,7 +145,7 @@ Remote MCP servers support multiple authentication methods:
 | Bearer    | Bearer token authentication     |
 | None      | No authentication required      |
-Authentication is configured per-project — different projects can connect to the same MCP server with different credentials.
+Authentication is configured per-project - different projects can connect to the same MCP server with different credentials.
 ## Device MCP Servers
@@ -199,11 +199,68 @@ handlers:
       system: research-prompt
 ```
-This thread can use Figma and browser tools, but not sentry or filesystem — even if those are available on the main agent.
+This thread can use Figma and browser tools, but not sentry or filesystem - even if those are available on the main agent.
+## On-Demand MCP Servers
+By default, an agent can only call MCP tools whose namespace is listed in `mcpServers`. With `onDemandMcpServers`, a scope can opt into **every connected MCP of a given source** at runtime, without enumerating each one in the protocol.
+Remote MCPs are connected at the project level from the Octavus dashboard. Normally, each connected MCP that the agent should be able to use has to be declared in the protocol - connecting a new MCP means editing the protocol and redeploying. `onDemandMcpServers` removes that round-trip: once a source is opted in, any MCP connected to the project under that source becomes available to the agent immediately.
+Currently supported for `source: remote`.
+### Protocol-level declaration
+Add an `onDemandMcpServers:` section alongside `mcpServers:`, keyed by source. Each entry configures how the matched MCPs appear in tool lists:
+```yaml
+mcpServers:
+  figma:
+    description: Figma design tool integration
+    source: remote
+    display: description
+onDemandMcpServers:
+  remote:
+    description: Additional connected integrations
+    display: name
+    contextRetention:
+      toolResults: { retainLast: 5 }
+```
+### Scope-level opt-in
+The agent and individual `start-thread` blocks each choose whether to pick up on-demand MCPs, by listing the sources they want:
+```yaml
+agent:
+  mcpServers: [figma]
+  onDemandMcpServers: [remote]
+handlers:
+  user-message:
+    focused:
+      block: start-thread
+      mcpServers: [figma]
+      # no onDemandMcpServers - this thread does NOT see on-demand MCPs
+    broad:
+      block: start-thread
+      mcpServers: [figma]
+      onDemandMcpServers: [remote]
+```
+### Rules
+- A scope's tool list includes every **connected** MCP of any referenced source, whether or not any protocol declares that slug.
+- Undeclared namespaces inherit `description`, `display`, and `contextRetention` from the per-source entry in `onDemandMcpServers`.
+- Scopes decide independently - threads do not inherit `onDemandMcpServers` from their parent, the same rule as `mcpServers:`.
+- Tool namespaces are always the connector's slug (for example `notion__search`, `linear__create_issue`). Source keys are never namespaces.
+Workers opt into on-demand MCPs the same way: through `start-thread` blocks inside `steps`. A worker without a `start-thread` that lists a source won't see on-demand MCPs of that source.
 ## Workers
-Workers can declare and use MCP servers using the same `mcpServers:` syntax. Workers resolve their own MCP connections independently — they don't inherit from a parent interactive agent.
+Workers can declare and use MCP servers using the same `mcpServers:` syntax. Workers resolve their own MCP connections independently - they don't inherit from a parent interactive agent.
 ```yaml
 # Worker protocol
@@ -227,7 +284,7 @@ steps:
     maxSteps: 10
 ```
-Since workers don't have a global `agent:` section, MCP servers are scoped per-thread via `start-thread` — the same way tools and skills work in workers. Remote MCP connections are project-scoped, so workers in the same project share the same OAuth connections.
+Since workers don't have a global `agent:` section, MCP servers are scoped per-thread via `start-thread` - the same way tools and skills work in workers. Remote MCP connections are project-scoped, so workers in the same project share the same OAuth connections.
 See [Workers](/docs/protocol/workers) for the full worker protocol reference.

package/content/05-api-reference/02-sessions.md CHANGED Viewed

@@ -35,7 +35,7 @@ POST /api/agent-sessions
 | `agentId` | string | Yes      | Agent ID (the `id` field, not `slug`) |
 | `input`   | object | No       | Input variables for the agent         |
-> **Getting the agent ID:** Copy the ID from the agent URL in the [platform](https://octavus.ai) (e.g., `octavus.ai/platform/agents/clxyz123`), or use the [CLI](/docs/server-sdk/cli) (`octavus sync ./agents/my-agent`) for local development workflows.
+> **Getting the agent ID:** Copy the ID from the agent URL in the [platform](https://octavus.ai) (e.g., `octavus.ai/projects/.../agents/clxyz123`), or use the [CLI](/docs/server-sdk/cli) (`octavus sync ./agents/my-agent`) for local development workflows.
 ### Response
@@ -216,7 +216,7 @@ curl -X POST https://octavus.ai/api/agent-sessions/:sessionId/restore \
 Clear session state, transitioning it to `expired` status. The session can be restored afterwards with the [Restore Session](#restore-session) endpoint.
-This is idempotent — clearing an already expired session succeeds without error.
+This is idempotent - clearing an already expired session succeeds without error.
 ```
 DELETE /api/agent-sessions/:sessionId

package/content/06-examples/02-nextjs-chat.md CHANGED Viewed

@@ -354,6 +354,6 @@ Tool handlers receive the parameters as `args`:
 ## Next Steps
-- [Protocol Overview](/docs/protocol/overview) — Define agent behavior
-- [Messages](/docs/client-sdk/messages) — Rich message rendering
-- [Streaming](/docs/client-sdk/streaming) — Advanced streaming UI
+- [Protocol Overview](/docs/protocol/overview) - Define agent behavior
+- [Messages](/docs/client-sdk/messages) - Rich message rendering
+- [Streaming](/docs/client-sdk/streaming) - Advanced streaming UI

package/content/06-examples/03-socket-chat.md CHANGED Viewed

@@ -404,6 +404,6 @@ const SockJS: typeof import('sockjs-client') = require('sockjs-client');
 ## Next Steps
-- [Socket Transport](/docs/client-sdk/socket-transport) — Advanced socket patterns
-- [Protocol Overview](/docs/protocol/overview) — Define agent behavior
-- [Tools](/docs/protocol/tools) — Building tool handlers
+- [Socket Transport](/docs/client-sdk/socket-transport) - Advanced socket patterns
+- [Protocol Overview](/docs/protocol/overview) - Define agent behavior
+- [Tools](/docs/protocol/tools) - Building tool handlers