npm - @octavus/docs - Versions diffs - 2.10.0 → 2.12.0 - Mend

@octavus/docs 2.10.0 → 2.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/content/02-server-sdk/01-overview.md +16 -0
package/content/02-server-sdk/05-cli.md +9 -3
package/content/02-server-sdk/06-workers.md +218 -143
package/content/03-client-sdk/08-file-uploads.md +57 -3
package/content/04-protocol/01-overview.md +33 -6
package/content/04-protocol/05-skills.md +43 -7
package/content/04-protocol/06-handlers.md +3 -0
package/content/04-protocol/07-agent-config.md +38 -13
package/content/04-protocol/09-skills-advanced.md +50 -29
package/content/04-protocol/11-workers.md +40 -5
package/content/04-protocol/12-references.md +189 -0
package/content/05-api-reference/03-agents.md +31 -9
package/dist/{chunk-HPVIPOLY.js → chunk-PQ5AGOPY.js} +27 -27
package/dist/chunk-PQ5AGOPY.js.map +1 -0
package/dist/chunk-RRXIH3DI.js +1507 -0
package/dist/chunk-RRXIH3DI.js.map +1 -0
package/dist/{chunk-RZZE5BMI.js → chunk-SAB5XUB6.js} +49 -31
package/dist/chunk-SAB5XUB6.js.map +1 -0
package/dist/content.js +1 -1
package/dist/docs.json +24 -15
package/dist/index.js +1 -1
package/dist/search-index.json +1 -1
package/dist/search.js +1 -1
package/dist/search.js.map +1 -1
package/dist/sections.json +24 -15
package/package.json +1 -1
package/dist/chunk-HPVIPOLY.js.map +0 -1
package/dist/chunk-RZZE5BMI.js.map +0 -1

package/content/04-protocol/05-skills.md CHANGED Viewed

@@ -61,24 +61,48 @@ skills:
 ## Enabling Skills
-After defining skills in the `skills:` section, specify which skills are available for the chat thread in `agent.skills`:
+After defining skills in the `skills:` section, specify which skills are available. Skills work in both interactive agents and workers.
+### Interactive Agents
+Reference skills in `agent.skills`:
 ```yaml
-# All skills available to this agent (defined once at protocol level)
 skills:
   qr-code:
     display: description
     description: Generating QR codes
-# Skills available for this chat thread
 agent:
   model: anthropic/claude-sonnet-4-5
   system: system
   tools: [get-user-account]
-  skills: [qr-code] # Skills available for this thread
+  skills: [qr-code]
   agentic: true
 ```
+### Workers and Named Threads
+Reference skills per-thread in `start-thread.skills`:
+```yaml
+skills:
+  qr-code:
+    display: description
+    description: Generating QR codes
+steps:
+  Start thread:
+    block: start-thread
+    thread: worker
+    model: anthropic/claude-sonnet-4-5
+    system: system
+    skills: [qr-code]
+    maxSteps: 10
+```
+This also works for named threads in interactive agents, allowing different threads to have different skills.
 ## Skill Tools
 When skills are enabled, the LLM has access to these tools:
@@ -290,23 +314,35 @@ agent:
 ## Sandbox Timeout
-The default sandbox timeout is 5 minutes. For long-running operations, you can configure a custom timeout using `sandboxTimeout` in the agent config:
+The default sandbox timeout is 5 minutes. You can configure a custom timeout using `sandboxTimeout` in the agent config or on individual `start-thread` blocks:
 ```yaml
+# Agent-level timeout (applies to main thread)
 agent:
   model: anthropic/claude-sonnet-4-5
   skills: [data-analysis]
   sandboxTimeout: 1800000 # 30 minutes (in milliseconds)
 ```
-`sandboxTimeout` Maximum: 1 hour (3,600,000 ms)
+```yaml
+# Thread-level timeout (overrides agent-level for this thread)
+steps:
+  Start thread:
+    block: start-thread
+    thread: analysis
+    model: anthropic/claude-sonnet-4-5
+    skills: [data-analysis]
+    sandboxTimeout: 3600000 # 1 hour
+```
+Thread-level `sandboxTimeout` takes priority over agent-level. Maximum: 1 hour (3,600,000 ms).
 ## Security
 Skills run in isolated sandbox environments:
 - **No network access** (unless explicitly configured)
-- **No persistent storage** (sandbox destroyed after execution)
+- **No persistent storage** (sandbox destroyed after each `next-message` execution)
 - **File output only** via `/output/` directory
 - **Time limits** enforced (5-minute default, configurable via `sandboxTimeout`)

package/content/04-protocol/06-handlers.md CHANGED Viewed

@@ -148,6 +148,9 @@ Start summary thread:
   maxSteps: 1 # Tool call limit
   system: escalation-summary # System prompt
   input: [COMPANY_NAME] # Variables for prompt
+  skills: [qr-code] # Octavus skills for this thread
+  sandboxTimeout: 600000 # Skill sandbox timeout (default: 5 min, max: 1 hour)
+  imageModel: google/gemini-2.5-flash-image # Image generation model
 ```
 The `model` field can also reference a variable for dynamic model selection:

package/content/04-protocol/07-agent-config.md CHANGED Viewed

@@ -15,23 +15,26 @@ agent:
   system: system # References prompts/system.md
   tools: [get-user-account] # Available tools
   skills: [qr-code] # Available skills
+  references: [api-guidelines] # On-demand context documents
 ```
 ## Configuration Options
-| Field         | Required | Description                                               |
-| ------------- | -------- | --------------------------------------------------------- |
-| `model`       | Yes      | Model identifier or variable reference                    |
-| `system`      | Yes      | System prompt filename (without .md)                      |
-| `input`       | No       | Variables to pass to the system prompt                    |
-| `tools`       | No       | List of tools the LLM can call                            |
-| `skills`      | No       | List of Octavus skills the LLM can use                    |
-| `imageModel`  | No       | Image generation model (enables agentic image generation) |
-| `agentic`     | No       | Allow multiple tool call cycles                           |
-| `maxSteps`    | No       | Maximum agentic steps (default: 10)                       |
-| `temperature` | No       | Model temperature (0-2)                                   |
-| `thinking`    | No       | Extended reasoning level                                  |
-| `anthropic`   | No       | Anthropic-specific options (tools, skills)                |
+| Field            | Required | Description                                               |
+| ---------------- | -------- | --------------------------------------------------------- |
+| `model`          | Yes      | Model identifier or variable reference                    |
+| `system`         | Yes      | System prompt filename (without .md)                      |
+| `input`          | No       | Variables to pass to the system prompt                    |
+| `tools`          | No       | List of tools the LLM can call                            |
+| `skills`         | No       | List of Octavus skills the LLM can use                    |
+| `references`     | No       | List of references the LLM can fetch on demand            |
+| `sandboxTimeout` | No       | Skill sandbox timeout in ms (default: 5 min, max: 1 hour) |
+| `imageModel`     | No       | Image generation model (enables agentic image generation) |
+| `agentic`        | No       | Allow multiple tool call cycles                           |
+| `maxSteps`       | No       | Maximum agentic steps (default: 10)                       |
+| `temperature`    | No       | Model temperature (0-2)                                   |
+| `thinking`       | No       | Extended reasoning level                                  |
+| `anthropic`      | No       | Anthropic-specific options (tools, skills)                |
 ## Models
@@ -211,6 +214,22 @@ Skills provide provider-agnostic code execution in isolated sandboxes. When enab
 See [Skills](/docs/protocol/skills) for full documentation.
+## References
+Enable on-demand context loading via reference documents:
+```yaml
+agent:
+  model: anthropic/claude-sonnet-4-5
+  system: system
+  references: [api-guidelines, error-codes]
+  agentic: true
+```
+References are markdown files stored in the agent's `references/` directory. When enabled, the LLM can list available references and read their content using `octavus_reference_list` and `octavus_reference_read` tools.
+See [References](/docs/protocol/references) for full documentation.
 ## Image Generation
 Enable the LLM to generate images autonomously:
@@ -319,8 +338,13 @@ handlers:
       thinking: low # Different thinking
       maxSteps: 1 # Limit tool calls
       system: escalation-summary # Different prompt
+      skills: [data-analysis] # Thread-specific skills
+      references: [escalation-policy] # Thread-specific references
+      imageModel: google/gemini-2.5-flash-image # Thread-specific image model
 ```
+Each thread can have its own skills, references, and image model. Skills must be defined in the protocol's `skills:` section. References must exist in the agent's `references/` directory. Workers use this same pattern since they don't have a global `agent:` section.
 ## Full Example
 ```yaml
@@ -367,6 +391,7 @@ agent:
     - search-docs
     - create-support-ticket
   skills: [qr-code] # Octavus skills
+  references: [support-policies] # On-demand context
   agentic: true
   maxSteps: 10
   thinking: medium

package/content/04-protocol/09-skills-advanced.md CHANGED Viewed

@@ -26,10 +26,11 @@ Use external tools instead when:
 ### Defining Available Skills
-Define all skills available to this agent in the `skills:` section. Then specify which skills are available for the chat thread in `agent.skills`:
+Define all skills in the `skills:` section, then reference which skills are available where they're used:
+**Interactive agents** — reference in `agent.skills`:
 ```yaml
-# All skills available to this agent (defined once at protocol level)
 skills:
   qr-code:
     display: description
@@ -37,23 +38,39 @@ skills:
   pdf-processor:
     display: description
     description: Processing PDFs
-  data-analysis:
-    display: description
-    description: Analyzing data
-# Skills available for this chat thread
 agent:
   model: anthropic/claude-sonnet-4-5
   system: system
-  skills: [qr-code] # Skills available for this thread
+  skills: [qr-code]
+```
+**Workers and named threads** — reference per-thread in `start-thread.skills`:
+```yaml
+skills:
+  qr-code:
+    display: description
+    description: Generating QR codes
+  data-analysis:
+    display: description
+    description: Analyzing data
+steps:
+  Start analysis:
+    block: start-thread
+    thread: analysis
+    model: anthropic/claude-sonnet-4-5
+    system: system
+    skills: [qr-code, data-analysis]
+    maxSteps: 10
 ```
 ### Match Skills to Use Cases
-Define all skills available to this agent in the `skills:` section. Then specify which skills are available for the chat thread based on use case:
+Different threads can have different skills. Define all skills at the protocol level, then scope them to each thread:
 ```yaml
-# All skills available to this agent (defined once at protocol level)
 skills:
   qr-code:
     display: description
@@ -65,14 +82,13 @@ skills:
     display: description
     description: Creating charts and visualizations
-# Skills available for this chat thread (support use case)
 agent:
   model: anthropic/claude-sonnet-4-5
   system: system
-  skills: [qr-code] # Skills available for this thread
+  skills: [qr-code]
 ```
-For a data analysis thread, you would specify `[data-analysis, visualization]` in `agent.skills`, but still define all available skills in the `skills:` section above.
+For a data analysis thread, you would specify `[data-analysis, visualization]` in `agent.skills` or in a `start-thread` block's `skills` field.
 ## Display Mode Strategy
@@ -207,43 +223,48 @@ with open(f'{output_dir}/metadata.json', 'w') as f:
 Sandboxes are created only when a skill tool is first called:
 ```yaml
-# Sandbox not created until LLM calls a skill tool
 agent:
-  skills: [qr-code] # Sandbox created on first use
+  skills: [qr-code] # Sandbox created on first skill tool call
 ```
 This means:
 - No cost if skills aren't used
 - Fast startup (no sandbox creation delay)
-- Sandbox reused for all skill calls in a trigger
+- Each `next-message` execution gets its own sandbox with only the skills it needs
 ### Timeout Limits
-Sandboxes have a 5-minute default timeout, which can be configured via `sandboxTimeout`:
+Sandboxes default to a 5-minute timeout. Configure `sandboxTimeout` on the agent config or per thread:
 ```yaml
+# Agent-level
 agent:
   model: anthropic/claude-sonnet-4-5
   skills: [data-analysis]
-  sandboxTimeout: 1800000 # 30 minutes for long-running analysis
+  sandboxTimeout: 1800000 # 30 minutes
 ```
-`sandboxTimeout` Maximum: 1 hour (3,600,000 ms)
-**Timeout guidelines:**
+```yaml
+# Thread-level (overrides agent-level)
+steps:
+  Start thread:
+    block: start-thread
+    thread: analysis
+    skills: [data-analysis]
+    sandboxTimeout: 3600000 # 1 hour for long-running analysis
+```
-- **Short operations** (default 5 min): QR codes, simple calculations
-- **Medium operations** (10-30 min): Data analysis, report generation
-- **Long operations** (30+ min): Complex processing, large dataset analysis
+Thread-level `sandboxTimeout` takes priority. Maximum: 1 hour (3,600,000 ms).
 ### Sandbox Lifecycle
-Each trigger execution gets a fresh sandbox:
+Each `next-message` execution gets its own sandbox:
-- **Clean state** - No leftover files from previous executions
-- **Isolated** - No interference between sessions
-- **Destroyed** - Sandbox cleaned up after trigger completes
+- **Scoped** - Only contains the skills available to that thread
+- **Isolated** - Interactive agents and workers don't share sandboxes
+- **Resilient** - If a sandbox expires, it's transparently recreated
+- **Cleaned up** - Sandbox destroyed when the LLM call completes
 ## Combining Skills with Tools
@@ -348,7 +369,7 @@ The LLM sees these errors and can retry or explain to users.
 ### Sandbox Isolation
 - **No network access** (unless explicitly configured)
-- **No persistent storage** (sandbox destroyed after execution)
+- **No persistent storage** (sandbox destroyed after each `next-message` execution)
 - **File output only** via `/output/` directory
 - **Time limits** enforced (5-minute default, configurable via `sandboxTimeout`)
@@ -373,7 +394,7 @@ if len(data) > 1000:
 Be aware of:
 - **File size limits** - Large files may fail to upload
-- **Execution time** - 5-minute sandbox timeout
+- **Execution time** - Sandbox timeout (5-minute default, 1-hour maximum)
 - **Memory limits** - Sandbox environment constraints
 ## Debugging Skills

package/content/04-protocol/11-workers.md CHANGED Viewed

@@ -148,7 +148,7 @@ steps:
     tools: [tool-b]
 ```
-This gives workers flexibility to use different models, tools, and settings at different stages.
+This gives workers flexibility to use different models, tools, skills, and settings at different stages.
 ### Steps Instead of Handlers
@@ -226,7 +226,7 @@ All LLM configuration goes here:
 | `system`      | System prompt filename (required)                 |
 | `input`       | Variables for system prompt                       |
 | `tools`       | Tools available in this thread                    |
-| `workers`     | Workers available to this thread (as LLM tools)   |
+| `skills`      | Octavus skills available in this thread           |
 | `imageModel`  | Image generation model                            |
 | `thinking`    | Extended reasoning level                          |
 | `temperature` | Model temperature                                 |
@@ -362,6 +362,31 @@ steps:
 output: CONVERSATION_SUMMARY
 ```
+## Skills and Image Generation
+Workers can use Octavus skills and image generation, configured per-thread via `start-thread`:
+```yaml
+skills:
+  qr-code:
+    display: description
+    description: Generate QR codes
+steps:
+  Start thread:
+    block: start-thread
+    thread: worker
+    model: anthropic/claude-sonnet-4-5
+    system: system
+    skills: [qr-code]
+    imageModel: google/gemini-2.5-flash-image
+    maxSteps: 10
+```
+Workers define their own skills independently -- they don't inherit skills from a parent interactive agent. Each thread gets its own sandbox scoped to only its listed skills.
+See [Skills](/docs/protocol/skills) for full documentation.
 ## Tool Handling
 Workers support the same tool handling as interactive agents:
@@ -370,14 +395,24 @@ Workers support the same tool handling as interactive agents:
 - **Client tools** — Pause execution, return tool request to caller
 ```typescript
+// Non-streaming: get the output directly
+const { output } = await client.workers.generate(
+  agentId,
+  { TOPIC: 'AI safety' },
+  {
+    tools: {
+      'web-search': async (args) => await searchWeb(args.query),
+    },
+  },
+);
+// Streaming: observe events in real-time
 const events = client.workers.execute(
   agentId,
   { TOPIC: 'AI safety' },
   {
     tools: {
-      'web-search': async (args) => {
-        return await searchWeb(args.query);
-      },
+      'web-search': async (args) => await searchWeb(args.query),
     },
   },
 );

package/content/04-protocol/12-references.md ADDED Viewed

@@ -0,0 +1,189 @@
+---
+title: References
+description: Using references for on-demand context loading in agents.
+---
+# References
+References are markdown documents that agents can fetch on demand. Instead of loading everything into the system prompt upfront, references let the agent decide what context it needs and load it when relevant.
+## Overview
+References are useful for:
+- **Large context** — Documents too long to include in every system prompt
+- **Selective loading** — Let the agent decide which context is relevant
+- **Shared knowledge** — Reusable documents across threads
+### How References Work
+1. **Definition**: Reference files live in the `references/` directory alongside your agent
+2. **Configuration**: List available references in `agent.references` or `start-thread.references`
+3. **Discovery**: The agent sees reference names and descriptions in its system prompt
+4. **Fetching**: The agent calls reference tools to read the full content when needed
+## Creating References
+Each reference is a markdown file with YAML frontmatter in the `references/` directory:
+```
+my-agent/
+├── settings.json
+├── protocol.yaml
+├── prompts/
+│   └── system.md
+└── references/
+    ├── api-guidelines.md
+    └── error-codes.md
+```
+### Reference Format
+```markdown
+---
+description: >
+  API design guidelines including naming conventions,
+  error handling patterns, and pagination standards.
+---
+# API Guidelines
+## Naming Conventions
+Use lowercase with dashes for URL paths...
+## Error Handling
+All errors return a standard error envelope...
+```
+The `description` field is required. It tells the agent what the reference contains so it can decide when to fetch it.
+### Naming Convention
+Reference filenames use `lowercase-with-dashes`:
+- `api-guidelines.md`
+- `error-codes.md`
+- `coding-standards.md`
+The filename (without `.md`) becomes the reference name used in the protocol.
+## Enabling References
+After creating reference files, specify which references are available in the protocol.
+### Interactive Agents
+List references in `agent.references`:
+```yaml
+agent:
+  model: anthropic/claude-sonnet-4-5
+  system: system
+  references: [api-guidelines, error-codes]
+  agentic: true
+```
+### Workers and Named Threads
+List references per-thread in `start-thread.references`:
+```yaml
+steps:
+  Start thread:
+    block: start-thread
+    thread: worker
+    model: anthropic/claude-sonnet-4-5
+    system: system
+    references: [api-guidelines]
+    maxSteps: 10
+```
+Different threads can have different references.
+## Reference Tools
+When references are enabled, the agent has access to two tools:
+| Tool                     | Purpose                                         |
+| ------------------------ | ----------------------------------------------- |
+| `octavus_reference_list` | List all available references with descriptions |
+| `octavus_reference_read` | Read the full content of a specific reference   |
+The agent also sees reference names and descriptions in its system prompt, so it knows what's available without calling `octavus_reference_list`.
+## Example
+```yaml
+agent:
+  model: anthropic/claude-sonnet-4-5
+  system: system
+  tools: [review-pull-request]
+  references: [coding-standards, api-guidelines]
+  agentic: true
+handlers:
+  user-message:
+    Add message:
+      block: add-message
+      role: user
+      prompt: user-message
+      input: [USER_MESSAGE]
+    Respond:
+      block: next-message
+```
+With `references/coding-standards.md`:
+```markdown
+---
+description: >
+  Team coding standards including naming conventions,
+  code organization, and review checklist.
+---
+# Coding Standards
+## Naming Conventions
+- Files: kebab-case
+- Variables: camelCase
+- Constants: UPPER_SNAKE_CASE
+  ...
+```
+When a user asks the agent to review code, the agent will:
+1. See "coding-standards" and "api-guidelines" in its system prompt
+2. Decide which references are relevant to the review
+3. Call `octavus_reference_read` to load the relevant reference
+4. Use the loaded context to provide an informed review
+## Validation
+The CLI and platform validate references during sync and deployment:
+- **Undefined references** — Referencing a name that doesn't have a matching file in `references/`
+- **Unused references** — A reference file exists but isn't listed in any `agent.references` or `start-thread.references`
+- **Invalid names** — Names that don't follow the `lowercase-with-dashes` convention
+- **Missing description** — Reference files without the required `description` in frontmatter
+## References vs Skills
+| Aspect        | References                    | Skills                          |
+| ------------- | ----------------------------- | ------------------------------- |
+| **Purpose**   | On-demand context documents   | Code execution and file output  |
+| **Content**   | Markdown text                 | Documentation + scripts         |
+| **Execution** | Synchronous text retrieval    | Sandboxed code execution (E2B)  |
+| **Scope**     | Per-agent (stored with agent) | Per-organization (shared)       |
+| **Tools**     | List and read (2 tools)       | Read, list, run, code (6 tools) |
+Use **references** when the agent needs access to text-based knowledge. Use **skills** when the agent needs to execute code or generate files.
+## Next Steps
+- [Agent Config](/docs/protocol/agent-config) — Configuring references in agent settings
+- [Skills](/docs/protocol/skills) — Code execution and knowledge packages
+- [Workers](/docs/protocol/workers) — Using references in worker agents

package/content/05-api-reference/03-agents.md CHANGED Viewed

@@ -5,7 +5,7 @@ description: Agent management API endpoints.
 # Agents API
-Manage agent definitions including protocols and prompts.
+Manage agent definitions including protocols, prompts, and references.
 ## Permissions
@@ -82,6 +82,13 @@ GET /api/agents/:id
       "name": "user-message",
       "content": "{{USER_MESSAGE}}"
     }
+  ],
+  "references": [
+    {
+      "name": "api-guidelines",
+      "description": "API design guidelines and conventions",
+      "content": "# API Guidelines\n\nUse lowercase with dashes..."
+    }
   ]
 }
 ```
@@ -119,18 +126,26 @@ POST /api/agents
       "name": "system",
       "content": "You are a support agent..."
     }
+  ],
+  "references": [
+    {
+      "name": "api-guidelines",
+      "description": "API design guidelines and conventions",
+      "content": "# API Guidelines\n..."
+    }
   ]
 }
 ```
-| Field                  | Type   | Required | Description               |
-| ---------------------- | ------ | -------- | ------------------------- |
-| `settings.slug`        | string | Yes      | URL-safe identifier       |
-| `settings.name`        | string | Yes      | Display name              |
-| `settings.description` | string | No       | Agent description         |
-| `settings.format`      | string | Yes      | `interactive` or `worker` |
-| `protocol`             | string | Yes      | YAML protocol definition  |
-| `prompts`              | array  | Yes      | Prompt files              |
+| Field                  | Type   | Required | Description                                      |
+| ---------------------- | ------ | -------- | ------------------------------------------------ |
+| `settings.slug`        | string | Yes      | URL-safe identifier                              |
+| `settings.name`        | string | Yes      | Display name                                     |
+| `settings.description` | string | No       | Agent description                                |
+| `settings.format`      | string | Yes      | `interactive` or `worker`                        |
+| `protocol`             | string | Yes      | YAML protocol definition                         |
+| `prompts`              | array  | Yes      | Prompt files                                     |
+| `references`           | array  | No       | Reference documents (name, description, content) |
 ### Response
@@ -178,6 +193,13 @@ PATCH /api/agents/:id
       "name": "system",
       "content": "Updated system prompt..."
     }
+  ],
+  "references": [
+    {
+      "name": "api-guidelines",
+      "description": "Updated description",
+      "content": "Updated content..."
+    }
   ]
 }
 ```