@octavus/docs 3.1.0 → 3.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -31,7 +31,9 @@ type UIMessagePart =
31
31
  | UIOperationPart
32
32
  | UISourcePart
33
33
  | UIFilePart
34
- | UIObjectPart;
34
+ | UIObjectPart
35
+ | UITodoPart
36
+ | UIWorkerPart;
35
37
 
36
38
  // Text content
37
39
  interface UITextPart {
@@ -107,6 +109,31 @@ interface UIObjectPart {
107
109
  error?: string;
108
110
  thread?: string;
109
111
  }
112
+
113
+ // Structured task list (when the agent uses octavus_todo_write)
114
+ interface UITodoPart {
115
+ type: 'todo';
116
+ todos: {
117
+ id: string;
118
+ content: string;
119
+ status: 'pending' | 'in_progress' | 'completed' | 'cancelled';
120
+ }[];
121
+ status: 'streaming' | 'done';
122
+ thread?: string;
123
+ }
124
+
125
+ // Sub-agent execution container (when an agent invokes a worker)
126
+ interface UIWorkerPart {
127
+ type: 'worker';
128
+ workerId: string;
129
+ workerSlug: string;
130
+ description?: string;
131
+ input?: Record<string, unknown>;
132
+ parts: UIMessagePart[]; // Nested parts from the worker (excluding nested workers)
133
+ output?: unknown;
134
+ error?: string;
135
+ status: 'running' | 'done' | 'error';
136
+ }
110
137
  ```
111
138
 
112
139
  ## Sending Messages
@@ -90,6 +90,7 @@ agent:
90
90
  skills: [qr-code] # Enable skills
91
91
  imageModel: google/gemini-2.5-flash-image # Enable image generation
92
92
  webSearch: true # Enable web search
93
+ todoList: true # Enable structured task tracking
93
94
  agentic: true # Allow multiple tool calls
94
95
  thinking: medium # Extended reasoning
95
96
 
@@ -47,11 +47,11 @@ Specify models in `provider/model-id` format. Any model supported by the provide
47
47
 
48
48
  ### Supported Providers
49
49
 
50
- | Provider | Format | Examples |
51
- | --------- | ---------------------- | -------------------------------------------------------------------- |
52
- | Anthropic | `anthropic/{model-id}` | `claude-opus-4-5`, `claude-sonnet-4-5`, `claude-haiku-4-5` |
53
- | Google | `google/{model-id}` | `gemini-3-pro-preview`, `gemini-3-flash-preview`, `gemini-2.5-flash` |
54
- | OpenAI | `openai/{model-id}` | `gpt-5`, `gpt-4o`, `o4-mini`, `o3`, `o3-mini`, `o1` |
50
+ | Provider | Format | Examples |
51
+ | --------- | ---------------------- | -------------------------------------------------------------------------------------------------- |
52
+ | Anthropic | `anthropic/{model-id}` | `claude-opus-4-7`, `claude-opus-4-6`, `claude-sonnet-4-6`, `claude-sonnet-4-5`, `claude-haiku-4-5` |
53
+ | Google | `google/{model-id}` | `gemini-3-pro-preview`, `gemini-3-flash-preview`, `gemini-2.5-flash` |
54
+ | OpenAI | `openai/{model-id}` | `gpt-5`, `gpt-4o`, `o4-mini`, `o3`, `o3-mini`, `o1` |
55
55
 
56
56
  ### Examples
57
57
 
@@ -225,14 +225,28 @@ agent:
225
225
  thinking: medium # low | medium | high
226
226
  ```
227
227
 
228
- | Level | Token Budget | Use Case |
229
- | -------- | ------------ | ------------------- |
230
- | `low` | ~5,000 | Simple reasoning |
231
- | `medium` | ~10,000 | Moderate complexity |
232
- | `high` | ~20,000 | Complex analysis |
228
+ | Level | Use Case |
229
+ | -------- | ------------------- |
230
+ | `low` | Simple reasoning |
231
+ | `medium` | Moderate complexity |
232
+ | `high` | Complex analysis |
233
233
 
234
234
  Thinking content streams to the UI and can be displayed to users.
235
235
 
236
+ ### How levels are applied
237
+
238
+ Each provider translates `thinking` into its own reasoning controls:
239
+
240
+ | Provider | Level mapping |
241
+ | -------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- |
242
+ | Anthropic 4.6+ (`claude-opus-4-7`, `claude-opus-4-6`, `claude-sonnet-4-6`) | Adaptive thinking - the model decides how much to reason, guided by `effort: low / medium / high` |
243
+ | Anthropic older (4.5 and earlier) | Fixed token budgets: `low` ~5,000, `medium` ~10,000, `high` ~20,000 |
244
+ | OpenAI (GPT-5.x, o-series) | `reasoningEffort: low / medium / high` |
245
+ | Google (Gemini 3.x) | `thinkingLevel: low / high` (`medium` rounds up to `high`) |
246
+ | Google (Gemini 1.x / 2.x) | Token budgets: `low` 1,024, `medium` 8,192, `high` 24,576 |
247
+ | OpenRouter | Unified `reasoning.max_tokens` (translated upstream) |
248
+ | Vercel AI Gateway | Forwards the underlying provider's options |
249
+
236
250
  ## Prompt Caching
237
251
 
238
252
  Providers charge less for tokens served from their prompt cache (often 10% of the uncached rate). Octavus exposes a single `cache` field that picks the right retention policy per provider, so the stable prefix of your agent - tools, system prompt, and historical messages - gets billed at the cache-read rate on repeat requests.
@@ -400,6 +414,28 @@ Use cases:
400
414
  - Fact verification and documentation lookups
401
415
  - Any information that may have changed since the model's training
402
416
 
417
+ ## TODO List
418
+
419
+ Enable the LLM to maintain a structured task list while it works:
420
+
421
+ ```yaml
422
+ agent:
423
+ model: anthropic/claude-sonnet-4-5
424
+ system: system
425
+ todoList: true
426
+ agentic: true
427
+ ```
428
+
429
+ When `todoList` is enabled, the `octavus_todo_write` tool becomes available. The LLM creates and updates a list of items - each with `id`, `content`, and `status` (`pending`, `in_progress`, `completed`, `cancelled`) - and the platform emits a `todo-update` stream event with the resolved snapshot. The Client SDK accumulates updates into a single `UITodoPart` per assistant message, so consumers render an evolving "Plan" card without managing state themselves.
430
+
431
+ The list persists across messages: the LLM can use `merge=true` to update items by id (sending only the changed fields), or `merge=false` to replace the list entirely.
432
+
433
+ Use cases:
434
+
435
+ - Multi-step tasks where the user benefits from seeing progress
436
+ - Long-running agentic loops that should communicate intent
437
+ - Workflows where the agent plans before acting
438
+
403
439
  ## Temperature
404
440
 
405
441
  Control response randomness:
@@ -460,9 +496,10 @@ handlers:
460
496
  references: [escalation-policy] # Thread-specific references
461
497
  imageModel: google/gemini-2.5-flash-image # Thread-specific image model
462
498
  webSearch: true # Thread-specific web search
499
+ todoList: true # Thread-specific task list
463
500
  ```
464
501
 
465
- Each thread can have its own model, backup model, cache mode, MCP servers, skills, references, image model, and web search setting. Skills must be defined in the protocol's `skills:` section. References must exist in the agent's `references/` directory. Workers use this same pattern since they don't have a global `agent:` section.
502
+ Each thread can have its own model, backup model, cache mode, MCP servers, skills, references, image model, web search setting, and task list setting. Skills must be defined in the protocol's `skills:` section. References must exist in the agent's `references/` directory. Workers use this same pattern since they don't have a global `agent:` section.
466
503
 
467
504
  ## Full Example
468
505
 
@@ -520,6 +557,7 @@ agent:
520
557
  skills: [qr-code] # Octavus skills
521
558
  references: [support-policies] # On-demand context
522
559
  webSearch: true # Built-in web search
560
+ todoList: true # Structured task tracking
523
561
  agentic: true
524
562
  maxSteps: 10
525
563
  thinking: medium
@@ -38,6 +38,7 @@ mcpServers:
38
38
  | `description` | Yes | What the MCP server provides |
39
39
  | `source` | Yes | `remote` (platform-managed) or `device` (consumer-provided) |
40
40
  | `display` | No | How tool calls appear in UI: `hidden`, `name`, `description` (default: `description`) |
41
+ | `connection` | No | When to connect: `eager` or `lazy` (default: `lazy`). Remote only. |
41
42
 
42
43
  ### Display Modes
43
44
 
@@ -134,6 +135,34 @@ Configuration happens in the Octavus platform UI:
134
135
  2. The server's slug must match the namespace in your protocol
135
136
  3. The platform connects, discovers tools, and makes them available to the agent
136
137
 
138
+ ### Connection Modes
139
+
140
+ The `connection` field controls when the platform connects to a remote MCP server:
141
+
142
+ | Mode | Behavior |
143
+ | ------- | ---------------------------------------------------------------------------------------------------------------------- |
144
+ | `lazy` | (default) The agent activates integrations on demand at runtime. The agent starts responding immediately. |
145
+ | `eager` | The platform connects and discovers tools before the first LLM request. Tools are guaranteed available from message 1. |
146
+
147
+ ```yaml
148
+ mcpServers:
149
+ sentry:
150
+ source: remote
151
+ connection: eager # Always connected upfront
152
+ display: name
153
+
154
+ notion:
155
+ source: remote
156
+ # connection defaults to lazy - agent activates when needed
157
+ display: description
158
+ ```
159
+
160
+ With **lazy connection** (the default), the agent receives two built-in tools - one for listing available integrations and one for activating them. The agent decides which integrations it needs based on the conversation and activates them on demand. This avoids paying connection latency for integrations the agent doesn't end up using.
161
+
162
+ With **eager connection**, the platform connects to the MCP server before the first LLM request, exactly like a declared tool. Use this when the agent needs the MCP's tools from the very first message.
163
+
164
+ The `connection` field is only valid on `source: remote` - device MCPs have their own connection mechanism through the server-sdk.
165
+
137
166
  ### Authentication
138
167
 
139
168
  Remote MCP servers support multiple authentication methods:
@@ -295,6 +324,7 @@ mcpServers:
295
324
  figma:
296
325
  description: Figma design tool integration
297
326
  source: remote
327
+ connection: eager
298
328
  display: description
299
329
  sentry:
300
330
  description: Error tracking and debugging
@@ -355,10 +385,12 @@ mcpServers:
355
385
  figma:
356
386
  description: Figma design tool integration
357
387
  source: remote
388
+ connection: eager # Need design tools from message 1
358
389
  display: description
359
390
  sentry:
360
391
  description: Error tracking and debugging
361
392
  source: remote
393
+ # Lazy (default) - agent activates when debugging is needed
362
394
  display: name
363
395
 
364
396
  tools: