@standardagents/skill 0.14.0 → 0.14.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/skills/agentbuilder/SKILL.md +485 -878
|
@@ -12,970 +12,577 @@ description: >
|
|
|
12
12
|
|
|
13
13
|
# AgentBuilder Skill
|
|
14
14
|
|
|
15
|
-
|
|
15
|
+
This guide is a living record. It is not part of the Standard Agent Specification — it exists to help humans and coding agents *use* the spec to build effective agents. It is updated regularly and not bound to any spec version.
|
|
16
16
|
|
|
17
|
-
|
|
17
|
+
> Commands like `pnpm exec agents …` are placeholders. Use whichever package manager (`npm`, `pnpm`, `yarn`, `bun`) the project actually uses.
|
|
18
18
|
|
|
19
|
-
|
|
19
|
+
For deeper reference material that this skill links to, see:
|
|
20
20
|
|
|
21
|
-
|
|
21
|
+
- `agents/agents/AGENTS.md` — agent definition reference (created by `agents scaffold`)
|
|
22
|
+
- `agents/prompts/AGENTS.md` — prompt and tool config reference
|
|
23
|
+
- `agents/tools/AGENTS.md` — tool-writing patterns and `ThreadState` examples
|
|
24
|
+
- `agents/models/AGENTS.md` — recommended model list (authoritative)
|
|
25
|
+
- `agents/hooks/AGENTS.md` — hook reference
|
|
26
|
+
- Canonical TypeScript types (full signatures): read `node_modules/@standardagents/spec/dist/` in your project, or browse `packages/spec/src/` on GitHub at https://github.com/standardagents/agentbuilder
|
|
27
|
+
- Specification: https://standardagentspec.org/llms.txt
|
|
28
|
+
- Builder docs: https://docs.standardagentbuilder.com/llms.txt
|
|
22
29
|
|
|
23
|
-
|
|
30
|
+
> If the `agents/*/AGENTS.md` files don't exist in your project yet, run `pnpm exec agents scaffold` to create them.
|
|
24
31
|
|
|
25
|
-
|
|
32
|
+
---
|
|
26
33
|
|
|
27
|
-
|
|
34
|
+
## Before you write any code
|
|
28
35
|
|
|
29
|
-
|
|
36
|
+
Most failed Standard Agent projects fail in the same six ways. Read this section first and answer all six questions explicitly before creating any code. Each links to the deeper section that explains *how*.
|
|
30
37
|
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
- decides which domain owns the work
|
|
34
|
-
- aggregates state, plans, and responses
|
|
35
|
-
- has subprompts for producing images, or using a more eloquent model for writing
|
|
36
|
-
- `communications_coordinator`
|
|
37
|
-
- owns all inbound and outbound communications work
|
|
38
|
-
- delegates to `gmail_agent`, `slack_agent`, and `text_message_agent`
|
|
39
|
-
- receives escalations from those channel agents and routes the next action
|
|
40
|
-
- `research_assistant`
|
|
41
|
-
- owns information gathering and synthesis
|
|
42
|
-
- delegates to `browser_use_agent`, `google_drive_agent`, and `notes_agent`
|
|
43
|
-
- optional additional branches
|
|
44
|
-
- `scheduling_coordinator`
|
|
45
|
-
- `travel_coordinator`
|
|
46
|
-
- `finance_ops_coordinator`
|
|
47
|
-
- `crm_coordinator`
|
|
38
|
+
**1. Architect the agent graph before picking types.**
|
|
39
|
+
List the domains the system touches. Determine the tree (coordinator → domain agents → sub-domain agents). *Then* decide what each node is. Do not start by writing one agent and bolting tools onto it. → [Architecture & decomposition](#architecture--decomposition)
|
|
48
40
|
|
|
49
|
-
|
|
41
|
+
**2. `dual_ai` is the default. `ai_human` is the exception.**
|
|
42
|
+
`ai_human` is correct only when the thread itself is the chat surface (chat UI, website widget, AgentBuilder admin, direct API chat). For Slack, email, SMS, Discord, webhooks, polled inboxes, schedulers, or any other mediated channel, the human is just a tool target — the agent is `dual_ai`. The whole graph being `dual_ai` is fine and often cleaner. → [Interaction type](#interaction-type-dual_ai-is-the-default)
|
|
50
43
|
|
|
51
|
-
|
|
44
|
+
**3. Every `dual_ai` agent must have explicit session boundaries.**
|
|
45
|
+
Name its `sessionStop` tool. Name its `sessionFail` tool. Set a finite `maxSessionTurns`. Give side_b a real, non-redundant job. If you cannot say in one sentence what tool call ends the session, the agent is not designed yet. → [Session boundary discipline](#session-boundary-discipline)
|
|
52
46
|
|
|
53
|
-
|
|
54
|
-
-
|
|
55
|
-
- Tools (define the functions that an agent can call to interact with the world, and other agents)
|
|
56
|
-
- Agents (define the agent itself, how it uses prompts and tools)
|
|
57
|
-
- Hooks (define custom code that can perform various internal operations such as modifying the prompt or changing the message history)
|
|
58
|
-
- Effects (custom code that executes at a certain time)
|
|
59
|
-
- Threads (the instance of a given agent, with its own state, message history, and filesystem)
|
|
60
|
-
- Endpoints (built-in and custom endpoints exposed by an agent thread)
|
|
47
|
+
**4. Pick models from `current-models` only.**
|
|
48
|
+
Never invent model strings from memory. Run `pnpm exec agents current-models`, choose by category, then run `pnpm exec agents available-models --provider=<name>` to confirm the exact string. → [Model selection](#model-selection)
|
|
61
49
|
|
|
62
|
-
|
|
50
|
+
**5. Research every third-party API before writing the tool.**
|
|
51
|
+
Fetch the official current docs. Confirm auth, base URL, endpoint paths, payload shapes, rate limits, error codes. Do not write a tool from memory of how an API "usually" works. APIs change. → [API research checklist](#api-research-checklist)
|
|
63
52
|
|
|
64
|
-
|
|
53
|
+
**6. Check `ThreadState` before adding any dependency.**
|
|
54
|
+
Before reaching for S3, Redis, an external cron, a queue service, or even `node:fs`, confirm the framework does not already provide the capability via `ThreadState`. It almost always does. → [ThreadState first](#threadstate-first)
|
|
65
55
|
|
|
66
|
-
|
|
56
|
+
---
|
|
67
57
|
|
|
68
|
-
|
|
58
|
+
## What is a Standard Agent?
|
|
69
59
|
|
|
70
|
-
|
|
71
|
-
pnpm exec agents current-models
|
|
72
|
-
```
|
|
60
|
+
In the Standard Agent paradigm, agents are the atomic unit of an AI system, and it is the *composition* of many domain-specific agents that produces efficacy. Standard Agents can be effective using small and cheap models — but small and cheap models suffer from poor tool discernment when presented with a broad, undifferentiated set of tools. Decomposition is what makes them work.
|
|
73
61
|
|
|
74
|
-
|
|
62
|
+
A "gmail" agent will outperform a "google apps" agent. A higher-level "communications" coordinator composes the gmail, slack, and SMS agents. A still-higher "personal assistant" coordinator composes communications, research, scheduling, and finance. This is fractal: the same shape scales up to teams, departments, and entire products. These agent-graphs solve real industry problems — progressive tool discovery, model/prompt tuning, context dilution, task resumability, and compaction.
|
|
63
|
+
|
|
64
|
+
Just because subagents *can* compose complex behavior doesn't mean every step needs one. A subagent is a two-sided conversation where each side may take multiple steps per turn. Sometimes you only need a feature of a different model — generating an image, rewriting a paragraph in a more eloquent voice. For those, use a **subprompt** (a single LLM step exposed as a tool), not a subagent. Use a **subagent** when you need iteration, QA, reflection, or long-lived addressable behavior.
|
|
65
|
+
|
|
66
|
+
When a subagent does QA on another model's output, prefer a *different provider* on the reviewing side. Same-lab models tend to rate their own output too generously.
|
|
67
|
+
|
|
68
|
+
### Example graph
|
|
75
69
|
|
|
76
|
-
```
|
|
77
|
-
|
|
70
|
+
```
|
|
71
|
+
personal_assistant_coordinator (dual_ai, top-level reasoning model)
|
|
72
|
+
├── communications_coordinator (dual_ai, resumable, explicit parent communication, note: children are explicit because they receive inbound messages and should filter out noise before returning)
|
|
73
|
+
│ ├── gmail_agent (dual_ai, resumable, immediate, explicit parent communication)
|
|
74
|
+
│ ├── slack_agent (dual_ai, resumable, immediate, explicit parent communication)
|
|
75
|
+
│ └── sms_agent (dual_ai, resumable, immediate, explicit parent communication)
|
|
76
|
+
├── research_assistant (dual_ai, resumable, explicit parent communication)
|
|
77
|
+
│ ├── browser_use_agent (dual_ai, resumable, implicit parent communication)
|
|
78
|
+
│ ├── google_drive_agent (dual_ai, resumable, implicit parent communication)
|
|
79
|
+
│ └── notes_agent (dual_ai, resumable, implicit parent communication)
|
|
80
|
+
├── scheduling_coordinator (subprompt)
|
|
81
|
+
└── finance_ops_coordinator (dual_ai)
|
|
78
82
|
```
|
|
79
83
|
|
|
80
|
-
Note:
|
|
84
|
+
Note: no `ai_human` anywhere. The "human" enters via the gmail/slack/sms tool calls. The whole graph is autonomous.
|
|
81
85
|
|
|
82
|
-
|
|
86
|
+
## The Standard Agent stack
|
|
83
87
|
|
|
84
|
-
|
|
88
|
+
The spec defines an agent as a composition of:
|
|
85
89
|
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
/**
|
|
135
|
-
* Tool calling strategy for the LLM.
|
|
136
|
-
*
|
|
137
|
-
* - `auto`: Model decides when to call tools (default)
|
|
138
|
-
* - `none`: Disable tool calling entirely
|
|
139
|
-
* - `required`: Force the model to call at least one tool
|
|
140
|
-
*
|
|
141
|
-
* @default 'auto'
|
|
142
|
-
*/
|
|
143
|
-
toolChoice?: 'auto' | 'none' | 'required';
|
|
144
|
-
|
|
145
|
-
/**
|
|
146
|
-
* Zod schema for validating inputs when this prompt is called as a tool.
|
|
147
|
-
*/
|
|
148
|
-
requiredSchema?: S;
|
|
149
|
-
|
|
150
|
-
/**
|
|
151
|
-
* Declared variables for this prompt.
|
|
152
|
-
*/
|
|
153
|
-
variables?: VariableDefinition[];
|
|
154
|
-
|
|
155
|
-
/**
|
|
156
|
-
* Tools available to this prompt.
|
|
157
|
-
* Can be:
|
|
158
|
-
* - string: Simple tool name (custom or provider tool)
|
|
159
|
-
* - SubpromptConfig: Sub-prompt used as a tool
|
|
160
|
-
* - PromptToolConfig: Tool with environment values and/or options
|
|
161
|
-
* - SubagentToolConfig: `dual_ai` subagent invocation behavior
|
|
162
|
-
*
|
|
163
|
-
* To enable handoffs, include ai_human agent names in this array.
|
|
164
|
-
*
|
|
165
|
-
* @example
|
|
166
|
-
* ```typescript
|
|
167
|
-
* tools: [
|
|
168
|
-
* 'custom_tool', // Simple tool name
|
|
169
|
-
* { name: 'other_prompt' }, // Sub-prompt as tool
|
|
170
|
-
* { name: 'file_search', env: { VECTOR_STORE_ID: 'vs_123' } }, // Tool with env values
|
|
171
|
-
* ]
|
|
172
|
-
* ```
|
|
173
|
-
*/
|
|
174
|
-
tools?: (
|
|
175
|
-
| StandardAgentSpec.Callables
|
|
176
|
-
| SubpromptConfig
|
|
177
|
-
| PromptToolConfig
|
|
178
|
-
| SubagentToolConfig
|
|
179
|
-
)[];
|
|
180
|
-
|
|
181
|
-
/**
|
|
182
|
-
* Environment values provided by this prompt.
|
|
183
|
-
* Prompt values are the lowest-precedence source in runtime resolution.
|
|
184
|
-
*/
|
|
185
|
-
env?: Record<string, string>;
|
|
186
|
-
|
|
187
|
-
/**
|
|
188
|
-
* Reasoning configuration for models that support extended thinking.
|
|
189
|
-
*/
|
|
190
|
-
reasoning?: ReasoningConfig;
|
|
191
|
-
|
|
192
|
-
/**
|
|
193
|
-
* Number of recent messages to keep actual images for in context.
|
|
194
|
-
* @default 10
|
|
195
|
-
*/
|
|
196
|
-
recentImageThreshold?: number;
|
|
197
|
-
|
|
198
|
-
/**
|
|
199
|
-
* Provider-specific options passed through to the provider.
|
|
200
|
-
* These override model-level providerOptions for this prompt.
|
|
201
|
-
*
|
|
202
|
-
* Options are merged in order (later wins):
|
|
203
|
-
* 1. model.providerOptions (defaults)
|
|
204
|
-
* 2. prompt.providerOptions (this field - overrides)
|
|
205
|
-
*
|
|
206
|
-
* @example
|
|
207
|
-
* ```typescript
|
|
208
|
-
* providerOptions: {
|
|
209
|
-
* response_format: { type: 'json_object' },
|
|
210
|
-
* }
|
|
211
|
-
* ```
|
|
212
|
-
*/
|
|
213
|
-
providerOptions?: Record<string, unknown>;
|
|
214
|
-
|
|
215
|
-
/**
|
|
216
|
-
* Hook IDs to run when this prompt is active.
|
|
217
|
-
* References hooks by their unique `id` property from defineHook().
|
|
218
|
-
* If not specified, falls back to agent-level hooks.
|
|
219
|
-
*
|
|
220
|
-
* @example
|
|
221
|
-
* ```typescript
|
|
222
|
-
* hooks: ['limit_to_20_messages', 'log_tool_calls']
|
|
223
|
-
* ```
|
|
224
|
-
*/
|
|
225
|
-
hooks?: StandardAgentSpec.HookIds[];
|
|
226
|
-
}
|
|
90
|
+
- **Providers** — how to talk to LLM providers and model variants
|
|
91
|
+
- **Models** — named model definitions referencing a provider
|
|
92
|
+
- **Prompts** — system instructions plus the tools, subprompts, and subagents available at that step
|
|
93
|
+
- **Tools** — functions the agent can call to interact with the world (and other agents)
|
|
94
|
+
- **Agents** — bind it all together: name, type, sides, session bindings
|
|
95
|
+
- **Hooks** — custom code that intercepts lifecycle events (history filtering, message injection, tool result transforms)
|
|
96
|
+
- **Effects** — custom code scheduled to run later
|
|
97
|
+
- **Threads** — runtime instances of an agent, each with its own state, message history, and filesystem
|
|
98
|
+
- **Endpoints** — built-in and custom HTTP endpoints exposed by a thread
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## Architecture & decomposition
|
|
103
|
+
|
|
104
|
+
**Procedure (do this in order, every time):**
|
|
105
|
+
|
|
106
|
+
1. List the domains the system actually touches.
|
|
107
|
+
2. Draw the tree on paper or in a comment block. Coordinators on top, leaf agents at the bottom.
|
|
108
|
+
3. Pick interaction types. (Default `dual_ai`. Use `ai_human` only at chat-surface entry points.)
|
|
109
|
+
4. Name session boundaries for every `dual_ai` node.
|
|
110
|
+
5. Pick a model category for every node.
|
|
111
|
+
6. List the third-party APIs each leaf needs and queue them for research.
|
|
112
|
+
7. Map each capability to a `ThreadState` primitive before writing any custom plumbing.
|
|
113
|
+
|
|
114
|
+
Only then start writing files.
|
|
115
|
+
|
|
116
|
+
### Mega-agent smell test
|
|
117
|
+
|
|
118
|
+
If a single agent has **more than ~8 tools across unrelated domains**, decompose it. Decomposition is rarely wrong; flattening usually is.
|
|
119
|
+
|
|
120
|
+
Other smells that mean "decompose now":
|
|
121
|
+
|
|
122
|
+
- One prompt has tools from two clearly different worlds (e.g., `send_email` and `query_postgres` and `generate_image`).
|
|
123
|
+
- The system prompt is trying to teach the model when to use which tool by writing rules in English. Rules-in-prose are coordinator logic; promote them to a coordinator that picks subagents.
|
|
124
|
+
- A model keeps calling the wrong tool because two tools have similar names or overlapping descriptions. Different domains, different agents.
|
|
125
|
+
- The agent's prompt is over ~150 lines. That's almost always context dilution — split.
|
|
126
|
+
|
|
127
|
+
### Worked example: wrong vs. right
|
|
128
|
+
|
|
129
|
+
**Wrong** — one mega-agent with tools from unrelated domains:
|
|
130
|
+
|
|
131
|
+
```
|
|
132
|
+
personal_assistant (ai_human)
|
|
133
|
+
tools: send_gmail, read_gmail, search_ebay_listings, place_ebay_order,
|
|
134
|
+
track_ebay_shipment, create_calendar_event, list_calendar_events,
|
|
135
|
+
post_slack_message, read_slack_channel, query_stock_price,
|
|
136
|
+
execute_stock_trade, search_recipes, generate_image
|
|
227
137
|
```
|
|
228
138
|
|
|
229
|
-
|
|
139
|
+
Gmail, eBay, calendar, Slack, stock trading, recipes, image gen — seven unrelated worlds in one tool list. The model picks `send_gmail` when it meant `post_slack_message`. It calls `execute_stock_trade` with eBay listing IDs. The system prompt grows 80 lines of rules trying to teach it which tool belongs to which world. Every bug fix breaks two unrelated flows.
|
|
230
140
|
|
|
231
|
-
|
|
141
|
+
**Right** — coordinator + domain subagents:
|
|
232
142
|
|
|
233
|
-
```
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
143
|
+
```
|
|
144
|
+
personal_assistant_coordinator (dual_ai, reasoning model)
|
|
145
|
+
├── gmail_agent tools: send_gmail, read_gmail
|
|
146
|
+
├── ebay_agent tools: search_ebay_listings, place_ebay_order,
|
|
147
|
+
│ track_ebay_shipment
|
|
148
|
+
├── calendar_agent tools: create_calendar_event, list_calendar_events
|
|
149
|
+
├── slack_agent tools: post_slack_message, read_slack_channel
|
|
150
|
+
├── trading_agent tools: query_stock_price, execute_stock_trade
|
|
151
|
+
└── content_helpers subprompts: search_recipes, generate_image
|
|
240
152
|
```
|
|
241
153
|
|
|
242
|
-
|
|
154
|
+
`personal_assistant_coordinator` runs a reasoning model and decides which domain owns the next step. Each leaf is a fast tool-calling model with a tight tool set it can actually discern between. A fix to the eBay flow can't break Gmail.
|
|
155
|
+
|
|
156
|
+
### Coordinators
|
|
157
|
+
|
|
158
|
+
Coordinators provide two things flat agents can't:
|
|
159
|
+
|
|
160
|
+
1. **Inter-domain communication.** A `coding_coordinator` can ask `research_agent` to investigate a library, then hand the findings to `bash_agent` to run commands. The bash agent never needed to know the research agent existed.
|
|
161
|
+
2. **Filtering.** A `gmail_agent` can filter spam well on its own. But the higher-level objectives of the organization — "what email actually matters to the CEO of the sprinkler hose company" — belong to a coordinator above it. The coordinator filters again before escalating.
|
|
162
|
+
|
|
163
|
+
### Graph depth tradeoffs
|
|
164
|
+
|
|
165
|
+
Deeper graphs add latency and create more inter-agent communication that can fail. But they also enable parallelism and isolation. A `marketing_communications_agent` can have a `social_media_agent` which has `twitter_agent`, `linkedin_agent`, and `facebook_agent` as children. That subtree handles posting workflows entirely without involving the top coordinator. Aim for depth that matches the natural hierarchy of the work, not depth for its own sake.
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
## Interaction type: `dual_ai` is the default
|
|
170
|
+
|
|
171
|
+
> **`dual_ai` is the default agent shape. `ai_human` is the exception**, used only when the thread itself is the chat surface — chat UI, website widget, AgentBuilder admin, or an API the user is talking to directly.
|
|
172
|
+
|
|
173
|
+
For **Slack, email, SMS, Discord, webhooks, polled inboxes, schedulers, or any other mediated channel**, the human is just a tool target. The agent on the framework side is `dual_ai`. The whole graph being `dual_ai` is fine — often cleaner.
|
|
174
|
+
|
|
175
|
+
### The single decision
|
|
176
|
+
|
|
177
|
+
Ask: **"Is the human typing directly into this thread?"**
|
|
178
|
+
|
|
179
|
+
- **Yes** → the top-level agent is `ai_human`.
|
|
180
|
+
- **No** → the top-level agent is `dual_ai`. The human enters the graph via a tool somewhere (e.g., a `send_slack_message` call inside a slack subagent).
|
|
181
|
+
|
|
182
|
+
Do not default to `ai_human` just because a human is eventually involved. Ask where messages physically arrive.
|
|
183
|
+
|
|
184
|
+
### The two shapes
|
|
185
|
+
|
|
186
|
+
**Shape A — chat surface (`ai_human` at top):**
|
|
243
187
|
|
|
244
|
-
```
|
|
245
|
-
|
|
246
|
-
|
|
247
|
-
toolDescription: 'Handles customer questions about shipping.',
|
|
248
|
-
prompt: [
|
|
249
|
-
{ type: 'include', prompt: 'company_tone' }, // Reference the tone prompt as a part
|
|
250
|
-
{ type: 'text', content: `Details about the products:...` }
|
|
251
|
-
],
|
|
252
|
-
model: 'tiny_model',
|
|
253
|
-
};
|
|
188
|
+
```
|
|
189
|
+
website_chatbot (ai_human) ← user types in a chat widget
|
|
190
|
+
tools: search_products, lookup_order, escalate_to_human
|
|
254
191
|
```
|
|
255
192
|
|
|
256
|
-
|
|
193
|
+
**Shape B — mediated (`dual_ai` everywhere):**
|
|
257
194
|
|
|
258
|
-
```
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
prompt: [
|
|
264
|
-
{ type: 'text', content: `You are an assistant for an ecommerce business. You help customers with their questions about products, inventory, and orders. Here is the current product inventory: ` },
|
|
265
|
-
{ type: 'env', property: 'PRODUCT_INVENTORY' }, // Reference the PRODUCT_INVENTORY env variable
|
|
266
|
-
],
|
|
267
|
-
model: 'tiny_model',
|
|
268
|
-
variables: [
|
|
269
|
-
{
|
|
270
|
-
/** Environment variable/property name */
|
|
271
|
-
name: 'PRODUCT_INVENTORY',
|
|
272
|
-
/** Value type: 'text' or 'secret' */
|
|
273
|
-
type: 'text',
|
|
274
|
-
/** Whether this variable is required to execute */
|
|
275
|
-
required: true;
|
|
276
|
-
/**
|
|
277
|
-
* Whether this variable is scoped to the declarer agent subtree.
|
|
278
|
-
*
|
|
279
|
-
* Scoped variables do not inherit parent thread env values. Descendants of
|
|
280
|
-
* the declarer still inherit scoped values from that declarer thread.
|
|
281
|
-
*
|
|
282
|
-
* @default false
|
|
283
|
-
*/
|
|
284
|
-
scoped: false;
|
|
285
|
-
/** Human-readable description (empty string when not provided) */
|
|
286
|
-
description: 'The full inventory of products, including names, descriptions, and stock levels.',
|
|
287
|
-
}
|
|
288
|
-
]
|
|
289
|
-
}
|
|
195
|
+
```
|
|
196
|
+
slack_research_assistant (dual_ai, reasoning model)
|
|
197
|
+
├── slack_agent (dual_ai, resumable) ← messages arrive via tool calls,
|
|
198
|
+
│ not as thread messages
|
|
199
|
+
└── research_agent (dual_ai)
|
|
290
200
|
```
|
|
291
201
|
|
|
292
|
-
|
|
202
|
+
In Shape B, side_a of `slack_research_assistant` plans the work and dispatches subagents. Side_b reviews and decides when the work is done. The "user" is a Slack channel reachable through `slack_agent`'s tools.
|
|
293
203
|
|
|
294
|
-
|
|
204
|
+
### Handoffs (a special case of `ai_human`)
|
|
295
205
|
|
|
296
|
-
|
|
206
|
+
When an `ai_human` agent calls another `ai_human` agent as a tool, the runtime does **not** spawn a new thread — it changes which prompt "owns" the existing thread. This is a *handoff*. Useful for stepwise human flows: an `onboarding_agent` collects information, then hands off to a `scheduling_agent` that books a meeting. The user keeps talking to the same thread.
|
|
297
207
|
|
|
298
|
-
|
|
208
|
+
### Subprompts vs. subagents vs. handoffs
|
|
299
209
|
|
|
300
|
-
|
|
210
|
+
- **Subprompt** — one LLM step exposed as a tool. Use to switch models for a focused task: image generation, polished writing, JSON extraction.
|
|
211
|
+
- **Subagent** — full child thread with its own `ThreadState`. Use when you need iteration, QA, reflection, or long-lived addressable behavior. Always `dual_ai`.
|
|
212
|
+
- **Handoff** — `ai_human` → `ai_human`, swaps prompt ownership of the same thread. Use for stepwise human-driven flows.
|
|
301
213
|
|
|
302
|
-
|
|
214
|
+
A subagent can be:
|
|
303
215
|
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
|
|
323
|
-
|
|
324
|
-
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
|
|
348
|
-
|
|
349
|
-
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
|
|
353
|
-
|
|
354
|
-
|
|
355
|
-
|
|
356
|
-
|
|
357
|
-
|
|
358
|
-
|
|
359
|
-
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
|
|
363
|
-
|
|
364
|
-
|
|
365
|
-
|
|
366
|
-
|
|
367
|
-
|
|
368
|
-
|
|
369
|
-
|
|
370
|
-
|
|
371
|
-
|
|
372
|
-
|
|
373
|
-
|
|
374
|
-
|
|
375
|
-
|
|
376
|
-
/**
|
|
377
|
-
* Scoped env name whose value may be used as the safe per-instance
|
|
378
|
-
* description hint for child bootstrap.
|
|
379
|
-
*/
|
|
380
|
-
descriptionEnv?: string;
|
|
381
|
-
|
|
382
|
-
/**
|
|
383
|
-
* Scoped env names that should be copied into the child thread for each
|
|
384
|
-
* immediate instance group.
|
|
385
|
-
*/
|
|
386
|
-
scopedEnv?: string[];
|
|
387
|
-
};
|
|
388
|
-
|
|
389
|
-
/**
|
|
390
|
-
* Optional branch flag env name.
|
|
391
|
-
*
|
|
392
|
-
* When set, this subagent is only enabled when the named env resolves to
|
|
393
|
-
* `true`, `1`, or `yes` (case-insensitive).
|
|
394
|
-
*/
|
|
395
|
-
optional?: string;
|
|
396
|
-
|
|
397
|
-
/**
|
|
398
|
-
* Resumability configuration.
|
|
399
|
-
*
|
|
400
|
-
* - `false` (default): Non-resumable subagent
|
|
401
|
-
* - Object: Resumable subagent with message routing and instance limits
|
|
402
|
-
*
|
|
403
|
-
* When resumable mode is enabled, runtimes SHOULD provide a built-in create
|
|
404
|
-
* and message lifecycle interface instead of exposing raw agent callables for
|
|
405
|
-
* new instance creation.
|
|
406
|
-
*/
|
|
407
|
-
resumable?:
|
|
408
|
-
| false
|
|
409
|
-
| {
|
|
410
|
-
/**
|
|
411
|
-
* Which side of the child `dual_ai` conversation receives parent messages.
|
|
412
|
-
*
|
|
413
|
-
* - `side_a`: Messages are queued as `role: 'user'`
|
|
414
|
-
* - `side_b`: Messages are queued as `role: 'assistant'`
|
|
415
|
-
*/
|
|
416
|
-
receives_messages: 'side_a' | 'side_b';
|
|
417
|
-
|
|
418
|
-
/**
|
|
419
|
-
* Maximum concurrent instances for this subagent tool.
|
|
420
|
-
*
|
|
421
|
-
* When reached, implementations may remove this tool from subsequent LLM
|
|
422
|
-
* requests and route new messages to existing instances.
|
|
423
|
-
*
|
|
424
|
-
* @default unlimited
|
|
425
|
-
*/
|
|
426
|
-
maxInstances?: number;
|
|
427
|
-
|
|
428
|
-
/**
|
|
429
|
-
* How this child reports back to its parent.
|
|
430
|
-
*
|
|
431
|
-
* - `implicit` (default): Child completion is automatically queued to the parent.
|
|
432
|
-
* - `explicit`: The runtime does not auto-queue child completion; tools/hooks may
|
|
433
|
-
* use thread APIs such as `state.notifyParent()` when they choose to escalate.
|
|
434
|
-
*/
|
|
435
|
-
parentCommunication?: 'implicit' | 'explicit';
|
|
436
|
-
};
|
|
437
|
-
} |
|
|
438
|
-
{
|
|
439
|
-
/**
|
|
440
|
-
* Name of the tool (custom tool or provider tool).
|
|
441
|
-
*/
|
|
442
|
-
name: StandardAgentSpec.Callables;
|
|
443
|
-
|
|
444
|
-
/**
|
|
445
|
-
* Environment variable values for this tool.
|
|
446
|
-
*/
|
|
447
|
-
env?: Record<string, string>;
|
|
448
|
-
/**
|
|
449
|
-
* @deprecated Use `env` instead.
|
|
450
|
-
*/
|
|
451
|
-
tenvs?: Record<string, unknown>;
|
|
452
|
-
|
|
453
|
-
/**
|
|
454
|
-
* Static options for this tool.
|
|
455
|
-
* Passed to the tool handler at execution time.
|
|
456
|
-
*/
|
|
457
|
-
options?: Record<string, unknown>;
|
|
458
|
-
} | {
|
|
459
|
-
/**
|
|
460
|
-
* Name of the sub-prompt or agent to call.
|
|
461
|
-
* Must be a prompt defined in agents/prompts/ or an agent in agents/agents/.
|
|
462
|
-
*/
|
|
463
|
-
name: T;
|
|
464
|
-
|
|
465
|
-
/**
|
|
466
|
-
* Include text response content from sub-prompt execution in the result string.
|
|
467
|
-
* @default true
|
|
468
|
-
*/
|
|
469
|
-
includeTextResponse?: boolean;
|
|
470
|
-
|
|
471
|
-
/**
|
|
472
|
-
* Serialize tool calls made by the sub-prompt (and their results) into the result string.
|
|
473
|
-
* @default true
|
|
474
|
-
*/
|
|
475
|
-
includeToolCalls?: boolean;
|
|
476
|
-
|
|
477
|
-
/**
|
|
478
|
-
* Serialize any errors from the sub-prompt into the result string.
|
|
479
|
-
* @default true
|
|
480
|
-
*/
|
|
481
|
-
includeErrors?: boolean;
|
|
482
|
-
|
|
483
|
-
/**
|
|
484
|
-
* Property from the tool call arguments to use as the initial user message
|
|
485
|
-
* when invoking the sub-prompt or agent.
|
|
486
|
-
*
|
|
487
|
-
* Autocompletes to fields from the prompt's requiredSchema (or agent's side_a prompt schema).
|
|
488
|
-
*
|
|
489
|
-
* @example
|
|
490
|
-
* If the tool is called with `{ query: "search term", limit: 10 }` and
|
|
491
|
-
* `initUserMessageProperty: 'query'`, the sub-prompt will receive
|
|
492
|
-
* "search term" as the initial user message.
|
|
493
|
-
*/
|
|
494
|
-
initUserMessageProperty?: StandardAgentSpec.SchemaFields<T>;
|
|
495
|
-
|
|
496
|
-
/**
|
|
497
|
-
* Property containing attachment path(s) to include as multimodal content
|
|
498
|
-
* when invoking the sub-prompt or agent.
|
|
499
|
-
*
|
|
500
|
-
* Autocompletes to fields from the prompt's requiredSchema (or agent's side_a prompt schema).
|
|
501
|
-
* Supports both a single path string or an array of paths.
|
|
502
|
-
*
|
|
503
|
-
* @example
|
|
504
|
-
* If the tool is called with `{ image: "/attachments/123.jpg" }` and
|
|
505
|
-
* `initAttachmentsProperty: 'image'`, the sub-prompt will receive
|
|
506
|
-
* the image as an attachment in the user message.
|
|
507
|
-
*
|
|
508
|
-
* @example
|
|
509
|
-
* If the tool is called with `{ images: ["/attachments/a.jpg", "/attachments/b.jpg"] }` and
|
|
510
|
-
* `initAttachmentsProperty: 'images'`, the sub-prompt will receive
|
|
511
|
-
* both images as attachments.
|
|
512
|
-
*/
|
|
513
|
-
initAttachmentsProperty?: StandardAgentSpec.SchemaFields<T>;
|
|
514
|
-
} |
|
|
515
|
-
string /* simple tool name with no extra options * /;
|
|
516
|
-
}
|
|
216
|
+
- **Blocking + non-resumable** — the parent waits, the child runs once, returns, and is gone (tool-call style)
|
|
217
|
+
- **Blocking + resumable** — the parent waits but the child remains addressable for future calls
|
|
218
|
+
- **Non-blocking + non-resumable** — fire-and-forget one-shot
|
|
219
|
+
- **Non-blocking + resumable** — long-lived addressable child the parent can re-message later (e.g., a Slack monitor that lives as long as the parent)
|
|
220
|
+
|
|
221
|
+
---
|
|
222
|
+
|
|
223
|
+
## Session boundary discipline
|
|
224
|
+
|
|
225
|
+
This section is a hard rule, not a suggestion. **Every `dual_ai` agent in the graph must explicitly define its session boundaries.** This is what makes whole-graph `dual_ai` safe.
|
|
226
|
+
|
|
227
|
+
### Required for every `dual_ai` agent
|
|
228
|
+
|
|
229
|
+
- **`sessionStop` tool binding** — names the tool call that ends the session with a result. Side_a or side_b invokes it when the work is done. Common patterns: `report_findings`, `submit_to_parent`, `final_answer`, `done`.
|
|
230
|
+
- **`sessionFail` tool binding** — names the tool call that ends the session with a failure the parent should see. Common patterns: `escalate_blocked`, `report_unresolvable`, `give_up_with_reason`.
|
|
231
|
+
- **`maxSessionTurns`** — a finite integer sized to the realistic upper bound for the task. Never omit this. A research subagent might be 20; an asset QA loop might be 6; a tight reflection loop might be 3.
|
|
232
|
+
- **A real job for side_b** — reflection, QA, judging, alternative-perspective driving, devil's advocate. If side_b is "another instance of side_a," you don't have a `dual_ai` agent — collapse it to a single-side prompt or a coordinator pattern.
|
|
233
|
+
- **Cross-provider QA** — when side_b reviews side_a's output, pick a model from a *different provider* than side_a. Same-lab models systematically over-rate their own work.
|
|
234
|
+
|
|
235
|
+
### The smell test
|
|
236
|
+
|
|
237
|
+
> If you cannot articulate in one sentence what tool call ends the session, the agent is not designed yet.
|
|
238
|
+
|
|
239
|
+
Examples that pass:
|
|
240
|
+
|
|
241
|
+
- "Side_a calls `submit_video_assets` once side_b approves the renders."
|
|
242
|
+
- "Side_b calls `report_findings` after the research turn count exceeds 5 or it judges the topic exhausted."
|
|
243
|
+
- "Either side calls `escalate_to_parent` if the customer policy question is unresolvable."
|
|
244
|
+
|
|
245
|
+
Examples that fail:
|
|
246
|
+
|
|
247
|
+
- "It just stops when it's done." (How does the runtime know?)
|
|
248
|
+
- "Whichever side decides." (Decides by calling *what*?)
|
|
249
|
+
|
|
250
|
+
### Why this matters
|
|
251
|
+
|
|
252
|
+
A `dual_ai` agent with no `sessionStop`, no `sessionFail`, and no `maxSessionTurns` will burn turns until it hits a runtime cap, then fail in a way the parent can't interpret. Coding agents that have been bitten by this once will start defaulting to `ai_human` to "feel safer" — and then we're back to mega-agents and chat-surface confusion. Boundary discipline is what keeps `dual_ai` the default.
|
|
253
|
+
|
|
254
|
+
---
|
|
255
|
+
|
|
256
|
+
## Model selection
|
|
257
|
+
|
|
258
|
+
**Rule: never write a model string the user did not request and `current-models` did not produce.**
|
|
259
|
+
|
|
260
|
+
### Required workflow
|
|
261
|
+
|
|
262
|
+
1. Run `pnpm exec agents current-models` to see the curated category list (e.g. `extra_reasoning`, `fast_tool_calls`, `writing`, `image_generation`, `tiny`).
|
|
263
|
+
2. Choose the category that fits the role (table below).
|
|
264
|
+
3. Run `pnpm exec agents available-models --provider=<name>` to confirm the exact provider model string.
|
|
265
|
+
4. Define the model in `agents/models/<name>.ts` using `defineModel`.
|
|
266
|
+
|
|
267
|
+
If you have multiple configured providers, pass `--provider=<name>` explicitly.
|
|
268
|
+
|
|
269
|
+
### Role → category mapping
|
|
270
|
+
|
|
271
|
+
| Role | Category | Notes |
|
|
272
|
+
|---|---|---|
|
|
273
|
+
| Top-level coordinator | `extra_reasoning` | The discerning brain. Don't skimp here. |
|
|
274
|
+
| Domain subagent (gmail, slack, etc.) | `fast_tool_calls` | High volume, narrow tool set, latency matters. |
|
|
275
|
+
| Eloquent text generation | `writing` | Use as a subprompt from a fast-tool-calling agent. |
|
|
276
|
+
| Image generation | `image_generation` | Use as a subprompt. |
|
|
277
|
+
| QA / reviewer side_b | `extra_reasoning` from a *different provider* than side_a | Same-lab QA is biased. |
|
|
278
|
+
| Cheap classification, tagging, routing | `tiny` | Where speed and cost dominate quality. |
|
|
279
|
+
|
|
280
|
+
### Fallback strategy
|
|
281
|
+
|
|
282
|
+
Define more than one model per category, on different providers. The model `name` should describe the *use case*, not the provider, so a fallback can substitute transparently:
|
|
283
|
+
|
|
284
|
+
```
|
|
285
|
+
agents/models/extra_reasoning.ts → primary (provider A)
|
|
286
|
+
agents/models/extra_reasoning_fallback.ts → secondary (provider B)
|
|
517
287
|
```
|
|
518
288
|
|
|
519
|
-
|
|
289
|
+
Prompts and agents reference `extra_reasoning`; if the primary provider is down, the fallback is one rename away.
|
|
520
290
|
|
|
291
|
+
For the authoritative list of which models currently fill each category, see `agents/models/AGENTS.md` in your project (created by `agents scaffold`). Do not embed the list here — it drifts.
|
|
521
292
|
|
|
522
|
-
|
|
293
|
+
---
|
|
523
294
|
|
|
524
|
-
|
|
295
|
+
## API research checklist
|
|
296
|
+
|
|
297
|
+
This is the single biggest gap in most coding-agent-built tools. **Do not write a `defineTool` that touches a third-party API from memory.** APIs change. Auth flows change. Endpoints get deprecated. Read the current docs every time.
|
|
298
|
+
|
|
299
|
+
### Before writing the tool
|
|
300
|
+
|
|
301
|
+
1. **Fetch the official current docs.** Use WebFetch, the read-website skill, or a browser. Confirm:
|
|
302
|
+
- Auth method (bearer, OAuth, signed request, API key in header vs. query)
|
|
303
|
+
- Base URL and current API version
|
|
304
|
+
- Endpoint paths and HTTP methods
|
|
305
|
+
- Request payload shape (and required vs. optional fields)
|
|
306
|
+
- Response payload shape (success and error)
|
|
307
|
+
- Rate limits and `Retry-After` handling
|
|
308
|
+
- Pagination model (cursor, offset, link header)
|
|
309
|
+
- Idempotency rules — does retrying duplicate side effects?
|
|
310
|
+
2. **Confirm SDK availability and Workers compatibility.** Is there a first-party JS/TS SDK? Does it run in Cloudflare Workers (no Node built-ins, no `fs`, no native modules, no long-lived sockets)? If not, use `fetch` directly. Most provider SDKs are not Workers-compatible out of the box.
|
|
311
|
+
3. **Identify required secrets.** Declare them as `secret` variables, never `text`. Secrets must never be referenced in prompt text or returned to the model.
|
|
312
|
+
4. **Map error modes explicitly.** Each gets handled, not ignored:
|
|
313
|
+
- `401` / `403` — auth failed. Surface a clear message; do not retry blindly.
|
|
314
|
+
- `404` — missing resource. Often a user-facing error; surface it.
|
|
315
|
+
- `409` — conflict. Often means the operation already happened; check before retrying.
|
|
316
|
+
- `429` — rate limited. Honor `Retry-After`. Retry with backoff.
|
|
317
|
+
- `5xx` — server error. Retry with exponential backoff, finite cap.
|
|
318
|
+
- `4xx` (other) — surface to the model so it can correct its arguments.
|
|
319
|
+
5. **Prototype the raw call** in isolation — a single `fetch` against the real endpoint with a real token. Confirm the response shape *as observed*, not as documented. Docs lag.
|
|
320
|
+
6. **Return shapes the model can actually use.** Strip noise. Surface IDs, names, statuses, and the fields the next step needs. Don't return the entire 12 KB JSON blob and hope the model picks the right field.
|
|
321
|
+
|
|
322
|
+
> **Anti-pattern:** "I know the Slack API, I'll just write it." You don't. It changed last quarter. Read the docs.
|
|
525
323
|
|
|
526
|
-
|
|
324
|
+
---
|
|
527
325
|
|
|
528
|
-
|
|
326
|
+
## ThreadState first
|
|
529
327
|
|
|
530
|
-
|
|
531
|
-
- Blocking and resumable
|
|
532
|
-
- Non-blocking and non-resumable
|
|
533
|
-
- Non-blocking and resumable
|
|
328
|
+
> **Rule:** Before adding any dependency or external service, check whether `ThreadState` already provides the capability.
|
|
534
329
|
|
|
535
|
-
|
|
536
|
-
Blocking and non-blocking indicates whether the parent thread waits for the subagent to finish before it continues. A non-blocking subagent allows the parent thread to continue its work while the subagent is running, and the parent can receive messages from the subagent.
|
|
330
|
+
`ThreadState` is the unified API passed to every callable, hook, and endpoint. It abstracts the runtime — your tool may run on the edge, on a Worker, or on a Node server, and the same `state.readFile()` call works in all of them. Tools should **not** import `node:fs`, read `process.env`, or assume Node-shaped APIs.
|
|
537
331
|
|
|
538
|
-
|
|
332
|
+
### Capability lookup table
|
|
539
333
|
|
|
540
|
-
|
|
334
|
+
| What you need | Use this | Don't use |
|
|
335
|
+
|---|---|---|
|
|
336
|
+
| Store a file between turns | `state.writeFile` / `state.readFile` | S3, external blob store |
|
|
337
|
+
| Persist structured data across turns | `state.context` (in-memory) + `state.writeFile` JSON (durable) | External KV, Redis |
|
|
338
|
+
| Trigger work later | `state.scheduleEffect` | External cron, queue service |
|
|
339
|
+
| Invoke another tool from inside a tool | `state.invokeTool` / `state.queueTool` | Re-implementing tool logic inline |
|
|
340
|
+
| Read/write config and secrets | `state.env` / `state.setEnv` | `process.env`, `.env` files |
|
|
341
|
+
| Search files the thread has seen | `state.grepFiles` / `state.findFiles` | Reimplementing search |
|
|
342
|
+
| Escalate / report status to the parent | `state.notifyParent` / `state.setStatus` | Custom message bus |
|
|
343
|
+
| Load a sibling prompt / agent / model | `state.loadPrompt` / `state.loadAgent` / `state.loadModel` | Duplicating the definition |
|
|
344
|
+
| Inspect or message a child thread | `state.children` / `state.getChildThread` | External orchestration |
|
|
345
|
+
| Inject context the model should see | `state.injectMessage` / `state.queueMessage` | Stuffing the system prompt |
|
|
346
|
+
| Read the thread's message history | `state.getMessages` / `state.getMessage` | Reimplementing storage |
|
|
347
|
+
| Update an existing message | `state.updateMessage` | Mutating storage directly |
|
|
348
|
+
| Read execution logs | `state.getLogs` | External observability shim |
|
|
349
|
+
| Emit a runtime event | `state.emit` | Console logging |
|
|
350
|
+
| Stop the thread | `state.terminate` | Throwing and hoping |
|
|
541
351
|
|
|
542
|
-
|
|
352
|
+
### Method cheat sheet
|
|
543
353
|
|
|
544
|
-
|
|
545
|
-
|
|
546
|
-
|
|
547
|
-
|
|
548
|
-
|
|
549
|
-
|
|
550
|
-
|
|
551
|
-
|
|
552
|
-
|
|
553
|
-
|
|
554
|
-
|
|
555
|
-
|
|
556
|
-
|
|
557
|
-
|
|
558
|
-
|
|
559
|
-
|
|
560
|
-
|
|
561
|
-
|
|
562
|
-
|
|
563
|
-
|
|
564
|
-
|
|
565
|
-
|
|
566
|
-
|
|
567
|
-
|
|
568
|
-
|
|
569
|
-
|
|
570
|
-
|
|
571
|
-
|
|
572
|
-
|
|
573
|
-
|
|
574
|
-
|
|
575
|
-
|
|
576
|
-
|
|
577
|
-
|
|
578
|
-
|
|
579
|
-
|
|
580
|
-
|
|
581
|
-
|
|
582
|
-
|
|
583
|
-
|
|
584
|
-
|
|
585
|
-
|
|
586
|
-
|
|
587
|
-
|
|
588
|
-
|
|
589
|
-
|
|
590
|
-
|
|
591
|
-
|
|
592
|
-
|
|
593
|
-
|
|
594
|
-
|
|
595
|
-
|
|
596
|
-
|
|
597
|
-
|
|
598
|
-
|
|
599
|
-
|
|
600
|
-
|
|
601
|
-
|
|
602
|
-
|
|
603
|
-
|
|
604
|
-
|
|
605
|
-
|
|
606
|
-
*
|
|
607
|
-
* @example 'Handles customer support inquiries and resolves issues'
|
|
608
|
-
*/
|
|
609
|
-
description?: string;
|
|
610
|
-
|
|
611
|
-
/**
|
|
612
|
-
* Icon URL or absolute path for the agent.
|
|
613
|
-
* Absolute paths (starting with `/`) are converted to full URLs in API responses.
|
|
614
|
-
*
|
|
615
|
-
* @example 'https://example.com/icon.svg' or '/icons/support.svg'
|
|
616
|
-
*/
|
|
617
|
-
icon?: string;
|
|
618
|
-
|
|
619
|
-
/**
|
|
620
|
-
* Environment values provided by this agent.
|
|
621
|
-
* Agent values are lower priority than thread/account/instance values.
|
|
622
|
-
*/
|
|
623
|
-
env?: Record<string, string>;
|
|
624
|
-
|
|
625
|
-
// ============================================================================
|
|
626
|
-
// Package Metadata (for packing/unpacking)
|
|
627
|
-
// ============================================================================
|
|
628
|
-
|
|
629
|
-
/**
|
|
630
|
-
* npm package name for this agent when packed.
|
|
631
|
-
* Used by the packing system to maintain consistent package identity
|
|
632
|
-
* across pack/unpack cycles.
|
|
633
|
-
*
|
|
634
|
-
* @example 'standardagent-support-agent', '@myorg/support-agent'
|
|
635
|
-
*/
|
|
636
|
-
packageName?: string;
|
|
637
|
-
|
|
638
|
-
/**
|
|
639
|
-
* Package version (semver format).
|
|
640
|
-
* Used by the packing system to track versions across pack/unpack cycles.
|
|
641
|
-
* When re-packing, this version is auto-incremented by the pack modal.
|
|
642
|
-
*
|
|
643
|
-
* @example '1.0.0', '2.3.1-beta.1'
|
|
644
|
-
*/
|
|
645
|
-
version?: string;
|
|
646
|
-
|
|
647
|
-
/**
|
|
648
|
-
* Package author/copyright holder.
|
|
649
|
-
* Used by the packing system for the LICENSE file and package.json author field.
|
|
650
|
-
*
|
|
651
|
-
* @example 'John Doe', 'Acme Corp'
|
|
652
|
-
*/
|
|
653
|
-
author?: string;
|
|
654
|
-
|
|
655
|
-
/**
|
|
656
|
-
* License identifier (SPDX format).
|
|
657
|
-
* Used by the packing system for LICENSE file generation.
|
|
658
|
-
*
|
|
659
|
-
* @example 'MIT', 'Apache-2.0', 'ISC'
|
|
660
|
-
*/
|
|
661
|
-
license?: string;
|
|
662
|
-
|
|
663
|
-
/**
|
|
664
|
-
* Hook IDs to run for this agent.
|
|
665
|
-
* References hooks by their unique `id` property from defineHook().
|
|
666
|
-
* These run when prompts don't specify their own hooks.
|
|
667
|
-
*
|
|
668
|
-
* @example
|
|
669
|
-
* ```typescript
|
|
670
|
-
* hooks: ['log_messages', 'track_tool_usage']
|
|
671
|
-
* ```
|
|
672
|
-
*/
|
|
673
|
-
hooks?: StandardAgentSpec.HookIds[];
|
|
674
|
-
}
|
|
354
|
+
Discoverability reference. For full signatures, read the spec types from `node_modules/@standardagents/spec/dist/` (or browse `packages/spec/src/` on GitHub). Worked examples live in `agents/tools/AGENTS.md`.
|
|
355
|
+
|
|
356
|
+
```
|
|
357
|
+
Identity threadId, agentId, userId, createdAt, children, terminated
|
|
358
|
+
Messages getMessages, getMessage, injectMessage, queueMessage, updateMessage
|
|
359
|
+
Logs getLogs
|
|
360
|
+
Resources loadModel, loadPrompt, loadAgent,
|
|
361
|
+
getChildThread, getParentThread,
|
|
362
|
+
getPromptNames, getAgentNames, getModelNames
|
|
363
|
+
Env env, setEnv
|
|
364
|
+
Parent notifyParent, setStatus
|
|
365
|
+
Tools queueTool, invokeTool
|
|
366
|
+
Effects scheduleEffect, getScheduledEffects, removeScheduledEffect
|
|
367
|
+
Events emit
|
|
368
|
+
Context context (Record<string, unknown>, in-memory only)
|
|
369
|
+
Files writeFile, readFile, readFileStream, statFile, readdirFile,
|
|
370
|
+
unlinkFile, mkdirFile, rmdirFile, getFileStats,
|
|
371
|
+
grepFiles, findFiles, getFileThumbnail
|
|
372
|
+
Execution execution, terminate
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
### Notes on a few that are easy to misuse
|
|
376
|
+
|
|
377
|
+
- **`state.context`** is in-memory for the *current execution*. It is not durable across thread restarts. For durable structured state, write a JSON file with `state.writeFile`.
|
|
378
|
+
- **`state.scheduleEffect`** runs a named effect after a delay. It survives restarts. This is your cron, your queue, and your retry timer all in one.
|
|
379
|
+
- **`state.invokeTool` vs `state.queueTool`** — `invokeTool` runs synchronously and returns the result; `queueTool` schedules the call to run later in the normal tool-call flow. Prefer `queueTool` when the model should see the result as a regular tool call.
|
|
380
|
+
- **`state.notifyParent`** — for resumable subagents with `parentCommunication: 'explicit'`, this is the only way the child talks to the parent. Use it sparingly; every notification interrupts the parent.
|
|
381
|
+
- **File attachments** use the path convention `/attachments/{filename}.{ext}`. Always use this path when passing files between agents — the runtime copies them across thread filesystems automatically.
|
|
382
|
+
|
|
383
|
+
---
|
|
384
|
+
|
|
385
|
+
## Tools
|
|
386
|
+
|
|
387
|
+
A "tool" is anything an agent can call. There are three kinds:
|
|
388
|
+
|
|
389
|
+
1. **Callables** — TypeScript functions defined via `defineTool` in `agents/tools/`. The primary way to interface with the outside world: APIs, databases, business logic, and (sometimes) other agents. Each callable receives a `ThreadState`. Do not assume Node APIs are available — your code may run on the edge.
|
|
390
|
+
2. **Subprompts** — prompts exposed as tools via their `toolDescription`. A single-step LLM call. Use for switching models on a focused task (image generation, polished writing, JSON extraction).
|
|
391
|
+
3. **Subagents** — full agents exposed as tools via `exposeAsTool: true` on the agent definition. Use when you need iteration, QA, reflection, or long-lived addressable behavior. Always `dual_ai`.
|
|
392
|
+
|
|
393
|
+
### `PromptDefinition` cheat sheet
|
|
394
|
+
|
|
395
|
+
A prompt is what actually gets sent to the LLM at one step. Set on each prompt file in `agents/prompts/`. For full signatures, read the spec types from `node_modules/@standardagents/spec/dist/` (or browse `packages/spec/src/` on GitHub), and see `agents/prompts/AGENTS.md`.
|
|
396
|
+
|
|
397
|
+
```
|
|
398
|
+
PromptDefinition
|
|
399
|
+
name string unique snake_case identifier
|
|
400
|
+
toolDescription string shown when this prompt is exposed as a tool
|
|
401
|
+
prompt string | PromptContent[] system prompt (string, or composable parts)
|
|
402
|
+
model ModelName references agents/models/<name>
|
|
403
|
+
includeChat boolean (default false) pass full chat history to this LLM step
|
|
404
|
+
includePastTools boolean (default false) pass past tool call results
|
|
405
|
+
parallelToolCalls boolean (default false) allow multiple tool calls per turn
|
|
406
|
+
toolChoice 'auto' | 'none' | 'required' tool calling strategy (default 'auto')
|
|
407
|
+
requiredSchema ZodSchema validate args when called as a tool
|
|
408
|
+
variables VariableDefinition[] declared text/secret variables
|
|
409
|
+
tools (string | SubpromptConfig | PromptToolConfig | SubagentToolConfig)[]
|
|
410
|
+
tools available at this step
|
|
411
|
+
env Record<string, string> prompt-level env values (lowest precedence)
|
|
412
|
+
reasoning ReasoningConfig extended thinking config (for models that support it)
|
|
413
|
+
recentImageThreshold number (default 10) how many recent messages keep real images
|
|
414
|
+
providerOptions Record<string, unknown> passthrough to the provider (overrides model defaults)
|
|
415
|
+
hooks HookId[] prompt-scoped hooks (overrides agent hooks)
|
|
675
416
|
```
|
|
676
417
|
|
|
677
|
-
|
|
418
|
+
### Composable prompts: the `tone` pattern
|
|
678
419
|
|
|
679
|
-
|
|
420
|
+
`prompt` can be a string or an array of parts. Use parts to compose a shared "tone" or "persona" across many prompts so changes flow through one place.
|
|
680
421
|
|
|
681
|
-
|
|
422
|
+
```ts
|
|
423
|
+
const tonePrompt: PromptDefinition = {
|
|
424
|
+
name: 'company_tone',
|
|
425
|
+
toolDescription: 'Defines the tone and style of the chatbot.',
|
|
426
|
+
prompt: `You are a friendly and helpful customer support assistant for an athletic shoe company. You always respond in a positive and upbeat tone, even when the customer is upset. You use simple language and avoid technical jargon.`,
|
|
427
|
+
model: 'tiny',
|
|
428
|
+
};
|
|
682
429
|
|
|
683
|
-
|
|
430
|
+
const shippingPrompt: PromptDefinition = {
|
|
431
|
+
name: 'shipping_inquiries',
|
|
432
|
+
toolDescription: 'Handles customer questions about shipping.',
|
|
433
|
+
prompt: [
|
|
434
|
+
{ type: 'include', prompt: 'company_tone' },
|
|
435
|
+
{ type: 'text', content: 'Details about shipping policies: ...' },
|
|
436
|
+
],
|
|
437
|
+
model: 'tiny',
|
|
438
|
+
};
|
|
439
|
+
```
|
|
684
440
|
|
|
685
|
-
|
|
441
|
+
Use `{ type: 'env', property: 'PRODUCT_INVENTORY' }` parts to inject runtime values into the prompt. Combined with `variables`, this lets a generic agent be specialized per-thread without code changes:
|
|
686
442
|
|
|
687
443
|
```ts
|
|
688
|
-
|
|
689
|
-
|
|
690
|
-
|
|
691
|
-
|
|
692
|
-
|
|
693
|
-
|
|
694
|
-
|
|
695
|
-
|
|
696
|
-
|
|
697
|
-
|
|
698
|
-
|
|
699
|
-
|
|
700
|
-
|
|
701
|
-
|
|
702
|
-
|
|
703
|
-
|
|
704
|
-
/**
|
|
705
|
-
* Property from tool-call arguments used as the initial message sent to the
|
|
706
|
-
* subagent on invocation.
|
|
707
|
-
*
|
|
708
|
-
* Uses the same semantics as {@link SubpromptConfig.initUserMessageProperty}.
|
|
709
|
-
*/
|
|
710
|
-
initUserMessageProperty?: StandardAgentSpec.SchemaFields<T>;
|
|
711
|
-
/**
|
|
712
|
-
* Property from tool-call arguments containing attachment path(s) that should
|
|
713
|
-
* be sent to the subagent on invocation.
|
|
714
|
-
*
|
|
715
|
-
* Uses the same semantics as {@link SubpromptConfig.initAttachmentsProperty}.
|
|
716
|
-
*/
|
|
717
|
-
initAttachmentsProperty?: StandardAgentSpec.SchemaFields<T>;
|
|
718
|
-
/**
|
|
719
|
-
* Property from tool-call arguments used to assign a human-readable name for
|
|
720
|
-
* each spawned child thread instance.
|
|
721
|
-
*
|
|
722
|
-
* Implementations SHOULD store this as a thread tag in the form
|
|
723
|
-
* `name:<value>` so UIs can render a concise per-instance title.
|
|
724
|
-
*/
|
|
725
|
-
initAgentNameProperty?: StandardAgentSpec.SchemaFields<T>;
|
|
726
|
-
/**
|
|
727
|
-
* Execute this tool immediately when the prompt becomes active.
|
|
728
|
-
*
|
|
729
|
-
* - `true`: Execute immediately using runtime defaults.
|
|
730
|
-
* - Object: Execute immediately with explicit per-instance env relationships.
|
|
731
|
-
*
|
|
732
|
-
* When the object form is used:
|
|
733
|
-
* - `scopedEnv` names the per-instance env values copied into the child thread.
|
|
734
|
-
* - `nameEnv` and `descriptionEnv` identify the only per-instance env values
|
|
735
|
-
* that runtimes may expose to an internal bootstrap model when deriving
|
|
736
|
-
* initial child arguments.
|
|
737
|
-
*
|
|
738
|
-
* Runtimes MUST NOT expose `scopedEnv` values to the model unless the same env
|
|
739
|
-
* name is explicitly designated by `nameEnv` or `descriptionEnv`.
|
|
740
|
-
*
|
|
741
|
-
* Immediate tools run before the first LLM step for that activation.
|
|
742
|
-
*/
|
|
743
|
-
immediate?: boolean | {
|
|
744
|
-
/**
|
|
745
|
-
* Scoped env name whose value may be used as the safe per-instance name
|
|
746
|
-
* hint for child bootstrap.
|
|
747
|
-
*/
|
|
748
|
-
nameEnv?: string;
|
|
749
|
-
/**
|
|
750
|
-
* Scoped env name whose value may be used as the safe per-instance
|
|
751
|
-
* description hint for child bootstrap.
|
|
752
|
-
*/
|
|
753
|
-
descriptionEnv?: string;
|
|
754
|
-
/**
|
|
755
|
-
* Scoped env names that should be copied into the child thread for each
|
|
756
|
-
* immediate instance group.
|
|
757
|
-
*/
|
|
758
|
-
scopedEnv?: string[];
|
|
759
|
-
};
|
|
760
|
-
/**
|
|
761
|
-
* Optional branch flag env name.
|
|
762
|
-
*
|
|
763
|
-
* When set, this subagent is only enabled when the named env resolves to
|
|
764
|
-
* `true`, `1`, or `yes` (case-insensitive).
|
|
765
|
-
*/
|
|
766
|
-
optional?: string;
|
|
767
|
-
/**
|
|
768
|
-
* Resumability configuration.
|
|
769
|
-
*
|
|
770
|
-
* - `false` (default): Non-resumable subagent
|
|
771
|
-
* - Object: Resumable subagent with message routing and instance limits
|
|
772
|
-
*
|
|
773
|
-
* When resumable mode is enabled, runtimes SHOULD provide a built-in create
|
|
774
|
-
* and message lifecycle interface instead of exposing raw agent callables for
|
|
775
|
-
* new instance creation.
|
|
776
|
-
*/
|
|
777
|
-
resumable?: false | {
|
|
778
|
-
/**
|
|
779
|
-
* Which side of the child `dual_ai` conversation receives parent messages.
|
|
780
|
-
*
|
|
781
|
-
* - `side_a`: Messages are queued as `role: 'user'`
|
|
782
|
-
* - `side_b`: Messages are queued as `role: 'assistant'`
|
|
783
|
-
*/
|
|
784
|
-
receives_messages: 'side_a' | 'side_b';
|
|
785
|
-
/**
|
|
786
|
-
* Maximum concurrent instances for this subagent tool.
|
|
787
|
-
*
|
|
788
|
-
* When reached, implementations may remove this tool from subsequent LLM
|
|
789
|
-
* requests and route new messages to existing instances.
|
|
790
|
-
*
|
|
791
|
-
* @default unlimited
|
|
792
|
-
*/
|
|
793
|
-
maxInstances?: number;
|
|
794
|
-
/**
|
|
795
|
-
* How this child reports back to its parent.
|
|
796
|
-
*
|
|
797
|
-
* - `implicit` (default): Child completion is automatically queued to the parent.
|
|
798
|
-
* - `explicit`: The runtime does not auto-queue child completion; tools/hooks may
|
|
799
|
-
* use thread APIs such as `state.notifyParent()` when they choose to escalate.
|
|
800
|
-
*/
|
|
801
|
-
parentCommunication?: 'implicit' | 'explicit';
|
|
802
|
-
};
|
|
444
|
+
{
|
|
445
|
+
name: 'ecommerce_assistant',
|
|
446
|
+
toolDescription: 'Assistant for ecommerce businesses.',
|
|
447
|
+
prompt: [
|
|
448
|
+
{ type: 'text', content: 'You help customers with products and orders. Current inventory: ' },
|
|
449
|
+
{ type: 'env', property: 'PRODUCT_INVENTORY' },
|
|
450
|
+
],
|
|
451
|
+
model: 'tiny',
|
|
452
|
+
variables: [
|
|
453
|
+
{
|
|
454
|
+
name: 'PRODUCT_INVENTORY',
|
|
455
|
+
type: 'text',
|
|
456
|
+
required: true,
|
|
457
|
+
description: 'Full product inventory with names, descriptions, and stock levels.',
|
|
458
|
+
},
|
|
459
|
+
],
|
|
803
460
|
}
|
|
804
461
|
```
|
|
805
462
|
|
|
806
|
-
|
|
807
|
-
1. Parents explicitly create children by calling the tool `subagent_create`.
|
|
808
|
-
2. Parents implicitly create children by having a subagent tool call with `immediate: { ... }` which creates a child thread as soon as the parent thread is activated, without the parent having to explicitly call the tool.
|
|
809
|
-
2. Children only communicate back to their parents:
|
|
810
|
-
1. Implicit communication: the child thread automatically queues a message to the parent thread when it ends a "session". A session ends, typically, when one side calls the tools assigned to `sessionStop` or `sessionFail` properties in the agent definition, or the `maxSessionTurns` is reached. All subagents communicate implicitly, unless they are resumable and have `resumable.parentCommunication` set to `explicit`.
|
|
811
|
-
2. Explicit communication: the child thread never communicates with the parent unless `state.notifyParent()` is called. Typically this is done in the tools that are given to the child thread. These calls are independent from session management. Subagents that receive a lot of inbound traffic, for example a Slack subagent, may want to use explicit communication to have more control over when the parent is notified and reduce noise.
|
|
812
|
-
3. Resumable subagents can receive messages from their parents. Which side receives the message is indicated by the `resumable.receives_messages`.
|
|
813
|
-
4. When sending messages back to the parent, subagents MUST indicate if they require the parent to provide a response or not (in plain english, for example "After researching you must provide a response back so I can continue with sending this email."). For example, a Gmail subagent may ask its parent for guidance on how to respond to an email regarding company policy, the coordinator may ask another subagent with expertise in legal matters and company policy for advice. The legal subagent does not need a response, it is just providing information, but the gmail subagent does require a response in order to proceed. Thus it is critical that the gmail subagent indicates to the parent that it needs a response in order to continue.
|
|
814
|
-
5. File attachments are represented by simple strings. Anytime a file is added to a standard agent's filesystem by generation or upload it is given an explicit attachment "path" `/attachments/{filename}.{ext}`. This MUST be used anytime an agent is coordinating with subagents. The tool definition for a subagent can indicate an `initAttachmentsProperty`, which should be an array of strings — if these strings are valid attachments in the parent's file system, then those attachments will be copied into the subagent's file system and attached along with the `initUserMessageProperty` when the subagent is created (note: this also is true for sub-prompts).
|
|
815
|
-
6. Resumable agents communicate from one agent to another via "silent" user messages. These messages indicate which agent instance they are from (via uuid) as well as the content of the message. The parent agent can then decide what to do with the message, whether to respond to it, or just use the information in the message to make a decision.
|
|
816
|
-
7. If a message is sent to a parent agent or a sub agent that is currently busy, it will be queued and sent when the agent is free.
|
|
817
|
-
8. When writing the language for prompts that include resumable subagents, ensure you clearly describe when the subagent should be created (assuming it's not immediate) via `subagent_create` vs when it should just be given a message via `subagent_message`. For example, if you have a research subagent that is researching a given topic, and another subagent needs additional information on that topic, just send a message back to the same resumable subagent, so its existing context can be useful. But if it's requesting information on a totally new topic, then perhaps it should use `subagent_create`.
|
|
463
|
+
### `AgentDefinition` cheat sheet
|
|
818
464
|
|
|
465
|
+
The agent definition binds prompts and sides together. Set on each file in `agents/agents/`. For full signatures, see `agents/agents/AGENTS.md`.
|
|
819
466
|
|
|
820
|
-
|
|
467
|
+
```
|
|
468
|
+
AgentDefinition
|
|
469
|
+
name string unique snake_case identifier
|
|
470
|
+
type 'ai_human' | 'dual_ai' default 'ai_human' — but see "dual_ai is the default"
|
|
471
|
+
maxSessionTurns number REQUIRED for dual_ai. Finite turn cap.
|
|
472
|
+
sideA SideConfig AI side (or first AI in dual_ai)
|
|
473
|
+
sideB SideConfig second AI side; required for dual_ai
|
|
474
|
+
exposeAsTool boolean (default false) enables this agent to be called as a tool by other prompts
|
|
475
|
+
toolDescription string required if exposeAsTool: true
|
|
476
|
+
description string brief human description for UIs
|
|
477
|
+
icon string URL or absolute path
|
|
478
|
+
env Record<string, string> agent-level env values
|
|
479
|
+
hooks HookId[] agent-scoped hooks (when prompts don't specify their own)
|
|
480
|
+
```
|
|
821
481
|
|
|
822
|
-
|
|
482
|
+
`SideConfig` is where you bind `defaultPrompt`, `defaultModel`, and the **session lifecycle bindings**:
|
|
823
483
|
|
|
824
|
-
|
|
825
|
-
|
|
484
|
+
- `sessionStop` — name of the tool whose call ends the session with a result
|
|
485
|
+
- `sessionFail` — name of the tool whose call ends the session with a failure
|
|
486
|
+
- `sessionStatus` — optional, a tool used to update the session status mid-run
|
|
826
487
|
|
|
827
|
-
|
|
488
|
+
These are the bindings the [Session boundary discipline](#session-boundary-discipline) section refers to. Every `dual_ai` agent must set them.
|
|
828
489
|
|
|
829
|
-
|
|
490
|
+
Packaging fields (`packageName`, `version`, `author`, `license`) exist for the agent packing system; ignore them unless you're publishing.
|
|
830
491
|
|
|
831
|
-
|
|
492
|
+
### `SubagentToolConfig` cheat sheet
|
|
832
493
|
|
|
833
|
-
|
|
494
|
+
This is where parent/child architecture is actually configured — it lives on the *parent prompt's* `tools` array. Knowing every field exists is essential; full signatures live in the spec types (`node_modules/@standardagents/spec/dist/`, or `packages/spec/src/` on GitHub).
|
|
834
495
|
|
|
835
|
-
|
|
836
|
-
|
|
837
|
-
|
|
838
|
-
|
|
839
|
-
|
|
840
|
-
|
|
841
|
-
|
|
842
|
-
|
|
843
|
-
|
|
844
|
-
|
|
845
|
-
|
|
496
|
+
```
|
|
497
|
+
SubagentToolConfig — entry on a parent prompt's `tools` array
|
|
498
|
+
name string dual_ai agent to invoke (must have exposeAsTool: true)
|
|
499
|
+
blocking boolean (default true) parent waits for result, vs. fire-and-forget
|
|
500
|
+
immediate bool | object spawn child when prompt activates, before any LLM step
|
|
501
|
+
object form: { nameEnv, descriptionEnv, scopedEnv }
|
|
502
|
+
scopedEnv values are copied into the child but NOT
|
|
503
|
+
exposed to the bootstrap model unless named in
|
|
504
|
+
nameEnv/descriptionEnv
|
|
505
|
+
resumable false | object false = tool-call style (default)
|
|
506
|
+
object = long-lived addressable child; runtime
|
|
507
|
+
exposes built-in subagent_create / subagent_message
|
|
508
|
+
receives_messages 'side_a' | 'side_b' which side hears parent messages
|
|
509
|
+
side_a = queued as 'user'
|
|
510
|
+
side_b = queued as 'assistant'
|
|
511
|
+
maxInstances number cap on concurrent instances; when reached the tool
|
|
512
|
+
is hidden and new messages route to existing instances
|
|
513
|
+
parentCommunication 'implicit' | 'explicit' implicit (default) auto-queues completion to parent
|
|
514
|
+
explicit requires state.notifyParent() in code
|
|
515
|
+
initUserMessageProperty schema field name tool arg used as child's first user message
|
|
516
|
+
initAttachmentsProperty schema field name tool arg holding /attachments/* paths to copy to child
|
|
517
|
+
initAgentNameProperty schema field name tool arg used to tag the child thread (UI label)
|
|
518
|
+
optional string (env name) subagent disabled unless this env resolves truthy
|
|
519
|
+
```
|
|
846
520
|
|
|
847
|
-
|
|
521
|
+
A second tool-config form exists for plain callables (`{ name, env, options }`), and a third for subprompts (`{ name, includeTextResponse, includeToolCalls, includeErrors, initUserMessageProperty, initAttachmentsProperty }`). Both are documented in full at `agents/prompts/AGENTS.md`.
|
|
848
522
|
|
|
849
|
-
|
|
523
|
+
### Inter-agent communication rules
|
|
850
524
|
|
|
851
|
-
|
|
525
|
+
These are correct as written in the spec — internalize them.
|
|
852
526
|
|
|
853
|
-
|
|
527
|
+
1. **Parents always create children.**
|
|
528
|
+
- Explicitly via the built-in `subagent_create` tool.
|
|
529
|
+
- Implicitly via `immediate: { ... }` — the child spawns the moment the parent prompt activates, before any LLM step.
|
|
530
|
+
2. **Children only communicate back to their parents.** Two flavors:
|
|
531
|
+
- **Implicit**: the child auto-queues a message to the parent when the session ends (via `sessionStop`, `sessionFail`, or `maxSessionTurns`). Default for all subagents.
|
|
532
|
+
- **Explicit**: only when `state.notifyParent()` is called. Set with `resumable.parentCommunication: 'explicit'`. Use for high-traffic resumables (e.g., a Slack monitor) where you want control over when the parent is interrupted.
|
|
533
|
+
3. **Resumable subagents can receive messages from their parents.** `resumable.receives_messages` chooses which side hears them.
|
|
534
|
+
4. **When a child sends back to the parent, it MUST indicate whether it needs a response** — in plain English, e.g. "After researching, you must respond so I can continue with the email." A research subagent providing FYI info doesn't need a response; a gmail subagent waiting on guidance does.
|
|
535
|
+
5. **File attachments use path strings.** `/attachments/{filename}.{ext}`. Use this convention everywhere — the runtime copies attachments across thread filesystems automatically when listed in `initAttachmentsProperty`.
|
|
536
|
+
6. **Resumable agents communicate via "silent" user messages** that carry the source instance UUID. The receiving agent decides whether to respond or just absorb the information.
|
|
537
|
+
7. **Messages to busy agents are queued** and delivered when the agent is free.
|
|
538
|
+
8. **Prompt language must distinguish `subagent_create` from `subagent_message`.** When writing prompts that include resumable subagents, explicitly tell the model: *"To research a new topic, use `subagent_create`. To follow up on a topic an existing instance is already researching, use `subagent_message`."* Without this guidance, models will pick wrong.
|
|
854
539
|
|
|
855
|
-
|
|
540
|
+
---
|
|
856
541
|
|
|
857
|
-
|
|
542
|
+
## Hooks
|
|
858
543
|
|
|
859
|
-
|
|
860
|
-
|
|
861
|
-
|
|
862
|
-
|
|
863
|
-
|
|
864
|
-
|
|
865
|
-
|
|
866
|
-
|
|
867
|
-
|
|
868
|
-
|
|
869
|
-
|
|
870
|
-
|
|
871
|
-
|
|
872
|
-
|
|
873
|
-
|
|
874
|
-
|
|
875
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
876
|
-
// Messages
|
|
877
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
878
|
-
getMessages(options?: GetMessagesOptions): Promise<MessagesResult>;
|
|
879
|
-
getMessage(messageId: string): Promise<Message | null>;
|
|
880
|
-
injectMessage(input: InjectMessageInput): Promise<Message>;
|
|
881
|
-
queueMessage(input: QueueMessageInput): Promise<void>;
|
|
882
|
-
updateMessage(messageId: string, updates: MessageUpdates): Promise<Message>;
|
|
883
|
-
|
|
884
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
885
|
-
// Logs
|
|
886
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
887
|
-
getLogs(options?: GetLogsOptions): Promise<Log[]>;
|
|
888
|
-
|
|
889
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
890
|
-
// Resource Loading
|
|
891
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
892
|
-
loadModel<T = unknown>(name: string): Promise<T>;
|
|
893
|
-
loadPrompt<T = unknown>(name: string): Promise<T>;
|
|
894
|
-
loadAgent<T = unknown>(name: string): Promise<T>;
|
|
895
|
-
getChildThread(referenceId: string): Promise<ThreadState | null>;
|
|
896
|
-
getParentThread(): Promise<ThreadState | null>;
|
|
897
|
-
getPromptNames(): string[];
|
|
898
|
-
getAgentNames(): string[];
|
|
899
|
-
getModelNames(): string[];
|
|
900
|
-
env(propertyName: string): Promise<string>;
|
|
901
|
-
setEnv(propertyName: string, value: string): Promise<void>;
|
|
902
|
-
notifyParent(content: string): Promise<void>;
|
|
903
|
-
setStatus(status: string): Promise<void>;
|
|
904
|
-
|
|
905
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
906
|
-
// Tool Invocation
|
|
907
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
908
|
-
queueTool(toolName: string, args: Record<string, unknown>): void;
|
|
909
|
-
invokeTool(toolName: string, args: Record<string, unknown>): Promise<ToolResult>;
|
|
910
|
-
|
|
911
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
912
|
-
// Effect Scheduling
|
|
913
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
914
|
-
scheduleEffect(name: string, args: Record<string, unknown>, delay?: number): Promise<string>;
|
|
915
|
-
getScheduledEffects(name?: string): Promise<ScheduledEffect[]>;
|
|
916
|
-
removeScheduledEffect(id: string): Promise<boolean>;
|
|
917
|
-
|
|
918
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
919
|
-
// Events
|
|
920
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
921
|
-
emit(event: string, data: unknown): void;
|
|
922
|
-
|
|
923
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
924
|
-
// Context Storage
|
|
925
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
926
|
-
context: Record<string, unknown>;
|
|
927
|
-
|
|
928
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
929
|
-
// File System
|
|
930
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
931
|
-
writeFile(path: string, data: ArrayBuffer | string, mimeType: string, options?: WriteFileOptions): Promise<FileRecord>;
|
|
932
|
-
readFile(path: string): Promise<ArrayBuffer | null>;
|
|
933
|
-
readFileStream(path: string, options?: ReadFileStreamOptions): Promise<AsyncIterable<FileChunk> | null>;
|
|
934
|
-
statFile(path: string): Promise<FileRecord | null>;
|
|
935
|
-
readdirFile(path: string): Promise<ReaddirResult>;
|
|
936
|
-
unlinkFile(path: string): Promise<void>;
|
|
937
|
-
mkdirFile(path: string): Promise<FileRecord>;
|
|
938
|
-
rmdirFile(path: string): Promise<void>;
|
|
939
|
-
getFileStats(): Promise<FileStats>;
|
|
940
|
-
grepFiles(pattern: string): Promise<GrepResult[]>;
|
|
941
|
-
findFiles(pattern: string): Promise<FindResult>;
|
|
942
|
-
getFileThumbnail(path: string): Promise<ArrayBuffer | null>;
|
|
943
|
-
|
|
944
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
945
|
-
// Execution State
|
|
946
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
947
|
-
execution: ExecutionState | null;
|
|
948
|
-
terminate(): Promise<void>;
|
|
949
|
-
|
|
950
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
951
|
-
// Runtime Context (Non-Portable)
|
|
952
|
-
// ─────────────────────────────────────────────────────────────────────────
|
|
953
|
-
readonly _notPackableRuntimeContext?: Record<string, unknown>;
|
|
954
|
-
}
|
|
955
|
-
```
|
|
544
|
+
Hooks extend agent behavior without modifying core logic. Defined via `defineHook`, referenced by `id` from agent or prompt definitions. Common uses: truncating history, injecting synthetic tool calls (e.g., real-time clock awareness), logging, and adapting tool results.
|
|
545
|
+
|
|
546
|
+
| Hook | Execution Point | Purpose |
|
|
547
|
+
|---|---|---|
|
|
548
|
+
| `filter_messages` | Before LLM context assembly | Filter/transform message history |
|
|
549
|
+
| `prefilter_llm_history` | After context assembly | Final adjustments before LLM request |
|
|
550
|
+
| `before_create_message` | Before message insert | Transform message before storage |
|
|
551
|
+
| `after_create_message` | After message insert | Side effects after storage |
|
|
552
|
+
| `before_update_message` | Before message update | Transform update data |
|
|
553
|
+
| `after_update_message` | After message update | Side effects after update |
|
|
554
|
+
| `before_store_tool_result` | Before tool result storage | Transform tool results |
|
|
555
|
+
| `after_tool_call_success` | After successful tool call | Post-process success results |
|
|
556
|
+
| `after_tool_call_failure` | After failed tool call | Handle/recover from errors |
|
|
557
|
+
|
|
558
|
+
> **Caution:** Hooks that filter or truncate messages must keep matching tool calls and tool results together. Separating them produces hard-to-debug failures in many models.
|
|
956
559
|
|
|
560
|
+
## Variables and environment
|
|
957
561
|
|
|
958
|
-
|
|
562
|
+
Variables let tools, prompts, and agents declare dynamic values they need. Two types:
|
|
959
563
|
|
|
960
|
-
|
|
564
|
+
- **`text`** — simple string. Safe to render in prompts.
|
|
565
|
+
- **`secret`** — encrypted; **MUST only be used inside tools**. Never reference a secret in prompt text and never return it to the model. A `GMAIL_API_KEY` is `secret`; a `LOCATION` is `text`.
|
|
961
566
|
|
|
962
|
-
|
|
963
|
-
Agent Builder Documentation: https://docs.standardagentbuilder.com/llms.txt
|
|
567
|
+
When a thread is created, all required variables in the agent graph must be provided. The instance of a variable on a thread is called an "environment variable" or `env`. Resolution precedence (low → high): prompt → tool → agent → thread.
|
|
964
568
|
|
|
569
|
+
Scoped variables (`scoped: true`) do not inherit from parent thread env — they reset for the declaring agent's subtree. Use this when a subagent must run with different config from its parent (e.g., a per-instance Slack channel ID).
|
|
570
|
+
|
|
571
|
+
---
|
|
965
572
|
|
|
966
573
|
## Implementation checking
|
|
967
574
|
|
|
968
|
-
When you edit `agents/`, `prompts/`, `models/`, `tools/`, `hooks/`, or `worker/`,
|
|
969
|
-
|
|
970
|
-
|
|
971
|
-
|
|
972
|
-
|
|
973
|
-
|
|
974
|
-
|
|
975
|
-
|
|
976
|
-
|
|
977
|
-
|
|
978
|
-
|
|
979
|
-
|
|
980
|
-
|
|
981
|
-
|
|
575
|
+
When you edit `agents/`, `prompts/`, `models/`, `tools/`, `hooks/`, or `worker/`, validate before claiming the change is done.
|
|
576
|
+
|
|
577
|
+
Validation order:
|
|
578
|
+
|
|
579
|
+
1. Read `package.json` and prefer project scripts if they exist.
|
|
580
|
+
2. Refresh Cloudflare types:
|
|
581
|
+
- use `pnpm cf-typegen` if that script exists
|
|
582
|
+
- otherwise run `npx wrangler types`
|
|
583
|
+
3. Regenerate AgentBuilder types by running the project's build command:
|
|
584
|
+
- usually `pnpm build`, `npm run build`, `pnpm run dev`, or equivalent
|
|
585
|
+
4. Run the project's type checker:
|
|
586
|
+
- prefer `pnpm type-check` / `npm run type-check` if present
|
|
587
|
+
- otherwise use the installed checker directly: `pnpm exec vue-tsc --build`, `pnpm exec tsc -b`, or `pnpm exec tsc --noEmit`
|
|
588
|
+
5. If any validation step cannot run, state exactly what is missing.
|