@bastani/atomic 0.5.23-0 → 0.5.24-0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/workflow-creator/SKILL.md +137 -326
- package/.agents/skills/workflow-creator/references/agent-sessions.md +211 -152
- package/.agents/skills/workflow-creator/references/computation-and-validation.md +12 -37
- package/.agents/skills/workflow-creator/references/control-flow.md +20 -14
- package/.agents/skills/workflow-creator/references/discovery-and-verification.md +1 -1
- package/.agents/skills/workflow-creator/references/failure-modes.md +87 -62
- package/.agents/skills/workflow-creator/references/getting-started.md +14 -40
- package/.agents/skills/workflow-creator/references/running-workflows.md +235 -0
- package/.agents/skills/workflow-creator/references/session-config.md +24 -9
- package/.agents/skills/workflow-creator/references/state-and-data-flow.md +9 -26
- package/.agents/skills/workflow-creator/references/user-input.md +71 -43
- package/.agents/skills/workflow-creator/references/workflow-inputs.md +25 -42
- package/dist/sdk/providers/claude.d.ts +7 -2
- package/dist/sdk/providers/claude.d.ts.map +1 -1
- package/dist/sdk/providers/opencode.d.ts +18 -2
- package/dist/sdk/providers/opencode.d.ts.map +1 -1
- package/dist/sdk/runtime/executor.d.ts +5 -0
- package/dist/sdk/runtime/executor.d.ts.map +1 -1
- package/package.json +1 -1
- package/src/sdk/providers/claude.ts +57 -12
- package/src/sdk/providers/headless-hil-policy.test.ts +171 -0
- package/src/sdk/providers/opencode.ts +62 -2
- package/src/sdk/runtime/executor.ts +57 -14
|
@@ -1,79 +1,66 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: workflow-creator
|
|
3
|
-
description: Create
|
|
3
|
+
description: Create AND run Atomic CLI workflows (`defineWorkflow().run().compile()` with `ctx.stage()`) across Claude, Copilot, and OpenCode SDKs. Use for **authoring** when the user wants to build, edit, debug, or design agent pipelines — multi-stage automations, review/fix loops, parallel fan-out, headless/background stages, `.atomic/workflows/` files, `defineWorkflow`, `ctx.stage`, `ctx.inputs`, or declared `WorkflowInput` schemas. Use for **running** when the user wants to kick off, execute, monitor, or tear down an existing workflow — "run the ralph workflow", "start gen-spec", "is it done yet?", "what's the status?", "kill the session", or any mention of `atomic workflow -n`, `atomic workflow inputs`, `atomic workflow status`, the picker, or `atomic session kill`.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Workflow Creator
|
|
7
7
|
|
|
8
|
-
You are a workflow architect specializing in the Atomic CLI `defineWorkflow().run().compile()` API.
|
|
8
|
+
You are a workflow architect specializing in the Atomic CLI `defineWorkflow().run().compile()` API. You translate user intent into well-structured workflow files that orchestrate multiple coding agent sessions using **programmatic SDK code** — Claude Agent SDK, Copilot SDK, and OpenCode SDK. Sessions are spawned dynamically via `ctx.stage(stageOpts, clientOpts, sessionOpts, callback)` inside the `.run()` callback, using native TypeScript control flow (loops, conditionals, `Promise.all()`) for orchestration. The runtime auto-creates the SDK client and session, injects them as `s.client` and `s.session`, runs the callback, then auto-cleans up.
|
|
9
9
|
|
|
10
|
-
You also serve as a **context engineering advisor
|
|
10
|
+
You also serve as a **context engineering advisor** — use the design skills listed under "Design Advisory Skills" to make informed architectural decisions about session structure, data flow, prompt composition, and quality assurance.
|
|
11
|
+
|
|
12
|
+
Two user journeys live in this skill:
|
|
13
|
+
|
|
14
|
+
- **Authoring** a new workflow (or editing/debugging an existing one) → read on below.
|
|
15
|
+
- **Running** a workflow on the user's behalf ("run ralph on this spec", "is it done yet?", "kill it") → go to `references/running-workflows.md`.
|
|
11
16
|
|
|
12
17
|
## Reference Files
|
|
13
18
|
|
|
14
|
-
Load
|
|
19
|
+
Load references on demand. **Only `getting-started.md` is always-load.** Everything else is conditional — pull it in when the task matches the trigger column.
|
|
15
20
|
|
|
16
|
-
|
|
|
17
|
-
|
|
18
|
-
|
|
|
19
|
-
|
|
|
20
|
-
|
|
|
21
|
-
|
|
|
22
|
-
|
|
|
23
|
-
|
|
|
24
|
-
|
|
|
25
|
-
|
|
|
26
|
-
|
|
|
27
|
-
|
|
|
21
|
+
| File | Load when |
|
|
22
|
+
|---|---|
|
|
23
|
+
| `getting-started.md` | **Always** — quick-start examples for all 3 SDKs, SDK exports, `SessionContext` field reference |
|
|
24
|
+
| `failure-modes.md` | Before shipping any multi-session workflow. 16 catalogued failures (silent + loud) with wrong-vs-right patterns and a pre-ship design checklist |
|
|
25
|
+
| `workflow-inputs.md` | When declaring structured inputs or documenting how a workflow is invoked — `WorkflowInput` schema, field-type selection, picker + CLI flag semantics, builtin-protection rules |
|
|
26
|
+
| `agent-sessions.md` | When writing SDK calls — `s.session.query()` (Claude), `s.session.send()` (Copilot), `s.client.session.prompt()` (OpenCode); includes session-lifecycle pitfalls and when to use `sendAndWait` with explicit timeouts |
|
|
27
|
+
| `control-flow.md` | When using loops, conditionals, parallel execution (`Promise.all`), headless fan-out, or review/fix patterns |
|
|
28
|
+
| `state-and-data-flow.md` | When passing data between sessions — `s.save()`, `s.transcript()`, `s.getMessages()`, file persistence, transcript compression |
|
|
29
|
+
| `running-workflows.md` | When the user asks you to **run** an existing workflow rather than author one |
|
|
30
|
+
| `computation-and-validation.md` | When adding deterministic computation, response parsing, validation, quality gates, or file I/O |
|
|
31
|
+
| `session-config.md` | When configuring model, tools, permissions, hooks, or structured output per SDK |
|
|
32
|
+
| `user-input.md` | When collecting user input **mid-workflow** (not at invocation time — use `workflow-inputs.md` for that) |
|
|
33
|
+
| `discovery-and-verification.md` | When setting up workflow file structure, validation, or TypeScript config |
|
|
28
34
|
|
|
29
35
|
## Information Flow Is a First-Class Design Concern
|
|
30
36
|
|
|
31
37
|
**A workflow is an information flow problem, not a sequence of prompts.**
|
|
32
|
-
Before
|
|
33
|
-
for every session boundary in your workflow:
|
|
34
|
-
|
|
35
|
-
1. **What context does this session need to succeed?** The original user
|
|
36
|
-
spec? Prior stage output? File paths? Git state? A summary?
|
|
37
|
-
2. **How will that context reach the session?** Built into the prompt?
|
|
38
|
-
Read from a file? Retrieved via a tool? Kept inside one continued
|
|
39
|
-
multi-turn stage instead of crossing a stage boundary?
|
|
40
|
-
3. **What happens if the context window fills up?** Compact? Clear? Spawn
|
|
41
|
-
a sub-session? Offload to files?
|
|
42
|
-
|
|
43
|
-
If you can't answer all three crisply, you don't have a workflow — you
|
|
44
|
-
have a sequence of hopeful prompts that will fail in non-deterministic
|
|
45
|
-
ways at scale.
|
|
46
|
-
|
|
47
|
-
### Session lifecycle controls information flow
|
|
48
|
-
|
|
49
|
-
| Lifecycle state | Context visible to the model | When it happens |
|
|
50
|
-
|---|---|---|
|
|
51
|
-
| **Fresh** | **Nothing** — empty conversation | Each new `ctx.stage()` call — the runtime creates a new session |
|
|
52
|
-
| **Continued** | Everything sent so far in this session | Additional turns within the same stage callback |
|
|
53
|
-
| **Closed** | Gone from the live client; persisted only through what you explicitly saved | Runtime auto-cleanup after the stage callback returns |
|
|
54
|
-
|
|
55
|
-
**Closing a session and creating a new one wipes all in-session context.**
|
|
56
|
-
The new session knows *only* what you put in its first prompt.
|
|
57
|
-
|
|
58
|
-
Claude is different: the runtime reuses a single persistent tmux pane, so every turn within a stage accumulates in the same conversation. But for Copilot and OpenCode, **every `ctx.stage()` is a fresh conversation** — you must explicitly forward context across the boundary.
|
|
59
|
-
|
|
60
|
-
### Avoiding context loss
|
|
38
|
+
Before writing any `ctx.stage()` call, answer for every session boundary:
|
|
61
39
|
|
|
62
|
-
|
|
40
|
+
- What context does this session need, how will it reach the session
|
|
41
|
+
(prompt handoff, file, single multi-turn stage), and what happens if the
|
|
42
|
+
context window fills up?
|
|
63
43
|
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
44
|
+
For Copilot and OpenCode, every `ctx.stage()` is a fresh conversation;
|
|
45
|
+
Claude reuses a tmux pane per stage. Read these before shipping any
|
|
46
|
+
multi-session workflow:
|
|
67
47
|
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
- `references/failure-modes.md` —
|
|
72
|
-
|
|
48
|
+
- `references/agent-sessions.md` §"Critical pitfall: session lifecycle
|
|
49
|
+
controls what context is available" — lifecycle table, context-loss
|
|
50
|
+
patterns, and per-SDK details.
|
|
51
|
+
- `references/failure-modes.md` — silent + loud failures with wrong-vs-right
|
|
52
|
+
patterns and the pre-ship design checklist.
|
|
53
|
+
- `references/state-and-data-flow.md` — `s.save()`, `s.transcript()`, and
|
|
54
|
+
file-based handoff patterns.
|
|
73
55
|
|
|
74
56
|
## Design Advisory Skills
|
|
75
57
|
|
|
76
|
-
Workflow quality depends on two disciplines: **prompt engineering** (crafting
|
|
58
|
+
Workflow quality depends on two disciplines: **prompt engineering** (crafting
|
|
59
|
+
clear, structured prompts each session receives) and **context engineering**
|
|
60
|
+
(ensuring the right information reaches each session without exceeding token
|
|
61
|
+
budgets). Use `prompt-engineer` to improve individual session prompts —
|
|
62
|
+
clarity, XML structure, few-shot examples, chain-of-thought — and the
|
|
63
|
+
context engineering skills below to design information flow between sessions.
|
|
77
64
|
|
|
78
65
|
| Design Concern | Skill | Trigger |
|
|
79
66
|
|---|---|---|
|
|
@@ -94,7 +81,11 @@ Workflow quality depends on two disciplines: **prompt engineering** (crafting cl
|
|
|
94
81
|
|
|
95
82
|
## How Workflows Work
|
|
96
83
|
|
|
97
|
-
A workflow is a TypeScript file with a single `.run()` callback that
|
|
84
|
+
A workflow is a TypeScript file with a single `.run()` callback that
|
|
85
|
+
orchestrates agent sessions dynamically. Inside the callback, `ctx.stage()`
|
|
86
|
+
spawns sessions — each gets its own tmux window and graph node (unless
|
|
87
|
+
running in headless mode). Native TypeScript handles all control flow:
|
|
88
|
+
loops, conditionals, `Promise.all()`, `try`/`catch`.
|
|
98
89
|
|
|
99
90
|
```ts
|
|
100
91
|
import { defineWorkflow, extractAssistantText } from "@bastani/atomic/workflows";
|
|
@@ -114,92 +105,41 @@ export default defineWorkflow({
|
|
|
114
105
|
.compile();
|
|
115
106
|
```
|
|
116
107
|
|
|
117
|
-
The runtime manages the full session lifecycle — callback return marks
|
|
108
|
+
The runtime manages the full session lifecycle — callback return marks
|
|
109
|
+
completion; throws mark errors. `.compile()` produces a branded
|
|
110
|
+
`WorkflowDefinition` consumed by the CLI.
|
|
118
111
|
|
|
119
112
|
### Background (headless) stages
|
|
120
113
|
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
{}, {},
|
|
128
|
-
async (s) => {
|
|
129
|
-
const result = await s.session.query("Analyze the codebase structure.");
|
|
130
|
-
s.save(s.sessionId);
|
|
131
|
-
return extractAssistantText(result, 0);
|
|
132
|
-
},
|
|
133
|
-
);
|
|
134
|
-
```
|
|
135
|
-
|
|
136
|
-
**When to use headless stages:**
|
|
137
|
-
- Parallel data-gathering tasks that don't need a visible TUI (e.g., codebase research, infrastructure discovery)
|
|
138
|
-
- Support tasks that should run alongside visible stages without cluttering the graph
|
|
139
|
-
- Any stage where only the result matters, not the live TUI interaction
|
|
140
|
-
|
|
141
|
-
**How they work per provider:**
|
|
142
|
-
- **Claude**: Uses the Agent SDK `query()` API directly in-process (no tmux pane)
|
|
143
|
-
- **Copilot**: SDK spawns its own CLI subprocess internally (no tmux pane needed)
|
|
144
|
-
- **OpenCode**: Uses `createOpencode()` to start both server and client in-process
|
|
145
|
-
|
|
146
|
-
**Key behaviors:**
|
|
147
|
-
- The callback interface is **identical** to interactive stages — `s.client`, `s.session`, `s.save()`, `s.transcript()` all work the same way
|
|
148
|
-
- Headless stages are **transparent to graph topology** — they don't consume or update the execution frontier, so `visible → [3 headless] → visible` renders as `visible → visible` in the graph
|
|
149
|
-
- Errors in headless stages still fail the workflow — they are tracked and recorded identically to interactive stages
|
|
150
|
-
- The `paneId` for headless stages is a virtual identifier: `headless-<name>-<sessionId>`
|
|
151
|
-
|
|
152
|
-
**Common pattern — fan-out with headless background stages:**
|
|
153
|
-
|
|
154
|
-
```ts
|
|
155
|
-
// Visible stage seeds context
|
|
156
|
-
const seed = await ctx.stage({ name: "seed" }, {}, {}, async (s) => { /* ... */ });
|
|
157
|
-
|
|
158
|
-
// Three parallel headless stages gather data in the background
|
|
159
|
-
const [a, b, c] = await Promise.all([
|
|
160
|
-
ctx.stage({ name: "gather-a", headless: true }, {}, {}, async (s) => { /* ... */ }),
|
|
161
|
-
ctx.stage({ name: "gather-b", headless: true }, {}, {}, async (s) => { /* ... */ }),
|
|
162
|
-
ctx.stage({ name: "gather-c", headless: true }, {}, {}, async (s) => { /* ... */ }),
|
|
163
|
-
]);
|
|
164
|
-
|
|
165
|
-
// Visible stage merges background results
|
|
166
|
-
await ctx.stage({ name: "merge" }, {}, {}, async (s) => {
|
|
167
|
-
await s.session.query(`Merge:\n${a.result}\n${b.result}\n${c.result}`);
|
|
168
|
-
s.save(s.sessionId);
|
|
169
|
-
});
|
|
170
|
-
```
|
|
114
|
+
Pass `{ headless: true }` in `stageOpts` to run a stage in-process with no
|
|
115
|
+
tmux window or graph node. The callback interface is identical
|
|
116
|
+
(`s.client`, `s.session`, `s.save()`, `s.transcript()` all work). For
|
|
117
|
+
mechanics, fan-out patterns, and graph topology see
|
|
118
|
+
`references/control-flow.md` §"Headless stages" and
|
|
119
|
+
`references/agent-sessions.md` per-SDK "Headless mode" sections.
|
|
171
120
|
|
|
172
|
-
|
|
121
|
+
### Installing the workflow SDK
|
|
173
122
|
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
bun add @github/copilot-sdk # For Copilot workflows
|
|
180
|
-
bun add @opencode-ai/sdk # For OpenCode workflows
|
|
181
|
-
```
|
|
182
|
-
|
|
183
|
-
Workflow files live at `.atomic/workflows/<name>/<agent>/index.ts`. Discovery sources: **Local** (`.atomic/workflows/`), **Global** (`~/.atomic/workflows/`), and **Built-in** (SDK-shipped). Built-in names (`ralph`, `deep-research-codebase`) are **reserved** — any local/global workflow with the same name is dropped before resolution. Among non-reserved names, local takes precedence over global. See `references/discovery-and-verification.md` for full discovery paths and validation.
|
|
123
|
+
Install `@bastani/atomic` plus the native SDK(s) you target
|
|
124
|
+
(`@anthropic-ai/claude-agent-sdk`, `@github/copilot-sdk`,
|
|
125
|
+
`@opencode-ai/sdk`). Workflow files live at
|
|
126
|
+
`.atomic/workflows/<name>/<agent>/index.ts`. Full paths, precedence, and
|
|
127
|
+
reserved built-in names live in `references/discovery-and-verification.md`.
|
|
184
128
|
|
|
185
129
|
### Two context levels
|
|
186
130
|
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
| `SessionContext` (`s`) | `ctx.stage(opts, clientOpts, sessionOpts, async (s) => ...)` | Yes | Agent work: use `s.client` and `s.session` for SDK calls, save output |
|
|
191
|
-
|
|
192
|
-
Both contexts expose typed `inputs` (keys restricted to declared input names), `stage()`, `transcript()`, and `getMessages()`. See `references/getting-started.md` for the full `SessionContext` field reference.
|
|
193
|
-
|
|
194
|
-
### Declared inputs: one API, three invocation surfaces
|
|
131
|
+
`WorkflowContext` (`ctx`) drives orchestration in `.run()`; `SessionContext`
|
|
132
|
+
(`s`) drives agent work inside each stage callback. Full field reference in
|
|
133
|
+
`references/getting-started.md` §"`SessionContext` reference".
|
|
195
134
|
|
|
196
|
-
|
|
135
|
+
### Declared inputs
|
|
197
136
|
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
137
|
+
Workflows receive user data exclusively through `ctx.inputs` / `s.inputs`,
|
|
138
|
+
declared inline as `inputs: WorkflowInput[]` on `defineWorkflow()`.
|
|
139
|
+
TypeScript restricts `ctx.inputs` to declared keys (undeclared access is a
|
|
140
|
+
compile-time error). Load `references/workflow-inputs.md` for schema shape,
|
|
141
|
+
field types (`string` / `text` / `enum`), validation rules, picker
|
|
142
|
+
semantics, and the "declare your prompt input explicitly" pattern.
|
|
203
143
|
|
|
204
144
|
### Invocation surfaces
|
|
205
145
|
|
|
@@ -214,13 +154,15 @@ Workflows that accept a free-form prompt should declare it explicitly: `{ name:
|
|
|
214
154
|
| Kill non-interactively | `atomic session kill <id> -y` | Tear down a workflow/chat session without the confirmation prompt — the form agents use |
|
|
215
155
|
| Detached (background) | `atomic workflow -n ralph -a claude -d "..."` | Scripted/CI runs where the caller shouldn't block on the TUI — the orchestrator keeps running on the atomic tmux socket; attach later with `atomic workflow session connect <name>` |
|
|
216
156
|
|
|
217
|
-
Any of the named shapes above (positional or structured) accepts
|
|
218
|
-
|
|
219
|
-
|
|
157
|
+
Any of the named shapes above (positional or structured) accepts
|
|
158
|
+
`-d` / `--detach` to run without attaching. Use it when you're automating
|
|
159
|
+
from a script and want the CLI to return as soon as the session is spawned.
|
|
220
160
|
|
|
221
161
|
### Declaring SDK compatibility (`minSDKVersion`)
|
|
222
162
|
|
|
223
|
-
Opt-in version gate for workflows that depend on a specific SDK release.
|
|
163
|
+
Opt-in version gate for workflows that depend on a specific SDK release.
|
|
164
|
+
**Default is unset — do not add it to new workflows unless you have a
|
|
165
|
+
concrete reason.**
|
|
224
166
|
|
|
225
167
|
```ts
|
|
226
168
|
defineWorkflow({
|
|
@@ -229,20 +171,17 @@ defineWorkflow({
|
|
|
229
171
|
})
|
|
230
172
|
```
|
|
231
173
|
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
When to set it: the workflow calls into a newly-added SDK surface (new `stage()` option, new helper export, new provider method) that older installs don't ship. Omit it for workflows that use only stable APIs — most workflows qualify.
|
|
174
|
+
When set to a version newer than the installed CLI, the workflow refuses to
|
|
175
|
+
load and surfaces a visible row in `atomic workflow list` and the picker
|
|
176
|
+
(rather than silently vanishing). Set it only when the workflow calls a
|
|
177
|
+
newly-added SDK surface (new `stage()` option, new helper export, new
|
|
178
|
+
provider method); omit it for workflows on stable APIs. Full semver
|
|
179
|
+
semantics and the visible-diagnostic contract live in
|
|
180
|
+
`references/discovery-and-verification.md`.
|
|
240
181
|
|
|
241
|
-
|
|
182
|
+
## Structural Rules (hard constraints)
|
|
242
183
|
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
Hard constraints enforced by the builder, loader, and runtime:
|
|
184
|
+
Enforced by the builder, loader, and runtime:
|
|
246
185
|
|
|
247
186
|
1. **`.run()` required** — the builder must have a `.run(async (ctx) => { ... })` call.
|
|
248
187
|
2. **`.compile()` required** — the chain must end with `.compile()`.
|
|
@@ -269,10 +208,14 @@ Every workflow pattern maps directly to TypeScript code:
|
|
|
269
208
|
| Return data from session | `const h = await ctx.stage(opts, {}, {}, async (s) => { return value; }); h.result` |
|
|
270
209
|
| Data flow between sessions | `s.save()` to persist → `s.transcript(handle)` or `s.transcript("name")` to retrieve |
|
|
271
210
|
| Deterministic computation (no LLM) | Plain TypeScript inside `.run()` or inside a session callback |
|
|
272
|
-
| Subagent orchestration | Claude:
|
|
211
|
+
| Subagent orchestration | Claude: `--agent` via `chatFlags` (interactive) or `agent` SDK option (headless); Copilot: `{ agent: "name" }` in sessionOpts; OpenCode: `agent` param in `s.client.session.prompt()` |
|
|
273
212
|
| Per-session configuration | Pass `clientOpts` (2nd arg) and `sessionOpts` (3rd arg) to `ctx.stage()` |
|
|
274
213
|
|
|
275
|
-
For full pattern examples with code, see `references/control-flow.md`
|
|
214
|
+
For full pattern examples with code, see `references/control-flow.md`
|
|
215
|
+
(loops, conditionals, review/fix, graph topology, headless fan-out),
|
|
216
|
+
`references/state-and-data-flow.md` (data passing, file coordination,
|
|
217
|
+
transcript compression), and `references/computation-and-validation.md`
|
|
218
|
+
(parsing, validation, quality gates).
|
|
276
219
|
|
|
277
220
|
## Authoring Process
|
|
278
221
|
|
|
@@ -292,24 +235,16 @@ Map the user's intent to sessions and patterns:
|
|
|
292
235
|
| Does the workflow need user input? | SDK-specific user input APIs (see `references/user-input.md`) |
|
|
293
236
|
| Do any steps need a specific model? | SDK-specific session config (see `references/session-config.md`) |
|
|
294
237
|
|
|
295
|
-
Then
|
|
296
|
-
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
| Is this task actually viable for agent automation? | `project-development` — validate task-model fit before building |
|
|
301
|
-
| Could any single session exceed context limits? | `context-fundamentals` — budget tokens; split into sub-sessions if needed |
|
|
302
|
-
| Do loops accumulate state that degrades over iterations? | `context-degradation` — add compaction triggers; detect lost-in-middle risk |
|
|
303
|
-
| Are large transcripts passed between sessions? | `context-compression` — summarize at boundaries; preserve key decisions and file paths |
|
|
304
|
-
| Should this be one session or many? | `multi-agent-patterns` — choose coordination topology based on task decomposability |
|
|
305
|
-
| Do sessions coordinate via shared files? | `filesystem-context` — use scratch pads, dynamic loading, file-based handoffs |
|
|
306
|
-
| Does the workflow need automated quality checks? | `evaluation` + `advanced-evaluation` — design rubrics; mitigate judge bias |
|
|
307
|
-
| Does the workflow expose custom tools to agents? | `tool-design` — consolidate tools; write unambiguous descriptions |
|
|
308
|
-
| Does the workflow need cross-run knowledge retention? | `memory-systems` — choose persistence layer based on retrieval needs |
|
|
238
|
+
Then walk the **Design Advisory Skills** table above (§"Design Advisory
|
|
239
|
+
Skills") — for each row whose trigger applies to your workflow, pull that
|
|
240
|
+
skill in *before* writing code. Catching architectural and prompt-quality
|
|
241
|
+
issues at design time is far cheaper than catching them in the first failed
|
|
242
|
+
end-to-end run.
|
|
309
243
|
|
|
310
244
|
### 2. Choose the Target Agent
|
|
311
245
|
|
|
312
|
-
Use `.for<"agent">()` on the builder to narrow all context types and get
|
|
246
|
+
Use `.for<"agent">()` on the builder to narrow all context types and get
|
|
247
|
+
correct `s.client`/`s.session` types. Call `.for()` **before** `.run()`:
|
|
313
248
|
|
|
314
249
|
| Agent | Builder Chain | Primary Session API |
|
|
315
250
|
|-------|---------------|---------------------|
|
|
@@ -317,9 +252,13 @@ Use `.for<"agent">()` on the builder to narrow all context types and get correct
|
|
|
317
252
|
| Copilot | `defineWorkflow({...}).for<"copilot">()` | `s.session.send({ prompt })` — the runtime wraps `send` to block until `session.idle` with no timeout (see `failure-modes.md` §F10); do not use `sendAndWait` in Atomic workflows |
|
|
318
253
|
| OpenCode | `defineWorkflow({...}).for<"opencode">()` | `s.client.session.prompt({ sessionID: s.session.id, parts: [...] })` |
|
|
319
254
|
|
|
320
|
-
The runtime manages client/session lifecycle automatically. For native SDK
|
|
255
|
+
The runtime manages client/session lifecycle automatically. For native SDK
|
|
256
|
+
types and advanced APIs, import directly from the provider packages
|
|
257
|
+
(`@github/copilot-sdk`, `@anthropic-ai/claude-agent-sdk`, `@opencode-ai/sdk/v2`).
|
|
321
258
|
|
|
322
|
-
For cross-agent support, create one workflow file per agent under
|
|
259
|
+
For cross-agent support, create one workflow file per agent under
|
|
260
|
+
`.atomic/workflows/<name>/<agent>/index.ts`. Use shared helper modules for
|
|
261
|
+
SDK-agnostic logic in a sibling `helpers/` directory:
|
|
323
262
|
|
|
324
263
|
```
|
|
325
264
|
.atomic/workflows/<name>/
|
|
@@ -334,23 +273,20 @@ For cross-agent support, create one workflow file per agent under `.atomic/workf
|
|
|
334
273
|
|
|
335
274
|
### 3. Write the Workflow File
|
|
336
275
|
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
|
|
342
|
-
|---------|--------|---------|----------|
|
|
343
|
-
| Send prompt | `s.session.query(prompt)` | `s.session.send({ prompt })` | `s.client.session.prompt({ sessionID: s.session.id, parts: [{ type: "text", text: prompt }] })` |
|
|
344
|
-
| Save output | `s.save(s.sessionId)` | `s.save(await s.session.getMessages())` | `s.save(result.data!)` |
|
|
345
|
-
| Timeout | Per-query defaults via sessionOpts | N/A (`send` has no timeout; `sendAndWait` accepts optional timeout, default 60s) | N/A |
|
|
346
|
-
| Context model | Tmux pane (accumulates across turns) | Fresh per `ctx.stage()` | Fresh per `ctx.stage()` |
|
|
347
|
-
| Extract text | `extractAssistantText(result, 0)` (uses `SessionMessage[]`) | `getAssistantText(messages)` (see `failure-modes.md` F1) | `extractResponseText(result.data!.parts)` (see `failure-modes.md` F3) |
|
|
276
|
+
Write the workflow file using the SDK-specific patterns. See
|
|
277
|
+
`references/getting-started.md` for full quick-start examples for all 3
|
|
278
|
+
SDKs (send/save/extract patterns, idle handling), and
|
|
279
|
+
`references/agent-sessions.md` for per-SDK API details and lifecycle
|
|
280
|
+
caveats.
|
|
348
281
|
|
|
349
|
-
The SDK ships two builtin workflows
|
|
350
|
-
|
|
351
|
-
- **`
|
|
282
|
+
The SDK ships two builtin workflows in `src/sdk/workflows/builtin/` as
|
|
283
|
+
production reference implementations across all 3 SDKs:
|
|
284
|
+
- **`ralph`** — iterative plan → orchestrate → review → debug loop.
|
|
285
|
+
- **`deep-research-codebase`** — deterministic scout → parallel explorers →
|
|
286
|
+
aggregator.
|
|
352
287
|
|
|
353
|
-
|
|
288
|
+
They demonstrate shared helpers, context-aware prompt building, deterministic
|
|
289
|
+
heuristics, and cross-SDK adaptation.
|
|
354
290
|
|
|
355
291
|
### 4. Type-Check the Workflow
|
|
356
292
|
|
|
@@ -377,147 +313,22 @@ atomic workflow -a <agent>
|
|
|
377
313
|
atomic workflow -n <workflow-name> -a <agent> -d "<your prompt>"
|
|
378
314
|
```
|
|
379
315
|
|
|
380
|
-
## Running
|
|
381
|
-
|
|
382
|
-
|
|
383
|
-
|
|
384
|
-
|
|
385
|
-
|
|
386
|
-
|
|
387
|
-
|
|
388
|
-
|
|
389
|
-
-
|
|
390
|
-
|
|
391
|
-
|
|
392
|
-
|
|
393
|
-
|
|
394
|
-
|
|
395
|
-
|
|
396
|
-
|
|
397
|
-
|
|
398
|
-
|
|
399
|
-
- Whether the workflow the user named actually exists.
|
|
400
|
-
- What other workflows are available (so you can suggest close matches on a typo).
|
|
401
|
-
- Source + metadata for every discoverable workflow (local vs. global vs. builtin).
|
|
402
|
-
|
|
403
|
-
Skipping this step is how you end up guessing a name, typing it into `atomic workflow -n <name>`, and getting a `workflow not found` error you could have predicted. List first, decide second, run third.
|
|
404
|
-
|
|
405
|
-
If the user's request is ambiguous ("run the research one"), the list output is also how you show them the candidates so they can pick — present the matching names and ask with AskUserQuestion.
|
|
406
|
-
|
|
407
|
-
### If the workflow doesn't exist: offer to create it
|
|
408
|
-
|
|
409
|
-
When the listed workflows don't include what the user asked for:
|
|
410
|
-
|
|
411
|
-
1. **Tell the user explicitly** — "I don't see a `<name>` workflow in `.atomic/workflows/` or `~/.atomic/workflows/`. Available: \<short list from `atomic workflow list`>."
|
|
412
|
-
2. **Check for typos first** — if one of the listed names is a close match, surface it via AskUserQuestion ("Did you mean `<close-match>`?") before offering to author anything.
|
|
413
|
-
3. **Offer to create it** — ask with AskUserQuestion: "Want me to create a `<name>` workflow first?" with choices `Yes, create it` / `No, pick from the list` / `No, cancel`.
|
|
414
|
-
4. **If yes → switch modes** — hand off to the authoring flow in the [Authoring Process](#authoring-process) section above (Steps 1-5). Interview the user for intent, write the file at `.atomic/workflows/<name>/<agent>/index.ts`, typecheck it, *then* come back to this runner section and invoke it. Do not skip the typecheck — an uncompiled workflow won't run.
|
|
415
|
-
5. **If no → stop** — don't fabricate a command that will fail. Let the user redirect you.
|
|
416
|
-
|
|
417
|
-
Never invent a workflow name or silently fall back to a different workflow. If the thing the user asked for doesn't exist, the correct answer is to say so and offer concrete next steps.
|
|
418
|
-
|
|
419
|
-
### Collecting inputs with AskUserQuestion
|
|
420
|
-
|
|
421
|
-
Once you've confirmed the workflow exists, you need to know two things about its invocation shape:
|
|
422
|
-
|
|
423
|
-
1. **Does it declare a `prompt` input?** If so, it's free-form — you pass a positional string.
|
|
424
|
-
2. **Does it declare structured inputs?** If so, you pass `--<field>=<value>` flags, one per required field.
|
|
425
|
-
|
|
426
|
-
**Use `atomic workflow inputs <name> -a <agent>` to get the schema.** This prints a JSON envelope with every field's `name`, `type`, `required`, `default`, `description`, and (for enums) `values` — exactly what AskUserQuestion needs. The `freeform: true` flag tells you whether the workflow takes a positional prompt vs. structured flags, with a synthetic `prompt` field included so the JSON shape is uniform either way.
|
|
427
|
-
|
|
428
|
-
```bash
|
|
429
|
-
atomic workflow inputs gen-spec -a claude
|
|
430
|
-
# {"workflow":"gen-spec","agent":"claude","freeform":false,
|
|
431
|
-
# "inputs":[{"name":"research_doc","type":"string","required":true,...},
|
|
432
|
-
# {"name":"focus","type":"enum","values":["minimal","standard","exhaustive"],"default":"standard"}]}
|
|
433
|
-
```
|
|
434
|
-
|
|
435
|
-
Why this command instead of reading the source file: `inputs` is the contract the CLI actually validates against. It survives refactors, handles built-in workflows that aren't in the project tree, and never falls out of sync with the runtime. Reading TypeScript source is a fallback for the rare case where the command can't resolve the workflow.
|
|
436
|
-
|
|
437
|
-
Once you have the schema, use the **AskUserQuestion tool** to collect any values the user hasn't already provided in their message. One question per missing input field. For enum fields, pass the declared `values` as multiple-choice options so the user sees exactly what's allowed. Keep questions tight and purposeful — if the user's message already answers a question, don't ask it again.
|
|
438
|
-
|
|
439
|
-
Skip AskUserQuestion entirely when:
|
|
440
|
-
- The user already supplied every required value in their message ("run ralph on 'add OAuth to the API'" — the prompt is right there).
|
|
441
|
-
- The workflow declares no required inputs and needs no prompt.
|
|
442
|
-
|
|
443
|
-
### End-to-end recipe
|
|
444
|
-
|
|
445
|
-
1. **List available workflows** — run `atomic workflow list`. Always. This is your ground truth.
|
|
446
|
-
2. **Resolve the target**:
|
|
447
|
-
- Exact match in the list → continue.
|
|
448
|
-
- Close match → confirm via AskUserQuestion before proceeding.
|
|
449
|
-
- No match → tell the user what's available and offer to author it (see previous section). If they decline, stop.
|
|
450
|
-
3. **Discover the inputs schema** — run `atomic workflow inputs <name> -a <agent>` and parse the JSON.
|
|
451
|
-
4. **Ask for missing inputs** — use AskUserQuestion, one question per unanswered required field. Enums become multiple-choice.
|
|
452
|
-
5. **Invoke** — build one of these commands:
|
|
453
|
-
- Free-form: `atomic workflow -n <name> "<prompt>"`
|
|
454
|
-
- Structured: `atomic workflow -n <name> --<field1>=<value1> --<field2>=<value2>`
|
|
455
|
-
6. **Report the session name** the CLI printed and tell the user: "attach any time with `atomic workflow session connect <session>` — or `atomic workflow session list` to see what's running."
|
|
456
|
-
|
|
457
|
-
### Monitoring a running workflow
|
|
458
|
-
|
|
459
|
-
Detached workflows return immediately with a session name; the actual work runs in the background on the atomic tmux socket. Use `atomic workflow status` to check whether the workflow is still running, has completed, errored out, or paused for human input — without attaching to its TUI.
|
|
460
|
-
|
|
461
|
-
```bash
|
|
462
|
-
atomic workflow status atomic-wf-claude-gen-spec-a1b2c3d4
|
|
463
|
-
# {"id":"atomic-wf-claude-gen-spec-a1b2c3d4","overall":"in_progress","alive":true,
|
|
464
|
-
# "sessions":[{"name":"orchestrator","status":"running",...}],...}
|
|
465
|
-
```
|
|
466
|
-
|
|
467
|
-
Four overall states the agent must handle distinctly:
|
|
468
|
-
|
|
469
|
-
| Status | Meaning | What you should do |
|
|
470
|
-
|---|---|---|
|
|
471
|
-
| `in_progress` | The orchestrator is running and no stage is paused | Wait, or report progress to the user |
|
|
472
|
-
| `needs_review` | At least one stage is paused for human input (HIL) — Copilot `ask_user`, OpenCode `question.asked`, Copilot/MCP elicitation | **Surface this to the user immediately** — they need to attach with `atomic workflow session connect <id>` to respond, otherwise the workflow stalls indefinitely |
|
|
473
|
-
| `completed` | Workflow finished successfully | Report success and summarize the output |
|
|
474
|
-
| `error` | Fatal error or a stage failed | Report the `fatalError` field and offer to investigate logs |
|
|
475
|
-
|
|
476
|
-
`needs_review` outranks `completed` so a HIL pause near the end is never reported as done while still waiting on a human. A dead orchestrator with a stale snapshot is automatically downgraded to `error`.
|
|
477
|
-
|
|
478
|
-
Omit the id to list every running workflow at once: `atomic workflow status`. Useful when checking on multiple parallel runs, or when the user just asks "what's running?".
|
|
479
|
-
|
|
480
|
-
### Cleaning up sessions
|
|
481
|
-
|
|
482
|
-
When the user is done with a workflow — or you launched one detached and it's no longer needed — tear it down with `-y` so no confirmation prompt blocks you:
|
|
483
|
-
|
|
484
|
-
```bash
|
|
485
|
-
atomic session kill atomic-wf-claude-gen-spec-a1b2c3d4 -y
|
|
486
|
-
```
|
|
487
|
-
|
|
488
|
-
The `-y` flag is mandatory for agent use. Without it, the CLI calls `@clack/prompts confirm`, which expects a TTY and will hang indefinitely in a non-interactive context. Same flag works for `atomic workflow session kill` and `atomic chat session kill`. Without an id, `kill -y` tears down every in-scope session — only do that when the user has asked to stop everything.
|
|
489
|
-
|
|
490
|
-
### Worked examples
|
|
491
|
-
|
|
492
|
-
**Example A — workflow exists, structured inputs**
|
|
493
|
-
|
|
494
|
-
> **User:** "run gen-spec on research/docs/2026-04-11-auth.md"
|
|
495
|
-
|
|
496
|
-
1. Run `atomic workflow list`. Output includes `gen-spec` under local. Good.
|
|
497
|
-
2. Target resolved exactly: `gen-spec`.
|
|
498
|
-
3. Run `atomic workflow inputs gen-spec -a claude`. Parse the JSON: `research_doc` (required string — already given), `focus` (required enum of `minimal|standard|exhaustive`, default `standard`), `notes` (optional text).
|
|
499
|
-
4. Ask via AskUserQuestion once: "What focus level for the spec?" with choices `minimal`, `standard`, `exhaustive`. User picks `standard`. Skip `notes` since it's optional.
|
|
500
|
-
5. Run: `atomic workflow -n gen-spec --research_doc=research/docs/2026-04-11-auth.md --focus=standard`
|
|
501
|
-
6. The CLI prints a session name like `atomic-wf-claude-gen-spec-a1b2c3d4`. Tell the user: "Started in the background. Attach with `atomic workflow session connect atomic-wf-claude-gen-spec-a1b2c3d4`, check progress with `atomic workflow status atomic-wf-claude-gen-spec-a1b2c3d4`, or stop it with `atomic session kill atomic-wf-claude-gen-spec-a1b2c3d4 -y`."
|
|
502
|
-
|
|
503
|
-
**Example B — workflow does not exist**
|
|
504
|
-
|
|
505
|
-
> **User:** "run the security-audit workflow on src/auth"
|
|
506
|
-
|
|
507
|
-
1. Run `atomic workflow list`. Available: `ralph`, `deep-research-codebase`, `gen-spec`, `review-to-merge`. No `security-audit`.
|
|
508
|
-
2. Tell the user: "I don't see a `security-audit` workflow. Available: ralph, deep-research-codebase, gen-spec, review-to-merge."
|
|
509
|
-
3. Ask via AskUserQuestion: "Want me to create a `security-audit` workflow first?" with choices `Yes, create it`, `No, use one of the existing workflows`, `No, cancel`.
|
|
510
|
-
4. If **Yes**: switch to the Authoring Process — interview the user for what the workflow should do, draft it, typecheck, *then* return here and invoke it.
|
|
511
|
-
5. If **No, use existing**: ask which one via AskUserQuestion over the listed options, then continue from step 3 of the recipe.
|
|
512
|
-
6. If **Cancel**: stop, no command runs.
|
|
513
|
-
|
|
514
|
-
### Common mistakes to avoid
|
|
515
|
-
|
|
516
|
-
- **Skipping `atomic workflow list`** — leads to guessing and `workflow not found` errors. It's a one-line command; always run it.
|
|
517
|
-
- **Inventing a workflow name** — if it's not in the list, it doesn't exist. Say so and offer to author it.
|
|
518
|
-
- **Reading the workflow source file to discover inputs** — use `atomic workflow inputs <name> -a <agent>` instead. JSON, no TS parsing required, always in sync with the runtime. Source-file reads are a fallback, not a default.
|
|
519
|
-
- **Asking everything at once** — let AskUserQuestion drive one question per field. Enum fields are multiple-choice, not free text.
|
|
520
|
-
- **Re-asking what the user already said** — read their message first.
|
|
521
|
-
- **Forgetting to report the session name** — the user needs it to reattach and to query status later.
|
|
522
|
-
- **Leaving `needs_review` unreported** — when `atomic workflow status` returns `needs_review`, surface it to the user right away. The workflow is blocked on human input and will sit forever otherwise.
|
|
523
|
-
- **Calling `session kill` without `-y`** — the prompt hangs in a non-interactive context. Always pass `-y` from an agent.
|
|
316
|
+
## Running an Existing Workflow
|
|
317
|
+
|
|
318
|
+
If the user asks you to **run** (or "kick off" / "start" / "execute") a
|
|
319
|
+
workflow — not author one — the workflow already exists on disk and you
|
|
320
|
+
just need to invoke it correctly. That's a different playbook from
|
|
321
|
+
authoring.
|
|
322
|
+
|
|
323
|
+
**Read `references/running-workflows.md`.** It covers:
|
|
324
|
+
|
|
325
|
+
- Why you don't usually need `-a` or `-d` (env-driven auto-detach).
|
|
326
|
+
- Why you must run `atomic workflow list` first.
|
|
327
|
+
- How to handle missing workflows (offer to author, not fabricate).
|
|
328
|
+
- Using `atomic workflow inputs <name> -a <agent>` to discover the schema
|
|
329
|
+
and drive AskUserQuestion.
|
|
330
|
+
- The six-step invocation recipe.
|
|
331
|
+
- Monitoring with `atomic workflow status` — and why `needs_review` must be
|
|
332
|
+
surfaced immediately.
|
|
333
|
+
- Tearing down with `atomic session kill -y` (the `-y` is mandatory).
|
|
334
|
+
- Worked examples for "workflow exists" and "workflow doesn't exist".
|