@bastani/atomic 0.5.23 → 0.5.24

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,79 +1,66 @@
1
1
  ---
2
2
  name: workflow-creator
3
- description: Create multi-agent workflows for Atomic CLI using defineWorkflow().run().compile() with ctx.stage() for session orchestration across Claude, Copilot, and OpenCode SDKs, AND invoke, monitor, and tear down existing workflows on behalf of the user. Use whenever the user wants to create, edit, debug, or RUN workflows ("run the ralph workflow", "kick off deep-research-codebase", "start the gen-spec workflow"), check on a running workflow ("is it done yet?", "what's the status?", "did it error out?"), kill a workflow or session, build agent pipelines, define multi-stage automations, set up review loops, declare workflow inputs, run background/headless stages, or mentions .atomic/workflows/, defineWorkflow, ctx.stage, ctx.inputs, headless, background stages, the atomic workflow picker, `atomic workflow -n`, `atomic workflow inputs`, `atomic workflow status`, or `atomic session kill`.
3
+ description: Create AND run Atomic CLI workflows (`defineWorkflow().run().compile()` with `ctx.stage()`) across Claude, Copilot, and OpenCode SDKs. Use for **authoring** when the user wants to build, edit, debug, or design agent pipelines — multi-stage automations, review/fix loops, parallel fan-out, headless/background stages, `.atomic/workflows/` files, `defineWorkflow`, `ctx.stage`, `ctx.inputs`, or declared `WorkflowInput` schemas. Use for **running** when the user wants to kick off, execute, monitor, or tear down an existing workflow — "run the ralph workflow", "start gen-spec", "is it done yet?", "what's the status?", "kill the session", or any mention of `atomic workflow -n`, `atomic workflow inputs`, `atomic workflow status`, the picker, or `atomic session kill`.
4
4
  ---
5
5
 
6
6
  # Workflow Creator
7
7
 
8
- You are a workflow architect specializing in the Atomic CLI `defineWorkflow().run().compile()` API. Your role is to translate user intent into well-structured workflow files that orchestrate multiple coding agent sessions using **programmatic SDK code** — Claude Agent SDK, Copilot SDK, and OpenCode SDK. Sessions are spawned dynamically via `ctx.stage(opts, clientOpts, sessionOpts, callback)` inside the `.run()` callback, using native TypeScript control flow (loops, conditionals, `Promise.all()`) for orchestration. The runtime auto-creates the SDK client and session, injects them as `s.client` and `s.session`, runs the callback, then auto-cleans up.
8
+ You are a workflow architect specializing in the Atomic CLI `defineWorkflow().run().compile()` API. You translate user intent into well-structured workflow files that orchestrate multiple coding agent sessions using **programmatic SDK code** — Claude Agent SDK, Copilot SDK, and OpenCode SDK. Sessions are spawned dynamically via `ctx.stage(stageOpts, clientOpts, sessionOpts, callback)` inside the `.run()` callback, using native TypeScript control flow (loops, conditionals, `Promise.all()`) for orchestration. The runtime auto-creates the SDK client and session, injects them as `s.client` and `s.session`, runs the callback, then auto-cleans up.
9
9
 
10
- You also serve as a **context engineering advisor**, applying principles from a suite of design skills to make informed architectural decisions about session structure, data flow, prompt composition, and quality assurance. Use these skills to elevate workflows beyond simple pipelines into robust, context-aware systems that respect token budgets, prevent degradation, and produce verifiable results.
10
+ You also serve as a **context engineering advisor** use the design skills listed under "Design Advisory Skills" to make informed architectural decisions about session structure, data flow, prompt composition, and quality assurance.
11
+
12
+ Two user journeys live in this skill:
13
+
14
+ - **Authoring** a new workflow (or editing/debugging an existing one) → read on below.
15
+ - **Running** a workflow on the user's behalf ("run ralph on this spec", "is it done yet?", "kill it") → go to `references/running-workflows.md`.
11
16
 
12
17
  ## Reference Files
13
18
 
14
- Load the topic-specific reference files from `references/` based on priority. **Always load Tier 1 files.** Load Tier 2-3 files when the task requires that topic.
19
+ Load references on demand. **Only `getting-started.md` is always-load.** Everything else is conditional pull it in when the task matches the trigger column.
15
20
 
16
- | Tier | File | When to load |
17
- |---|---|---|
18
- | **1** | `getting-started.md` | **Always** — quick-start examples for all 3 SDKs, SDK exports, `SessionContext` reference |
19
- | **1** | `failure-modes.md` | **Always for multi-session workflows** 15 catalogued failures (silent + loud) with wrong-vs-right patterns and a pre-ship design checklist |
20
- | **1** | `workflow-inputs.md` | **Always when declaring structured inputs or documenting how a workflow is invoked** — `WorkflowInput` schema, field-type selection, picker + CLI flag semantics, builtin-protection rules, invocation cheat sheet |
21
- | **2** | `agent-sessions.md` | When writing SDK calls — `s.session.query()` (Claude), `s.session.send()` (Copilot), `s.client.session.prompt()` (OpenCode); includes critical pitfalls on session lifecycle and when to use `sendAndWait` with explicit timeouts |
22
- | **2** | `control-flow.md` | When using loops, conditionals, parallel execution, or review/fix patterns |
23
- | **2** | `state-and-data-flow.md` | When passing data between sessions — `s.save()`, `s.transcript()`, `s.getMessages()`, file persistence, transcript compression |
24
- | **3** | `computation-and-validation.md` | When adding deterministic computation, response parsing, validation, quality gates, or file I/O |
25
- | **3** | `session-config.md` | When configuring model, tools, permissions, hooks, or structured output per SDK |
26
- | **3** | `user-input.md` | When collecting user input **mid-workflow** (not at invocation time) Claude `canUseTool`, Copilot `onElicitationRequest`, OpenCode TUI control. For invocation-time inputs, see `workflow-inputs.md`. |
27
- | **3** | `discovery-and-verification.md` | When setting up workflow file structure, validation, or TypeScript config |
21
+ | File | Load when |
22
+ |---|---|
23
+ | `getting-started.md` | **Always** — quick-start examples for all 3 SDKs, SDK exports, `SessionContext` field reference |
24
+ | `failure-modes.md` | Before shipping any multi-session workflow. 16 catalogued failures (silent + loud) with wrong-vs-right patterns and a pre-ship design checklist |
25
+ | `workflow-inputs.md` | When declaring structured inputs or documenting how a workflow is invoked — `WorkflowInput` schema, field-type selection, picker + CLI flag semantics, builtin-protection rules |
26
+ | `agent-sessions.md` | When writing SDK calls — `s.session.query()` (Claude), `s.session.send()` (Copilot), `s.client.session.prompt()` (OpenCode); includes session-lifecycle pitfalls and when to use `sendAndWait` with explicit timeouts |
27
+ | `control-flow.md` | When using loops, conditionals, parallel execution (`Promise.all`), headless fan-out, or review/fix patterns |
28
+ | `state-and-data-flow.md` | When passing data between sessions — `s.save()`, `s.transcript()`, `s.getMessages()`, file persistence, transcript compression |
29
+ | `running-workflows.md` | When the user asks you to **run** an existing workflow rather than author one |
30
+ | `computation-and-validation.md` | When adding deterministic computation, response parsing, validation, quality gates, or file I/O |
31
+ | `session-config.md` | When configuring model, tools, permissions, hooks, or structured output per SDK |
32
+ | `user-input.md` | When collecting user input **mid-workflow** (not at invocation time use `workflow-inputs.md` for that) |
33
+ | `discovery-and-verification.md` | When setting up workflow file structure, validation, or TypeScript config |
28
34
 
29
35
  ## Information Flow Is a First-Class Design Concern
30
36
 
31
37
  **A workflow is an information flow problem, not a sequence of prompts.**
32
- Before you write a single `ctx.stage()` call, answer these three questions
33
- for every session boundary in your workflow:
34
-
35
- 1. **What context does this session need to succeed?** The original user
36
- spec? Prior stage output? File paths? Git state? A summary?
37
- 2. **How will that context reach the session?** Built into the prompt?
38
- Read from a file? Retrieved via a tool? Kept inside one continued
39
- multi-turn stage instead of crossing a stage boundary?
40
- 3. **What happens if the context window fills up?** Compact? Clear? Spawn
41
- a sub-session? Offload to files?
42
-
43
- If you can't answer all three crisply, you don't have a workflow — you
44
- have a sequence of hopeful prompts that will fail in non-deterministic
45
- ways at scale.
46
-
47
- ### Session lifecycle controls information flow
48
-
49
- | Lifecycle state | Context visible to the model | When it happens |
50
- |---|---|---|
51
- | **Fresh** | **Nothing** — empty conversation | Each new `ctx.stage()` call — the runtime creates a new session |
52
- | **Continued** | Everything sent so far in this session | Additional turns within the same stage callback |
53
- | **Closed** | Gone from the live client; persisted only through what you explicitly saved | Runtime auto-cleanup after the stage callback returns |
54
-
55
- **Closing a session and creating a new one wipes all in-session context.**
56
- The new session knows *only* what you put in its first prompt.
57
-
58
- Claude is different: the runtime reuses a single persistent tmux pane, so every turn within a stage accumulates in the same conversation. But for Copilot and OpenCode, **every `ctx.stage()` is a fresh conversation** — you must explicitly forward context across the boundary.
59
-
60
- ### Avoiding context loss
38
+ Before writing any `ctx.stage()` call, answer for every session boundary:
61
39
 
62
- Three reliable patterns (they compose using 1+2 together is common). See `references/agent-sessions.md` for detailed examples and wrong-vs-right code patterns.
40
+ - What context does this session need, how will it reach the session
41
+ (prompt handoff, file, single multi-turn stage), and what happens if the
42
+ context window fills up?
63
43
 
64
- 1. **Explicit prompt handoff** capture the prior session's output via `s.transcript()` and inject it into the next session's first prompt. Simple, always works.
65
- 2. **External shared state** — write to files, git, or a database; the next session reads from there. Best when data is already structured.
66
- 3. **Keep related turns in one stage callback** — if the next step needs full conversation history, send another turn to `s.session` instead of spawning a new stage. This is the idiomatic way to preserve context.
44
+ For Copilot and OpenCode, every `ctx.stage()` is a fresh conversation;
45
+ Claude reuses a tmux pane per stage. Read these before shipping any
46
+ multi-session workflow:
67
47
 
68
- **Context is finite.** Even within one session, context can overflow. Symptoms: lost-in-middle, repeated questions, forgotten decisions. Compact (summarize prior turns) or clear (drop non-essential turns) before this happens. Consult `context-compression` and `context-optimization` for trade-offs.
69
-
70
- **Load-bearing references for these pitfalls:**
71
- - `references/failure-modes.md` — **read before shipping any multi-session workflow**. Catalogue of 15 silent + loud failures with wrong-vs-right patterns and a pre-ship design checklist.
72
- - `references/agent-sessions.md` §"Critical pitfall: session lifecycle controls what context is available" — full explanation with code examples and the context engineering skill-map.
48
+ - `references/agent-sessions.md` §"Critical pitfall: session lifecycle
49
+ controls what context is available" — lifecycle table, context-loss
50
+ patterns, and per-SDK details.
51
+ - `references/failure-modes.md` — silent + loud failures with wrong-vs-right
52
+ patterns and the pre-ship design checklist.
53
+ - `references/state-and-data-flow.md` — `s.save()`, `s.transcript()`, and
54
+ file-based handoff patterns.
73
55
 
74
56
  ## Design Advisory Skills
75
57
 
76
- Workflow quality depends on two disciplines: **prompt engineering** (crafting clear, structured prompts that each session receives) and **context engineering** (ensuring the right information reaches each session at the right time without exceeding token budgets). Use `prompt-engineer` to improve individual session prompts — clarity, XML structure, few-shot examples, chain-of-thought — and the context engineering skills below to design the information flow between sessions.
58
+ Workflow quality depends on two disciplines: **prompt engineering** (crafting
59
+ clear, structured prompts each session receives) and **context engineering**
60
+ (ensuring the right information reaches each session without exceeding token
61
+ budgets). Use `prompt-engineer` to improve individual session prompts —
62
+ clarity, XML structure, few-shot examples, chain-of-thought — and the
63
+ context engineering skills below to design information flow between sessions.
77
64
 
78
65
  | Design Concern | Skill | Trigger |
79
66
  |---|---|---|
@@ -94,7 +81,11 @@ Workflow quality depends on two disciplines: **prompt engineering** (crafting cl
94
81
 
95
82
  ## How Workflows Work
96
83
 
97
- A workflow is a TypeScript file with a single `.run()` callback that orchestrates agent sessions dynamically. Inside the callback, `ctx.stage()` spawns sessions — each gets its own tmux window and graph node (unless running in headless mode). Native TypeScript handles all control flow: loops, conditionals, `Promise.all()`, `try`/`catch`.
84
+ A workflow is a TypeScript file with a single `.run()` callback that
85
+ orchestrates agent sessions dynamically. Inside the callback, `ctx.stage()`
86
+ spawns sessions — each gets its own tmux window and graph node (unless
87
+ running in headless mode). Native TypeScript handles all control flow:
88
+ loops, conditionals, `Promise.all()`, `try`/`catch`.
98
89
 
99
90
  ```ts
100
91
  import { defineWorkflow, extractAssistantText } from "@bastani/atomic/workflows";
@@ -114,92 +105,41 @@ export default defineWorkflow({
114
105
  .compile();
115
106
  ```
116
107
 
117
- The runtime manages the full session lifecycle — callback return marks completion; throws mark errors. `.compile()` produces a branded `WorkflowDefinition` consumed by the CLI.
108
+ The runtime manages the full session lifecycle — callback return marks
109
+ completion; throws mark errors. `.compile()` produces a branded
110
+ `WorkflowDefinition` consumed by the CLI.
118
111
 
119
112
  ### Background (headless) stages
120
113
 
121
- Stages can run in **headless mode** by passing `{ headless: true }` in `SessionRunOptions`. Headless stages execute the provider SDK **in-process** instead of spawning a tmux window — they are invisible in the workflow graph but tracked via a background task counter in the statusline.
122
-
123
- ```ts
124
- // Headless stage — runs in-process, no tmux window, invisible in graph
125
- await ctx.stage(
126
- { name: "background-analysis", headless: true },
127
- {}, {},
128
- async (s) => {
129
- const result = await s.session.query("Analyze the codebase structure.");
130
- s.save(s.sessionId);
131
- return extractAssistantText(result, 0);
132
- },
133
- );
134
- ```
135
-
136
- **When to use headless stages:**
137
- - Parallel data-gathering tasks that don't need a visible TUI (e.g., codebase research, infrastructure discovery)
138
- - Support tasks that should run alongside visible stages without cluttering the graph
139
- - Any stage where only the result matters, not the live TUI interaction
140
-
141
- **How they work per provider:**
142
- - **Claude**: Uses the Agent SDK `query()` API directly in-process (no tmux pane)
143
- - **Copilot**: SDK spawns its own CLI subprocess internally (no tmux pane needed)
144
- - **OpenCode**: Uses `createOpencode()` to start both server and client in-process
145
-
146
- **Key behaviors:**
147
- - The callback interface is **identical** to interactive stages — `s.client`, `s.session`, `s.save()`, `s.transcript()` all work the same way
148
- - Headless stages are **transparent to graph topology** — they don't consume or update the execution frontier, so `visible → [3 headless] → visible` renders as `visible → visible` in the graph
149
- - Errors in headless stages still fail the workflow — they are tracked and recorded identically to interactive stages
150
- - The `paneId` for headless stages is a virtual identifier: `headless-<name>-<sessionId>`
151
-
152
- **Common pattern — fan-out with headless background stages:**
153
-
154
- ```ts
155
- // Visible stage seeds context
156
- const seed = await ctx.stage({ name: "seed" }, {}, {}, async (s) => { /* ... */ });
157
-
158
- // Three parallel headless stages gather data in the background
159
- const [a, b, c] = await Promise.all([
160
- ctx.stage({ name: "gather-a", headless: true }, {}, {}, async (s) => { /* ... */ }),
161
- ctx.stage({ name: "gather-b", headless: true }, {}, {}, async (s) => { /* ... */ }),
162
- ctx.stage({ name: "gather-c", headless: true }, {}, {}, async (s) => { /* ... */ }),
163
- ]);
164
-
165
- // Visible stage merges background results
166
- await ctx.stage({ name: "merge" }, {}, {}, async (s) => {
167
- await s.session.query(`Merge:\n${a.result}\n${b.result}\n${c.result}`);
168
- s.save(s.sessionId);
169
- });
170
- ```
114
+ Pass `{ headless: true }` in `stageOpts` to run a stage in-process with no
115
+ tmux window or graph node. The callback interface is identical
116
+ (`s.client`, `s.session`, `s.save()`, `s.transcript()` all work). For
117
+ mechanics, fan-out patterns, and graph topology see
118
+ `references/control-flow.md` §"Headless stages" and
119
+ `references/agent-sessions.md` per-SDK "Headless mode" sections.
171
120
 
172
- See `references/control-flow.md` for full headless pattern details and `references/agent-sessions.md` for per-SDK headless session behavior.
121
+ ### Installing the workflow SDK
173
122
 
174
- Workflows are SDK-specific. User-created workflows live in a project with `@bastani/atomic` installed as a dependency, along with the native agent SDK(s) for the provider(s) you target. Install only the SDK(s) you need:
175
-
176
- ```bash
177
- bun add @bastani/atomic # Workflow SDK
178
- bun add @anthropic-ai/claude-agent-sdk # For Claude workflows
179
- bun add @github/copilot-sdk # For Copilot workflows
180
- bun add @opencode-ai/sdk # For OpenCode workflows
181
- ```
182
-
183
- Workflow files live at `.atomic/workflows/<name>/<agent>/index.ts`. Discovery sources: **Local** (`.atomic/workflows/`), **Global** (`~/.atomic/workflows/`), and **Built-in** (SDK-shipped). Built-in names (`ralph`, `deep-research-codebase`) are **reserved** — any local/global workflow with the same name is dropped before resolution. Among non-reserved names, local takes precedence over global. See `references/discovery-and-verification.md` for full discovery paths and validation.
123
+ Install `@bastani/atomic` plus the native SDK(s) you target
124
+ (`@anthropic-ai/claude-agent-sdk`, `@github/copilot-sdk`,
125
+ `@opencode-ai/sdk`). Workflow files live at
126
+ `.atomic/workflows/<name>/<agent>/index.ts`. Full paths, precedence, and
127
+ reserved built-in names live in `references/discovery-and-verification.md`.
184
128
 
185
129
  ### Two context levels
186
130
 
187
- | Context | Available in | Has `client`/`session`/`save`? | Purpose |
188
- |---------|-------------|-------------------------------|---------|
189
- | `WorkflowContext` (`ctx`) | `.run(async (ctx) => ...)` | No | Orchestration: spawn sessions, read transcripts, read `ctx.inputs` |
190
- | `SessionContext` (`s`) | `ctx.stage(opts, clientOpts, sessionOpts, async (s) => ...)` | Yes | Agent work: use `s.client` and `s.session` for SDK calls, save output |
191
-
192
- Both contexts expose typed `inputs` (keys restricted to declared input names), `stage()`, `transcript()`, and `getMessages()`. See `references/getting-started.md` for the full `SessionContext` field reference.
193
-
194
- ### Declared inputs: one API, three invocation surfaces
131
+ `WorkflowContext` (`ctx`) drives orchestration in `.run()`; `SessionContext`
132
+ (`s`) drives agent work inside each stage callback. Full field reference in
133
+ `references/getting-started.md` §"`SessionContext` reference".
195
134
 
196
- Workflows receive user data exclusively through `ctx.inputs` (and `s.inputs` inside stage callbacks).
135
+ ### Declared inputs
197
136
 
198
- Declare `inputs: WorkflowInput[]` inline on `defineWorkflow()`. TypeScript infers literal field names from the array and restricts `ctx.inputs` to only those keys — accessing an undeclared field is a **compile-time error**. The CLI materializes one `--<field>=<value>` flag per entry, validates required fields + enum membership before launching, and the picker renders a form. Three field types: `string` (single-line), `text` (multi-line), `enum` (fixed set).
199
-
200
- Workflows that accept a free-form prompt should declare it explicitly: `{ name: "prompt", type: "text", required: true }`.
201
-
202
- **Load `references/workflow-inputs.md`** for the full schema shape, validation rules, picker semantics, and invocation cheat sheet.
137
+ Workflows receive user data exclusively through `ctx.inputs` / `s.inputs`,
138
+ declared inline as `inputs: WorkflowInput[]` on `defineWorkflow()`.
139
+ TypeScript restricts `ctx.inputs` to declared keys (undeclared access is a
140
+ compile-time error). Load `references/workflow-inputs.md` for schema shape,
141
+ field types (`string` / `text` / `enum`), validation rules, picker
142
+ semantics, and the "declare your prompt input explicitly" pattern.
203
143
 
204
144
  ### Invocation surfaces
205
145
 
@@ -214,13 +154,15 @@ Workflows that accept a free-form prompt should declare it explicitly: `{ name:
214
154
  | Kill non-interactively | `atomic session kill <id> -y` | Tear down a workflow/chat session without the confirmation prompt — the form agents use |
215
155
  | Detached (background) | `atomic workflow -n ralph -a claude -d "..."` | Scripted/CI runs where the caller shouldn't block on the TUI — the orchestrator keeps running on the atomic tmux socket; attach later with `atomic workflow session connect <name>` |
216
156
 
217
- Any of the named shapes above (positional or structured) accepts `-d` / `--detach` to run without attaching. Use it when you're automating from a script and want the CLI to return as soon as the session is spawned.
218
-
219
- **Builtin workflows are reserved** local/global workflows cannot shadow them. Pick distinct names.
157
+ Any of the named shapes above (positional or structured) accepts
158
+ `-d` / `--detach` to run without attaching. Use it when you're automating
159
+ from a script and want the CLI to return as soon as the session is spawned.
220
160
 
221
161
  ### Declaring SDK compatibility (`minSDKVersion`)
222
162
 
223
- Opt-in version gate for workflows that depend on a specific SDK release. **Default is unset — do not add it to new workflows unless you have a concrete reason.**
163
+ Opt-in version gate for workflows that depend on a specific SDK release.
164
+ **Default is unset — do not add it to new workflows unless you have a
165
+ concrete reason.**
224
166
 
225
167
  ```ts
226
168
  defineWorkflow({
@@ -229,20 +171,17 @@ defineWorkflow({
229
171
  })
230
172
  ```
231
173
 
232
- | Behaviour | Unset (default) | Set to a version newer than the installed CLI |
233
- |---|---|---|
234
- | Loader | Always loads | Refuses to load, returns `IncompatibleSDKError` |
235
- | `atomic workflow list` | Normal row | `⚠ needs v<X> (installed v<Y>)` dim name, visible |
236
- | Picker | Normal row | `⚠ update required` glyph + preview explains the gap; Enter is disabled |
237
- | `atomic workflow -n <name>` | Runs | Errors with an upgrade hint, non-zero exit |
238
-
239
- When to set it: the workflow calls into a newly-added SDK surface (new `stage()` option, new helper export, new provider method) that older installs don't ship. Omit it for workflows that use only stable APIs — most workflows qualify.
174
+ When set to a version newer than the installed CLI, the workflow refuses to
175
+ load and surfaces a visible row in `atomic workflow list` and the picker
176
+ (rather than silently vanishing). Set it only when the workflow calls a
177
+ newly-added SDK surface (new `stage()` option, new helper export, new
178
+ provider method); omit it for workflows on stable APIs. Full semver
179
+ semantics and the visible-diagnostic contract live in
180
+ `references/discovery-and-verification.md`.
240
181
 
241
- The point of the field is to convert a silent "workflow vanished after upgrade" failure into a visible, actionable row the user can fix. See `references/discovery-and-verification.md` for semver semantics and the visible-diagnostic contract.
182
+ ## Structural Rules (hard constraints)
242
183
 
243
- ### Structural Rules
244
-
245
- Hard constraints enforced by the builder, loader, and runtime:
184
+ Enforced by the builder, loader, and runtime:
246
185
 
247
186
  1. **`.run()` required** — the builder must have a `.run(async (ctx) => { ... })` call.
248
187
  2. **`.compile()` required** — the chain must end with `.compile()`.
@@ -269,10 +208,14 @@ Every workflow pattern maps directly to TypeScript code:
269
208
  | Return data from session | `const h = await ctx.stage(opts, {}, {}, async (s) => { return value; }); h.result` |
270
209
  | Data flow between sessions | `s.save()` to persist → `s.transcript(handle)` or `s.transcript("name")` to retrieve |
271
210
  | Deterministic computation (no LLM) | Plain TypeScript inside `.run()` or inside a session callback |
272
- | Subagent orchestration | Claude: `@"agent (agent)"` prefix in prompt; Copilot: `{ agent: "name" }` in sessionOpts; OpenCode: `agent` param in `s.client.session.prompt()` |
211
+ | Subagent orchestration | Claude: `--agent` via `chatFlags` (interactive) or `agent` SDK option (headless); Copilot: `{ agent: "name" }` in sessionOpts; OpenCode: `agent` param in `s.client.session.prompt()` |
273
212
  | Per-session configuration | Pass `clientOpts` (2nd arg) and `sessionOpts` (3rd arg) to `ctx.stage()` |
274
213
 
275
- For full pattern examples with code, see `references/control-flow.md` (loops, conditionals, review/fix, graph topology), `references/state-and-data-flow.md` (data passing, file coordination, transcript compression), and `references/computation-and-validation.md` (parsing, validation, quality gates).
214
+ For full pattern examples with code, see `references/control-flow.md`
215
+ (loops, conditionals, review/fix, graph topology, headless fan-out),
216
+ `references/state-and-data-flow.md` (data passing, file coordination,
217
+ transcript compression), and `references/computation-and-validation.md`
218
+ (parsing, validation, quality gates).
276
219
 
277
220
  ## Authoring Process
278
221
 
@@ -292,24 +235,16 @@ Map the user's intent to sessions and patterns:
292
235
  | Does the workflow need user input? | SDK-specific user input APIs (see `references/user-input.md`) |
293
236
  | Do any steps need a specific model? | SDK-specific session config (see `references/session-config.md`) |
294
237
 
295
- Then apply **design advisory checks** these catch architectural and prompt quality issues before you write code:
296
-
297
- | Design Question | If Yes Consult |
298
- |-----------------|------------------|
299
- | Do session prompts need to be clear, structured, or include examples? | `prompt-engineer` — use XML tags, chain-of-thought, few-shot examples, explicit output format |
300
- | Is this task actually viable for agent automation? | `project-development` — validate task-model fit before building |
301
- | Could any single session exceed context limits? | `context-fundamentals` — budget tokens; split into sub-sessions if needed |
302
- | Do loops accumulate state that degrades over iterations? | `context-degradation` — add compaction triggers; detect lost-in-middle risk |
303
- | Are large transcripts passed between sessions? | `context-compression` — summarize at boundaries; preserve key decisions and file paths |
304
- | Should this be one session or many? | `multi-agent-patterns` — choose coordination topology based on task decomposability |
305
- | Do sessions coordinate via shared files? | `filesystem-context` — use scratch pads, dynamic loading, file-based handoffs |
306
- | Does the workflow need automated quality checks? | `evaluation` + `advanced-evaluation` — design rubrics; mitigate judge bias |
307
- | Does the workflow expose custom tools to agents? | `tool-design` — consolidate tools; write unambiguous descriptions |
308
- | Does the workflow need cross-run knowledge retention? | `memory-systems` — choose persistence layer based on retrieval needs |
238
+ Then walk the **Design Advisory Skills** table above (§"Design Advisory
239
+ Skills") — for each row whose trigger applies to your workflow, pull that
240
+ skill in *before* writing code. Catching architectural and prompt-quality
241
+ issues at design time is far cheaper than catching them in the first failed
242
+ end-to-end run.
309
243
 
310
244
  ### 2. Choose the Target Agent
311
245
 
312
- Use `.for<"agent">()` on the builder to narrow all context types and get correct `s.client`/`s.session` types. Call `.for()` **before** `.run()`:
246
+ Use `.for<"agent">()` on the builder to narrow all context types and get
247
+ correct `s.client`/`s.session` types. Call `.for()` **before** `.run()`:
313
248
 
314
249
  | Agent | Builder Chain | Primary Session API |
315
250
  |-------|---------------|---------------------|
@@ -317,9 +252,13 @@ Use `.for<"agent">()` on the builder to narrow all context types and get correct
317
252
  | Copilot | `defineWorkflow({...}).for<"copilot">()` | `s.session.send({ prompt })` — the runtime wraps `send` to block until `session.idle` with no timeout (see `failure-modes.md` §F10); do not use `sendAndWait` in Atomic workflows |
318
253
  | OpenCode | `defineWorkflow({...}).for<"opencode">()` | `s.client.session.prompt({ sessionID: s.session.id, parts: [...] })` |
319
254
 
320
- The runtime manages client/session lifecycle automatically. For native SDK types and advanced APIs, import directly from the provider packages (`@github/copilot-sdk`, `@anthropic-ai/claude-agent-sdk`, `@opencode-ai/sdk/v2`).
255
+ The runtime manages client/session lifecycle automatically. For native SDK
256
+ types and advanced APIs, import directly from the provider packages
257
+ (`@github/copilot-sdk`, `@anthropic-ai/claude-agent-sdk`, `@opencode-ai/sdk/v2`).
321
258
 
322
- For cross-agent support, create one workflow file per agent under `.atomic/workflows/<name>/<agent>/index.ts`. Use shared helper modules for SDK-agnostic logic in a sibling `helpers/` directory:
259
+ For cross-agent support, create one workflow file per agent under
260
+ `.atomic/workflows/<name>/<agent>/index.ts`. Use shared helper modules for
261
+ SDK-agnostic logic in a sibling `helpers/` directory:
323
262
 
324
263
  ```
325
264
  .atomic/workflows/<name>/
@@ -334,23 +273,20 @@ For cross-agent support, create one workflow file per agent under `.atomic/workf
334
273
 
335
274
  ### 3. Write the Workflow File
336
275
 
337
- **Load `references/getting-started.md`** for complete quick-start examples for all three SDKs with correct save patterns, response extraction, and timeout handling.
338
-
339
- Per-SDK cheat sheet:
340
-
341
- | Concern | Claude | Copilot | OpenCode |
342
- |---------|--------|---------|----------|
343
- | Send prompt | `s.session.query(prompt)` | `s.session.send({ prompt })` | `s.client.session.prompt({ sessionID: s.session.id, parts: [{ type: "text", text: prompt }] })` |
344
- | Save output | `s.save(s.sessionId)` | `s.save(await s.session.getMessages())` | `s.save(result.data!)` |
345
- | Timeout | Per-query defaults via sessionOpts | N/A (`send` has no timeout; `sendAndWait` accepts optional timeout, default 60s) | N/A |
346
- | Context model | Tmux pane (accumulates across turns) | Fresh per `ctx.stage()` | Fresh per `ctx.stage()` |
347
- | Extract text | `extractAssistantText(result, 0)` (uses `SessionMessage[]`) | `getAssistantText(messages)` (see `failure-modes.md` F1) | `extractResponseText(result.data!.parts)` (see `failure-modes.md` F3) |
276
+ Write the workflow file using the SDK-specific patterns. See
277
+ `references/getting-started.md` for full quick-start examples for all 3
278
+ SDKs (send/save/extract patterns, idle handling), and
279
+ `references/agent-sessions.md` for per-SDK API details and lifecycle
280
+ caveats.
348
281
 
349
- The SDK ships two builtin workflows as production reference implementations:
350
- - **`ralph`** iterative plan → orchestrate → review → debug loop (all 3 SDKs)
351
- - **`deep-research-codebase`** — deterministic scoutparallel explorers aggregator (all 3 SDKs)
282
+ The SDK ships two builtin workflows in `src/sdk/workflows/builtin/` as
283
+ production reference implementations across all 3 SDKs:
284
+ - **`ralph`** — iterative planorchestratereview debug loop.
285
+ - **`deep-research-codebase`** — deterministic scout → parallel explorers →
286
+ aggregator.
352
287
 
353
- Both live in `src/sdk/workflows/builtin/` and demonstrate real patterns including shared helpers, context-aware prompt building, deterministic heuristics, and cross-SDK adaptation.
288
+ They demonstrate shared helpers, context-aware prompt building, deterministic
289
+ heuristics, and cross-SDK adaptation.
354
290
 
355
291
  ### 4. Type-Check the Workflow
356
292
 
@@ -377,147 +313,22 @@ atomic workflow -a <agent>
377
313
  atomic workflow -n <workflow-name> -a <agent> -d "<your prompt>"
378
314
  ```
379
315
 
380
- ## Running a Workflow on Behalf of the User
381
-
382
- When the user asks you to **run** (or "kick off" / "start" / "execute") a workflow — *not* author one — your job is to translate their request into a single `atomic workflow` invocation and run it. This section is the playbook for that flow. It is different from the authoring playbook above: the workflow already exists on disk; you just need to invoke it correctly.
383
-
384
- ### You don't need to pass `-a` or `-d`
385
-
386
- When you (the agent) are running inside an atomic chat or workflow pane, the CLI reads `$ATOMIC_AGENT` from your environment and:
387
-
388
- - Fills in `-a <agent>` automatically if you don't pass it.
389
- - Forces detached mode on, so launching a workflow never takes over your pane.
390
-
391
- The practical result: your command is just `atomic workflow -n <name> <inputs>`. No provider flag, no detach flag, no chance of the orchestrator hijacking your terminal. The CLI prints the session name and returns immediately; you relay that name to the user.
392
-
393
- Override only when the user explicitly asks for a different provider (e.g. "run it on Copilot") — pass `-a copilot` and the CLI will honor it over the env var.
394
-
395
- ### Always list first
396
-
397
- **Before anything else, run `atomic workflow list`.** (Optionally filter with `-a <agent>` if the user's pinned to one — usually unnecessary.) This is a cheap, read-only call that tells you three things in one shot:
398
-
399
- - Whether the workflow the user named actually exists.
400
- - What other workflows are available (so you can suggest close matches on a typo).
401
- - Source + metadata for every discoverable workflow (local vs. global vs. builtin).
402
-
403
- Skipping this step is how you end up guessing a name, typing it into `atomic workflow -n <name>`, and getting a `workflow not found` error you could have predicted. List first, decide second, run third.
404
-
405
- If the user's request is ambiguous ("run the research one"), the list output is also how you show them the candidates so they can pick — present the matching names and ask with AskUserQuestion.
406
-
407
- ### If the workflow doesn't exist: offer to create it
408
-
409
- When the listed workflows don't include what the user asked for:
410
-
411
- 1. **Tell the user explicitly** — "I don't see a `<name>` workflow in `.atomic/workflows/` or `~/.atomic/workflows/`. Available: \<short list from `atomic workflow list`>."
412
- 2. **Check for typos first** — if one of the listed names is a close match, surface it via AskUserQuestion ("Did you mean `<close-match>`?") before offering to author anything.
413
- 3. **Offer to create it** — ask with AskUserQuestion: "Want me to create a `<name>` workflow first?" with choices `Yes, create it` / `No, pick from the list` / `No, cancel`.
414
- 4. **If yes → switch modes** — hand off to the authoring flow in the [Authoring Process](#authoring-process) section above (Steps 1-5). Interview the user for intent, write the file at `.atomic/workflows/<name>/<agent>/index.ts`, typecheck it, *then* come back to this runner section and invoke it. Do not skip the typecheck — an uncompiled workflow won't run.
415
- 5. **If no → stop** — don't fabricate a command that will fail. Let the user redirect you.
416
-
417
- Never invent a workflow name or silently fall back to a different workflow. If the thing the user asked for doesn't exist, the correct answer is to say so and offer concrete next steps.
418
-
419
- ### Collecting inputs with AskUserQuestion
420
-
421
- Once you've confirmed the workflow exists, you need to know two things about its invocation shape:
422
-
423
- 1. **Does it declare a `prompt` input?** If so, it's free-form — you pass a positional string.
424
- 2. **Does it declare structured inputs?** If so, you pass `--<field>=<value>` flags, one per required field.
425
-
426
- **Use `atomic workflow inputs <name> -a <agent>` to get the schema.** This prints a JSON envelope with every field's `name`, `type`, `required`, `default`, `description`, and (for enums) `values` — exactly what AskUserQuestion needs. The `freeform: true` flag tells you whether the workflow takes a positional prompt vs. structured flags, with a synthetic `prompt` field included so the JSON shape is uniform either way.
427
-
428
- ```bash
429
- atomic workflow inputs gen-spec -a claude
430
- # {"workflow":"gen-spec","agent":"claude","freeform":false,
431
- # "inputs":[{"name":"research_doc","type":"string","required":true,...},
432
- # {"name":"focus","type":"enum","values":["minimal","standard","exhaustive"],"default":"standard"}]}
433
- ```
434
-
435
- Why this command instead of reading the source file: `inputs` is the contract the CLI actually validates against. It survives refactors, handles built-in workflows that aren't in the project tree, and never falls out of sync with the runtime. Reading TypeScript source is a fallback for the rare case where the command can't resolve the workflow.
436
-
437
- Once you have the schema, use the **AskUserQuestion tool** to collect any values the user hasn't already provided in their message. One question per missing input field. For enum fields, pass the declared `values` as multiple-choice options so the user sees exactly what's allowed. Keep questions tight and purposeful — if the user's message already answers a question, don't ask it again.
438
-
439
- Skip AskUserQuestion entirely when:
440
- - The user already supplied every required value in their message ("run ralph on 'add OAuth to the API'" — the prompt is right there).
441
- - The workflow declares no required inputs and needs no prompt.
442
-
443
- ### End-to-end recipe
444
-
445
- 1. **List available workflows** — run `atomic workflow list`. Always. This is your ground truth.
446
- 2. **Resolve the target**:
447
- - Exact match in the list → continue.
448
- - Close match → confirm via AskUserQuestion before proceeding.
449
- - No match → tell the user what's available and offer to author it (see previous section). If they decline, stop.
450
- 3. **Discover the inputs schema** — run `atomic workflow inputs <name> -a <agent>` and parse the JSON.
451
- 4. **Ask for missing inputs** — use AskUserQuestion, one question per unanswered required field. Enums become multiple-choice.
452
- 5. **Invoke** — build one of these commands:
453
- - Free-form: `atomic workflow -n <name> "<prompt>"`
454
- - Structured: `atomic workflow -n <name> --<field1>=<value1> --<field2>=<value2>`
455
- 6. **Report the session name** the CLI printed and tell the user: "attach any time with `atomic workflow session connect <session>` — or `atomic workflow session list` to see what's running."
456
-
457
- ### Monitoring a running workflow
458
-
459
- Detached workflows return immediately with a session name; the actual work runs in the background on the atomic tmux socket. Use `atomic workflow status` to check whether the workflow is still running, has completed, errored out, or paused for human input — without attaching to its TUI.
460
-
461
- ```bash
462
- atomic workflow status atomic-wf-claude-gen-spec-a1b2c3d4
463
- # {"id":"atomic-wf-claude-gen-spec-a1b2c3d4","overall":"in_progress","alive":true,
464
- # "sessions":[{"name":"orchestrator","status":"running",...}],...}
465
- ```
466
-
467
- Four overall states the agent must handle distinctly:
468
-
469
- | Status | Meaning | What you should do |
470
- |---|---|---|
471
- | `in_progress` | The orchestrator is running and no stage is paused | Wait, or report progress to the user |
472
- | `needs_review` | At least one stage is paused for human input (HIL) — Copilot `ask_user`, OpenCode `question.asked`, Copilot/MCP elicitation | **Surface this to the user immediately** — they need to attach with `atomic workflow session connect <id>` to respond, otherwise the workflow stalls indefinitely |
473
- | `completed` | Workflow finished successfully | Report success and summarize the output |
474
- | `error` | Fatal error or a stage failed | Report the `fatalError` field and offer to investigate logs |
475
-
476
- `needs_review` outranks `completed` so a HIL pause near the end is never reported as done while still waiting on a human. A dead orchestrator with a stale snapshot is automatically downgraded to `error`.
477
-
478
- Omit the id to list every running workflow at once: `atomic workflow status`. Useful when checking on multiple parallel runs, or when the user just asks "what's running?".
479
-
480
- ### Cleaning up sessions
481
-
482
- When the user is done with a workflow — or you launched one detached and it's no longer needed — tear it down with `-y` so no confirmation prompt blocks you:
483
-
484
- ```bash
485
- atomic session kill atomic-wf-claude-gen-spec-a1b2c3d4 -y
486
- ```
487
-
488
- The `-y` flag is mandatory for agent use. Without it, the CLI calls `@clack/prompts confirm`, which expects a TTY and will hang indefinitely in a non-interactive context. Same flag works for `atomic workflow session kill` and `atomic chat session kill`. Without an id, `kill -y` tears down every in-scope session — only do that when the user has asked to stop everything.
489
-
490
- ### Worked examples
491
-
492
- **Example A — workflow exists, structured inputs**
493
-
494
- > **User:** "run gen-spec on research/docs/2026-04-11-auth.md"
495
-
496
- 1. Run `atomic workflow list`. Output includes `gen-spec` under local. Good.
497
- 2. Target resolved exactly: `gen-spec`.
498
- 3. Run `atomic workflow inputs gen-spec -a claude`. Parse the JSON: `research_doc` (required string — already given), `focus` (required enum of `minimal|standard|exhaustive`, default `standard`), `notes` (optional text).
499
- 4. Ask via AskUserQuestion once: "What focus level for the spec?" with choices `minimal`, `standard`, `exhaustive`. User picks `standard`. Skip `notes` since it's optional.
500
- 5. Run: `atomic workflow -n gen-spec --research_doc=research/docs/2026-04-11-auth.md --focus=standard`
501
- 6. The CLI prints a session name like `atomic-wf-claude-gen-spec-a1b2c3d4`. Tell the user: "Started in the background. Attach with `atomic workflow session connect atomic-wf-claude-gen-spec-a1b2c3d4`, check progress with `atomic workflow status atomic-wf-claude-gen-spec-a1b2c3d4`, or stop it with `atomic session kill atomic-wf-claude-gen-spec-a1b2c3d4 -y`."
502
-
503
- **Example B — workflow does not exist**
504
-
505
- > **User:** "run the security-audit workflow on src/auth"
506
-
507
- 1. Run `atomic workflow list`. Available: `ralph`, `deep-research-codebase`, `gen-spec`, `review-to-merge`. No `security-audit`.
508
- 2. Tell the user: "I don't see a `security-audit` workflow. Available: ralph, deep-research-codebase, gen-spec, review-to-merge."
509
- 3. Ask via AskUserQuestion: "Want me to create a `security-audit` workflow first?" with choices `Yes, create it`, `No, use one of the existing workflows`, `No, cancel`.
510
- 4. If **Yes**: switch to the Authoring Process — interview the user for what the workflow should do, draft it, typecheck, *then* return here and invoke it.
511
- 5. If **No, use existing**: ask which one via AskUserQuestion over the listed options, then continue from step 3 of the recipe.
512
- 6. If **Cancel**: stop, no command runs.
513
-
514
- ### Common mistakes to avoid
515
-
516
- - **Skipping `atomic workflow list`** — leads to guessing and `workflow not found` errors. It's a one-line command; always run it.
517
- - **Inventing a workflow name** — if it's not in the list, it doesn't exist. Say so and offer to author it.
518
- - **Reading the workflow source file to discover inputs** — use `atomic workflow inputs <name> -a <agent>` instead. JSON, no TS parsing required, always in sync with the runtime. Source-file reads are a fallback, not a default.
519
- - **Asking everything at once** — let AskUserQuestion drive one question per field. Enum fields are multiple-choice, not free text.
520
- - **Re-asking what the user already said** — read their message first.
521
- - **Forgetting to report the session name** — the user needs it to reattach and to query status later.
522
- - **Leaving `needs_review` unreported** — when `atomic workflow status` returns `needs_review`, surface it to the user right away. The workflow is blocked on human input and will sit forever otherwise.
523
- - **Calling `session kill` without `-y`** — the prompt hangs in a non-interactive context. Always pass `-y` from an agent.
316
+ ## Running an Existing Workflow
317
+
318
+ If the user asks you to **run** (or "kick off" / "start" / "execute") a
319
+ workflow — not author one — the workflow already exists on disk and you
320
+ just need to invoke it correctly. That's a different playbook from
321
+ authoring.
322
+
323
+ **Read `references/running-workflows.md`.** It covers:
324
+
325
+ - Why you don't usually need `-a` or `-d` (env-driven auto-detach).
326
+ - Why you must run `atomic workflow list` first.
327
+ - How to handle missing workflows (offer to author, not fabricate).
328
+ - Using `atomic workflow inputs <name> -a <agent>` to discover the schema
329
+ and drive AskUserQuestion.
330
+ - The six-step invocation recipe.
331
+ - Monitoring with `atomic workflow status` — and why `needs_review` must be
332
+ surfaced immediately.
333
+ - Tearing down with `atomic session kill -y` (the `-y` is mandatory).
334
+ - Worked examples for "workflow exists" and "workflow doesn't exist".