@bastani/atomic 0.5.23-0 → 0.5.24-0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/workflow-creator/SKILL.md +137 -326
- package/.agents/skills/workflow-creator/references/agent-sessions.md +211 -152
- package/.agents/skills/workflow-creator/references/computation-and-validation.md +12 -37
- package/.agents/skills/workflow-creator/references/control-flow.md +20 -14
- package/.agents/skills/workflow-creator/references/discovery-and-verification.md +1 -1
- package/.agents/skills/workflow-creator/references/failure-modes.md +87 -62
- package/.agents/skills/workflow-creator/references/getting-started.md +14 -40
- package/.agents/skills/workflow-creator/references/running-workflows.md +235 -0
- package/.agents/skills/workflow-creator/references/session-config.md +24 -9
- package/.agents/skills/workflow-creator/references/state-and-data-flow.md +9 -26
- package/.agents/skills/workflow-creator/references/user-input.md +71 -43
- package/.agents/skills/workflow-creator/references/workflow-inputs.md +25 -42
- package/dist/sdk/providers/claude.d.ts +7 -2
- package/dist/sdk/providers/claude.d.ts.map +1 -1
- package/dist/sdk/providers/opencode.d.ts +18 -2
- package/dist/sdk/providers/opencode.d.ts.map +1 -1
- package/dist/sdk/runtime/executor.d.ts +5 -0
- package/dist/sdk/runtime/executor.d.ts.map +1 -1
- package/package.json +1 -1
- package/src/sdk/providers/claude.ts +57 -12
- package/src/sdk/providers/headless-hil-policy.test.ts +171 -0
- package/src/sdk/providers/opencode.ts +62 -2
- package/src/sdk/runtime/executor.ts +57 -14
|
@@ -0,0 +1,235 @@
|
|
|
1
|
+
# Running a Workflow on Behalf of the User
|
|
2
|
+
|
|
3
|
+
When the user asks you to **run** (or "kick off" / "start" / "execute") a
|
|
4
|
+
workflow — *not* author one — your job is to translate their request into a
|
|
5
|
+
single `atomic workflow` invocation and run it. This is the playbook for that
|
|
6
|
+
flow. It is different from the authoring playbook in `SKILL.md`: the workflow
|
|
7
|
+
already exists on disk; you just need to invoke it correctly.
|
|
8
|
+
|
|
9
|
+
## You don't need to pass `-a` or `-d`
|
|
10
|
+
|
|
11
|
+
When you (the agent) are running inside an atomic chat or workflow pane, the
|
|
12
|
+
CLI reads `$ATOMIC_AGENT` from your environment and:
|
|
13
|
+
|
|
14
|
+
- Fills in `-a <agent>` automatically if you don't pass it.
|
|
15
|
+
- Forces detached mode on, so launching a workflow never takes over your pane.
|
|
16
|
+
|
|
17
|
+
The practical result: your command is just `atomic workflow -n <name> <inputs>`.
|
|
18
|
+
No provider flag, no detach flag, no chance of the orchestrator hijacking your
|
|
19
|
+
terminal. The CLI prints the session name and returns immediately; you relay
|
|
20
|
+
that name to the user.
|
|
21
|
+
|
|
22
|
+
Override only when the user explicitly asks for a different provider (e.g.
|
|
23
|
+
"run it on Copilot") — pass `-a copilot` and the CLI will honor it over the
|
|
24
|
+
env var.
|
|
25
|
+
|
|
26
|
+
## Always list first
|
|
27
|
+
|
|
28
|
+
**Before anything else, run `atomic workflow list`.** (Optionally filter with
|
|
29
|
+
`-a <agent>` if the user's pinned to one — usually unnecessary.) This is a
|
|
30
|
+
cheap, read-only call that tells you three things in one shot:
|
|
31
|
+
|
|
32
|
+
- Whether the workflow the user named actually exists.
|
|
33
|
+
- What other workflows are available (so you can suggest close matches on a typo).
|
|
34
|
+
- Source + metadata for every discoverable workflow (local vs. global vs. builtin).
|
|
35
|
+
|
|
36
|
+
Skipping this step is how you end up guessing a name, typing it into
|
|
37
|
+
`atomic workflow -n <name>`, and getting a `workflow not found` error you
|
|
38
|
+
could have predicted. List first, decide second, run third.
|
|
39
|
+
|
|
40
|
+
If the user's request is ambiguous ("run the research one"), the list output
|
|
41
|
+
is also how you show them the candidates so they can pick — present the
|
|
42
|
+
matching names and ask with AskUserQuestion.
|
|
43
|
+
|
|
44
|
+
## If the workflow doesn't exist: offer to create it
|
|
45
|
+
|
|
46
|
+
When the listed workflows don't include what the user asked for:
|
|
47
|
+
|
|
48
|
+
1. **Tell the user explicitly** — "I don't see a `<name>` workflow in
|
|
49
|
+
`.atomic/workflows/` or `~/.atomic/workflows/`. Available: \<short list from
|
|
50
|
+
`atomic workflow list`>."
|
|
51
|
+
2. **Check for typos first** — if one of the listed names is a close match,
|
|
52
|
+
surface it via AskUserQuestion ("Did you mean `<close-match>`?") before
|
|
53
|
+
offering to author anything.
|
|
54
|
+
3. **Offer to create it** — ask with AskUserQuestion: "Want me to create a
|
|
55
|
+
`<name>` workflow first?" with choices `Yes, create it` / `No, pick from
|
|
56
|
+
the list` / `No, cancel`.
|
|
57
|
+
4. **If yes → switch modes** — hand off to the authoring flow in SKILL.md
|
|
58
|
+
(Steps 1-5). Interview the user for intent, write the file at
|
|
59
|
+
`.atomic/workflows/<name>/<agent>/index.ts`, typecheck it, *then* come back
|
|
60
|
+
here and invoke it. Do not skip the typecheck — an uncompiled workflow
|
|
61
|
+
won't run.
|
|
62
|
+
5. **If no → stop** — don't fabricate a command that will fail. Let the user
|
|
63
|
+
redirect you.
|
|
64
|
+
|
|
65
|
+
Never invent a workflow name or silently fall back to a different workflow.
|
|
66
|
+
If the thing the user asked for doesn't exist, the correct answer is to say
|
|
67
|
+
so and offer concrete next steps.
|
|
68
|
+
|
|
69
|
+
## Collecting inputs with AskUserQuestion
|
|
70
|
+
|
|
71
|
+
Once you've confirmed the workflow exists, you need to know two things about
|
|
72
|
+
its invocation shape:
|
|
73
|
+
|
|
74
|
+
1. **Does it declare a `prompt` input?** If so, it's free-form — you pass a
|
|
75
|
+
positional string.
|
|
76
|
+
2. **Does it declare structured inputs?** If so, you pass `--<field>=<value>`
|
|
77
|
+
flags, one per required field.
|
|
78
|
+
|
|
79
|
+
**Use `atomic workflow inputs <name> -a <agent>` to get the schema.** This
|
|
80
|
+
prints a JSON envelope with every field's `name`, `type`, `required`,
|
|
81
|
+
`default`, `description`, and (for enums) `values` — exactly what
|
|
82
|
+
AskUserQuestion needs. The `freeform: true` flag tells you whether the
|
|
83
|
+
workflow takes a positional prompt vs. structured flags, with a synthetic
|
|
84
|
+
`prompt` field included so the JSON shape is uniform either way.
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
atomic workflow inputs gen-spec -a claude
|
|
88
|
+
# {"workflow":"gen-spec","agent":"claude","freeform":false,
|
|
89
|
+
# "inputs":[{"name":"research_doc","type":"string","required":true,...},
|
|
90
|
+
# {"name":"focus","type":"enum","values":["minimal","standard","exhaustive"],"default":"standard"}]}
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
Why this command instead of reading the source file: `inputs` is the contract
|
|
94
|
+
the CLI actually validates against. It survives refactors, handles built-in
|
|
95
|
+
workflows that aren't in the project tree, and never falls out of sync with
|
|
96
|
+
the runtime. Reading TypeScript source is a fallback for the rare case where
|
|
97
|
+
the command can't resolve the workflow.
|
|
98
|
+
|
|
99
|
+
Once you have the schema, use the **AskUserQuestion tool** to collect any
|
|
100
|
+
values the user hasn't already provided in their message. One question per
|
|
101
|
+
missing input field. For enum fields, pass the declared `values` as
|
|
102
|
+
multiple-choice options so the user sees exactly what's allowed. Keep
|
|
103
|
+
questions tight and purposeful — if the user's message already answers a
|
|
104
|
+
question, don't ask it again.
|
|
105
|
+
|
|
106
|
+
Skip AskUserQuestion entirely when:
|
|
107
|
+
- The user already supplied every required value in their message
|
|
108
|
+
("run ralph on 'add OAuth to the API'" — the prompt is right there).
|
|
109
|
+
- The workflow declares no required inputs and needs no prompt.
|
|
110
|
+
|
|
111
|
+
## End-to-end recipe
|
|
112
|
+
|
|
113
|
+
1. **List available workflows** — run `atomic workflow list`. Always. This is
|
|
114
|
+
your ground truth.
|
|
115
|
+
2. **Resolve the target**:
|
|
116
|
+
- Exact match in the list → continue.
|
|
117
|
+
- Close match → confirm via AskUserQuestion before proceeding.
|
|
118
|
+
- No match → tell the user what's available and offer to author it (see
|
|
119
|
+
previous section). If they decline, stop.
|
|
120
|
+
3. **Discover the inputs schema** — run `atomic workflow inputs <name> -a <agent>`
|
|
121
|
+
and parse the JSON.
|
|
122
|
+
4. **Ask for missing inputs** — use AskUserQuestion, one question per
|
|
123
|
+
unanswered required field. Enums become multiple-choice.
|
|
124
|
+
5. **Invoke** — build one of these commands:
|
|
125
|
+
- Free-form: `atomic workflow -n <name> "<prompt>"`
|
|
126
|
+
- Structured: `atomic workflow -n <name> --<field1>=<value1> --<field2>=<value2>`
|
|
127
|
+
6. **Report the session name** the CLI printed and tell the user: "attach any
|
|
128
|
+
time with `atomic workflow session connect <session>` — or
|
|
129
|
+
`atomic workflow session list` to see what's running."
|
|
130
|
+
|
|
131
|
+
## Monitoring a running workflow
|
|
132
|
+
|
|
133
|
+
Detached workflows return immediately with a session name; the actual work
|
|
134
|
+
runs in the background on the atomic tmux socket. Use `atomic workflow status`
|
|
135
|
+
to check whether the workflow is still running, has completed, errored out, or
|
|
136
|
+
paused for human input — without attaching to its TUI.
|
|
137
|
+
|
|
138
|
+
```bash
|
|
139
|
+
atomic workflow status atomic-wf-claude-gen-spec-a1b2c3d4
|
|
140
|
+
# {"id":"atomic-wf-claude-gen-spec-a1b2c3d4","overall":"in_progress","alive":true,
|
|
141
|
+
# "sessions":[{"name":"orchestrator","status":"running",...}],...}
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
Four overall states the agent must handle distinctly:
|
|
145
|
+
|
|
146
|
+
| Status | Meaning | What you should do |
|
|
147
|
+
|---|---|---|
|
|
148
|
+
| `in_progress` | The orchestrator is running and no stage is paused | Wait, or report progress to the user |
|
|
149
|
+
| `needs_review` | At least one stage is paused for human input (HIL) — Copilot `ask_user`, OpenCode `question.asked`, Copilot/MCP elicitation | **Surface this to the user immediately** — they need to attach with `atomic workflow session connect <id>` to respond, otherwise the workflow stalls indefinitely |
|
|
150
|
+
| `completed` | Workflow finished successfully | Report success and summarize the output |
|
|
151
|
+
| `error` | Fatal error or a stage failed | Report the `fatalError` field and offer to investigate logs |
|
|
152
|
+
|
|
153
|
+
`needs_review` outranks `completed` so a HIL pause near the end is never
|
|
154
|
+
reported as done while still waiting on a human. A dead orchestrator with a
|
|
155
|
+
stale snapshot is automatically downgraded to `error`.
|
|
156
|
+
|
|
157
|
+
Omit the id to list every running workflow at once: `atomic workflow status`.
|
|
158
|
+
Useful when checking on multiple parallel runs, or when the user just asks
|
|
159
|
+
"what's running?".
|
|
160
|
+
|
|
161
|
+
## Cleaning up sessions
|
|
162
|
+
|
|
163
|
+
When the user is done with a workflow — or you launched one detached and it's
|
|
164
|
+
no longer needed — tear it down with `-y` so no confirmation prompt blocks you:
|
|
165
|
+
|
|
166
|
+
```bash
|
|
167
|
+
atomic session kill atomic-wf-claude-gen-spec-a1b2c3d4 -y
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
The `-y` flag is mandatory for agent use. Without it, the CLI calls
|
|
171
|
+
`@clack/prompts confirm`, which expects a TTY and will hang indefinitely in a
|
|
172
|
+
non-interactive context. Same flag works for `atomic workflow session kill`
|
|
173
|
+
and `atomic chat session kill`. Without an id, `kill -y` tears down every
|
|
174
|
+
in-scope session — only do that when the user has asked to stop everything.
|
|
175
|
+
|
|
176
|
+
## Worked examples
|
|
177
|
+
|
|
178
|
+
**Example A — workflow exists, structured inputs**
|
|
179
|
+
|
|
180
|
+
> **User:** "run gen-spec on research/docs/2026-04-11-auth.md"
|
|
181
|
+
|
|
182
|
+
1. Run `atomic workflow list`. Output includes `gen-spec` under local. Good.
|
|
183
|
+
2. Target resolved exactly: `gen-spec`.
|
|
184
|
+
3. Run `atomic workflow inputs gen-spec -a claude`. Parse the JSON:
|
|
185
|
+
`research_doc` (required string — already given), `focus` (required enum
|
|
186
|
+
of `minimal|standard|exhaustive`, default `standard`), `notes`
|
|
187
|
+
(optional text).
|
|
188
|
+
4. Ask via AskUserQuestion once: "What focus level for the spec?" with
|
|
189
|
+
choices `minimal`, `standard`, `exhaustive`. User picks `standard`. Skip
|
|
190
|
+
`notes` since it's optional.
|
|
191
|
+
5. Run: `atomic workflow -n gen-spec --research_doc=research/docs/2026-04-11-auth.md --focus=standard`
|
|
192
|
+
6. The CLI prints a session name like `atomic-wf-claude-gen-spec-a1b2c3d4`.
|
|
193
|
+
Tell the user: "Started in the background. Attach with
|
|
194
|
+
`atomic workflow session connect atomic-wf-claude-gen-spec-a1b2c3d4`,
|
|
195
|
+
check progress with `atomic workflow status atomic-wf-claude-gen-spec-a1b2c3d4`,
|
|
196
|
+
or stop it with `atomic session kill atomic-wf-claude-gen-spec-a1b2c3d4 -y`."
|
|
197
|
+
|
|
198
|
+
**Example B — workflow does not exist**
|
|
199
|
+
|
|
200
|
+
> **User:** "run the security-audit workflow on src/auth"
|
|
201
|
+
|
|
202
|
+
1. Run `atomic workflow list`. Available: `ralph`, `deep-research-codebase`,
|
|
203
|
+
`gen-spec`, `review-to-merge`. No `security-audit`.
|
|
204
|
+
2. Tell the user: "I don't see a `security-audit` workflow. Available:
|
|
205
|
+
ralph, deep-research-codebase, gen-spec, review-to-merge."
|
|
206
|
+
3. Ask via AskUserQuestion: "Want me to create a `security-audit` workflow
|
|
207
|
+
first?" with choices `Yes, create it`, `No, use one of the existing
|
|
208
|
+
workflows`, `No, cancel`.
|
|
209
|
+
4. If **Yes**: switch to SKILL.md's Authoring Process — interview the user
|
|
210
|
+
for what the workflow should do, draft it, typecheck, *then* return here
|
|
211
|
+
and invoke it.
|
|
212
|
+
5. If **No, use existing**: ask which one via AskUserQuestion over the
|
|
213
|
+
listed options, then continue from step 3 of the recipe.
|
|
214
|
+
6. If **Cancel**: stop, no command runs.
|
|
215
|
+
|
|
216
|
+
## Common mistakes to avoid
|
|
217
|
+
|
|
218
|
+
- **Skipping `atomic workflow list`** — leads to guessing and
|
|
219
|
+
`workflow not found` errors. It's a one-line command; always run it.
|
|
220
|
+
- **Inventing a workflow name** — if it's not in the list, it doesn't exist.
|
|
221
|
+
Say so and offer to author it.
|
|
222
|
+
- **Reading the workflow source file to discover inputs** — use
|
|
223
|
+
`atomic workflow inputs <name> -a <agent>` instead. JSON, no TS parsing
|
|
224
|
+
required, always in sync with the runtime. Source-file reads are a
|
|
225
|
+
fallback, not a default.
|
|
226
|
+
- **Asking everything at once** — let AskUserQuestion drive one question per
|
|
227
|
+
field. Enum fields are multiple-choice, not free text.
|
|
228
|
+
- **Re-asking what the user already said** — read their message first.
|
|
229
|
+
- **Forgetting to report the session name** — the user needs it to reattach
|
|
230
|
+
and to query status later.
|
|
231
|
+
- **Leaving `needs_review` unreported** — when `atomic workflow status`
|
|
232
|
+
returns `needs_review`, surface it to the user right away. The workflow is
|
|
233
|
+
blocked on human input and will sit forever otherwise.
|
|
234
|
+
- **Calling `session kill` without `-y`** — the prompt hangs in a
|
|
235
|
+
non-interactive context. Always pass `-y` from an agent.
|
|
@@ -25,14 +25,24 @@ If you want to configure agent/permission/tools behavior for a **headless** Clau
|
|
|
25
25
|
|
|
26
26
|
```ts
|
|
27
27
|
await ctx.stage({ name: "..." }, {}, {}, async (s) => {
|
|
28
|
-
await s.session.query((
|
|
28
|
+
await s.session.query((s.inputs.prompt ?? ""));
|
|
29
29
|
s.save(s.sessionId);
|
|
30
30
|
});
|
|
31
31
|
```
|
|
32
32
|
|
|
33
|
-
### `query()` options
|
|
33
|
+
### `query()` options (reference for `s.session.query()` sdkOptions)
|
|
34
|
+
|
|
35
|
+
**This block is a reference cheatsheet for the SDK option shape — it is
|
|
36
|
+
not valid workflow code.** Do not import `query` from
|
|
37
|
+
`@anthropic-ai/claude-agent-sdk` inside a `ctx.stage()` callback (see
|
|
38
|
+
`failure-modes.md` §F16). In a **headless** stage, pass these options as
|
|
39
|
+
the second argument to `s.session.query(prompt, sdkOptions)` — the runtime
|
|
40
|
+
forwards them to the Agent SDK. In an **interactive** stage, the options
|
|
41
|
+
are silently ignored; drive behaviour via `chatFlags` in `clientOpts`
|
|
42
|
+
instead.
|
|
34
43
|
|
|
35
44
|
```ts
|
|
45
|
+
// ❌ Reference only — do not call query() like this from a workflow.
|
|
36
46
|
import { query } from "@anthropic-ai/claude-agent-sdk";
|
|
37
47
|
|
|
38
48
|
const result = query({
|
|
@@ -40,7 +50,7 @@ const result = query({
|
|
|
40
50
|
options: {
|
|
41
51
|
// Model selection
|
|
42
52
|
model: "claude-opus-4-6", // Full model ID or alias ("opus", "sonnet", "haiku")
|
|
43
|
-
effort: "high", // "low", "medium", "high", "max" (max is Opus 4.6 only)
|
|
53
|
+
effort: "high", // "low", "medium", "high", "xhigh", "max" (max is Opus 4.6/4.7 only)
|
|
44
54
|
thinking: { type: "adaptive" }, // Default for supported models; or { type: "enabled", budgetTokens: N }
|
|
45
55
|
maxTurns: 50, // Maximum conversation turns
|
|
46
56
|
maxBudgetUsd: 5.0, // Spending cap in USD
|
|
@@ -218,10 +228,15 @@ await ctx.stage({ name: "plan" }, {}, {
|
|
|
218
228
|
onErrorOccurred: (event) => { /* error handling */ },
|
|
219
229
|
},
|
|
220
230
|
|
|
221
|
-
// Advanced
|
|
222
|
-
|
|
231
|
+
// Advanced — auto-manage context via compaction. Pass an InfiniteSessionConfig,
|
|
232
|
+
// not a boolean. See docs/copilot-cli/sdk.md for the full threshold surface.
|
|
233
|
+
infiniteSessions: {
|
|
234
|
+
enabled: true,
|
|
235
|
+
backgroundCompactionThreshold: 0.8, // start compacting at 80% window usage
|
|
236
|
+
bufferExhaustionThreshold: 0.95, // block at 95% until compaction completes
|
|
237
|
+
},
|
|
223
238
|
}, async (s) => {
|
|
224
|
-
await s.session.send({ prompt: (
|
|
239
|
+
await s.session.send({ prompt: (s.inputs.prompt ?? "") });
|
|
225
240
|
s.save(await s.session.getMessages());
|
|
226
241
|
});
|
|
227
242
|
```
|
|
@@ -231,7 +246,7 @@ await ctx.stage({ name: "plan" }, {}, {
|
|
|
231
246
|
```ts
|
|
232
247
|
// Approve everything (autonomous) — this is the default
|
|
233
248
|
await ctx.stage({ name: "plan" }, {}, { onPermissionRequest: approveAll }, async (s) => {
|
|
234
|
-
await s.session.send({ prompt: (
|
|
249
|
+
await s.session.send({ prompt: (s.inputs.prompt ?? "") });
|
|
235
250
|
s.save(await s.session.getMessages());
|
|
236
251
|
});
|
|
237
252
|
|
|
@@ -251,7 +266,7 @@ await ctx.stage({ name: "plan" }, {}, {
|
|
|
251
266
|
}
|
|
252
267
|
},
|
|
253
268
|
}, async (s) => {
|
|
254
|
-
await s.session.send({ prompt: (
|
|
269
|
+
await s.session.send({ prompt: (s.inputs.prompt ?? "") });
|
|
255
270
|
s.save(await s.session.getMessages());
|
|
256
271
|
});
|
|
257
272
|
```
|
|
@@ -295,7 +310,7 @@ await ctx.stage({ name: "implement" }, {}, {}, async (s) => {
|
|
|
295
310
|
// Basic prompt
|
|
296
311
|
const result = await s.client.session.prompt({
|
|
297
312
|
sessionID: s.session.id,
|
|
298
|
-
parts: [{ type: "text", text: (
|
|
313
|
+
parts: [{ type: "text", text: (s.inputs.prompt ?? "") }],
|
|
299
314
|
});
|
|
300
315
|
|
|
301
316
|
// Structured output
|
|
@@ -110,7 +110,7 @@ Use closures and variables for state within a single session:
|
|
|
110
110
|
|
|
111
111
|
for (let cycle = 0; cycle < 10; cycle++) {
|
|
112
112
|
const result = await s.session.query(
|
|
113
|
-
buildReviewPrompt((
|
|
113
|
+
buildReviewPrompt((s.inputs.prompt ?? ""), priorOutput),
|
|
114
114
|
);
|
|
115
115
|
|
|
116
116
|
// Accumulate findings
|
|
@@ -128,7 +128,7 @@ Use closures and variables for state within a single session:
|
|
|
128
128
|
consecutiveClean = 0;
|
|
129
129
|
|
|
130
130
|
// Apply fix
|
|
131
|
-
const fixResult = await s.session.query(buildFixSpec(review, (
|
|
131
|
+
const fixResult = await s.session.query(buildFixSpec(review, (s.inputs.prompt ?? "")));
|
|
132
132
|
priorOutput = extractAssistantText(fixResult, 0);
|
|
133
133
|
}
|
|
134
134
|
|
|
@@ -148,10 +148,6 @@ import { readFile, writeFile, mkdir } from "fs/promises";
|
|
|
148
148
|
import { join } from "path";
|
|
149
149
|
|
|
150
150
|
.run(async (ctx) => {
|
|
151
|
-
await ctx.stage({ name: "plan" }, {}, {}, async (s) => {
|
|
152
|
-
// ... plan session ...
|
|
153
|
-
});
|
|
154
|
-
|
|
155
151
|
const planHandle = await ctx.stage({ name: "plan" }, {}, {}, async (s) => {
|
|
156
152
|
// Write artifacts to session directory
|
|
157
153
|
const artifactDir = join(s.sessionDir, "artifacts");
|
|
@@ -214,6 +210,11 @@ export function buildReviewPrompt(spec: string, priorOutput?: string): string {
|
|
|
214
210
|
|
|
215
211
|
### Response parsers
|
|
216
212
|
|
|
213
|
+
For tolerant JSON parsing, see `failure-modes.md` §F8 — the canonical
|
|
214
|
+
`parseReviewResult` helper uses a layered fallback (direct parse → last
|
|
215
|
+
fenced block → last balanced object) that survives prose interleaving.
|
|
216
|
+
Copy that implementation into `helpers/parsers.ts` and import.
|
|
217
|
+
|
|
217
218
|
```ts
|
|
218
219
|
// .atomic/workflows/my-workflow/helpers/parsers.ts
|
|
219
220
|
export interface ReviewResult {
|
|
@@ -221,27 +222,9 @@ export interface ReviewResult {
|
|
|
221
222
|
overall_correctness: string;
|
|
222
223
|
}
|
|
223
224
|
|
|
224
|
-
|
|
225
|
-
* Always extract the LAST block, not the first — see failure-modes.md §F4/§F8. */
|
|
225
|
+
// See failure-modes.md §F8 for the full implementation.
|
|
226
226
|
export function parseReviewResult(text: string): ReviewResult | null {
|
|
227
|
-
//
|
|
228
|
-
try {
|
|
229
|
-
const parsed = JSON.parse(text);
|
|
230
|
-
if (parsed?.findings) return parsed;
|
|
231
|
-
} catch {}
|
|
232
|
-
|
|
233
|
-
// 2. Last fenced block
|
|
234
|
-
const blockRe = /```(?:json)?\s*\n([\s\S]*?)\n```/g;
|
|
235
|
-
let lastBlock: string | null = null;
|
|
236
|
-
let m: RegExpExecArray | null;
|
|
237
|
-
while ((m = blockRe.exec(text)) !== null) {
|
|
238
|
-
if (m[1]) lastBlock = m[1];
|
|
239
|
-
}
|
|
240
|
-
if (lastBlock) {
|
|
241
|
-
try { return JSON.parse(lastBlock); } catch {}
|
|
242
|
-
}
|
|
243
|
-
|
|
244
|
-
return null;
|
|
227
|
+
// ... three-layer fallback per §F8
|
|
245
228
|
}
|
|
246
229
|
```
|
|
247
230
|
|
|
@@ -6,58 +6,87 @@ For **invocation-time** inputs (the values the user supplies when they launch th
|
|
|
6
6
|
|
|
7
7
|
## Claude
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
Never import `query` from `@anthropic-ai/claude-agent-sdk` inside a stage
|
|
10
|
+
callback — that's the F16 anti-pattern (see `failure-modes.md` §F16). All
|
|
11
|
+
options route through `s.session.query(prompt, sdkOptions)` in headless
|
|
12
|
+
stages, or through `chatFlags` in interactive stages.
|
|
10
13
|
|
|
11
|
-
|
|
14
|
+
### Via `canUseTool` callback (headless stages only)
|
|
12
15
|
|
|
13
|
-
|
|
14
|
-
|
|
16
|
+
`canUseTool` is an SDK option — it only applies in a headless stage, where
|
|
17
|
+
the second argument to `s.session.query()` is forwarded to the Agent SDK as
|
|
18
|
+
`Partial<SDKOptions>`. In interactive stages the option is silently ignored
|
|
19
|
+
because `s.session.query()` is driving the `claude` CLI binary, not the SDK.
|
|
15
20
|
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
21
|
+
```ts
|
|
22
|
+
await ctx.stage(
|
|
23
|
+
{ name: "implement", headless: true },
|
|
24
|
+
{}, {},
|
|
25
|
+
async (s) => {
|
|
26
|
+
const messages = await s.session.query(
|
|
27
|
+
"Implement the feature, but ask me before making any database changes.",
|
|
28
|
+
{
|
|
29
|
+
canUseTool: async (toolName, toolInput) => {
|
|
30
|
+
if (toolName === "Write" && typeof toolInput.file_path === "string" && toolInput.file_path.includes("migration")) {
|
|
31
|
+
const approved = await promptUser("Allow database migration?");
|
|
32
|
+
return approved
|
|
33
|
+
? { behavior: "allow", updatedInput: toolInput }
|
|
34
|
+
: { behavior: "deny", message: "User declined migration" };
|
|
35
|
+
}
|
|
36
|
+
return { behavior: "allow", updatedInput: toolInput };
|
|
37
|
+
},
|
|
30
38
|
},
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
},
|
|
39
|
+
);
|
|
40
|
+
s.save(s.sessionId);
|
|
41
|
+
return extractAssistantText(messages, 0);
|
|
42
|
+
},
|
|
43
|
+
);
|
|
35
44
|
```
|
|
36
45
|
|
|
37
46
|
### Via `AskUserQuestion` tool
|
|
38
47
|
|
|
39
|
-
Allow the agent to ask the user questions by including `AskUserQuestion` in
|
|
48
|
+
Allow the agent to ask the user questions by including `AskUserQuestion` in
|
|
49
|
+
`allowedTools`. This works for both interactive stages (via `chatFlags`) and
|
|
50
|
+
headless stages (via sdkOptions on `s.session.query()`).
|
|
51
|
+
|
|
52
|
+
**Interactive stage** — pass the tool allowlist via `chatFlags`:
|
|
40
53
|
|
|
41
54
|
```ts
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
55
|
+
await ctx.stage(
|
|
56
|
+
{ name: "implement" },
|
|
57
|
+
{ chatFlags: ["--allowed-tools", "Read,Write,Edit,Bash,AskUserQuestion"] },
|
|
58
|
+
{},
|
|
59
|
+
async (s) => {
|
|
60
|
+
await s.session.query(s.inputs.prompt ?? "");
|
|
61
|
+
s.save(s.sessionId);
|
|
46
62
|
},
|
|
47
|
-
|
|
63
|
+
);
|
|
48
64
|
```
|
|
49
65
|
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
For interactive sessions, use streaming mode to feed user input:
|
|
66
|
+
**Headless stage** — pass `allowedTools` in the sdkOptions:
|
|
53
67
|
|
|
54
68
|
```ts
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
69
|
+
await ctx.stage(
|
|
70
|
+
{ name: "implement", headless: true },
|
|
71
|
+
{}, {},
|
|
72
|
+
async (s) => {
|
|
73
|
+
const messages = await s.session.query(s.inputs.prompt ?? "", {
|
|
74
|
+
allowedTools: ["Read", "Write", "Edit", "Bash", "AskUserQuestion"],
|
|
75
|
+
});
|
|
76
|
+
s.save(s.sessionId);
|
|
77
|
+
return extractAssistantText(messages, 0);
|
|
78
|
+
},
|
|
79
|
+
);
|
|
59
80
|
```
|
|
60
81
|
|
|
82
|
+
### Via streaming input (headless stages only)
|
|
83
|
+
|
|
84
|
+
The Agent SDK's `streamInput()` feeds additional input while a query is
|
|
85
|
+
running. It's only reachable from headless stages via an async iterable
|
|
86
|
+
prompt — pass an `AsyncIterable<SDKUserMessage>` as the first argument to
|
|
87
|
+
`s.session.query()` instead of a plain string. In interactive stages, send
|
|
88
|
+
follow-up turns with another `s.session.query()` call to the same session.
|
|
89
|
+
|
|
61
90
|
## Copilot
|
|
62
91
|
|
|
63
92
|
Session callbacks (`onUserInputRequest`, `onElicitationRequest`,
|
|
@@ -77,7 +106,7 @@ await ctx.stage({ name: "plan" }, {}, {
|
|
|
77
106
|
return answer;
|
|
78
107
|
},
|
|
79
108
|
}, async (s) => {
|
|
80
|
-
await s.session.send({ prompt: (
|
|
109
|
+
await s.session.send({ prompt: (s.inputs.prompt ?? "") });
|
|
81
110
|
s.save(await s.session.getMessages());
|
|
82
111
|
});
|
|
83
112
|
```
|
|
@@ -97,7 +126,7 @@ await ctx.stage({ name: "plan" }, {}, {
|
|
|
97
126
|
};
|
|
98
127
|
},
|
|
99
128
|
}, async (s) => {
|
|
100
|
-
await s.session.send({ prompt: (
|
|
129
|
+
await s.session.send({ prompt: (s.inputs.prompt ?? "") });
|
|
101
130
|
s.save(await s.session.getMessages());
|
|
102
131
|
});
|
|
103
132
|
```
|
|
@@ -112,7 +141,7 @@ import { approveAll } from "@github/copilot-sdk";
|
|
|
112
141
|
|
|
113
142
|
// Explicit (same as the default):
|
|
114
143
|
await ctx.stage({ name: "plan" }, {}, { onPermissionRequest: approveAll }, async (s) => {
|
|
115
|
-
await s.session.send({ prompt: (
|
|
144
|
+
await s.session.send({ prompt: (s.inputs.prompt ?? "") });
|
|
116
145
|
s.save(await s.session.getMessages());
|
|
117
146
|
});
|
|
118
147
|
```
|
|
@@ -131,7 +160,7 @@ await ctx.stage({ name: "plan" }, {}, {
|
|
|
131
160
|
return { kind: "approved" };
|
|
132
161
|
},
|
|
133
162
|
}, async (s) => {
|
|
134
|
-
await s.session.send({ prompt: (
|
|
163
|
+
await s.session.send({ prompt: (s.inputs.prompt ?? "") });
|
|
135
164
|
s.save(await s.session.getMessages());
|
|
136
165
|
});
|
|
137
166
|
```
|
|
@@ -186,11 +215,10 @@ Use user input results in conditional logic. This Claude example uses
|
|
|
186
215
|
user directly, and you parse the response to branch:
|
|
187
216
|
|
|
188
217
|
```ts
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
// Inside a ctx.stage() callback (Claude example):
|
|
218
|
+
// Inside a ctx.stage() callback (Claude example).
|
|
219
|
+
// AskUserQuestion must be in allowedTools — see "Via AskUserQuestion tool" above.
|
|
192
220
|
async (s) => {
|
|
193
|
-
const plan = await s.transcript("plan");
|
|
221
|
+
const plan = await s.transcript("plan"); // or s.transcript(handle) if a handle is in scope (preferred)
|
|
194
222
|
|
|
195
223
|
// Let the agent ask the user for approval via AskUserQuestion
|
|
196
224
|
const result = await s.session.query(
|
|
@@ -89,6 +89,18 @@ The nullish coalescing on `notes` handles the optional field case —
|
|
|
89
89
|
declared-but-unset inputs resolve to `undefined` unless they have a
|
|
90
90
|
`default`.
|
|
91
91
|
|
|
92
|
+
**Style convention.** Inside a stage callback, both `s.inputs.<name>` and
|
|
93
|
+
`ctx.inputs.<name>` resolve to the same value. Either of these patterns
|
|
94
|
+
works:
|
|
95
|
+
|
|
96
|
+
- **Destructure once at the top of `.run()`** so each stage closes over a
|
|
97
|
+
bare local. Best when many stages reference the same input.
|
|
98
|
+
- **Inline access** with `(s.inputs.<name> ?? "")` at each call site. Best
|
|
99
|
+
for short workflows or when each stage uses a different field.
|
|
100
|
+
|
|
101
|
+
Pick whichever reads cleaner for your workflow. Examples in other reference
|
|
102
|
+
files use the inline form for brevity in focused snippets.
|
|
103
|
+
|
|
92
104
|
## Declaring an input schema
|
|
93
105
|
|
|
94
106
|
Pass an `inputs` array to `defineWorkflow({ ... })`. Each entry is a
|
|
@@ -205,50 +217,15 @@ because the form teaches the schema as the user fills it in.
|
|
|
205
217
|
|
|
206
218
|
## Builtin protection
|
|
207
219
|
|
|
208
|
-
Builtin
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
time — the runtime drops user-defined workflows with reserved names
|
|
212
|
-
before any precedence merge. This prevents a user from accidentally
|
|
213
|
-
redefining the canonical version of a workflow in a way that confuses
|
|
214
|
-
teammates or breaks automation.
|
|
215
|
-
|
|
216
|
-
You'll still see shadowed local/global workflows in
|
|
217
|
-
`atomic workflow list` output so the collision is visible, but running
|
|
218
|
-
`atomic workflow -n ralph -a claude` will always land on the builtin.
|
|
219
|
-
|
|
220
|
-
The practical implication: **don't name a new workflow `ralph` or
|
|
221
|
-
`deep-research-codebase`**. Pick a distinct name and you'll never hit
|
|
222
|
-
this.
|
|
223
|
-
|
|
224
|
-
## Invocation cheat sheet
|
|
225
|
-
|
|
226
|
-
```bash
|
|
227
|
-
# List everything, grouped by source
|
|
228
|
-
atomic workflow list
|
|
229
|
-
|
|
230
|
-
# Launch the picker for a pinned agent
|
|
231
|
-
atomic workflow -a claude
|
|
220
|
+
Builtin names (`ralph`, `deep-research-codebase`) are reserved — pick
|
|
221
|
+
distinct names for your workflows. Full precedence + shadowing rules
|
|
222
|
+
live in `discovery-and-verification.md`.
|
|
232
223
|
|
|
233
|
-
|
|
234
|
-
atomic workflow -n hello -a claude "hello world"
|
|
224
|
+
## Invocation details
|
|
235
225
|
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
--focus=standard \
|
|
240
|
-
--notes="pay special attention to session token storage"
|
|
241
|
-
|
|
242
|
-
# Structured, long-form flag value (= form)
|
|
243
|
-
atomic workflow -n gen-spec -a claude --focus standard --research_doc notes.md
|
|
244
|
-
|
|
245
|
-
# Detached (background) — starts the orchestrator on the atomic tmux
|
|
246
|
-
# socket and returns immediately. The command prints the session name
|
|
247
|
-
# and hints for attaching later. Use this for scripted / CI runs where
|
|
248
|
-
# the caller shouldn't block on the TUI.
|
|
249
|
-
atomic workflow -n hello -a claude -d "hello world"
|
|
250
|
-
atomic workflow session connect atomic-wf-claude-hello-<id> # attach later
|
|
251
|
-
```
|
|
226
|
+
See SKILL.md §"Invocation surfaces" for the table of every top-level
|
|
227
|
+
command. This section only covers the flag-parsing nuances specific to
|
|
228
|
+
structured inputs.
|
|
252
229
|
|
|
253
230
|
Both `--flag=value` and `--flag value` forms are accepted. Short flags
|
|
254
231
|
(`-x value`) are NOT parsed as structured inputs — only long-form
|
|
@@ -257,6 +234,12 @@ Both `--flag=value` and `--flag value` forms are accepted. Short flags
|
|
|
257
234
|
The `-d` / `--detach` flag composes with any named shape (positional
|
|
258
235
|
prompt, structured flags) and is independent of the inputs schema.
|
|
259
236
|
|
|
237
|
+
```bash
|
|
238
|
+
# Structured, both flag forms work identically
|
|
239
|
+
atomic workflow -n gen-spec -a claude --focus=standard --research_doc=notes.md
|
|
240
|
+
atomic workflow -n gen-spec -a claude --focus standard --research_doc notes.md
|
|
241
|
+
```
|
|
242
|
+
|
|
260
243
|
## Pitfalls
|
|
261
244
|
|
|
262
245
|
### Declare every field you access
|