npm - @bastani/atomic - Versions diffs - 0.5.12-3 → 0.5.12-5 - Mend

@bastani/atomic 0.5.12-3 → 0.5.12-5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/.agents/skills/workflow-creator/SKILL.md CHANGED Viewed

@@ -97,9 +97,16 @@ Workflow quality depends on two disciplines: **prompt engineering** (crafting cl
 A workflow is a TypeScript file with a single `.run()` callback that orchestrates agent sessions dynamically. Inside the callback, `ctx.stage()` spawns sessions — each gets its own tmux window and graph node (unless running in headless mode). Native TypeScript handles all control flow: loops, conditionals, `Promise.all()`, `try`/`catch`.
 ```ts
-import { defineWorkflow } from "@bastani/atomic/workflows";
-export default defineWorkflow<"claude">({ name: "my-workflow", description: "..." })
+import { defineWorkflow, extractAssistantText } from "@bastani/atomic/workflows";
+export default defineWorkflow({
+    name: "my-workflow",
+    description: "...",
+    inputs: [
+      { name: "prompt", type: "text", required: true, description: "task to perform" },
+    ],
+  })
+  .for<"claude">()
   .run(async (ctx) => {
     const step1 = await ctx.stage({ name: "step-1" }, {}, {}, async (s) => { /* s.client, s.session */ });
     await ctx.stage({ name: "step-2" }, {}, {}, async (s) => { /* s.client, s.session */ });
@@ -121,7 +128,7 @@ await ctx.stage(
   async (s) => {
     const result = await s.session.query("Analyze the codebase structure.");
     s.save(s.sessionId);
-    return result.output;
+    return extractAssistantText(result, 0);
   },
 );
 ```
@@ -182,23 +189,23 @@ Workflow files live at `.atomic/workflows/<name>/<agent>/index.ts`. Discovery so
 | `WorkflowContext` (`ctx`) | `.run(async (ctx) => ...)` | No | Orchestration: spawn sessions, read transcripts, read `ctx.inputs` |
 | `SessionContext` (`s`) | `ctx.stage(opts, clientOpts, sessionOpts, async (s) => ...)` | Yes | Agent work: use `s.client` and `s.session` for SDK calls, save output |
-Both contexts expose `inputs: Record<string, string>`, `stage()`, `transcript()`, and `getMessages()`. See `references/getting-started.md` for the full `SessionContext` field reference.
+Both contexts expose typed `inputs` (keys restricted to declared input names), `stage()`, `transcript()`, and `getMessages()`. See `references/getting-started.md` for the full `SessionContext` field reference.
 ### Declared inputs: one API, three invocation surfaces
-Workflows receive user data exclusively through `ctx.inputs` (and `s.inputs` inside stage callbacks). You have two choices:
+Workflows receive user data exclusively through `ctx.inputs` (and `s.inputs` inside stage callbacks).
-**Free-form** — no schema. The positional CLI prompt lands under `ctx.inputs.prompt`. Read via `ctx.inputs.prompt ?? ""`.
+Declare `inputs: WorkflowInput[]` inline on `defineWorkflow()`. TypeScript infers literal field names from the array and restricts `ctx.inputs` to only those keys — accessing an undeclared field is a **compile-time error**. The CLI materializes one `--<field>=<value>` flag per entry, validates required fields + enum membership before launching, and the picker renders a form. Three field types: `string` (single-line), `text` (multi-line), `enum` (fixed set).
-**Structured** — declare `inputs: WorkflowInput[]` on `defineWorkflow`. The CLI materializes one `--<field>=<value>` flag per entry, validates required fields + enum membership before launching, and the picker renders a form. Three field types: `string` (single-line), `text` (multi-line), `enum` (fixed set).
+Workflows that accept a free-form prompt should declare it explicitly: `{ name: "prompt", type: "text", required: true }`.
-**Load `references/workflow-inputs.md`** for the full schema shape, validation rules, free-form-vs-structured decision guide, picker semantics, and invocation cheat sheet.
+**Load `references/workflow-inputs.md`** for the full schema shape, validation rules, picker semantics, and invocation cheat sheet.
 ### Invocation surfaces
 | Surface | Command | When |
 |---|---|---|
-| Named, free-form | `atomic workflow -n hello -a claude "fix the bug"` | Scripted runs; prompt lands in `ctx.inputs.prompt` |
+| Named, with prompt | `atomic workflow -n hello -a claude "fix the bug"` | Scripted runs; requires the workflow to declare a `prompt` input |
 | Named, structured | `atomic workflow -n gen-spec -a claude --research_doc=notes.md` | Scripted structured runs |
 | Interactive picker | `atomic workflow -a claude` | Discovery; shows fuzzy list + form |
 | List | `atomic workflow -l` | Browse everything by source |
@@ -274,13 +281,13 @@ Then apply **design advisory checks** — these catch architectural and prompt q
 ### 2. Choose the Target Agent
-Pass a type parameter to `defineWorkflow<"agent">()` to narrow all context types and get correct `s.client`/`s.session` types:
+Use `.for<"agent">()` on the builder to narrow all context types and get correct `s.client`/`s.session` types. Call `.for()` **before** `.run()`:
-| Agent | Type Parameter | Primary Session API |
+| Agent | Builder Chain | Primary Session API |
 |-------|---------------|---------------------|
-| Claude | `defineWorkflow<"claude">` | `s.session.query(prompt)` — sends prompt to the Claude TUI pane |
-| Copilot | `defineWorkflow<"copilot">` | `s.session.send({ prompt })` — fire-and-forget; use `sendAndWait({ prompt }, timeoutMs)` only when the user explicitly requests timeout-based waiting |
-| OpenCode | `defineWorkflow<"opencode">` | `s.client.session.prompt({ sessionID: s.session.id, parts: [...] })` |
+| Claude | `defineWorkflow({...}).for<"claude">()` | `s.session.query(prompt)` — sends prompt to the Claude TUI pane |
+| Copilot | `defineWorkflow({...}).for<"copilot">()` | `s.session.send({ prompt })` — fire-and-forget; use `sendAndWait({ prompt }, timeoutMs)` only when the user explicitly requests timeout-based waiting |
+| OpenCode | `defineWorkflow({...}).for<"opencode">()` | `s.client.session.prompt({ sessionID: s.session.id, parts: [...] })` |
 The runtime manages client/session lifecycle automatically. For native SDK types and advanced APIs, import directly from the provider packages (`@github/copilot-sdk`, `@anthropic-ai/claude-agent-sdk`, `@opencode-ai/sdk/v2`).
@@ -309,7 +316,7 @@ Per-SDK cheat sheet:
 | Save output | `s.save(s.sessionId)` | `s.save(await s.session.getMessages())` | `s.save(result.data!)` |
 | Timeout | Per-query defaults via sessionOpts | N/A (`send` has no timeout; `sendAndWait` accepts optional timeout, default 60s) | N/A |
 | Context model | Tmux pane (accumulates across turns) | Fresh per `ctx.stage()` | Fresh per `ctx.stage()` |
-| Extract text | `result.output` (string) | `getAssistantText(messages)` (see `failure-modes.md` F1) | `extractResponseText(result.data!.parts)` (see `failure-modes.md` F3) |
+| Extract text | `extractAssistantText(result, 0)` (uses `SessionMessage[]`) | `getAssistantText(messages)` (see `failure-modes.md` F1) | `extractResponseText(result.data!.parts)` (see `failure-modes.md` F3) |
 The SDK ships two builtin workflows as production reference implementations:
 - **`ralph`** — iterative plan → orchestrate → review → debug loop (all 3 SDKs)
@@ -326,7 +333,7 @@ bun typecheck
 ### 5. Test the Workflow
 ```bash
-# Free-form workflow
+# Workflow with a declared prompt input
 atomic workflow -n <workflow-name> -a <agent> "<your prompt>"
 # Structured workflow

package/.agents/skills/workflow-creator/references/agent-sessions.md CHANGED Viewed

@@ -18,14 +18,14 @@ import { defineWorkflow } from "@bastani/atomic/workflows";
   await ctx.stage(
     { name: "implement", description: "Implement the feature" },
     {}, // clientOpts: chatFlags and readyTimeoutMs go here
-    {}, // sessionOpts: query defaults (timeoutMs, pollIntervalMs, etc.) go here
+    {}, // sessionOpts: query defaults (pollIntervalMs, readyTimeoutMs, etc.) go here
     async (s) => {
       // s.client — Claude CLI wrapper (already started by runtime)
       // s.session — session wrapper (ready to accept queries via s.session.query())
       // Send queries — Claude maintains conversation context across calls
+      // Returns SessionMessage[] (native SDK type from @anthropic-ai/claude-agent-sdk)
       const result = await s.session.query((s.inputs.prompt ?? ""));
-      // result.output contains the captured response text
       // Save transcript
       s.save(s.sessionId);
@@ -44,26 +44,32 @@ Client options (2nd arg to `ctx.stage()`):
 - `readyTimeoutMs` — timeout waiting for TUI readiness (default: 30s)
 Session options (3rd arg to `ctx.stage()`), applied as defaults to every `s.session.query()` call:
-- `timeoutMs` — timeout waiting for Claude to finish responding (default: 300s)
 - `pollIntervalMs` — polling interval (default: 2000ms)
 - `submitPresses` — C-m presses per submit round (default: 1)
 - `maxSubmitRounds` — max submit rounds (default: 6)
 - `readyTimeoutMs` — timeout waiting for pane readiness before sending (default: 30s)
+No manual timeout is needed — idle detection watches for the pane prompt to return, and the session transcript is used to extract the response text.
 ### Basic usage with `s.session.query()`
 ```ts
 import { defineWorkflow } from "@bastani/atomic/workflows";
-export default defineWorkflow<"claude">({ name: "implement" })
+export default defineWorkflow({
+    name: "implement",
+    inputs: [{ name: "prompt", type: "text", required: true, description: "task prompt" }],
+  })
+  .for<"claude">()
   .run(async (ctx) => {
     await ctx.stage(
       { name: "implement", description: "Implement the feature" },
       {},
       {},
       async (s) => {
-        const result = await s.session.query((s.inputs.prompt ?? ""));
-        // result.output contains the captured response text
+        const messages = await s.session.query((s.inputs.prompt ?? ""));
+        // messages is SessionMessage[] — native SDK type
+        // Use extractAssistantText(messages, 0) to get the text response
         s.save(s.sessionId);
       },
     );
@@ -71,7 +77,7 @@ export default defineWorkflow<"claude">({ name: "implement" })
   .compile();
 ```
-`s.session.query(prompt)` sends text to the Claude pane, verifies delivery, retries if needed, and waits for output stabilization. Returns `{ output: string }`.
+`s.session.query(prompt)` sends text to the Claude pane, verifies delivery, retries if needed, and waits for output stabilization. Returns `SessionMessage[]` (the native transcript messages from this turn, imported from `@anthropic-ai/claude-agent-sdk`). Use `extractAssistantText(messages, 0)` to extract the plain text response.
 ### Multi-turn conversations
@@ -183,43 +189,72 @@ const result = query({ prompt: "Continue...", options: { resume: sessionId } });
 const result = query({ prompt: "Try a different approach", options: { resume: sessionId, forkSession: true } });
 ```
-### Sub-agent delegation via `s.session.query()`
+### Sub-agent delegation
+For stages that call a single sub-agent, use `--agent` (interactive) or the SDK `agent` option (headless) to route all prompts through that agent. The agent must be defined in `.claude/agents/` or `.agents/skills/`.
-Invoke named sub-agents by prefixing the prompt with `@"agent-name (agent)"`. The agent must be defined in `.claude/agents/`:
+**Interactive stages** — pass `--agent` via `chatFlags` in client opts (2nd arg):
 ```ts
 .run(async (ctx) => {
-  await ctx.stage({ name: "plan-and-implement" }, {}, {}, async (s) => {
-    // Delegate to the "planner" agent
-    await s.session.query(`@"planner (agent)" Create a plan for: ${(s.inputs.prompt ?? "")}`);
+  await ctx.stage(
+    { name: "plan" },
+    { chatFlags: ["--agent", "planner", "--allow-dangerously-skip-permissions", "--dangerously-skip-permissions"] },
+    {},
+    async (s) => {
+      await s.session.query(`Create a plan for: ${(s.inputs.prompt ?? "")}`);
+      s.save(s.sessionId);
+    },
+  );
+})
+```
-    // Delegate to the "orchestrator" agent
-    await s.session.query(`@"orchestrator (agent)" Execute the plan above.`);
+**Headless stages** — pass `agent` via SDK options in the `query()` call:
-    s.save(s.sessionId);
-  });
+```ts
+.run(async (ctx) => {
+  const handle = await ctx.stage(
+    { name: "locate", headless: true },
+    {}, {},
+    async (s) => {
+      const result = await s.session.query(
+        "Find all API endpoint files",
+        { agent: "codebase-locator", permissionMode: "bypassPermissions", allowDangerouslySkipPermissions: true },
+      );
+      s.save(s.sessionId);
+      return extractAssistantText(result, 0);
+    },
+  );
 })
 ```
+> **Note:** The `@"agent-name (agent)"` prompt prefix is for multi-agent conversations in a single stage where you switch between agents mid-session. For single-agent stages, prefer `--agent` (interactive) or the `agent` SDK option (headless) as shown above.
 ### Headless mode (background stages)
-Claude headless stages use the Agent SDK's `query()` API directly in-process instead of automating a tmux pane. Set `headless: true` in the stage options:
+Claude headless stages use the Agent SDK's `query()` API directly in-process instead of automating a tmux pane. Set `headless: true` in the stage options. SDK options like `agent`, `permissionMode`, and `allowDangerouslySkipPermissions` can be passed directly in the `query()` call:
 ```ts
+import { defineWorkflow, extractAssistantText } from "@bastani/atomic/workflows";
+// ...
 await ctx.stage(
   { name: "background-analysis", headless: true },
   {}, {},
   async (s) => {
-    // s.session.query() works identically — the runtime uses
-    // HeadlessClaudeSessionWrapper which calls the Agent SDK directly
-    const result = await s.session.query("Analyze the codebase.");
+    const result = await s.session.query(
+      "Analyze the codebase.",
+      { agent: "codebase-analyzer", permissionMode: "bypassPermissions", allowDangerouslySkipPermissions: true },
+    );
     s.save(s.sessionId);
-    return result.output;
+    return extractAssistantText(result, 0);
   },
 );
 ```
-The callback interface is identical to interactive stages. Internally, the runtime uses `HeadlessClaudeClientWrapper` (no-op start/stop) and `HeadlessClaudeSessionWrapper` (calls `query()` from `@anthropic-ai/claude-agent-sdk` directly). No tmux pane is created, and the stage is invisible in the workflow graph.
+The callback interface is identical to interactive stages — `s.session.query()` returns `SessionMessage[]` in both cases. Internally, the runtime uses `HeadlessClaudeSessionWrapper` which calls `query()` from `@anthropic-ai/claude-agent-sdk` directly. No tmux pane is created, and the stage is invisible in the workflow graph.
+**Design principle:** Never create custom message types. All provider return types are native SDK types — `SessionMessage[]` for Claude, `SessionEvent[]` for Copilot, `SessionPromptResponse` for OpenCode. Use `extractAssistantText()` to extract plain text from Claude's `SessionMessage[]`.
 ## Copilot SDK
@@ -230,7 +265,11 @@ Copilot uses a client-server architecture. The runtime auto-creates a `CopilotCl
 ```ts
 import { defineWorkflow } from "@bastani/atomic/workflows";
-export default defineWorkflow<"copilot">({ name: "implement" })
+export default defineWorkflow({
+    name: "implement",
+    inputs: [{ name: "prompt", type: "text", required: true, description: "task prompt" }],
+  })
+  .for<"copilot">()
   .run(async (ctx) => {
     await ctx.stage(
       { name: "implement" },
@@ -563,7 +602,11 @@ OpenCode uses a client-server model. The runtime auto-creates an `OpencodeClient
 ```ts
 import { defineWorkflow } from "@bastani/atomic/workflows";
-export default defineWorkflow<"opencode">({ name: "implement" })
+export default defineWorkflow({
+    name: "implement",
+    inputs: [{ name: "prompt", type: "text", required: true, description: "task prompt" }],
+  })
+  .for<"opencode">()
   .run(async (ctx) => {
     await ctx.stage(
       { name: "implement" },

package/.agents/skills/workflow-creator/references/computation-and-validation.md CHANGED Viewed

@@ -34,11 +34,13 @@ Each SDK returns responses in different formats. Use helpers to extract text:
 ### Claude
-`s.session.query()` returns `{ output: string, delivered: boolean }` — the captured response text.
+`s.session.query()` returns `SessionMessage[]` — the native SDK transcript messages from this turn. Use `extractAssistantText()` to extract the plain text:
 ```ts
+import { extractAssistantText } from "@bastani/atomic/workflows";
 const result = await s.session.query("...");
-const text = result.output; // Already a string
+const text = extractAssistantText(result, 0); // Extract text from SessionMessage[]
 ```
 ### Copilot
@@ -212,7 +214,7 @@ ${implTranscript.content}
 Respond with JSON: { "correctness": N, "completeness": N, "style": N, "pass": boolean, "issues": [...] }`,
     );
-    const scores = parseJsonResponse(result.output);
+    const scores = parseJsonResponse(extractAssistantText(result, 0));
     if (!scores.pass) {
       await s.session.query(`Fix these quality issues:\n${scores.issues.join("\n")}`);

package/.agents/skills/workflow-creator/references/control-flow.md CHANGED Viewed

@@ -16,6 +16,8 @@ Prefer inter-session control flow when you want the workflow graph to reflect wh
 Run a triage session first, then branch at the `.run()` level to spawn a purpose-built session for each outcome. Every branch appears as a distinct node in the graph:
 ```ts
+import { extractAssistantText } from "@bastani/atomic/workflows";
 .run(async (ctx) => {
   // Step 1: Classify the request
   const triage = await ctx.stage({ name: "triage" }, {}, {}, async (s) => {
@@ -23,7 +25,7 @@ Run a triage session first, then branch at the `.run()` level to spawn a purpose
       `Classify this as "bug", "feature", or "question": ${(ctx.inputs.prompt ?? "")}`,
     );
     s.save(s.sessionId);
-    return result.output.toLowerCase();
+    return extractAssistantText(result, 0).toLowerCase();
   });
   const classification = triage.result;
@@ -53,13 +55,15 @@ Run a triage session first, then branch at the `.run()` level to spawn a purpose
 When the branching logic is simple and you want the agent to retain full context across both the triage and the action, do it all inside a single session callback:
 ```ts
+import { extractAssistantText } from "@bastani/atomic/workflows";
 .run(async (ctx) => {
   await ctx.stage({ name: "triage-and-act" }, {}, {}, async (s) => {
     const triageResult = await s.session.query(
       `Classify this as "bug", "feature", or "question": ${(ctx.inputs.prompt ?? "")}`,
     );
-    const classification = triageResult.output.toLowerCase();
+    const classification = extractAssistantText(triageResult, 0).toLowerCase();
     if (classification.includes("bug")) {
       await s.session.query("Diagnose and fix the bug described above.");
@@ -81,6 +85,8 @@ When the branching logic is simple and you want the agent to retain full context
 Each iteration spawns its own session, so the graph shows exactly how many passes ran:
 ```ts
+import { extractAssistantText } from "@bastani/atomic/workflows";
 .run(async (ctx) => {
   const MAX_ITERATIONS = 5;
@@ -88,7 +94,7 @@ Each iteration spawns its own session, so the graph shows exactly how many passe
     const iteration = await ctx.stage({ name: `refine-${i}` }, {}, {}, async (s) => {
       const result = await s.session.query(`Iteration ${i}: Improve the implementation.`);
       s.save(s.sessionId);
-      return result.output;
+      return extractAssistantText(result, 0);
     });
     if (iteration.result.includes("LGTM") || iteration.result.includes("no issues")) {
@@ -103,6 +109,8 @@ Each iteration spawns its own session, so the graph shows exactly how many passe
 When the agent must remember every prior iteration's output to make progress, keep the loop inside one session:
 ```ts
+import { extractAssistantText } from "@bastani/atomic/workflows";
 .run(async (ctx) => {
   await ctx.stage({ name: "iterative-refinement" }, {}, {}, async (s) => {
     const MAX_ITERATIONS = 5;
@@ -110,7 +118,7 @@ When the agent must remember every prior iteration's output to make progress, ke
     for (let i = 0; i < MAX_ITERATIONS; i++) {
       const result = await s.session.query(`Iteration ${i + 1}: Improve the implementation.`);
-      if (result.output.includes("LGTM") || result.output.includes("no issues")) {
+      if (extractAssistantText(result, 0).includes("LGTM") || extractAssistantText(result, 0).includes("no issues")) {
         break;
       }
     }
@@ -125,6 +133,8 @@ When the agent must remember every prior iteration's output to make progress, ke
 The inter-session pattern is the right fit here: every review and every fix becomes its own graph node, so the executed path is fully visible. This is the production-grade approach with consecutive clean-pass detection:
 ```ts
+import { extractAssistantText } from "@bastani/atomic/workflows";
 .run(async (ctx) => {
   const MAX_CYCLES = 10;
   const CLEAN_THRESHOLD = 2;
@@ -135,7 +145,7 @@ The inter-session pattern is the right fit here: every review and every fix beco
     const review = await ctx.stage({ name: `review-${cycle}` }, {}, {}, async (s) => {
       const result = await s.session.query(buildReviewPrompt((ctx.inputs.prompt ?? "")));
       s.save(s.sessionId);
-      return result.output;
+      return extractAssistantText(result, 0);
     });
     const reviewRaw = review.result;
@@ -292,12 +302,14 @@ Each iteration's stages form a natural chain because each `await` follows the pr
 Headless stages (`{ headless: true }`) are **invisible in the workflow graph** — they don't consume or update the execution frontier. This means they don't affect the parent-child edges inferred for visible stages.
 ```ts
+import { extractAssistantText } from "@bastani/atomic/workflows";
 // ✅ Graph renders: seed → merge (headless stages are transparent)
 .run(async (ctx) => {
   const seed = await ctx.stage({ name: "seed" }, {}, {}, async (s) => {
     const result = await s.session.query("Describe the project.");
     s.save(s.sessionId);
-    return result.output;
+    return extractAssistantText(result, 0);
   });
   // Three parallel headless stages — invisible in the graph
@@ -305,17 +317,17 @@ Headless stages (`{ headless: true }`) are **invisible in the workflow graph**
     ctx.stage({ name: "gather-a", headless: true }, {}, {}, async (s) => {
       const result = await s.session.query(`List 3 pros:\n\n${seed.result}`);
       s.save(s.sessionId);
-      return result.output;
+      return extractAssistantText(result, 0);
     }),
     ctx.stage({ name: "gather-b", headless: true }, {}, {}, async (s) => {
       const result = await s.session.query(`List 3 cons:\n\n${seed.result}`);
       s.save(s.sessionId);
-      return result.output;
+      return extractAssistantText(result, 0);
     }),
     ctx.stage({ name: "gather-c", headless: true }, {}, {}, async (s) => {
       const result = await s.session.query(`List 3 uses:\n\n${seed.result}`);
       s.save(s.sessionId);
-      return result.output;
+      return extractAssistantText(result, 0);
     }),
   ]);
@@ -417,12 +429,14 @@ async function retryWithBackoff<T>(
 Combine loops, conditionals, and inter-session data passing. Session callbacks return typed values via `SessionHandle<T>.result`, and `s.transcript(handle)` accepts a prior `SessionHandle` to read another session's saved output:
 ```ts
+import { extractAssistantText } from "@bastani/atomic/workflows";
 .run(async (ctx) => {
   // Step 1: Analyse — result is available as a typed handle
   const analysisHandle = await ctx.stage({ name: "analyze" }, {}, {}, async (s) => {
     const result = await s.session.query(`Analyse the task: ${(ctx.inputs.prompt ?? "")}`);
     s.save(s.sessionId);
-    return result.output;
+    return extractAssistantText(result, 0);
   });
   const isComplex = analysisHandle.result.includes("complex");
@@ -439,7 +453,7 @@ Combine loops, conditionals, and inter-session data passing. Session callbacks r
           : "Continue improving the implementation.",
       );
       s.save(s.sessionId);
-      return result.output;
+      return extractAssistantText(result, 0);
     });
     if (impl.result.includes("all tests pass")) {

package/.agents/skills/workflow-creator/references/discovery-and-verification.md CHANGED Viewed

@@ -66,10 +66,11 @@ Every workflow file must use `export default` with a compiled workflow:
 ```ts
 import { defineWorkflow } from "@bastani/atomic/workflows";
-export default defineWorkflow<"claude">({
+export default defineWorkflow({
     name: "my-workflow",
     description: "What this workflow does",
   })
+  .for<"claude">()
   .run(async (ctx) => {
     await ctx.stage({ name: "step-1" }, {}, {}, async (s) => { /* ... */ });
     await ctx.stage({ name: "step-2" }, {}, {}, async (s) => { /* ... */ });
@@ -126,7 +127,7 @@ This catches:
 - SDK type mismatches (e.g., passing wrong types to `s.save()`)
 - Incorrect provider-specific method calls (e.g., calling `s.session.query()` in a Copilot workflow)
-**Note on generic type parameter:** Using `defineWorkflow<"claude">()`, `defineWorkflow<"copilot">()`, or `defineWorkflow<"opencode">()` narrows `s.client` and `s.session` to the correct provider types throughout the `.run()` callback and all `ctx.stage()` callbacks. Without the type parameter, `s.client` and `s.session` resolve to a union of all provider types, which requires type guards to use provider-specific methods.
+**Note on provider type parameter:** Using `.for<"claude">()`, `.for<"copilot">()`, or `.for<"opencode">()` narrows `s.client` and `s.session` to the correct provider types throughout the `.run()` callback and all `ctx.stage()` callbacks. Without the type parameter, `s.client` and `s.session` resolve to a union of all provider types, which requires type guards to use provider-specific methods.
 ## Testing

package/.agents/skills/workflow-creator/references/failure-modes.md CHANGED Viewed

@@ -33,7 +33,7 @@ Silent failures are catalogued first below. Loud failures are grouped at the end
 | [F1](#f1-copilot-getlastassistanttext-returns-empty-string) | Copilot: `getLastAssistantText` returns empty string | Copilot | silent |
 | [F2](#f2-copilot-sub-agent-messages-pollute-getmessages-stream) | Copilot: sub-agent messages pollute `getMessages()` stream | Copilot | silent |
 | [F3](#f3-opencode-result-parts-contain-non-text-parts) | OpenCode: `result.data.parts` contains non-text parts | OpenCode | silent |
-| [F4](#f4-claudequery-output-includes-tui-scrollback-not-just-the-last-turn) | Claude: `s.session.query()` output includes TUI scrollback, not just the last turn | Claude | silent |
+| [F4](#f4-claude-ssessionquery-returns-sessionmessage-extract-text-with-extractassistanttext) | Claude: `s.session.query()` returns `SessionMessage[]` — extract text with `extractAssistantText(result, 0)` | Claude | silent |
 | [F5](#f5-fresh-session-wipes-prior-stage-context) | Fresh session wipes prior stage context | Copilot, OpenCode | silent |
 | [F6](#f6-planner-prompts-that-dont-request-trailing-commentary-produce-empty-handoffs) | Planner prompts that don't request trailing commentary produce empty handoffs | all | silent |
 | [F7](#f7-continued-sessions-accumulate-state-across-loop-iterations) | Continued sessions accumulate state across loop iterations (lost-in-middle) | all | silent |
@@ -176,49 +176,48 @@ function extractResponseText(
 ---
-## F4. Claude: `s.session.query()` output includes TUI scrollback, not just the last turn
+## F4. Claude: `s.session.query()` returns `SessionMessage[]` — extract text with `extractAssistantText`
-**Symptom.** Parsers matching "the last fenced JSON block" pick up an old
-turn's JSON because the captured output contains multiple turns of scrollback.
+**Symptom.** Workflow code tries to access `.output` or `.text` on the
+result of `s.session.query()` and gets `undefined`, or passes the result
+directly to a string parser that throws.
-**Root cause.** `s.session.query()` captures the tmux pane's visible scrollback after output stabilizes — it's not a scoped
-"this call's response only" string. Earlier sub-agent output, prior-turn
-assistant text, and even the user's own prompt echo all end up in
-`result.output`.
+**Root cause.** `s.session.query()` returns `SessionMessage[]` — the native
+Claude Agent SDK type. It does NOT return a `{ output: string }` object or a
+raw TUI scrollback string. The assistant's text lives inside structured content
+blocks within those messages and must be extracted explicitly.
-**Affected SDKs.** Claude (tmux-based query).
+**Affected SDKs.** Claude.
 ### ❌ Wrong
 ```ts
-// Assumes `output` is only the latest turn's JSON
-const parsed = JSON.parse(reviewResult.output);
+// result is SessionMessage[], not { output: string }
+const result = await s.session.query(prompt);
+const parsed = JSON.parse(result.output);  // TypeError: result.output is undefined
 ```
-### ✅ Right — extract the LAST fenced block, not the first
+### ✅ Right — use `extractAssistantText(result, 0)`
 ```ts
-export function extractLastFencedBlock(
-  content: string,
-  lang = "json",
-): string | null {
-  const re = new RegExp("```" + lang + "\\s*\\n([\\s\\S]*?)\\n```", "g");
-  let last: string | null = null;
-  let match: RegExpExecArray | null;
-  while ((match = re.exec(content)) !== null) {
-    if (match[1]) last = match[1];
-  }
-  return last;
-}
+import { extractAssistantText } from "@bastani/atomic/workflows";
+const result = await s.session.query(prompt);
+const text = extractAssistantText(result, 0);
+// Now `text` is the concatenated assistant prose for this turn
 ```
+`extractAssistantText(msgs, afterIndex)` walks `SessionMessage[]` from
+`afterIndex` forward, pulls `TextBlock.text` from each `assistant` message's
+content array, and joins them with newlines.
 The ralph helpers in `src/sdk/workflows/builtin/ralph/helpers/prompts.ts`
-(`parseReviewResult`, `extractMarkdownBlock`) use this pattern — always take
-the **last** block, never the first.
+(`parseReviewResult`, `extractMarkdownBlock`) use this pattern — always
+extract text first, then parse.
-**Detection.** Run the workflow twice in the same session; if the
-downstream parser returns stale data from the prior iteration, F4 is the
-cause.
+**Detection.** Log `typeof result` after `s.session.query()`. If it's
+`object` (an array), you need `extractAssistantText`. Accessing `.output`
+on an array returns `undefined`.
 ---
@@ -232,9 +231,9 @@ returns a **fresh, empty conversation**. The CLIENT object is just the
 transport — each session is independent. The new session sees only what you
 put in its first prompt.
-**Affected SDKs.** Copilot, OpenCode. (Claude's tmux pane model is
-different — context accumulates in the same pane, so this failure mode
-does NOT apply to `s.session.query()`.)
+**Affected SDKs.** Copilot, OpenCode. (Claude's session model is
+different — context accumulates within the same SDK session, so this failure
+mode does NOT apply to `s.session.query()`.)
 ### ❌ Wrong
@@ -329,8 +328,8 @@ or "forgetting" a requirement that was clearly stated in the original spec.
 session, and context grows past the attention window. The model starts
 dropping middle-of-context information (classic lost-in-middle).
-**Affected SDKs.** All three. Claude's long tmux pane is especially
-vulnerable because the scrollback captures every intermediate turn.
+**Affected SDKs.** All three. Claude's session transcript accumulates every
+intermediate turn, so long loops grow the context window substantially.
 ### ❌ Wrong — unbounded loop on a single session
@@ -455,8 +454,8 @@ expects, and the runtime doesn't type-check the argument beyond "anything".
 ### ❌ Wrong
 ```ts
-// Claude — saves the wrong thing
-s.save(result.output);
+// Claude — saves the wrong thing (result is SessionMessage[], not { output: string })
+s.save(result.output);  // TypeError: result.output is undefined; use s.save(s.sessionId)
 // Copilot — saves an empty array if called before send
 s.save(await s.session.getMessages());