npm - @bastani/atomic - Versions diffs - 0.6.4 → 0.6.5 - Mend

@bastani/atomic 0.6.4 → 0.6.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (120) hide show

package/.agents/skills/create-spec/SKILL.md +6 -3
package/.agents/skills/tdd/SKILL.md +107 -0
package/.agents/skills/tdd/deep-modules.md +33 -0
package/.agents/skills/tdd/interface-design.md +31 -0
package/.agents/skills/tdd/mocking.md +59 -0
package/.agents/skills/tdd/refactoring.md +10 -0
package/.agents/skills/tdd/tests.md +61 -0
package/.agents/skills/workflow-creator/SKILL.md +550 -0
package/.agents/skills/workflow-creator/references/agent-sessions.md +891 -0
package/.agents/skills/workflow-creator/references/agent-setup-recipe.md +266 -0
package/.agents/skills/workflow-creator/references/computation-and-validation.md +201 -0
package/.agents/skills/workflow-creator/references/control-flow.md +470 -0
package/.agents/skills/workflow-creator/references/failure-modes.md +1014 -0
package/.agents/skills/workflow-creator/references/getting-started.md +392 -0
package/.agents/skills/workflow-creator/references/registry-and-validation.md +141 -0
package/.agents/skills/workflow-creator/references/running-workflows.md +418 -0
package/.agents/skills/workflow-creator/references/session-config.md +384 -0
package/.agents/skills/workflow-creator/references/state-and-data-flow.md +356 -0
package/.agents/skills/workflow-creator/references/user-input.md +234 -0
package/.agents/skills/workflow-creator/references/workflow-inputs.md +392 -0
package/.claude/agents/debugger.md +2 -2
package/.claude/agents/reviewer.md +1 -1
package/.claude/agents/worker.md +2 -2
package/.github/agents/debugger.md +1 -1
package/.github/agents/worker.md +1 -1
package/.mcp.json +5 -1
package/.opencode/agents/debugger.md +1 -1
package/.opencode/agents/worker.md +1 -1
package/README.md +236 -201
package/dist/sdk/define-workflow.d.ts +11 -6
package/dist/sdk/define-workflow.d.ts.map +1 -1
package/dist/sdk/errors.d.ts +10 -0
package/dist/sdk/errors.d.ts.map +1 -1
package/dist/sdk/index.d.ts +21 -9
package/dist/sdk/index.d.ts.map +1 -1
package/dist/sdk/primitives/inputs.d.ts +36 -0
package/dist/sdk/primitives/inputs.d.ts.map +1 -0
package/dist/sdk/primitives/metadata.d.ts +40 -0
package/dist/sdk/primitives/metadata.d.ts.map +1 -0
package/dist/sdk/primitives/run.d.ts +57 -0
package/dist/sdk/primitives/run.d.ts.map +1 -0
package/dist/sdk/primitives/sessions.d.ts +128 -0
package/dist/sdk/primitives/sessions.d.ts.map +1 -0
package/dist/sdk/runtime/executor.d.ts +24 -56
package/dist/sdk/runtime/executor.d.ts.map +1 -1
package/dist/sdk/runtime/orchestrator-entry.d.ts +26 -0
package/dist/sdk/runtime/orchestrator-entry.d.ts.map +1 -0
package/dist/sdk/runtime/tmux.d.ts +20 -0
package/dist/sdk/runtime/tmux.d.ts.map +1 -1
package/dist/sdk/types.d.ts +26 -86
package/dist/sdk/types.d.ts.map +1 -1
package/dist/sdk/workflows/builtin/deep-research-codebase/claude/index.d.ts.map +1 -1
package/dist/sdk/workflows/builtin/deep-research-codebase/copilot/index.d.ts.map +1 -1
package/dist/sdk/workflows/builtin/deep-research-codebase/opencode/index.d.ts.map +1 -1
package/dist/sdk/workflows/builtin/open-claude-design/claude/index.d.ts.map +1 -1
package/dist/sdk/workflows/builtin/open-claude-design/copilot/index.d.ts.map +1 -1
package/dist/sdk/workflows/builtin/open-claude-design/opencode/index.d.ts.map +1 -1
package/dist/sdk/workflows/builtin/ralph/claude/index.d.ts.map +1 -1
package/dist/sdk/workflows/builtin/ralph/copilot/index.d.ts.map +1 -1
package/dist/sdk/workflows/builtin/ralph/opencode/index.d.ts.map +1 -1
package/dist/sdk/workflows/index.d.ts +20 -12
package/dist/sdk/workflows/index.d.ts.map +1 -1
package/dist/services/config/additional-instructions.d.ts +1 -1
package/dist/services/config/additional-instructions.d.ts.map +1 -1
package/package.json +4 -4
package/src/cli.ts +39 -56
package/src/commands/builtin-registry.ts +37 -0
package/src/commands/cli/chat/index.ts +1 -3
package/src/{sdk → commands/cli}/management-commands.ts +15 -55
package/src/commands/cli/session.ts +1 -1
package/src/commands/cli/workflow-command.test.ts +250 -16
package/src/commands/cli/workflow-inputs.test.ts +1 -0
package/src/commands/cli/workflow-inputs.ts +13 -3
package/src/commands/cli/workflow-list.test.ts +1 -0
package/src/commands/cli/workflow-list.ts +0 -0
package/src/commands/cli/workflow-status.ts +1 -1
package/src/commands/cli/workflow.ts +191 -11
package/src/sdk/define-workflow.test.ts +47 -16
package/src/sdk/define-workflow.ts +24 -6
package/src/sdk/errors.test.ts +11 -0
package/src/sdk/errors.ts +13 -0
package/src/sdk/index.test.ts +92 -0
package/src/sdk/index.ts +71 -15
package/src/sdk/primitives/inputs.ts +48 -0
package/src/sdk/primitives/metadata.ts +63 -0
package/src/sdk/primitives/run.ts +81 -0
package/src/sdk/primitives/sessions.test.ts +594 -0
package/src/sdk/primitives/sessions.ts +328 -0
package/src/sdk/runtime/executor.ts +36 -115
package/src/sdk/runtime/orchestrator-entry.ts +110 -0
package/src/sdk/runtime/tmux.ts +33 -0
package/src/sdk/types.ts +26 -91
package/src/sdk/workflows/builtin/deep-research-codebase/claude/index.ts +1 -0
package/src/sdk/workflows/builtin/deep-research-codebase/copilot/index.ts +1 -0
package/src/sdk/workflows/builtin/deep-research-codebase/opencode/index.ts +1 -0
package/src/sdk/workflows/builtin/open-claude-design/claude/index.ts +1 -0
package/src/sdk/workflows/builtin/open-claude-design/copilot/index.ts +1 -0
package/src/sdk/workflows/builtin/open-claude-design/opencode/index.ts +1 -0
package/src/sdk/workflows/builtin/ralph/claude/index.ts +1 -0
package/src/sdk/workflows/builtin/ralph/copilot/index.ts +1 -0
package/src/sdk/workflows/builtin/ralph/opencode/index.ts +1 -0
package/src/sdk/workflows/index.ts +68 -51
package/src/services/config/additional-instructions.ts +1 -1
package/.agents/skills/test-driven-development/SKILL.md +0 -371
package/.agents/skills/test-driven-development/testing-anti-patterns.md +0 -299
package/dist/commands/cli/session.d.ts +0 -67
package/dist/commands/cli/session.d.ts.map +0 -1
package/dist/commands/cli/workflow-status.d.ts +0 -63
package/dist/commands/cli/workflow-status.d.ts.map +0 -1
package/dist/sdk/commander.d.ts +0 -74
package/dist/sdk/commander.d.ts.map +0 -1
package/dist/sdk/management-commands.d.ts +0 -42
package/dist/sdk/management-commands.d.ts.map +0 -1
package/dist/sdk/workflow-cli.d.ts +0 -103
package/dist/sdk/workflow-cli.d.ts.map +0 -1
package/dist/sdk/workflows/builtin-registry.d.ts +0 -113
package/dist/sdk/workflows/builtin-registry.d.ts.map +0 -1
package/src/sdk/commander.ts +0 -161
package/src/sdk/workflow-cli.ts +0 -409
package/src/sdk/workflows/builtin-registry.ts +0 -23

package/README.md CHANGED Viewed

@@ -9,27 +9,21 @@
 [![Bun](https://img.shields.io/badge/Bun-Runtime-f9f1e1?logo=bun&logoColor=black)](./package.json)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](./LICENSE)
-**An open-source CLI and TypeScript SDK for building harnesses around your coding agent** — Claude Code, OpenCode, or GitHub Copilot CLI. Chain agent sessions into deterministic pipelines, add human-in-the-loop approval gates, dispatch **12 specialized sub-agents**, and tap **57 built-in skills** — then ship it as TypeScript your whole team runs.
+**Turn coding agents into reliable engineering workflows.** Atomic is an open-source CLI and TypeScript SDK for Claude Code, OpenCode, and GitHub Copilot CLI. Define the steps, guardrails, review gates, and execution environment your agent should follow, then run the workflow as TypeScript your whole team can review and reuse.
-> Define how your agent works. Start for yourself, scale to your team — across GitHub, Azure DevOps (ADO), or Sapling.
+> Build the workflow once. Run it across agents, repos, and teams — with GitHub, Azure DevOps (ADO), or Sapling.
 ---
-## Two surfaces: CLI and SDK
-Atomic ships **two** things that share one orchestrator runtime. You can use either on its own or both together:
+## Why Atomic
-|                       | Atomic CLI                                                                                                                                                                                                                         | Atomic SDK                                                                                                                                                      |
-| --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| **What it is**        | Global `atomic` binary                                                                                                                                                                                                             | `@bastani/atomic/workflows` TypeScript library                                                                                                                  |
-| **Install**           | `bun install -g @bastani/atomic` (or `install.sh` / `install.ps1`)                                                                                                                                                                 | `bun add @bastani/atomic` inside your project                                                                                                                   |
-| **Entrypoint**        | `atomic <command>`                                                                                                                                                                                                                 | `bun run src/<agent>-worker.ts`                                                                                                                                 |
-| **Code required?**    | No — everything is pre-built                                                                                                                                                                                                       | Yes — you write `defineWorkflow(...)` + a 3-line composition root                                                                                               |
-| **What you get**      | `atomic chat` (agent REPL), three autonomous builtins (`ralph`, `deep-research-codebase`, `open-claude-design`), session management, the live orchestrator panel, Atomic skills (`/init`, `/research-codebase`, `/create-spec`, …) | `defineWorkflow`, `createWorkflowCli`, `createRegistry`, `ctx.stage`, `s.save` / `s.transcript`, headless stages, the Commander adapter (`toCommand`, `runCli`) |
-| **When to reach for** | You want autonomous execution of a standard pattern out of the box, or interactive chat with your agent's full toolset                                                                                                             | You want to encode **your** team's process — review flows, deployment gates, custom research pipelines — as TypeScript every teammate runs identically          |
-| **Read next**         | [Quick Start](#quick-start) (steps 1–3)                                                                                                                                                                                            | [Quick Start step 4](#4-build-your-own-workflow--sdk) and [Building your own atomic-powered app](#building-your-own-atomic-powered-app)                         |
+Coding agents are great inside a single session. They can inspect code, use tools, make edits, and explain their work. The trouble starts when the task is ambiguous/complex, tied to specific outcomes/exit criteria, long-running, or tied to a large codebase: you end up reminding the agent of the process, moving output between sessions, checking whether it followed the right steps, and deciding when a human needs to review the work. Atomic turns that process into code. A workflow can branch, retry, run stages in parallel, isolate sessions, pass only the right transcript forward, pause for human approval, and run inside a devcontainer so the agent is not loose on your host machine.
-Both surfaces call the same runtime underneath (tmux/psmux session graph, provider SDKs, detach/reattach) — they're two entry points, not two products. Neither depends on the other: you can `bun add @bastani/atomic` in a project without ever installing the global binary, and you can use `atomic chat` and the builtins without writing any TypeScript.
+- **Start with your own process.** Automate the repetitive parts of research, product feedback, debugging, review, migrations, or PR prep. One TypeScript file, versioned with the repo.
+- **Scale to your team.** Encode review gates, quality checks, and approvals so every teammate runs the same workflow instead of manually steering an agent.
+- **Keep the coding agent.** Atomic adds structure around Claude Code, OpenCode, and Copilot CLI without rebuilding their file editing, tool use, MCP setup, hooks, or context handling from scratch.
+- **Use natural language to get started.** Ask the `workflow-creator` skill to turn a workflow description into `defineWorkflow()` code, or let an agent use the skill when a complex task needs a repeatable workflow.
+- **Control the outer loop.** Instead of trusting a black-box harness to improvise process, Atomic makes the orchestration inspectable: the agent stil uses it's harness with its native tools and context management, but the workflow, gates, handoffs, and execution graph are TypeScript you can read, edit, and version. This allows you to enhance your existing coding agent's capabilities.
 ---
@@ -39,7 +33,7 @@ Install, generate context, try Ralph, then write your own workflow — four step
 ### Prerequisites
-Atomic doesn't replace your coding agent or terminal — it orchestrates them. Three things have to exist on the host before a workflow can run:
+Atomic doesn't replace your coding agent or terminal — it gives them a workflow to follow. Three things have to exist on the host before a workflow can run:
 - **[Bun](https://bun.sh/)** as the JavaScript runtime — Atomic and the SDK ship source that relies on `Bun.spawn`, native pty handling, and Bun-specific module resolution. **They do not run on Node.js.** The bootstrap installer below installs Bun for you; if you install `@bastani/atomic` manually, install Bun first.
 - **A terminal multiplexer** — every stage runs inside a detachable session on a dedicated `atomic` socket (your personal tmux is untouched). That's how workflows survive terminal disconnects, how `-d/--detach` puts a run in the background, and how `atomic session connect` reattaches later from any shell.
@@ -51,7 +45,7 @@ Atomic doesn't replace your coding agent or terminal — it orchestrates them. T
   - [GitHub Copilot CLI](https://github.com/features/copilot/cli) — run `copilot` and authenticate
 - **Windows only:** PowerShell 7+ ([install guide](https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell-on-windows))
-> The bootstrap installer below installs Bun and Atomic but **does not** install tmux/psmux or the coding agents. Install those separately before running any workflow — `bun run src/claude-worker.ts -n <workflow-name> -a claude` will fail loudly at stage spawn if either is missing. Using a [devcontainer](#alternative-devcontainer-recommended-for-autonomous-workflows) short-circuits all of this: the atomic feature bundles Bun + tmux + the agent CLI into the container image.
+> The bootstrap installer below installs Bun and Atomic but **does not** install tmux/psmux or the coding agents. Install those separately before running any workflow — `bun run src/claude-worker.ts` will fail loudly at stage spawn if either is missing. Using a [devcontainer](#alternative-devcontainer-recommended-for-autonomous-workflows) short-circuits all of this: the atomic feature bundles Bun + tmux + the agent CLI into the container image.
 ### 1. Install — CLI + SDK share the same package
@@ -233,22 +227,52 @@ export default defineWorkflow({
   .compile();
 ```
-Wire it to a CLI in `src/claude-worker.ts` — three lines:
+Wire it to a CLI in `src/claude-worker.ts`. The SDK ships pure
+primitives — no wrapper to opt into. Compose with your CLI library of
+choice (Commander, citty, yargs, …) and call `runWorkflow`. Catch the
+SDK's typed errors (`MissingDependencyError`, `SessionNotFoundError`,
+…) for friendly CLI output:
 ```ts
-import { createWorkflowCli } from "@bastani/atomic/workflows";
+import { Command } from "@commander-js/extra-typings";
+import {
+  getInputSchema,
+  runWorkflow,
+  MissingDependencyError,
+} from "@bastani/atomic/workflows";
 import workflow from "./workflows/review-to-merge/claude.ts";
-await createWorkflowCli(workflow).run();
+const program = new Command();
+for (const input of getInputSchema(workflow)) {
+  program.option(`--${input.name} <value>`, input.description ?? "");
+}
+program.action(async (rawOpts) => {
+  try {
+    await runWorkflow({ workflow, inputs: rawOpts as Record<string, string> });
+  } catch (err) {
+    if (err instanceof MissingDependencyError) {
+      console.error(`Missing dependency: ${err.dependency}. Install it and retry.`);
+      process.exit(1);
+    }
+    throw err;
+  }
+});
+await program.parseAsync();
 ```
 Run it:
 ```bash
-bun run src/claude-worker.ts -n review-to-merge -a claude
+bun run src/claude-worker.ts --target_branch=main
 ```
-That's the full shape — one workflow file, one three-line composition root. `createWorkflowCli` handles named dispatch (`-n/--name` + `-a/--agent`), the `--<input>` flags declared by your workflow, detached execution, and the interactive picker. Pass an array (`createWorkflowCli([claude, copilot])`) for multi-agent or multi-workflow apps; the file stays three lines. See [Workflow SDK](#workflow-sdk--build-your-own-deterministic-harness) for parallel stages, input schemas, headless stages, and the full API reference.
+That's the full shape — one workflow file, one composition root. The
+SDK exposes primitives (`runWorkflow`, `getInputSchema`, `listWorkflows`,
+`getName`, `getAgent`, `validateInputs`, `listSessions`, …) and the
+developer composes them into whatever CLI shape they prefer. See
+[Workflow SDK](#workflow-sdk--build-reliable-engineering-workflows) for
+parallel stages, input schemas, headless stages, and the full API
+reference.
 ### Managing sessions
@@ -270,42 +294,42 @@ atomic workflow -n ralph -a claude -d "build the auth module"   # returns immedi
 atomic workflow session connect atomic-wf-claude-ralph-<id>      # attach later
 ```
-Detached mode is what you want for scripted / CI automation and long-running tasks — the orchestrator keeps running on the atomic tmux socket regardless of your terminal.
+Detached mode is what you want for scripted / CI automation and long-running tasks — the workflow keeps running on the atomic tmux socket regardless of your terminal.
 ---
-## Why Atomic
+## Two surfaces: CLI and SDK
-Better models make harnesses **more** important, not less. The more you trust an agent to execute complex tasks, the more value you get from defining exactly **what** it should execute, in **what order**, with **what checks** along the way. The harness is the durable layer — models keep improving underneath it, but your process stays the same.
+Atomic ships **two** things that share one workflow runtime. You can use either on its own or both together:
-- **Start for yourself.** Automate the repetitive parts of your own workflow — research a codebase, add monitoring, generate specs. One TypeScript file, one afternoon.
-- **Scale to your team.** Encode your team's review process, deployment gates, and quality checks as TypeScript every member runs identically — versioned, testable, reproducible.
-- **Work across agents.** Write a harness once, run it on Claude Code, OpenCode, or Copilot CLI with a flag change.
+|                       | Atomic CLI                                                                                                                                                                                                                               | Atomic SDK                                                                                                                                                                                                                                                      |
+| --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| **What it is**        | Global `atomic` binary                                                                                                                                                                                                                   | `@bastani/atomic/workflows` TypeScript library                                                                                                                                                                                                                  |
+| **Install**           | `bun install -g @bastani/atomic` (or `install.sh` / `install.ps1`)                                                                                                                                                                       | `bun add @bastani/atomic` inside your project                                                                                                                                                                                                                   |
+| **Entrypoint**        | `atomic <command>`                                                                                                                                                                                                                       | `bun run src/<agent>-worker.ts`                                                                                                                                                                                                                                 |
+| **Code required?**    | No — everything is pre-built. You can also ask the agent inside `atomic chat` to use the `workflow-creator` skill, decide when a complex task needs its own workflow, and build/run that workflow on the fly.                            | No to start — describe the workflow in natural language and use the `workflow-creator` skill to generate it. Then refine it in natural language or edit the TypeScript workflow and composition root directly, with full visibility into exactly what will run. |
+| **What you get**      | `atomic chat` (agent REPL), three autonomous built-in workflows (`ralph`, `deep-research-codebase`, `open-claude-design`), session management, the live workflow panel, Atomic skills (`/init`, `/research-codebase`, `/create-spec`, …) | `defineWorkflow`, `createRegistry`, `runWorkflow`, metadata accessors (`getName`, `getInputSchema`, …), session primitives (`listSessions`, `getSessionStatus`, `attachSession` / `detachSession`, `nextWindow` / `previousWindow` / `gotoOrchestrator`), typed errors (`MissingDependencyError`, `SessionNotFoundError`, …), `ctx.stage`, `s.save` / `s.transcript`, headless stages |
+| **When to reach for** | You want autonomous execution of a standard pattern out of the box, interactive chat with your agent's full toolset, or a CLI agent that can create a purpose-built workflow before doing complex work.                                  | You want to control the outer loop yourself — review flows, deployment gates, custom research pipelines — with full visibility into the TypeScript your team will run identically.                                                                              |
+| **Read next**         | [Quick Start](#quick-start) (steps 1–3)                                                                                                                                                                                                  | [Quick Start step 4](#4-build-your-own-workflow--sdk) and [Building your own atomic-powered app](#building-your-own-atomic-powered-app)                                                                                                                         |
-### Example use cases
+Both surfaces call the same runtime underneath (tmux/psmux session graph, provider SDKs, detach/reattach) — they're two entry points, not two products. Neither depends on the other: you can `bun add @bastani/atomic` in a project without ever installing the global binary, and you can use `atomic chat` and the built-in workflows without writing any TypeScript.
-These are shapes you'd **author** with `defineWorkflow` and then run from your own `src/<agent>-worker.ts` — see [step 4 of Quick Start](#4-build-your-own-workflow) for the three-line entrypoint. Atomic ships three built-in workflows (`ralph`, `deep-research-codebase`, `open-claude-design`); everything else is yours to define.
+## Example use cases
-**Add production monitoring.** Research observability gaps, implement missing metrics and health checks, review the changes.
+These are workflows you'd author with `defineWorkflow` and run from your own `src/<agent>-worker.ts` — see [step 4 of Quick Start](#4-build-your-own-workflow--sdk) for the three-line entrypoint. Atomic ships three built-in workflows (`ralph`, `deep-research-codebase`, `open-claude-design`); everything else is yours to define.
-```bash
-bun run src/claude-worker.ts -n observability -a claude "add Prometheus metrics and health checks to all API endpoints"
-```
-**Parallel UX testing with 50 personas.** Spin up 50 agents, each with a distinct persona (power user, accessibility-dependent, non-technical stakeholder), each using [Playwright](#built-in-skills) to test your app.
-```bash
-bun run src/claude-worker.ts -n ux-personas -a claude --personas=50
-```
-**Review-to-merge pipeline.** The workflow from [step 4](#4-build-your-own-workflow) above — reviews code, runs CI in parallel, opens a PR, notifies Slack, waits for approval, merges.
+- **Review-to-merge pipeline.** Review code, run CI in parallel, open a PR, notify Slack, wait for approval, merge.
+- **Support ticket to draft PR.** Reproduce the issue, find the root cause, try a fix in a sandbox, run tests, pause for review.
+- **Production alert investigation.** Pull the failing trace, inspect recent commits, rank likely causes, then draft a fix or page the on-call with evidence.
+- **Parallel UX testing.** Run many persona-specific agents against the same feature, aggregate structured feedback, and turn selected issues into tasks.
+- **Large migration or refactor.** Research the codebase, split the work into safe batches, run implementation and review passes, and keep artifacts for later runs.
 ---
 ## Table of Contents
 - [Atomic](#atomic)
-  - [Two surfaces: CLI and SDK](#two-surfaces-cli-and-sdk)
+  - [Why Atomic](#why-atomic)
   - [Quick Start](#quick-start)
     - [Prerequisites](#prerequisites)
     - [1. Install — CLI + SDK share the same package](#1-install--cli--sdk-share-the-same-package)
@@ -313,16 +337,16 @@ bun run src/claude-worker.ts -n ux-personas -a claude --personas=50
     - [3. Try Ralph — CLI (autonomous coding)](#3-try-ralph--cli-autonomous-coding)
     - [4. Build your own workflow — SDK](#4-build-your-own-workflow--sdk)
     - [Managing sessions](#managing-sessions)
-  - [Why Atomic](#why-atomic)
-    - [Example use cases](#example-use-cases)
+  - [Two surfaces: CLI and SDK](#two-surfaces-cli-and-sdk)
+  - [Example use cases](#example-use-cases)
   - [Table of Contents](#table-of-contents)
   - [Security: Workflow Permissions Model](#security-workflow-permissions-model)
   - [Core Features](#core-features)
     - [Multi-Agent Support](#multi-agent-support)
-    - [Workflow SDK — Build Your Own Deterministic Harness](#workflow-sdk--build-your-own-deterministic-harness)
+    - [Workflow SDK — Build Reliable Engineering Workflows](#workflow-sdk--build-reliable-engineering-workflows)
       - [Runnable examples shipped with the repo](#runnable-examples-shipped-with-the-repo)
       - [Builder API](#builder-api)
-      - [WorkflowContext (`ctx`) — top-level orchestrator](#workflowcontext-ctx--top-level-orchestrator)
+      - [WorkflowContext (`ctx`) — top-level workflow context](#workflowcontext-ctx--top-level-workflow-context)
       - [SessionContext (`s`) — inside each session callback](#sessioncontext-s--inside-each-session-callback)
       - [Session Options (`SessionRunOptions`)](#session-options-sessionrunoptions)
       - [Saving Transcripts](#saving-transcripts)
@@ -334,7 +358,7 @@ bun run src/claude-worker.ts -n ux-personas -a claude --personas=50
     - [Containerized Execution](#containerized-execution)
     - [Specialized Sub-Agents](#specialized-sub-agents)
     - [Built-in Skills](#built-in-skills)
-    - [Workflow Orchestrator Panel](#workflow-orchestrator-panel)
+    - [Workflow Panel](#workflow-panel)
   - [Commands Reference](#commands-reference)
     - [CLI Commands](#cli-commands)
       - [Global Flags](#global-flags)
@@ -346,8 +370,8 @@ bun run src/claude-worker.ts -n ux-personas -a claude --personas=50
   - [Building your own atomic-powered app](#building-your-own-atomic-powered-app)
     - [One factory, three input shapes](#one-factory-three-input-shapes)
     - [One method: `run()`](#one-method-run)
-    - [Embedding under a parent CLI — `toCommand` + `runCli`](#embedding-under-a-parent-cli--tocommand--runcli)
-    - [`entry` — for bundled apps and test harnesses](#entry--for-bundled-apps-and-test-harnesses)
+    - [Embedding under a parent CLI — `runWorkflow` inside any Commander tree](#embedding-under-a-parent-cli--runworkflow-inside-any-commander-tree)
+    - [`entry` — for bundled apps and tests](#entry--for-bundled-apps-and-tests)
     - [Registry rules](#registry-rules)
     - [Input precedence](#input-precedence)
     - [Builtin workflows via the `atomic` CLI](#builtin-workflows-via-the-atomic-cli)
@@ -395,21 +419,23 @@ Atomic works across **three production coding agents** — switch with a flag an
 Each agent gets its own configuration directory (`.claude/`, `.opencode/`, `.github/`), skills, and context files — all managed by Atomic.
-### Workflow SDK — Build Your Own Deterministic Harness
+### Workflow SDK — Build Reliable Engineering Workflows
-The Workflow SDK (`@bastani/atomic/workflows`) lets you encode your team's process as TypeScript — spawn agent sessions dynamically with native control flow (`for`, `if`, `Promise.all()`), and watch them appear in a live graph as they execute.
+The Workflow SDK (`@bastani/atomic/workflows`) lets you encode your team's process as TypeScript — spawn agent sessions dynamically with native control flow (`for`, `if`, `Promise.all()`), pass state explicitly, and watch each stage appear in a live graph as it runs.
-Set up a workflow project (`bun init && bun add @bastani/atomic`), define your workflow with `defineWorkflow`, then bind it to a CLI with `createWorkflowCli(definition)` (single workflow) or `createWorkflowCli(registry)` (many workflows):
+Set up a workflow project (`bun init && bun add @bastani/atomic`), define your workflow with `defineWorkflow`, then call `runWorkflow({ workflow, inputs })` from inside whatever CLI library you prefer (Commander, citty, yargs, an OpenTUI app, …). The SDK ships pure primitives — no opinionated wrapper:
 ```bash
-bun run src/claude-worker.ts -n <workflow-name> -a claude --prompt "describe this project"
+bun run src/claude-worker.ts --prompt="describe this project"
 ```
-See [step 4 of Quick Start](#4-build-your-own-workflow) for a complete review-to-merge example. More examples and the full API reference below.
+See [step 4 of Quick Start](#4-build-your-own-workflow--sdk) for a complete review-to-merge example. More examples and the full API reference below.
 #### Runnable examples shipped with the repo
-The [`examples/`](./examples) directory contains small, complete user apps you can run directly. Most subdirectories ship `claude/`, `copilot/`, and `opencode/` variants plus one agent-scoped worker file per agent — `claude-worker.ts`, `copilot-worker.ts`, `opencode-worker.ts` — each a three-line `createWorkflowCli(workflow).run()` entrypoint. `multi-workflow/` and `commander-embed/` use a single `cli.ts` instead, to demonstrate multi-workflow dispatch and Commander embedding respectively.
+The [`examples/`](./examples) directory contains small, complete user apps you can run directly. Most subdirectories ship `claude/`, `copilot/`, and `opencode/` variants plus one agent-scoped worker file per agent — `claude-worker.ts`, `copilot-worker.ts`, `opencode-worker.ts` — each a small Commander entrypoint that calls `runWorkflow({ workflow, inputs })`. `multi-workflow/` and `commander-embed/` use a single `cli.ts` instead, to demonstrate multi-workflow dispatch and Commander embedding respectively.
+**Design principle — when does `-a/--agent` belong on your CLI?** Each agent-scoped worker file imports a single workflow pinned to one agent (`import workflow from "./claude/index.ts"`), so there's nothing to disambiguate — no `-a` flag. Only reach for `-a/--agent` when one CLI dispatches across workflows that exist in multiple agent variants — e.g. a `cli.ts` that registers `hello` for claude *and* copilot. The atomic CLI itself uses `-a` for exactly that reason: its builtin registry has cross-agent variants of `ralph`, `deep-research-codebase`, and `open-claude-design`.
 | Example                         | What it demonstrates                                                                                                                                                                                                                                         |
 | ------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
@@ -421,29 +447,28 @@ The [`examples/`](./examples) directory contains small, complete user apps you c
 | `hil-favorite-color-headless`   | HIL pause inside a headless stage                                                                                                                                                                                                                            |
 | `structured-output-demo`        | Per-SDK structured output (JSON-schema validation, Zod)                                                                                                                                                                                                      |
 | `reviewer-tool-test`            | Custom reviewer tool wiring (Copilot — copilot-worker.ts only)                                                                                                                                                                                               |
-| `review-fix-loop`               | Draft → loop(review → fix) with bounded iterations and early exit on a `CLEAN` verdict — the quintessential harness pattern, showing how a stage's return value (`handle.result`) drives TypeScript control flow                                             |
-| `multi-workflow`                | Two Claude workflows under one `cli.ts` — `-n/--name` dispatch, per-workflow `--<input>` flag union, and the interactive picker. Shows the array form (`createWorkflowCli([hello, goodbye])`) and the `createRegistry().register(...)` variant side by side. |
-| `commander-embed`               | Mount an atomic workflow under a parent Commander CLI with `toCommand(cli, "greet")`, alongside a plain Commander sibling command. `runCli` replaces `program.parseAsync()` and transparently handles detached orchestrator re-entry.                        |
+| `review-fix-loop`               | Draft → loop(review → fix) with bounded iterations and early exit on a `CLEAN` verdict — a reliable review gate showing how a stage's return value (`handle.result`) drives TypeScript control flow                                                          |
+| `multi-workflow`                | Two Claude workflows under one `cli.ts` — uses `listWorkflows(registry)` to register one Commander subcommand per workflow with each workflow's declared inputs as `--<flag>` options.                                                |
+| `commander-embed`               | Mount an atomic workflow under a parent Commander CLI by calling `runWorkflow({ workflow, inputs })` inside a Commander action, alongside a plain Commander sibling command. No re-entry boilerplate — the SDK ships its own orchestrator entry script. |
+| `pane-navigation`               | Driver CLI for the SDK pane-navigation primitives (`nextWindow`, `previousWindow`, `gotoOrchestrator`, `attachSession`, `detachSession`). Spawns a 3-stage workflow detached and exposes `start / list / status / next / prev / home / attach / stop` subcommands. Catches `SessionNotFoundError` for friendly errors. |
 Run any of them with:
 ```bash
-# Single-workflow examples — one worker file per agent
-bun run examples/<name>/<agent>-worker.ts -n <workflow-name> -a <agent> [--field=value | "<prompt>"]
-# e.g.
-bun run examples/hello-world/claude-worker.ts -n hello-world -a claude --greeting="Hello" --style=casual
-bun run examples/sequential-describe-summarize/claude-worker.ts -n sequential-describe-summarize -a claude --topic="Bun"
-bun run examples/review-fix-loop/claude-worker.ts -n review-fix-loop -a claude --topic="adopting Bun" --max_iterations=3
-bun run examples/headless-test/copilot-worker.ts -n headless-test -a copilot "TypeScript"
-# Multi-workflow — one cli.ts, dispatch by `-n/--name`
-bun run examples/multi-workflow/cli.ts -n hello   -a claude --who=Alex
-bun run examples/multi-workflow/cli.ts -n goodbye -a claude --tone=melodramatic
-bun run examples/multi-workflow/cli.ts -a claude              # interactive picker (TTY)
-# Commander embedding — workflow mounted as a subcommand under a parent CLI
-bun run examples/commander-embed/cli.ts greet -n greet -a claude --who=Alex
+# Single-workflow workers — agent is pinned by which file you run, so no `-a` flag.
+# Inputs map to `--<input>=<value>` flags; if the workflow declares no inputs,
+# trailing positional tokens become the prompt.
+bun run examples/hello-world/claude-worker.ts --greeting="Hello" --style=casual
+bun run examples/sequential-describe-summarize/claude-worker.ts --topic="Bun"
+bun run examples/review-fix-loop/claude-worker.ts --topic="adopting Bun" --max_iterations=3
+bun run examples/headless-test/copilot-worker.ts --prompt="TypeScript"
+# Multi-workflow CLI — one cli.ts, one Commander subcommand per registered workflow.
+bun run examples/multi-workflow/cli.ts hello   --who=Alex
+bun run examples/multi-workflow/cli.ts goodbye --tone=melodramatic
+# Commander embedding — atomic workflow mounted as `greet` alongside plain Commander commands.
+bun run examples/commander-embed/cli.ts greet --who=Alex
 bun run examples/commander-embed/cli.ts status                # sibling Commander command
 bun run examples/commander-embed/cli.ts --help                # all commands
 ```
@@ -583,13 +608,11 @@ export default defineWorkflow({
   .compile();
 ```
-Wire it into `src/claude-worker.ts` (three lines — see [step 4 of Quick Start](#4-build-your-own-workflow)) and run it:
+Wire it into `src/claude-worker.ts` (three lines — see [step 4 of Quick Start](#4-build-your-own-workflow--sdk)) and run it:
 ```bash
 # Scriptable; CI-friendly
 bun run src/claude-worker.ts \
-  -n gen-spec \
-  -a claude \
   --research_doc=research/docs/2026-04-11-auth.md \
   --focus=standard
 ```
@@ -662,15 +685,17 @@ The graph shows `seed → merge` — headless stages are transparent to the topo
 | ---------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- |
 | **Dynamic session spawning**       | `ctx.stage()` spawns sessions at runtime — each gets its own tmux window and graph node                                            |
 | **Native TypeScript control flow** | Use `for`, `if/else`, `Promise.all()`, `try/catch` — no framework DSL                                                              |
+| **Review gates and approvals**     | Pause for human input, run structured review stages, and decide whether the next stage should continue                             |
 | **Session return values**          | Session callbacks can return data: `const h = await ctx.stage(...); h.result`                                                      |
 | **Transcript passing**             | Access prior output via handle (`s.transcript(handle)`) or name (`s.transcript("name")`)                                           |
 | **Declared input schemas**         | Add an `inputs: [...]` array and the CLI materialises `--<field>=<value>` flags with built-in validation                           |
-| **Interactive picker**             | `atomic workflow -a <agent>` is the explicit no-`-n` discovery path; direct runs use `-n <name>`                                    |
+| **Interactive picker**             | `atomic workflow -a <agent>` is the explicit no-`-n` discovery path; direct runs use `-n <name>`                                   |
 | **Nested sub-sessions**            | `s.stage()` inside a callback spawns child sessions — visible as nested graph nodes                                                |
 | **Auto-inferred graph**            | Topology derived from `await` / `Promise.all` patterns — no annotations                                                            |
 | **Provider-agnostic**              | Write raw SDK code for Claude, Copilot, or OpenCode inside each callback                                                           |
 | **Live graph visualization**       | Sessions appear in the TUI graph as they spawn — loops and conditionals visible in real time                                       |
 | **Background (headless) stages**   | `headless: true` runs in-process without a tmux window — invisible in graph, tracked by statusline counter, identical callback API |
+| **Token-aware handoffs**           | Save transcripts to disk and pass paths or distilled outputs forward instead of stuffing every stage with the full history         |
 **Deterministic execution guarantees:**
@@ -682,7 +707,7 @@ Workflows are deterministic by design — the same definition produces the same
 - **Isolated context windows** — Each session runs in its own tmux pane with a fresh context. Data flows only through explicit `ctx.transcript()` / `ctx.getMessages()` calls.
 - **Persisted artifacts** — Every session writes messages, transcript, and metadata to disk — a complete, inspectable execution record.
-Variance comes only from the LLM's responses, not from the harness.
+Variance comes from the LLM's responses, not from a changing workflow.
 > Ask Atomic to build workflows for you: `Use your workflow-creator skill to create a workflow that plans, implements, and reviews a feature.`
@@ -697,7 +722,7 @@ Variance comes only from the LLM's responses, not from the harness.
 | `.run(async (ctx) => { ... })`          | Set the workflow's entry point — `ctx` is a `WorkflowContext`     |
 | `.compile()`                            | **Required** — terminal method that seals the workflow definition |
-#### WorkflowContext (`ctx`) — top-level orchestrator
+#### WorkflowContext (`ctx`) — top-level workflow context
 | Property                                       | Type                        | Description                                                                                                                                                                                        |
 | ---------------------------------------------- | --------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
@@ -759,7 +784,7 @@ The runtime auto-creates `s.client` and `s.session` — use them directly inside
 2. Session names must be unique within a workflow run
 3. `transcript()` / `getMessages()` only access completed sessions (callback returned + saves flushed)
 4. Each session runs in its own tmux window with the chosen agent
-5. Bind a workflow to a CLI with `createWorkflowCli(workflow)` (single workflow) or `createWorkflowCli(createRegistry().register(...))` (many workflows)
+5. Run a workflow by calling `runWorkflow({ workflow, inputs })` from inside any CLI library (Commander, citty, yargs, …). Use `listWorkflows(registry)` to iterate when registering multiple workflows.
 6. Set up your workflow project with `bun init && bun add @bastani/atomic`
 7. Background (headless) stages use the same callback API — `s.client`, `s.session`, `s.save()`, return values all work identically
@@ -818,7 +843,7 @@ The [Ralph Method](https://ghuntley.com/ralph/) enables **multi-hour autonomous
 **How Ralph works:**
 1. **Task Decomposition** — A `planner` sub-agent breaks your spec into a task list with dependency tracking, stored in SQLite (WAL mode for parallel access).
-2. **Orchestration** — An `orchestrator` retrieves the task list, validates the dependency graph, and dispatches `worker` sub-agents for ready tasks.
+2. **Execution** — An `orchestrator` retrieves the task list, validates the dependency graph, and dispatches `worker` sub-agents for ready tasks.
 3. **Review & Debug** — A `reviewer` audits the implementation with structured JSON output; if P0–P2 findings exist, a `debugger` investigates root causes and feeds back to the planner on the next iteration.
 **Loop config:** Up to **10 iterations**. Exits early after **2 consecutive clean reviews** (zero actionable findings). P3 (minor) findings are non-actionable.
@@ -917,16 +942,16 @@ Skills are structured capability modules that give agents best practices and reu
 <details>
 <summary><b>Development workflows</b></summary>
-| Skill                     | Description                                                                 |
-| ------------------------- | --------------------------------------------------------------------------- |
-| `init`                    | Generate `CLAUDE.md` and `AGENTS.md` by exploring the codebase              |
-| `research-codebase`       | Analyze codebase with parallel sub-agents and document findings             |
-| `create-spec`             | Create detailed execution plans from research documents                     |
-| `workflow-creator`        | Create multi-agent workflows using the session-based `defineWorkflow()` API |
-| `explain-code`            | Explain code functionality in detail using DeepWiki                         |
-| `find-skills`             | Discover and install agent skills from the community                        |
-| `test-driven-development` | Write tests first; includes a testing anti-patterns guide                   |
-| `prompt-engineer`         | Create, improve, and optimize prompts using best practices                  |
+| Skill               | Description                                                                 |
+| ------------------- | --------------------------------------------------------------------------- |
+| `init`              | Generate `CLAUDE.md` and `AGENTS.md` by exploring the codebase              |
+| `research-codebase` | Analyze codebase with parallel sub-agents and document findings             |
+| `create-spec`       | Create detailed execution plans from research documents                     |
+| `workflow-creator`  | Create multi-agent workflows using the session-based `defineWorkflow()` API |
+| `explain-code`      | Explain code functionality in detail using DeepWiki                         |
+| `find-skills`       | Discover and install agent skills from the community                        |
+| `tdd`               | Write tests first; includes a testing anti-patterns guide                   |
+| `prompt-engineer`   | Create, improve, and optimize prompts using best practices                  |
 </details>
@@ -1025,9 +1050,9 @@ Skills are structured capability modules that give agents best practices and reu
 Skills are auto-invoked when relevant. Run `ls .agents/skills/` for the complete, current list on disk.
-### Workflow Orchestrator Panel
+### Workflow Panel
-During `atomic workflow` execution, Atomic renders a live orchestrator panel built on [OpenTUI](https://github.com/anomalyco/opentui) over the workflow's tmux session graph. It shows:
+During `atomic workflow` execution, Atomic renders a live workflow panel built on [OpenTUI](https://github.com/anomalyco/opentui) over the workflow's tmux session graph. It shows:
 - **Session graph** — Nodes per `.stage()` with status (pending / running / completed / failed) and edges for sequential / parallel dependencies
 - **Task list tracking** — Ralph's decomposed task list with dependency arrows, updated in real time
@@ -1038,7 +1063,7 @@ During `atomic chat`, there is no Atomic-owned TUI — `atomic chat -a <agent>`
 | Context                                | UI provider                                                 |
 | -------------------------------------- | ----------------------------------------------------------- |
-| `atomic workflow -n <name> -a <agent>` | Atomic (orchestrator panel + tmux session graph)            |
+| `atomic workflow -n <name> -a <agent>` | Atomic (workflow panel + tmux session graph)                |
 | `atomic chat -a <agent>`               | The native agent CLI (Claude Code / OpenCode / Copilot CLI) |
 ---
@@ -1047,16 +1072,16 @@ During `atomic chat`, there is no Atomic-owned TUI — `atomic chat -a <agent>`
 ### CLI Commands
-| Command                         | Description                                                     |
-| ------------------------------- | --------------------------------------------------------------- |
-| `atomic chat`                   | Spawn the native agent CLI inside a tmux session                |
-| `atomic workflow`               | Run a named multi-session workflow with the Atomic orchestrator panel |
-| `atomic workflow list`          | List available workflows, grouped by source                     |
-| `atomic session list`           | List all running sessions on the atomic tmux socket             |
-| `atomic session connect [name]` | Attach to a session (interactive picker when no name given)     |
-| `atomic session kill [name]`    | Kill a session by name, or all sessions when no name is given   |
-| `atomic completions <shell>`    | Output shell completion script (bash, zsh, fish, powershell)    |
-| `atomic config set <k> <v>`     | Set configuration values (supports `telemetry` and `scm`)       |
+| Command                         | Description                                                       |
+| ------------------------------- | ----------------------------------------------------------------- |
+| `atomic chat`                   | Spawn the native agent CLI inside a tmux session                  |
+| `atomic workflow`               | Run a named multi-session workflow with the Atomic workflow panel |
+| `atomic workflow list`          | List available workflows, grouped by source                       |
+| `atomic session list`           | List all running sessions on the atomic tmux socket               |
+| `atomic session connect [name]` | Attach to a session (interactive picker when no name given)       |
+| `atomic session kill [name]`    | Kill a session by name, or all sessions when no name is given     |
+| `atomic completions <shell>`    | Output shell completion script (bash, zsh, fish, powershell)      |
+| `atomic config set <k> <v>`     | Set configuration values (supports `telemetry` and `scm`)         |
 #### Global Flags
@@ -1113,7 +1138,7 @@ atomic chat -a claude --verbose              # forward --verbose to claude
 | Flag                 | Description                                                                                                                                       |
 | -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `-n, --name <name>`  | Workflow name (required for direct runs; omit only for the interactive picker)                                                                     |
+| `-n, --name <name>`  | Workflow name (required for direct runs; omit only for the interactive picker)                                                                    |
 | `-a, --agent <name>` | Agent: `claude`, `opencode`, `copilot`                                                                                                            |
 | `-d, --detach`       | Start the workflow in the background without attaching — ideal for scripted / CI runs; attach later with `atomic workflow session connect <name>` |
 | `--<field>=<value>`  | Structured input for workflows that declare an `inputs` schema (also accepts `--<field> <value>`)                                                 |
@@ -1137,7 +1162,7 @@ atomic workflow -n open-claude-design -a claude \
   --prompt="a dashboard for monitoring API latency" \
   --output-type=prototype
-# 5. Run detached — orchestrator runs in the background; prints the session name
+# 5. Run detached — workflow runs in the background; prints the session name
 #    and returns immediately. Attach any time with `atomic workflow session connect`.
 atomic workflow -n ralph -a claude -d "build a REST API for user management"
 ```
@@ -1212,150 +1237,160 @@ Native slash commands (`/help`, `/clear`, `/compact`, `/model`, `/theme`, `/agen
 `@bastani/atomic/workflows` is a library, not just a CLI. Use it directly to build your own TypeScript app that runs your team's workflows.
-> **SDK-only users:** you don't need the global `atomic` binary, but you still need the runtime prerequisites — **[Bun](https://bun.sh/) (the SDK does not run on Node.js)**, a terminal multiplexer (tmux on macOS/Linux, psmux on Windows), and at least one authenticated coding agent CLI (`claude`, `opencode`, or `copilot`). See [Prerequisites](#prerequisites) for the "why" and install commands. The SDK spawns the agent CLI at each stage and wraps it in a detachable multiplexer session — those are orchestration primitives the SDK doesn't embed.
+> **SDK-only users:** you don't need the global `atomic` binary, but you still need the runtime prerequisites — **[Bun](https://bun.sh/) (the SDK does not run on Node.js)**, a terminal multiplexer (tmux on macOS/Linux, psmux on Windows), and at least one authenticated coding agent CLI (`claude`, `opencode`, or `copilot`). See [Prerequisites](#prerequisites) for the "why" and install commands. The SDK spawns the agent CLI at each stage and wraps it in a detachable multiplexer session.
 >
-> **Management commands ship natively.** `createWorkflowCli` auto-registers `session` and `status` subcommands on every worker CLI by default, so `bun run src/claude-worker.ts session list`, `… status <id>`, `… session connect <id>`, and `… session kill <id> -y` all work with zero extra code. Sessions live on the shared `atomic` tmux socket, so the worker CLI, the global `atomic` binary, and `bunx atomic` (for SDK-only installs) all see the same runtime state. Opt out with `createWorkflowCli(workflow, { includeManagementCommands: false })` when you want a minimal CLI or are embedding under a parent Commander program that owns session management. The names `session` and `status` are reserved — workflow inputs with those names throw at `defineWorkflow` time to prevent flag collisions.
+> **Session management primitives.** The SDK exposes `listSessions`, `getSession`, `stopSession`, `attachSession`, `detachSession`, `getSessionStatus`, `getSessionTranscript`, plus pane-navigation verbs `nextWindow` / `previousWindow` / `gotoOrchestrator` — wire them into your CLI's `session list`, `status`, etc. subcommands as you see fit. Sessions live on the shared `atomic` tmux socket, so a worker CLI built on the primitives, the global `atomic` binary, and `bunx atomic` all see the same runtime state.
+>
+> **Typed errors.** Every error path the SDK throws — missing tmux/psmux/bun, unknown session id, missing `.compile()`, invalid workflow file, `minSDKVersion` mismatch — is a typed class (`MissingDependencyError`, `SessionNotFoundError`, `WorkflowNotCompiledError`, `InvalidWorkflowError`, `IncompatibleSDKError`). Catch them with `instanceof` to render friendly CLI output without parsing message text. See `examples/pane-navigation/cli.ts` for a worked example.
-### One factory, three input shapes
+### Primitives, not a wrapper
-`createWorkflowCli` is the single factory. Pick whichever input shape matches how you organize your workflows:
+The SDK ships pure functions you compose into whatever CLI shape you want:
-| Input                           | When to use                                                                                          |
-| ------------------------------- | ---------------------------------------------------------------------------------------------------- |
-| `createWorkflowCli(workflow)`   | One workflow. Direct runs still use `-n/--name` + `-a/--agent`; the CLI exposes only that workflow's declared `--<input>` flags. |
-| `createWorkflowCli([wf1, wf2])` | Multiple workflows inline. Uses the same `-n/--name` + `-a/--agent` dispatch and the interactive picker.                         |
-| `createWorkflowCli(registry)`   | Dynamic composition (loop-register, conditional registration). Same runtime shape as the array form.                             |
+| Primitive               | Purpose                                                                                                |
+| ----------------------- | ------------------------------------------------------------------------------------------------------ |
+| `defineWorkflow`        | Author a workflow with `.for(agent).run(...).compile()`. Pass `source: import.meta.path`.              |
+| `createRegistry`        | Build an immutable registry of workflows for iteration / lookup.                                       |
+| `listWorkflows(reg)`    | Snapshot every workflow in a registry.                                                                 |
+| `getWorkflow(reg, …)`   | Resolve `(agent, name)` → workflow.                                                                    |
+| `getName / getAgent / getInputSchema / getDescription / getSource / getMinSDKVersion` | Read workflow metadata.        |
+| `validateInputs(wf, raw)` | Run the same validation pipeline atomic uses (required, defaults, enum, integer).                    |
+| `runWorkflow({ workflow, inputs, detach? })` | Spawn the orchestrator tmux session and (optionally) attach. Resolves with `{ id, tmuxSessionName }`. |
+| `listSessions / getSession / stopSession / attachSession / detachSession` | Manage running tmux sessions on the shared atomic socket. |
+| `getSessionStatus / getSessionTranscript`    | Read the orchestrator-written status snapshot or per-session messages from disk.  |
+| `nextWindow / previousWindow / gotoOrchestrator` | **Pane navigation** — pure tmux verbs that update the session's current-window pointer. Never auto-attach; an attached client sees the change live, otherwise a subsequent `attachSession` lands on the new window. |
+| `MissingDependencyError / SessionNotFoundError / WorkflowNotCompiledError / InvalidWorkflowError / IncompatibleSDKError` | **Typed errors** thrown by the primitives above. Catch with `instanceof` to render friendly CLI messages without parsing message text. |
-**Single workflow (most common)** — one file, three lines:
+**Single workflow (most common):**
 ```ts
 // src/claude-worker.ts
-import { createWorkflowCli } from "@bastani/atomic/workflows";
+import { Command } from "@commander-js/extra-typings";
+import { getInputSchema, runWorkflow } from "@bastani/atomic/workflows";
 import workflow from "./workflows/review-to-merge/claude.ts";
-await createWorkflowCli(workflow).run({ inputs: { target_branch: "main" } });
-// defaults above; CLI flags override.
+const program = new Command();
+for (const input of getInputSchema(workflow)) {
+  program.option(`--${input.name} <value>`, input.description ?? "");
+}
+program.action(async (rawOpts) => {
+  const inputs = rawOpts as Record<string, string>;
+  await runWorkflow({ workflow, inputs });
+});
+await program.parseAsync();
 ```
 Run it:
 ```bash
-bun run src/claude-worker.ts -n review-to-merge -a claude --target_branch=release/v2
+bun run src/claude-worker.ts --target_branch=release/v2
 ```
-**Multiple workflows — inline array:**
+**Multiple workflows — iterate a registry:**
 ```ts
 // src/cli.ts
-import { createWorkflowCli } from "@bastani/atomic/workflows";
+import { Command } from "@commander-js/extra-typings";
+import {
+  createRegistry,
+  getInputSchema,
+  getName,
+  listWorkflows,
+  runWorkflow,
+} from "@bastani/atomic/workflows";
 import reviewToMerge from "./workflows/review-to-merge/claude.ts";
 import genSpec from "./workflows/gen-spec/claude.ts";
-await createWorkflowCli([reviewToMerge, genSpec]).run();
-```
-Run it:
-```bash
-bun run src/cli.ts -n review-to-merge -a claude
-bun run src/cli.ts -a claude                    # interactive picker (TTY)
-```
-See [`examples/multi-workflow/`](./examples/multi-workflow) for a complete runnable version — two Claude workflows (`hello`, `goodbye`) registered under one `cli.ts`, with the `createRegistry()` variant shown side by side in a comment.
-**Dynamic composition — `createRegistry`:**
+const registry = createRegistry().register(reviewToMerge).register(genSpec);
+const program = new Command("my-app");
-```ts
-import { createWorkflowCli, createRegistry } from "@bastani/atomic/workflows";
+for (const wf of listWorkflows(registry)) {
+  const sub = program.command(getName(wf)).description(wf.description);
+  for (const input of getInputSchema(wf)) {
+    sub.option(`--${input.name} <value>`, input.description ?? "");
+  }
+  sub.action(async (rawOpts) => {
+    await runWorkflow({ workflow: wf, inputs: rawOpts as Record<string, string> });
+  });
+}
-const registry = workflowFiles.reduce((r, wf) => r.register(wf), createRegistry());
-await createWorkflowCli(registry).run();
+await program.parseAsync();
 ```
-Need a listing subcommand? Use `toCommand(cli)` from `@bastani/atomic/workflows/commander` and attach your own `list` subcommand — the same way `atomic workflow list` is wired up in `src/cli.ts`.
-### One method: `run()`
+See [`examples/multi-workflow/`](./examples/multi-workflow) for a complete runnable version — two Claude workflows (`hello`, `goodbye`) registered under one `cli.ts`.
-`WorkflowCli` exposes one method — `run(options?)`. Default parses `process.argv`; pass `argv: [...]` to parse an explicit list, or `argv: false` to skip parsing entirely. `inputs` merge as defaults under CLI flags; `argv: false` makes them final. `run()` also accepts `name` / `agent`, which layer the same way.
+### Programmatic invocation
-The `WorkflowCli` type is framework-agnostic — no Commander imports in sight. If you want one, reach for the adapter below.
-Example — programmatic invocation without argv:
+`runWorkflow({ workflow, inputs })` is a plain async function — you don't need a CLI at all:
 ```ts
-// Single workflow: name + agent are still required when argv parsing is skipped.
-await cli.run({
-  argv: false,
-  name: "review-to-merge",
-  agent: "claude",
-  inputs: { target_branch: "main" },
-});
+import { runWorkflow } from "@bastani/atomic/workflows";
+import workflow from "./workflows/review-to-merge/claude.ts";
-// Multi-workflow cli: name + agent required under argv: false.
-await cli.run({
-  argv: false,
-  name: "review-to-merge",
-  agent: "claude",
+const { id, tmuxSessionName } = await runWorkflow({
+  workflow,
   inputs: { target_branch: "main" },
+  detach: true,
 });
 ```
-### Embedding under a parent CLI — `toCommand` + `runCli`
+Combine with `getSessionStatus(tmuxSessionName)` and `attachSession(id)` to build your own monitoring UI on top of the SDK.
+### Embedding under a parent CLI — `runWorkflow` inside any Commander tree
-For integration with a bigger Commander program, import the adapter from the dedicated subpath:
+The SDK no longer ships a Commander adapter — it doesn't need one. Just call `runWorkflow` from inside any Commander action:
 ```ts
-import { createWorkflowCli } from "@bastani/atomic/workflows";
-import { toCommand, runCli } from "@bastani/atomic/workflows/commander";
 import { Command } from "@commander-js/extra-typings";
+import { getInputSchema, runWorkflow } from "@bastani/atomic/workflows";
 import workflow from "./workflows/deploy/claude.ts";
-const cli = createWorkflowCli(workflow);
 const program = new Command("my-app");
-program.addCommand(toCommand(cli, "deploy"));
+const deploy = program.command("deploy").description(workflow.description);
+for (const input of getInputSchema(workflow)) {
+  deploy.option(`--${input.name} <value>`, input.description ?? "");
+}
+deploy.action(async (rawOpts) => {
+  await runWorkflow({ workflow, inputs: rawOpts as Record<string, string> });
+});
 program.command("hello").action(() => console.log("hi"));
-// Replaces program.parseAsync(). runCli transparently handles detached
-// re-entry — when the process is a tmux-spawned orchestrator, it drives
-// runOrchestrator; otherwise it invokes your callback (argv parse + any
-// bootstrap you want). PyTorch's init_process_group for rank-zero
-// dispatch — no guards, no env-var checks in user code.
-await runCli(cli, () => program.parseAsync());
+await program.parseAsync();
 ```
-`toCommand(cli, "workflow")` is exactly how the internal `atomic workflow` command is wired (`src/commands/cli/workflow.ts`). Because the Commander dependency lives only on the subpath, a future `@bastani/atomic/workflows/yargs` adapter can ship alongside without touching the core SDK.
+There's no re-entry boilerplate — the SDK ships its own internal orchestrator entry script and re-execs *that* with positional args (`workflowSource`, `agent`, base64-encoded inputs). Your CLI is never re-imported, so there's nothing to guard against orchestrator-mode env vars.
-### `entry` — for bundled apps and test harnesses
+### `WorkflowPicker` component
-`createWorkflowCli` accepts `{ entry?: string }`, defaulting to `process.argv[1]`. That's the file the runtime re-executes on `--detach` to resume the orchestrator, so it has to be the composition root. Override it when you bundle the app (`entry` should point at the bundle), when the composition root isn't argv[1] (tests, embedded CLIs), or with `import.meta.url` for ESM-native correctness.
+The interactive picker (the same one `atomic workflow -a claude` opens) is exposed as a component:
 ```ts
-const cli = createWorkflowCli(workflow, { entry: import.meta.url });
+import { WorkflowPicker } from "@bastani/atomic/workflows/components";
 ```
+Mount it inside your own OpenTUI app or imperatively via `WorkflowPickerPanel.create({ agent, registry })`.
 ### Registry rules
 - `createRegistry()` returns an **immutable** registry. Each `.register(wf)` call returns a **new** registry — the original is unchanged. Chain calls to accumulate workflows.
 - Each workflow is keyed by `${agent}/${name}` — the `(agent, name)` pair must be unique. Registering a duplicate throws immediately.
-- `createWorkflowCli(registry)` inspects every registered workflow and builds a union of their declared inputs. Same-name / same-type flags are shared; same-name / different-type conflicts throw at construction time so ambiguity never reaches runtime.
 - Builtin workflows (`ralph`, `deep-research-codebase`, `open-claude-design`) are managed by `atomic`'s internal `createBuiltinRegistry()`. They are reserved — user-registered workflows with the same name will not shadow builtins when running the `atomic` CLI.
 ### Input precedence
-CLI flags always win when parsing is active. Under them, the order is:
+`runWorkflow({ workflow, inputs })` runs `validateInputs(workflow, inputs)` for you, applying:
-1. `defineWorkflow` default values (on each `WorkflowInput`)
-2. Layer supplied at construction or invocation:
-   - `cli.run({ inputs })` for the single-workflow shape
-   - `createWorkflowCli(registry, { inputs })` / `cli.run({ inputs })` for the multi-workflow shape
-3. CLI flags — `--<field>=<value>` passed at runtime
+1. `defineWorkflow` default values (on each `WorkflowInput`) when no value is provided
+2. The first declared enum value when `required: true` and no value is provided
+3. Whatever you pass in `inputs`
-With `argv: false`, the CLI-flag layer is skipped — your programmatic `inputs` become top-of-chain.
+CLI flags compose entirely at the calling-CLI layer — the SDK only sees the final `inputs` map.
 ### Builtin workflows via the `atomic` CLI
-The `atomic workflow` command still works for the three built-in workflows — internally it's `toCommand(createWorkflowCli(createBuiltinRegistry()), "workflow")`:
+The `atomic workflow` command runs the built-in registry via the same primitives:
 ```bash
 atomic workflow -n ralph -a claude "Build the auth module"
@@ -1365,19 +1400,14 @@ atomic workflow -n open-claude-design -a claude
 These are not affected by your own `createRegistry()` — they are separate.
-### Migration from 0.x (directory-scanning) to current
-> This is a breaking change. The SDK no longer scans `.atomic/workflows/` directories.
+### Migration from 0.x (directory-scanning) and the `createWorkflowCli` wrapper
-1. **Delete** `.atomic/workflows/` from your repo.
-2. **Create one entrypoint file per agent**, e.g. `src/claude-worker.ts`:
-   ```ts
-   import { createWorkflowCli } from "@bastani/atomic/workflows";
-   import workflow from "./workflows/my-workflow/claude.ts";
+> Two breaking changes: workflows must declare `source: import.meta.path`, and the `createWorkflowCli` / `toCommand` / `runCli` wrappers were removed in favour of primitives.
-   await createWorkflowCli(workflow).run();
-   ```
-3. **Update invocations**: replace `atomic workflow -n foo -a claude` with `bun run src/claude-worker.ts -n foo -a claude` for your custom workflows. For the Atomic builtin set (`ralph`, `deep-research-codebase`, `open-claude-design`) keep using `atomic workflow -n <name> -a <agent>`.
+1. **Add `source: import.meta.path`** to every `defineWorkflow({ ... })` call. The SDK uses it to import the workflow module inside the orchestrator child process.
+2. **Replace `createWorkflowCli(workflow).run()`** with a small Commander (or citty / yargs) entrypoint that calls `runWorkflow({ workflow, inputs })` — see the snippets above. The SDK no longer ships a CLI wrapper.
+3. **Remove `handleOrchestratorReentry` / `runCli` calls** — the SDK ships its own orchestrator entry script and the dev's CLI is never re-execed.
+4. **Update invocations**: replace `atomic workflow -n foo -a claude` with `bun run src/claude-worker.ts --<input>=<value>` for your custom workflows. For the Atomic builtin set (`ralph`, `deep-research-codebase`, `open-claude-design`) keep using `atomic workflow -n <name> -a <agent>`.
 ---
@@ -1481,6 +1511,11 @@ Ensure the agent CLI is in your PATH. Atomic uses `Bun.which()`, which handles `
 ## FAQ
+<details>
+<summary><b>Why not markdown, a coding agent alone, or a general agent framework?</b></summary>
+Markdown is great for guidance: conventions, commands, repo notes, and checklists. Use Claude Code, OpenCode, or Copilot CLI directly for normal single-session coding. Atomic is for the point where the work needs branching, retries, parallel sessions, state, human approval, sandboxed execution, or reliable handoff between stages. General agent frameworks can do some of this, but you often rebuild coding-agent basics yourself: file editing, terminal interaction, MCP setup, hooks, session handling, and repo-specific context. Atomic starts from production coding agents and adds the workflow layer around them.
+</details>
 <details>
 <summary><b>How does Atomic differ from Spec-Kit?</b></summary>
@@ -1507,9 +1542,9 @@ Ensure the agent CLI is in your PATH. Atomic uses `Bun.which()`, which handles `
 <details>
 <summary><b>How does Atomic differ from DeerFlow?</b></summary>
-[DeerFlow](https://github.com/bytedance/deer-flow) is ByteDance's agent harness built on LangGraph/LangChain. Both are multi-agent orchestrators, but take different approaches:
+[DeerFlow](https://github.com/bytedance/deer-flow) is ByteDance's agent runtime built on LangGraph/LangChain. Both can run multi-agent work, but take different approaches:
-**In short:** DeerFlow is a general-purpose agent orchestrator with a web UI. Atomic is narrowly focused on coding workflows. The key difference is that Atomic runs on top of production coding agents (Claude Code, OpenCode, Copilot CLI) rather than reimplementing coding tools through a generic API — you get each agent's native file editing, permissions, MCP integrations, and hooks out of the box. Atomic also gives you deterministic execution, which matters when encoding a team's dev process.
+**In short:** DeerFlow is a general-purpose agent system with a web UI. Atomic is narrowly focused on coding workflows. The key difference is that Atomic runs on top of production coding agents (Claude Code, OpenCode, Copilot CLI) rather than reimplementing coding tools through a generic API — you get each agent's native file editing, permissions, MCP integrations, and hooks out of the box. Atomic also gives you deterministic execution, which matters when encoding a team's dev process.
 | Aspect                  | DeerFlow                                        | Atomic                                                                                        |
 | ----------------------- | ----------------------------------------------- | --------------------------------------------------------------------------------------------- |
@@ -1531,9 +1566,9 @@ Ensure the agent CLI is in your PATH. Atomic uses `Bun.which()`, which handles `
 <details>
 <summary><b>How does Atomic differ from Hermes Agent?</b></summary>
-[Hermes Agent](https://github.com/NousResearch/hermes-agent) is Nous Research's general-purpose AI agent with a self-improving learning loop. Both are open-source agent frameworks, but serve different use cases:
+[Hermes Agent](https://github.com/NousResearch/hermes-agent) is Nous Research's general-purpose AI agent with a self-improving learning loop. Both are open source agent projects, but serve different use cases:
-**In short:** Hermes is a broad AI assistant that learns across sessions and connects to messaging platforms. Atomic is a coding-specific harness for engineering teams. It lets you encode your development process as deterministic TypeScript workflows that run identically across team members, machines, and CI. Atomic inherits production-hardened tools from Claude Code, OpenCode, and Copilot CLI — including their permission systems, MCP integrations, and hooks — giving you two independent security boundaries (devcontainer isolation + agent permissions). Fresh context per session keeps output sharp over multi-hour tasks. Developer-authored skills don't drift the way auto-generated ones can.
+**In short:** Hermes is a broad AI assistant that learns across sessions and connects to messaging platforms. Atomic is coding-specific workflow software for engineering teams. It lets you encode your development process as deterministic TypeScript workflows that run identically across team members, machines, and CI. Atomic inherits production-hardened tools from Claude Code, OpenCode, and Copilot CLI — including their permission systems, MCP integrations, and hooks — giving you two independent security boundaries (devcontainer isolation + agent permissions). Fresh context per session keeps output sharp over multi-hour tasks. Developer-authored skills don't drift the way auto-generated ones can.
 | Aspect                    | Hermes Agent                                                                                 | Atomic                                                                                                                                       |
 | ------------------------- | -------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |