@bastani/atomic 0.5.0 → 0.5.1-0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -9,59 +9,49 @@
9
9
  [![Bun](https://img.shields.io/badge/Bun-Runtime-f9f1e1?logo=bun&logoColor=black)](./package.json)
10
10
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](./LICENSE)
11
11
 
12
- Atomic is an open-source **multi-agent harness** that orchestrates **Claude Code**, **OpenCode**, and **GitHub Copilot CLI** through a unified interface — with a **workflow SDK**, **containerized execution**, **deep codebase research**, and **autonomous multi-hour coding sessions**.
12
+ Atomic is an open-source **agent harness framework** that lets you build, compose, and run **multi-session coding workflows** on top of **Claude Code**, **OpenCode**, and **GitHub Copilot CLI** — with **58 built-in skills**, **12 specialized sub-agents**, and **containerized execution**.
13
13
 
14
- > One CLI. Three agent SDKs. Research it, spec it, ship it then wake up to completed code ready for review.
14
+ > Build any agent harness you want. Define workflows as TypeScript. Run them on any coding agent.
15
+
16
+ ---
17
+
18
+ ## Why Atomic
19
+
20
+ Building harnesses and workflows around coding agents is harder than it should be. Teams hit the same walls:
21
+
22
+ - **No way to chain agent sessions.** You can prompt an agent, but there's no standard way to feed one session's output into the next — research into planning, planning into implementation, implementation into review. Teams resort to copy-pasting between terminals.
23
+ - **Context degrades in long sessions.** A single agent asked to research, plan, implement, and review in one session produces increasingly unreliable output as its context window fills up. There's no built-in mechanism to isolate concerns across sessions.
24
+ - **Agent-specific configuration is fragmented.** Claude Code, OpenCode, and Copilot CLI each have their own config directories, skill formats, and agent definitions. Building a workflow that works across agents means maintaining three separate configurations.
25
+ - **Team processes live in wikis, not in code.** Every team has a process — triage bugs this way, ship features that way, review PRs with these checks. But those processes are prose in a wiki, not executable code that an agent can follow.
26
+ - **Autonomous execution is unsafe without isolation.** Agents run shell commands, delete files, and execute arbitrary code. Running them autonomously on your host system is a risk most teams won't take.
27
+ - **Specialized work requires specialized agents.** A single general-purpose agent juggling file search, code analysis, web research, and implementation will lose track of details. There's no framework for dispatching purpose-built sub-agents with scoped tools and isolated context windows.
28
+ - **Agent workflows aren't deterministic.** Even when you do chain sessions together, there's no guarantee they'll execute in the same order, pass data the same way, or produce an inspectable record. Without strict ordering and controlled data flow, workflows become unpredictable — hard to debug, impossible to reproduce.
29
+
30
+ Atomic solves these by giving you a **Workflow SDK** to define multi-session pipelines as TypeScript with **deterministic execution** — strict step ordering, frozen definitions, and controlled transcript passing — plus **12 specialized sub-agents** that keep context windows small and focused, and **containerized execution** via devcontainer features that isolate agents from your host system. Write a workflow once, run it on Claude Code, OpenCode, or Copilot CLI with a flag change.
15
31
 
16
32
  ---
17
33
 
18
34
  ## Table of Contents
19
35
 
20
- - [Atomic](#atomic)
21
- - [Table of Contents](#table-of-contents)
22
- - [Quick Start](#quick-start)
23
- - [Prerequisites](#prerequisites)
24
- - [1. Install](#1-install)
25
- - [2. Initialize Your Project](#2-initialize-your-project)
26
- - [3. Generate Context Files](#3-generate-context-files)
27
- - [4. Ship Features](#4-ship-features)
28
- - [Video Overview](#video-overview)
29
- - [Core Features](#core-features)
30
- - [Multi-Agent SDK Support](#multi-agent-sdk-support)
31
- - [Workflow SDK — Build Your Own Harness](#workflow-sdk--build-your-own-harness)
32
- - [Builder API](#builder-api)
33
- - [Session Context (`ctx`)](#session-context-ctx)
34
- - [Session Options (`SessionRunOptions`)](#session-options-sessionrunoptions)
35
- - [Saving Transcripts](#saving-transcripts)
36
- - [Provider Helpers](#provider-helpers)
37
- - [Key Rules](#key-rules)
38
- - [Deep Codebase Research](#deep-codebase-research)
39
- - [Autonomous Execution (Ralph)](#autonomous-execution-ralph)
40
- - [Containerized Execution](#containerized-execution)
41
- - [Specialized Sub-Agents](#specialized-sub-agents)
42
- - [Built-in Skills](#built-in-skills)
43
- - [Workflow Orchestrator Panel](#workflow-orchestrator-panel)
44
- - [Architecture](#architecture)
45
- - [Why Research → Plan → Implement → Verify Works](#why-research--plan--implement--verify-works)
46
- - [Commands Reference](#commands-reference)
47
- - [CLI Commands](#cli-commands)
48
- - [Global Flags](#global-flags)
49
- - [`atomic init` Flags](#atomic-init-flags)
50
- - [`atomic chat` Flags](#atomic-chat-flags)
51
- - [`atomic workflow` Flags](#atomic-workflow-flags)
52
- - [Atomic-Provided Skills (invokable from any agent chat)](#atomic-provided-skills-invokable-from-any-agent-chat)
53
- - [Configuration](#configuration)
54
- - [`.atomic/settings.json`](#atomicsettingsjson)
55
- - [Agent-Specific Files](#agent-specific-files)
56
- - [Installation Options](#installation-options)
57
- - [Updating \& Uninstalling](#updating--uninstalling)
58
- - [Update](#update)
59
- - [Uninstall](#uninstall)
60
- - [Troubleshooting](#troubleshooting)
61
- - [FAQ](#faq)
62
- - [Contributing](#contributing)
63
- - [License](#license)
64
- - [Credits](#credits)
36
+ - [Quick Start](#quick-start)
37
+ - [Core Features](#core-features)
38
+ - [Multi-Agent SDK Support](#multi-agent-sdk-support)
39
+ - [Workflow SDK — Build Your Own Deterministic Harness](#workflow-sdk--build-your-own-deterministic-harness)
40
+ - [Deep Codebase Research](#deep-codebase-research)
41
+ - [Autonomous Execution (Ralph)](#autonomous-execution-ralph)
42
+ - [Containerized Execution](#containerized-execution)
43
+ - [Specialized Sub-Agents](#specialized-sub-agents)
44
+ - [Built-in Skills](#built-in-skills)
45
+ - [Workflow Orchestrator Panel](#workflow-orchestrator-panel)
46
+ - [Commands Reference](#commands-reference)
47
+ - [Configuration](#configuration)
48
+ - [Installation Options](#installation-options)
49
+ - [Updating & Uninstalling](#updating--uninstalling)
50
+ - [Troubleshooting](#troubleshooting)
51
+ - [FAQ](#faq)
52
+ - [Contributing](#contributing)
53
+ - [License](#license)
54
+ - [Credits](#credits)
65
55
 
66
56
  ---
67
57
 
@@ -70,6 +60,7 @@ Atomic is an open-source **multi-agent harness** that orchestrates **Claude Code
70
60
  ### Prerequisites
71
61
 
72
62
  - **macOS, Linux, or Windows** (PowerShell 7+ required on Windows — [install guide](https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell-on-windows))
63
+ - **[Bun](https://bun.sh/)** runtime installed
73
64
  - **At least one coding agent installed and logged in:**
74
65
  - [Claude Code](https://code.claude.com/docs/en/quickstart) — run `claude` and complete authentication
75
66
  - [OpenCode](https://opencode.ai) — run `opencode` and complete authentication
@@ -96,13 +87,7 @@ your-project/
96
87
  └── ...
97
88
  ```
98
89
 
99
- ```jsonc
100
- {
101
- "features": {
102
- "ghcr.io/flora131/atomic/claude:1": {} // or /opencode:1 or /copilot:1
103
- }
104
- }
105
- ```
90
+ On first run, Atomic automatically sets up all required tooling (Node.js, tmux, Playwright CLI, config files, skills, and agent configurations). This happens once and takes about a minute.
106
91
 
107
92
  | Feature | Reference | Agent |
108
93
  | -------------------- | ------------------------------------ | ---------------------------------------------------- |
@@ -130,8 +115,6 @@ macOS / Linux:
130
115
 
131
116
  ```bash
132
117
  curl -fsSL https://raw.githubusercontent.com/flora131/atomic/main/install.sh | bash
133
- # or with wget:
134
- wget -qO- https://raw.githubusercontent.com/flora131/atomic/main/install.sh | bash
135
118
  ```
136
119
 
137
120
  Windows PowerShell 7+:
@@ -140,6 +123,58 @@ Windows PowerShell 7+:
140
123
  irm https://raw.githubusercontent.com/flora131/atomic/main/install.ps1 | iex
141
124
  ```
142
125
 
126
+ <details>
127
+ <summary>Migrating from v0.4.x (Binary) to v0.5.x (npm)?</summary>
128
+
129
+ Atomic has moved from a standalone binary distribution to an **npm package**. The new version gives you the Workflow SDK, 58 skills, and 12 sub-agents as a single installable package.
130
+
131
+ #### Migration Steps
132
+
133
+ **1. Uninstall the old binary:**
134
+
135
+ ```bash
136
+ atomic uninstall
137
+ ```
138
+
139
+ **2. Remove the old Workflow SDK global package:**
140
+
141
+ ```bash
142
+ bun uninstall -g @bastani/atomic-workflows
143
+ ```
144
+
145
+ **3. Delete the old configuration directory:**
146
+
147
+ ```bash
148
+ rm -rf ~/.atomic
149
+ ```
150
+
151
+ **4. Install the new version:**
152
+
153
+ ```bash
154
+ bun install -g @bastani/atomic
155
+ ```
156
+
157
+ **5. Re-initialize your project:**
158
+
159
+ ```bash
160
+ cd your-project
161
+ atomic init
162
+ ```
163
+
164
+ > On first run after install, Atomic automatically syncs all agent configurations, skills, workflows, and tooling. This replaces the old `atomic update` command — updates now happen lazily on CLI startup when a version mismatch is detected.
165
+
166
+ #### What Changed
167
+
168
+ | Aspect | v0.4.x (Binary) | v0.5.x (npm) |
169
+ | --- | --- | --- |
170
+ | **Distribution** | Pre-compiled binary via `install.sh` | npm package via `bun install -g` |
171
+ | **Updates** | `atomic update` command | Reinstall via `bun install -g @bastani/atomic` + auto-sync on first run |
172
+ | **Uninstall** | `atomic uninstall` | `bun uninstall -g @bastani/atomic` |
173
+ | **Workflow SDK** | Separate `@bastani/atomic-workflows` global package | Bundled with CLI as workspace package |
174
+ | **Config sync** | Manual via install scripts | Automatic on first run after upgrade |
175
+
176
+ </details>
177
+
143
178
  ### 2. Initialize Your Project
144
179
 
145
180
  ```bash
@@ -147,7 +182,7 @@ cd your-project
147
182
  atomic init
148
183
  ```
149
184
 
150
- Select your coding agent when prompted. The CLI configures your project automatically.
185
+ Select your coding agent and source control system when prompted. The CLI configures your project automatically.
151
186
 
152
187
  ### 3. Generate Context Files
153
188
 
@@ -163,47 +198,70 @@ atomic chat -a <claude|opencode|copilot>
163
198
 
164
199
  This explores your codebase using sub-agents and generates documentation that gives coding agents the context they need.
165
200
 
166
- ### 4. Ship Features
201
+ ### 4. Build a Workflow
167
202
 
168
- ```
169
- Research → Spec → Implement, Verify & Review → PR
170
- ```
171
-
172
- ```bash
173
- # Research the codebase
174
- /research-codebase Describe your feature or question
175
- /clear
203
+ Every team has a process. Atomic lets you encode it as TypeScript — chain agent sessions together, pass transcripts between them, and run the whole thing from the CLI.
176
204
 
177
- # Create a specification (review carefully it becomes the contract)
178
- /create-spec research-path
179
- /clear
205
+ Drop a `.ts` file in `.atomic/workflows/<name>/<agent>/index.ts` and run it:
180
206
 
181
- # Implement autonomously (run from a separate terminal)
182
- atomic workflow -n ralph -a <claude|opencode|copilot> "<prompt-or-spec-path>"
207
+ ```bash
208
+ atomic workflow -n my-workflow -a claude "add user avatars to the profile page"
209
+ ```
183
210
 
184
- # Review the implementation
185
- # Ralph runs tests, reviews correctness, and fixes issues automatically —
186
- # but you should still read the code changes before shipping.
187
- Review the code changes against the spec. Flag anything that doesn't match.
211
+ Here's a workflow that researches a codebase, implements a feature, and reviews the result — three sessions, each in its own context window:
188
212
 
189
- # Commit and ship
190
- /gh-commit
191
- /gh-create-pr
192
- ```
213
+ ```ts
214
+ // .atomic/workflows/my-workflow/claude/index.ts
215
+ import { defineWorkflow, createClaudeSession, claudeQuery } from "@bastani/atomic/workflows";
193
216
 
194
- > **Testing and verification are automated.** Ralph's review-debug loop runs tests, checks correctness and test coverage against the spec, and fixes issues — but we suggest to review the final diff yourself before committing.
217
+ export default defineWorkflow({
218
+ name: "my-workflow",
219
+ description: "Research -> Implement -> Review",
220
+ })
221
+ .run(async (ctx) => {
222
+ const research = await ctx.session(
223
+ { name: "research", description: "Analyze the codebase for the requested change" },
224
+ async (s) => {
225
+ await createClaudeSession({ paneId: s.paneId });
226
+ await claudeQuery({
227
+ paneId: s.paneId,
228
+ prompt: `/research-codebase ${s.userPrompt}`,
229
+ });
230
+ s.save(s.sessionId);
231
+ },
232
+ );
195
233
 
196
- If something breaks, use the debugging agent:
234
+ const implement = await ctx.session(
235
+ { name: "implement", description: "Implement the feature based on research findings" },
236
+ async (s) => {
237
+ const transcript = await s.transcript(research);
238
+ await createClaudeSession({ paneId: s.paneId });
239
+ await claudeQuery({
240
+ paneId: s.paneId,
241
+ prompt: `Read ${transcript.path} and implement the changes described. Run tests to verify.`,
242
+ });
243
+ s.save(s.sessionId);
244
+ },
245
+ );
197
246
 
247
+ await ctx.session(
248
+ { name: "review", description: "Review the implementation for correctness" },
249
+ async (s) => {
250
+ await createClaudeSession({ paneId: s.paneId });
251
+ await claudeQuery({
252
+ paneId: s.paneId,
253
+ prompt: "Review all uncommitted changes. Flag any issues with correctness, tests, or style.",
254
+ });
255
+ s.save(s.sessionId);
256
+ },
257
+ );
258
+ })
259
+ .compile();
198
260
  ```
199
- Use the debugging agent to create a debugging report for [error message]
200
- ```
201
-
202
- ---
203
261
 
204
- ## Video Overview
262
+ This is just one example. Add a spec phase, parallelize independent sessions, swap in a different agent — the workflow is yours to define. See [Workflow SDK — Build Your Own Harness](#workflow-sdk--build-your-own-harness) for the full API and more examples.
205
263
 
206
- [![Atomic Video Overview](https://img.youtube.com/vi/Lq8-qzGfoy4/maxresdefault.jpg)](https://www.youtube.com/watch?v=Lq8-qzGfoy4)
264
+ > **Want something that works out of the box?** Atomic ships with `ralph`, a built-in workflow that plans, implements, reviews, and debugs autonomously — see [Autonomous Execution (Ralph)](#autonomous-execution-ralph).
207
265
 
208
266
  ---
209
267
 
@@ -221,7 +279,7 @@ Atomic is the only harness that unifies **three production agent SDKs** behind a
221
279
 
222
280
  Each agent gets its own configuration directory (`.claude/`, `.opencode/`, `.github/`), skills, and context files — all managed by Atomic. Write a workflow once, run it on any agent.
223
281
 
224
- ### Workflow SDK — Build Your Own Harness
282
+ ### Workflow SDK — Build Your Own Deterministic Harness
225
283
 
226
284
  Every team has a process — triage bugs this way, ship features that way, review PRs with these checks. Most of it lives in a wiki nobody reads or in one senior engineer's head. The **Workflow SDK** (`@bastani/atomic/workflows`) lets you encode that process as TypeScript — spawn agent sessions dynamically with native control flow (`for`, `if`, `Promise.all()`), and watch them appear in a live graph as they execute.
227
285
 
@@ -232,7 +290,7 @@ atomic workflow -n hello -a claude "describe this project"
232
290
  ```
233
291
 
234
292
  <details>
235
- <summary>See an example of the workflow definition</summary>
293
+ <summary>Example: Sequential workflow (describe -> summarize)</summary>
236
294
 
237
295
  ```ts
238
296
  // .atomic/workflows/hello/claude/index.ts
@@ -240,10 +298,10 @@ import { defineWorkflow, createClaudeSession, claudeQuery } from "@bastani/atomi
240
298
 
241
299
  export default defineWorkflow({
242
300
  name: "hello",
243
- description: "Two-session Claude demo: describe summarize",
301
+ description: "Two-session Claude demo: describe -> summarize",
244
302
  })
245
303
  .run(async (ctx) => {
246
- const describe = await ctx.session(
304
+ const describe = await ctx.stage(
247
305
  { name: "describe", description: "Ask Claude to describe the project" },
248
306
  async (s) => {
249
307
  await createClaudeSession({ paneId: s.paneId });
@@ -252,7 +310,7 @@ export default defineWorkflow({
252
310
  },
253
311
  );
254
312
 
255
- await ctx.session(
313
+ await ctx.stage(
256
314
  { name: "summarize", description: "Summarize the previous session's output" },
257
315
  async (s) => {
258
316
  const research = await s.transcript(describe);
@@ -270,19 +328,106 @@ export default defineWorkflow({
270
328
 
271
329
  </details>
272
330
 
331
+ <details>
332
+ <summary>Example: Parallel workflow (describe -> [summarize-a, summarize-b] -> merge)</summary>
333
+
334
+ ```ts
335
+ // .atomic/workflows/hello-parallel/claude/index.ts
336
+ import { defineWorkflow, createClaudeSession, claudeQuery } from "@bastani/atomic/workflows";
337
+
338
+ export default defineWorkflow({
339
+ name: "hello-parallel",
340
+ description: "Parallel Claude demo: describe -> [summarize-a, summarize-b] -> merge",
341
+ })
342
+ .run(async (ctx) => {
343
+ const describe = await ctx.session(
344
+ { name: "describe", description: "Ask Claude to describe the project" },
345
+ async (s) => {
346
+ await createClaudeSession({ paneId: s.paneId });
347
+ await claudeQuery({ paneId: s.paneId, prompt: s.userPrompt });
348
+ s.save(s.sessionId);
349
+ },
350
+ );
351
+
352
+ const [summarizeA, summarizeB] = await Promise.all([
353
+ ctx.session(
354
+ { name: "summarize-a", description: "Summarize the description as bullet points" },
355
+ async (s) => {
356
+ const research = await s.transcript(describe);
357
+ await createClaudeSession({ paneId: s.paneId });
358
+ await claudeQuery({
359
+ paneId: s.paneId,
360
+ prompt: `Read ${research.path} and summarize it in 2-3 bullet points.`,
361
+ });
362
+ s.save(s.sessionId);
363
+ },
364
+ ),
365
+ ctx.session(
366
+ { name: "summarize-b", description: "Summarize the description as a one-liner" },
367
+ async (s) => {
368
+ const research = await s.transcript(describe);
369
+ await createClaudeSession({ paneId: s.paneId });
370
+ await claudeQuery({
371
+ paneId: s.paneId,
372
+ prompt: `Read ${research.path} and summarize it in a single sentence.`,
373
+ });
374
+ s.save(s.sessionId);
375
+ },
376
+ ),
377
+ ]);
378
+
379
+ await ctx.session(
380
+ { name: "merge", description: "Merge both summaries into a final output" },
381
+ async (s) => {
382
+ const bullets = await s.transcript(summarizeA);
383
+ const oneliner = await s.transcript(summarizeB);
384
+ await createClaudeSession({ paneId: s.paneId });
385
+ await claudeQuery({
386
+ paneId: s.paneId,
387
+ prompt: [
388
+ "Combine the following two summaries into one concise paragraph:",
389
+ "",
390
+ "## Bullet points",
391
+ bullets.content,
392
+ "",
393
+ "## One-liner",
394
+ oneliner.content,
395
+ ].join("\n"),
396
+ });
397
+ s.save(s.sessionId);
398
+ },
399
+ );
400
+ })
401
+ .compile();
402
+ ```
403
+
404
+ </details>
405
+
273
406
  **Key capabilities:**
274
407
 
275
408
  | Capability | Description |
276
409
  | ---------------------------- | ------------------------------------------------------------------------------------ |
277
- | **Dynamic session spawning** | Call `ctx.session()` to spawn sessions at runtime — each gets its own tmux window and graph node |
410
+ | **Dynamic session spawning** | Call `ctx.stage()` to spawn sessions at runtime — each gets its own tmux window and graph node |
278
411
  | **Native TypeScript control flow** | Use `for`, `if/else`, `Promise.all()`, `try/catch` — no framework DSL needed |
279
- | **Session return values** | Session callbacks can return data: `const h = await ctx.session(...); h.result` |
412
+ | **Session return values** | Session callbacks can return data: `const h = await ctx.stage(...); h.result` |
280
413
  | **Transcript passing** | Access prior session output via handle (`s.transcript(handle)`) or name (`s.transcript("name")`) |
281
- | **Nested sub-sessions** | Call `s.session()` inside a session callback to spawn child sessions — visible as nested nodes in the graph |
282
- | **Dependency tracking** | Use `dependsOn: ["name"]` to declare session ordering the runtime waits and the graph shows the edges |
414
+ | **Nested sub-sessions** | Call `s.stage()` inside a session callback to spawn child sessions — visible as nested nodes in the graph |
415
+ | **Auto-inferred graph** | Graph topology auto-inferred from `await`/`Promise.all` patternsno annotations needed |
283
416
  | **Provider-agnostic** | Write raw SDK code for Claude, Copilot, or OpenCode inside each session callback |
284
417
  | **Live graph visualization** | Sessions appear in the TUI graph as they're spawned — loops and conditionals are visible in real time |
285
418
 
419
+ **Deterministic execution guarantees:**
420
+
421
+ Workflows are deterministic by design — the same definition always produces the same execution order with the same data flow, regardless of when or where you run it.
422
+
423
+ - **Strict step ordering** — Steps execute sequentially. Step 2 never starts until Step 1 finishes. Parallel sessions within a step all complete (or fail fast) before the next step begins.
424
+ - **Frozen definitions** — `.compile()` freezes the workflow structure. Once compiled, the step order, session names, and execution graph are immutable.
425
+ - **Controlled transcript access** — Sessions can only read transcripts from *completed* upstream sessions. Parallel siblings are blocked from reading each other, eliminating race conditions on shared state.
426
+ - **Isolated context windows** — Each session runs in its own tmux pane with a fresh context window. No session inherits stale state from another — data flows only through explicit `ctx.transcript()` and `ctx.getMessages()` calls.
427
+ - **Persisted artifacts** — Every session writes its messages, transcript, and metadata to disk. The workflow produces a complete, inspectable execution record you can replay or debug after the fact.
428
+
429
+ This means you can run the same workflow on different machines, different agents, or at different times and get structurally identical execution — same steps, same data flow, same ordering. The only variance comes from the LLM's responses, not from the harness.
430
+
286
431
  Drop a `.ts` file in `.atomic/workflows/<name>/<agent>/` (project-local) or `~/.atomic/workflows/` (global). You can also ask Atomic to create workflows for you:
287
432
 
288
433
  ```
@@ -306,7 +451,7 @@ Use your workflow-creator skill to create a workflow that plans, implements, and
306
451
  | ----------------------- | ------------------------- | -------------------------------------------------------------- |
307
452
  | `ctx.userPrompt` | `string` | Original user prompt from the CLI invocation |
308
453
  | `ctx.agent` | `AgentType` | Which agent is running (`"claude"`, `"copilot"`, `"opencode"`) |
309
- | `ctx.session(opts, fn)` | `Promise<SessionHandle<T>>` | Spawn a session — returns handle with `name`, `id`, `result` |
454
+ | `ctx.stage(opts, fn)` | `Promise<SessionHandle<T>>` | Spawn a session — returns handle with `name`, `id`, `result` |
310
455
  | `ctx.transcript(ref)` | `Promise<Transcript>` | Get a completed session's transcript (`{ path, content }`) |
311
456
  | `ctx.getMessages(ref)` | `Promise<SavedMessage[]>` | Get a completed session's raw native messages |
312
457
 
@@ -323,7 +468,7 @@ Use your workflow-creator skill to create a workflow that plans, implements, and
323
468
  | `s.save(messages)` | `SaveTranscript` | Save this session's output for subsequent sessions |
324
469
  | `s.transcript(ref)` | `Promise<Transcript>` | Get a completed session's transcript |
325
470
  | `s.getMessages(ref)` | `Promise<SavedMessage[]>` | Get a completed session's raw native messages |
326
- | `s.session(opts, fn)` | `Promise<SessionHandle<T>>` | Spawn a nested sub-session (child in the graph) |
471
+ | `s.stage(opts, fn)` | `Promise<SessionHandle<T>>` | Spawn a nested sub-session (child in the graph) |
327
472
 
328
473
  #### Session Options (`SessionRunOptions`)
329
474
 
@@ -331,17 +476,8 @@ Use your workflow-creator skill to create a workflow that plans, implements, and
331
476
  | ------------- | ---------- | ----------------------------------------------------------------------------- |
332
477
  | `name` | `string` | Unique session name within the workflow run |
333
478
  | `description` | `string?` | Human-readable description shown in the graph |
334
- | `dependsOn` | `string[]?`| Names of sessions that must complete before this one starts (creates graph edges) |
335
479
 
336
- `dependsOn` is useful when spawning sessions with `Promise.all()` it lets the runtime enforce ordering while still allowing parallel spawning of independent sessions:
337
-
338
- ```ts
339
- await Promise.all([
340
- ctx.session({ name: "migrate-db" }, async (s) => { /* ... */ }),
341
- ctx.session({ name: "seed-data", dependsOn: ["migrate-db"] }, async (s) => { /* ... */ }),
342
- ctx.session({ name: "gen-types", dependsOn: ["migrate-db"] }, async (s) => { /* ... */ }),
343
- ]);
344
- ```
480
+ The runtime auto-infers parent-child edges from execution order: sequential `await` creates a chain, while `Promise.all` creates parallel fan-out/fan-in no annotations needed.
345
481
 
346
482
  #### Saving Transcripts
347
483
 
@@ -520,7 +656,7 @@ Each feature installs Atomic + one agent. Mix and match across projects:
520
656
 
521
657
  ### Specialized Sub-Agents
522
658
 
523
- Atomic doesn't use one general-purpose agent for everything. It dispatches **purpose-built sub-agents**, each with scoped context, tools, and termination conditions:
659
+ Atomic dispatches **purpose-built sub-agents**, each with scoped context, tools, and termination conditions:
524
660
 
525
661
  | Sub-Agent | Purpose |
526
662
  | ---------------------------- | ---------------------------------------------------------------------- |
@@ -654,7 +790,7 @@ Skills are auto-invoked when relevant — `test-driven-development` activates be
654
790
 
655
791
  During `atomic workflow` execution, Atomic renders a live orchestrator panel built on [OpenTUI](https://github.com/anomalyco/opentui) on top of the workflow's tmux session graph. It shows:
656
792
 
657
- - **Session graph** — Nodes for each `.session()` call with status (pending, running, completed, failed) and edges for sequential / parallel dependencies
793
+ - **Session graph** — Nodes for each `.stage()` call with status (pending, running, completed, failed) and edges for sequential / parallel dependencies
658
794
  - **Task list tracking** — Ralph's decomposed task list with dependency arrows, updated in real time as workers complete tasks
659
795
  - **Pane previews** — Thumbnail of each tmux pane so you can see what every agent is doing without switching contexts
660
796
  - **Transcript passing visibility** — Highlights `s.save()` / `s.transcript()` handoffs as they happen between sessions
@@ -668,42 +804,6 @@ During `atomic chat`, there is no Atomic-owned TUI — `atomic chat -a <agent>`
668
804
 
669
805
  ---
670
806
 
671
- ## Architecture
672
-
673
- **You own the decisions. Agents own the execution.**
674
-
675
- Every feature follows this cycle. Specs and research become persistent context for future sessions. You review at two critical points: after research (did the agent understand the codebase?) and after the spec (is the plan correct?).
676
-
677
- ```
678
- Research → Specs → Execution → Outcomes → Specs (persistent context)
679
- ↑ ↓
680
- └────────────────────────────────────┘
681
- ```
682
-
683
- ### Why Research → Plan → Implement → Verify Works
684
-
685
- Most failures in AI-assisted coding come from the same root cause: **the agent didn't have enough context before it started writing code**. An agent that jumps straight to implementation is guessing at architecture, conventions, and constraints — and the further it gets, the more expensive it is to correct course. This is true regardless of model capability.
686
-
687
- Atomic's architecture is built around a four-phase cycle that plays to how LLMs actually work best:
688
-
689
- **1. Research** — Before touching any code, the agent builds a factual understanding of the codebase. Specialized research sub-agents fan out in parallel: locating relevant files, analyzing implementations, querying external documentation. The output is a structured research document — not a plan, not code, just facts. This gives the human a checkpoint: *did the agent actually understand the codebase?* If the research is wrong, you catch it here instead of after 500 lines of incorrect implementation.
690
-
691
- **2. Plan (Spec)** — The agent produces a technical specification grounded in the research. This is the most important human review point. A spec is a contract: it defines what will be built, what files will be touched, what the expected behavior is. Specs are cheap to revise; implementations are expensive to rewrite. By forcing a planning phase, Atomic ensures the agent commits to a coherent strategy before writing any code.
692
-
693
- **3. Implement** — With a validated spec, the planner decomposes work into discrete tasks with dependency tracking. Worker sub-agents execute tasks in parallel, each in its own context window, each focused on a single unit of work. This is where specialization pays off — a worker implementing a database migration doesn't need to hold the full API spec in context. It just needs its task, the relevant files, and the tools to edit them.
694
-
695
- **4. Verify** — A reviewer sub-agent audits the implementation against the original spec. If issues are found, a debugger generates a report that feeds back to the planner on the next iteration. This catches errors before they compound — a misnamed field caught during review is a one-line fix; the same error caught by a user in production is a multi-file cascade.
696
-
697
- **Why this matters for LLMs specifically:**
698
-
699
- LLMs are stateless — they don't retain memory between turns beyond what's in the context window. Without structure, a long coding session becomes a degrading context window where early decisions get pushed out and the agent loses coherence. Atomic's phased approach solves this by externalizing state: research documents persist to disk, specs become files, task lists live in a SQLite database, and review feedback generates new tasks. Each phase produces artifacts that the next phase consumes, so no single agent needs to hold the entire problem in its context window.
700
-
701
- This is also why the cycle is iterative. Research and specs become persistent context for future sessions — every investigation compounds. The agent that implements your next feature starts with richer context than the one that implemented the first, without anyone having to re-explain the codebase.
702
-
703
- [![Architecture](assets/architecture.svg)](assets/architecture.svg)
704
-
705
- ---
706
-
707
807
  ## Commands Reference
708
808
 
709
809
  ### CLI Commands
@@ -826,6 +926,65 @@ Created automatically during `atomic init`. Resolution order:
826
926
 
827
927
  ## Installation Options
828
928
 
929
+ ### Bun (recommended)
930
+
931
+ ```bash
932
+ bun install -g @bastani/atomic
933
+ ```
934
+
935
+ ### Devcontainer (recommended for autonomous agents)
936
+
937
+ > [!TIP]
938
+ > Devcontainers isolate the coding agent from your host system, reducing the risk of destructive actions like unintended file deletions or misapplied shell commands. This makes them the safest way to run Atomic.
939
+ >
940
+ > Use the [Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) for VS Code or [DevPod](https://devpod.sh) to spawn and manage your devcontainers.
941
+
942
+ Add a single feature to your `.devcontainer/devcontainer.json`:
943
+
944
+ ```
945
+ your-project/
946
+ +-- .devcontainer/
947
+ | +-- devcontainer.json <-- add the feature here
948
+ +-- src/
949
+ +-- ...
950
+ ```
951
+
952
+ ```jsonc
953
+ {
954
+ "features": {
955
+ "ghcr.io/flora131/atomic/claude:1": {} // or /opencode:1 or /copilot:1
956
+ }
957
+ }
958
+ ```
959
+
960
+ | Feature | Reference | Agent |
961
+ |---------|-----------|-------|
962
+ | Atomic + Claude Code | `ghcr.io/flora131/atomic/claude:1` | [Claude Code](https://claude.ai) |
963
+ | Atomic + OpenCode | `ghcr.io/flora131/atomic/opencode:1` | [OpenCode](https://opencode.ai) |
964
+ | Atomic + Copilot CLI | `ghcr.io/flora131/atomic/copilot:1` | [Copilot CLI](https://github.com/github/copilot-cli) |
965
+
966
+ Each feature installs the Atomic CLI, all shared dependencies (bun, playwright-cli), agent-specific configurations (agents, skills), and the agent CLI itself. Features are versioned in sync with Atomic CLI releases.
967
+
968
+ <details>
969
+ <summary>Standalone binary (macOS / Linux)</summary>
970
+
971
+ ```bash
972
+ curl -fsSL https://raw.githubusercontent.com/flora131/atomic/main/install.sh | bash
973
+ # or with wget:
974
+ wget -qO- https://raw.githubusercontent.com/flora131/atomic/main/install.sh | bash
975
+ ```
976
+
977
+ </details>
978
+
979
+ <details>
980
+ <summary>Standalone binary (Windows PowerShell)</summary>
981
+
982
+ ```powershell
983
+ irm https://raw.githubusercontent.com/flora131/atomic/main/install.ps1 | iex
984
+ ```
985
+
986
+ </details>
987
+
829
988
  <details>
830
989
  <summary>Install a specific version</summary>
831
990
 
@@ -1004,6 +1163,7 @@ Remove-Item -Path "$env:USERPROFILE\.atomic" -Recurse -Force
1004
1163
 
1005
1164
  ---
1006
1165
 
1166
+
1007
1167
  ## Troubleshooting
1008
1168
 
1009
1169
  <details>
@@ -1030,6 +1190,27 @@ If agents fail to spawn on Windows, ensure the agent CLI is in your PATH. Atomic
1030
1190
 
1031
1191
  </details>
1032
1192
 
1193
+ <details>
1194
+ <summary>Sub-agent tree stuck on "Initializing..."</summary>
1195
+
1196
+ 1. Update to the latest release (`bun install -g @bastani/atomic`) and retry
1197
+ 2. Check for terminal progress events in verbose mode
1198
+ 3. Press `Ctrl+F` twice to terminate stuck background agents, then resend your prompt
1199
+ 4. If the issue persists, capture reproduction steps and [open an issue](https://github.com/flora131/atomic/issues)
1200
+
1201
+ </details>
1202
+
1203
+ <details>
1204
+ <summary>Shift+Enter not inserting newline</summary>
1205
+
1206
+ - **VS Code terminal:** Keep `terminal.integrated.enableKittyKeyboardProtocol` enabled
1207
+ - **GNOME Terminal, xterm, Alacritty, WezTerm, iTerm2:** `modifyOtherKeys` mode is enabled automatically
1208
+ - **Universal fallback:** Use `Ctrl+J` for newline
1209
+ - **Last resort:** End line with `\` and press Enter
1210
+
1211
+ </details>
1212
+
1213
+
1033
1214
  ---
1034
1215
 
1035
1216
  ## FAQ
@@ -1039,17 +1220,21 @@ If agents fail to spawn on Windows, ensure the agent CLI is in your PATH. Atomic
1039
1220
 
1040
1221
  [Spec Kit](https://github.com/github/spec-kit) is GitHub's toolkit for "Spec-Driven Development." Both improve AI-assisted development, but solve different problems:
1041
1222
 
1042
- | Aspect | Spec-Kit | Atomic |
1043
- | ---------------- | ------------------------------- | ----------------------------------------------- |
1044
- | **Focus** | Greenfield projects | Large existing codebases + greenfield |
1045
- | **First Step** | Define project principles | Analyze existing architecture |
1046
- | **Context** | Per-feature specs | Research Specs Execution Outcomes |
1047
- | **Agents** | Single agent with shell scripts | 12+ specialized sub-agents across 3 SDKs |
1048
- | **Workflows** | Not available | Session-based pipelines with transcript passing |
1049
- | **Human Review** | Implicit | Explicit checkpoints |
1050
- | **Debugging** | Not addressed | Dedicated debugging workflow |
1051
- | **Autonomous** | Not available | Ralph for multi-hour execution |
1052
- | **Isolation** | Not addressed | Devcontainer features for safe execution |
1223
+ **In short:** Spec-Kit works well for greenfield projects where you start from a spec and use a single Copilot session to generate code. Atomic is built for the harder case — large existing codebases where you need to research what's already there before changing anything. It gives you multi-session pipelines with isolated context windows (so the agent doesn't degrade over long tasks), deterministic execution, and support for Claude Code, OpenCode, and Copilot CLI instead of just one agent. If you're starting a new project from scratch with Copilot, Spec-Kit is simpler. If you're working on an established codebase and need chained sessions, parallel research, or autonomous execution, that's what Atomic is for.
1224
+
1225
+ | Aspect | Spec-Kit | Atomic |
1226
+ | --- | --- | --- |
1227
+ | **Focus** | Greenfield projects with spec-first workflow | Large existing codebases + greenfield research-first or spec-first |
1228
+ | **First Step** | Define project principles and specs | Analyze existing architecture with parallel research sub-agents |
1229
+ | **Workflow Definition** | Shell scripts and markdown templates | TypeScript Workflow SDK (`defineWorkflow()` → `.run()` → `.compile()`) with deterministic execution |
1230
+ | **Session Management** | Single agent session | Multi-session pipelines — sequential and parallel — each in isolated context windows |
1231
+ | **Data Flow** | Manual — copy output between steps | Controlled transcript passing via `ctx.transcript()` and `ctx.getMessages()` |
1232
+ | **Agent Support** | GitHub Copilot CLI | Claude Code + OpenCode + Copilot CLI — switch with a flag |
1233
+ | **Sub-Agents** | Single general-purpose agent | 12 specialized sub-agents with scoped tools and isolated contexts |
1234
+ | **Skills** | Not available | 58 built-in skills (development, design, docs, agent architecture) |
1235
+ | **Autonomous Execution** | Not available | Ralph — multi-hour autonomous sessions with plan/implement/review/debug loop |
1236
+ | **Execution Guarantees** | Non-deterministic | Deterministic — strict step ordering, frozen definitions, controlled transcript access |
1237
+ | **Isolation** | Not addressed | Devcontainer features for containerized execution |
1053
1238
 
1054
1239
  </details>
1055
1240
 
@@ -1058,15 +1243,54 @@ If agents fail to spawn on Windows, ensure the agent CLI is in your PATH. Atomic
1058
1243
 
1059
1244
  [DeerFlow](https://github.com/bytedance/deer-flow) is ByteDance's agent harness built on LangGraph/LangChain. Both are multi-agent orchestrators, but take different approaches:
1060
1245
 
1061
- | Aspect | DeerFlow | Atomic |
1062
- | -------------- | --------------------------- | -------------------------------------------------- |
1063
- | **Runtime** | Python (LangGraph) | TypeScript (Bun) |
1064
- | **Agent SDKs** | OpenAI-compatible API | Claude Code + OpenCode + Copilot CLI SDKs natively |
1065
- | **Focus** | General-purpose agent tasks | Coding-specific: research, spec, implement, review |
1066
- | **Workflows** | LangGraph state machines | Session-based chainable API with `.compile()` |
1067
- | **Execution** | Sandbox containers | Devcontainer features + git worktrees |
1068
- | **Interface** | Web UI | Terminal TUI with agent activity tree |
1069
- | **Autonomous** | Not available | Ralph for multi-hour coding sessions |
1246
+ **In short:** DeerFlow is a general-purpose agent orchestrator — it handles research, report generation, and other tasks through a LangGraph DAG with a web UI. Atomic is narrowly focused on coding workflows. The key difference is that Atomic runs on top of production coding agents (Claude Code, OpenCode, Copilot CLI) rather than reimplementing coding tools through a generic API. You get each agent's native file editing, permissions, MCP integrations, and hooks out of the box. Atomic also gives you deterministic execution — same step order, same data flow every run — which matters when you're encoding a team's dev process and need it to be reproducible across people and CI. If you need a general-purpose agent pipeline with a web UI, DeerFlow is the better fit. If you need coding-specific workflows with strict execution guarantees, Atomic is more appropriate.
1247
+
1248
+ | Aspect | DeerFlow | Atomic |
1249
+ | --- | --- | --- |
1250
+ | **Runtime** | Python (LangGraph) | TypeScript (Bun) |
1251
+ | **Agent SDKs** | OpenAI-compatible API | Claude Code + OpenCode + Copilot CLI native SDKs — write raw SDK code in each session |
1252
+ | **Focus** | General-purpose agent tasks (research, reports) | Coding-specific: research, spec, implement, review, debug |
1253
+ | **Workflow Definition** | LangGraph state machines with graph nodes | TypeScript Workflow SDK `defineWorkflow()` → `.run()` → `.compile()` |
1254
+ | **Execution Model** | DAG-based with conditional edges | Deterministic strict step ordering, frozen definitions, controlled transcript passing |
1255
+ | **Parallelism** | Via LangGraph branch nodes | Native parallel sessions via `Promise.all()` with `ctx.session()` in isolated context windows |
1256
+ | **Sub-Agents** | Researcher, coder, reporter nodes | 12 specialized sub-agents with scoped tools (planner, worker, reviewer, debugger, etc.) |
1257
+ | **Skills** | Not available | 58 built-in skills auto-invoked by context |
1258
+ | **Isolation** | Sandbox containers | Devcontainer features + git worktrees |
1259
+ | **Interface** | Web UI (Streamlit) | Terminal chat with tmux-based session management |
1260
+ | **Autonomous** | Not available | Ralph — bounded iteration with plan/implement/review/debug loop |
1261
+ | **Distribution** | `pip install` + local server | `bun install -g` or devcontainer features |
1262
+
1263
+ </details>
1264
+
1265
+ <details>
1266
+ <summary>How does Atomic differ from Hermes Agent?</summary>
1267
+
1268
+ [Hermes Agent](https://github.com/NousResearch/hermes-agent) is Nous Research's general-purpose AI agent with a self-improving learning loop. Both are open-source agent frameworks, but serve different use cases:
1269
+
1270
+ **In short:** Hermes Agent is a broad AI assistant that learns and improves across sessions, connects to messaging platforms, and works with any OpenAI-compatible model. Atomic is a coding-specific harness built for engineering teams. It lets you encode your development process as deterministic TypeScript workflows that run identically across team members, machines, and CI pipelines. Instead of reimplementing coding tools from scratch, Atomic inherits production-hardened tool ecosystems from Claude Code, OpenCode, and Copilot CLI — including their permission systems, MCP integrations, and hooks — giving you two independent security boundaries (devcontainer isolation + agent permissions) rather than one. Each workflow session runs in a fresh context window with only distilled transcripts passed forward, so output stays sharp over multi-hour coding tasks instead of degrading through lossy compression. And because skills are developer-authored and version-controlled, they don't drift or accumulate errors the way auto-generated skills can. Choose Hermes if you want a self-improving general-purpose agent with multi-platform messaging; choose Atomic if you want repeatable, auditable coding workflows with strict execution guarantees and production-grade isolation.
1271
+
1272
+ | Aspect | Hermes Agent | Atomic |
1273
+ | --- | --- | --- |
1274
+ | **Focus** | General-purpose AI assistant (coding, messaging, smart home, research) | Coding-specific: multi-session workflows on coding agents |
1275
+ | **Runtime** | Python 3.11+ (uv) | TypeScript (Bun) |
1276
+ | **Agent SDKs** | OpenAI-compatible API as universal adapter (200+ models via OpenRouter) | Claude Code + OpenCode + Copilot CLI native SDKs — write raw SDK code in each session |
1277
+ | **Workflow Definition** | Cron scheduler + subagent delegation | TypeScript Workflow SDK — `defineWorkflow()` → `.run()` → `.compile()` |
1278
+ | **Session Management** | Single conversation loop with context compression | Multi-session pipelines — sequential and parallel — each in isolated context windows |
1279
+ | **Data Flow** | In-context within a single conversation | Controlled transcript passing via `ctx.transcript()` and `ctx.getMessages()` |
1280
+ | **Self-Improvement** | Closed learning loop — auto-creates skills from experience, persistent user model via Honcho | Skills authored by developers; memory via CLAUDE.md / AGENTS.md context files |
1281
+ | **Sub-Agents** | `delegate_task` spawns isolated subagents | 12 specialized sub-agents with scoped tools and model tiers (Opus, Sonnet, Haiku) |
1282
+ | **Skills** | 40+ tools + community Skills Hub (agentskills.io) | 58 built-in skills (development, design, docs, agent architecture) |
1283
+ | **Interface** | Terminal TUI + multi-platform messaging gateway (Telegram, Discord, Slack, WhatsApp, etc.) | Terminal chat with tmux-based session management |
1284
+ | **Isolation** | Six terminal backends (local, Docker, SSH, Daytona, Singularity, Modal) | Devcontainer features + git worktrees |
1285
+ | **Autonomous Execution** | Cron scheduler with inactivity-based timeouts | Ralph — bounded iteration with plan/implement/review/debug loop |
1286
+ | **Execution Guarantees** | Non-deterministic conversation loop | Deterministic — strict step ordering, frozen definitions, controlled transcript access |
1287
+ | **Team Process Encoding** | Personal assistant — no concept of team-shared workflows | Encode your team's dev process as TypeScript — repeatable across members, projects, and CI |
1288
+ | **Coding Agent Tooling** | Reimplements file/terminal tools from scratch via `model_tools.py` | Inherits production-hardened tool ecosystems from Claude Code, OpenCode, and Copilot CLI (file editing, permissions, MCP, hooks) |
1289
+ | **Reproducibility** | Conversation loop produces different execution paths each run | Frozen workflow definitions run identically across machines, team members, and CI pipelines |
1290
+ | **Context Quality** | Lossy compression within a single conversation — degrades on long coding tasks | Fresh context window per session with only distilled transcripts passed forward — stays sharp over multi-hour tasks |
1291
+ | **Skill Authoring** | Auto-created skills may drift, accumulate errors, or encode bad patterns over time | Developer-authored, version-controlled skills — intentional and auditable |
1292
+ | **Security Model** | Command approval + container backends (single boundary) | Devcontainer isolation + coding agent permission systems (Claude Code permissions, Copilot safeguards) — two independent security boundaries |
1293
+ | **Distribution** | `uv` / `pip` | `bun install -g` or devcontainer features |
1070
1294
 
1071
1295
  </details>
1072
1296
 
@@ -1089,3 +1313,4 @@ MIT License — see [LICENSE](LICENSE) for details.
1089
1313
  - [Ralph Wiggum Method](https://ghuntley.com/ralph/)
1090
1314
  - [OpenAI Codex Cookbook](https://github.com/openai/openai-cookbook)
1091
1315
  - [HumanLayer](https://github.com/humanlayer/humanlayer)
1316
+ - [Impeccable](https://github.com/pbakaus/impeccable)