@bastani/atomic 0.8.17-0 → 0.8.18-0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (53) hide show
  1. package/CHANGELOG.md +16 -0
  2. package/dist/builtin/intercom/CHANGELOG.md +5 -0
  3. package/dist/builtin/intercom/package.json +1 -1
  4. package/dist/builtin/mcp/CHANGELOG.md +5 -0
  5. package/dist/builtin/mcp/package.json +1 -1
  6. package/dist/builtin/subagents/CHANGELOG.md +5 -0
  7. package/dist/builtin/subagents/package.json +1 -1
  8. package/dist/builtin/web-access/CHANGELOG.md +5 -0
  9. package/dist/builtin/web-access/package.json +1 -1
  10. package/dist/builtin/workflows/CHANGELOG.md +25 -0
  11. package/dist/builtin/workflows/README.md +62 -3
  12. package/dist/builtin/workflows/builtin/deep-research-codebase.ts +555 -537
  13. package/dist/builtin/workflows/builtin/goal.ts +5 -0
  14. package/dist/builtin/workflows/builtin/open-claude-design.ts +3 -3
  15. package/dist/builtin/workflows/builtin/ralph.ts +737 -713
  16. package/dist/builtin/workflows/builtin/shared-prompts.ts +11 -0
  17. package/dist/builtin/workflows/package.json +1 -1
  18. package/dist/builtin/workflows/src/extension/discovery.ts +61 -22
  19. package/dist/builtin/workflows/src/extension/index.ts +2 -0
  20. package/dist/builtin/workflows/src/extension/runtime.ts +4 -0
  21. package/dist/builtin/workflows/src/extension/workflow-schema.ts +4 -0
  22. package/dist/builtin/workflows/src/runs/foreground/executor.ts +96 -6
  23. package/dist/builtin/workflows/src/runs/foreground/stage-runner.ts +2 -0
  24. package/dist/builtin/workflows/src/runs/shared/workflow-runner.ts +7 -0
  25. package/dist/builtin/workflows/src/runs/shared/worktree.ts +214 -1
  26. package/dist/builtin/workflows/src/sdk-surface.ts +2 -0
  27. package/dist/builtin/workflows/src/shared/types.ts +32 -3
  28. package/dist/builtin/workflows/src/workflows/define-workflow.ts +18 -1
  29. package/dist/core/agent-session-services.d.ts +2 -1
  30. package/dist/core/agent-session-services.d.ts.map +1 -1
  31. package/dist/core/agent-session-services.js +1 -0
  32. package/dist/core/agent-session-services.js.map +1 -1
  33. package/dist/core/agent-session.d.ts +3 -0
  34. package/dist/core/agent-session.d.ts.map +1 -1
  35. package/dist/core/agent-session.js +16 -5
  36. package/dist/core/agent-session.js.map +1 -1
  37. package/dist/core/atomic-guide-command.d.ts.map +1 -1
  38. package/dist/core/atomic-guide-command.js +40 -28
  39. package/dist/core/atomic-guide-command.js.map +1 -1
  40. package/dist/core/sdk.d.ts +9 -1
  41. package/dist/core/sdk.d.ts.map +1 -1
  42. package/dist/core/sdk.js +2 -2
  43. package/dist/core/sdk.js.map +1 -1
  44. package/dist/core/system-prompt.d.ts +2 -0
  45. package/dist/core/system-prompt.d.ts.map +1 -1
  46. package/dist/core/system-prompt.js +22 -13
  47. package/dist/core/system-prompt.js.map +1 -1
  48. package/docs/quickstart.md +13 -5
  49. package/docs/sdk.md +20 -5
  50. package/docs/workflows.md +44 -17
  51. package/examples/sdk/05-tools.ts +22 -1
  52. package/examples/sdk/README.md +7 -3
  53. package/package.json +1 -1
package/docs/workflows.md CHANGED
@@ -17,11 +17,12 @@ Use a workflow when a task should be repeatable, inspectable, resumable, or spli
17
17
  - **Package distribution** - Ship workflows through Atomic packages, settings, or conventional directories
18
18
 
19
19
  **Example use cases:**
20
+ - Small, outcome-driven code or docs changes with explicit done criteria
20
21
  - Codebase research with parallel local and external research stages
21
22
  - Review/fix loops with independent reviewers and a synthesis stage
22
23
  - Release planning with human approval gates
23
24
  - Documentation audits that save findings as artifacts
24
- - Multi-stage migrations with validation and rollback checks
25
+ - Multi-stage migrations, broad refactors, and validation/rollback plans
25
26
  - Reusable team workflows distributed through npm, git, or project settings
26
27
 
27
28
  ## Table of Contents
@@ -139,8 +140,8 @@ Atomic bundles four workflows that cover the most common multi-stage jobs. They
139
140
  | Workflow | What it does | When to use |
140
141
  |---|---|---|
141
142
  | `deep-research-codebase` | Scout + research-history chain → parallel specialist waves → aggregator. Indexes the whole repo and synthesizes findings. | Broad or cross-cutting research before you decide what to change. Prefer `/skill:research-codebase` for one subsystem. |
142
- | `goal` | Persisted goal ledger → bounded worker turns → receipts → three-reviewer gate → deterministic reducer → final report. | Focused implementation against a clear objective or spec where you want auditable progress, reviewer-quorum completion, repeated-blocker detection, and explicit stop decisions. |
143
- | `ralph` | RFC planning → sub-agent orchestration → simplification → infrastructure discovery → parallel review → PR handoff. | Larger spec-to-PR jobs where you want a generated technical plan, delegated implementation, iterative review, and pull-request preparation. |
143
+ | `goal` | Persisted goal ledger → bounded worker turns → receipts → three-reviewer gate → deterministic reducer → final report. | Small-to-medium scope changes when you can identify the work surface, state the exact outcome, and name the validation that proves it is done — for example tests, lint/typecheck, docs builds, or observable behavior. |
144
+ | `ralph` | RFC planning → sub-agent orchestration → simplification → infrastructure discovery → parallel review → PR handoff. | Larger migrations, broad refactors, multi-package changes, and spec-to-PR work where you want Atomic to plan the approach, delegate implementation through sub-agents, simplify, review, iterate, and prepare a pull-request report. |
144
145
  | `open-claude-design` | Design-system onboarding → reference import → HTML generation → impeccable-driven refinement → quality gate → rich HTML handoff. Renders a live `preview.html` you can iterate against (opens through `playwright-cli` when available). | UI, page, component, theme, or design-token work that benefits from generation + critique loops. |
145
146
 
146
147
  ### `deep-research-codebase`
@@ -192,7 +193,7 @@ Inputs:
192
193
 
193
194
  | Input | Type | Required | Default | Description |
194
195
  |---|---|---|---|---|
195
- | `objective` | text | yes | — | Goal-runner objective. |
196
+ | `objective` | text | yes | — | Goal-runner objective. Include the desired end state, expected outcome, testing/validation instructions, and any explicit done criteria. |
196
197
  | `max_turns` | number | no | `10` | Maximum worker/review turns before human follow-up is needed. |
197
198
  | `base_branch` | string | no | `origin/main` | Branch reviewers compare the current code delta against. |
198
199
 
@@ -201,13 +202,15 @@ Inputs:
201
202
  Run examples:
202
203
 
203
204
  ```text
204
- /workflow goal objective="Implement specs/2026-03-rate-limit.md and validate the changed behavior"
205
- /workflow goal objective="Migrate the database layer to Drizzle" base_branch=develop
206
- /workflow goal objective="Finish the docs refresh" max_turns=2
205
+ /workflow goal objective="Implement specs/2026-03-rate-limit.md, add the requested regression tests, run bun test packages/api/rate-limit.test.ts, and finish only when burst traffic returns 429 with Retry-After"
206
+ /workflow goal objective="Update the CLI docs to describe the new --json flag, include one usage example, and verify the docs build still passes" max_turns=3
207
+ /workflow goal objective="Fix the settings form validation bug; add/adjust the focused test and consider it done when invalid emails show the inline error without submitting"
207
208
  ```
208
209
 
209
210
  `goal` creates an OS-temp `goal-ledger.json` artifact, renders goal-continuation context for each worker turn, writes each worker receipt to `work-turn-N.md`, and appends receipts, reviewer decisions, blockers, reducer decisions, and lifecycle events to the ledger. The objective is treated as user-provided data, not higher-priority instructions.
210
211
 
212
+ Write the `objective` like a compact acceptance spec. Say what should exist when the run is done, how you want testing handled, which command(s) or manual checks matter, and what outcome proves completion. The workflow is intentionally lean: it does not first generate an RFC or migration plan, so the developer-supplied objective is where scope, validation, and completion criteria belong.
213
+
211
214
  The worker may claim readiness, but it cannot finalize completion. Three reviewers independently inspect the ledger, worker receipt, repository state, and diff against `base_branch`; each returns structured JSON with findings, evidence, verification still remaining, and an optional blocker. A TypeScript reducer marks the goal complete only when reviewer quorum approves, marks blocked only when the same dependency/tool blocker repeats for the blocker threshold, continues when evidence is missing, and returns `needs_human` when `max_turns` is exhausted or worker execution fails.
212
215
 
213
216
  Result fields:
@@ -226,8 +229,6 @@ Result fields:
226
229
  | `remaining_work` | Remaining gaps/blockers when incomplete, or `none`. |
227
230
  | `review_report` | Markdown report containing the last structured reviewer decision payloads used by the reducer. |
228
231
 
229
- Use `goal` when you already have a clear objective or reviewed spec and want the leanest auditable implementation loop.
230
-
231
232
  ### `ralph`
232
233
 
233
234
  Inputs:
@@ -236,16 +237,20 @@ Inputs:
236
237
  |---|---|---|---|---|
237
238
  | `prompt` | text | yes | — | Task, feature request, issue summary, or spec path to plan, execute, refine, review, and prepare for PR. |
238
239
  | `max_loops` | number | no | `10` | Maximum plan/orchestrate/review iterations before the workflow proceeds to PR handoff without reviewer approval. |
239
- | `base_branch` | string | no | `origin/main` | Branch reviewers and the PR-prep stage compare the current code delta against. |
240
+ | `base_branch` | string | no | `origin/main` | Branch reviewers and the PR-prep stage compare the current code delta against; also used to create a missing worktree. |
241
+ | `git_worktree_dir` | string | no | `""` | Optional reusable Git worktree root. Empty runs in the invoking checkout; non-empty values run Ralph stages in the created/reused worktree. |
240
242
 
241
243
  Run examples:
242
244
 
243
245
  ```text
244
- /workflow ralph prompt="Implement specs/2026-03-rate-limit.md and prepare the PR"
245
246
  /workflow ralph prompt="Plan and migrate the database layer to Drizzle" max_loops=3 base_branch=develop
247
+ /workflow ralph prompt="Refactor authentication across the API, CLI, and web UI, then prepare the PR"
248
+ /workflow ralph prompt="Safely implement the API refactor" git_worktree_dir=../atomic-ralph-api-wt base_branch=main
246
249
  ```
247
250
 
248
- `ralph` is a heavier spec-to-PR workflow. Each iteration writes an RFC-style technical design document under `specs/`, initializes an OS-temp implementation notes file, delegates implementation through sub-agents, runs a behavior-preserving code simplifier, discovers review infrastructure, and asks two reviewers to inspect the patch against `base_branch`. The loop stops when every reviewer approves or `max_loops` is reached, then runs a pull-request preparation stage.
251
+ Each `ralph` iteration writes an RFC-style technical design document under `specs/`, initializes an OS-temp implementation notes file, delegates implementation through sub-agents, runs a behavior-preserving code simplifier, discovers review infrastructure, and asks two reviewers to inspect the patch against `base_branch`. The loop stops when every reviewer approves or `max_loops` is reached, then runs a pull-request preparation stage.
252
+
253
+ Set `git_worktree_dir` when you want Ralph's worker stages isolated in a reusable Git worktree. Relative paths resolve from the invoking repository root, existing same-repository worktree roots are reused, and missing paths are created from `base_branch`. Ralph preserves the invoking repo-relative cwd inside the worktree, so launching from `repo/packages/api` with `git_worktree_dir=../repo-wt` runs stages from `../repo-wt/packages/api`.
249
254
 
250
255
  Result fields:
251
256
 
@@ -260,7 +265,7 @@ Result fields:
260
265
  | `iterations_completed` | Number of plan/orchestrate/review loops completed. |
261
266
  | `review_report` | Markdown report containing the latest reviewer payloads. |
262
267
 
263
- A typical end-to-end flow is `/skill:research-codebase` → `/skill:create-spec` → `/workflow goal objective="Implement the researched rate-limit behavior and validate it"`. Use `/workflow ralph` instead when you want Atomic to generate the RFC, coordinate implementation sub-agents, iterate on review findings, and prepare a PR report in one workflow.
268
+ A typical end-to-end flow is `/skill:research-codebase` → `/skill:create-spec` → `/workflow goal objective="Implement the researched rate-limit behavior, run the focused tests, and finish when the documented burst behavior is validated"` when you can identify the work surface, state the exact outcome, and name the validation that proves it is done. Keep using `/workflow ralph` for larger migrations, broad refactors, multi-package changes, and spec-to-PR work where you want Atomic to plan, delegate through sub-agents, simplify, review, iterate, and prepare a pull-request report.
264
269
 
265
270
  ### `open-claude-design`
266
271
 
@@ -291,11 +296,11 @@ Run a deep codebase research workflow on how the rate limiter behaves under burs
291
296
  ```
292
297
 
293
298
  ```text
294
- Use the goal workflow to implement specs/2026-03-rate-limit.md and cap it at 5 turns.
299
+ Use the goal workflow to implement specs/2026-03-rate-limit.md, run the focused rate-limit tests, finish only when burst traffic returns 429 with Retry-After, and cap it at 5 turns.
295
300
  ```
296
301
 
297
302
  ```text
298
- Use the ralph workflow to plan, implement, review, and prepare a PR for specs/2026-03-rate-limit.md.
303
+ Use the ralph workflow to plan a database-layer migration, implement it, review it, and prepare a PR.
299
304
  ```
300
305
 
301
306
  ```text
@@ -338,6 +343,8 @@ If the task is only deterministic TypeScript with no LLM/session stage, use a sc
338
343
  | User goal | Use |
339
344
  |-----------|-----|
340
345
  | Run, inspect, attach to, pause, interrupt, resume, or check status for an existing workflow | `/workflow ...` or `workflow({ action: ... })` |
346
+ | Implement a small-to-medium scope change with an identifiable work surface, exact outcome, and named validation | `/workflow goal objective="..."` so Atomic keeps the run bounded, captures receipts in a goal ledger, gates completion through reviewers, and stops as `complete`, `blocked`, or `needs_human` |
347
+ | Plan and execute a larger migration, broad refactor, multi-package change, or spec-to-PR effort | `/workflow ralph prompt="..."` so Atomic can plan the approach, delegate implementation through sub-agents, simplify, review, iterate, and prepare a pull-request report |
341
348
  | Create or edit reusable automation | a TypeScript workflow definition with `defineWorkflow(...).run(...).compile()` |
342
349
  | Track one-off work without saving a workflow file | direct `workflow({ task })`, `workflow({ tasks })`, or `workflow({ chain })` calls |
343
350
  | Make a workflow robust | design the stage graph, context handoffs, artifacts, validation gates, model fallbacks, and human approval points before coding |
@@ -660,7 +667,7 @@ workflow({
660
667
  })
661
668
  ```
662
669
 
663
- Direct mode supports top-level/default options and per-task options such as `context`, `forkFromSessionFile`, `model`, `fallbackModels`, `thinkingLevel`, `tools`, `noTools`, `customTools`, `mcp`, `output`, `outputMode`, `reads`, `worktree`, `maxOutput`, `artifacts`, `sessionDir`, `cwd`, and `agentDir`. Direct chains also support `chainName`, `chainDir`, and `failFast`.
670
+ Direct mode supports top-level/default options and per-task options such as `context`, `forkFromSessionFile`, `model`, `fallbackModels`, `thinkingLevel`, `tools`, `noTools`, `customTools`, `mcp`, `output`, `outputMode`, `reads`, `worktree`, `gitWorktreeDir`, `baseBranch`, `maxOutput`, `artifacts`, `sessionDir`, `cwd`, and `agentDir`. Direct chains also support `chainName`, `chainDir`, and `failFast`.
664
671
 
665
672
  For large fan-outs, prefer `outputMode: "file-only"` so the parent result contains compact file references instead of full output. Treat intercom payloads from async direct runs as user-visible workflow output.
666
673
 
@@ -710,6 +717,7 @@ Builder basics:
710
717
  - Workflow names normalize for lookup: trim, lowercase, convert whitespace/underscore to hyphen, remove other punctuation, and collapse hyphens.
711
718
  - `.description(text)` sets the listing text.
712
719
  - `.input(key, schema)` declares typed user inputs.
720
+ - `.worktreeFromInputs({ gitWorktreeDir, baseBranch })` optionally maps input names to workflow-wide reusable Git worktree defaults.
713
721
  - `.run(async (ctx) => { ... })` defines the workflow body.
714
722
  - `.compile()` returns the workflow definition for discovery.
715
723
 
@@ -770,9 +778,28 @@ Common task/stage options include:
770
778
  - `context: "fresh" | "fork"`, `forkFromSessionFile`
771
779
  - `model`, `fallbackModels`, `thinkingLevel`, `scopedModels`, `modelRegistry`
772
780
  - `tools`, `noTools`, `customTools`, `mcp: { allow?: string[], deny?: string[] }`
773
- - `output`, `outputMode`, `reads`, `worktree`, `maxOutput`, `artifacts`, `sessionDir`, `cwd`, `agentDir`
781
+ - `output`, `outputMode`, `reads`, `worktree`, `gitWorktreeDir`, `baseBranch`, `maxOutput`, `artifacts`, `sessionDir`, `cwd`, `agentDir`
774
782
  - advanced host-supplied SDK seams: `authStorage`, `resourceLoader`, `sessionManager`, `settingsManager`, `sessionStartEvent`
775
783
 
784
+ `gitWorktreeDir` selects a reusable Git worktree root for `ctx.stage`, `ctx.task`, `ctx.chain`, and `ctx.parallel`. If the path is missing, Atomic creates it with `git worktree add --detach <path> <baseBranch>`; if it exists, it must be a same-repository worktree root. The default stage cwd becomes the matching cwd inside the worktree and preserves the invoking repo-relative subdirectory. Explicit `cwd` still wins; relative `cwd` values resolve from the worktree cwd, while absolute `cwd` values are used as provided. `gitWorktreeDir` is mutually exclusive with `worktree: true`: use `gitWorktreeDir` for named/reusable worktrees and `worktree: true` for temporary direct-mode worktrees that are cleaned up after the run.
785
+
786
+ To bind user inputs to a workflow-wide worktree default, use the builder method:
787
+
788
+ ```ts
789
+ export default defineWorkflow("safe-implementation")
790
+ .input("task", { type: "text", required: true })
791
+ .input("git_worktree_dir", { type: "string", default: "" })
792
+ .input("base_branch", { type: "string", default: "origin/main" })
793
+ .worktreeFromInputs({ gitWorktreeDir: "git_worktree_dir", baseBranch: "base_branch" })
794
+ .run(async (ctx) => {
795
+ const result = await ctx.task("implement", { task: String(ctx.inputs.task) });
796
+ return { result: result.text };
797
+ })
798
+ .compile();
799
+ ```
800
+
801
+ For lower-level integrations, `@bastani/workflows` also exports `setupGitWorktree({ gitWorktreeDir, baseBranch, cwd })`, returning `{ worktreeRoot, cwd, repositoryRoot, created }` with the same validation, symlink-preserving path handling, and cwd-preservation behavior used by workflow stages.
802
+
776
803
  `fallbackModels` retries transient provider/model failures with the primary `model` first, then each fallback, then the current Atomic-selected model when available. It is for rate limits, quota/auth/provider outages, unavailable models, network timeouts, and 5xx errors — not workflow-code errors, tool failures, validation failures, or cancellations.
777
804
 
778
805
  ## Programmatic Usage
@@ -1,7 +1,11 @@
1
1
  /**
2
2
  * Tools Configuration
3
3
  *
4
- * Use tool names to choose which built-in tools are enabled.
4
+ * Use tool names to choose which tools are exposed.
5
+ *
6
+ * `tools` is an allowlist. `excludeTools` removes names from the final exposed
7
+ * set after any allowlist is applied, which is useful for keeping the default
8
+ * tools while removing one tool such as ask_user_question.
5
9
  *
6
10
  * Tool names are matched against all available tools. If you use a custom `cwd`,
7
11
  * createAgentSession() applies that cwd when it builds the actual built-in tools.
@@ -28,6 +32,23 @@ const { session: customToolsSession } = await createAgentSession({
28
32
  console.log("Custom tools session created");
29
33
  customToolsSession.dispose();
30
34
 
35
+ // Keep defaults but remove one tool (for example, no human-in-the-loop prompts)
36
+ const { session: defaultsWithoutAskSession } = await createAgentSession({
37
+ excludeTools: ["ask_user_question"],
38
+ sessionManager: SessionManager.inMemory(),
39
+ });
40
+ console.log("Defaults minus ask_user_question session created");
41
+ defaultsWithoutAskSession.dispose();
42
+
43
+ // Allowlist first, then subtract exclusions
44
+ const { session: allowlistWithExclusionSession } = await createAgentSession({
45
+ tools: ["read", "bash", "ask_user_question"],
46
+ excludeTools: ["ask_user_question"],
47
+ sessionManager: SessionManager.inMemory(),
48
+ });
49
+ console.log("Allowlist with exclusion session created");
50
+ allowlistWithExclusionSession.dispose();
51
+
31
52
  // With custom cwd
32
53
  const customCwd = "/path/to/project";
33
54
  const { session: customCwdSession } = await createAgentSession({
@@ -12,7 +12,7 @@ The runtime example shows how to build a recreate function that closes over proc
12
12
  | `02-custom-model.ts` | Select model and thinking level |
13
13
  | `03-custom-prompt.ts` | Replace or modify system prompt |
14
14
  | `04-skills.ts` | Discover, filter, or replace skills |
15
- | `05-tools.ts` | Built-in tool allowlists |
15
+ | `05-tools.ts` | Tool allowlists and exclusions |
16
16
  | `06-extensions.ts` | Logging, blocking, result modification |
17
17
  | `07-context-files.ts` | AGENTS.md context files |
18
18
  | `08-slash-commands.ts` | File-based slash commands |
@@ -26,7 +26,7 @@ The runtime example shows how to build a recreate function that closes over proc
26
26
 
27
27
  ```bash
28
28
  cd packages/coding-agent
29
- npx tsx examples/sdk/01-minimal.ts
29
+ bun examples/sdk/01-minimal.ts
30
30
  ```
31
31
 
32
32
  ## Quick Reference
@@ -63,6 +63,9 @@ const { session } = await createAgentSession({ resourceLoader: loader, authStora
63
63
  // Read-only
64
64
  const { session } = await createAgentSession({ tools: ["read", "grep", "find", "ls"], authStorage, modelRegistry });
65
65
 
66
+ // Defaults minus one tool
67
+ const { session } = await createAgentSession({ excludeTools: ["ask_user_question"], authStorage, modelRegistry });
68
+
66
69
  // In-memory
67
70
  const { session } = await createAgentSession({
68
71
  sessionManager: SessionManager.inMemory(),
@@ -114,7 +117,8 @@ await session.prompt("Hello");
114
117
  | `agentDir` | `~/.pi/agent` | Config directory |
115
118
  | `model` | From settings/first available | Model to use |
116
119
  | `thinkingLevel` | From settings/"off" | off, low, medium, high |
117
- | `tools` | `["read", "bash", "edit", "write"]` built-ins | Allowlist tool names across built-in, extension, and custom tools |
120
+ | `tools` | Default active built-ins | Allowlist tool names across built-in, extension, and custom tools |
121
+ | `excludeTools` | `[]` | Blocklist tool names across built-in, extension, and custom tools; applied after `tools` |
118
122
  | `customTools` | `[]` | Additional tool definitions |
119
123
  | `resourceLoader` | DefaultResourceLoader | Resource loader for extensions, skills, prompts, themes |
120
124
  | `sessionManager` | `SessionManager.create(cwd)` | Persistence |
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@bastani/atomic",
3
- "version": "0.8.17-0",
3
+ "version": "0.8.18-0",
4
4
  "description": "Atomic coding agent CLI with read, bash, edit, write tools and session management",
5
5
  "type": "module",
6
6
  "atomicConfig": {