npm - @bastani/atomic - Versions diffs - 0.8.17-0 → 0.8.18-0 - Mend

@bastani/atomic 0.8.17-0 → 0.8.18-0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (53) hide show

package/CHANGELOG.md +16 -0
package/dist/builtin/intercom/CHANGELOG.md +5 -0
package/dist/builtin/intercom/package.json +1 -1
package/dist/builtin/mcp/CHANGELOG.md +5 -0
package/dist/builtin/mcp/package.json +1 -1
package/dist/builtin/subagents/CHANGELOG.md +5 -0
package/dist/builtin/subagents/package.json +1 -1
package/dist/builtin/web-access/CHANGELOG.md +5 -0
package/dist/builtin/web-access/package.json +1 -1
package/dist/builtin/workflows/CHANGELOG.md +25 -0
package/dist/builtin/workflows/README.md +62 -3
package/dist/builtin/workflows/builtin/deep-research-codebase.ts +555 -537
package/dist/builtin/workflows/builtin/goal.ts +5 -0
package/dist/builtin/workflows/builtin/open-claude-design.ts +3 -3
package/dist/builtin/workflows/builtin/ralph.ts +737 -713
package/dist/builtin/workflows/builtin/shared-prompts.ts +11 -0
package/dist/builtin/workflows/package.json +1 -1
package/dist/builtin/workflows/src/extension/discovery.ts +61 -22
package/dist/builtin/workflows/src/extension/index.ts +2 -0
package/dist/builtin/workflows/src/extension/runtime.ts +4 -0
package/dist/builtin/workflows/src/extension/workflow-schema.ts +4 -0
package/dist/builtin/workflows/src/runs/foreground/executor.ts +96 -6
package/dist/builtin/workflows/src/runs/foreground/stage-runner.ts +2 -0
package/dist/builtin/workflows/src/runs/shared/workflow-runner.ts +7 -0
package/dist/builtin/workflows/src/runs/shared/worktree.ts +214 -1
package/dist/builtin/workflows/src/sdk-surface.ts +2 -0
package/dist/builtin/workflows/src/shared/types.ts +32 -3
package/dist/builtin/workflows/src/workflows/define-workflow.ts +18 -1
package/dist/core/agent-session-services.d.ts +2 -1
package/dist/core/agent-session-services.d.ts.map +1 -1
package/dist/core/agent-session-services.js +1 -0
package/dist/core/agent-session-services.js.map +1 -1
package/dist/core/agent-session.d.ts +3 -0
package/dist/core/agent-session.d.ts.map +1 -1
package/dist/core/agent-session.js +16 -5
package/dist/core/agent-session.js.map +1 -1
package/dist/core/atomic-guide-command.d.ts.map +1 -1
package/dist/core/atomic-guide-command.js +40 -28
package/dist/core/atomic-guide-command.js.map +1 -1
package/dist/core/sdk.d.ts +9 -1
package/dist/core/sdk.d.ts.map +1 -1
package/dist/core/sdk.js +2 -2
package/dist/core/sdk.js.map +1 -1
package/dist/core/system-prompt.d.ts +2 -0
package/dist/core/system-prompt.d.ts.map +1 -1
package/dist/core/system-prompt.js +22 -13
package/dist/core/system-prompt.js.map +1 -1
package/docs/quickstart.md +13 -5
package/docs/sdk.md +20 -5
package/docs/workflows.md +44 -17
package/examples/sdk/05-tools.ts +22 -1
package/examples/sdk/README.md +7 -3
package/package.json +1 -1

package/docs/workflows.md CHANGED Viewed

@@ -17,11 +17,12 @@ Use a workflow when a task should be repeatable, inspectable, resumable, or spli
 - **Package distribution** - Ship workflows through Atomic packages, settings, or conventional directories
 **Example use cases:**
+- Small, outcome-driven code or docs changes with explicit done criteria
 - Codebase research with parallel local and external research stages
 - Review/fix loops with independent reviewers and a synthesis stage
 - Release planning with human approval gates
 - Documentation audits that save findings as artifacts
-- Multi-stage migrations with validation and rollback checks
+- Multi-stage migrations, broad refactors, and validation/rollback plans
 - Reusable team workflows distributed through npm, git, or project settings
 ## Table of Contents
@@ -139,8 +140,8 @@ Atomic bundles four workflows that cover the most common multi-stage jobs. They
 | Workflow | What it does | When to use |
 |---|---|---|
 | `deep-research-codebase` | Scout + research-history chain → parallel specialist waves → aggregator. Indexes the whole repo and synthesizes findings. | Broad or cross-cutting research before you decide what to change. Prefer `/skill:research-codebase` for one subsystem. |
-| `goal` | Persisted goal ledger → bounded worker turns → receipts → three-reviewer gate → deterministic reducer → final report. | Focused implementation against a clear objective or spec where you want auditable progress, reviewer-quorum completion, repeated-blocker detection, and explicit stop decisions. |
-| `ralph` | RFC planning → sub-agent orchestration → simplification → infrastructure discovery → parallel review → PR handoff. | Larger spec-to-PR jobs where you want a generated technical plan, delegated implementation, iterative review, and pull-request preparation. |
+| `goal` | Persisted goal ledger → bounded worker turns → receipts → three-reviewer gate → deterministic reducer → final report. | Small-to-medium scope changes when you can identify the work surface, state the exact outcome, and name the validation that proves it is done — for example tests, lint/typecheck, docs builds, or observable behavior. |
+| `ralph` | RFC planning → sub-agent orchestration → simplification → infrastructure discovery → parallel review → PR handoff. | Larger migrations, broad refactors, multi-package changes, and spec-to-PR work where you want Atomic to plan the approach, delegate implementation through sub-agents, simplify, review, iterate, and prepare a pull-request report. |
 | `open-claude-design` | Design-system onboarding → reference import → HTML generation → impeccable-driven refinement → quality gate → rich HTML handoff. Renders a live `preview.html` you can iterate against (opens through `playwright-cli` when available). | UI, page, component, theme, or design-token work that benefits from generation + critique loops. |
 ### `deep-research-codebase`
@@ -192,7 +193,7 @@ Inputs:
 | Input | Type | Required | Default | Description |
 |---|---|---|---|---|
-| `objective` | text | yes | — | Goal-runner objective. |
+| `objective` | text | yes | — | Goal-runner objective. Include the desired end state, expected outcome, testing/validation instructions, and any explicit done criteria. |
 | `max_turns` | number | no | `10` | Maximum worker/review turns before human follow-up is needed. |
 | `base_branch` | string | no | `origin/main` | Branch reviewers compare the current code delta against. |
@@ -201,13 +202,15 @@ Inputs:
 Run examples:
 ```text
-/workflow goal objective="Implement specs/2026-03-rate-limit.md and validate the changed behavior"
-/workflow goal objective="Migrate the database layer to Drizzle" base_branch=develop
-/workflow goal objective="Finish the docs refresh" max_turns=2
+/workflow goal objective="Implement specs/2026-03-rate-limit.md, add the requested regression tests, run bun test packages/api/rate-limit.test.ts, and finish only when burst traffic returns 429 with Retry-After"
+/workflow goal objective="Update the CLI docs to describe the new --json flag, include one usage example, and verify the docs build still passes" max_turns=3
+/workflow goal objective="Fix the settings form validation bug; add/adjust the focused test and consider it done when invalid emails show the inline error without submitting"
 ```
 `goal` creates an OS-temp `goal-ledger.json` artifact, renders goal-continuation context for each worker turn, writes each worker receipt to `work-turn-N.md`, and appends receipts, reviewer decisions, blockers, reducer decisions, and lifecycle events to the ledger. The objective is treated as user-provided data, not higher-priority instructions.
+Write the `objective` like a compact acceptance spec. Say what should exist when the run is done, how you want testing handled, which command(s) or manual checks matter, and what outcome proves completion. The workflow is intentionally lean: it does not first generate an RFC or migration plan, so the developer-supplied objective is where scope, validation, and completion criteria belong.
 The worker may claim readiness, but it cannot finalize completion. Three reviewers independently inspect the ledger, worker receipt, repository state, and diff against `base_branch`; each returns structured JSON with findings, evidence, verification still remaining, and an optional blocker. A TypeScript reducer marks the goal complete only when reviewer quorum approves, marks blocked only when the same dependency/tool blocker repeats for the blocker threshold, continues when evidence is missing, and returns `needs_human` when `max_turns` is exhausted or worker execution fails.
 Result fields:
@@ -226,8 +229,6 @@ Result fields:
 | `remaining_work` | Remaining gaps/blockers when incomplete, or `none`. |
 | `review_report` | Markdown report containing the last structured reviewer decision payloads used by the reducer. |
-Use `goal` when you already have a clear objective or reviewed spec and want the leanest auditable implementation loop.
 ### `ralph`
 Inputs:
@@ -236,16 +237,20 @@ Inputs:
 |---|---|---|---|---|
 | `prompt` | text | yes | — | Task, feature request, issue summary, or spec path to plan, execute, refine, review, and prepare for PR. |
 | `max_loops` | number | no | `10` | Maximum plan/orchestrate/review iterations before the workflow proceeds to PR handoff without reviewer approval. |
-| `base_branch` | string | no | `origin/main` | Branch reviewers and the PR-prep stage compare the current code delta against. |
+| `base_branch` | string | no | `origin/main` | Branch reviewers and the PR-prep stage compare the current code delta against; also used to create a missing worktree. |
+| `git_worktree_dir` | string | no | `""` | Optional reusable Git worktree root. Empty runs in the invoking checkout; non-empty values run Ralph stages in the created/reused worktree. |
 Run examples:
 ```text
-/workflow ralph prompt="Implement specs/2026-03-rate-limit.md and prepare the PR"
 /workflow ralph prompt="Plan and migrate the database layer to Drizzle" max_loops=3 base_branch=develop
+/workflow ralph prompt="Refactor authentication across the API, CLI, and web UI, then prepare the PR"
+/workflow ralph prompt="Safely implement the API refactor" git_worktree_dir=../atomic-ralph-api-wt base_branch=main
 ```
-`ralph` is a heavier spec-to-PR workflow. Each iteration writes an RFC-style technical design document under `specs/`, initializes an OS-temp implementation notes file, delegates implementation through sub-agents, runs a behavior-preserving code simplifier, discovers review infrastructure, and asks two reviewers to inspect the patch against `base_branch`. The loop stops when every reviewer approves or `max_loops` is reached, then runs a pull-request preparation stage.
+Each `ralph` iteration writes an RFC-style technical design document under `specs/`, initializes an OS-temp implementation notes file, delegates implementation through sub-agents, runs a behavior-preserving code simplifier, discovers review infrastructure, and asks two reviewers to inspect the patch against `base_branch`. The loop stops when every reviewer approves or `max_loops` is reached, then runs a pull-request preparation stage.
+Set `git_worktree_dir` when you want Ralph's worker stages isolated in a reusable Git worktree. Relative paths resolve from the invoking repository root, existing same-repository worktree roots are reused, and missing paths are created from `base_branch`. Ralph preserves the invoking repo-relative cwd inside the worktree, so launching from `repo/packages/api` with `git_worktree_dir=../repo-wt` runs stages from `../repo-wt/packages/api`.
 Result fields:
@@ -260,7 +265,7 @@ Result fields:
 | `iterations_completed` | Number of plan/orchestrate/review loops completed. |
 | `review_report` | Markdown report containing the latest reviewer payloads. |
-A typical end-to-end flow is `/skill:research-codebase` → `/skill:create-spec` → `/workflow goal objective="Implement the researched rate-limit behavior and validate it"`. Use `/workflow ralph` instead when you want Atomic to generate the RFC, coordinate implementation sub-agents, iterate on review findings, and prepare a PR report in one workflow.
+A typical end-to-end flow is `/skill:research-codebase` → `/skill:create-spec` → `/workflow goal objective="Implement the researched rate-limit behavior, run the focused tests, and finish when the documented burst behavior is validated"` when you can identify the work surface, state the exact outcome, and name the validation that proves it is done. Keep using `/workflow ralph` for larger migrations, broad refactors, multi-package changes, and spec-to-PR work where you want Atomic to plan, delegate through sub-agents, simplify, review, iterate, and prepare a pull-request report.
 ### `open-claude-design`
@@ -291,11 +296,11 @@ Run a deep codebase research workflow on how the rate limiter behaves under burs
 ```
 ```text
-Use the goal workflow to implement specs/2026-03-rate-limit.md and cap it at 5 turns.
+Use the goal workflow to implement specs/2026-03-rate-limit.md, run the focused rate-limit tests, finish only when burst traffic returns 429 with Retry-After, and cap it at 5 turns.
 ```
 ```text
-Use the ralph workflow to plan, implement, review, and prepare a PR for specs/2026-03-rate-limit.md.
+Use the ralph workflow to plan a database-layer migration, implement it, review it, and prepare a PR.
 ```
 ```text
@@ -338,6 +343,8 @@ If the task is only deterministic TypeScript with no LLM/session stage, use a sc
 | User goal | Use |
 |-----------|-----|
 | Run, inspect, attach to, pause, interrupt, resume, or check status for an existing workflow | `/workflow ...` or `workflow({ action: ... })` |
+| Implement a small-to-medium scope change with an identifiable work surface, exact outcome, and named validation | `/workflow goal objective="..."` so Atomic keeps the run bounded, captures receipts in a goal ledger, gates completion through reviewers, and stops as `complete`, `blocked`, or `needs_human` |
+| Plan and execute a larger migration, broad refactor, multi-package change, or spec-to-PR effort | `/workflow ralph prompt="..."` so Atomic can plan the approach, delegate implementation through sub-agents, simplify, review, iterate, and prepare a pull-request report |
 | Create or edit reusable automation | a TypeScript workflow definition with `defineWorkflow(...).run(...).compile()` |
 | Track one-off work without saving a workflow file | direct `workflow({ task })`, `workflow({ tasks })`, or `workflow({ chain })` calls |
 | Make a workflow robust | design the stage graph, context handoffs, artifacts, validation gates, model fallbacks, and human approval points before coding |
@@ -660,7 +667,7 @@ workflow({
 })
 ```
-Direct mode supports top-level/default options and per-task options such as `context`, `forkFromSessionFile`, `model`, `fallbackModels`, `thinkingLevel`, `tools`, `noTools`, `customTools`, `mcp`, `output`, `outputMode`, `reads`, `worktree`, `maxOutput`, `artifacts`, `sessionDir`, `cwd`, and `agentDir`. Direct chains also support `chainName`, `chainDir`, and `failFast`.
+Direct mode supports top-level/default options and per-task options such as `context`, `forkFromSessionFile`, `model`, `fallbackModels`, `thinkingLevel`, `tools`, `noTools`, `customTools`, `mcp`, `output`, `outputMode`, `reads`, `worktree`, `gitWorktreeDir`, `baseBranch`, `maxOutput`, `artifacts`, `sessionDir`, `cwd`, and `agentDir`. Direct chains also support `chainName`, `chainDir`, and `failFast`.
 For large fan-outs, prefer `outputMode: "file-only"` so the parent result contains compact file references instead of full output. Treat intercom payloads from async direct runs as user-visible workflow output.
@@ -710,6 +717,7 @@ Builder basics:
 - Workflow names normalize for lookup: trim, lowercase, convert whitespace/underscore to hyphen, remove other punctuation, and collapse hyphens.
 - `.description(text)` sets the listing text.
 - `.input(key, schema)` declares typed user inputs.
+- `.worktreeFromInputs({ gitWorktreeDir, baseBranch })` optionally maps input names to workflow-wide reusable Git worktree defaults.
 - `.run(async (ctx) => { ... })` defines the workflow body.
 - `.compile()` returns the workflow definition for discovery.
@@ -770,9 +778,28 @@ Common task/stage options include:
 - `context: "fresh" | "fork"`, `forkFromSessionFile`
 - `model`, `fallbackModels`, `thinkingLevel`, `scopedModels`, `modelRegistry`
 - `tools`, `noTools`, `customTools`, `mcp: { allow?: string[], deny?: string[] }`
-- `output`, `outputMode`, `reads`, `worktree`, `maxOutput`, `artifacts`, `sessionDir`, `cwd`, `agentDir`
+- `output`, `outputMode`, `reads`, `worktree`, `gitWorktreeDir`, `baseBranch`, `maxOutput`, `artifacts`, `sessionDir`, `cwd`, `agentDir`
 - advanced host-supplied SDK seams: `authStorage`, `resourceLoader`, `sessionManager`, `settingsManager`, `sessionStartEvent`
+`gitWorktreeDir` selects a reusable Git worktree root for `ctx.stage`, `ctx.task`, `ctx.chain`, and `ctx.parallel`. If the path is missing, Atomic creates it with `git worktree add --detach <path> <baseBranch>`; if it exists, it must be a same-repository worktree root. The default stage cwd becomes the matching cwd inside the worktree and preserves the invoking repo-relative subdirectory. Explicit `cwd` still wins; relative `cwd` values resolve from the worktree cwd, while absolute `cwd` values are used as provided. `gitWorktreeDir` is mutually exclusive with `worktree: true`: use `gitWorktreeDir` for named/reusable worktrees and `worktree: true` for temporary direct-mode worktrees that are cleaned up after the run.
+To bind user inputs to a workflow-wide worktree default, use the builder method:
+```ts
+export default defineWorkflow("safe-implementation")
+  .input("task", { type: "text", required: true })
+  .input("git_worktree_dir", { type: "string", default: "" })
+  .input("base_branch", { type: "string", default: "origin/main" })
+  .worktreeFromInputs({ gitWorktreeDir: "git_worktree_dir", baseBranch: "base_branch" })
+  .run(async (ctx) => {
+    const result = await ctx.task("implement", { task: String(ctx.inputs.task) });
+    return { result: result.text };
+  })
+  .compile();
+```
+For lower-level integrations, `@bastani/workflows` also exports `setupGitWorktree({ gitWorktreeDir, baseBranch, cwd })`, returning `{ worktreeRoot, cwd, repositoryRoot, created }` with the same validation, symlink-preserving path handling, and cwd-preservation behavior used by workflow stages.
 `fallbackModels` retries transient provider/model failures with the primary `model` first, then each fallback, then the current Atomic-selected model when available. It is for rate limits, quota/auth/provider outages, unavailable models, network timeouts, and 5xx errors — not workflow-code errors, tool failures, validation failures, or cancellations.
 ## Programmatic Usage

package/examples/sdk/05-tools.ts CHANGED Viewed

@@ -1,7 +1,11 @@
 /**
  * Tools Configuration
  *
- * Use tool names to choose which built-in tools are enabled.
+ * Use tool names to choose which tools are exposed.
+ *
+ * `tools` is an allowlist. `excludeTools` removes names from the final exposed
+ * set after any allowlist is applied, which is useful for keeping the default
+ * tools while removing one tool such as ask_user_question.
  *
  * Tool names are matched against all available tools. If you use a custom `cwd`,
  * createAgentSession() applies that cwd when it builds the actual built-in tools.
@@ -28,6 +32,23 @@ const { session: customToolsSession } = await createAgentSession({
 console.log("Custom tools session created");
 customToolsSession.dispose();
+// Keep defaults but remove one tool (for example, no human-in-the-loop prompts)
+const { session: defaultsWithoutAskSession } = await createAgentSession({
+	excludeTools: ["ask_user_question"],
+	sessionManager: SessionManager.inMemory(),
+});
+console.log("Defaults minus ask_user_question session created");
+defaultsWithoutAskSession.dispose();
+// Allowlist first, then subtract exclusions
+const { session: allowlistWithExclusionSession } = await createAgentSession({
+	tools: ["read", "bash", "ask_user_question"],
+	excludeTools: ["ask_user_question"],
+	sessionManager: SessionManager.inMemory(),
+});
+console.log("Allowlist with exclusion session created");
+allowlistWithExclusionSession.dispose();
 // With custom cwd
 const customCwd = "/path/to/project";
 const { session: customCwdSession } = await createAgentSession({

package/examples/sdk/README.md CHANGED Viewed

@@ -12,7 +12,7 @@ The runtime example shows how to build a recreate function that closes over proc
 | `02-custom-model.ts` | Select model and thinking level |
 | `03-custom-prompt.ts` | Replace or modify system prompt |
 | `04-skills.ts` | Discover, filter, or replace skills |
-| `05-tools.ts` | Built-in tool allowlists |
+| `05-tools.ts` | Tool allowlists and exclusions |
 | `06-extensions.ts` | Logging, blocking, result modification |
 | `07-context-files.ts` | AGENTS.md context files |
 | `08-slash-commands.ts` | File-based slash commands |
@@ -26,7 +26,7 @@ The runtime example shows how to build a recreate function that closes over proc
 ```bash
 cd packages/coding-agent
-npx tsx examples/sdk/01-minimal.ts
+bun examples/sdk/01-minimal.ts
 ```
 ## Quick Reference
@@ -63,6 +63,9 @@ const { session } = await createAgentSession({ resourceLoader: loader, authStora
 // Read-only
 const { session } = await createAgentSession({ tools: ["read", "grep", "find", "ls"], authStorage, modelRegistry });
+// Defaults minus one tool
+const { session } = await createAgentSession({ excludeTools: ["ask_user_question"], authStorage, modelRegistry });
 // In-memory
 const { session } = await createAgentSession({
   sessionManager: SessionManager.inMemory(),
@@ -114,7 +117,8 @@ await session.prompt("Hello");
 | `agentDir` | `~/.pi/agent` | Config directory |
 | `model` | From settings/first available | Model to use |
 | `thinkingLevel` | From settings/"off" | off, low, medium, high |
-| `tools` | `["read", "bash", "edit", "write"]` built-ins | Allowlist tool names across built-in, extension, and custom tools |
+| `tools` | Default active built-ins | Allowlist tool names across built-in, extension, and custom tools |
+| `excludeTools` | `[]` | Blocklist tool names across built-in, extension, and custom tools; applied after `tools` |
 | `customTools` | `[]` | Additional tool definitions |
 | `resourceLoader` | DefaultResourceLoader | Resource loader for extensions, skills, prompts, themes |
 | `sessionManager` | `SessionManager.create(cwd)` | Persistence |

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@bastani/atomic",
-  "version": "0.8.17-0",
+  "version": "0.8.18-0",
   "description": "Atomic coding agent CLI with read, bash, edit, write tools and session management",
   "type": "module",
   "atomicConfig": {