@bastani/atomic 0.9.0 → 0.9.1-alpha.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +10 -0
- package/dist/builtin/cursor/CHANGELOG.md +6 -0
- package/dist/builtin/cursor/package.json +2 -2
- package/dist/builtin/intercom/CHANGELOG.md +6 -0
- package/dist/builtin/intercom/package.json +1 -1
- package/dist/builtin/mcp/CHANGELOG.md +6 -0
- package/dist/builtin/mcp/package.json +1 -1
- package/dist/builtin/subagents/CHANGELOG.md +6 -0
- package/dist/builtin/subagents/package.json +1 -1
- package/dist/builtin/web-access/CHANGELOG.md +6 -0
- package/dist/builtin/web-access/package.json +1 -1
- package/dist/builtin/workflows/CHANGELOG.md +12 -0
- package/dist/builtin/workflows/README.md +1 -1
- package/dist/builtin/workflows/builtin/goal-runner.ts +8 -5
- package/dist/builtin/workflows/builtin/prompt-refinement.ts +8 -20
- package/dist/builtin/workflows/builtin/ralph-models.ts +12 -15
- package/dist/builtin/workflows/builtin/ralph-runner.ts +1 -1
- package/dist/builtin/workflows/package.json +1 -1
- package/docs/workflows.md +2 -2
- package/package.json +2 -2
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,16 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [0.9.1-alpha.1] - 2026-06-22
|
|
6
|
+
|
|
7
|
+
### Changed
|
|
8
|
+
|
|
9
|
+
- Changed the bundled `goal`/`ralph` workflow prompt-refinement stage to use a workflow-neutral, model-only rubric prompt that returns only the refined objective instead of invoking the `prompt-engineer` skill directly.
|
|
10
|
+
|
|
11
|
+
### Fixed
|
|
12
|
+
|
|
13
|
+
- Fixed the bundled `ralph` workflow reviewer-c model configuration to use Gemini 3.1 Pro as the third reviewer with Gemini 3.1 provider fallbacks, removing Gemini 3.5 Flash from that slot's fallback chain ([#1484](https://github.com/bastani-inc/atomic/issues/1484)).
|
|
14
|
+
|
|
5
15
|
## [0.9.0] - 2026-06-22
|
|
6
16
|
|
|
7
17
|
### Breaking Changes
|
|
@@ -2,6 +2,12 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [0.9.1-alpha.1] - 2026-06-22
|
|
6
|
+
|
|
7
|
+
### Changed
|
|
8
|
+
|
|
9
|
+
- Published a synchronized Atomic 0.9.1-alpha.1 prerelease for the Cursor provider package; no functional Cursor provider changes were made after 0.9.0.
|
|
10
|
+
|
|
5
11
|
## [0.9.0] - 2026-06-22
|
|
6
12
|
|
|
7
13
|
### Changed
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@bastani/cursor",
|
|
3
|
-
"version": "0.9.
|
|
3
|
+
"version": "0.9.1-alpha.1",
|
|
4
4
|
"private": true,
|
|
5
5
|
"description": "Experimental first-party Atomic extension for Cursor OAuth, model discovery, and streaming provider registration.",
|
|
6
6
|
"contributors": [
|
|
@@ -40,7 +40,7 @@
|
|
|
40
40
|
}
|
|
41
41
|
},
|
|
42
42
|
"dependencies": {
|
|
43
|
-
"@bastani/atomic-natives": "0.9.
|
|
43
|
+
"@bastani/atomic-natives": "0.9.1-alpha.1",
|
|
44
44
|
"@bufbuild/protobuf": "^2.0.0"
|
|
45
45
|
}
|
|
46
46
|
}
|
|
@@ -4,6 +4,12 @@ All notable changes to the `pi-intercom` extension will be documented in this fi
|
|
|
4
4
|
|
|
5
5
|
## [Unreleased]
|
|
6
6
|
|
|
7
|
+
## [0.9.1-alpha.1] - 2026-06-22
|
|
8
|
+
|
|
9
|
+
### Changed
|
|
10
|
+
|
|
11
|
+
- Published a synchronized Atomic 0.9.1-alpha.1 prerelease for the intercom extension; no functional intercom changes were made after 0.9.0.
|
|
12
|
+
|
|
7
13
|
## [0.9.0] - 2026-06-22
|
|
8
14
|
|
|
9
15
|
### Changed
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@bastani/intercom",
|
|
3
|
-
"version": "0.9.
|
|
3
|
+
"version": "0.9.1-alpha.1",
|
|
4
4
|
"private": true,
|
|
5
5
|
"description": "Atomic extension providing a private coordination channel between parent and child agent sessions. Fork of: https://github.com/nicobailon/pi-intercom",
|
|
6
6
|
"contributors": [
|
|
@@ -7,6 +7,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
7
7
|
|
|
8
8
|
## [Unreleased]
|
|
9
9
|
|
|
10
|
+
## [0.9.1-alpha.1] - 2026-06-22
|
|
11
|
+
|
|
12
|
+
### Changed
|
|
13
|
+
|
|
14
|
+
- Published a synchronized Atomic 0.9.1-alpha.1 prerelease for the MCP extension; no functional MCP changes were made after 0.9.0.
|
|
15
|
+
|
|
10
16
|
## [0.9.0] - 2026-06-22
|
|
11
17
|
|
|
12
18
|
### Fixed
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@bastani/mcp",
|
|
3
|
-
"version": "0.9.
|
|
3
|
+
"version": "0.9.1-alpha.1",
|
|
4
4
|
"private": true,
|
|
5
5
|
"description": "Atomic extension that adapts MCP (Model Context Protocol) servers into the coding agent. Fork of: https://github.com/nicobailon/pi-mcp-adapter",
|
|
6
6
|
"contributors": [
|
|
@@ -2,6 +2,12 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [0.9.1-alpha.1] - 2026-06-22
|
|
6
|
+
|
|
7
|
+
### Changed
|
|
8
|
+
|
|
9
|
+
- Published a synchronized Atomic 0.9.1-alpha.1 prerelease for the subagents extension; no functional subagents changes were made after 0.9.0.
|
|
10
|
+
|
|
5
11
|
## [0.9.0] - 2026-06-22
|
|
6
12
|
|
|
7
13
|
### Added
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@bastani/subagents",
|
|
3
|
-
"version": "0.9.
|
|
3
|
+
"version": "0.9.1-alpha.1",
|
|
4
4
|
"private": true,
|
|
5
5
|
"description": "Atomic extension for delegating tasks to subagents with chains, parallel execution, and TUI clarification. Fork of: https://github.com/nicobailon/pi-subagents",
|
|
6
6
|
"contributors": [
|
|
@@ -4,6 +4,12 @@ All notable changes to this project will be documented in this file.
|
|
|
4
4
|
|
|
5
5
|
## [Unreleased]
|
|
6
6
|
|
|
7
|
+
## [0.9.1-alpha.1] - 2026-06-22
|
|
8
|
+
|
|
9
|
+
### Changed
|
|
10
|
+
|
|
11
|
+
- Published a synchronized Atomic 0.9.1-alpha.1 prerelease for the web-access extension; no functional web-access changes were made after 0.9.0.
|
|
12
|
+
|
|
7
13
|
## [0.9.0] - 2026-06-22
|
|
8
14
|
|
|
9
15
|
### Changed
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@bastani/web-access",
|
|
3
|
-
"version": "0.9.
|
|
3
|
+
"version": "0.9.1-alpha.1",
|
|
4
4
|
"private": true,
|
|
5
5
|
"description": "Atomic extension for web search, URL fetching, GitHub repo cloning, PDF/video extraction. Fork of: https://github.com/nicobailon/pi-web-access",
|
|
6
6
|
"contributors": [
|
|
@@ -6,6 +6,16 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
|
6
6
|
|
|
7
7
|
## [Unreleased]
|
|
8
8
|
|
|
9
|
+
## [0.9.1-alpha.1] - 2026-06-22
|
|
10
|
+
|
|
11
|
+
### Changed
|
|
12
|
+
|
|
13
|
+
- Changed the shared `goal`/`ralph` prompt-refinement stage to use a workflow-neutral, model-only rubric prompt that returns only the refined objective instead of invoking the `prompt-engineer` skill directly.
|
|
14
|
+
|
|
15
|
+
### Fixed
|
|
16
|
+
|
|
17
|
+
- Fixed the builtin `ralph` reviewer-c model configuration to use Gemini 3.1 Pro as the third reviewer with Gemini 3.1 provider fallbacks, removing Gemini 3.5 Flash from that slot's fallback chain ([#1484](https://github.com/bastani-inc/atomic/issues/1484)).
|
|
18
|
+
|
|
9
19
|
## [0.9.0] - 2026-06-22
|
|
10
20
|
|
|
11
21
|
### Breaking Changes
|
|
@@ -89,6 +99,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
|
89
99
|
|
|
90
100
|
### Changed
|
|
91
101
|
|
|
102
|
+
- Changed the shared `goal`/`ralph` prompt-refinement stage to use a workflow-neutral, model-only rubric prompt that returns only the refined objective instead of invoking the `prompt-engineer` skill directly.
|
|
92
103
|
- Changed the builtin `ralph`, `goal`, and `open-claude-design` workflows and the shared end-to-end verification guidance to drive browsers through the `playwright-cli` skill and `playwright-cli` command instead of the removed `browser` skill / `browse` CLI. Ralph/goal subagents now verify web and full-stack flows with `skill: "playwright-cli"`, and `open-claude-design`'s deterministic setup step now ensures `playwright-cli` (`npm install -g @playwright/cli@latest`) instead of `browse`, with every preview/review stage prompt updated to `playwright-cli open`/`snapshot`/`screenshot --filename`/`resize`/`show --annotate`.
|
|
93
104
|
- Changed the builtin `ralph` workflow review fan-out from two reviewers to three independent reviewers, each running on a different primary model family (Claude Fable 5, GPT-5.5 Codex, and Gemini 3.1 Pro) with shared fallbacks, so the adversarial review gets cross-model coverage instead of repeated passes from one model. The review loop stops only when all three reviewers independently approve (find no issues), so a P0–P3 finding from any single reviewer keeps Ralph iterating instead of being out-voted by a majority quorum. Also strengthened the orchestrator's implementation-notes contract to require verifiable evidence for any claims recorded in the notes and reviewer artifacts.
|
|
94
105
|
- Changed the builtin `deep-research-codebase`, `goal`, `ralph`, and `open-claude-design` workflows to run their GitHub Copilot `claude-opus-4.8` fallbacks at the model's largest advertised long-context (~1M/936K) window via the new `(1m)` token, automatically degrading to the 200K short window when Copilot's long-context tier is unavailable. Other models in each fallback chain are unaffected.
|
|
@@ -97,6 +108,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
|
97
108
|
|
|
98
109
|
### Fixed
|
|
99
110
|
|
|
111
|
+
- Fixed the builtin `ralph` reviewer-c model configuration to use Gemini 3.1 Pro as the third reviewer and remove Gemini 3.5 Flash from that slot's fallback chain ([#1484](https://github.com/bastani-inc/atomic/issues/1484)).
|
|
100
112
|
- Fixed workflow stage transcripts ignoring the host's resolved non-default session directory in headless runs. Stages without an explicit `sessionDir` now inherit the active main-session directory when it comes from `--session-dir`, `ATOMIC_CODING_AGENT_SESSION_DIR`, or settings; explicit per-stage `sessionDir` still wins, default host sessions keep writing stages to the global store, and forked stages inherit the non-default directory too ([#1444](https://github.com/bastani-inc/atomic/issues/1444)).
|
|
101
113
|
- Fixed a manual workflow pause/resume not updating the main-chat run status the way the `workflow` tool and `/workflow pause`/`/workflow resume` do. Pausing a stage from the attached stage chat (Escape) or any direct live-handle path recorded only the **stage** as paused (`recordStagePaused`) and never the **run** (`recordRunPaused`), so the below-editor status widget and `/workflow status` kept showing the run as `running` (`●`) even though work was paused; resume had the symmetric gap. The executor stage-control handle now records run-level pause/resume itself — marking the run paused once no stage is still actively running (mirroring `pauseRun`'s all-active-stages-paused rule) and restoring it on resume — so manual and tool-driven pause/resume update the main chat identically. Both run-level transitions are idempotent, so the tool/slash path and cascade re-entry stay safe.
|
|
102
114
|
- Fixed the builtin `ralph` workflow review loop iterating until `max_loops` even when reviewers judged the patch correct. The unanimous-approval gate required a literally empty `findings` array, so a single low-priority **P3** nit — or a placeholder/dummy finding a reviewer appended because it wrongly believed an empty array would fail schema validation — kept the loop spinning despite every reviewer reporting `overall_correctness: "patch is correct"`. Approval is now **severity-aware and deterministic**: a reviewer approves when it judged the patch correct, reported no `reviewer_error`, and filed no *blocking* finding, where blocking = **P0/P1/P2** (priority 0/1/2) and **P3** (priority 3) is a non-blocking nice-to-have; a finding without a determinable priority (`null`/`undefined`) is treated as blocking so ambiguity never silently approves. The decision is computed from finding priorities rather than the reviewer's self-reported `stop_review_loop` flag. Extracted the gate into `builtin/ralph-review-gate.ts` (`reviewDecisionApproved`, `isBlockingFinding`) with unit coverage, and updated the reviewer prompt so an empty `findings` array is explicitly valid and placeholder findings are never fabricated ([#1407](https://github.com/bastani-inc/atomic/issues/1407)).
|
|
@@ -678,7 +678,7 @@ Child workflow outputs: `result`, `status`, `approved`, `goal_id`, `objective`,
|
|
|
678
678
|
|
|
679
679
|
### `ralph`
|
|
680
680
|
|
|
681
|
-
Prompt-
|
|
681
|
+
Prompt-refinement → prompt-engineering research → orchestrate → review workflow with optional final-stage PR handoff: first sharpen the raw prompt into a clearer objective, then transform it into a codebase and online research question with `/skill:prompt-engineer`, run `/skill:research-codebase` against it, write findings under `research/`, delegate implementation through sub-agents from that research, run parallel reviewers across Claude Fable 5, GPT-5.5 Codex, and Gemini 3.1 Pro model families, and iterate until approval or the loop limit. Ralph's orchestrator and reviewers are prompted to verify user-visible behavior end-to-end when practical with `playwright-cli`-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. For UI-applicable or full-stack changes, the orchestrator runs a `playwright-cli` end-to-end QA pass and records a reviewable proof video, references it in the implementation notes, and exposes it as the `qa_video_path` output; when `create_pr=true`, the final `pull-request` stage attaches or links that video to the created PR/MR/review. Follow-up iterations pass unresolved review artifacts into prompt-engineering/research and fork research from prior research session data when available. Ralph skips PR creation by default; prompt text alone does not opt in. Pass `create_pr=true` to authorize only the final `pull-request` stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation (for example GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling). Ralph's own PR-creation instructions live in that final stage. Reviewers inspect repository infrastructure directly as needed; Ralph no longer runs separate `infra-*` discovery stages.
|
|
682
682
|
|
|
683
683
|
```text
|
|
684
684
|
/workflow ralph prompt="Migrate the database layer to Drizzle ORM" max_loops=3 base_branch=develop
|
|
@@ -86,7 +86,7 @@ export async function runGoalWorkflow(ctx: GoalRunnerContext, options: GoalWorkf
|
|
|
86
86
|
if (!rawObjective) {
|
|
87
87
|
throw new Error("goal requires an objective input.");
|
|
88
88
|
}
|
|
89
|
-
const objective = await runPromptRefinementStage(ctx, { request: rawObjective,
|
|
89
|
+
const objective = await runPromptRefinementStage(ctx, { request: rawObjective, modelConfig: promptEngineerModelConfig });
|
|
90
90
|
|
|
91
91
|
const maxTurns = positiveInteger(inputs.max_turns, DEFAULT_MAX_TURNS);
|
|
92
92
|
const reviewQuorum = DEFAULT_REVIEW_QUORUM;
|
|
@@ -103,9 +103,12 @@ export async function runGoalWorkflow(ctx: GoalRunnerContext, options: GoalWorkf
|
|
|
103
103
|
"anthropic/claude-opus-4-8:medium",
|
|
104
104
|
"zai/glm-5.2:medium",
|
|
105
105
|
"zai-coding-cn/glm-5.2:medium",
|
|
106
|
+
"github-copilot/gemini-3.1-pro-preview (1m):medium",
|
|
107
|
+
"google/gemini-3.1-pro-preview:medium",
|
|
108
|
+
"google-vertex/gemini-3.1-pro-preview:medium",
|
|
106
109
|
"github-copilot/gemini-3.5-flash (1m):medium",
|
|
107
110
|
"google/gemini-3.5-flash:medium",
|
|
108
|
-
"google-vertex/gemini-3.5-flash:medium"
|
|
111
|
+
"google-vertex/gemini-3.5-flash:medium",
|
|
109
112
|
],
|
|
110
113
|
tools: goalRunnerTools,
|
|
111
114
|
};
|
|
@@ -120,12 +123,12 @@ export async function runGoalWorkflow(ctx: GoalRunnerContext, options: GoalWorkf
|
|
|
120
123
|
"anthropic/claude-opus-4-8:xhigh",
|
|
121
124
|
"zai/glm-5.2:xhigh",
|
|
122
125
|
"zai-coding-cn/glm-5.2:xhigh",
|
|
123
|
-
"github-copilot/gemini-3.5-flash (1m):high",
|
|
124
|
-
"google/gemini-3.5-flash:high",
|
|
125
|
-
"google-vertex/gemini-3.5-flash:high",
|
|
126
126
|
"github-copilot/gemini-3.1-pro-preview (1m):high",
|
|
127
127
|
"google/gemini-3.1-pro-preview:high",
|
|
128
128
|
"google-vertex/gemini-3.1-pro-preview:high",
|
|
129
|
+
"github-copilot/gemini-3.5-flash (1m):high",
|
|
130
|
+
"google/gemini-3.5-flash:high",
|
|
131
|
+
"google-vertex/gemini-3.5-flash:high",
|
|
129
132
|
],
|
|
130
133
|
tools: goalRunnerTools,
|
|
131
134
|
schema: reviewDecisionSchema,
|
|
@@ -2,12 +2,11 @@
|
|
|
2
2
|
* Shared prompt-refinement stage used by the ralph and goal workflows.
|
|
3
3
|
*
|
|
4
4
|
* Before the main work loop begins, both workflows run this single
|
|
5
|
-
* `prompt-refinement` stage.
|
|
6
|
-
*
|
|
7
|
-
*
|
|
8
|
-
*
|
|
9
|
-
*
|
|
10
|
-
* preserved by each workflow for reporting.
|
|
5
|
+
* `prompt-refinement` stage. The stage uses the Workflow Best Practices prompt
|
|
6
|
+
* anatomy documented in `packages/coding-agent/docs/workflows.md` to sharpen the
|
|
7
|
+
* raw user request into a clearer, more actionable objective. The refined
|
|
8
|
+
* request replaces the original as the operative objective downstream; the
|
|
9
|
+
* original is preserved by each workflow for reporting.
|
|
11
10
|
*/
|
|
12
11
|
|
|
13
12
|
import type { WorkflowModelValue, WorkflowTaskOptions, WorkflowTaskResult } from "../src/shared/types.js";
|
|
@@ -21,7 +20,7 @@ export type PromptSection = readonly [tag: string, content: string];
|
|
|
21
20
|
* inferred from the raw request.
|
|
22
21
|
*/
|
|
23
22
|
export const PROMPT_REFINEMENT_CRITERIA = [
|
|
24
|
-
"Apply the workflow best practices documented in the `## Workflow Best Practices` section of `docs/workflows.md
|
|
23
|
+
"Apply the workflow best practices documented in the `## Workflow Best Practices` section of `docs/workflows.md` to transform the raw request into a clear and verifiable objective. Treat that section as the authoritative prompt-anatomy rubric: use its Objective, Context, Scope, Non-goals, Done criteria, Validation command, Reporting requirements, and Stop conditions when refining the request.",
|
|
25
24
|
"Objective — state what should be true when the work is complete.",
|
|
26
25
|
"Context — note why it matters and where the relevant code or area likely lives.",
|
|
27
26
|
"Scope — state what is allowed to change (the smallest correct change).",
|
|
@@ -39,21 +38,12 @@ export const PROMPT_REFINEMENT_CRITERIA = [
|
|
|
39
38
|
*/
|
|
40
39
|
export function renderPromptRefinementPrompt(args: {
|
|
41
40
|
readonly request: string;
|
|
42
|
-
readonly workflowLabel: string;
|
|
43
41
|
readonly workflowCwdContext?: PromptSection;
|
|
44
42
|
}): string {
|
|
45
43
|
const sections: readonly string[] = [
|
|
46
|
-
|
|
44
|
+
`Refine the following user request into a clear and verifiable objective. Improve clarity and completeness using the rubric below without changing the user's intent, expanding scope, or inventing requirements that cannot be reasonably inferred from the request.`,
|
|
47
45
|
`<original_request>\n${args.request}\n</original_request>`,
|
|
48
|
-
`<
|
|
49
|
-
[
|
|
50
|
-
"<refinement_rules>",
|
|
51
|
-
"- Preserve the user's original intent and scope; do not add unrelated work.",
|
|
52
|
-
"- If the original request is already clear and complete, return it essentially unchanged with only clarity improvements.",
|
|
53
|
-
"- Where a criterion cannot be reasonably inferred, state it as a concise assumption or a 'to confirm' note rather than fabricating specifics.",
|
|
54
|
-
"- Do not implement anything, run commands, or edit files. This stage only produces the refined request text.",
|
|
55
|
-
"</refinement_rules>",
|
|
56
|
-
].join("\n"),
|
|
46
|
+
`<instructions>\n${PROMPT_REFINEMENT_CRITERIA}\n</instructions>`,
|
|
57
47
|
`<output_format>\nReturn ONLY the refined request. No preamble, no explanation, and no Markdown fences. The returned text replaces the original request as the operative objective for the rest of the workflow, so it must be a single self-contained request.\n</output_format>`,
|
|
58
48
|
];
|
|
59
49
|
const tail = args.workflowCwdContext === undefined
|
|
@@ -84,7 +74,6 @@ export async function runPromptRefinementStage(
|
|
|
84
74
|
ctx: PromptRefinementContext,
|
|
85
75
|
options: {
|
|
86
76
|
readonly request: string;
|
|
87
|
-
readonly workflowLabel: string;
|
|
88
77
|
readonly workflowCwdContext?: PromptSection;
|
|
89
78
|
readonly modelConfig: PromptRefinementModelConfig;
|
|
90
79
|
},
|
|
@@ -92,7 +81,6 @@ export async function runPromptRefinementStage(
|
|
|
92
81
|
const result = await ctx.task("prompt-refinement", {
|
|
93
82
|
prompt: renderPromptRefinementPrompt({
|
|
94
83
|
request: options.request,
|
|
95
|
-
workflowLabel: options.workflowLabel,
|
|
96
84
|
...(options.workflowCwdContext === undefined ? {} : { workflowCwdContext: options.workflowCwdContext }),
|
|
97
85
|
}),
|
|
98
86
|
...options.modelConfig,
|
|
@@ -48,12 +48,12 @@ export const orchestratorModelConfig = {
|
|
|
48
48
|
"anthropic/claude-opus-4-8:medium",
|
|
49
49
|
"zai/glm-5.2:medium",
|
|
50
50
|
"zai-coding-cn/glm-5.2:medium",
|
|
51
|
-
"github-copilot/gemini-3.5-flash (1m):medium",
|
|
52
|
-
"google/gemini-3.5-flash:medium",
|
|
53
|
-
"google-vertex/gemini-3.5-flash:medium",
|
|
54
51
|
"github-copilot/gemini-3.1-pro-preview (1m):medium",
|
|
55
52
|
"google/gemini-3.1-pro-preview:medium",
|
|
56
53
|
"google-vertex/gemini-3.1-pro-preview:medium",
|
|
54
|
+
"github-copilot/gemini-3.5-flash (1m):medium",
|
|
55
|
+
"google/gemini-3.5-flash:medium",
|
|
56
|
+
"google-vertex/gemini-3.5-flash:medium",
|
|
57
57
|
],
|
|
58
58
|
excludedTools: ["ask_user_question"],
|
|
59
59
|
};
|
|
@@ -68,12 +68,12 @@ export const reviewerAModelConfig = {
|
|
|
68
68
|
"openai/gpt-5.5:xhigh",
|
|
69
69
|
"zai/glm-5.2:xhigh",
|
|
70
70
|
"zai-coding-cn/glm-5.2:xhigh",
|
|
71
|
+
"github-copilot/gemini-3.1-pro-preview (1m):high",
|
|
72
|
+
"google/gemini-3.1-pro-preview:high",
|
|
73
|
+
"google-vertex/gemini-3.1-pro-preview:high",
|
|
71
74
|
"github-copilot/gemini-3.5-flash (1m):high",
|
|
72
75
|
"google/gemini-3.5-flash:high",
|
|
73
76
|
"google-vertex/gemini-3.5-flash:high",
|
|
74
|
-
"github-copilot/gemini-3.1-pro-preview (1m):high",
|
|
75
|
-
"google/gemini-3.1-pro-preview:high",
|
|
76
|
-
"google-vertex/gemini-3.1-pro-preview:high"
|
|
77
77
|
],
|
|
78
78
|
excludedTools: ["ask_user_question"],
|
|
79
79
|
schema: reviewDecisionSchema,
|
|
@@ -89,27 +89,24 @@ export const reviewerBModelConfig = {
|
|
|
89
89
|
"anthropic/claude-opus-4-8:xhigh",
|
|
90
90
|
"zai/glm-5.2:xhigh",
|
|
91
91
|
"zai-coding-cn/glm-5.2:xhigh",
|
|
92
|
+
"github-copilot/gemini-3.1-pro-preview (1m):high",
|
|
93
|
+
"google/gemini-3.1-pro-preview:high",
|
|
94
|
+
"google-vertex/gemini-3.1-pro-preview:high",
|
|
92
95
|
"github-copilot/gemini-3.5-flash (1m):high",
|
|
93
96
|
"google/gemini-3.5-flash:high",
|
|
94
97
|
"google-vertex/gemini-3.5-flash:high",
|
|
95
|
-
"github-copilot/gemini-3.1-pro-preview (1m):high",
|
|
96
|
-
"google/gemini-3.1-pro-preview:high",
|
|
97
|
-
"google-vertex/gemini-3.1-pro-preview:high"
|
|
98
98
|
],
|
|
99
99
|
excludedTools: ["ask_user_question"],
|
|
100
100
|
schema: reviewDecisionSchema,
|
|
101
101
|
};
|
|
102
102
|
|
|
103
103
|
export const reviewerCModelConfig = {
|
|
104
|
-
model: "
|
|
104
|
+
model: "github-copilot/gemini-3.1-pro-preview (1m):high",
|
|
105
105
|
fallbackModels: [
|
|
106
|
-
"zai-coding-cn/glm-5.2:xhigh",
|
|
107
|
-
"github-copilot/gemini-3.5-flash (1m):high",
|
|
108
|
-
"google/gemini-3.5-flash:high",
|
|
109
|
-
"google-vertex/gemini-3.5-flash:high",
|
|
110
|
-
"github-copilot/gemini-3.1-pro-preview (1m):high",
|
|
111
106
|
"google/gemini-3.1-pro-preview:high",
|
|
112
107
|
"google-vertex/gemini-3.1-pro-preview:high",
|
|
108
|
+
"zai/glm-5.2:xhigh",
|
|
109
|
+
"zai-coding-cn/glm-5.2:xhigh",
|
|
113
110
|
"openai-codex/gpt-5.5:xhigh",
|
|
114
111
|
"github-copilot/gpt-5.5:xhigh",
|
|
115
112
|
"openai/gpt-5.5:xhigh",
|
|
@@ -49,7 +49,7 @@ export async function runRalphWorkflow(
|
|
|
49
49
|
let finalResult = "";
|
|
50
50
|
let finalPrReport: string | undefined;
|
|
51
51
|
const workflowCwdContext = workflowCwdContextSection(workflowStartCwd);
|
|
52
|
-
const refinedPrompt = await runPromptRefinementStage(ctx, { request: prompt,
|
|
52
|
+
const refinedPrompt = await runPromptRefinementStage(ctx, { request: prompt, workflowCwdContext, modelConfig: promptEngineerModelConfig });
|
|
53
53
|
const workflowResearchPath = resolve(workflowStartCwd, defaultResearchPath(refinedPrompt));
|
|
54
54
|
const implementationNotesPath = await createImplementationNotesFile(refinedPrompt);
|
|
55
55
|
const qaVideoPath = await createQaEvidenceVideoPath();
|
package/docs/workflows.md
CHANGED
|
@@ -228,7 +228,7 @@ Run examples:
|
|
|
228
228
|
/workflow goal objective="Implement the focused docs fix, run the docs validation command, and open a PR when complete" create_pr=true
|
|
229
229
|
```
|
|
230
230
|
|
|
231
|
-
`goal` starts with a single `prompt-refinement` stage that
|
|
231
|
+
`goal` starts with a single model-only `prompt-refinement` stage that sharpens the raw objective into a clearer, more actionable form using the Workflow Best Practices prompt anatomy documented later in this guide; the refined objective becomes the operative one recorded in the ledger (the original is preserved as `original_objective` and shown in the final report when it differs). `goal` then creates an OS-temp `goal-ledger.json` artifact, renders goal-continuation context for each worker turn, writes each worker receipt to `work-turn-N.md`, and appends receipts, reviewer decisions, blockers, reducer decisions, and lifecycle events to the ledger. The objective is treated as user-provided data, not higher-priority instructions. By default `goal` does not start the final `pull-request` stage, and `pr_report` is omitted. Prompt text alone does not opt in. Pass `create_pr=true` only when you explicitly want the final stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation, such as GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling, after Goal reaches `complete` within `max_turns`. Goal worker and reviewer prompts explicitly tell intermediate stages to ignore PR-creation requests; only the final `pull-request` stage may attempt that handoff.
|
|
232
232
|
|
|
233
233
|
Write the `objective` like a compact acceptance spec. Say what should exist when the run is done, how you want testing handled, which command(s) or manual checks matter, and what outcome proves completion. The workflow is intentionally lean: it does not first generate an RFC or migration plan, so the developer-supplied objective is where scope, validation, and completion criteria belong.
|
|
234
234
|
|
|
@@ -273,7 +273,7 @@ Run examples:
|
|
|
273
273
|
/workflow ralph prompt="Safely implement the API refactor" git_worktree_dir=../atomic-ralph-api-wt base_branch=main
|
|
274
274
|
```
|
|
275
275
|
|
|
276
|
-
Each `ralph` run starts with a single `prompt-refinement` stage that
|
|
276
|
+
Each `ralph` run starts with a single model-only `prompt-refinement` stage that sharpens the raw user prompt into a clearer, more actionable objective using the Workflow Best Practices prompt anatomy documented later in this guide; that refined prompt becomes the operative objective for research, orchestration, and review, while the original is surfaced as `original_prompt`. Each iteration then transforms the refined prompt with `/skill:prompt-engineer Transform the following refined user request into a codebase and online research question which can be thoroughly explored: ...` (`research-prompt-refinement`), researches that transformed question with `/skill:research-codebase ...`, and writes the findings under `research/`. The orchestrator treats that research artifact as its primary implementation context, initializes/updates an OS-temp implementation notes file while generating verifiable evidence for any claims it records in the notes and reviewer artifacts, delegates implementation through sub-agents, and asks three independent reviewers to inspect the patch directly against `base_branch`. The reviewer fan-out runs reviewers on different primary model families (Claude Fable 5, GPT-5.5 Codex, and Gemini 3.1 Pro, with shared fallbacks) so the adversarial review gets cross-model coverage instead of three passes from one model. Ralph's orchestrator and reviewers are prompted to verify user-visible behavior end-to-end when practical, using `playwright-cli`-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. For UI-applicable or full-stack changes, the orchestrator runs a `playwright-cli` end-to-end QA pass and records a reviewable proof video (referenced in the implementation notes and surfaced as `qa_video_path`); when `create_pr=true`, the final `pull-request` stage attaches or links that video to the created PR/MR/review. If reviewers find issues, the next `research-prompt-refinement` and research stages receive the review artifact path so follow-up research can address unresolved findings, and research stages fork from prior research session data when available. The loop stops only when all three reviewers independently approve (each finds no issues) or `max_loops` is reached, so a P0–P3 finding from any single reviewer keeps Ralph iterating instead of being out-voted by a majority quorum. By default Ralph does not start the final `pull-request` stage, and `pr_report` is omitted. Prompt text alone does not opt in. Pass `create_pr=true` only when you explicitly want the final `pull-request` stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation, such as GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling; Ralph's own PR-creation instructions live in that final stage.
|
|
277
277
|
|
|
278
278
|
Set `git_worktree_dir` when you want Ralph's worker stages isolated in a reusable Git worktree. Relative paths resolve from the invoking repository root, existing same-repository worktree roots are reused, and missing paths are created from `base_branch`. Ralph preserves the invoking repo-relative cwd inside the worktree, so launching from `repo/packages/api` with `git_worktree_dir=../repo-wt` runs stages from `../repo-wt/packages/api`.
|
|
279
279
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@bastani/atomic",
|
|
3
|
-
"version": "0.9.
|
|
3
|
+
"version": "0.9.1-alpha.1",
|
|
4
4
|
"description": "Atomic coding agent CLI with read, bash, edit, write tools and session management",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"atomicConfig": {
|
|
@@ -68,7 +68,7 @@
|
|
|
68
68
|
"prepublishOnly": "bun run clean && bun run build"
|
|
69
69
|
},
|
|
70
70
|
"dependencies": {
|
|
71
|
-
"@bastani/atomic-natives": "0.9.
|
|
71
|
+
"@bastani/atomic-natives": "0.9.1-alpha.1",
|
|
72
72
|
"@bufbuild/protobuf": "^2.0.0",
|
|
73
73
|
"@earendil-works/pi-agent-core": "^0.79.9",
|
|
74
74
|
"@earendil-works/pi-ai": "^0.79.9",
|