npm - @bastani/atomic - Versions diffs - 0.9.0 → 0.9.1-alpha.1 - Mend

@bastani/atomic 0.9.0 → 0.9.1-alpha.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/CHANGELOG.md +10 -0
package/dist/builtin/cursor/CHANGELOG.md +6 -0
package/dist/builtin/cursor/package.json +2 -2
package/dist/builtin/intercom/CHANGELOG.md +6 -0
package/dist/builtin/intercom/package.json +1 -1
package/dist/builtin/mcp/CHANGELOG.md +6 -0
package/dist/builtin/mcp/package.json +1 -1
package/dist/builtin/subagents/CHANGELOG.md +6 -0
package/dist/builtin/subagents/package.json +1 -1
package/dist/builtin/web-access/CHANGELOG.md +6 -0
package/dist/builtin/web-access/package.json +1 -1
package/dist/builtin/workflows/CHANGELOG.md +12 -0
package/dist/builtin/workflows/README.md +1 -1
package/dist/builtin/workflows/builtin/goal-runner.ts +8 -5
package/dist/builtin/workflows/builtin/prompt-refinement.ts +8 -20
package/dist/builtin/workflows/builtin/ralph-models.ts +12 -15
package/dist/builtin/workflows/builtin/ralph-runner.ts +1 -1
package/dist/builtin/workflows/package.json +1 -1
package/docs/workflows.md +2 -2
package/package.json +2 -2

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,16 @@
 ## [Unreleased]
+## [0.9.1-alpha.1] - 2026-06-22
+### Changed
+- Changed the bundled `goal`/`ralph` workflow prompt-refinement stage to use a workflow-neutral, model-only rubric prompt that returns only the refined objective instead of invoking the `prompt-engineer` skill directly.
+### Fixed
+- Fixed the bundled `ralph` workflow reviewer-c model configuration to use Gemini 3.1 Pro as the third reviewer with Gemini 3.1 provider fallbacks, removing Gemini 3.5 Flash from that slot's fallback chain ([#1484](https://github.com/bastani-inc/atomic/issues/1484)).
 ## [0.9.0] - 2026-06-22
 ### Breaking Changes

package/dist/builtin/cursor/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,12 @@
 ## [Unreleased]
+## [0.9.1-alpha.1] - 2026-06-22
+### Changed
+- Published a synchronized Atomic 0.9.1-alpha.1 prerelease for the Cursor provider package; no functional Cursor provider changes were made after 0.9.0.
 ## [0.9.0] - 2026-06-22
 ### Changed

package/dist/builtin/cursor/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@bastani/cursor",
-  "version": "0.9.0",
+  "version": "0.9.1-alpha.1",
   "private": true,
   "description": "Experimental first-party Atomic extension for Cursor OAuth, model discovery, and streaming provider registration.",
   "contributors": [
@@ -40,7 +40,7 @@
     }
   },
   "dependencies": {
-    "@bastani/atomic-natives": "0.9.0",
+    "@bastani/atomic-natives": "0.9.1-alpha.1",
     "@bufbuild/protobuf": "^2.0.0"
   }
 }

package/dist/builtin/intercom/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,12 @@ All notable changes to the `pi-intercom` extension will be documented in this fi
 ## [Unreleased]
+## [0.9.1-alpha.1] - 2026-06-22
+### Changed
+- Published a synchronized Atomic 0.9.1-alpha.1 prerelease for the intercom extension; no functional intercom changes were made after 0.9.0.
 ## [0.9.0] - 2026-06-22
 ### Changed

package/dist/builtin/intercom/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@bastani/intercom",
-  "version": "0.9.0",
+  "version": "0.9.1-alpha.1",
   "private": true,
   "description": "Atomic extension providing a private coordination channel between parent and child agent sessions. Fork of: https://github.com/nicobailon/pi-intercom",
   "contributors": [

package/dist/builtin/mcp/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [0.9.1-alpha.1] - 2026-06-22
+### Changed
+- Published a synchronized Atomic 0.9.1-alpha.1 prerelease for the MCP extension; no functional MCP changes were made after 0.9.0.
 ## [0.9.0] - 2026-06-22
 ### Fixed

package/dist/builtin/mcp/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@bastani/mcp",
-  "version": "0.9.0",
+  "version": "0.9.1-alpha.1",
   "private": true,
   "description": "Atomic extension that adapts MCP (Model Context Protocol) servers into the coding agent. Fork of: https://github.com/nicobailon/pi-mcp-adapter",
   "contributors": [

package/dist/builtin/subagents/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,12 @@
 ## [Unreleased]
+## [0.9.1-alpha.1] - 2026-06-22
+### Changed
+- Published a synchronized Atomic 0.9.1-alpha.1 prerelease for the subagents extension; no functional subagents changes were made after 0.9.0.
 ## [0.9.0] - 2026-06-22
 ### Added

package/dist/builtin/subagents/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@bastani/subagents",
-  "version": "0.9.0",
+  "version": "0.9.1-alpha.1",
   "private": true,
   "description": "Atomic extension for delegating tasks to subagents with chains, parallel execution, and TUI clarification. Fork of: https://github.com/nicobailon/pi-subagents",
   "contributors": [

package/dist/builtin/web-access/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,12 @@ All notable changes to this project will be documented in this file.
 ## [Unreleased]
+## [0.9.1-alpha.1] - 2026-06-22
+### Changed
+- Published a synchronized Atomic 0.9.1-alpha.1 prerelease for the web-access extension; no functional web-access changes were made after 0.9.0.
 ## [0.9.0] - 2026-06-22
 ### Changed

package/dist/builtin/web-access/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@bastani/web-access",
-  "version": "0.9.0",
+  "version": "0.9.1-alpha.1",
   "private": true,
   "description": "Atomic extension for web search, URL fetching, GitHub repo cloning, PDF/video extraction. Fork of: https://github.com/nicobailon/pi-web-access",
   "contributors": [

package/dist/builtin/workflows/CHANGELOG.md CHANGED Viewed

@@ -6,6 +6,16 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 ## [Unreleased]
+## [0.9.1-alpha.1] - 2026-06-22
+### Changed
+- Changed the shared `goal`/`ralph` prompt-refinement stage to use a workflow-neutral, model-only rubric prompt that returns only the refined objective instead of invoking the `prompt-engineer` skill directly.
+### Fixed
+- Fixed the builtin `ralph` reviewer-c model configuration to use Gemini 3.1 Pro as the third reviewer with Gemini 3.1 provider fallbacks, removing Gemini 3.5 Flash from that slot's fallback chain ([#1484](https://github.com/bastani-inc/atomic/issues/1484)).
 ## [0.9.0] - 2026-06-22
 ### Breaking Changes
@@ -89,6 +99,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 ### Changed
+- Changed the shared `goal`/`ralph` prompt-refinement stage to use a workflow-neutral, model-only rubric prompt that returns only the refined objective instead of invoking the `prompt-engineer` skill directly.
 - Changed the builtin `ralph`, `goal`, and `open-claude-design` workflows and the shared end-to-end verification guidance to drive browsers through the `playwright-cli` skill and `playwright-cli` command instead of the removed `browser` skill / `browse` CLI. Ralph/goal subagents now verify web and full-stack flows with `skill: "playwright-cli"`, and `open-claude-design`'s deterministic setup step now ensures `playwright-cli` (`npm install -g @playwright/cli@latest`) instead of `browse`, with every preview/review stage prompt updated to `playwright-cli open`/`snapshot`/`screenshot --filename`/`resize`/`show --annotate`.
 - Changed the builtin `ralph` workflow review fan-out from two reviewers to three independent reviewers, each running on a different primary model family (Claude Fable 5, GPT-5.5 Codex, and Gemini 3.1 Pro) with shared fallbacks, so the adversarial review gets cross-model coverage instead of repeated passes from one model. The review loop stops only when all three reviewers independently approve (find no issues), so a P0–P3 finding from any single reviewer keeps Ralph iterating instead of being out-voted by a majority quorum. Also strengthened the orchestrator's implementation-notes contract to require verifiable evidence for any claims recorded in the notes and reviewer artifacts.
 - Changed the builtin `deep-research-codebase`, `goal`, `ralph`, and `open-claude-design` workflows to run their GitHub Copilot `claude-opus-4.8` fallbacks at the model's largest advertised long-context (~1M/936K) window via the new `(1m)` token, automatically degrading to the 200K short window when Copilot's long-context tier is unavailable. Other models in each fallback chain are unaffected.
@@ -97,6 +108,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 ### Fixed
+- Fixed the builtin `ralph` reviewer-c model configuration to use Gemini 3.1 Pro as the third reviewer and remove Gemini 3.5 Flash from that slot's fallback chain ([#1484](https://github.com/bastani-inc/atomic/issues/1484)).
 - Fixed workflow stage transcripts ignoring the host's resolved non-default session directory in headless runs. Stages without an explicit `sessionDir` now inherit the active main-session directory when it comes from `--session-dir`, `ATOMIC_CODING_AGENT_SESSION_DIR`, or settings; explicit per-stage `sessionDir` still wins, default host sessions keep writing stages to the global store, and forked stages inherit the non-default directory too ([#1444](https://github.com/bastani-inc/atomic/issues/1444)).
 - Fixed a manual workflow pause/resume not updating the main-chat run status the way the `workflow` tool and `/workflow pause`/`/workflow resume` do. Pausing a stage from the attached stage chat (Escape) or any direct live-handle path recorded only the **stage** as paused (`recordStagePaused`) and never the **run** (`recordRunPaused`), so the below-editor status widget and `/workflow status` kept showing the run as `running` (`●`) even though work was paused; resume had the symmetric gap. The executor stage-control handle now records run-level pause/resume itself — marking the run paused once no stage is still actively running (mirroring `pauseRun`'s all-active-stages-paused rule) and restoring it on resume — so manual and tool-driven pause/resume update the main chat identically. Both run-level transitions are idempotent, so the tool/slash path and cascade re-entry stay safe.
 - Fixed the builtin `ralph` workflow review loop iterating until `max_loops` even when reviewers judged the patch correct. The unanimous-approval gate required a literally empty `findings` array, so a single low-priority **P3** nit — or a placeholder/dummy finding a reviewer appended because it wrongly believed an empty array would fail schema validation — kept the loop spinning despite every reviewer reporting `overall_correctness: "patch is correct"`. Approval is now **severity-aware and deterministic**: a reviewer approves when it judged the patch correct, reported no `reviewer_error`, and filed no *blocking* finding, where blocking = **P0/P1/P2** (priority 0/1/2) and **P3** (priority 3) is a non-blocking nice-to-have; a finding without a determinable priority (`null`/`undefined`) is treated as blocking so ambiguity never silently approves. The decision is computed from finding priorities rather than the reviewer's self-reported `stop_review_loop` flag. Extracted the gate into `builtin/ralph-review-gate.ts` (`reviewDecisionApproved`, `isBlockingFinding`) with unit coverage, and updated the reviewer prompt so an empty `findings` array is explicitly valid and placeholder findings are never fabricated ([#1407](https://github.com/bastani-inc/atomic/issues/1407)).

package/dist/builtin/workflows/README.md CHANGED Viewed

@@ -678,7 +678,7 @@ Child workflow outputs: `result`, `status`, `approved`, `goal_id`, `objective`,
 ### `ralph`
-Prompt-engineering → research → orchestrate → review workflow with optional final-stage PR handoff: transform the user prompt into a codebase and online research question with `/skill:prompt-engineer`, run `/skill:research-codebase` against it, write findings under `research/`, delegate implementation through sub-agents from that research, run parallel reviewers, and iterate until approval or the loop limit. Ralph's orchestrator and reviewers are prompted to verify user-visible behavior end-to-end when practical with `playwright-cli`-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. For UI-applicable or full-stack changes, the orchestrator runs a `playwright-cli` end-to-end QA pass and records a reviewable proof video, references it in the implementation notes, and exposes it as the `qa_video_path` output; when `create_pr=true`, the final `pull-request` stage attaches or links that video to the created PR/MR/review. Follow-up iterations pass unresolved review artifacts into prompt-engineering/research and fork research from prior research session data when available. Ralph skips PR creation by default; prompt text alone does not opt in. Pass `create_pr=true` to authorize only the final `pull-request` stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation (for example GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling). Ralph's own PR-creation instructions live in that final stage. Reviewers inspect repository infrastructure directly as needed; Ralph no longer runs separate `infra-*` discovery stages.
+Prompt-refinement → prompt-engineering research → orchestrate → review workflow with optional final-stage PR handoff: first sharpen the raw prompt into a clearer objective, then transform it into a codebase and online research question with `/skill:prompt-engineer`, run `/skill:research-codebase` against it, write findings under `research/`, delegate implementation through sub-agents from that research, run parallel reviewers across Claude Fable 5, GPT-5.5 Codex, and Gemini 3.1 Pro model families, and iterate until approval or the loop limit. Ralph's orchestrator and reviewers are prompted to verify user-visible behavior end-to-end when practical with `playwright-cli`-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. For UI-applicable or full-stack changes, the orchestrator runs a `playwright-cli` end-to-end QA pass and records a reviewable proof video, references it in the implementation notes, and exposes it as the `qa_video_path` output; when `create_pr=true`, the final `pull-request` stage attaches or links that video to the created PR/MR/review. Follow-up iterations pass unresolved review artifacts into prompt-engineering/research and fork research from prior research session data when available. Ralph skips PR creation by default; prompt text alone does not opt in. Pass `create_pr=true` to authorize only the final `pull-request` stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation (for example GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling). Ralph's own PR-creation instructions live in that final stage. Reviewers inspect repository infrastructure directly as needed; Ralph no longer runs separate `infra-*` discovery stages.
 ```text
 /workflow ralph prompt="Migrate the database layer to Drizzle ORM" max_loops=3 base_branch=develop

package/dist/builtin/workflows/builtin/goal-runner.ts CHANGED Viewed

@@ -86,7 +86,7 @@ export async function runGoalWorkflow(ctx: GoalRunnerContext, options: GoalWorkf
     if (!rawObjective) {
       throw new Error("goal requires an objective input.");
     }
-    const objective = await runPromptRefinementStage(ctx, { request: rawObjective, workflowLabel: "Goal", modelConfig: promptEngineerModelConfig });
+    const objective = await runPromptRefinementStage(ctx, { request: rawObjective, modelConfig: promptEngineerModelConfig });
     const maxTurns = positiveInteger(inputs.max_turns, DEFAULT_MAX_TURNS);
     const reviewQuorum = DEFAULT_REVIEW_QUORUM;
@@ -103,9 +103,12 @@ export async function runGoalWorkflow(ctx: GoalRunnerContext, options: GoalWorkf
           "anthropic/claude-opus-4-8:medium",
           "zai/glm-5.2:medium",
           "zai-coding-cn/glm-5.2:medium",
+          "github-copilot/gemini-3.1-pro-preview (1m):medium",
+          "google/gemini-3.1-pro-preview:medium",
+          "google-vertex/gemini-3.1-pro-preview:medium",
           "github-copilot/gemini-3.5-flash (1m):medium",
           "google/gemini-3.5-flash:medium",
-          "google-vertex/gemini-3.5-flash:medium"
+          "google-vertex/gemini-3.5-flash:medium",
       ],
       tools: goalRunnerTools,
     };
@@ -120,12 +123,12 @@ export async function runGoalWorkflow(ctx: GoalRunnerContext, options: GoalWorkf
           "anthropic/claude-opus-4-8:xhigh",
           "zai/glm-5.2:xhigh",
           "zai-coding-cn/glm-5.2:xhigh",
-          "github-copilot/gemini-3.5-flash (1m):high",
-          "google/gemini-3.5-flash:high",
-          "google-vertex/gemini-3.5-flash:high",
           "github-copilot/gemini-3.1-pro-preview (1m):high",
           "google/gemini-3.1-pro-preview:high",
           "google-vertex/gemini-3.1-pro-preview:high",
+          "github-copilot/gemini-3.5-flash (1m):high",
+          "google/gemini-3.5-flash:high",
+          "google-vertex/gemini-3.5-flash:high",
       ],
       tools: goalRunnerTools,
       schema: reviewDecisionSchema,

package/dist/builtin/workflows/builtin/prompt-refinement.ts CHANGED Viewed

@@ -2,12 +2,11 @@
  * Shared prompt-refinement stage used by the ralph and goal workflows.
  *
  * Before the main work loop begins, both workflows run this single
- * `prompt-refinement` stage. It invokes the prompt-engineer skill
- * (`/skill:prompt-engineer`) to sharpen the raw user request into a clearer,
- * more actionable objective using the Workflow Best Practices prompt anatomy
- * documented in `packages/coding-agent/docs/workflows.md`. The refined request
- * replaces the original as the operative objective downstream; the original is
- * preserved by each workflow for reporting.
+ * `prompt-refinement` stage. The stage uses the Workflow Best Practices prompt
+ * anatomy documented in `packages/coding-agent/docs/workflows.md` to sharpen the
+ * raw user request into a clearer, more actionable objective. The refined
+ * request replaces the original as the operative objective downstream; the
+ * original is preserved by each workflow for reporting.
  */
 import type { WorkflowModelValue, WorkflowTaskOptions, WorkflowTaskResult } from "../src/shared/types.js";
@@ -21,7 +20,7 @@ export type PromptSection = readonly [tag: string, content: string];
  * inferred from the raw request.
  */
 export const PROMPT_REFINEMENT_CRITERIA = [
-  "Apply the workflow best practices documented in the `## Workflow Best Practices` section of `docs/workflows.md`. Treat that section as the authoritative prompt-anatomy rubric: use its Objective, Context, Scope, Non-goals, Done criteria, Validation command, Reporting requirements, and Stop conditions when refining the request.",
+  "Apply the workflow best practices documented in the `## Workflow Best Practices` section of `docs/workflows.md` to transform the raw request into a clear and verifiable objective. Treat that section as the authoritative prompt-anatomy rubric: use its Objective, Context, Scope, Non-goals, Done criteria, Validation command, Reporting requirements, and Stop conditions when refining the request.",
   "Objective — state what should be true when the work is complete.",
   "Context — note why it matters and where the relevant code or area likely lives.",
   "Scope — state what is allowed to change (the smallest correct change).",
@@ -39,21 +38,12 @@ export const PROMPT_REFINEMENT_CRITERIA = [
  */
 export function renderPromptRefinementPrompt(args: {
   readonly request: string;
-  readonly workflowLabel: string;
   readonly workflowCwdContext?: PromptSection;
 }): string {
   const sections: readonly string[] = [
-    `/skill:prompt-engineer Refine the following user request into a clearer, more actionable objective for the ${args.workflowLabel} workflow. Improve clarity and completeness using the rubric below without changing the user's intent, expanding scope, or inventing requirements that cannot be reasonably inferred from the request.`,
+    `Refine the following user request into a clear and verifiable objective. Improve clarity and completeness using the rubric below without changing the user's intent, expanding scope, or inventing requirements that cannot be reasonably inferred from the request.`,
     `<original_request>\n${args.request}\n</original_request>`,
-    `<clarity_rubric>\nApply the Workflow Best Practices prompt anatomy. Make each of the following explicit where it can be reasonably inferred from the original request:\n${PROMPT_REFINEMENT_CRITERIA}\n</clarity_rubric>`,
-    [
-      "<refinement_rules>",
-      "- Preserve the user's original intent and scope; do not add unrelated work.",
-      "- If the original request is already clear and complete, return it essentially unchanged with only clarity improvements.",
-      "- Where a criterion cannot be reasonably inferred, state it as a concise assumption or a 'to confirm' note rather than fabricating specifics.",
-      "- Do not implement anything, run commands, or edit files. This stage only produces the refined request text.",
-      "</refinement_rules>",
-    ].join("\n"),
+    `<instructions>\n${PROMPT_REFINEMENT_CRITERIA}\n</instructions>`,
     `<output_format>\nReturn ONLY the refined request. No preamble, no explanation, and no Markdown fences. The returned text replaces the original request as the operative objective for the rest of the workflow, so it must be a single self-contained request.\n</output_format>`,
   ];
   const tail = args.workflowCwdContext === undefined
@@ -84,7 +74,6 @@ export async function runPromptRefinementStage(
   ctx: PromptRefinementContext,
   options: {
     readonly request: string;
-    readonly workflowLabel: string;
     readonly workflowCwdContext?: PromptSection;
     readonly modelConfig: PromptRefinementModelConfig;
   },
@@ -92,7 +81,6 @@ export async function runPromptRefinementStage(
   const result = await ctx.task("prompt-refinement", {
     prompt: renderPromptRefinementPrompt({
       request: options.request,
-      workflowLabel: options.workflowLabel,
       ...(options.workflowCwdContext === undefined ? {} : { workflowCwdContext: options.workflowCwdContext }),
     }),
     ...options.modelConfig,

package/dist/builtin/workflows/builtin/ralph-models.ts CHANGED Viewed

@@ -48,12 +48,12 @@ export const orchestratorModelConfig = {
         "anthropic/claude-opus-4-8:medium",
         "zai/glm-5.2:medium",
         "zai-coding-cn/glm-5.2:medium",
-        "github-copilot/gemini-3.5-flash (1m):medium",
-        "google/gemini-3.5-flash:medium",
-        "google-vertex/gemini-3.5-flash:medium",
         "github-copilot/gemini-3.1-pro-preview (1m):medium",
         "google/gemini-3.1-pro-preview:medium",
         "google-vertex/gemini-3.1-pro-preview:medium",
+        "github-copilot/gemini-3.5-flash (1m):medium",
+        "google/gemini-3.5-flash:medium",
+        "google-vertex/gemini-3.5-flash:medium",
     ],
     excludedTools: ["ask_user_question"],
 };
@@ -68,12 +68,12 @@ export const reviewerAModelConfig = {
       "openai/gpt-5.5:xhigh",
       "zai/glm-5.2:xhigh",
       "zai-coding-cn/glm-5.2:xhigh",
+      "github-copilot/gemini-3.1-pro-preview (1m):high",
+      "google/gemini-3.1-pro-preview:high",
+      "google-vertex/gemini-3.1-pro-preview:high",
       "github-copilot/gemini-3.5-flash (1m):high",
       "google/gemini-3.5-flash:high",
       "google-vertex/gemini-3.5-flash:high",
-      "github-copilot/gemini-3.1-pro-preview (1m):high",
-      "google/gemini-3.1-pro-preview:high",
-      "google-vertex/gemini-3.1-pro-preview:high"
     ],
     excludedTools: ["ask_user_question"],
     schema: reviewDecisionSchema,
@@ -89,27 +89,24 @@ export const reviewerBModelConfig = {
       "anthropic/claude-opus-4-8:xhigh",
       "zai/glm-5.2:xhigh",
       "zai-coding-cn/glm-5.2:xhigh",
+      "github-copilot/gemini-3.1-pro-preview (1m):high",
+      "google/gemini-3.1-pro-preview:high",
+      "google-vertex/gemini-3.1-pro-preview:high",
       "github-copilot/gemini-3.5-flash (1m):high",
       "google/gemini-3.5-flash:high",
       "google-vertex/gemini-3.5-flash:high",
-      "github-copilot/gemini-3.1-pro-preview (1m):high",
-      "google/gemini-3.1-pro-preview:high",
-      "google-vertex/gemini-3.1-pro-preview:high"
     ],
     excludedTools: ["ask_user_question"],
     schema: reviewDecisionSchema,
 };
 export const reviewerCModelConfig = {
-    model: "zai/glm-5.2:xhigh",
+    model: "github-copilot/gemini-3.1-pro-preview (1m):high",
     fallbackModels: [
-      "zai-coding-cn/glm-5.2:xhigh",
-      "github-copilot/gemini-3.5-flash (1m):high",
-      "google/gemini-3.5-flash:high",
-      "google-vertex/gemini-3.5-flash:high",
-      "github-copilot/gemini-3.1-pro-preview (1m):high",
       "google/gemini-3.1-pro-preview:high",
       "google-vertex/gemini-3.1-pro-preview:high",
+      "zai/glm-5.2:xhigh",
+      "zai-coding-cn/glm-5.2:xhigh",
       "openai-codex/gpt-5.5:xhigh",
       "github-copilot/gpt-5.5:xhigh",
       "openai/gpt-5.5:xhigh",

package/dist/builtin/workflows/builtin/ralph-runner.ts CHANGED Viewed

@@ -49,7 +49,7 @@ export async function runRalphWorkflow(
   let finalResult = "";
   let finalPrReport: string | undefined;
   const workflowCwdContext = workflowCwdContextSection(workflowStartCwd);
-  const refinedPrompt = await runPromptRefinementStage(ctx, { request: prompt, workflowLabel: "Ralph", workflowCwdContext, modelConfig: promptEngineerModelConfig });
+  const refinedPrompt = await runPromptRefinementStage(ctx, { request: prompt, workflowCwdContext, modelConfig: promptEngineerModelConfig });
   const workflowResearchPath = resolve(workflowStartCwd, defaultResearchPath(refinedPrompt));
   const implementationNotesPath = await createImplementationNotesFile(refinedPrompt);
   const qaVideoPath = await createQaEvidenceVideoPath();

package/dist/builtin/workflows/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@bastani/workflows",
-  "version": "0.9.0",
+  "version": "0.9.1-alpha.1",
   "private": true,
   "description": "Atomic extension for multi-stage workflow authoring and execution.",
   "contributors": [

package/docs/workflows.md CHANGED Viewed

@@ -228,7 +228,7 @@ Run examples:
 /workflow goal objective="Implement the focused docs fix, run the docs validation command, and open a PR when complete" create_pr=true
 ```
-`goal` starts with a single `prompt-refinement` stage that invokes the `prompt-engineer` skill (`/skill:prompt-engineer`) to sharpen the raw objective into a clearer, more actionable form using the Workflow Best Practices prompt anatomy documented later in this guide; the refined objective becomes the operative one recorded in the ledger (the original is preserved as `original_objective` and shown in the final report when it differs). `goal` then creates an OS-temp `goal-ledger.json` artifact, renders goal-continuation context for each worker turn, writes each worker receipt to `work-turn-N.md`, and appends receipts, reviewer decisions, blockers, reducer decisions, and lifecycle events to the ledger. The objective is treated as user-provided data, not higher-priority instructions. By default `goal` does not start the final `pull-request` stage, and `pr_report` is omitted. Prompt text alone does not opt in. Pass `create_pr=true` only when you explicitly want the final stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation, such as GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling, after Goal reaches `complete` within `max_turns`. Goal worker and reviewer prompts explicitly tell intermediate stages to ignore PR-creation requests; only the final `pull-request` stage may attempt that handoff.
+`goal` starts with a single model-only `prompt-refinement` stage that sharpens the raw objective into a clearer, more actionable form using the Workflow Best Practices prompt anatomy documented later in this guide; the refined objective becomes the operative one recorded in the ledger (the original is preserved as `original_objective` and shown in the final report when it differs). `goal` then creates an OS-temp `goal-ledger.json` artifact, renders goal-continuation context for each worker turn, writes each worker receipt to `work-turn-N.md`, and appends receipts, reviewer decisions, blockers, reducer decisions, and lifecycle events to the ledger. The objective is treated as user-provided data, not higher-priority instructions. By default `goal` does not start the final `pull-request` stage, and `pr_report` is omitted. Prompt text alone does not opt in. Pass `create_pr=true` only when you explicitly want the final stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation, such as GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling, after Goal reaches `complete` within `max_turns`. Goal worker and reviewer prompts explicitly tell intermediate stages to ignore PR-creation requests; only the final `pull-request` stage may attempt that handoff.
 Write the `objective` like a compact acceptance spec. Say what should exist when the run is done, how you want testing handled, which command(s) or manual checks matter, and what outcome proves completion. The workflow is intentionally lean: it does not first generate an RFC or migration plan, so the developer-supplied objective is where scope, validation, and completion criteria belong.
@@ -273,7 +273,7 @@ Run examples:
 /workflow ralph prompt="Safely implement the API refactor" git_worktree_dir=../atomic-ralph-api-wt base_branch=main
 ```
-Each `ralph` run starts with a single `prompt-refinement` stage that invokes the `prompt-engineer` skill (`/skill:prompt-engineer`) to sharpen the raw user prompt into a clearer, more actionable objective using the Workflow Best Practices prompt anatomy documented later in this guide; that refined prompt becomes the operative objective for research, orchestration, and review, while the original is surfaced as `original_prompt`. Each iteration then transforms the refined prompt with `/skill:prompt-engineer Transform the following refined user request into a codebase and online research question which can be thoroughly explored: ...` (`research-prompt-refinement`), researches that transformed question with `/skill:research-codebase ...`, and writes the findings under `research/`. The orchestrator treats that research artifact as its primary implementation context, initializes/updates an OS-temp implementation notes file while generating verifiable evidence for any claims it records in the notes and reviewer artifacts, delegates implementation through sub-agents, and asks three independent reviewers to inspect the patch directly against `base_branch`. The reviewer fan-out runs each reviewer on a different primary model family (with shared fallbacks) so the adversarial review gets cross-model coverage instead of three passes from one model. Ralph's orchestrator and reviewers are prompted to verify user-visible behavior end-to-end when practical, using `playwright-cli`-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. For UI-applicable or full-stack changes, the orchestrator runs a `playwright-cli` end-to-end QA pass and records a reviewable proof video (referenced in the implementation notes and surfaced as `qa_video_path`); when `create_pr=true`, the final `pull-request` stage attaches or links that video to the created PR/MR/review. If reviewers find issues, the next `research-prompt-refinement` and research stages receive the review artifact path so follow-up research can address unresolved findings, and research stages fork from prior research session data when available. The loop stops only when all three reviewers independently approve (each finds no issues) or `max_loops` is reached, so a P0–P3 finding from any single reviewer keeps Ralph iterating instead of being out-voted by a majority quorum. By default Ralph does not start the final `pull-request` stage, and `pr_report` is omitted. Prompt text alone does not opt in. Pass `create_pr=true` only when you explicitly want the final `pull-request` stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation, such as GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling; Ralph's own PR-creation instructions live in that final stage.
+Each `ralph` run starts with a single model-only `prompt-refinement` stage that sharpens the raw user prompt into a clearer, more actionable objective using the Workflow Best Practices prompt anatomy documented later in this guide; that refined prompt becomes the operative objective for research, orchestration, and review, while the original is surfaced as `original_prompt`. Each iteration then transforms the refined prompt with `/skill:prompt-engineer Transform the following refined user request into a codebase and online research question which can be thoroughly explored: ...` (`research-prompt-refinement`), researches that transformed question with `/skill:research-codebase ...`, and writes the findings under `research/`. The orchestrator treats that research artifact as its primary implementation context, initializes/updates an OS-temp implementation notes file while generating verifiable evidence for any claims it records in the notes and reviewer artifacts, delegates implementation through sub-agents, and asks three independent reviewers to inspect the patch directly against `base_branch`. The reviewer fan-out runs reviewers on different primary model families (Claude Fable 5, GPT-5.5 Codex, and Gemini 3.1 Pro, with shared fallbacks) so the adversarial review gets cross-model coverage instead of three passes from one model. Ralph's orchestrator and reviewers are prompted to verify user-visible behavior end-to-end when practical, using `playwright-cli`-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. For UI-applicable or full-stack changes, the orchestrator runs a `playwright-cli` end-to-end QA pass and records a reviewable proof video (referenced in the implementation notes and surfaced as `qa_video_path`); when `create_pr=true`, the final `pull-request` stage attaches or links that video to the created PR/MR/review. If reviewers find issues, the next `research-prompt-refinement` and research stages receive the review artifact path so follow-up research can address unresolved findings, and research stages fork from prior research session data when available. The loop stops only when all three reviewers independently approve (each finds no issues) or `max_loops` is reached, so a P0–P3 finding from any single reviewer keeps Ralph iterating instead of being out-voted by a majority quorum. By default Ralph does not start the final `pull-request` stage, and `pr_report` is omitted. Prompt text alone does not opt in. Pass `create_pr=true` only when you explicitly want the final `pull-request` stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation, such as GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling; Ralph's own PR-creation instructions live in that final stage.
 Set `git_worktree_dir` when you want Ralph's worker stages isolated in a reusable Git worktree. Relative paths resolve from the invoking repository root, existing same-repository worktree roots are reused, and missing paths are created from `base_branch`. Ralph preserves the invoking repo-relative cwd inside the worktree, so launching from `repo/packages/api` with `git_worktree_dir=../repo-wt` runs stages from `../repo-wt/packages/api`.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@bastani/atomic",
-  "version": "0.9.0",
+  "version": "0.9.1-alpha.1",
   "description": "Atomic coding agent CLI with read, bash, edit, write tools and session management",
   "type": "module",
   "atomicConfig": {
@@ -68,7 +68,7 @@
     "prepublishOnly": "bun run clean && bun run build"
   },
   "dependencies": {
-    "@bastani/atomic-natives": "0.9.0",
+    "@bastani/atomic-natives": "0.9.1-alpha.1",
     "@bufbuild/protobuf": "^2.0.0",
     "@earendil-works/pi-agent-core": "^0.79.9",
     "@earendil-works/pi-ai": "^0.79.9",