npm - @bastani/atomic - Versions diffs - 0.8.31-alpha.1 → 0.8.31-alpha.3 - Mend

@bastani/atomic 0.8.31-alpha.1 → 0.8.31-alpha.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (148) hide show

package/CHANGELOG.md +17 -5
package/README.md +12 -10
package/dist/builtin/cursor/CHANGELOG.md +1 -1
package/dist/builtin/cursor/package.json +2 -2
package/dist/builtin/intercom/CHANGELOG.md +1 -1
package/dist/builtin/intercom/package.json +2 -2
package/dist/builtin/mcp/CHANGELOG.md +1 -1
package/dist/builtin/mcp/package.json +3 -3
package/dist/builtin/subagents/CHANGELOG.md +10 -1
package/dist/builtin/subagents/agents/codebase-online-researcher.md +8 -8
package/dist/builtin/subagents/agents/debugger.md +6 -6
package/dist/builtin/subagents/package.json +4 -4
package/dist/builtin/subagents/skills/effective-liteparse/SKILL.md +118 -0
package/dist/builtin/subagents/skills/effective-liteparse/scripts/search.py +128 -0
package/dist/builtin/subagents/skills/playwright-cli/SKILL.md +404 -0
package/dist/builtin/subagents/skills/playwright-cli/references/element-attributes.md +23 -0
package/dist/builtin/subagents/skills/playwright-cli/references/playwright-tests.md +39 -0
package/dist/builtin/subagents/skills/playwright-cli/references/request-mocking.md +87 -0
package/dist/builtin/subagents/skills/playwright-cli/references/running-code.md +241 -0
package/dist/builtin/subagents/skills/playwright-cli/references/session-management.md +225 -0
package/dist/builtin/subagents/skills/playwright-cli/references/spec-driven-testing.md +305 -0
package/dist/builtin/subagents/skills/playwright-cli/references/storage-state.md +275 -0
package/dist/builtin/subagents/skills/playwright-cli/references/test-generation.md +134 -0
package/dist/builtin/subagents/skills/playwright-cli/references/tracing.md +139 -0
package/dist/builtin/subagents/skills/playwright-cli/references/video-recording.md +143 -0
package/dist/builtin/web-access/CHANGELOG.md +1 -1
package/dist/builtin/web-access/package.json +2 -2
package/dist/builtin/workflows/CHANGELOG.md +7 -1
package/dist/builtin/workflows/README.md +4 -4
package/dist/builtin/workflows/builtin/open-claude-design.ts +59 -56
package/dist/builtin/workflows/builtin/ralph.ts +56 -3
package/dist/builtin/workflows/builtin/shared-prompts.ts +1 -1
package/dist/builtin/workflows/package.json +2 -2
package/dist/builtin/workflows/skills/research-codebase/SKILL.md +1 -1
package/dist/cli/args.d.ts.map +1 -1
package/dist/cli/args.js +1 -1
package/dist/cli/args.js.map +1 -1
package/dist/core/agent-session.d.ts +1 -0
package/dist/core/agent-session.d.ts.map +1 -1
package/dist/core/agent-session.js +49 -21
package/dist/core/agent-session.js.map +1 -1
package/dist/core/context-window.d.ts +26 -1
package/dist/core/context-window.d.ts.map +1 -1
package/dist/core/context-window.js +30 -6
package/dist/core/context-window.js.map +1 -1
package/dist/core/copilot-model-catalog.d.ts +39 -21
package/dist/core/copilot-model-catalog.d.ts.map +1 -1
package/dist/core/copilot-model-catalog.js +44 -16
package/dist/core/copilot-model-catalog.js.map +1 -1
package/dist/core/model-registry.d.ts.map +1 -1
package/dist/core/model-registry.js +6 -4
package/dist/core/model-registry.js.map +1 -1
package/dist/core/project-trust.d.ts.map +1 -1
package/dist/core/project-trust.js +2 -1
package/dist/core/project-trust.js.map +1 -1
package/dist/core/sdk.d.ts.map +1 -1
package/dist/core/sdk.js +18 -7
package/dist/core/sdk.js.map +1 -1
package/dist/core/settings-manager.d.ts +11 -2
package/dist/core/settings-manager.d.ts.map +1 -1
package/dist/core/settings-manager.js +62 -8
package/dist/core/settings-manager.js.map +1 -1
package/dist/core/system-prompt.d.ts.map +1 -1
package/dist/core/system-prompt.js +1 -0
package/dist/core/system-prompt.js.map +1 -1
package/dist/core/tools/edit-diff.d.ts +1 -2
package/dist/core/tools/edit-diff.d.ts.map +1 -1
package/dist/core/tools/edit-diff.js +1 -2
package/dist/core/tools/edit-diff.js.map +1 -1
package/dist/index.d.ts +2 -1
package/dist/index.d.ts.map +1 -1
package/dist/index.js +2 -1
package/dist/index.js.map +1 -1
package/dist/modes/interactive/components/config-selector.d.ts.map +1 -1
package/dist/modes/interactive/components/config-selector.js +5 -7
package/dist/modes/interactive/components/config-selector.js.map +1 -1
package/dist/modes/interactive/components/model-selector.d.ts.map +1 -1
package/dist/modes/interactive/components/model-selector.js +2 -1
package/dist/modes/interactive/components/model-selector.js.map +1 -1
package/dist/modes/interactive/components/scoped-models-selector.d.ts.map +1 -1
package/dist/modes/interactive/components/scoped-models-selector.js +4 -1
package/dist/modes/interactive/components/scoped-models-selector.js.map +1 -1
package/dist/modes/interactive/components/settings-selector.d.ts +2 -0
package/dist/modes/interactive/components/settings-selector.d.ts.map +1 -1
package/dist/modes/interactive/components/settings-selector.js +165 -15
package/dist/modes/interactive/components/settings-selector.js.map +1 -1
package/dist/modes/interactive/components/tree-selector.d.ts.map +1 -1
package/dist/modes/interactive/components/tree-selector.js +44 -4
package/dist/modes/interactive/components/tree-selector.js.map +1 -1
package/dist/modes/interactive/interactive-mode.d.ts +1 -1
package/dist/modes/interactive/interactive-mode.d.ts.map +1 -1
package/dist/modes/interactive/interactive-mode.js +24 -54
package/dist/modes/interactive/interactive-mode.js.map +1 -1
package/dist/modes/interactive/model-search.d.ts +7 -0
package/dist/modes/interactive/model-search.d.ts.map +1 -0
package/dist/modes/interactive/model-search.js +6 -0
package/dist/modes/interactive/model-search.js.map +1 -0
package/dist/modes/interactive/theme/theme-controller.d.ts +30 -0
package/dist/modes/interactive/theme/theme-controller.d.ts.map +1 -0
package/dist/modes/interactive/theme/theme-controller.js +108 -0
package/dist/modes/interactive/theme/theme-controller.js.map +1 -0
package/dist/modes/interactive/theme/theme-schema.json +2 -1
package/dist/modes/interactive/theme/theme.d.ts +5 -0
package/dist/modes/interactive/theme/theme.d.ts.map +1 -1
package/dist/modes/interactive/theme/theme.js +70 -29
package/dist/modes/interactive/theme/theme.js.map +1 -1
package/dist/modes/rpc/rpc-client.d.ts +1 -1
package/dist/modes/rpc/rpc-client.d.ts.map +1 -1
package/dist/modes/rpc/rpc-client.js +1 -1
package/dist/modes/rpc/rpc-client.js.map +1 -1
package/dist/modes/rpc/rpc-mode.d.ts.map +1 -1
package/dist/modes/rpc/rpc-mode.js +1 -1
package/dist/modes/rpc/rpc-mode.js.map +1 -1
package/dist/package-manager-cli.d.ts.map +1 -1
package/dist/package-manager-cli.js +39 -9
package/dist/package-manager-cli.js.map +1 -1
package/docs/extensions.md +21 -0
package/docs/models.md +3 -3
package/docs/packages.md +13 -9
package/docs/providers.md +3 -3
package/docs/quickstart.md +14 -0
package/docs/rpc.md +3 -3
package/docs/sdk.md +15 -11
package/docs/session-format.md +1 -1
package/docs/settings.md +8 -3
package/docs/themes.md +3 -1
package/docs/tui.md +1 -1
package/docs/usage.md +12 -9
package/docs/workflows.md +9 -7
package/examples/extensions/custom-provider-anthropic/package-lock.json +2 -2
package/examples/extensions/custom-provider-anthropic/package.json +1 -1
package/examples/extensions/custom-provider-gitlab-duo/package.json +1 -1
package/examples/extensions/gondolin/package-lock.json +2 -2
package/examples/extensions/gondolin/package.json +1 -1
package/examples/extensions/preset.ts +10 -4
package/examples/extensions/provider-payload.ts +5 -5
package/examples/extensions/sandbox/index.ts +2 -2
package/examples/extensions/sandbox/package-lock.json +3 -3
package/examples/extensions/sandbox/package.json +2 -2
package/examples/extensions/subagent/agents.ts +2 -2
package/examples/extensions/subagent/index.ts +4 -2
package/examples/extensions/with-deps/package-lock.json +2 -2
package/examples/extensions/with-deps/package.json +1 -1
package/package.json +5 -5
package/dist/builtin/subagents/skills/browser/EXAMPLES.md +0 -151
package/dist/builtin/subagents/skills/browser/LICENSE.txt +0 -21
package/dist/builtin/subagents/skills/browser/REFERENCE.md +0 -451
package/dist/builtin/subagents/skills/browser/SKILL.md +0 -170

package/dist/builtin/subagents/skills/playwright-cli/references/tracing.md ADDED Viewed

@@ -0,0 +1,139 @@
+# Tracing
+Capture detailed execution traces for debugging and analysis. Traces include DOM snapshots, screenshots, network activity, and console logs.
+## Basic Usage
+```bash
+# Start trace recording
+playwright-cli tracing-start
+# Perform actions
+playwright-cli open https://example.com
+playwright-cli click e1
+playwright-cli fill e2 "test"
+# Stop trace recording
+playwright-cli tracing-stop
+```
+## Trace Output Files
+When you start tracing, Playwright creates a `traces/` directory with several files:
+### `trace-{timestamp}.trace`
+**Action log** - The main trace file containing:
+- Every action performed (clicks, fills, navigations)
+- DOM snapshots before and after each action
+- Screenshots at each step
+- Timing information
+- Console messages
+- Source locations
+### `trace-{timestamp}.network`
+**Network log** - Complete network activity:
+- All HTTP requests and responses
+- Request headers and bodies
+- Response headers and bodies
+- Timing (DNS, connect, TLS, TTFB, download)
+- Resource sizes
+- Failed requests and errors
+### `resources/`
+**Resources directory** - Cached resources:
+- Images, fonts, stylesheets, scripts
+- Response bodies for replay
+- Assets needed to reconstruct page state
+## What Traces Capture
+| Category | Details |
+|----------|---------|
+| **Actions** | Clicks, fills, hovers, keyboard input, navigations |
+| **DOM** | Full DOM snapshot before/after each action |
+| **Screenshots** | Visual state at each step |
+| **Network** | All requests, responses, headers, bodies, timing |
+| **Console** | All console.log, warn, error messages |
+| **Timing** | Precise timing for each operation |
+## Use Cases
+### Debugging Failed Actions
+```bash
+playwright-cli tracing-start
+playwright-cli open https://app.example.com
+# This click fails - why?
+playwright-cli click e5
+playwright-cli tracing-stop
+# Open trace to see DOM state when click was attempted
+```
+### Analyzing Performance
+```bash
+playwright-cli tracing-start
+playwright-cli open https://slow-site.com
+playwright-cli tracing-stop
+# View network waterfall to identify slow resources
+```
+### Capturing Evidence
+```bash
+# Record a complete user flow for documentation
+playwright-cli tracing-start
+playwright-cli open https://app.example.com/checkout
+playwright-cli fill e1 "4111111111111111"
+playwright-cli fill e2 "12/25"
+playwright-cli fill e3 "123"
+playwright-cli click e4
+playwright-cli tracing-stop
+# Trace shows exact sequence of events
+```
+## Trace vs Video vs Screenshot
+| Feature | Trace | Video | Screenshot |
+|---------|-------|-------|------------|
+| **Format** | .trace file | .webm video | .png/.jpeg image |
+| **DOM inspection** | Yes | No | No |
+| **Network details** | Yes | No | No |
+| **Step-by-step replay** | Yes | Continuous | Single frame |
+| **File size** | Medium | Large | Small |
+| **Best for** | Debugging | Demos | Quick capture |
+## Best Practices
+### 1. Start Tracing Before the Problem
+```bash
+# Trace the entire flow, not just the failing step
+playwright-cli tracing-start
+playwright-cli open https://example.com
+# ... all steps leading to the issue ...
+playwright-cli tracing-stop
+```
+### 2. Clean Up Old Traces
+Traces can consume significant disk space:
+```bash
+# Remove traces older than 7 days
+find .playwright-cli/traces -mtime +7 -delete
+```
+## Limitations
+- Traces add overhead to automation
+- Large traces can consume significant disk space
+- Some dynamic content may not replay perfectly

package/dist/builtin/subagents/skills/playwright-cli/references/video-recording.md ADDED Viewed

@@ -0,0 +1,143 @@
+# Video Recording
+Capture browser automation sessions as video for debugging, documentation, or verification. Produces WebM (VP8/VP9 codec).
+## Basic Recording
+```bash
+# Open browser first
+playwright-cli open
+# Start recording
+playwright-cli video-start demo.webm
+# Add a chapter marker for section transitions
+playwright-cli video-chapter "Getting Started" --description="Opening the homepage" --duration=2000
+# Navigate and perform actions
+playwright-cli goto https://example.com
+playwright-cli snapshot
+playwright-cli click e1
+# Add another chapter
+playwright-cli video-chapter "Filling Form" --description="Entering test data" --duration=2000
+playwright-cli fill e2 "test input"
+# Stop and save
+playwright-cli video-stop
+```
+## Best Practices
+### 1. Use Descriptive Filenames
+```bash
+# Include context in filename
+playwright-cli video-start recordings/login-flow-2024-01-15.webm
+playwright-cli video-start recordings/checkout-test-run-42.webm
+```
+### 2. Record entire hero scripts.
+When recording a video for the user or as a proof of work, it is best to create a code snippet and execute it with run-code.
+It allows inserting appropriate pauses between the actions and annotating the video. There are new Playwright APIs for that.
+1) Perform scenario using CLI and take note of all locators and actions. You'll need those locators to request their bounding boxes for highlight.
+2) Create a file with the intended script for video (below). Use pressSequentially w/ delay for nice typing, make reasonable pauses.
+3) Use playwright-cli run-code --filename your-script.js
+**Important**: Overlays are `pointer-events: none` — they do not interfere with page interactions. You can safely keep sticky overlays visible while clicking, filling, or performing any actions on the page.
+```js
+async page => {
+  await page.screencast.start({ path: 'video.webm', size: { width: 1280, height: 800 } });
+  await page.goto('https://demo.playwright.dev/todomvc');
+  // Show a chapter card — blurs the page and shows a dialog.
+  // Blocks until duration expires, then auto-removes.
+  // Use this for simple use cases, but always feel free to hand-craft your own beautiful
+  // overlay via await page.screencast.showOverlay().
+  await page.screencast.showChapter('Adding Todo Items', {
+    description: 'We will add several items to the todo list.',
+    duration: 2000,
+  });
+  // Perform action
+  await page.getByRole('textbox', { name: 'What needs to be done?' }).pressSequentially('Walk the dog', { delay: 60 });
+  await page.getByRole('textbox', { name: 'What needs to be done?' }).press('Enter');
+  await page.waitForTimeout(1000);
+  // Show next chapter
+  await page.screencast.showChapter('Verifying Results', {
+    description: 'Checking the item appeared in the list.',
+    duration: 2000,
+  });
+  // Add a sticky annotation that stays while you perform actions.
+  // Overlays are pointer-events: none, so they won't block clicks.
+  const annotation = await page.screencast.showOverlay(`
+    <div style="position: absolute; top: 8px; right: 8px;
+      padding: 6px 12px; background: rgba(0,0,0,0.7);
+      border-radius: 8px; font-size: 13px; color: white;">
+      ✓ Item added successfully
+    </div>
+  `);
+  // Perform more actions while the annotation is visible
+  await page.getByRole('textbox', { name: 'What needs to be done?' }).pressSequentially('Buy groceries', { delay: 60 });
+  await page.getByRole('textbox', { name: 'What needs to be done?' }).press('Enter');
+  await page.waitForTimeout(1500);
+  // Remove the annotation when done
+  await annotation.dispose();
+  // You can also highlight relevant locators and provide contextual annotations.
+  const bounds = await page.getByText('Walk the dog').boundingBox();
+  await page.screencast.showOverlay(`
+    <div style="position: absolute;
+      top: ${bounds.y}px;
+      left: ${bounds.x}px;
+      width: ${bounds.width}px;
+      height: ${bounds.height}px;
+      border: 1px solid red;">
+    </div>
+    <div style="position: absolute;
+      top: ${bounds.y + bounds.height + 5}px;
+      left: ${bounds.x + bounds.width / 2}px;
+      transform: translateX(-50%);
+      padding: 6px;
+      background: #808080;
+      border-radius: 10px;
+      font-size: 14px;
+      color: white;">Check it out, it is right above this text
+    </div>
+  `, { duration: 2000 });
+  await page.screencast.stop();
+}
+```
+Embrace creativity, overlays are powerful.
+### Overlay API Summary
+| Method | Use Case |
+|--------|----------|
+| `page.screencast.showChapter(title, { description?, duration?, styleSheet? })` | Full-screen chapter card with blurred backdrop — ideal for section transitions |
+| `page.screencast.showOverlay(html, { duration? })` | Custom HTML overlay — use for callouts, labels, highlights |
+| `disposable.dispose()` | Remove a sticky overlay added without duration |
+| `page.screencast.hideOverlays()` / `page.screencast.showOverlays()` | Temporarily hide/show all overlays |
+## Tracing vs Video
+| Feature | Video | Tracing |
+|---------|-------|---------|
+| Output | WebM file | Trace file (viewable in Trace Viewer) |
+| Shows | Visual recording | DOM snapshots, network, console, actions |
+| Use case | Demos, documentation | Debugging, analysis |
+| Size | Larger | Smaller |
+## Limitations
+- Recording adds slight overhead to automation
+- Large recordings can consume significant disk space

package/dist/builtin/web-access/CHANGELOG.md CHANGED Viewed

@@ -6,7 +6,7 @@ All notable changes to this project will be documented in this file.
 ### Changed
-- Aligned the web-access extension peer dependency with upstream pi TUI `^0.79.6` so web-access curator and summary UI surfaces consume the latest shared TUI compatibility fixes; no web-access extension code changes were made for this metadata sync ([#1413](https://github.com/bastani-inc/atomic/issues/1413)).
+- Aligned the web-access extension peer dependency with upstream pi TUI `^0.79.7` so web-access curator and summary UI surfaces consume the latest shared TUI color-scheme, Warp image capability, and compatibility fixes; no web-access extension code changes were made for this metadata sync ([#1413](https://github.com/bastani-inc/atomic/issues/1413)).
 ## [0.8.30] - 2026-06-17

package/dist/builtin/web-access/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@bastani/web-access",
-  "version": "0.8.31-alpha.1",
+  "version": "0.8.31-alpha.3",
   "private": true,
   "description": "Atomic extension for web search, URL fetching, GitHub repo cloning, PDF/video extraction. Fork of: https://github.com/nicobailon/pi-web-access",
   "contributors": [
@@ -30,7 +30,7 @@
   },
   "peerDependencies": {
     "@bastani/atomic": "*",
-    "@earendil-works/pi-tui": "^0.79.6"
+    "@earendil-works/pi-tui": "^0.79.7"
   },
   "peerDependenciesMeta": {
     "@bastani/atomic": {

package/dist/builtin/workflows/CHANGELOG.md CHANGED Viewed

@@ -6,15 +6,21 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 ## [Unreleased]
+### Breaking Changes
+- Renamed the builtin `open-claude-design` workflow output `browse_cli_status` to `playwright_cli_status` as part of migrating the workflow's preview/review tooling from the removed `browse` CLI to the `playwright-cli` command. Update any workflow-composition consumers that read `browse_cli_status`.
 ### Added
+- Added a QA end-to-end proof video to the builtin `ralph` workflow. For UI-applicable or full-stack changes, the orchestrator now runs a `playwright-cli` end-to-end QA pass that drives the running app like a user, records a reviewable video (`playwright-cli video-start`/`video-stop`) to a stable run path, references it in the implementation notes (`## QA E2E Video`), and exposes it as the new optional `qa_video_path` output so the proof is available when the orchestrator finishes. When `create_pr=true`, the final `pull-request` stage attaches or links that video to the created PR/MR/review (embedding/linking where the provider supports media uploads, otherwise surfacing the absolute path). When no user-visible UI scenario applies, no video is produced and the notes record why.
 - Added a per-model context-window authoring token to workflow model strings: a parenthesized size token placed in the model-name portion, *before* the optional `:reasoning` suffix, e.g. `github-copilot/claude-opus-4.8 (1m):xhigh`. Adopting GitHub Copilot's `Claude Opus 4.8 (1M context)` naming convention keeps the window separate from the reasoning level so the two never collide. The token is resolved against the candidate model's advertised windows — an exact match wins, otherwise the largest supported window not exceeding the request (so `(1m)` selects a model's ~936K long-context tier), and it falls back to the model's default (short) window when no larger tier is available. It applies only to the candidate that carries the token, leaving primary and other fallback models untouched. Also surfaced `contextWindow`/`contextWindowStrict` on `StageOptions` and the workflow tool's direct-task schema for stage-level selection.
 ### Changed
+- Changed the builtin `ralph`, `goal`, and `open-claude-design` workflows and the shared end-to-end verification guidance to drive browsers through the `playwright-cli` skill and `playwright-cli` command instead of the removed `browser` skill / `browse` CLI. Ralph/goal subagents now verify web and full-stack flows with `skill: "playwright-cli"`, and `open-claude-design`'s deterministic setup step now ensures `playwright-cli` (`npm install -g @playwright/cli@latest`) instead of `browse`, with every preview/review stage prompt updated to `playwright-cli open`/`snapshot`/`screenshot --filename`/`resize`/`show --annotate`.
 - Changed the builtin `ralph` workflow review fan-out from two reviewers to three independent reviewers, each running on a different primary model family (Claude Fable 5, GPT-5.5 Codex, and Gemini 3.1 Pro) with shared fallbacks, so the adversarial review gets cross-model coverage instead of repeated passes from one model. The review loop stops only when all three reviewers independently approve (find no issues), so a P0–P3 finding from any single reviewer keeps Ralph iterating instead of being out-voted by a majority quorum. Also strengthened the orchestrator's implementation-notes contract to require verifiable evidence for any claims recorded in the notes and reviewer artifacts.
 - Changed the builtin `deep-research-codebase`, `goal`, `ralph`, and `open-claude-design` workflows to run their GitHub Copilot `claude-opus-4.8` fallbacks at the model's largest advertised long-context (~1M/936K) window via the new `(1m)` token, automatically degrading to the 200K short window when Copilot's long-context tier is unavailable. Other models in each fallback chain are unaffected.
-- Aligned the workflows extension peer dependency with upstream pi TUI `^0.79.6` so workflow graph, custom UI, and prompt-broker integrations consume the latest shared TUI compatibility fixes; no workflows extension code changes were made for this metadata sync ([#1413](https://github.com/bastani-inc/atomic/issues/1413)).
+- Aligned the workflows extension peer dependency with upstream pi TUI `^0.79.7` so workflow graph, custom UI, and prompt-broker integrations consume the latest shared TUI color-scheme, Warp image capability, and compatibility fixes; no workflows extension code changes were made for this metadata sync ([#1413](https://github.com/bastani-inc/atomic/issues/1413)).
 ## [0.8.30] - 2026-06-17

package/dist/builtin/workflows/README.md CHANGED Viewed

@@ -591,7 +591,7 @@ Child workflow outputs: `result`, `findings`, `research_doc_path`, `artifact_dir
 ### `goal`
-Goal Runner workflow: initialize a persisted goal ledger with a per-run goal id and lifecycle events, render goal-continuation context, run bounded worker LM turns, append receipts, run three independent reviewers, and let a TypeScript reducer decide `complete`, `continue`, `blocked`, or `needs_human`. Workers and reviewers are prompted to verify user-visible behavior end-to-end when practical with browser-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. Token budget behavior is intentionally excluded.
+Goal Runner workflow: initialize a persisted goal ledger with a per-run goal id and lifecycle events, render goal-continuation context, run bounded worker LM turns, append receipts, run three independent reviewers, and let a TypeScript reducer decide `complete`, `continue`, `blocked`, or `needs_human`. Workers and reviewers are prompted to verify user-visible behavior end-to-end when practical with `playwright-cli`-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. Token budget behavior is intentionally excluded.
 ```text
 /workflow goal objective="Migrate the database layer to Drizzle ORM" base_branch=develop
@@ -609,7 +609,7 @@ Child workflow outputs: `result`, `status`, `approved`, `goal_id`, `objective`,
 ### `ralph`
-Prompt-engineering → research → orchestrate → review workflow with optional final-stage PR handoff: transform the user prompt into a codebase and online research question with `/skill:prompt-engineer`, run `/skill:research-codebase` against it, write findings under `research/`, delegate implementation through sub-agents from that research, run parallel reviewers, and iterate until approval or the loop limit. Ralph's orchestrator and reviewers are prompted to verify user-visible behavior end-to-end when practical with browser-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. Follow-up iterations pass unresolved review artifacts into prompt-engineering/research and fork research from prior research session data when available. Ralph skips PR creation by default; prompt text alone does not opt in. Pass `create_pr=true` to authorize only the final `pull-request` stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation (for example GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling). Ralph's own PR-creation instructions live in that final stage. Reviewers inspect repository infrastructure directly as needed; Ralph no longer runs separate `infra-*` discovery stages.
+Prompt-engineering → research → orchestrate → review workflow with optional final-stage PR handoff: transform the user prompt into a codebase and online research question with `/skill:prompt-engineer`, run `/skill:research-codebase` against it, write findings under `research/`, delegate implementation through sub-agents from that research, run parallel reviewers, and iterate until approval or the loop limit. Ralph's orchestrator and reviewers are prompted to verify user-visible behavior end-to-end when practical with `playwright-cli`-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. For UI-applicable or full-stack changes, the orchestrator runs a `playwright-cli` end-to-end QA pass and records a reviewable proof video, references it in the implementation notes, and exposes it as the `qa_video_path` output; when `create_pr=true`, the final `pull-request` stage attaches or links that video to the created PR/MR/review. Follow-up iterations pass unresolved review artifacts into prompt-engineering/research and fork research from prior research session data when available. Ralph skips PR creation by default; prompt text alone does not opt in. Pass `create_pr=true` to authorize only the final `pull-request` stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation (for example GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling). Ralph's own PR-creation instructions live in that final stage. Reviewers inspect repository infrastructure directly as needed; Ralph no longer runs separate `infra-*` discovery stages.
 ```text
 /workflow ralph prompt="Migrate the database layer to Drizzle ORM" max_loops=3 base_branch=develop
@@ -624,7 +624,7 @@ Prompt-engineering → research → orchestrate → review workflow with optiona
 | `git_worktree_dir` | `string`  | —        | `""`          | Optional reusable Git worktree root. Empty runs in the invoking checkout; non-empty values run Ralph stages in the created/reused worktree. |
 | `create_pr`        | `boolean` | —        | `false`       | Safe-by-default PR creation flag. Omitted or `false` skips the final `pull-request` stage and omits `pr_report`; prompt text alone does not opt in, and only strict `true` authorizes the final `pull-request` stage to attempt provider-appropriate PR/MR/review creation. |
-Child workflow outputs: `result`, `plan` (latest transformed research question), `plan_path` (compatibility alias for `research_path`), `research`, `research_path`, `implementation_notes_path`, `approved`, `iterations_completed`, `review_report`, and `review_report_path`. `pr_report` is included only when `create_pr=true` and the final `pull-request` stage runs.
+Child workflow outputs: `result`, `plan` (latest transformed research question), `plan_path` (compatibility alias for `research_path`), `research`, `research_path`, `implementation_notes_path`, `qa_video_path` (reviewable QA end-to-end proof video recorded with `playwright-cli` for UI-applicable changes, when produced), `approved`, `iterations_completed`, `review_report`, and `review_report_path`. `pr_report` is included only when `create_pr=true` and the final `pull-request` stage runs.
 ### `open-claude-design`
@@ -642,7 +642,7 @@ Design-system onboarding → reference import → generation → refinement →
 | `design_system`   | `text`   | —        | —           | Existing design-system reference / Design.md path.                   |
 | `max_refinements` | `number` | —        | `3`         | Maximum critique/apply refinement iterations.                        |
-Child workflow outputs: `output_type`, `design_system`, `artifact`, `handoff`, `approved_for_export`, `refinements_completed`, `import_context`, `run_id`, `artifact_dir`, `preview_path`, `preview_file_url`, `spec_path`, and `spec_file_url`. `open-claude-design` has no `result` output; it exposes only the declared fields listed here.
+Child workflow outputs: `output_type`, `design_system`, `artifact`, `handoff`, `approved_for_export`, `refinements_completed`, `import_context`, `run_id`, `artifact_dir`, `preview_path`, `preview_file_url`, `spec_path`, `spec_file_url`, and `playwright_cli_status`. `open-claude-design` has no `result` output; it exposes only the declared fields listed here.
 ---