@bastani/atomic 0.8.31-alpha.1 → 0.8.31-alpha.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (148) hide show
  1. package/CHANGELOG.md +17 -5
  2. package/README.md +12 -10
  3. package/dist/builtin/cursor/CHANGELOG.md +1 -1
  4. package/dist/builtin/cursor/package.json +2 -2
  5. package/dist/builtin/intercom/CHANGELOG.md +1 -1
  6. package/dist/builtin/intercom/package.json +2 -2
  7. package/dist/builtin/mcp/CHANGELOG.md +1 -1
  8. package/dist/builtin/mcp/package.json +3 -3
  9. package/dist/builtin/subagents/CHANGELOG.md +10 -1
  10. package/dist/builtin/subagents/agents/codebase-online-researcher.md +8 -8
  11. package/dist/builtin/subagents/agents/debugger.md +6 -6
  12. package/dist/builtin/subagents/package.json +4 -4
  13. package/dist/builtin/subagents/skills/effective-liteparse/SKILL.md +118 -0
  14. package/dist/builtin/subagents/skills/effective-liteparse/scripts/search.py +128 -0
  15. package/dist/builtin/subagents/skills/playwright-cli/SKILL.md +404 -0
  16. package/dist/builtin/subagents/skills/playwright-cli/references/element-attributes.md +23 -0
  17. package/dist/builtin/subagents/skills/playwright-cli/references/playwright-tests.md +39 -0
  18. package/dist/builtin/subagents/skills/playwright-cli/references/request-mocking.md +87 -0
  19. package/dist/builtin/subagents/skills/playwright-cli/references/running-code.md +241 -0
  20. package/dist/builtin/subagents/skills/playwright-cli/references/session-management.md +225 -0
  21. package/dist/builtin/subagents/skills/playwright-cli/references/spec-driven-testing.md +305 -0
  22. package/dist/builtin/subagents/skills/playwright-cli/references/storage-state.md +275 -0
  23. package/dist/builtin/subagents/skills/playwright-cli/references/test-generation.md +134 -0
  24. package/dist/builtin/subagents/skills/playwright-cli/references/tracing.md +139 -0
  25. package/dist/builtin/subagents/skills/playwright-cli/references/video-recording.md +143 -0
  26. package/dist/builtin/web-access/CHANGELOG.md +1 -1
  27. package/dist/builtin/web-access/package.json +2 -2
  28. package/dist/builtin/workflows/CHANGELOG.md +7 -1
  29. package/dist/builtin/workflows/README.md +4 -4
  30. package/dist/builtin/workflows/builtin/open-claude-design.ts +59 -56
  31. package/dist/builtin/workflows/builtin/ralph.ts +56 -3
  32. package/dist/builtin/workflows/builtin/shared-prompts.ts +1 -1
  33. package/dist/builtin/workflows/package.json +2 -2
  34. package/dist/builtin/workflows/skills/research-codebase/SKILL.md +1 -1
  35. package/dist/cli/args.d.ts.map +1 -1
  36. package/dist/cli/args.js +1 -1
  37. package/dist/cli/args.js.map +1 -1
  38. package/dist/core/agent-session.d.ts +1 -0
  39. package/dist/core/agent-session.d.ts.map +1 -1
  40. package/dist/core/agent-session.js +49 -21
  41. package/dist/core/agent-session.js.map +1 -1
  42. package/dist/core/context-window.d.ts +26 -1
  43. package/dist/core/context-window.d.ts.map +1 -1
  44. package/dist/core/context-window.js +30 -6
  45. package/dist/core/context-window.js.map +1 -1
  46. package/dist/core/copilot-model-catalog.d.ts +39 -21
  47. package/dist/core/copilot-model-catalog.d.ts.map +1 -1
  48. package/dist/core/copilot-model-catalog.js +44 -16
  49. package/dist/core/copilot-model-catalog.js.map +1 -1
  50. package/dist/core/model-registry.d.ts.map +1 -1
  51. package/dist/core/model-registry.js +6 -4
  52. package/dist/core/model-registry.js.map +1 -1
  53. package/dist/core/project-trust.d.ts.map +1 -1
  54. package/dist/core/project-trust.js +2 -1
  55. package/dist/core/project-trust.js.map +1 -1
  56. package/dist/core/sdk.d.ts.map +1 -1
  57. package/dist/core/sdk.js +18 -7
  58. package/dist/core/sdk.js.map +1 -1
  59. package/dist/core/settings-manager.d.ts +11 -2
  60. package/dist/core/settings-manager.d.ts.map +1 -1
  61. package/dist/core/settings-manager.js +62 -8
  62. package/dist/core/settings-manager.js.map +1 -1
  63. package/dist/core/system-prompt.d.ts.map +1 -1
  64. package/dist/core/system-prompt.js +1 -0
  65. package/dist/core/system-prompt.js.map +1 -1
  66. package/dist/core/tools/edit-diff.d.ts +1 -2
  67. package/dist/core/tools/edit-diff.d.ts.map +1 -1
  68. package/dist/core/tools/edit-diff.js +1 -2
  69. package/dist/core/tools/edit-diff.js.map +1 -1
  70. package/dist/index.d.ts +2 -1
  71. package/dist/index.d.ts.map +1 -1
  72. package/dist/index.js +2 -1
  73. package/dist/index.js.map +1 -1
  74. package/dist/modes/interactive/components/config-selector.d.ts.map +1 -1
  75. package/dist/modes/interactive/components/config-selector.js +5 -7
  76. package/dist/modes/interactive/components/config-selector.js.map +1 -1
  77. package/dist/modes/interactive/components/model-selector.d.ts.map +1 -1
  78. package/dist/modes/interactive/components/model-selector.js +2 -1
  79. package/dist/modes/interactive/components/model-selector.js.map +1 -1
  80. package/dist/modes/interactive/components/scoped-models-selector.d.ts.map +1 -1
  81. package/dist/modes/interactive/components/scoped-models-selector.js +4 -1
  82. package/dist/modes/interactive/components/scoped-models-selector.js.map +1 -1
  83. package/dist/modes/interactive/components/settings-selector.d.ts +2 -0
  84. package/dist/modes/interactive/components/settings-selector.d.ts.map +1 -1
  85. package/dist/modes/interactive/components/settings-selector.js +165 -15
  86. package/dist/modes/interactive/components/settings-selector.js.map +1 -1
  87. package/dist/modes/interactive/components/tree-selector.d.ts.map +1 -1
  88. package/dist/modes/interactive/components/tree-selector.js +44 -4
  89. package/dist/modes/interactive/components/tree-selector.js.map +1 -1
  90. package/dist/modes/interactive/interactive-mode.d.ts +1 -1
  91. package/dist/modes/interactive/interactive-mode.d.ts.map +1 -1
  92. package/dist/modes/interactive/interactive-mode.js +24 -54
  93. package/dist/modes/interactive/interactive-mode.js.map +1 -1
  94. package/dist/modes/interactive/model-search.d.ts +7 -0
  95. package/dist/modes/interactive/model-search.d.ts.map +1 -0
  96. package/dist/modes/interactive/model-search.js +6 -0
  97. package/dist/modes/interactive/model-search.js.map +1 -0
  98. package/dist/modes/interactive/theme/theme-controller.d.ts +30 -0
  99. package/dist/modes/interactive/theme/theme-controller.d.ts.map +1 -0
  100. package/dist/modes/interactive/theme/theme-controller.js +108 -0
  101. package/dist/modes/interactive/theme/theme-controller.js.map +1 -0
  102. package/dist/modes/interactive/theme/theme-schema.json +2 -1
  103. package/dist/modes/interactive/theme/theme.d.ts +5 -0
  104. package/dist/modes/interactive/theme/theme.d.ts.map +1 -1
  105. package/dist/modes/interactive/theme/theme.js +70 -29
  106. package/dist/modes/interactive/theme/theme.js.map +1 -1
  107. package/dist/modes/rpc/rpc-client.d.ts +1 -1
  108. package/dist/modes/rpc/rpc-client.d.ts.map +1 -1
  109. package/dist/modes/rpc/rpc-client.js +1 -1
  110. package/dist/modes/rpc/rpc-client.js.map +1 -1
  111. package/dist/modes/rpc/rpc-mode.d.ts.map +1 -1
  112. package/dist/modes/rpc/rpc-mode.js +1 -1
  113. package/dist/modes/rpc/rpc-mode.js.map +1 -1
  114. package/dist/package-manager-cli.d.ts.map +1 -1
  115. package/dist/package-manager-cli.js +39 -9
  116. package/dist/package-manager-cli.js.map +1 -1
  117. package/docs/extensions.md +21 -0
  118. package/docs/models.md +3 -3
  119. package/docs/packages.md +13 -9
  120. package/docs/providers.md +3 -3
  121. package/docs/quickstart.md +14 -0
  122. package/docs/rpc.md +3 -3
  123. package/docs/sdk.md +15 -11
  124. package/docs/session-format.md +1 -1
  125. package/docs/settings.md +8 -3
  126. package/docs/themes.md +3 -1
  127. package/docs/tui.md +1 -1
  128. package/docs/usage.md +12 -9
  129. package/docs/workflows.md +9 -7
  130. package/examples/extensions/custom-provider-anthropic/package-lock.json +2 -2
  131. package/examples/extensions/custom-provider-anthropic/package.json +1 -1
  132. package/examples/extensions/custom-provider-gitlab-duo/package.json +1 -1
  133. package/examples/extensions/gondolin/package-lock.json +2 -2
  134. package/examples/extensions/gondolin/package.json +1 -1
  135. package/examples/extensions/preset.ts +10 -4
  136. package/examples/extensions/provider-payload.ts +5 -5
  137. package/examples/extensions/sandbox/index.ts +2 -2
  138. package/examples/extensions/sandbox/package-lock.json +3 -3
  139. package/examples/extensions/sandbox/package.json +2 -2
  140. package/examples/extensions/subagent/agents.ts +2 -2
  141. package/examples/extensions/subagent/index.ts +4 -2
  142. package/examples/extensions/with-deps/package-lock.json +2 -2
  143. package/examples/extensions/with-deps/package.json +1 -1
  144. package/package.json +5 -5
  145. package/dist/builtin/subagents/skills/browser/EXAMPLES.md +0 -151
  146. package/dist/builtin/subagents/skills/browser/LICENSE.txt +0 -21
  147. package/dist/builtin/subagents/skills/browser/REFERENCE.md +0 -451
  148. package/dist/builtin/subagents/skills/browser/SKILL.md +0 -170
@@ -0,0 +1,139 @@
1
+ # Tracing
2
+
3
+ Capture detailed execution traces for debugging and analysis. Traces include DOM snapshots, screenshots, network activity, and console logs.
4
+
5
+ ## Basic Usage
6
+
7
+ ```bash
8
+ # Start trace recording
9
+ playwright-cli tracing-start
10
+
11
+ # Perform actions
12
+ playwright-cli open https://example.com
13
+ playwright-cli click e1
14
+ playwright-cli fill e2 "test"
15
+
16
+ # Stop trace recording
17
+ playwright-cli tracing-stop
18
+ ```
19
+
20
+ ## Trace Output Files
21
+
22
+ When you start tracing, Playwright creates a `traces/` directory with several files:
23
+
24
+ ### `trace-{timestamp}.trace`
25
+
26
+ **Action log** - The main trace file containing:
27
+ - Every action performed (clicks, fills, navigations)
28
+ - DOM snapshots before and after each action
29
+ - Screenshots at each step
30
+ - Timing information
31
+ - Console messages
32
+ - Source locations
33
+
34
+ ### `trace-{timestamp}.network`
35
+
36
+ **Network log** - Complete network activity:
37
+ - All HTTP requests and responses
38
+ - Request headers and bodies
39
+ - Response headers and bodies
40
+ - Timing (DNS, connect, TLS, TTFB, download)
41
+ - Resource sizes
42
+ - Failed requests and errors
43
+
44
+ ### `resources/`
45
+
46
+ **Resources directory** - Cached resources:
47
+ - Images, fonts, stylesheets, scripts
48
+ - Response bodies for replay
49
+ - Assets needed to reconstruct page state
50
+
51
+ ## What Traces Capture
52
+
53
+ | Category | Details |
54
+ |----------|---------|
55
+ | **Actions** | Clicks, fills, hovers, keyboard input, navigations |
56
+ | **DOM** | Full DOM snapshot before/after each action |
57
+ | **Screenshots** | Visual state at each step |
58
+ | **Network** | All requests, responses, headers, bodies, timing |
59
+ | **Console** | All console.log, warn, error messages |
60
+ | **Timing** | Precise timing for each operation |
61
+
62
+ ## Use Cases
63
+
64
+ ### Debugging Failed Actions
65
+
66
+ ```bash
67
+ playwright-cli tracing-start
68
+ playwright-cli open https://app.example.com
69
+
70
+ # This click fails - why?
71
+ playwright-cli click e5
72
+
73
+ playwright-cli tracing-stop
74
+ # Open trace to see DOM state when click was attempted
75
+ ```
76
+
77
+ ### Analyzing Performance
78
+
79
+ ```bash
80
+ playwright-cli tracing-start
81
+ playwright-cli open https://slow-site.com
82
+ playwright-cli tracing-stop
83
+
84
+ # View network waterfall to identify slow resources
85
+ ```
86
+
87
+ ### Capturing Evidence
88
+
89
+ ```bash
90
+ # Record a complete user flow for documentation
91
+ playwright-cli tracing-start
92
+
93
+ playwright-cli open https://app.example.com/checkout
94
+ playwright-cli fill e1 "4111111111111111"
95
+ playwright-cli fill e2 "12/25"
96
+ playwright-cli fill e3 "123"
97
+ playwright-cli click e4
98
+
99
+ playwright-cli tracing-stop
100
+ # Trace shows exact sequence of events
101
+ ```
102
+
103
+ ## Trace vs Video vs Screenshot
104
+
105
+ | Feature | Trace | Video | Screenshot |
106
+ |---------|-------|-------|------------|
107
+ | **Format** | .trace file | .webm video | .png/.jpeg image |
108
+ | **DOM inspection** | Yes | No | No |
109
+ | **Network details** | Yes | No | No |
110
+ | **Step-by-step replay** | Yes | Continuous | Single frame |
111
+ | **File size** | Medium | Large | Small |
112
+ | **Best for** | Debugging | Demos | Quick capture |
113
+
114
+ ## Best Practices
115
+
116
+ ### 1. Start Tracing Before the Problem
117
+
118
+ ```bash
119
+ # Trace the entire flow, not just the failing step
120
+ playwright-cli tracing-start
121
+ playwright-cli open https://example.com
122
+ # ... all steps leading to the issue ...
123
+ playwright-cli tracing-stop
124
+ ```
125
+
126
+ ### 2. Clean Up Old Traces
127
+
128
+ Traces can consume significant disk space:
129
+
130
+ ```bash
131
+ # Remove traces older than 7 days
132
+ find .playwright-cli/traces -mtime +7 -delete
133
+ ```
134
+
135
+ ## Limitations
136
+
137
+ - Traces add overhead to automation
138
+ - Large traces can consume significant disk space
139
+ - Some dynamic content may not replay perfectly
@@ -0,0 +1,143 @@
1
+ # Video Recording
2
+
3
+ Capture browser automation sessions as video for debugging, documentation, or verification. Produces WebM (VP8/VP9 codec).
4
+
5
+ ## Basic Recording
6
+
7
+ ```bash
8
+ # Open browser first
9
+ playwright-cli open
10
+
11
+ # Start recording
12
+ playwright-cli video-start demo.webm
13
+
14
+ # Add a chapter marker for section transitions
15
+ playwright-cli video-chapter "Getting Started" --description="Opening the homepage" --duration=2000
16
+
17
+ # Navigate and perform actions
18
+ playwright-cli goto https://example.com
19
+ playwright-cli snapshot
20
+ playwright-cli click e1
21
+
22
+ # Add another chapter
23
+ playwright-cli video-chapter "Filling Form" --description="Entering test data" --duration=2000
24
+ playwright-cli fill e2 "test input"
25
+
26
+ # Stop and save
27
+ playwright-cli video-stop
28
+ ```
29
+
30
+ ## Best Practices
31
+
32
+ ### 1. Use Descriptive Filenames
33
+
34
+ ```bash
35
+ # Include context in filename
36
+ playwright-cli video-start recordings/login-flow-2024-01-15.webm
37
+ playwright-cli video-start recordings/checkout-test-run-42.webm
38
+ ```
39
+
40
+ ### 2. Record entire hero scripts.
41
+
42
+ When recording a video for the user or as a proof of work, it is best to create a code snippet and execute it with run-code.
43
+ It allows inserting appropriate pauses between the actions and annotating the video. There are new Playwright APIs for that.
44
+
45
+ 1) Perform scenario using CLI and take note of all locators and actions. You'll need those locators to request their bounding boxes for highlight.
46
+ 2) Create a file with the intended script for video (below). Use pressSequentially w/ delay for nice typing, make reasonable pauses.
47
+ 3) Use playwright-cli run-code --filename your-script.js
48
+
49
+ **Important**: Overlays are `pointer-events: none` — they do not interfere with page interactions. You can safely keep sticky overlays visible while clicking, filling, or performing any actions on the page.
50
+
51
+ ```js
52
+ async page => {
53
+ await page.screencast.start({ path: 'video.webm', size: { width: 1280, height: 800 } });
54
+ await page.goto('https://demo.playwright.dev/todomvc');
55
+
56
+ // Show a chapter card — blurs the page and shows a dialog.
57
+ // Blocks until duration expires, then auto-removes.
58
+ // Use this for simple use cases, but always feel free to hand-craft your own beautiful
59
+ // overlay via await page.screencast.showOverlay().
60
+ await page.screencast.showChapter('Adding Todo Items', {
61
+ description: 'We will add several items to the todo list.',
62
+ duration: 2000,
63
+ });
64
+
65
+ // Perform action
66
+ await page.getByRole('textbox', { name: 'What needs to be done?' }).pressSequentially('Walk the dog', { delay: 60 });
67
+ await page.getByRole('textbox', { name: 'What needs to be done?' }).press('Enter');
68
+ await page.waitForTimeout(1000);
69
+
70
+ // Show next chapter
71
+ await page.screencast.showChapter('Verifying Results', {
72
+ description: 'Checking the item appeared in the list.',
73
+ duration: 2000,
74
+ });
75
+
76
+ // Add a sticky annotation that stays while you perform actions.
77
+ // Overlays are pointer-events: none, so they won't block clicks.
78
+ const annotation = await page.screencast.showOverlay(`
79
+ <div style="position: absolute; top: 8px; right: 8px;
80
+ padding: 6px 12px; background: rgba(0,0,0,0.7);
81
+ border-radius: 8px; font-size: 13px; color: white;">
82
+ ✓ Item added successfully
83
+ </div>
84
+ `);
85
+
86
+ // Perform more actions while the annotation is visible
87
+ await page.getByRole('textbox', { name: 'What needs to be done?' }).pressSequentially('Buy groceries', { delay: 60 });
88
+ await page.getByRole('textbox', { name: 'What needs to be done?' }).press('Enter');
89
+ await page.waitForTimeout(1500);
90
+
91
+ // Remove the annotation when done
92
+ await annotation.dispose();
93
+
94
+ // You can also highlight relevant locators and provide contextual annotations.
95
+ const bounds = await page.getByText('Walk the dog').boundingBox();
96
+ await page.screencast.showOverlay(`
97
+ <div style="position: absolute;
98
+ top: ${bounds.y}px;
99
+ left: ${bounds.x}px;
100
+ width: ${bounds.width}px;
101
+ height: ${bounds.height}px;
102
+ border: 1px solid red;">
103
+ </div>
104
+ <div style="position: absolute;
105
+ top: ${bounds.y + bounds.height + 5}px;
106
+ left: ${bounds.x + bounds.width / 2}px;
107
+ transform: translateX(-50%);
108
+ padding: 6px;
109
+ background: #808080;
110
+ border-radius: 10px;
111
+ font-size: 14px;
112
+ color: white;">Check it out, it is right above this text
113
+ </div>
114
+ `, { duration: 2000 });
115
+
116
+ await page.screencast.stop();
117
+ }
118
+ ```
119
+
120
+ Embrace creativity, overlays are powerful.
121
+
122
+ ### Overlay API Summary
123
+
124
+ | Method | Use Case |
125
+ |--------|----------|
126
+ | `page.screencast.showChapter(title, { description?, duration?, styleSheet? })` | Full-screen chapter card with blurred backdrop — ideal for section transitions |
127
+ | `page.screencast.showOverlay(html, { duration? })` | Custom HTML overlay — use for callouts, labels, highlights |
128
+ | `disposable.dispose()` | Remove a sticky overlay added without duration |
129
+ | `page.screencast.hideOverlays()` / `page.screencast.showOverlays()` | Temporarily hide/show all overlays |
130
+
131
+ ## Tracing vs Video
132
+
133
+ | Feature | Video | Tracing |
134
+ |---------|-------|---------|
135
+ | Output | WebM file | Trace file (viewable in Trace Viewer) |
136
+ | Shows | Visual recording | DOM snapshots, network, console, actions |
137
+ | Use case | Demos, documentation | Debugging, analysis |
138
+ | Size | Larger | Smaller |
139
+
140
+ ## Limitations
141
+
142
+ - Recording adds slight overhead to automation
143
+ - Large recordings can consume significant disk space
@@ -6,7 +6,7 @@ All notable changes to this project will be documented in this file.
6
6
 
7
7
  ### Changed
8
8
 
9
- - Aligned the web-access extension peer dependency with upstream pi TUI `^0.79.6` so web-access curator and summary UI surfaces consume the latest shared TUI compatibility fixes; no web-access extension code changes were made for this metadata sync ([#1413](https://github.com/bastani-inc/atomic/issues/1413)).
9
+ - Aligned the web-access extension peer dependency with upstream pi TUI `^0.79.7` so web-access curator and summary UI surfaces consume the latest shared TUI color-scheme, Warp image capability, and compatibility fixes; no web-access extension code changes were made for this metadata sync ([#1413](https://github.com/bastani-inc/atomic/issues/1413)).
10
10
 
11
11
  ## [0.8.30] - 2026-06-17
12
12
 
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@bastani/web-access",
3
- "version": "0.8.31-alpha.1",
3
+ "version": "0.8.31-alpha.3",
4
4
  "private": true,
5
5
  "description": "Atomic extension for web search, URL fetching, GitHub repo cloning, PDF/video extraction. Fork of: https://github.com/nicobailon/pi-web-access",
6
6
  "contributors": [
@@ -30,7 +30,7 @@
30
30
  },
31
31
  "peerDependencies": {
32
32
  "@bastani/atomic": "*",
33
- "@earendil-works/pi-tui": "^0.79.6"
33
+ "@earendil-works/pi-tui": "^0.79.7"
34
34
  },
35
35
  "peerDependenciesMeta": {
36
36
  "@bastani/atomic": {
@@ -6,15 +6,21 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
 
7
7
  ## [Unreleased]
8
8
 
9
+ ### Breaking Changes
10
+
11
+ - Renamed the builtin `open-claude-design` workflow output `browse_cli_status` to `playwright_cli_status` as part of migrating the workflow's preview/review tooling from the removed `browse` CLI to the `playwright-cli` command. Update any workflow-composition consumers that read `browse_cli_status`.
12
+
9
13
  ### Added
10
14
 
15
+ - Added a QA end-to-end proof video to the builtin `ralph` workflow. For UI-applicable or full-stack changes, the orchestrator now runs a `playwright-cli` end-to-end QA pass that drives the running app like a user, records a reviewable video (`playwright-cli video-start`/`video-stop`) to a stable run path, references it in the implementation notes (`## QA E2E Video`), and exposes it as the new optional `qa_video_path` output so the proof is available when the orchestrator finishes. When `create_pr=true`, the final `pull-request` stage attaches or links that video to the created PR/MR/review (embedding/linking where the provider supports media uploads, otherwise surfacing the absolute path). When no user-visible UI scenario applies, no video is produced and the notes record why.
11
16
  - Added a per-model context-window authoring token to workflow model strings: a parenthesized size token placed in the model-name portion, *before* the optional `:reasoning` suffix, e.g. `github-copilot/claude-opus-4.8 (1m):xhigh`. Adopting GitHub Copilot's `Claude Opus 4.8 (1M context)` naming convention keeps the window separate from the reasoning level so the two never collide. The token is resolved against the candidate model's advertised windows — an exact match wins, otherwise the largest supported window not exceeding the request (so `(1m)` selects a model's ~936K long-context tier), and it falls back to the model's default (short) window when no larger tier is available. It applies only to the candidate that carries the token, leaving primary and other fallback models untouched. Also surfaced `contextWindow`/`contextWindowStrict` on `StageOptions` and the workflow tool's direct-task schema for stage-level selection.
12
17
 
13
18
  ### Changed
14
19
 
20
+ - Changed the builtin `ralph`, `goal`, and `open-claude-design` workflows and the shared end-to-end verification guidance to drive browsers through the `playwright-cli` skill and `playwright-cli` command instead of the removed `browser` skill / `browse` CLI. Ralph/goal subagents now verify web and full-stack flows with `skill: "playwright-cli"`, and `open-claude-design`'s deterministic setup step now ensures `playwright-cli` (`npm install -g @playwright/cli@latest`) instead of `browse`, with every preview/review stage prompt updated to `playwright-cli open`/`snapshot`/`screenshot --filename`/`resize`/`show --annotate`.
15
21
  - Changed the builtin `ralph` workflow review fan-out from two reviewers to three independent reviewers, each running on a different primary model family (Claude Fable 5, GPT-5.5 Codex, and Gemini 3.1 Pro) with shared fallbacks, so the adversarial review gets cross-model coverage instead of repeated passes from one model. The review loop stops only when all three reviewers independently approve (find no issues), so a P0–P3 finding from any single reviewer keeps Ralph iterating instead of being out-voted by a majority quorum. Also strengthened the orchestrator's implementation-notes contract to require verifiable evidence for any claims recorded in the notes and reviewer artifacts.
16
22
  - Changed the builtin `deep-research-codebase`, `goal`, `ralph`, and `open-claude-design` workflows to run their GitHub Copilot `claude-opus-4.8` fallbacks at the model's largest advertised long-context (~1M/936K) window via the new `(1m)` token, automatically degrading to the 200K short window when Copilot's long-context tier is unavailable. Other models in each fallback chain are unaffected.
17
- - Aligned the workflows extension peer dependency with upstream pi TUI `^0.79.6` so workflow graph, custom UI, and prompt-broker integrations consume the latest shared TUI compatibility fixes; no workflows extension code changes were made for this metadata sync ([#1413](https://github.com/bastani-inc/atomic/issues/1413)).
23
+ - Aligned the workflows extension peer dependency with upstream pi TUI `^0.79.7` so workflow graph, custom UI, and prompt-broker integrations consume the latest shared TUI color-scheme, Warp image capability, and compatibility fixes; no workflows extension code changes were made for this metadata sync ([#1413](https://github.com/bastani-inc/atomic/issues/1413)).
18
24
 
19
25
  ## [0.8.30] - 2026-06-17
20
26
 
@@ -591,7 +591,7 @@ Child workflow outputs: `result`, `findings`, `research_doc_path`, `artifact_dir
591
591
 
592
592
  ### `goal`
593
593
 
594
- Goal Runner workflow: initialize a persisted goal ledger with a per-run goal id and lifecycle events, render goal-continuation context, run bounded worker LM turns, append receipts, run three independent reviewers, and let a TypeScript reducer decide `complete`, `continue`, `blocked`, or `needs_human`. Workers and reviewers are prompted to verify user-visible behavior end-to-end when practical with browser-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. Token budget behavior is intentionally excluded.
594
+ Goal Runner workflow: initialize a persisted goal ledger with a per-run goal id and lifecycle events, render goal-continuation context, run bounded worker LM turns, append receipts, run three independent reviewers, and let a TypeScript reducer decide `complete`, `continue`, `blocked`, or `needs_human`. Workers and reviewers are prompted to verify user-visible behavior end-to-end when practical with `playwright-cli`-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. Token budget behavior is intentionally excluded.
595
595
 
596
596
  ```text
597
597
  /workflow goal objective="Migrate the database layer to Drizzle ORM" base_branch=develop
@@ -609,7 +609,7 @@ Child workflow outputs: `result`, `status`, `approved`, `goal_id`, `objective`,
609
609
 
610
610
  ### `ralph`
611
611
 
612
- Prompt-engineering → research → orchestrate → review workflow with optional final-stage PR handoff: transform the user prompt into a codebase and online research question with `/skill:prompt-engineer`, run `/skill:research-codebase` against it, write findings under `research/`, delegate implementation through sub-agents from that research, run parallel reviewers, and iterate until approval or the loop limit. Ralph's orchestrator and reviewers are prompted to verify user-visible behavior end-to-end when practical with browser-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. Follow-up iterations pass unresolved review artifacts into prompt-engineering/research and fork research from prior research session data when available. Ralph skips PR creation by default; prompt text alone does not opt in. Pass `create_pr=true` to authorize only the final `pull-request` stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation (for example GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling). Ralph's own PR-creation instructions live in that final stage. Reviewers inspect repository infrastructure directly as needed; Ralph no longer runs separate `infra-*` discovery stages.
612
+ Prompt-engineering → research → orchestrate → review workflow with optional final-stage PR handoff: transform the user prompt into a codebase and online research question with `/skill:prompt-engineer`, run `/skill:research-codebase` against it, write findings under `research/`, delegate implementation through sub-agents from that research, run parallel reviewers, and iterate until approval or the loop limit. Ralph's orchestrator and reviewers are prompted to verify user-visible behavior end-to-end when practical with `playwright-cli`-skilled subagents for web/frontend flows that may depend on backend/API behavior and tmux-skilled subagents for TUI or terminal-app scenarios. For UI-applicable or full-stack changes, the orchestrator runs a `playwright-cli` end-to-end QA pass and records a reviewable proof video, references it in the implementation notes, and exposes it as the `qa_video_path` output; when `create_pr=true`, the final `pull-request` stage attaches or links that video to the created PR/MR/review. Follow-up iterations pass unresolved review artifacts into prompt-engineering/research and fork research from prior research session data when available. Ralph skips PR creation by default; prompt text alone does not opt in. Pass `create_pr=true` to authorize only the final `pull-request` stage to inspect provider credentials and attempt provider-appropriate PR/MR/review creation (for example GitHub `gh`, Azure Repos `az repos pr create`, or Sapling/Phabricator tooling). Ralph's own PR-creation instructions live in that final stage. Reviewers inspect repository infrastructure directly as needed; Ralph no longer runs separate `infra-*` discovery stages.
613
613
 
614
614
  ```text
615
615
  /workflow ralph prompt="Migrate the database layer to Drizzle ORM" max_loops=3 base_branch=develop
@@ -624,7 +624,7 @@ Prompt-engineering → research → orchestrate → review workflow with optiona
624
624
  | `git_worktree_dir` | `string` | — | `""` | Optional reusable Git worktree root. Empty runs in the invoking checkout; non-empty values run Ralph stages in the created/reused worktree. |
625
625
  | `create_pr` | `boolean` | — | `false` | Safe-by-default PR creation flag. Omitted or `false` skips the final `pull-request` stage and omits `pr_report`; prompt text alone does not opt in, and only strict `true` authorizes the final `pull-request` stage to attempt provider-appropriate PR/MR/review creation. |
626
626
 
627
- Child workflow outputs: `result`, `plan` (latest transformed research question), `plan_path` (compatibility alias for `research_path`), `research`, `research_path`, `implementation_notes_path`, `approved`, `iterations_completed`, `review_report`, and `review_report_path`. `pr_report` is included only when `create_pr=true` and the final `pull-request` stage runs.
627
+ Child workflow outputs: `result`, `plan` (latest transformed research question), `plan_path` (compatibility alias for `research_path`), `research`, `research_path`, `implementation_notes_path`, `qa_video_path` (reviewable QA end-to-end proof video recorded with `playwright-cli` for UI-applicable changes, when produced), `approved`, `iterations_completed`, `review_report`, and `review_report_path`. `pr_report` is included only when `create_pr=true` and the final `pull-request` stage runs.
628
628
 
629
629
  ### `open-claude-design`
630
630
 
@@ -642,7 +642,7 @@ Design-system onboarding → reference import → generation → refinement →
642
642
  | `design_system` | `text` | — | — | Existing design-system reference / Design.md path. |
643
643
  | `max_refinements` | `number` | — | `3` | Maximum critique/apply refinement iterations. |
644
644
 
645
- Child workflow outputs: `output_type`, `design_system`, `artifact`, `handoff`, `approved_for_export`, `refinements_completed`, `import_context`, `run_id`, `artifact_dir`, `preview_path`, `preview_file_url`, `spec_path`, and `spec_file_url`. `open-claude-design` has no `result` output; it exposes only the declared fields listed here.
645
+ Child workflow outputs: `output_type`, `design_system`, `artifact`, `handoff`, `approved_for_export`, `refinements_completed`, `import_context`, `run_id`, `artifact_dir`, `preview_path`, `preview_file_url`, `spec_path`, `spec_file_url`, and `playwright_cli_status`. `open-claude-design` has no `result` output; it exposes only the declared fields listed here.
646
646
 
647
647
  ---
648
648