ptywright 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. package/README.md +459 -116
  2. package/dist/agent.mjs +2 -0
  3. package/dist/bin/ptywright.mjs +6 -0
  4. package/dist/cli-DIUx2w6X.mjs +3587 -0
  5. package/dist/cli.mjs +2 -0
  6. package/{src/index.ts → dist/index.mjs} +7 -9
  7. package/dist/mcp.mjs +2 -0
  8. package/dist/pty-cassette.mjs +24 -0
  9. package/dist/pty_like-Cpkh_O9B.mjs +404 -0
  10. package/dist/runner-DzZlFrt1.mjs +1897 -0
  11. package/dist/runner-zApMYWZx.mjs +3257 -0
  12. package/dist/script.mjs +2 -0
  13. package/dist/server-VHuEWWj_.mjs +3068 -0
  14. package/dist/session.mjs +2 -0
  15. package/dist/terminal_session-DopC7Xg6.mjs +893 -0
  16. package/package.json +28 -21
  17. package/schemas/ptywright-agent-cassette.schema.json +57 -0
  18. package/schemas/ptywright-agent-check.schema.json +122 -0
  19. package/schemas/ptywright-agent-manifest.schema.json +107 -0
  20. package/schemas/ptywright-agent-promote.schema.json +146 -0
  21. package/schemas/ptywright-agent-replay-summary.schema.json +140 -0
  22. package/schemas/ptywright-agent-run.schema.json +126 -0
  23. package/schemas/ptywright-agent.schema.json +182 -0
  24. package/schemas/ptywright-pty-cassette.schema.json +86 -0
  25. package/schemas/ptywright-script-manifest.schema.json +75 -0
  26. package/schemas/ptywright-script-run-summary.schema.json +114 -0
  27. package/schemas/ptywright-script.schema.json +55 -3
  28. package/skills/ptywright-testing/SKILL.md +53 -33
  29. package/bin/ptywright +0 -4
  30. package/src/cli.ts +0 -414
  31. package/src/generator/doc_parser.ts +0 -341
  32. package/src/generator/generate.ts +0 -161
  33. package/src/generator/index.ts +0 -10
  34. package/src/generator/script_generator.ts +0 -209
  35. package/src/generator/step_extractor.ts +0 -397
  36. package/src/mcp/http_server.ts +0 -174
  37. package/src/mcp/script_recording.ts +0 -238
  38. package/src/mcp/server.ts +0 -1348
  39. package/src/pty/bun_pty_adapter.ts +0 -34
  40. package/src/pty/bun_terminal_adapter.ts +0 -149
  41. package/src/pty/pty_adapter.ts +0 -31
  42. package/src/script/dsl.ts +0 -188
  43. package/src/script/module.ts +0 -43
  44. package/src/script/path.ts +0 -151
  45. package/src/script/run.ts +0 -108
  46. package/src/script/run_all.ts +0 -229
  47. package/src/script/runner.ts +0 -983
  48. package/src/script/schema.ts +0 -237
  49. package/src/script/steps/assert_snapshot_equals.ts +0 -21
  50. package/src/script/steps/index.ts +0 -2
  51. package/src/script/suite_report.ts +0 -626
  52. package/src/session/session_manager.ts +0 -145
  53. package/src/session/terminal_session.ts +0 -473
  54. package/src/terminal/ansi.ts +0 -142
  55. package/src/terminal/keys.ts +0 -180
  56. package/src/terminal/mask.ts +0 -70
  57. package/src/terminal/mouse.ts +0 -75
  58. package/src/terminal/snapshot.ts +0 -196
  59. package/src/terminal/style.ts +0 -121
  60. package/src/terminal/view.ts +0 -49
  61. package/src/trace/asciicast.ts +0 -20
  62. package/src/trace/asciinema_player_assets.ts +0 -44
  63. package/src/trace/cast_to_txt.ts +0 -116
  64. package/src/trace/recorder.ts +0 -110
  65. package/src/trace/report.ts +0 -2092
  66. package/src/types.ts +0 -86
  67. package/src/util/hash.ts +0 -8
  68. package/src/util/sleep.ts +0 -5
package/README.md CHANGED
@@ -1,188 +1,531 @@
1
1
  # ptywright
2
2
 
3
- 一个通用的“终端版 DevTools / Playwright driver”原型:通过 PTY 启动任意 CLI/TUI,把 ANSI/VT 输出喂给 `@xterm/headless` 重建屏幕网格,并以 MCP(stdio)暴露工具接口。
3
+ [中文文档](./README_ZH.md)
4
4
 
5
- ## Run
5
+ A universal "Terminal DevTools / Playwright driver": Launch any CLI/TUI via PTY, feed ANSI/VT output to `@xterm/headless` to rebuild the screen grid, and expose it as MCP (stdio) tools.
6
6
 
7
- ```bash
8
- bun install
9
-
10
- # 默认:启动 MCP server(等价 `ptywright mcp`)
11
- bun run bin/ptywright
7
+ ## Installation
12
8
 
13
- # 显式写法:
14
- # bun run bin/ptywright mcp
9
+ ```bash
10
+ # Recommended: Run with bunx (no install needed)
11
+ bunx ptywright@latest --help
15
12
 
16
- # 可选:以 Streamable HTTP 方式启动(Web transport)
17
- # bun run bin/ptywright mcp-http --port 3000
13
+ # Or install globally
14
+ bun add -g ptywright
15
+ ptywright --help
18
16
 
19
- # 可选:减少 tool 数量(降低 Agent 上下文压力)
20
- # bun run bin/ptywright mcp --caps core
21
- # 或:PTYWRIGHT_CAPS=core bun run bin/ptywright
17
+ # Or via npm/npx
18
+ npx -y ptywright@latest --help
19
+ npm install -g ptywright
22
20
  ```
23
21
 
24
- ## Tools (MVP)
22
+ ## Quick Start
25
23
 
26
- 工具默认全量开启(等价 `PTYWRIGHT_CAPS=all`)。如需减少 tool 数量,可设置 `PTYWRIGHT_CAPS=core` 或按需组合:
24
+ ### Use as MCP Server
27
25
 
28
- - 默认:`PTYWRIGHT_CAPS=all`
29
- - 最小:`PTYWRIGHT_CAPS=core`
30
- - 组合:`PTYWRIGHT_CAPS=core,debug,script,recording`
26
+ ```bash
27
+ # stdio mode (default)
28
+ bunx ptywright@latest mcp
31
29
 
32
- ### core
30
+ # HTTP mode
31
+ bunx ptywright@latest mcp-http --port 3000
33
32
 
34
- - `list_sessions` / `select_session`:管理与选择会话
35
- - `launch_session`:启动 PTY 会话(会自动成为默认会话)
36
- - `send_text` / `press_key`:发送输入
37
- - `snapshot_text`:返回可见屏幕文本(适合 Agent “看界面”与做 golden)
38
- - `snapshot_view`:更适合人看的快照(带元信息+行号)
39
- - `wait_for_text`:等待文本/正则出现
40
- - `wait_for_stable_screen`:等待屏幕在 quiet window 内稳定(降低 flaky)
41
- - `assert`:对当前屏幕做断言(text/regex/semantic)
42
- - `close_session`:关闭会话
33
+ # Minimal tools (reduce Agent context pressure)
34
+ bunx ptywright@latest mcp --caps core
35
+ ```
43
36
 
44
- ### debug(可选)
37
+ ### Configure MCP Client
45
38
 
46
- - `snapshot_ansi`:返回带 ANSI/SGR 样式的可见屏幕(适合 debug/人眼验收)
47
- - `snapshot_view_ansi`:带 ANSI/SGR 样式的 `snapshot_view`
39
+ **Claude Desktop / Cursor** (`~/.config/claude/claude_desktop_config.json`):
48
40
 
49
- ### script(可选)
41
+ ```json
42
+ {
43
+ "mcpServers": {
44
+ "ptywright": {
45
+ "command": "bunx",
46
+ "args": ["ptywright@latest", "mcp"]
47
+ }
48
+ }
49
+ }
50
+ ```
50
51
 
51
- - `run_routine`:一键执行多步交互(type/key/wait/assert/snapshot)
52
- - `run_script`:运行 `scriptPath=file.json|file.ts` 并产出 artifacts(cast/report/失败快照)
53
- - `run_all_scripts`:批量运行目录内脚本(递归;支持 `includeEntries/maxEntries` 控制输出)
54
- - `generate_test_from_doc`:从文档(本地/URL)生成可执行脚本
55
- - `inspect_failure`:查看最近一次失败的屏幕与错误
52
+ **Minimal mode** (load core tools only):
56
53
 
57
- ### recording(可选)
54
+ ```json
55
+ {
56
+ "mcpServers": {
57
+ "ptywright": {
58
+ "command": "bunx",
59
+ "args": ["ptywright@latest", "mcp", "--caps", "core"]
60
+ }
61
+ }
62
+ }
63
+ ```
58
64
 
59
- - `start_script_recording` / `stop_script_recording`:录制 MCP 工具调用并导出可复跑脚本(JSON + goldens)
60
- - `mark`:在 trace 中打点(asciicast marker event)
65
+ **HTTP mode** (for Web clients):
61
66
 
62
- `mask`(可选):`snapshot_text/snapshot_ansi/snapshot_view/snapshot_view_ansi` 支持 `mask=[{regex,flags?,replacement?,preserveLength?}]`,用于把随机 id/时间戳等变成可 diff 的稳定快照
67
+ ```json
68
+ {
69
+ "mcpServers": {
70
+ "ptywright": {
71
+ "command": "bunx",
72
+ "args": ["ptywright@latest", "mcp-http", "--port", "3000"]
73
+ }
74
+ }
75
+ }
76
+ ```
63
77
 
64
- ### `press_key` Key Spec
78
+ ### CLI Commands
65
79
 
66
- 支持单键与“修饰键 + 单键”的组合写法(大小写不敏感,`+`/`-` 都可作为分隔符):
67
- - 单字符:`"a"` / `"?"`(原样写入 PTY)
68
- - 特殊键:`Enter|Return`、`Esc|Escape`、`Backspace`、`Space`、`Tab`、`BackTab`
69
- - 组合键:`Ctrl+C`、`Ctrl+Shift+R`、`Alt+X`/`Meta+X`、`Shift+Tab`、`Ctrl+Up`
70
- - 导航键:`Up/Down/Left/Right`、`Home/End`、`PageUp/PageDown`、`Insert/Delete`、`F1..F12`
71
- - 兼容:`c-x`(等价 `Ctrl+X`)
80
+ ```bash
81
+ # Run a single test script
82
+ bunx ptywright@latest run scripts/demo.json
72
83
 
73
- ## Tests
84
+ # Run all scripts (generate HTML report)
85
+ bunx ptywright@latest run-all --dir scripts
74
86
 
75
- ```bash
76
- bun test
87
+ # Show help
88
+ bunx ptywright@latest --help
77
89
  ```
78
90
 
79
- 包含:
80
- - PTY + xterm 解析与快照测试
81
- - MCP server 端到端 smoke(client 通过 stdio 启动 server 并调用 tools)
91
+ ### Raw PTY Cassette
82
92
 
83
- ## Use With MCP Clients(可选)
93
+ `ptywright pty` records the raw PTY stream once and replays it later without
94
+ rerunning the original command. This is intended for browser terminal renderers
95
+ that need deterministic regression tests for prompts or AI sessions that are
96
+ hard to reproduce live.
84
97
 
85
- 本仓库也提供了一个可选的 Codex skill:`skills/ptywright-testing/`,用于指导 Agent 如何用 ptywright MCP/CLI 跑回归并读取 `run.summary.json`(尽量不把超长报告塞进上下文)。
98
+ ```bash
99
+ # Record output/input/resize/exit as base64 PTY events
100
+ bunx ptywright@latest pty record --out tests/cassettes/codex.pty.json -- codex
86
101
 
87
- 本项目是一个 stdio transport MCP server。直接启动:
102
+ # Replay the same raw output stream instantly
103
+ bunx ptywright@latest pty replay tests/cassettes/codex.pty.json
88
104
 
89
- ```bash
90
- # 默认等价 `ptywright mcp`
91
- bun run bin/ptywright
105
+ # Validate or inspect the portable artifact
106
+ bunx ptywright@latest pty validate tests/cassettes/codex.pty.json
107
+ bunx ptywright@latest pty inspect tests/cassettes/codex.pty.json
92
108
  ```
93
109
 
94
- 也可以用 Streamable HTTP 启动(默认 endpoint: `http://127.0.0.1:3000/mcp`):
110
+ External projects do not need to depend on aitty. Use the structural
111
+ `wrapPtyLike` API for `node-pty`/`bun-pty` style objects:
95
112
 
96
- ```bash
97
- bun run bin/ptywright mcp-http --port 3000
113
+ ```ts
114
+ import { wrapPtyLike } from "ptywright/pty-cassette";
115
+
116
+ const recorded = wrapPtyLike(pty, {
117
+ path: "tests/cassettes/session.pty.json",
118
+ terminal: { cols: 120, rows: 40, term: "xterm-256color" },
119
+ command: { file: "codex", args: [] },
120
+ });
121
+
122
+ recorded.write("hello\r");
123
+ // output and exit are captured from pty.onData/onExit
98
124
  ```
99
125
 
100
- 然后在你使用的 MCP client 里把它作为一个 stdio server 配置进去即可(不同 client 的配置方式不同)。
126
+ For Bun Terminal callback-style integration, create a recorder and call
127
+ `recordOutput` from the terminal `data` hook, or use
128
+ `wrapBunTerminalOptions`. The cassette can then be replayed into any renderer
129
+ and compared by that renderer's DOM/text snapshot tests.
130
+
131
+ ## Tools
132
+
133
+ All tools are enabled by default (`--caps all`). Use `--caps core` or combine as needed:
134
+
135
+ - Default: `--caps all`
136
+ - Minimal: `--caps core`
137
+ - Combined: `--caps core,debug,script,recording`
138
+
139
+ ### core
140
+
141
+ - `list_sessions` / `select_session`: Manage and select sessions
142
+ - `launch_session`: Start a PTY session (becomes default session)
143
+ - `send_text` / `press_key`: Send input
144
+ - `snapshot_text`: Return visible screen text (for Agent "seeing" and golden snapshots)
145
+ - `snapshot_view`: Human-friendly snapshot (with metadata + line numbers)
146
+ - `wait_for_text`: Wait for text/regex to appear
147
+ - `wait_for_stable_screen`: Wait for screen to stabilize within quiet window (reduce flaky)
148
+ - `assert`: Assert on current screen (text/regex/semantic)
149
+ - `close_session`: Close session
150
+
151
+ ### debug (optional)
152
+
153
+ - `snapshot_ansi`: Return visible screen with ANSI/SGR styles (for debug/human review)
154
+ - `snapshot_view_ansi`: `snapshot_view` with ANSI/SGR styles
155
+
156
+ ### script (optional)
157
+
158
+ - `run_routine`: Execute multi-step interactions in one call (type/key/wait/assert/snapshot)
159
+ - `run_script`: Run `scriptPath=file.json|file.ts` and produce artifacts (cast/report/failure snapshots)
160
+ - `run_all_scripts`: Run scripts in directory recursively (supports `includeEntries/maxEntries`)
161
+ - `generate_test_from_doc`: Generate executable scripts from documentation (local/URL)
162
+ - `inspect_failure`: View last failure screen and error
163
+
164
+ ### recording (optional)
165
+
166
+ - `start_script_recording` / `stop_script_recording`: Record MCP tool calls and export replayable scripts (JSON + goldens)
167
+ - `mark`: Add marker to trace (asciicast marker event)
168
+
169
+ ### `mask` Parameter
170
+
171
+ `snapshot_text/snapshot_ansi/snapshot_view/snapshot_view_ansi` support `mask=[{regex,flags?,replacement?,preserveLength?}]` to convert random IDs/timestamps into diffable stable snapshots.
172
+
173
+ ### `press_key` Key Spec
174
+
175
+ Supports single keys and modifier combinations (case-insensitive, `+`/`-` as separator):
176
+ - Single char: `"a"` / `"?"` (written to PTY as-is)
177
+ - Special keys: `Enter|Return`, `Esc|Escape`, `Backspace`, `Space`, `Tab`, `BackTab`
178
+ - Combos: `Ctrl+C`, `Ctrl+Shift+R`, `Alt+X`/`Meta+X`, `Shift+Tab`, `Ctrl+Up`
179
+ - Navigation: `Up/Down/Left/Right`, `Home/End`, `PageUp/PageDown`, `Insert/Delete`, `F1..F12`
180
+ - Compatible: `c-x` (equals `Ctrl+X`)
101
181
 
102
182
  ## Script Runner (JSON)
103
183
 
104
- 把一次 TUI 测试写成 JSON:启动输入等待快照(可 mask)→ 断言,并自动产出 `.cast` + `report.html`。
184
+ Write TUI tests as JSON: launch inputwaitsnapshot (with mask) assert, automatically producing `.cast` + `report.html`.
105
185
 
106
- 可选:在 JSON 顶部加上 schema(编辑器补全/校验更友好):
186
+ Optional: Add schema for editor completion/validation:
107
187
 
108
188
  ```json
109
- { "$schema": "../schemas/ptywright-script.schema.json" }
189
+ { "$schema": "node_modules/ptywright/schemas/ptywright-script.schema.json" }
110
190
  ```
111
191
 
112
192
  ```bash
113
- bun run script:run scripts/m5_mask_demo.json
114
- # 或(CLI)
115
- bun run bin/ptywright run scripts/m5_mask_demo.json
116
- #
117
- bun run script:m5-mask-demo
193
+ # Run single script
194
+ bunx ptywright@latest run scripts/m5_mask_demo.json
195
+
196
+ # Run all scripts
197
+ bunx ptywright@latest run-all --dir scripts
118
198
  ```
119
199
 
120
- 批量执行(本地/CI):
200
+ Artifacts go to `.tmp/runs/<name>/` by default (override with `--artifacts-dir`).
201
+
202
+ Batch runs generate an overview report:
203
+ - Default: `.tmp/run-all/index.html` + `.tmp/run-all/run.summary.json`
204
+ - With `--artifacts-root <dir>`: `<dir>/index.html` + `<dir>/run.summary.json`
205
+ - `run.summary.json` stores `commands.runAll.argv` and
206
+ `commands.updateGoldens.argv` so automation can replay the suite or update
207
+ goldens without reconstructing CLI arguments.
208
+
209
+ You can read or execute those commands directly from the generated artifact:
121
210
 
122
211
  ```bash
123
- bun run script:run-all
124
- # 或(CLI)
125
- bun run bin/ptywright run-all
212
+ bunx ptywright@latest script commands .tmp/run-all --json
213
+ bunx ptywright@latest script commands .tmp/run-all/run.summary.json --command runAll
214
+ bunx ptywright@latest script inspect .tmp/run-all
215
+ bunx ptywright@latest script validate .tmp/run-all
216
+ bunx ptywright@latest script exec .tmp/run-all --command updateGoldens
126
217
  ```
127
218
 
128
- 会生成一个总览报告(类似 Playwright report 首页):
129
- - 默认:`.tmp/run-all/index.html` + `.tmp/run-all/run.summary.json`
130
- - 若传入 `--artifacts-root <dir>`:写到 `<dir>/index.html` + `<dir>/run.summary.json`
219
+ Suite directories also include `ptywright-script.manifest.json`, which indexes
220
+ the generated summary, reports, casts, data, and failure artifacts with
221
+ `bytes`/`sha256`. `script validate`, `script inspect`, `script commands`, and
222
+ `script exec` verify that manifest before using a directory bundle, so copied
223
+ script run artifacts can be replayed or updated without trusting stale files.
131
224
 
132
- 如果 JSON 里用到了 `type:"custom"`,用 `--steps <module.ts>` 注入 handlers(模块导出 `steps` 对象即可):
225
+ On failure, additional files are saved:
226
+ - `failure.error.txt` (error stack)
227
+ - `failure.step.json` (failed step info)
228
+ - `failure.last.txt` / `failure.last.view.txt` (last frame snapshot)
133
229
 
134
- ```bash
135
- bun run script:run examples/json_custom_steps_demo.json --steps scripts/m6_json_custom_steps.ts
136
- ```
230
+ `report.html` includes **Timeline View** showing screen snapshots after each step. Click the `debug` badge to switch to debug view.
137
231
 
138
- 产物默认写到 `.tmp/runs/<name>/`(可用 `--artifacts-dir` 覆盖)。
232
+ Built-in steps (no `--steps` needed):
233
+ - `assert`: Assert text/regex (`text`/`regex`)
234
+ - `assertSemantic`: Semantic assertion placeholder (`prompt`)
235
+ - `sleep`: Fixed wait
236
+ - `expectMeta`: Assert terminal meta
237
+ - `waitForExit`: Wait for process exit
238
+ - `sendMouse`: Send SGR mouse events
139
239
 
140
- 失败时会额外落盘:
141
- - `failure.error.txt`(错误堆栈)
142
- - `failure.step.json`(失败的 step 信息)
143
- - `failure.last.txt` / `failure.last.view.txt`(最后一帧快照)
240
+ ### Framework Backends
144
241
 
145
- `report.html` 现在包含 **Timeline View**,展示每一步操作后的屏幕快照(不仅是失败时)。点击顶部 `debug` badge 可切换到调试视图。
242
+ `launch.backend` defaults to `pty`. For faster framework-level checks, use
243
+ `frames`, `ratatui`, or `ink` to run the same script steps against deterministic
244
+ frames without starting a PTY:
146
245
 
147
- Cast Playback(完整录屏)会优先加载 report 同目录的 `asciinema-player.min.js` / `asciinema-player.css`(生成 report 时自动复制),因此离线打开 report 也可播放;若本地资源缺失则会 fallback 到 CDN。
246
+ ```json
247
+ {
248
+ "$schema": "../schemas/ptywright-script.schema.json",
249
+ "name": "ratatui_snapshot",
250
+ "launch": {
251
+ "backend": "ratatui",
252
+ "cols": 60,
253
+ "rows": 12,
254
+ "frames": [
255
+ "Screen: Dashboard\nMode: HIGH",
256
+ "Screen: Permissions\nMode: LOW"
257
+ ]
258
+ },
259
+ "steps": [
260
+ { "type": "waitForText", "text": "Dashboard" },
261
+ { "type": "pressKey", "key": "Enter" },
262
+ { "type": "snapshot", "kind": "text", "saveAs": "final" },
263
+ { "type": "expect", "from": "final", "contains": ["Mode: LOW"] }
264
+ ]
265
+ }
266
+ ```
148
267
 
149
- 内置 steps(无需 `--steps`):
150
- - `assert`:**[NEW]** 断言文本/正则(`text`/`regex`)
151
- - `assertSemantic`:**[NEW]** 语义断言占位符(`prompt`)
152
- - `sleep`:固定等待
153
- - `expectMeta`:断言终端 meta
154
- - `waitForExit`:等待进程退出
155
- - `sendMouse`:发送 SGR 鼠标事件
268
+ `ratatui` is intended for text emitted by `TestBackend`/insta-style snapshots.
269
+ `ink` can load a module via `frameModule` that exports `frames`, `frame`,
270
+ `snapshot`, or `lastFrame`. Input steps such as `pressKey` and `sendText`
271
+ advance to the next frame by default, so the assertion path stays identical to
272
+ the PTY end-to-end script.
156
273
 
157
- ## Script Recording (MCP)
274
+ For `type:"custom"` steps, inject handlers with `--steps <module.ts>`:
158
275
 
159
- 如果你设置了 `PTYWRIGHT_CAPS` 且未包含 `recording`,需要开启 `recording`(例如 `PTYWRIGHT_CAPS=core,recording`)。
276
+ ```bash
277
+ bunx ptywright@latest run demo.json --steps custom_steps.ts
278
+ ```
279
+
280
+ ## Script Recording (MCP)
160
281
 
161
- 在任意 MCP client/Agent 通过 MCP tools 驱动时,可以一键把工具调用“录成脚本”,并在 `mark` 处自动落盘 golden:
282
+ Record tool calls into replayable scripts from any MCP client/Agent:
162
283
 
163
284
  1) `start_script_recording(name="my_flow")`
164
- 2) 正常执行:`launch_session/send_text/press_key/wait_for_*`
165
- 3) 关键节点打点:`mark(label="checkpoint")`(会自动生成 `snapshot + expectGolden`)
166
- 4) `stop_script_recording(recordingId=...)`(写入 `scripts/my_flow.json` + `tests/golden/scripts/my_flow/*.txt`)
285
+ 2) Execute normally: `launch_session/send_text/press_key/wait_for_*`
286
+ 3) Add checkpoints: `mark(label="checkpoint")` (auto-generates `snapshot + expectGolden`)
287
+ 4) `stop_script_recording(recordingId=...)` (writes `scripts/my_flow.json` + `tests/golden/scripts/my_flow/*.txt`)
167
288
 
168
289
  ## Script DSL (TypeScript)
169
290
 
170
- TS builder script(类型安全,可组合,支持自定义 step),底层仍复用同一个 runner:
291
+ Write scripts with TS builder (type-safe, composable, custom steps):
171
292
 
172
293
  ```bash
173
- bun run script:run scripts/m6_dsl_demo.ts
294
+ bunx ptywright@latest run scripts/demo.ts
295
+ ```
296
+
297
+ Conventions:
298
+ - Module default export (`export default`), or export `script`.
299
+ - Optional `steps` export (custom step handlers) for `type:"custom"` steps.
300
+ - Use `pasteText("...", { bracketed: true })` for bracketed paste testing.
301
+
302
+ ## Cast -> SVG/GIF (Optional)
303
+
304
+ Recording artifacts are best for failure diagnosis or manual review; prefer `snapshot_grid` diff for stable regression.
305
+
306
+ - SVG: `bunx svg-term --in <castPath> --out <outSvg>`
307
+ - GIF: `agg --fps 30 <castPath> <outGif>` (requires [asciinema/agg](https://github.com/asciinema/agg))
308
+
309
+ ## Browser Agent Regression
310
+
311
+ The new destructive path is browser-first: ptywright can launch an agent through
312
+ `@aitty/cli`, drive the browser-hosted wterm DOM with Playwright, and persist a
313
+ replayable run artifact plus terminal/DOM snapshots.
314
+
315
+ ```bash
316
+ # First run records snapshots, screenshots, replay metadata, and report.
317
+ bun run bin/ptywright agent run examples/agent_deterministic.json --update-snapshots
318
+
319
+ # Later runs compare terminal + DOM snapshots like a test snapshot.
320
+ bun run bin/ptywright agent run examples/agent_deterministic.json
321
+
322
+ # Replay does not need AI; it uses the recorded flow artifact.
323
+ bun run bin/ptywright agent replay .tmp/agent/agent_deterministic/agent_deterministic.agent-run.json
324
+
325
+ # Cassette files are also directly replayable.
326
+ bun run bin/ptywright agent replay .tmp/agent/agent_deterministic/agent_deterministic.cassette.json
327
+
328
+ # Promote a live run/cassette into the committed non-AI regression suite.
329
+ bun run bin/ptywright agent promote \
330
+ .tmp/agent/agent_deterministic/agent_deterministic.cassette.json \
331
+ --update-snapshots
332
+
333
+ # Batch replay committed cassettes/run records as a regression suite.
334
+ bun run bin/ptywright agent replay-all .tmp/agent --artifacts-root .tmp/agent-replay-all
335
+
336
+ # Rerun directly from a generated summary artifact.
337
+ bun run bin/ptywright agent rerun .tmp/agent-promote/agent_deterministic/agent-promote.summary.json
338
+ bun run bin/ptywright agent rerun .tmp/agent-check/agent-check.summary.json
339
+ bun run bin/ptywright agent rerun .tmp/agent-check/agent-replay.summary.json --update-snapshots
340
+
341
+ # Read reusable commands from any supported agent artifact.
342
+ bun run bin/ptywright agent commands .tmp/agent-check/agent-check.summary.json --json
343
+ bun run bin/ptywright agent commands .tmp/agent-check/agent-check.summary.json --command rerun
344
+ bun run bin/ptywright agent commands .tmp/agent-check --json
345
+ bun run bin/ptywright agent inspect .tmp/agent-check
346
+ bun run bin/ptywright agent inspect .tmp/agent-check --json
347
+ bun run bin/ptywright agent validate .tmp/agent-check
348
+ bun run bin/ptywright agent exec .tmp/agent-check --command rerun
349
+ bun run bin/ptywright agent exec .tmp/agent-check --command updateSnapshots
350
+ bun run bin/ptywright agent exec .tmp/agent-check/agent-check.summary.json --command rerun
351
+ bun run bin/ptywright agent exec .tmp/agent-check/agent-check.summary.json --command updateSnapshots
352
+
353
+ # Validate flow/cassette/run-record/summary artifacts before committing.
354
+ bun run bin/ptywright agent validate .tmp/agent-replay-all
355
+
356
+ # Run committed cassette replay regression without launching live agents.
357
+ bun run bin/ptywright agent check
358
+ bun run bin/ptywright agent check --json
359
+
360
+ # Update terminal/DOM baselines from committed cassettes intentionally.
361
+ bun run bin/ptywright agent replay-all tests/agent-cassettes --update-snapshots
362
+
363
+ # Record browser interactions into a replayable flow spec.
364
+ bun run bin/ptywright agent record examples/agents/codex_browser_smoke.json \
365
+ --out scripts/agents/codex_recorded.flow.json \
366
+ --duration-ms 60000 \
367
+ --headed
368
+
369
+ # Generate starter specs for real agents.
370
+ bun run bin/ptywright agent init codex examples/agents/codex_browser_smoke.json
371
+ bun run bin/ptywright agent init claude examples/agents/claude_browser_smoke.json
372
+ bun run bin/ptywright agent init droidx examples/agents/droidx_browser_smoke.json
373
+ ```
374
+
375
+ Artifacts are split intentionally:
376
+ - `.tmp/agent/<name>/` contains run output, screenshots, `*.flow.json`,
377
+ `*.agent-run.json`, `*.cassette.json`, `index.html`, and
378
+ `ptywright-agent.manifest.json`.
379
+ - `tests/agent-snapshots/<name>/` contains stable terminal/DOM baselines.
380
+ - `--update-snapshots` is the explicit update path for intentional UI changes.
381
+
382
+ `launch.mode=aitty` runs `aitty exec --launch print -- <agent>`. By default
383
+ ptywright resolves the sibling `../aitty/packages/cli/dist/cli.js`; set
384
+ `PTYWRIGHT_AITTY_CLI` or `launch.aitty.command` to override it.
385
+
386
+ Set `launch.agentFlavor` to `codex`, `claude`, `droid`, or `generic` to opt
387
+ into built-in mask presets for timestamps, generated ids, model names, token
388
+ counts, and other non-deterministic terminal text. Explicit
389
+ `defaults.mask=[...]` rules are appended after the preset, so project-specific
390
+ noise can be hidden without rewriting the runner.
391
+
392
+ `agent record` opens the same browser-hosted terminal and writes the captured
393
+ keyboard/click steps back to a normal flow JSON. The output can be committed and
394
+ run later with `agent run`, while `.agent-run.json` remains the per-run replay
395
+ record generated by the runner. Run records must include
396
+ `commands.replay.argv` and `commands.updateSnapshots.argv`, so automation can
397
+ replay or intentionally update the captured flow without parsing shell strings.
398
+
399
+ `agent run` is the live path: it launches the configured process and updates or
400
+ compares terminal/DOM snapshots. `agent promote <run|cassette>` is the
401
+ intentional solidify step after a good live run: it copies the cassette into
402
+ `tests/agent-cassettes/<name>/`, rewrites its `snapshotDir`, optionally updates
403
+ terminal/DOM baselines, replays the promoted cassette, and writes
404
+ `agent-promote.summary.json` with direct commands for future non-AI checks. HTML
405
+ reports also surface replay/update/inspect commands so failed runs can be
406
+ reproduced directly from the report page.
407
+ `agent replay` is the single-case cassette regression path: it accepts either
408
+ `.agent-run.json` or `.cassette.json`, serves a local replay page, and reproduces
409
+ the previously captured terminal DOM without launching Codex, Claude, Droid, or
410
+ any other live agent process. `agent replay-all` recursively scans a directory
411
+ for `.agent-run.json` and `.cassette.json` files, then writes
412
+ `agent-replay.summary.json` and an HTML suite report so committed cassettes can
413
+ be run like a snapshot regression suite.
414
+ `--update-snapshots` works on `agent replay-all`, so intentional DOM/terminal
415
+ baseline changes can be updated from committed cassettes without a live agent.
416
+ Cassette files embed the normalized flow spec plus frame hashes, so they remain
417
+ self-contained replay artifacts when copied away from the original run
418
+ directory. Replay runs also copy the source cassette into the replay artifact
419
+ directory and write run records that point at that local copy, so the replay
420
+ directory can be moved as a durable reproduction bundle. Run/check/promote and
421
+ replay-all outputs also include `ptywright-agent.manifest.json`, which indexes
422
+ produced files with artifact-root-relative paths plus `bytes` and `sha256`,
423
+ stores reusable `commands.*.argv`, and can be passed to `agent commands`,
424
+ `agent inspect`, `agent exec`, or `agent validate`. `agent inspect
425
+ <artifact|dir>` is the self-describing bundle check: when pointed at an artifact
426
+ directory it prefers `ptywright-agent.manifest.json`, validates indexed file
427
+ hashes, summarizes manifest file kinds/validation stages, and prints the
428
+ relocated reusable commands. Because file entries are relative to the manifest
429
+ directory and manifest commands are relocated when read, copying the whole
430
+ artifact directory preserves both manifest validation and direct
431
+ `agent inspect <copied>` /
432
+ `agent commands <copied>` /
433
+ `agent exec <copied> --command rerun` workflows.
434
+ When inspecting a moved summary file that is the manifest primary artifact,
435
+ `agent inspect` also prints `commandsManifest=<path>` and includes
436
+ `commands.manifestPath` in JSON output, making it explicit which manifest bundle
437
+ will validate and relocate the stored commands before execution.
438
+ The same directory entrypoint works for copied live-run bundles too, so
439
+ `agent exec <copied-run> --command replay` and `--command updateSnapshots`
440
+ remain usable after the original run directory is deleted.
441
+ Copied replay-suite bundles are rerun from the run records stored under the
442
+ bundle's own `tests/` directory, so they do not need the original cassette input
443
+ directory. Promote bundles can move their artifact root and still rerun from the
444
+ copied manifest, while continuing to target the promoted cassette suite.
445
+ If `agent inspect <dir>` sees agent artifacts but no top-level manifest, it
446
+ still reports recursive validation results and prints a `directoryManifest`
447
+ diagnostic so the directory is not confused with a portable commands/exec
448
+ bundle.
449
+ For `agent commands` and `agent exec`, a directory argument means a manifest
450
+ bundle directory and must contain `ptywright-agent.manifest.json`; use
451
+ `agent validate <dir>` when you want recursive artifact discovery.
452
+ The generated agent flow, cassette, run-record, manifest, promote-summary,
453
+ replay-summary, and check-summary JSON files each carry a `$schema` URL under
454
+ `schemas/` so editors and CI tooling can validate the replay contract directly.
455
+ Run-record and summary schemas also encode the expected stored command prefixes,
456
+ for example `ptywright agent replay`, `ptywright agent replay-all`, and
457
+ `ptywright agent rerun`, so malformed commands can be caught before execution.
458
+ Run records and summaries reject missing or stale `commands.*.argv` metadata,
459
+ because those argv arrays are the non-AI replay/update contract for the artifact.
460
+ Promote, replay, and check summaries include `commands.*.argv` arrays for direct
461
+ non-AI reruns and snapshot updates. Each summary also includes
462
+ `commands.rerun.argv`, so downstream automation can re-execute the exact summary
463
+ artifact without reconstructing CLI arguments. `agent commands <artifact>
464
+ --command <name>` prints one shell-safe command line for scripts that want to
465
+ execute a specific replay/update/rerun path directly; with `--json`, the same
466
+ command includes `cwd`, `command.argv`, and `shell` so automation can choose
467
+ structured spawn or shell execution. When a moved summary/run-record is backed
468
+ by a sibling manifest bundle, `agent commands` also reports the manifest path in
469
+ plain output and JSON so automation can see which bundle is responsible for
470
+ relocation and integrity checks; manifest-backed command discovery validates
471
+ stored command targets and indexed file hashes before printing commands.
472
+ `agent exec <artifact> --command <name>`
473
+ executes a stored agent command through ptywright's own CLI dispatcher, so it
474
+ does not depend on shell parsing or a global `ptywright` binary. This includes
475
+ stored `updateSnapshots` commands, which provide the non-AI equivalent of a
476
+ snapshot update run from an existing summary artifact. `agent validate
477
+ <artifact>` also checks that every stored argv starts with a supported
478
+ `ptywright agent <subcommand>` shape before accepting the artifact. If validation
479
+ fails on `commands.*.argv`, regenerate the run/summary with `agent run`,
480
+ `agent replay-all`, `agent promote`, or `agent check`; do not hand-edit shell
481
+ strings as a recovery path, because the argv arrays are the replay contract.
482
+ `agent rerun <summary>` reads `agent-promote.summary.json`,
483
+ `agent-check.summary.json`, or
484
+ `agent-replay.summary.json` and replays the stored cassette directory/artifact
485
+ root without launching a live agent. `agent commands <artifact>` reads
486
+ flow/cassette/run-record/summary artifacts and prints the reusable argv commands
487
+ without executing them. `agent validate <path>` accepts a single artifact or a
488
+ directory and returns a non-zero exit code when any known agent replay artifact
489
+ is malformed. `agent check [dir]` validates committed cassettes under
490
+ `tests/agent-cassettes` by default, replays them into `.tmp/agent-check`, writes
491
+ `agent-check.summary.json`, then validates the generated suite output. Add
492
+ `--json` for a CI-friendly summary with input/replay/output counts and failure
493
+ details.
494
+
495
+ ## Development
496
+
497
+ ```bash
498
+ bun install
499
+
500
+ # Start MCP server
501
+ bun run bin/ptywright mcp
502
+
503
+ # Run tests
504
+ bun run test
505
+ bun run agent:check
506
+ bun run check
507
+
508
+ # CI installs Chromium, runs bun run check, and uploads .tmp/agent-check.
509
+
510
+ # Lint & Format
511
+ bun run lint
512
+ bun run format:check
513
+
514
+ # Run scripts
515
+ bun run bin/ptywright run scripts/m5_mask_demo.json
516
+ bun run bin/ptywright run-all
517
+
518
+ # Run browser agent regression
519
+ bun run bin/ptywright agent run examples/agent_deterministic.json --update-snapshots
174
520
  ```
175
521
 
176
- 约定:
177
- - module 默认导出(`export default`),或导出 `script`。
178
- - 可选导出 `steps`(custom step handlers),用于执行 `type:"custom"` 的步骤。
179
- - 常用 handlers 可复用:`src/script/steps/*`。
180
- - 需要测试“粘贴”时可用 `pasteText("...", { bracketed: true })`(bracketed paste)。
522
+ ## Environment Variables
181
523
 
182
- ## Cast -> SVG/GIF (可选)
524
+ - `TUI_TEST_PTY_BACKEND=auto|bun-terminal|bun-pty`
525
+ - default `auto`: macOS/Linux prefers `bun-terminal`, Windows uses `bun-pty`
526
+ - `PTYWRIGHT_CAPS=all|core|debug|script|recording`
527
+ - Equivalent to `--caps` parameter
183
528
 
184
- 录像类产物建议只用于失败诊断或人工验收;稳定回归优先用 `snapshot_grid` 做 diff。
529
+ ## License
185
530
 
186
- - SVG: `svg-term`(例如:`bunx svg-term --in <castPath> --out <outSvg>`)
187
- - TXT: `bun run src/trace/cast_to_txt.ts --in <castPath> --out <outTxt>`
188
- - GIF: `asciinema/agg`(例如:`agg --fps 30 <castPath> <outGif>`)
531
+ Apache-2.0
package/dist/agent.mjs ADDED
@@ -0,0 +1,2 @@
1
+ import { a as runAgentSpecPath, i as runAgentSpec, n as printAittyLaunchPlan, r as replayAgentRecordPath, t as defaultSpecNameForPath } from "./runner-DzZlFrt1.mjs";
2
+ export { defaultSpecNameForPath, printAittyLaunchPlan, replayAgentRecordPath, runAgentSpec, runAgentSpecPath };
@@ -0,0 +1,6 @@
1
+ #!/usr/bin/env bun
2
+ import { t as main } from "../cli-DIUx2w6X.mjs";
3
+ //#region src/bin/ptywright.ts
4
+ await main();
5
+ //#endregion
6
+ export {};