ptywright 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +459 -116
- package/dist/agent.mjs +2 -0
- package/dist/bin/ptywright.mjs +6 -0
- package/dist/cli-DIUx2w6X.mjs +3587 -0
- package/dist/cli.mjs +2 -0
- package/{src/index.ts → dist/index.mjs} +7 -9
- package/dist/mcp.mjs +2 -0
- package/dist/pty-cassette.mjs +24 -0
- package/dist/pty_like-Cpkh_O9B.mjs +404 -0
- package/dist/runner-DzZlFrt1.mjs +1897 -0
- package/dist/runner-zApMYWZx.mjs +3257 -0
- package/dist/script.mjs +2 -0
- package/dist/server-VHuEWWj_.mjs +3068 -0
- package/dist/session.mjs +2 -0
- package/dist/terminal_session-DopC7Xg6.mjs +893 -0
- package/package.json +28 -21
- package/schemas/ptywright-agent-cassette.schema.json +57 -0
- package/schemas/ptywright-agent-check.schema.json +122 -0
- package/schemas/ptywright-agent-manifest.schema.json +107 -0
- package/schemas/ptywright-agent-promote.schema.json +146 -0
- package/schemas/ptywright-agent-replay-summary.schema.json +140 -0
- package/schemas/ptywright-agent-run.schema.json +126 -0
- package/schemas/ptywright-agent.schema.json +182 -0
- package/schemas/ptywright-pty-cassette.schema.json +86 -0
- package/schemas/ptywright-script-manifest.schema.json +75 -0
- package/schemas/ptywright-script-run-summary.schema.json +114 -0
- package/schemas/ptywright-script.schema.json +55 -3
- package/skills/ptywright-testing/SKILL.md +53 -33
- package/bin/ptywright +0 -4
- package/src/cli.ts +0 -414
- package/src/generator/doc_parser.ts +0 -341
- package/src/generator/generate.ts +0 -161
- package/src/generator/index.ts +0 -10
- package/src/generator/script_generator.ts +0 -209
- package/src/generator/step_extractor.ts +0 -397
- package/src/mcp/http_server.ts +0 -174
- package/src/mcp/script_recording.ts +0 -238
- package/src/mcp/server.ts +0 -1348
- package/src/pty/bun_pty_adapter.ts +0 -34
- package/src/pty/bun_terminal_adapter.ts +0 -149
- package/src/pty/pty_adapter.ts +0 -31
- package/src/script/dsl.ts +0 -188
- package/src/script/module.ts +0 -43
- package/src/script/path.ts +0 -151
- package/src/script/run.ts +0 -108
- package/src/script/run_all.ts +0 -229
- package/src/script/runner.ts +0 -983
- package/src/script/schema.ts +0 -237
- package/src/script/steps/assert_snapshot_equals.ts +0 -21
- package/src/script/steps/index.ts +0 -2
- package/src/script/suite_report.ts +0 -626
- package/src/session/session_manager.ts +0 -145
- package/src/session/terminal_session.ts +0 -473
- package/src/terminal/ansi.ts +0 -142
- package/src/terminal/keys.ts +0 -180
- package/src/terminal/mask.ts +0 -70
- package/src/terminal/mouse.ts +0 -75
- package/src/terminal/snapshot.ts +0 -196
- package/src/terminal/style.ts +0 -121
- package/src/terminal/view.ts +0 -49
- package/src/trace/asciicast.ts +0 -20
- package/src/trace/asciinema_player_assets.ts +0 -44
- package/src/trace/cast_to_txt.ts +0 -116
- package/src/trace/recorder.ts +0 -110
- package/src/trace/report.ts +0 -2092
- package/src/types.ts +0 -86
- package/src/util/hash.ts +0 -8
- package/src/util/sleep.ts +0 -5
package/README.md
CHANGED
|
@@ -1,188 +1,531 @@
|
|
|
1
1
|
# ptywright
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
[中文文档](./README_ZH.md)
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
A universal "Terminal DevTools / Playwright driver": Launch any CLI/TUI via PTY, feed ANSI/VT output to `@xterm/headless` to rebuild the screen grid, and expose it as MCP (stdio) tools.
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
bun install
|
|
9
|
-
|
|
10
|
-
# 默认:启动 MCP server(等价 `ptywright mcp`)
|
|
11
|
-
bun run bin/ptywright
|
|
7
|
+
## Installation
|
|
12
8
|
|
|
13
|
-
|
|
14
|
-
#
|
|
9
|
+
```bash
|
|
10
|
+
# Recommended: Run with bunx (no install needed)
|
|
11
|
+
bunx ptywright@latest --help
|
|
15
12
|
|
|
16
|
-
#
|
|
17
|
-
|
|
13
|
+
# Or install globally
|
|
14
|
+
bun add -g ptywright
|
|
15
|
+
ptywright --help
|
|
18
16
|
|
|
19
|
-
#
|
|
20
|
-
|
|
21
|
-
|
|
17
|
+
# Or via npm/npx
|
|
18
|
+
npx -y ptywright@latest --help
|
|
19
|
+
npm install -g ptywright
|
|
22
20
|
```
|
|
23
21
|
|
|
24
|
-
##
|
|
22
|
+
## Quick Start
|
|
25
23
|
|
|
26
|
-
|
|
24
|
+
### Use as MCP Server
|
|
27
25
|
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
26
|
+
```bash
|
|
27
|
+
# stdio mode (default)
|
|
28
|
+
bunx ptywright@latest mcp
|
|
31
29
|
|
|
32
|
-
|
|
30
|
+
# HTTP mode
|
|
31
|
+
bunx ptywright@latest mcp-http --port 3000
|
|
33
32
|
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
- `snapshot_text`:返回可见屏幕文本(适合 Agent “看界面”与做 golden)
|
|
38
|
-
- `snapshot_view`:更适合人看的快照(带元信息+行号)
|
|
39
|
-
- `wait_for_text`:等待文本/正则出现
|
|
40
|
-
- `wait_for_stable_screen`:等待屏幕在 quiet window 内稳定(降低 flaky)
|
|
41
|
-
- `assert`:对当前屏幕做断言(text/regex/semantic)
|
|
42
|
-
- `close_session`:关闭会话
|
|
33
|
+
# Minimal tools (reduce Agent context pressure)
|
|
34
|
+
bunx ptywright@latest mcp --caps core
|
|
35
|
+
```
|
|
43
36
|
|
|
44
|
-
###
|
|
37
|
+
### Configure MCP Client
|
|
45
38
|
|
|
46
|
-
|
|
47
|
-
- `snapshot_view_ansi`:带 ANSI/SGR 样式的 `snapshot_view`
|
|
39
|
+
**Claude Desktop / Cursor** (`~/.config/claude/claude_desktop_config.json`):
|
|
48
40
|
|
|
49
|
-
|
|
41
|
+
```json
|
|
42
|
+
{
|
|
43
|
+
"mcpServers": {
|
|
44
|
+
"ptywright": {
|
|
45
|
+
"command": "bunx",
|
|
46
|
+
"args": ["ptywright@latest", "mcp"]
|
|
47
|
+
}
|
|
48
|
+
}
|
|
49
|
+
}
|
|
50
|
+
```
|
|
50
51
|
|
|
51
|
-
|
|
52
|
-
- `run_script`:运行 `scriptPath=file.json|file.ts` 并产出 artifacts(cast/report/失败快照)
|
|
53
|
-
- `run_all_scripts`:批量运行目录内脚本(递归;支持 `includeEntries/maxEntries` 控制输出)
|
|
54
|
-
- `generate_test_from_doc`:从文档(本地/URL)生成可执行脚本
|
|
55
|
-
- `inspect_failure`:查看最近一次失败的屏幕与错误
|
|
52
|
+
**Minimal mode** (load core tools only):
|
|
56
53
|
|
|
57
|
-
|
|
54
|
+
```json
|
|
55
|
+
{
|
|
56
|
+
"mcpServers": {
|
|
57
|
+
"ptywright": {
|
|
58
|
+
"command": "bunx",
|
|
59
|
+
"args": ["ptywright@latest", "mcp", "--caps", "core"]
|
|
60
|
+
}
|
|
61
|
+
}
|
|
62
|
+
}
|
|
63
|
+
```
|
|
58
64
|
|
|
59
|
-
|
|
60
|
-
- `mark`:在 trace 中打点(asciicast marker event)
|
|
65
|
+
**HTTP mode** (for Web clients):
|
|
61
66
|
|
|
62
|
-
|
|
67
|
+
```json
|
|
68
|
+
{
|
|
69
|
+
"mcpServers": {
|
|
70
|
+
"ptywright": {
|
|
71
|
+
"command": "bunx",
|
|
72
|
+
"args": ["ptywright@latest", "mcp-http", "--port", "3000"]
|
|
73
|
+
}
|
|
74
|
+
}
|
|
75
|
+
}
|
|
76
|
+
```
|
|
63
77
|
|
|
64
|
-
###
|
|
78
|
+
### CLI Commands
|
|
65
79
|
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
- 组合键:`Ctrl+C`、`Ctrl+Shift+R`、`Alt+X`/`Meta+X`、`Shift+Tab`、`Ctrl+Up`
|
|
70
|
-
- 导航键:`Up/Down/Left/Right`、`Home/End`、`PageUp/PageDown`、`Insert/Delete`、`F1..F12`
|
|
71
|
-
- 兼容:`c-x`(等价 `Ctrl+X`)
|
|
80
|
+
```bash
|
|
81
|
+
# Run a single test script
|
|
82
|
+
bunx ptywright@latest run scripts/demo.json
|
|
72
83
|
|
|
73
|
-
|
|
84
|
+
# Run all scripts (generate HTML report)
|
|
85
|
+
bunx ptywright@latest run-all --dir scripts
|
|
74
86
|
|
|
75
|
-
|
|
76
|
-
|
|
87
|
+
# Show help
|
|
88
|
+
bunx ptywright@latest --help
|
|
77
89
|
```
|
|
78
90
|
|
|
79
|
-
|
|
80
|
-
- PTY + xterm 解析与快照测试
|
|
81
|
-
- MCP server 端到端 smoke(client 通过 stdio 启动 server 并调用 tools)
|
|
91
|
+
### Raw PTY Cassette
|
|
82
92
|
|
|
83
|
-
|
|
93
|
+
`ptywright pty` records the raw PTY stream once and replays it later without
|
|
94
|
+
rerunning the original command. This is intended for browser terminal renderers
|
|
95
|
+
that need deterministic regression tests for prompts or AI sessions that are
|
|
96
|
+
hard to reproduce live.
|
|
84
97
|
|
|
85
|
-
|
|
98
|
+
```bash
|
|
99
|
+
# Record output/input/resize/exit as base64 PTY events
|
|
100
|
+
bunx ptywright@latest pty record --out tests/cassettes/codex.pty.json -- codex
|
|
86
101
|
|
|
87
|
-
|
|
102
|
+
# Replay the same raw output stream instantly
|
|
103
|
+
bunx ptywright@latest pty replay tests/cassettes/codex.pty.json
|
|
88
104
|
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
105
|
+
# Validate or inspect the portable artifact
|
|
106
|
+
bunx ptywright@latest pty validate tests/cassettes/codex.pty.json
|
|
107
|
+
bunx ptywright@latest pty inspect tests/cassettes/codex.pty.json
|
|
92
108
|
```
|
|
93
109
|
|
|
94
|
-
|
|
110
|
+
External projects do not need to depend on aitty. Use the structural
|
|
111
|
+
`wrapPtyLike` API for `node-pty`/`bun-pty` style objects:
|
|
95
112
|
|
|
96
|
-
```
|
|
97
|
-
|
|
113
|
+
```ts
|
|
114
|
+
import { wrapPtyLike } from "ptywright/pty-cassette";
|
|
115
|
+
|
|
116
|
+
const recorded = wrapPtyLike(pty, {
|
|
117
|
+
path: "tests/cassettes/session.pty.json",
|
|
118
|
+
terminal: { cols: 120, rows: 40, term: "xterm-256color" },
|
|
119
|
+
command: { file: "codex", args: [] },
|
|
120
|
+
});
|
|
121
|
+
|
|
122
|
+
recorded.write("hello\r");
|
|
123
|
+
// output and exit are captured from pty.onData/onExit
|
|
98
124
|
```
|
|
99
125
|
|
|
100
|
-
|
|
126
|
+
For Bun Terminal callback-style integration, create a recorder and call
|
|
127
|
+
`recordOutput` from the terminal `data` hook, or use
|
|
128
|
+
`wrapBunTerminalOptions`. The cassette can then be replayed into any renderer
|
|
129
|
+
and compared by that renderer's DOM/text snapshot tests.
|
|
130
|
+
|
|
131
|
+
## Tools
|
|
132
|
+
|
|
133
|
+
All tools are enabled by default (`--caps all`). Use `--caps core` or combine as needed:
|
|
134
|
+
|
|
135
|
+
- Default: `--caps all`
|
|
136
|
+
- Minimal: `--caps core`
|
|
137
|
+
- Combined: `--caps core,debug,script,recording`
|
|
138
|
+
|
|
139
|
+
### core
|
|
140
|
+
|
|
141
|
+
- `list_sessions` / `select_session`: Manage and select sessions
|
|
142
|
+
- `launch_session`: Start a PTY session (becomes default session)
|
|
143
|
+
- `send_text` / `press_key`: Send input
|
|
144
|
+
- `snapshot_text`: Return visible screen text (for Agent "seeing" and golden snapshots)
|
|
145
|
+
- `snapshot_view`: Human-friendly snapshot (with metadata + line numbers)
|
|
146
|
+
- `wait_for_text`: Wait for text/regex to appear
|
|
147
|
+
- `wait_for_stable_screen`: Wait for screen to stabilize within quiet window (reduce flaky)
|
|
148
|
+
- `assert`: Assert on current screen (text/regex/semantic)
|
|
149
|
+
- `close_session`: Close session
|
|
150
|
+
|
|
151
|
+
### debug (optional)
|
|
152
|
+
|
|
153
|
+
- `snapshot_ansi`: Return visible screen with ANSI/SGR styles (for debug/human review)
|
|
154
|
+
- `snapshot_view_ansi`: `snapshot_view` with ANSI/SGR styles
|
|
155
|
+
|
|
156
|
+
### script (optional)
|
|
157
|
+
|
|
158
|
+
- `run_routine`: Execute multi-step interactions in one call (type/key/wait/assert/snapshot)
|
|
159
|
+
- `run_script`: Run `scriptPath=file.json|file.ts` and produce artifacts (cast/report/failure snapshots)
|
|
160
|
+
- `run_all_scripts`: Run scripts in directory recursively (supports `includeEntries/maxEntries`)
|
|
161
|
+
- `generate_test_from_doc`: Generate executable scripts from documentation (local/URL)
|
|
162
|
+
- `inspect_failure`: View last failure screen and error
|
|
163
|
+
|
|
164
|
+
### recording (optional)
|
|
165
|
+
|
|
166
|
+
- `start_script_recording` / `stop_script_recording`: Record MCP tool calls and export replayable scripts (JSON + goldens)
|
|
167
|
+
- `mark`: Add marker to trace (asciicast marker event)
|
|
168
|
+
|
|
169
|
+
### `mask` Parameter
|
|
170
|
+
|
|
171
|
+
`snapshot_text/snapshot_ansi/snapshot_view/snapshot_view_ansi` support `mask=[{regex,flags?,replacement?,preserveLength?}]` to convert random IDs/timestamps into diffable stable snapshots.
|
|
172
|
+
|
|
173
|
+
### `press_key` Key Spec
|
|
174
|
+
|
|
175
|
+
Supports single keys and modifier combinations (case-insensitive, `+`/`-` as separator):
|
|
176
|
+
- Single char: `"a"` / `"?"` (written to PTY as-is)
|
|
177
|
+
- Special keys: `Enter|Return`, `Esc|Escape`, `Backspace`, `Space`, `Tab`, `BackTab`
|
|
178
|
+
- Combos: `Ctrl+C`, `Ctrl+Shift+R`, `Alt+X`/`Meta+X`, `Shift+Tab`, `Ctrl+Up`
|
|
179
|
+
- Navigation: `Up/Down/Left/Right`, `Home/End`, `PageUp/PageDown`, `Insert/Delete`, `F1..F12`
|
|
180
|
+
- Compatible: `c-x` (equals `Ctrl+X`)
|
|
101
181
|
|
|
102
182
|
## Script Runner (JSON)
|
|
103
183
|
|
|
104
|
-
|
|
184
|
+
Write TUI tests as JSON: launch → input → wait → snapshot (with mask) → assert, automatically producing `.cast` + `report.html`.
|
|
105
185
|
|
|
106
|
-
|
|
186
|
+
Optional: Add schema for editor completion/validation:
|
|
107
187
|
|
|
108
188
|
```json
|
|
109
|
-
{ "$schema": "
|
|
189
|
+
{ "$schema": "node_modules/ptywright/schemas/ptywright-script.schema.json" }
|
|
110
190
|
```
|
|
111
191
|
|
|
112
192
|
```bash
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
#
|
|
117
|
-
|
|
193
|
+
# Run single script
|
|
194
|
+
bunx ptywright@latest run scripts/m5_mask_demo.json
|
|
195
|
+
|
|
196
|
+
# Run all scripts
|
|
197
|
+
bunx ptywright@latest run-all --dir scripts
|
|
118
198
|
```
|
|
119
199
|
|
|
120
|
-
|
|
200
|
+
Artifacts go to `.tmp/runs/<name>/` by default (override with `--artifacts-dir`).
|
|
201
|
+
|
|
202
|
+
Batch runs generate an overview report:
|
|
203
|
+
- Default: `.tmp/run-all/index.html` + `.tmp/run-all/run.summary.json`
|
|
204
|
+
- With `--artifacts-root <dir>`: `<dir>/index.html` + `<dir>/run.summary.json`
|
|
205
|
+
- `run.summary.json` stores `commands.runAll.argv` and
|
|
206
|
+
`commands.updateGoldens.argv` so automation can replay the suite or update
|
|
207
|
+
goldens without reconstructing CLI arguments.
|
|
208
|
+
|
|
209
|
+
You can read or execute those commands directly from the generated artifact:
|
|
121
210
|
|
|
122
211
|
```bash
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
212
|
+
bunx ptywright@latest script commands .tmp/run-all --json
|
|
213
|
+
bunx ptywright@latest script commands .tmp/run-all/run.summary.json --command runAll
|
|
214
|
+
bunx ptywright@latest script inspect .tmp/run-all
|
|
215
|
+
bunx ptywright@latest script validate .tmp/run-all
|
|
216
|
+
bunx ptywright@latest script exec .tmp/run-all --command updateGoldens
|
|
126
217
|
```
|
|
127
218
|
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
219
|
+
Suite directories also include `ptywright-script.manifest.json`, which indexes
|
|
220
|
+
the generated summary, reports, casts, data, and failure artifacts with
|
|
221
|
+
`bytes`/`sha256`. `script validate`, `script inspect`, `script commands`, and
|
|
222
|
+
`script exec` verify that manifest before using a directory bundle, so copied
|
|
223
|
+
script run artifacts can be replayed or updated without trusting stale files.
|
|
131
224
|
|
|
132
|
-
|
|
225
|
+
On failure, additional files are saved:
|
|
226
|
+
- `failure.error.txt` (error stack)
|
|
227
|
+
- `failure.step.json` (failed step info)
|
|
228
|
+
- `failure.last.txt` / `failure.last.view.txt` (last frame snapshot)
|
|
133
229
|
|
|
134
|
-
|
|
135
|
-
bun run script:run examples/json_custom_steps_demo.json --steps scripts/m6_json_custom_steps.ts
|
|
136
|
-
```
|
|
230
|
+
`report.html` includes **Timeline View** showing screen snapshots after each step. Click the `debug` badge to switch to debug view.
|
|
137
231
|
|
|
138
|
-
|
|
232
|
+
Built-in steps (no `--steps` needed):
|
|
233
|
+
- `assert`: Assert text/regex (`text`/`regex`)
|
|
234
|
+
- `assertSemantic`: Semantic assertion placeholder (`prompt`)
|
|
235
|
+
- `sleep`: Fixed wait
|
|
236
|
+
- `expectMeta`: Assert terminal meta
|
|
237
|
+
- `waitForExit`: Wait for process exit
|
|
238
|
+
- `sendMouse`: Send SGR mouse events
|
|
139
239
|
|
|
140
|
-
|
|
141
|
-
- `failure.error.txt`(错误堆栈)
|
|
142
|
-
- `failure.step.json`(失败的 step 信息)
|
|
143
|
-
- `failure.last.txt` / `failure.last.view.txt`(最后一帧快照)
|
|
240
|
+
### Framework Backends
|
|
144
241
|
|
|
145
|
-
`
|
|
242
|
+
`launch.backend` defaults to `pty`. For faster framework-level checks, use
|
|
243
|
+
`frames`, `ratatui`, or `ink` to run the same script steps against deterministic
|
|
244
|
+
frames without starting a PTY:
|
|
146
245
|
|
|
147
|
-
|
|
246
|
+
```json
|
|
247
|
+
{
|
|
248
|
+
"$schema": "../schemas/ptywright-script.schema.json",
|
|
249
|
+
"name": "ratatui_snapshot",
|
|
250
|
+
"launch": {
|
|
251
|
+
"backend": "ratatui",
|
|
252
|
+
"cols": 60,
|
|
253
|
+
"rows": 12,
|
|
254
|
+
"frames": [
|
|
255
|
+
"Screen: Dashboard\nMode: HIGH",
|
|
256
|
+
"Screen: Permissions\nMode: LOW"
|
|
257
|
+
]
|
|
258
|
+
},
|
|
259
|
+
"steps": [
|
|
260
|
+
{ "type": "waitForText", "text": "Dashboard" },
|
|
261
|
+
{ "type": "pressKey", "key": "Enter" },
|
|
262
|
+
{ "type": "snapshot", "kind": "text", "saveAs": "final" },
|
|
263
|
+
{ "type": "expect", "from": "final", "contains": ["Mode: LOW"] }
|
|
264
|
+
]
|
|
265
|
+
}
|
|
266
|
+
```
|
|
148
267
|
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
-
|
|
154
|
-
- `waitForExit`:等待进程退出
|
|
155
|
-
- `sendMouse`:发送 SGR 鼠标事件
|
|
268
|
+
`ratatui` is intended for text emitted by `TestBackend`/insta-style snapshots.
|
|
269
|
+
`ink` can load a module via `frameModule` that exports `frames`, `frame`,
|
|
270
|
+
`snapshot`, or `lastFrame`. Input steps such as `pressKey` and `sendText`
|
|
271
|
+
advance to the next frame by default, so the assertion path stays identical to
|
|
272
|
+
the PTY end-to-end script.
|
|
156
273
|
|
|
157
|
-
|
|
274
|
+
For `type:"custom"` steps, inject handlers with `--steps <module.ts>`:
|
|
158
275
|
|
|
159
|
-
|
|
276
|
+
```bash
|
|
277
|
+
bunx ptywright@latest run demo.json --steps custom_steps.ts
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
## Script Recording (MCP)
|
|
160
281
|
|
|
161
|
-
|
|
282
|
+
Record tool calls into replayable scripts from any MCP client/Agent:
|
|
162
283
|
|
|
163
284
|
1) `start_script_recording(name="my_flow")`
|
|
164
|
-
2)
|
|
165
|
-
3)
|
|
166
|
-
4) `stop_script_recording(recordingId=...)
|
|
285
|
+
2) Execute normally: `launch_session/send_text/press_key/wait_for_*`
|
|
286
|
+
3) Add checkpoints: `mark(label="checkpoint")` (auto-generates `snapshot + expectGolden`)
|
|
287
|
+
4) `stop_script_recording(recordingId=...)` (writes `scripts/my_flow.json` + `tests/golden/scripts/my_flow/*.txt`)
|
|
167
288
|
|
|
168
289
|
## Script DSL (TypeScript)
|
|
169
290
|
|
|
170
|
-
|
|
291
|
+
Write scripts with TS builder (type-safe, composable, custom steps):
|
|
171
292
|
|
|
172
293
|
```bash
|
|
173
|
-
|
|
294
|
+
bunx ptywright@latest run scripts/demo.ts
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
Conventions:
|
|
298
|
+
- Module default export (`export default`), or export `script`.
|
|
299
|
+
- Optional `steps` export (custom step handlers) for `type:"custom"` steps.
|
|
300
|
+
- Use `pasteText("...", { bracketed: true })` for bracketed paste testing.
|
|
301
|
+
|
|
302
|
+
## Cast -> SVG/GIF (Optional)
|
|
303
|
+
|
|
304
|
+
Recording artifacts are best for failure diagnosis or manual review; prefer `snapshot_grid` diff for stable regression.
|
|
305
|
+
|
|
306
|
+
- SVG: `bunx svg-term --in <castPath> --out <outSvg>`
|
|
307
|
+
- GIF: `agg --fps 30 <castPath> <outGif>` (requires [asciinema/agg](https://github.com/asciinema/agg))
|
|
308
|
+
|
|
309
|
+
## Browser Agent Regression
|
|
310
|
+
|
|
311
|
+
The new destructive path is browser-first: ptywright can launch an agent through
|
|
312
|
+
`@aitty/cli`, drive the browser-hosted wterm DOM with Playwright, and persist a
|
|
313
|
+
replayable run artifact plus terminal/DOM snapshots.
|
|
314
|
+
|
|
315
|
+
```bash
|
|
316
|
+
# First run records snapshots, screenshots, replay metadata, and report.
|
|
317
|
+
bun run bin/ptywright agent run examples/agent_deterministic.json --update-snapshots
|
|
318
|
+
|
|
319
|
+
# Later runs compare terminal + DOM snapshots like a test snapshot.
|
|
320
|
+
bun run bin/ptywright agent run examples/agent_deterministic.json
|
|
321
|
+
|
|
322
|
+
# Replay does not need AI; it uses the recorded flow artifact.
|
|
323
|
+
bun run bin/ptywright agent replay .tmp/agent/agent_deterministic/agent_deterministic.agent-run.json
|
|
324
|
+
|
|
325
|
+
# Cassette files are also directly replayable.
|
|
326
|
+
bun run bin/ptywright agent replay .tmp/agent/agent_deterministic/agent_deterministic.cassette.json
|
|
327
|
+
|
|
328
|
+
# Promote a live run/cassette into the committed non-AI regression suite.
|
|
329
|
+
bun run bin/ptywright agent promote \
|
|
330
|
+
.tmp/agent/agent_deterministic/agent_deterministic.cassette.json \
|
|
331
|
+
--update-snapshots
|
|
332
|
+
|
|
333
|
+
# Batch replay committed cassettes/run records as a regression suite.
|
|
334
|
+
bun run bin/ptywright agent replay-all .tmp/agent --artifacts-root .tmp/agent-replay-all
|
|
335
|
+
|
|
336
|
+
# Rerun directly from a generated summary artifact.
|
|
337
|
+
bun run bin/ptywright agent rerun .tmp/agent-promote/agent_deterministic/agent-promote.summary.json
|
|
338
|
+
bun run bin/ptywright agent rerun .tmp/agent-check/agent-check.summary.json
|
|
339
|
+
bun run bin/ptywright agent rerun .tmp/agent-check/agent-replay.summary.json --update-snapshots
|
|
340
|
+
|
|
341
|
+
# Read reusable commands from any supported agent artifact.
|
|
342
|
+
bun run bin/ptywright agent commands .tmp/agent-check/agent-check.summary.json --json
|
|
343
|
+
bun run bin/ptywright agent commands .tmp/agent-check/agent-check.summary.json --command rerun
|
|
344
|
+
bun run bin/ptywright agent commands .tmp/agent-check --json
|
|
345
|
+
bun run bin/ptywright agent inspect .tmp/agent-check
|
|
346
|
+
bun run bin/ptywright agent inspect .tmp/agent-check --json
|
|
347
|
+
bun run bin/ptywright agent validate .tmp/agent-check
|
|
348
|
+
bun run bin/ptywright agent exec .tmp/agent-check --command rerun
|
|
349
|
+
bun run bin/ptywright agent exec .tmp/agent-check --command updateSnapshots
|
|
350
|
+
bun run bin/ptywright agent exec .tmp/agent-check/agent-check.summary.json --command rerun
|
|
351
|
+
bun run bin/ptywright agent exec .tmp/agent-check/agent-check.summary.json --command updateSnapshots
|
|
352
|
+
|
|
353
|
+
# Validate flow/cassette/run-record/summary artifacts before committing.
|
|
354
|
+
bun run bin/ptywright agent validate .tmp/agent-replay-all
|
|
355
|
+
|
|
356
|
+
# Run committed cassette replay regression without launching live agents.
|
|
357
|
+
bun run bin/ptywright agent check
|
|
358
|
+
bun run bin/ptywright agent check --json
|
|
359
|
+
|
|
360
|
+
# Update terminal/DOM baselines from committed cassettes intentionally.
|
|
361
|
+
bun run bin/ptywright agent replay-all tests/agent-cassettes --update-snapshots
|
|
362
|
+
|
|
363
|
+
# Record browser interactions into a replayable flow spec.
|
|
364
|
+
bun run bin/ptywright agent record examples/agents/codex_browser_smoke.json \
|
|
365
|
+
--out scripts/agents/codex_recorded.flow.json \
|
|
366
|
+
--duration-ms 60000 \
|
|
367
|
+
--headed
|
|
368
|
+
|
|
369
|
+
# Generate starter specs for real agents.
|
|
370
|
+
bun run bin/ptywright agent init codex examples/agents/codex_browser_smoke.json
|
|
371
|
+
bun run bin/ptywright agent init claude examples/agents/claude_browser_smoke.json
|
|
372
|
+
bun run bin/ptywright agent init droidx examples/agents/droidx_browser_smoke.json
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
Artifacts are split intentionally:
|
|
376
|
+
- `.tmp/agent/<name>/` contains run output, screenshots, `*.flow.json`,
|
|
377
|
+
`*.agent-run.json`, `*.cassette.json`, `index.html`, and
|
|
378
|
+
`ptywright-agent.manifest.json`.
|
|
379
|
+
- `tests/agent-snapshots/<name>/` contains stable terminal/DOM baselines.
|
|
380
|
+
- `--update-snapshots` is the explicit update path for intentional UI changes.
|
|
381
|
+
|
|
382
|
+
`launch.mode=aitty` runs `aitty exec --launch print -- <agent>`. By default
|
|
383
|
+
ptywright resolves the sibling `../aitty/packages/cli/dist/cli.js`; set
|
|
384
|
+
`PTYWRIGHT_AITTY_CLI` or `launch.aitty.command` to override it.
|
|
385
|
+
|
|
386
|
+
Set `launch.agentFlavor` to `codex`, `claude`, `droid`, or `generic` to opt
|
|
387
|
+
into built-in mask presets for timestamps, generated ids, model names, token
|
|
388
|
+
counts, and other non-deterministic terminal text. Explicit
|
|
389
|
+
`defaults.mask=[...]` rules are appended after the preset, so project-specific
|
|
390
|
+
noise can be hidden without rewriting the runner.
|
|
391
|
+
|
|
392
|
+
`agent record` opens the same browser-hosted terminal and writes the captured
|
|
393
|
+
keyboard/click steps back to a normal flow JSON. The output can be committed and
|
|
394
|
+
run later with `agent run`, while `.agent-run.json` remains the per-run replay
|
|
395
|
+
record generated by the runner. Run records must include
|
|
396
|
+
`commands.replay.argv` and `commands.updateSnapshots.argv`, so automation can
|
|
397
|
+
replay or intentionally update the captured flow without parsing shell strings.
|
|
398
|
+
|
|
399
|
+
`agent run` is the live path: it launches the configured process and updates or
|
|
400
|
+
compares terminal/DOM snapshots. `agent promote <run|cassette>` is the
|
|
401
|
+
intentional solidify step after a good live run: it copies the cassette into
|
|
402
|
+
`tests/agent-cassettes/<name>/`, rewrites its `snapshotDir`, optionally updates
|
|
403
|
+
terminal/DOM baselines, replays the promoted cassette, and writes
|
|
404
|
+
`agent-promote.summary.json` with direct commands for future non-AI checks. HTML
|
|
405
|
+
reports also surface replay/update/inspect commands so failed runs can be
|
|
406
|
+
reproduced directly from the report page.
|
|
407
|
+
`agent replay` is the single-case cassette regression path: it accepts either
|
|
408
|
+
`.agent-run.json` or `.cassette.json`, serves a local replay page, and reproduces
|
|
409
|
+
the previously captured terminal DOM without launching Codex, Claude, Droid, or
|
|
410
|
+
any other live agent process. `agent replay-all` recursively scans a directory
|
|
411
|
+
for `.agent-run.json` and `.cassette.json` files, then writes
|
|
412
|
+
`agent-replay.summary.json` and an HTML suite report so committed cassettes can
|
|
413
|
+
be run like a snapshot regression suite.
|
|
414
|
+
`--update-snapshots` works on `agent replay-all`, so intentional DOM/terminal
|
|
415
|
+
baseline changes can be updated from committed cassettes without a live agent.
|
|
416
|
+
Cassette files embed the normalized flow spec plus frame hashes, so they remain
|
|
417
|
+
self-contained replay artifacts when copied away from the original run
|
|
418
|
+
directory. Replay runs also copy the source cassette into the replay artifact
|
|
419
|
+
directory and write run records that point at that local copy, so the replay
|
|
420
|
+
directory can be moved as a durable reproduction bundle. Run/check/promote and
|
|
421
|
+
replay-all outputs also include `ptywright-agent.manifest.json`, which indexes
|
|
422
|
+
produced files with artifact-root-relative paths plus `bytes` and `sha256`,
|
|
423
|
+
stores reusable `commands.*.argv`, and can be passed to `agent commands`,
|
|
424
|
+
`agent inspect`, `agent exec`, or `agent validate`. `agent inspect
|
|
425
|
+
<artifact|dir>` is the self-describing bundle check: when pointed at an artifact
|
|
426
|
+
directory it prefers `ptywright-agent.manifest.json`, validates indexed file
|
|
427
|
+
hashes, summarizes manifest file kinds/validation stages, and prints the
|
|
428
|
+
relocated reusable commands. Because file entries are relative to the manifest
|
|
429
|
+
directory and manifest commands are relocated when read, copying the whole
|
|
430
|
+
artifact directory preserves both manifest validation and direct
|
|
431
|
+
`agent inspect <copied>` /
|
|
432
|
+
`agent commands <copied>` /
|
|
433
|
+
`agent exec <copied> --command rerun` workflows.
|
|
434
|
+
When inspecting a moved summary file that is the manifest primary artifact,
|
|
435
|
+
`agent inspect` also prints `commandsManifest=<path>` and includes
|
|
436
|
+
`commands.manifestPath` in JSON output, making it explicit which manifest bundle
|
|
437
|
+
will validate and relocate the stored commands before execution.
|
|
438
|
+
The same directory entrypoint works for copied live-run bundles too, so
|
|
439
|
+
`agent exec <copied-run> --command replay` and `--command updateSnapshots`
|
|
440
|
+
remain usable after the original run directory is deleted.
|
|
441
|
+
Copied replay-suite bundles are rerun from the run records stored under the
|
|
442
|
+
bundle's own `tests/` directory, so they do not need the original cassette input
|
|
443
|
+
directory. Promote bundles can move their artifact root and still rerun from the
|
|
444
|
+
copied manifest, while continuing to target the promoted cassette suite.
|
|
445
|
+
If `agent inspect <dir>` sees agent artifacts but no top-level manifest, it
|
|
446
|
+
still reports recursive validation results and prints a `directoryManifest`
|
|
447
|
+
diagnostic so the directory is not confused with a portable commands/exec
|
|
448
|
+
bundle.
|
|
449
|
+
For `agent commands` and `agent exec`, a directory argument means a manifest
|
|
450
|
+
bundle directory and must contain `ptywright-agent.manifest.json`; use
|
|
451
|
+
`agent validate <dir>` when you want recursive artifact discovery.
|
|
452
|
+
The generated agent flow, cassette, run-record, manifest, promote-summary,
|
|
453
|
+
replay-summary, and check-summary JSON files each carry a `$schema` URL under
|
|
454
|
+
`schemas/` so editors and CI tooling can validate the replay contract directly.
|
|
455
|
+
Run-record and summary schemas also encode the expected stored command prefixes,
|
|
456
|
+
for example `ptywright agent replay`, `ptywright agent replay-all`, and
|
|
457
|
+
`ptywright agent rerun`, so malformed commands can be caught before execution.
|
|
458
|
+
Run records and summaries reject missing or stale `commands.*.argv` metadata,
|
|
459
|
+
because those argv arrays are the non-AI replay/update contract for the artifact.
|
|
460
|
+
Promote, replay, and check summaries include `commands.*.argv` arrays for direct
|
|
461
|
+
non-AI reruns and snapshot updates. Each summary also includes
|
|
462
|
+
`commands.rerun.argv`, so downstream automation can re-execute the exact summary
|
|
463
|
+
artifact without reconstructing CLI arguments. `agent commands <artifact>
|
|
464
|
+
--command <name>` prints one shell-safe command line for scripts that want to
|
|
465
|
+
execute a specific replay/update/rerun path directly; with `--json`, the same
|
|
466
|
+
command includes `cwd`, `command.argv`, and `shell` so automation can choose
|
|
467
|
+
structured spawn or shell execution. When a moved summary/run-record is backed
|
|
468
|
+
by a sibling manifest bundle, `agent commands` also reports the manifest path in
|
|
469
|
+
plain output and JSON so automation can see which bundle is responsible for
|
|
470
|
+
relocation and integrity checks; manifest-backed command discovery validates
|
|
471
|
+
stored command targets and indexed file hashes before printing commands.
|
|
472
|
+
`agent exec <artifact> --command <name>`
|
|
473
|
+
executes a stored agent command through ptywright's own CLI dispatcher, so it
|
|
474
|
+
does not depend on shell parsing or a global `ptywright` binary. This includes
|
|
475
|
+
stored `updateSnapshots` commands, which provide the non-AI equivalent of a
|
|
476
|
+
snapshot update run from an existing summary artifact. `agent validate
|
|
477
|
+
<artifact>` also checks that every stored argv starts with a supported
|
|
478
|
+
`ptywright agent <subcommand>` shape before accepting the artifact. If validation
|
|
479
|
+
fails on `commands.*.argv`, regenerate the run/summary with `agent run`,
|
|
480
|
+
`agent replay-all`, `agent promote`, or `agent check`; do not hand-edit shell
|
|
481
|
+
strings as a recovery path, because the argv arrays are the replay contract.
|
|
482
|
+
`agent rerun <summary>` reads `agent-promote.summary.json`,
|
|
483
|
+
`agent-check.summary.json`, or
|
|
484
|
+
`agent-replay.summary.json` and replays the stored cassette directory/artifact
|
|
485
|
+
root without launching a live agent. `agent commands <artifact>` reads
|
|
486
|
+
flow/cassette/run-record/summary artifacts and prints the reusable argv commands
|
|
487
|
+
without executing them. `agent validate <path>` accepts a single artifact or a
|
|
488
|
+
directory and returns a non-zero exit code when any known agent replay artifact
|
|
489
|
+
is malformed. `agent check [dir]` validates committed cassettes under
|
|
490
|
+
`tests/agent-cassettes` by default, replays them into `.tmp/agent-check`, writes
|
|
491
|
+
`agent-check.summary.json`, then validates the generated suite output. Add
|
|
492
|
+
`--json` for a CI-friendly summary with input/replay/output counts and failure
|
|
493
|
+
details.
|
|
494
|
+
|
|
495
|
+
## Development
|
|
496
|
+
|
|
497
|
+
```bash
|
|
498
|
+
bun install
|
|
499
|
+
|
|
500
|
+
# Start MCP server
|
|
501
|
+
bun run bin/ptywright mcp
|
|
502
|
+
|
|
503
|
+
# Run tests
|
|
504
|
+
bun run test
|
|
505
|
+
bun run agent:check
|
|
506
|
+
bun run check
|
|
507
|
+
|
|
508
|
+
# CI installs Chromium, runs bun run check, and uploads .tmp/agent-check.
|
|
509
|
+
|
|
510
|
+
# Lint & Format
|
|
511
|
+
bun run lint
|
|
512
|
+
bun run format:check
|
|
513
|
+
|
|
514
|
+
# Run scripts
|
|
515
|
+
bun run bin/ptywright run scripts/m5_mask_demo.json
|
|
516
|
+
bun run bin/ptywright run-all
|
|
517
|
+
|
|
518
|
+
# Run browser agent regression
|
|
519
|
+
bun run bin/ptywright agent run examples/agent_deterministic.json --update-snapshots
|
|
174
520
|
```
|
|
175
521
|
|
|
176
|
-
|
|
177
|
-
- module 默认导出(`export default`),或导出 `script`。
|
|
178
|
-
- 可选导出 `steps`(custom step handlers),用于执行 `type:"custom"` 的步骤。
|
|
179
|
-
- 常用 handlers 可复用:`src/script/steps/*`。
|
|
180
|
-
- 需要测试“粘贴”时可用 `pasteText("...", { bracketed: true })`(bracketed paste)。
|
|
522
|
+
## Environment Variables
|
|
181
523
|
|
|
182
|
-
|
|
524
|
+
- `TUI_TEST_PTY_BACKEND=auto|bun-terminal|bun-pty`
|
|
525
|
+
- default `auto`: macOS/Linux prefers `bun-terminal`, Windows uses `bun-pty`
|
|
526
|
+
- `PTYWRIGHT_CAPS=all|core|debug|script|recording`
|
|
527
|
+
- Equivalent to `--caps` parameter
|
|
183
528
|
|
|
184
|
-
|
|
529
|
+
## License
|
|
185
530
|
|
|
186
|
-
-
|
|
187
|
-
- TXT: `bun run src/trace/cast_to_txt.ts --in <castPath> --out <outTxt>`
|
|
188
|
-
- GIF: `asciinema/agg`(例如:`agg --fps 30 <castPath> <outGif>`)
|
|
531
|
+
Apache-2.0
|
package/dist/agent.mjs
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
1
|
+
import { a as runAgentSpecPath, i as runAgentSpec, n as printAittyLaunchPlan, r as replayAgentRecordPath, t as defaultSpecNameForPath } from "./runner-DzZlFrt1.mjs";
|
|
2
|
+
export { defaultSpecNameForPath, printAittyLaunchPlan, replayAgentRecordPath, runAgentSpec, runAgentSpecPath };
|