ptywright 0.1.1 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (67) hide show
  1. package/README.md +318 -1
  2. package/dist/agent.mjs +2 -0
  3. package/dist/bin/ptywright.mjs +6 -0
  4. package/dist/cli-CfvlbRoZ.mjs +3585 -0
  5. package/dist/cli.mjs +2 -0
  6. package/{src/index.ts → dist/index.mjs} +7 -9
  7. package/dist/mcp.mjs +2 -0
  8. package/dist/pty-cassette.mjs +24 -0
  9. package/dist/pty_like-Cpkh_O9B.mjs +404 -0
  10. package/dist/runner-zApMYWZx.mjs +3257 -0
  11. package/dist/runner-zi0nItvB.mjs +1874 -0
  12. package/dist/script.mjs +2 -0
  13. package/dist/server-BC3yo-dq.mjs +3068 -0
  14. package/dist/session.mjs +2 -0
  15. package/dist/terminal_session-DopC7Xg6.mjs +893 -0
  16. package/package.json +28 -21
  17. package/schemas/ptywright-agent-cassette.schema.json +57 -0
  18. package/schemas/ptywright-agent-check.schema.json +122 -0
  19. package/schemas/ptywright-agent-manifest.schema.json +107 -0
  20. package/schemas/ptywright-agent-promote.schema.json +146 -0
  21. package/schemas/ptywright-agent-replay-summary.schema.json +140 -0
  22. package/schemas/ptywright-agent-run.schema.json +126 -0
  23. package/schemas/ptywright-agent.schema.json +166 -0
  24. package/schemas/ptywright-pty-cassette.schema.json +86 -0
  25. package/schemas/ptywright-script-manifest.schema.json +75 -0
  26. package/schemas/ptywright-script-run-summary.schema.json +114 -0
  27. package/schemas/ptywright-script.schema.json +55 -3
  28. package/bin/ptywright +0 -4
  29. package/src/cli.ts +0 -414
  30. package/src/generator/doc_parser.ts +0 -341
  31. package/src/generator/generate.ts +0 -161
  32. package/src/generator/index.ts +0 -10
  33. package/src/generator/script_generator.ts +0 -209
  34. package/src/generator/step_extractor.ts +0 -397
  35. package/src/mcp/http_server.ts +0 -174
  36. package/src/mcp/script_recording.ts +0 -238
  37. package/src/mcp/server.ts +0 -1348
  38. package/src/pty/bun_pty_adapter.ts +0 -34
  39. package/src/pty/bun_terminal_adapter.ts +0 -149
  40. package/src/pty/pty_adapter.ts +0 -31
  41. package/src/script/dsl.ts +0 -188
  42. package/src/script/module.ts +0 -43
  43. package/src/script/path.ts +0 -151
  44. package/src/script/run.ts +0 -108
  45. package/src/script/run_all.ts +0 -229
  46. package/src/script/runner.ts +0 -983
  47. package/src/script/schema.ts +0 -237
  48. package/src/script/steps/assert_snapshot_equals.ts +0 -21
  49. package/src/script/steps/index.ts +0 -2
  50. package/src/script/suite_report.ts +0 -626
  51. package/src/session/session_manager.ts +0 -145
  52. package/src/session/terminal_session.ts +0 -473
  53. package/src/terminal/ansi.ts +0 -142
  54. package/src/terminal/keys.ts +0 -180
  55. package/src/terminal/mask.ts +0 -70
  56. package/src/terminal/mouse.ts +0 -75
  57. package/src/terminal/snapshot.ts +0 -196
  58. package/src/terminal/style.ts +0 -121
  59. package/src/terminal/view.ts +0 -49
  60. package/src/trace/asciicast.ts +0 -20
  61. package/src/trace/asciinema_player_assets.ts +0 -44
  62. package/src/trace/cast_to_txt.ts +0 -116
  63. package/src/trace/recorder.ts +0 -110
  64. package/src/trace/report.ts +0 -2092
  65. package/src/types.ts +0 -86
  66. package/src/util/hash.ts +0 -8
  67. package/src/util/sleep.ts +0 -5
package/README.md CHANGED
@@ -88,6 +88,46 @@ bunx ptywright@latest run-all --dir scripts
88
88
  bunx ptywright@latest --help
89
89
  ```
90
90
 
91
+ ### Raw PTY Cassette
92
+
93
+ `ptywright pty` records the raw PTY stream once and replays it later without
94
+ rerunning the original command. This is intended for browser terminal renderers
95
+ that need deterministic regression tests for prompts or AI sessions that are
96
+ hard to reproduce live.
97
+
98
+ ```bash
99
+ # Record output/input/resize/exit as base64 PTY events
100
+ bunx ptywright@latest pty record --out tests/cassettes/codex.pty.json -- codex
101
+
102
+ # Replay the same raw output stream instantly
103
+ bunx ptywright@latest pty replay tests/cassettes/codex.pty.json
104
+
105
+ # Validate or inspect the portable artifact
106
+ bunx ptywright@latest pty validate tests/cassettes/codex.pty.json
107
+ bunx ptywright@latest pty inspect tests/cassettes/codex.pty.json
108
+ ```
109
+
110
+ External projects do not need a ptywright-specific PTY wrapper. Use the structural
111
+ `wrapPtyLike` API for `node-pty`/`bun-pty` style objects:
112
+
113
+ ```ts
114
+ import { wrapPtyLike } from "ptywright/pty-cassette";
115
+
116
+ const recorded = wrapPtyLike(pty, {
117
+ path: "tests/cassettes/session.pty.json",
118
+ terminal: { cols: 120, rows: 40, term: "xterm-256color" },
119
+ command: { file: "codex", args: [] },
120
+ });
121
+
122
+ recorded.write("hello\r");
123
+ // output and exit are captured from pty.onData/onExit
124
+ ```
125
+
126
+ For Bun Terminal callback-style integration, create a recorder and call
127
+ `recordOutput` from the terminal `data` hook, or use
128
+ `wrapBunTerminalOptions`. The cassette can then be replayed into any renderer
129
+ and compared by that renderer's DOM/text snapshot tests.
130
+
91
131
  ## Tools
92
132
 
93
133
  All tools are enabled by default (`--caps all`). Use `--caps core` or combine as needed:
@@ -162,6 +202,25 @@ Artifacts go to `.tmp/runs/<name>/` by default (override with `--artifacts-dir`)
162
202
  Batch runs generate an overview report:
163
203
  - Default: `.tmp/run-all/index.html` + `.tmp/run-all/run.summary.json`
164
204
  - With `--artifacts-root <dir>`: `<dir>/index.html` + `<dir>/run.summary.json`
205
+ - `run.summary.json` stores `commands.runAll.argv` and
206
+ `commands.updateGoldens.argv` so automation can replay the suite or update
207
+ goldens without reconstructing CLI arguments.
208
+
209
+ You can read or execute those commands directly from the generated artifact:
210
+
211
+ ```bash
212
+ bunx ptywright@latest script commands .tmp/run-all --json
213
+ bunx ptywright@latest script commands .tmp/run-all/run.summary.json --command runAll
214
+ bunx ptywright@latest script inspect .tmp/run-all
215
+ bunx ptywright@latest script validate .tmp/run-all
216
+ bunx ptywright@latest script exec .tmp/run-all --command updateGoldens
217
+ ```
218
+
219
+ Suite directories also include `ptywright-script.manifest.json`, which indexes
220
+ the generated summary, reports, casts, data, and failure artifacts with
221
+ `bytes`/`sha256`. `script validate`, `script inspect`, `script commands`, and
222
+ `script exec` verify that manifest before using a directory bundle, so copied
223
+ script run artifacts can be replayed or updated without trusting stale files.
165
224
 
166
225
  On failure, additional files are saved:
167
226
  - `failure.error.txt` (error stack)
@@ -178,6 +237,40 @@ Built-in steps (no `--steps` needed):
178
237
  - `waitForExit`: Wait for process exit
179
238
  - `sendMouse`: Send SGR mouse events
180
239
 
240
+ ### Framework Backends
241
+
242
+ `launch.backend` defaults to `pty`. For faster framework-level checks, use
243
+ `frames`, `ratatui`, or `ink` to run the same script steps against deterministic
244
+ frames without starting a PTY:
245
+
246
+ ```json
247
+ {
248
+ "$schema": "../schemas/ptywright-script.schema.json",
249
+ "name": "ratatui_snapshot",
250
+ "launch": {
251
+ "backend": "ratatui",
252
+ "cols": 60,
253
+ "rows": 12,
254
+ "frames": [
255
+ "Screen: Dashboard\nMode: HIGH",
256
+ "Screen: Permissions\nMode: LOW"
257
+ ]
258
+ },
259
+ "steps": [
260
+ { "type": "waitForText", "text": "Dashboard" },
261
+ { "type": "pressKey", "key": "Enter" },
262
+ { "type": "snapshot", "kind": "text", "saveAs": "final" },
263
+ { "type": "expect", "from": "final", "contains": ["Mode: LOW"] }
264
+ ]
265
+ }
266
+ ```
267
+
268
+ `ratatui` is intended for text emitted by `TestBackend`/insta-style snapshots.
269
+ `ink` can load a module via `frameModule` that exports `frames`, `frame`,
270
+ `snapshot`, or `lastFrame`. Input steps such as `pressKey` and `sendText`
271
+ advance to the next frame by default, so the assertion path stays identical to
272
+ the PTY end-to-end script.
273
+
181
274
  For `type:"custom"` steps, inject handlers with `--steps <module.ts>`:
182
275
 
183
276
  ```bash
@@ -213,6 +306,223 @@ Recording artifacts are best for failure diagnosis or manual review; prefer `sna
213
306
  - SVG: `bunx svg-term --in <castPath> --out <outSvg>`
214
307
  - GIF: `agg --fps 30 <castPath> <outGif>` (requires [asciinema/agg](https://github.com/asciinema/agg))
215
308
 
309
+ ## Browser Agent Regression
310
+
311
+ The browser-first path is integration-agnostic: ptywright launches any command
312
+ that prints a browser URL, drives the terminal DOM with Playwright, and persists
313
+ a replayable run artifact plus terminal/DOM snapshots. The browser page must
314
+ expose the terminal root as `[data-terminal-root]`.
315
+
316
+ ```bash
317
+ # First run records snapshots, screenshots, replay metadata, and report.
318
+ bun run bin/ptywright agent run examples/agent_deterministic.json --update-snapshots
319
+
320
+ # Later runs compare terminal + DOM snapshots like a test snapshot.
321
+ bun run bin/ptywright agent run examples/agent_deterministic.json
322
+
323
+ # Replay does not need AI; it uses the recorded flow artifact.
324
+ bun run bin/ptywright agent replay .tmp/agent/agent_deterministic/agent_deterministic.agent-run.json
325
+
326
+ # Cassette files are also directly replayable.
327
+ bun run bin/ptywright agent replay .tmp/agent/agent_deterministic/agent_deterministic.cassette.json
328
+
329
+ # Promote a live run/cassette into the committed non-AI regression suite.
330
+ bun run bin/ptywright agent promote \
331
+ .tmp/agent/agent_deterministic/agent_deterministic.cassette.json \
332
+ --update-snapshots
333
+
334
+ # Batch replay committed cassettes/run records as a regression suite.
335
+ bun run bin/ptywright agent replay-all .tmp/agent --artifacts-root .tmp/agent-replay-all
336
+
337
+ # Rerun directly from a generated summary artifact.
338
+ bun run bin/ptywright agent rerun .tmp/agent-promote/agent_deterministic/agent-promote.summary.json
339
+ bun run bin/ptywright agent rerun .tmp/agent-check/agent-check.summary.json
340
+ bun run bin/ptywright agent rerun .tmp/agent-check/agent-replay.summary.json --update-snapshots
341
+
342
+ # Read reusable commands from any supported agent artifact.
343
+ bun run bin/ptywright agent commands .tmp/agent-check/agent-check.summary.json --json
344
+ bun run bin/ptywright agent commands .tmp/agent-check/agent-check.summary.json --command rerun
345
+ bun run bin/ptywright agent commands .tmp/agent-check --json
346
+ bun run bin/ptywright agent inspect .tmp/agent-check
347
+ bun run bin/ptywright agent inspect .tmp/agent-check --json
348
+ bun run bin/ptywright agent validate .tmp/agent-check
349
+ bun run bin/ptywright agent exec .tmp/agent-check --command rerun
350
+ bun run bin/ptywright agent exec .tmp/agent-check --command updateSnapshots
351
+ bun run bin/ptywright agent exec .tmp/agent-check/agent-check.summary.json --command rerun
352
+ bun run bin/ptywright agent exec .tmp/agent-check/agent-check.summary.json --command updateSnapshots
353
+
354
+ # Validate flow/cassette/run-record/summary artifacts before committing.
355
+ bun run bin/ptywright agent validate .tmp/agent-replay-all
356
+
357
+ # Run committed cassette replay regression without launching live agents.
358
+ bun run bin/ptywright agent check
359
+ bun run bin/ptywright agent check --json
360
+
361
+ # Update terminal/DOM baselines from committed cassettes intentionally.
362
+ bun run bin/ptywright agent replay-all tests/agent-cassettes --update-snapshots
363
+
364
+ # Record browser interactions into a replayable flow spec.
365
+ bun run bin/ptywright agent record examples/agents/codex_browser_smoke.json \
366
+ --out scripts/agents/codex_recorded.flow.json \
367
+ --duration-ms 60000 \
368
+ --headed
369
+
370
+ # Generate starter specs for real agents.
371
+ bun run bin/ptywright agent init codex examples/agents/codex_browser_smoke.json
372
+ bun run bin/ptywright agent init claude examples/agents/claude_browser_smoke.json
373
+ bun run bin/ptywright agent init droidx examples/agents/droidx_browser_smoke.json
374
+ ```
375
+
376
+ Artifacts are split intentionally:
377
+ - `.tmp/agent/<name>/` contains run output, screenshots, `*.flow.json`,
378
+ `*.agent-run.json`, `*.cassette.json`, `index.html`, and
379
+ `ptywright-agent.manifest.json`.
380
+ - `tests/agent-snapshots/<name>/` contains stable terminal/DOM baselines.
381
+ - `--update-snapshots` is the explicit update path for intentional UI changes.
382
+
383
+ `launch.mode=command` is the recommended integration contract. `command` and
384
+ `args` are spawned directly, and ptywright reads the first URL printed to stdout
385
+ or stderr. Use `waitForUrlMs` to tune startup timeouts and `urlRegex` when the
386
+ URL is embedded in structured output. Set `launch.agentFlavor` explicitly when
387
+ the command is a wrapper, so mask presets still match the underlying agent.
388
+
389
+ `launch.mode=url` skips process launch and points ptywright at an already
390
+ running browser terminal.
391
+
392
+ A wrapper integration is just a normal command that prints its browser URL:
393
+
394
+ ```json
395
+ {
396
+ "name": "codex_browser_replay",
397
+ "launch": {
398
+ "mode": "command",
399
+ "agentFlavor": "codex",
400
+ "command": "node_modules/.bin/browser-terminal-launcher",
401
+ "args": [
402
+ "--replay",
403
+ "test/recordings/codex-yolo.pty.json",
404
+ "--speed",
405
+ "0",
406
+ "--print-url"
407
+ ],
408
+ "waitForUrlMs": 15000
409
+ },
410
+ "steps": [
411
+ { "type": "waitForStableDom" },
412
+ { "type": "snapshot", "name": "codex", "targets": ["terminal", "dom"] }
413
+ ]
414
+ }
415
+ ```
416
+
417
+ Set `launch.agentFlavor` to `codex`, `claude`, `droid`, or `generic` to opt
418
+ into built-in mask presets for timestamps, generated ids, model names, token
419
+ counts, and other non-deterministic terminal text. Explicit
420
+ `defaults.mask=[...]` rules are appended after the preset, so project-specific
421
+ noise can be hidden without rewriting the runner.
422
+
423
+ `agent record` opens the same browser-hosted terminal and writes the captured
424
+ keyboard/click steps back to a normal flow JSON. The output can be committed and
425
+ run later with `agent run`, while `.agent-run.json` remains the per-run replay
426
+ record generated by the runner. Run records must include
427
+ `commands.replay.argv` and `commands.updateSnapshots.argv`, so automation can
428
+ replay or intentionally update the captured flow without parsing shell strings.
429
+
430
+ `agent run` is the live path: it launches the configured process and updates or
431
+ compares terminal/DOM snapshots. `agent promote <run|cassette>` is the
432
+ intentional solidify step after a good live run: it copies the cassette into
433
+ `tests/agent-cassettes/<name>/`, rewrites its `snapshotDir`, optionally updates
434
+ terminal/DOM baselines, replays the promoted cassette, and writes
435
+ `agent-promote.summary.json` with direct commands for future non-AI checks. HTML
436
+ reports also surface replay/update/inspect commands so failed runs can be
437
+ reproduced directly from the report page.
438
+ `agent replay` is the single-case cassette regression path: it accepts either
439
+ `.agent-run.json` or `.cassette.json`, serves a local replay page, and reproduces
440
+ the previously captured terminal DOM without launching Codex, Claude, Droid, or
441
+ any other live agent process. `agent replay-all` recursively scans a directory
442
+ for `.agent-run.json` and `.cassette.json` files, then writes
443
+ `agent-replay.summary.json` and an HTML suite report so committed cassettes can
444
+ be run like a snapshot regression suite.
445
+ `--update-snapshots` works on `agent replay-all`, so intentional DOM/terminal
446
+ baseline changes can be updated from committed cassettes without a live agent.
447
+ Cassette files embed the normalized flow spec plus frame hashes, so they remain
448
+ self-contained replay artifacts when copied away from the original run
449
+ directory. Replay runs also copy the source cassette into the replay artifact
450
+ directory and write run records that point at that local copy, so the replay
451
+ directory can be moved as a durable reproduction bundle. Run/check/promote and
452
+ replay-all outputs also include `ptywright-agent.manifest.json`, which indexes
453
+ produced files with artifact-root-relative paths plus `bytes` and `sha256`,
454
+ stores reusable `commands.*.argv`, and can be passed to `agent commands`,
455
+ `agent inspect`, `agent exec`, or `agent validate`. `agent inspect
456
+ <artifact|dir>` is the self-describing bundle check: when pointed at an artifact
457
+ directory it prefers `ptywright-agent.manifest.json`, validates indexed file
458
+ hashes, summarizes manifest file kinds/validation stages, and prints the
459
+ relocated reusable commands. Because file entries are relative to the manifest
460
+ directory and manifest commands are relocated when read, copying the whole
461
+ artifact directory preserves both manifest validation and direct
462
+ `agent inspect <copied>` /
463
+ `agent commands <copied>` /
464
+ `agent exec <copied> --command rerun` workflows.
465
+ When inspecting a moved summary file that is the manifest primary artifact,
466
+ `agent inspect` also prints `commandsManifest=<path>` and includes
467
+ `commands.manifestPath` in JSON output, making it explicit which manifest bundle
468
+ will validate and relocate the stored commands before execution.
469
+ The same directory entrypoint works for copied live-run bundles too, so
470
+ `agent exec <copied-run> --command replay` and `--command updateSnapshots`
471
+ remain usable after the original run directory is deleted.
472
+ Copied replay-suite bundles are rerun from the run records stored under the
473
+ bundle's own `tests/` directory, so they do not need the original cassette input
474
+ directory. Promote bundles can move their artifact root and still rerun from the
475
+ copied manifest, while continuing to target the promoted cassette suite.
476
+ If `agent inspect <dir>` sees agent artifacts but no top-level manifest, it
477
+ still reports recursive validation results and prints a `directoryManifest`
478
+ diagnostic so the directory is not confused with a portable commands/exec
479
+ bundle.
480
+ For `agent commands` and `agent exec`, a directory argument means a manifest
481
+ bundle directory and must contain `ptywright-agent.manifest.json`; use
482
+ `agent validate <dir>` when you want recursive artifact discovery.
483
+ The generated agent flow, cassette, run-record, manifest, promote-summary,
484
+ replay-summary, and check-summary JSON files each carry a `$schema` URL under
485
+ `schemas/` so editors and CI tooling can validate the replay contract directly.
486
+ Run-record and summary schemas also encode the expected stored command prefixes,
487
+ for example `ptywright agent replay`, `ptywright agent replay-all`, and
488
+ `ptywright agent rerun`, so malformed commands can be caught before execution.
489
+ Run records and summaries reject missing or stale `commands.*.argv` metadata,
490
+ because those argv arrays are the non-AI replay/update contract for the artifact.
491
+ Promote, replay, and check summaries include `commands.*.argv` arrays for direct
492
+ non-AI reruns and snapshot updates. Each summary also includes
493
+ `commands.rerun.argv`, so downstream automation can re-execute the exact summary
494
+ artifact without reconstructing CLI arguments. `agent commands <artifact>
495
+ --command <name>` prints one shell-safe command line for scripts that want to
496
+ execute a specific replay/update/rerun path directly; with `--json`, the same
497
+ command includes `cwd`, `command.argv`, and `shell` so automation can choose
498
+ structured spawn or shell execution. When a moved summary/run-record is backed
499
+ by a sibling manifest bundle, `agent commands` also reports the manifest path in
500
+ plain output and JSON so automation can see which bundle is responsible for
501
+ relocation and integrity checks; manifest-backed command discovery validates
502
+ stored command targets and indexed file hashes before printing commands.
503
+ `agent exec <artifact> --command <name>`
504
+ executes a stored agent command through ptywright's own CLI dispatcher, so it
505
+ does not depend on shell parsing or a global `ptywright` binary. This includes
506
+ stored `updateSnapshots` commands, which provide the non-AI equivalent of a
507
+ snapshot update run from an existing summary artifact. `agent validate
508
+ <artifact>` also checks that every stored argv starts with a supported
509
+ `ptywright agent <subcommand>` shape before accepting the artifact. If validation
510
+ fails on `commands.*.argv`, regenerate the run/summary with `agent run`,
511
+ `agent replay-all`, `agent promote`, or `agent check`; do not hand-edit shell
512
+ strings as a recovery path, because the argv arrays are the replay contract.
513
+ `agent rerun <summary>` reads `agent-promote.summary.json`,
514
+ `agent-check.summary.json`, or
515
+ `agent-replay.summary.json` and replays the stored cassette directory/artifact
516
+ root without launching a live agent. `agent commands <artifact>` reads
517
+ flow/cassette/run-record/summary artifacts and prints the reusable argv commands
518
+ without executing them. `agent validate <path>` accepts a single artifact or a
519
+ directory and returns a non-zero exit code when any known agent replay artifact
520
+ is malformed. `agent check [dir]` validates committed cassettes under
521
+ `tests/agent-cassettes` by default, replays them into `.tmp/agent-check`, writes
522
+ `agent-check.summary.json`, then validates the generated suite output. Add
523
+ `--json` for a CI-friendly summary with input/replay/output counts and failure
524
+ details.
525
+
216
526
  ## Development
217
527
 
218
528
  ```bash
@@ -222,7 +532,11 @@ bun install
222
532
  bun run bin/ptywright mcp
223
533
 
224
534
  # Run tests
225
- bun test
535
+ bun run test
536
+ bun run agent:check
537
+ bun run check
538
+
539
+ # CI installs Chromium, runs bun run check, and uploads .tmp/agent-check.
226
540
 
227
541
  # Lint & Format
228
542
  bun run lint
@@ -231,6 +545,9 @@ bun run format:check
231
545
  # Run scripts
232
546
  bun run bin/ptywright run scripts/m5_mask_demo.json
233
547
  bun run bin/ptywright run-all
548
+
549
+ # Run browser agent regression
550
+ bun run bin/ptywright agent run examples/agent_deterministic.json --update-snapshots
234
551
  ```
235
552
 
236
553
  ## Environment Variables
package/dist/agent.mjs ADDED
@@ -0,0 +1,2 @@
1
+ import { a as runAgentSpecPath, i as runAgentSpec, n as printAgentLaunchPlan, r as replayAgentRecordPath, t as defaultSpecNameForPath } from "./runner-zi0nItvB.mjs";
2
+ export { defaultSpecNameForPath, printAgentLaunchPlan, replayAgentRecordPath, runAgentSpec, runAgentSpecPath };
@@ -0,0 +1,6 @@
1
+ #!/usr/bin/env bun
2
+ import { t as main } from "../cli-CfvlbRoZ.mjs";
3
+ //#region src/bin/ptywright.ts
4
+ await main();
5
+ //#endregion
6
+ export {};