ptywright 0.1.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (67) hide show
  1. package/README.md +287 -1
  2. package/dist/agent.mjs +2 -0
  3. package/dist/bin/ptywright.mjs +6 -0
  4. package/dist/cli-DIUx2w6X.mjs +3587 -0
  5. package/dist/cli.mjs +2 -0
  6. package/{src/index.ts → dist/index.mjs} +7 -9
  7. package/dist/mcp.mjs +2 -0
  8. package/dist/pty-cassette.mjs +24 -0
  9. package/dist/pty_like-Cpkh_O9B.mjs +404 -0
  10. package/dist/runner-DzZlFrt1.mjs +1897 -0
  11. package/dist/runner-zApMYWZx.mjs +3257 -0
  12. package/dist/script.mjs +2 -0
  13. package/dist/server-VHuEWWj_.mjs +3068 -0
  14. package/dist/session.mjs +2 -0
  15. package/dist/terminal_session-DopC7Xg6.mjs +893 -0
  16. package/package.json +28 -21
  17. package/schemas/ptywright-agent-cassette.schema.json +57 -0
  18. package/schemas/ptywright-agent-check.schema.json +122 -0
  19. package/schemas/ptywright-agent-manifest.schema.json +107 -0
  20. package/schemas/ptywright-agent-promote.schema.json +146 -0
  21. package/schemas/ptywright-agent-replay-summary.schema.json +140 -0
  22. package/schemas/ptywright-agent-run.schema.json +126 -0
  23. package/schemas/ptywright-agent.schema.json +182 -0
  24. package/schemas/ptywright-pty-cassette.schema.json +86 -0
  25. package/schemas/ptywright-script-manifest.schema.json +75 -0
  26. package/schemas/ptywright-script-run-summary.schema.json +114 -0
  27. package/schemas/ptywright-script.schema.json +55 -3
  28. package/bin/ptywright +0 -4
  29. package/src/cli.ts +0 -414
  30. package/src/generator/doc_parser.ts +0 -341
  31. package/src/generator/generate.ts +0 -161
  32. package/src/generator/index.ts +0 -10
  33. package/src/generator/script_generator.ts +0 -209
  34. package/src/generator/step_extractor.ts +0 -397
  35. package/src/mcp/http_server.ts +0 -174
  36. package/src/mcp/script_recording.ts +0 -238
  37. package/src/mcp/server.ts +0 -1348
  38. package/src/pty/bun_pty_adapter.ts +0 -34
  39. package/src/pty/bun_terminal_adapter.ts +0 -149
  40. package/src/pty/pty_adapter.ts +0 -31
  41. package/src/script/dsl.ts +0 -188
  42. package/src/script/module.ts +0 -43
  43. package/src/script/path.ts +0 -151
  44. package/src/script/run.ts +0 -108
  45. package/src/script/run_all.ts +0 -229
  46. package/src/script/runner.ts +0 -983
  47. package/src/script/schema.ts +0 -237
  48. package/src/script/steps/assert_snapshot_equals.ts +0 -21
  49. package/src/script/steps/index.ts +0 -2
  50. package/src/script/suite_report.ts +0 -626
  51. package/src/session/session_manager.ts +0 -145
  52. package/src/session/terminal_session.ts +0 -473
  53. package/src/terminal/ansi.ts +0 -142
  54. package/src/terminal/keys.ts +0 -180
  55. package/src/terminal/mask.ts +0 -70
  56. package/src/terminal/mouse.ts +0 -75
  57. package/src/terminal/snapshot.ts +0 -196
  58. package/src/terminal/style.ts +0 -121
  59. package/src/terminal/view.ts +0 -49
  60. package/src/trace/asciicast.ts +0 -20
  61. package/src/trace/asciinema_player_assets.ts +0 -44
  62. package/src/trace/cast_to_txt.ts +0 -116
  63. package/src/trace/recorder.ts +0 -110
  64. package/src/trace/report.ts +0 -2092
  65. package/src/types.ts +0 -86
  66. package/src/util/hash.ts +0 -8
  67. package/src/util/sleep.ts +0 -5
package/README.md CHANGED
@@ -88,6 +88,46 @@ bunx ptywright@latest run-all --dir scripts
88
88
  bunx ptywright@latest --help
89
89
  ```
90
90
 
91
+ ### Raw PTY Cassette
92
+
93
+ `ptywright pty` records the raw PTY stream once and replays it later without
94
+ rerunning the original command. This is intended for browser terminal renderers
95
+ that need deterministic regression tests for prompts or AI sessions that are
96
+ hard to reproduce live.
97
+
98
+ ```bash
99
+ # Record output/input/resize/exit as base64 PTY events
100
+ bunx ptywright@latest pty record --out tests/cassettes/codex.pty.json -- codex
101
+
102
+ # Replay the same raw output stream instantly
103
+ bunx ptywright@latest pty replay tests/cassettes/codex.pty.json
104
+
105
+ # Validate or inspect the portable artifact
106
+ bunx ptywright@latest pty validate tests/cassettes/codex.pty.json
107
+ bunx ptywright@latest pty inspect tests/cassettes/codex.pty.json
108
+ ```
109
+
110
+ External projects do not need to depend on aitty. Use the structural
111
+ `wrapPtyLike` API for `node-pty`/`bun-pty` style objects:
112
+
113
+ ```ts
114
+ import { wrapPtyLike } from "ptywright/pty-cassette";
115
+
116
+ const recorded = wrapPtyLike(pty, {
117
+ path: "tests/cassettes/session.pty.json",
118
+ terminal: { cols: 120, rows: 40, term: "xterm-256color" },
119
+ command: { file: "codex", args: [] },
120
+ });
121
+
122
+ recorded.write("hello\r");
123
+ // output and exit are captured from pty.onData/onExit
124
+ ```
125
+
126
+ For Bun Terminal callback-style integration, create a recorder and call
127
+ `recordOutput` from the terminal `data` hook, or use
128
+ `wrapBunTerminalOptions`. The cassette can then be replayed into any renderer
129
+ and compared by that renderer's DOM/text snapshot tests.
130
+
91
131
  ## Tools
92
132
 
93
133
  All tools are enabled by default (`--caps all`). Use `--caps core` or combine as needed:
@@ -162,6 +202,25 @@ Artifacts go to `.tmp/runs/<name>/` by default (override with `--artifacts-dir`)
162
202
  Batch runs generate an overview report:
163
203
  - Default: `.tmp/run-all/index.html` + `.tmp/run-all/run.summary.json`
164
204
  - With `--artifacts-root <dir>`: `<dir>/index.html` + `<dir>/run.summary.json`
205
+ - `run.summary.json` stores `commands.runAll.argv` and
206
+ `commands.updateGoldens.argv` so automation can replay the suite or update
207
+ goldens without reconstructing CLI arguments.
208
+
209
+ You can read or execute those commands directly from the generated artifact:
210
+
211
+ ```bash
212
+ bunx ptywright@latest script commands .tmp/run-all --json
213
+ bunx ptywright@latest script commands .tmp/run-all/run.summary.json --command runAll
214
+ bunx ptywright@latest script inspect .tmp/run-all
215
+ bunx ptywright@latest script validate .tmp/run-all
216
+ bunx ptywright@latest script exec .tmp/run-all --command updateGoldens
217
+ ```
218
+
219
+ Suite directories also include `ptywright-script.manifest.json`, which indexes
220
+ the generated summary, reports, casts, data, and failure artifacts with
221
+ `bytes`/`sha256`. `script validate`, `script inspect`, `script commands`, and
222
+ `script exec` verify that manifest before using a directory bundle, so copied
223
+ script run artifacts can be replayed or updated without trusting stale files.
165
224
 
166
225
  On failure, additional files are saved:
167
226
  - `failure.error.txt` (error stack)
@@ -178,6 +237,40 @@ Built-in steps (no `--steps` needed):
178
237
  - `waitForExit`: Wait for process exit
179
238
  - `sendMouse`: Send SGR mouse events
180
239
 
240
+ ### Framework Backends
241
+
242
+ `launch.backend` defaults to `pty`. For faster framework-level checks, use
243
+ `frames`, `ratatui`, or `ink` to run the same script steps against deterministic
244
+ frames without starting a PTY:
245
+
246
+ ```json
247
+ {
248
+ "$schema": "../schemas/ptywright-script.schema.json",
249
+ "name": "ratatui_snapshot",
250
+ "launch": {
251
+ "backend": "ratatui",
252
+ "cols": 60,
253
+ "rows": 12,
254
+ "frames": [
255
+ "Screen: Dashboard\nMode: HIGH",
256
+ "Screen: Permissions\nMode: LOW"
257
+ ]
258
+ },
259
+ "steps": [
260
+ { "type": "waitForText", "text": "Dashboard" },
261
+ { "type": "pressKey", "key": "Enter" },
262
+ { "type": "snapshot", "kind": "text", "saveAs": "final" },
263
+ { "type": "expect", "from": "final", "contains": ["Mode: LOW"] }
264
+ ]
265
+ }
266
+ ```
267
+
268
+ `ratatui` is intended for text emitted by `TestBackend`/insta-style snapshots.
269
+ `ink` can load a module via `frameModule` that exports `frames`, `frame`,
270
+ `snapshot`, or `lastFrame`. Input steps such as `pressKey` and `sendText`
271
+ advance to the next frame by default, so the assertion path stays identical to
272
+ the PTY end-to-end script.
273
+
181
274
  For `type:"custom"` steps, inject handlers with `--steps <module.ts>`:
182
275
 
183
276
  ```bash
@@ -213,6 +306,192 @@ Recording artifacts are best for failure diagnosis or manual review; prefer `sna
213
306
  - SVG: `bunx svg-term --in <castPath> --out <outSvg>`
214
307
  - GIF: `agg --fps 30 <castPath> <outGif>` (requires [asciinema/agg](https://github.com/asciinema/agg))
215
308
 
309
+ ## Browser Agent Regression
310
+
311
+ The new destructive path is browser-first: ptywright can launch an agent through
312
+ `@aitty/cli`, drive the browser-hosted wterm DOM with Playwright, and persist a
313
+ replayable run artifact plus terminal/DOM snapshots.
314
+
315
+ ```bash
316
+ # First run records snapshots, screenshots, replay metadata, and report.
317
+ bun run bin/ptywright agent run examples/agent_deterministic.json --update-snapshots
318
+
319
+ # Later runs compare terminal + DOM snapshots like a test snapshot.
320
+ bun run bin/ptywright agent run examples/agent_deterministic.json
321
+
322
+ # Replay does not need AI; it uses the recorded flow artifact.
323
+ bun run bin/ptywright agent replay .tmp/agent/agent_deterministic/agent_deterministic.agent-run.json
324
+
325
+ # Cassette files are also directly replayable.
326
+ bun run bin/ptywright agent replay .tmp/agent/agent_deterministic/agent_deterministic.cassette.json
327
+
328
+ # Promote a live run/cassette into the committed non-AI regression suite.
329
+ bun run bin/ptywright agent promote \
330
+ .tmp/agent/agent_deterministic/agent_deterministic.cassette.json \
331
+ --update-snapshots
332
+
333
+ # Batch replay committed cassettes/run records as a regression suite.
334
+ bun run bin/ptywright agent replay-all .tmp/agent --artifacts-root .tmp/agent-replay-all
335
+
336
+ # Rerun directly from a generated summary artifact.
337
+ bun run bin/ptywright agent rerun .tmp/agent-promote/agent_deterministic/agent-promote.summary.json
338
+ bun run bin/ptywright agent rerun .tmp/agent-check/agent-check.summary.json
339
+ bun run bin/ptywright agent rerun .tmp/agent-check/agent-replay.summary.json --update-snapshots
340
+
341
+ # Read reusable commands from any supported agent artifact.
342
+ bun run bin/ptywright agent commands .tmp/agent-check/agent-check.summary.json --json
343
+ bun run bin/ptywright agent commands .tmp/agent-check/agent-check.summary.json --command rerun
344
+ bun run bin/ptywright agent commands .tmp/agent-check --json
345
+ bun run bin/ptywright agent inspect .tmp/agent-check
346
+ bun run bin/ptywright agent inspect .tmp/agent-check --json
347
+ bun run bin/ptywright agent validate .tmp/agent-check
348
+ bun run bin/ptywright agent exec .tmp/agent-check --command rerun
349
+ bun run bin/ptywright agent exec .tmp/agent-check --command updateSnapshots
350
+ bun run bin/ptywright agent exec .tmp/agent-check/agent-check.summary.json --command rerun
351
+ bun run bin/ptywright agent exec .tmp/agent-check/agent-check.summary.json --command updateSnapshots
352
+
353
+ # Validate flow/cassette/run-record/summary artifacts before committing.
354
+ bun run bin/ptywright agent validate .tmp/agent-replay-all
355
+
356
+ # Run committed cassette replay regression without launching live agents.
357
+ bun run bin/ptywright agent check
358
+ bun run bin/ptywright agent check --json
359
+
360
+ # Update terminal/DOM baselines from committed cassettes intentionally.
361
+ bun run bin/ptywright agent replay-all tests/agent-cassettes --update-snapshots
362
+
363
+ # Record browser interactions into a replayable flow spec.
364
+ bun run bin/ptywright agent record examples/agents/codex_browser_smoke.json \
365
+ --out scripts/agents/codex_recorded.flow.json \
366
+ --duration-ms 60000 \
367
+ --headed
368
+
369
+ # Generate starter specs for real agents.
370
+ bun run bin/ptywright agent init codex examples/agents/codex_browser_smoke.json
371
+ bun run bin/ptywright agent init claude examples/agents/claude_browser_smoke.json
372
+ bun run bin/ptywright agent init droidx examples/agents/droidx_browser_smoke.json
373
+ ```
374
+
375
+ Artifacts are split intentionally:
376
+ - `.tmp/agent/<name>/` contains run output, screenshots, `*.flow.json`,
377
+ `*.agent-run.json`, `*.cassette.json`, `index.html`, and
378
+ `ptywright-agent.manifest.json`.
379
+ - `tests/agent-snapshots/<name>/` contains stable terminal/DOM baselines.
380
+ - `--update-snapshots` is the explicit update path for intentional UI changes.
381
+
382
+ `launch.mode=aitty` runs `aitty exec --launch print -- <agent>`. By default
383
+ ptywright resolves the sibling `../aitty/packages/cli/dist/cli.js`; set
384
+ `PTYWRIGHT_AITTY_CLI` or `launch.aitty.command` to override it.
385
+
386
+ Set `launch.agentFlavor` to `codex`, `claude`, `droid`, or `generic` to opt
387
+ into built-in mask presets for timestamps, generated ids, model names, token
388
+ counts, and other non-deterministic terminal text. Explicit
389
+ `defaults.mask=[...]` rules are appended after the preset, so project-specific
390
+ noise can be hidden without rewriting the runner.
391
+
392
+ `agent record` opens the same browser-hosted terminal and writes the captured
393
+ keyboard/click steps back to a normal flow JSON. The output can be committed and
394
+ run later with `agent run`, while `.agent-run.json` remains the per-run replay
395
+ record generated by the runner. Run records must include
396
+ `commands.replay.argv` and `commands.updateSnapshots.argv`, so automation can
397
+ replay or intentionally update the captured flow without parsing shell strings.
398
+
399
+ `agent run` is the live path: it launches the configured process and updates or
400
+ compares terminal/DOM snapshots. `agent promote <run|cassette>` is the
401
+ intentional solidify step after a good live run: it copies the cassette into
402
+ `tests/agent-cassettes/<name>/`, rewrites its `snapshotDir`, optionally updates
403
+ terminal/DOM baselines, replays the promoted cassette, and writes
404
+ `agent-promote.summary.json` with direct commands for future non-AI checks. HTML
405
+ reports also surface replay/update/inspect commands so failed runs can be
406
+ reproduced directly from the report page.
407
+ `agent replay` is the single-case cassette regression path: it accepts either
408
+ `.agent-run.json` or `.cassette.json`, serves a local replay page, and reproduces
409
+ the previously captured terminal DOM without launching Codex, Claude, Droid, or
410
+ any other live agent process. `agent replay-all` recursively scans a directory
411
+ for `.agent-run.json` and `.cassette.json` files, then writes
412
+ `agent-replay.summary.json` and an HTML suite report so committed cassettes can
413
+ be run like a snapshot regression suite.
414
+ `--update-snapshots` works on `agent replay-all`, so intentional DOM/terminal
415
+ baseline changes can be updated from committed cassettes without a live agent.
416
+ Cassette files embed the normalized flow spec plus frame hashes, so they remain
417
+ self-contained replay artifacts when copied away from the original run
418
+ directory. Replay runs also copy the source cassette into the replay artifact
419
+ directory and write run records that point at that local copy, so the replay
420
+ directory can be moved as a durable reproduction bundle. Run/check/promote and
421
+ replay-all outputs also include `ptywright-agent.manifest.json`, which indexes
422
+ produced files with artifact-root-relative paths plus `bytes` and `sha256`,
423
+ stores reusable `commands.*.argv`, and can be passed to `agent commands`,
424
+ `agent inspect`, `agent exec`, or `agent validate`. `agent inspect
425
+ <artifact|dir>` is the self-describing bundle check: when pointed at an artifact
426
+ directory it prefers `ptywright-agent.manifest.json`, validates indexed file
427
+ hashes, summarizes manifest file kinds/validation stages, and prints the
428
+ relocated reusable commands. Because file entries are relative to the manifest
429
+ directory and manifest commands are relocated when read, copying the whole
430
+ artifact directory preserves both manifest validation and direct
431
+ `agent inspect <copied>` /
432
+ `agent commands <copied>` /
433
+ `agent exec <copied> --command rerun` workflows.
434
+ When inspecting a moved summary file that is the manifest primary artifact,
435
+ `agent inspect` also prints `commandsManifest=<path>` and includes
436
+ `commands.manifestPath` in JSON output, making it explicit which manifest bundle
437
+ will validate and relocate the stored commands before execution.
438
+ The same directory entrypoint works for copied live-run bundles too, so
439
+ `agent exec <copied-run> --command replay` and `--command updateSnapshots`
440
+ remain usable after the original run directory is deleted.
441
+ Copied replay-suite bundles are rerun from the run records stored under the
442
+ bundle's own `tests/` directory, so they do not need the original cassette input
443
+ directory. Promote bundles can move their artifact root and still rerun from the
444
+ copied manifest, while continuing to target the promoted cassette suite.
445
+ If `agent inspect <dir>` sees agent artifacts but no top-level manifest, it
446
+ still reports recursive validation results and prints a `directoryManifest`
447
+ diagnostic so the directory is not confused with a portable commands/exec
448
+ bundle.
449
+ For `agent commands` and `agent exec`, a directory argument means a manifest
450
+ bundle directory and must contain `ptywright-agent.manifest.json`; use
451
+ `agent validate <dir>` when you want recursive artifact discovery.
452
+ The generated agent flow, cassette, run-record, manifest, promote-summary,
453
+ replay-summary, and check-summary JSON files each carry a `$schema` URL under
454
+ `schemas/` so editors and CI tooling can validate the replay contract directly.
455
+ Run-record and summary schemas also encode the expected stored command prefixes,
456
+ for example `ptywright agent replay`, `ptywright agent replay-all`, and
457
+ `ptywright agent rerun`, so malformed commands can be caught before execution.
458
+ Run records and summaries reject missing or stale `commands.*.argv` metadata,
459
+ because those argv arrays are the non-AI replay/update contract for the artifact.
460
+ Promote, replay, and check summaries include `commands.*.argv` arrays for direct
461
+ non-AI reruns and snapshot updates. Each summary also includes
462
+ `commands.rerun.argv`, so downstream automation can re-execute the exact summary
463
+ artifact without reconstructing CLI arguments. `agent commands <artifact>
464
+ --command <name>` prints one shell-safe command line for scripts that want to
465
+ execute a specific replay/update/rerun path directly; with `--json`, the same
466
+ command includes `cwd`, `command.argv`, and `shell` so automation can choose
467
+ structured spawn or shell execution. When a moved summary/run-record is backed
468
+ by a sibling manifest bundle, `agent commands` also reports the manifest path in
469
+ plain output and JSON so automation can see which bundle is responsible for
470
+ relocation and integrity checks; manifest-backed command discovery validates
471
+ stored command targets and indexed file hashes before printing commands.
472
+ `agent exec <artifact> --command <name>`
473
+ executes a stored agent command through ptywright's own CLI dispatcher, so it
474
+ does not depend on shell parsing or a global `ptywright` binary. This includes
475
+ stored `updateSnapshots` commands, which provide the non-AI equivalent of a
476
+ snapshot update run from an existing summary artifact. `agent validate
477
+ <artifact>` also checks that every stored argv starts with a supported
478
+ `ptywright agent <subcommand>` shape before accepting the artifact. If validation
479
+ fails on `commands.*.argv`, regenerate the run/summary with `agent run`,
480
+ `agent replay-all`, `agent promote`, or `agent check`; do not hand-edit shell
481
+ strings as a recovery path, because the argv arrays are the replay contract.
482
+ `agent rerun <summary>` reads `agent-promote.summary.json`,
483
+ `agent-check.summary.json`, or
484
+ `agent-replay.summary.json` and replays the stored cassette directory/artifact
485
+ root without launching a live agent. `agent commands <artifact>` reads
486
+ flow/cassette/run-record/summary artifacts and prints the reusable argv commands
487
+ without executing them. `agent validate <path>` accepts a single artifact or a
488
+ directory and returns a non-zero exit code when any known agent replay artifact
489
+ is malformed. `agent check [dir]` validates committed cassettes under
490
+ `tests/agent-cassettes` by default, replays them into `.tmp/agent-check`, writes
491
+ `agent-check.summary.json`, then validates the generated suite output. Add
492
+ `--json` for a CI-friendly summary with input/replay/output counts and failure
493
+ details.
494
+
216
495
  ## Development
217
496
 
218
497
  ```bash
@@ -222,7 +501,11 @@ bun install
222
501
  bun run bin/ptywright mcp
223
502
 
224
503
  # Run tests
225
- bun test
504
+ bun run test
505
+ bun run agent:check
506
+ bun run check
507
+
508
+ # CI installs Chromium, runs bun run check, and uploads .tmp/agent-check.
226
509
 
227
510
  # Lint & Format
228
511
  bun run lint
@@ -231,6 +514,9 @@ bun run format:check
231
514
  # Run scripts
232
515
  bun run bin/ptywright run scripts/m5_mask_demo.json
233
516
  bun run bin/ptywright run-all
517
+
518
+ # Run browser agent regression
519
+ bun run bin/ptywright agent run examples/agent_deterministic.json --update-snapshots
234
520
  ```
235
521
 
236
522
  ## Environment Variables
package/dist/agent.mjs ADDED
@@ -0,0 +1,2 @@
1
+ import { a as runAgentSpecPath, i as runAgentSpec, n as printAittyLaunchPlan, r as replayAgentRecordPath, t as defaultSpecNameForPath } from "./runner-DzZlFrt1.mjs";
2
+ export { defaultSpecNameForPath, printAittyLaunchPlan, replayAgentRecordPath, runAgentSpec, runAgentSpecPath };
@@ -0,0 +1,6 @@
1
+ #!/usr/bin/env bun
2
+ import { t as main } from "../cli-DIUx2w6X.mjs";
3
+ //#region src/bin/ptywright.ts
4
+ await main();
5
+ //#endregion
6
+ export {};