@humanjs/playwright 0.3.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -13,8 +13,7 @@ pnpm add @humanjs/playwright playwright
13
13
  ## Quick start
14
14
 
15
15
  ```ts
16
- import { chromium } from 'playwright';
17
- import { createHuman } from '@humanjs/playwright';
16
+ import { chromium, createHuman } from '@humanjs/playwright';
18
17
 
19
18
  const browser = await chromium.launch();
20
19
  const page = await browser.newPage();
@@ -46,6 +45,114 @@ await human.type('input[name="email"]', 'gonzalo@example.com');
46
45
 
47
46
  Pass a `seed` and every random decision (path curvature, typo placement, keystroke jitter) becomes reproducible. Same seed + same personality + same value = same keystrokes.
48
47
 
48
+ ### Primitives
49
+
50
+ The full `Human` surface, at a glance. Each one fires real DOM events through Playwright; the humanization wraps the timing and the path, not the dispatch.
51
+
52
+ | Primitive | Purpose |
53
+ |---|---|
54
+ | `goto(url)` | Navigate the page. |
55
+ | `click(target)` | Bezier path → pre-click hover dwell → click. Occasionally near-misses (cursor wobble outside the target, then corrects) per `personality.mouse.misclickProbability`. |
56
+ | `rightClick(target)` | Same as `click` but with `button: 'right'`. Fires `contextmenu`. Same near-miss behavior. |
57
+ | `drag(from, to)` | Two-phase Bezier (to start → mouse down → curve to end → mouse up). Both endpoints accept `Locator \| string \| Point` — `Point` is essential for canvas / SVG / slider drags. Both endpoints independently near-miss per `misclickProbability` — a drag may wobble on the grab, the drop, both, or neither. |
58
+ | `hover(target)` | Walk to the element and settle. No click. Hover-state UI (tooltips, dropdowns) fires. |
59
+ | `move(target)` | Walk to a `Locator \| string \| Point`. Pure positional motion. No dwell, no element interaction — use this when you want the cursor parked somewhere with no implied click. |
60
+ | `type(target, value)` | Clicks the field for focus, then per-key rhythm with optional typos + Backspace recovery. |
61
+ | `paste(target, value)` | Clicks the field for focus, then `insertText` — instant insertion, no per-character timing. Cmd-V semantic. |
62
+ | `press(key)` | Single key (`'Tab'`) or chord (`'Mod+S'`). See [Keyboard](#keyboard) below. |
63
+ | `read(target)` | Dwell as a reader would. Cursor scans across the text in humanized mode. See [Reading](#reading). |
64
+ | `scroll(target?)` | Multi-segment wheel motion, bell-curve velocity, optional mid-scroll pauses. See [Scrolling](#scrolling). |
65
+ | `sleep(ms)` | Re-exported from `@humanjs/core` for convenience. |
66
+ | `record(fn)` | Wrap a block and export as mp4 / gif / JSON. See [Recording](#recording). |
67
+
68
+ Targets accept a CSS selector string or a Playwright `Locator`. `move` and `drag` additionally accept raw `Point` coordinates. Auto-scroll fires for any element-bound primitive when the target is outside the viewport — humanized scroll in normal speed modes, `scrollIntoViewIfNeeded` in `'instant'`.
69
+
70
+ Near-miss (cursor wobble before committing — see `personality.mouse.misclickProbability`) applies to the primitives that commit a button event at the resolved coordinates: `click`, `rightClick`, both `drag` endpoints, and the implicit focus-acquiring click inside `type` / `paste` (the keystrokes themselves are unaffected). `hover`, `move`, `press`, `read`, and `scroll` never misclick, by design — a wobble would trigger handlers on the wrong element for `hover`, and would contradict the explicit-coordinate contract for `move`. The misclick is also skipped when the cursor is already on the target (no approach means no overshoot).
71
+
72
+ ### Keyboard
73
+
74
+ ```ts
75
+ await human.press('Tab'); // single key
76
+ await human.press('Mod+S'); // cross-platform save (Meta on Mac, Control elsewhere)
77
+ await human.press('Cmd+Shift+P'); // literal Meta+Shift+P on every OS
78
+ await human.press('Control+C'); // literal Ctrl+C
79
+ await human.press('Shift+ArrowDown'); // extend selection down
80
+ ```
81
+
82
+ `press` accepts a single key or a keyboard chord. IDE autocomplete enumerates every `Modifier+Key` combination — type `'Shift+'` and you get `Shift+A`, `Shift+B`, …, `Shift+Tab`, etc. as completions.
83
+
84
+ **Modifier rules:**
85
+
86
+ | Token | Resolves to | Notes |
87
+ |---|---|---|
88
+ | `Mod` / `CmdOrCtrl` / `CommandOrControl` | `Meta` on macOS, `Control` elsewhere | The right token for cross-platform app shortcuts. All three are aliases; `Mod` is shortest. |
89
+ | `Cmd` / `Command` / `Meta` / `Win` / `Super` | `Meta` keycode | Literal — does **not** auto-translate to `Control`. Same physical key on every OS. |
90
+ | `Ctrl` / `Control` | `Control` keycode | Literal — stays `Control` everywhere, so Mac-specific things like terminal `Ctrl+C` still work. |
91
+ | `Alt` / `Option` / `Opt` | `Alt` keycode | Literal. |
92
+ | `Shift` | `Shift` keycode | Literal. |
93
+
94
+ Case-insensitive at runtime. Modifier typos (`'Mosd+S'`) are caught at compile time — the modifier union is closed.
95
+
96
+ **Escape hatch for uncommon keys.** Uncommon keys (`'BracketLeft'`, `'NumpadAdd'`, locale-specific keys) and 3+ modifier chords aren't in the typed `KeyOrChord` union. Cast at the call site — the runtime parser handles them:
97
+
98
+ ```ts
99
+ import type { KeyOrChord } from '@humanjs/playwright';
100
+
101
+ await human.press('Mod+BracketLeft' as KeyOrChord);
102
+ await human.press('Ctrl+Shift+Alt+K' as KeyOrChord);
103
+ ```
104
+
105
+ Why a cast? Including a `(string & {})` escape hatch in the type collapses TypeScript's literal-template IntelliSense, so `'Shift+...'` completions disappear. Autocomplete wins for the 95% case; the cast handles the 5%.
106
+
107
+ **Press does NOT move the cursor** — keyboard input dispatches against focus, not cursor position. Compose with `click` / `hover` / `move` when you need both.
108
+
109
+ ### Typing
110
+
111
+ ```ts
112
+ await human.type('input[name="email"]', 'gonzalo@example.com');
113
+ ```
114
+
115
+ `type` simulates a real keyboard. Per-key delays scale with the personality's typing speed (with jitter); the `distracted` personality occasionally injects QWERTY typos and recovers them with `Backspace`. Single ASCII characters route through Playwright's `keyboard.press` so per-key handlers (autocomplete, validation) fire; non-ASCII characters fall back to `keyboard.insertText` since `press` is keyboard-layout-aware and can't reliably synthesize `é` or `🎉` on every layout.
116
+
117
+ Like every other element-bound primitive, `type` clicks the field first to focus it — a real user moves the cursor to the input and clicks; they don't teleport-focus a field. The implicit click is a sub-step of the `'type'` action, not its own timeline event.
118
+
119
+ **Privacy.** The typed value is never echoed to plugin params. The `'type'` action surfaces only `{ target, length }` — by design, since this argument may carry passwords, tokens, or other secrets.
120
+
121
+ In `'instant'` speed mode, the humanized loop is bypassed for Playwright's `locator.pressSequentially(value, { delay: 0 })` — per-key events still fire, just without the timing.
122
+
123
+ ### Pasting
124
+
125
+ ```ts
126
+ await human.paste('textarea', longCodeBlock);
127
+ ```
128
+
129
+ The Cmd-V semantic. `paste` clicks the field for focus (same mouse-led pattern as `type`), then dispatches the value via `page.keyboard.insertText` — instant, no per-character timing, the value lands in the field in a single beat.
130
+
131
+ When to use which:
132
+
133
+ - **`type`** — short strings where the per-key rhythm is the showcase (form fills, search queries, demo content). Slow on long input by design.
134
+ - **`paste`** — long content where humanized typing would be slow and uninformative (code blocks, multi-paragraph text, anything you'd realistically Cmd-V into the field).
135
+
136
+ `paste` does NOT fire the page's `paste` event. If you need that, drive it yourself: write to the clipboard, focus the field, then `human.press('Mod+V')`.
137
+
138
+ **Privacy.** Same posture as `type` — `{ target, length }` only.
139
+
140
+ ### Dragging
141
+
142
+ ```ts
143
+ await human.drag('#card-1', '#slot-3'); // selector → selector
144
+ await human.drag('#slider-thumb', { x: 800, y: 450 }); // selector → Point
145
+ await human.drag('#card', locator); // any combination
146
+ ```
147
+
148
+ Two-phase Bezier motion: walk to `from`, press the left button, curve to `to` with the button held, release. Each endpoint accepts a `Locator`, a CSS selector, or a raw `Point` coordinate.
149
+
150
+ The `Point` form is essential for canvas / SVG drags where the destination isn't a DOM element — sliders, signature pads, freehand drawing tools. The selector-to-Point shape is the canonical "drag this thumb to that position" pattern.
151
+
152
+ Both Bezier paths (start-to-`from` and `from`-to-`to`) are humanized independently with their own curvature and jitter, so drags don't trace robotic straight lines mid-flight.
153
+
154
+ Auto-scroll fires on both endpoints when needed — if the destination is below the fold, the cursor scrolls to bring it into view before releasing, the way a real user would scroll-to-grab then scroll-to-drop. Raw `Point` endpoints opt out of auto-scroll (explicit coordinates are the caller's responsibility).
155
+
49
156
  ### Reading
50
157
 
51
158
  ```ts
@@ -75,15 +182,16 @@ await human.read('ul.changelog', { kind: 'scan' }); // explicit skim
75
182
 
76
183
  Explicit `kind` always wins over auto-detection.
77
184
 
78
- **Visible eye-scan motion** during the dwell:
185
+ **Eye-scan cursor motion** runs during the dwell by default:
79
186
 
80
187
  ```ts
81
- await human.read('article', { withMotion: true });
188
+ await human.read('article'); // motion: on
189
+ await human.read('article', { withMotion: false }); // motion: off
82
190
  ```
83
191
 
84
- The cursor walks a humanized L→R sweep through every line of rendered text and emits a small return-saccade between lines — same `mousemove` events a real reader would dispatch (so reading-time tooltip / hover handlers fire). Off by default.
192
+ The cursor walks a humanized L→R sweep through every line of rendered text and emits a small return-saccade between lines — same `mousemove` events a real reader would dispatch (so reading-time tooltip / hover handlers fire). Pass `{ withMotion: false }` when you only care about the temporal pattern (typical AI-agent use case).
85
193
 
86
- For demos and screen recordings, pair `withMotion` with `installMouseHelper(page)` to render a visible cursor that follows the synthetic motion:
194
+ For demos and screen recordings, pair it with `installMouseHelper(page)` to render a visible cursor that follows the synthetic motion:
87
195
 
88
196
  ```ts
89
197
  import { createHuman, installMouseHelper } from '@humanjs/playwright';
@@ -93,7 +201,7 @@ await page.goto('https://example.com/article');
93
201
  await installMouseHelper(page);
94
202
 
95
203
  const human = await createHuman(page, { personality: 'careful' });
96
- await human.read('article', { withMotion: true });
204
+ await human.read('article');
97
205
  ```
98
206
 
99
207
  **Returns** a `ReadResult`:
@@ -171,6 +279,91 @@ In `speed: 'instant'`, the page jumps directly via `window.scrollTo` — no whee
171
279
 
172
280
  See [humanjs.dev](https://humanjs.dev) for the full feature set and personality reference.
173
281
 
282
+ ### Recording
283
+
284
+ ```ts
285
+ import { chromium, createHuman, installMouseHelper } from '@humanjs/playwright';
286
+
287
+ const browser = await chromium.launch({ headless: false });
288
+ const context = await browser.newContext({ viewport: { width: 1920, height: 1080 } });
289
+ const page = await context.newPage();
290
+
291
+ // Visible cursor overlay so the recorded video shows mouse motion.
292
+ await installMouseHelper(context);
293
+
294
+ const human = await createHuman(page);
295
+
296
+ const rec = await human.record(async () => {
297
+ await human.click('#login');
298
+ await human.type('#email', 'demo@humanjs.dev');
299
+ });
300
+
301
+ await rec.toVideo('demo.mp4');
302
+ await rec.toTimeline('demo.json');
303
+ await browser.close();
304
+ ```
305
+
306
+ `human.record(cb)` polls `page.screenshot()` at the target FPS, writes each frame to a temp directory, then assembles them via ffmpeg when you call an exporter:
307
+
308
+ - `rec.toVideo(path)` — `.mp4` (H.264 / yuv420p) or `.webm` (VP9)
309
+ - `rec.toGif(path, { fps?, width? })` — palette-optimized animated GIF (`palettegen` + `paletteuse`, Bayer dither). Defaults to 15 fps, source viewport size.
310
+
311
+ Both exporters are **repeatable and interleavable** — they read the captured frames, they don't consume them. Want an mp4 for the landing page *and* a GIF for the README from the same recording? Just call both:
312
+
313
+ ```ts
314
+ const rec = await human.record(fn);
315
+ await rec.toVideo('demo.mp4');
316
+ await rec.toGif('demo.gif', { width: 720 });
317
+ // No explicit cleanup needed for one-shot scripts — see below.
318
+ ```
319
+
320
+ Captured frames live in a temp directory under `os.tmpdir()`. Cleanup happens automatically at process exit (a single `process.on('exit')` handler sweeps any un-disposed frame dirs), so casual scripts don't have to think about it. For long-running services, batch jobs, or anywhere you want predictable disk usage, release them proactively:
321
+
322
+ ```ts
323
+ await rec.dispose(); // explicit, idempotent
324
+ // or with TS ≥ 5.2 / Node ≥ 20.4:
325
+ await using rec = await human.record(fn); // auto-disposes at scope exit
326
+ ```
327
+
328
+ The same `Recording` exposes a **structured action timeline** of everything that happened during the callback:
329
+
330
+ ```ts
331
+ await rec.toTimeline('session.json'); // → JSON on disk
332
+ const timeline = rec.timeline; // → in-memory object
333
+ ```
334
+
335
+ The shape (`Timeline` with `personality`, `seed`, `speed`, `durationMs`, and an `events` array of `{ type, params, tMs, durationMs }`) is intended for observability pipelines, replay infrastructure, analytics, and debugger UIs. `toTimeline()` doesn't touch the browser context — call it before or after `toVideo()`, multiple times, in any order.
336
+
337
+ **Quality presets** trade off file size, encoding time, and visual fidelity. Defaults to `'high'`:
338
+
339
+ ```ts
340
+ await rec.toVideo('demo.mp4', { quality: 'high' });
341
+ // 'fast' — JPEG q=85, CRF 23, preset fast (iteration)
342
+ // 'standard' — JPEG q=90, CRF 20, preset fast (balanced)
343
+ // 'high' — JPEG q=95, CRF 18, preset slow, animation (DEFAULT)
344
+ // 'lossless' — PNG capture, CRF 12, preset veryslow (archival)
345
+ ```
346
+
347
+ Individual ffmpeg knobs (`crf`, `preset`, `tune`) can override the preset for fine-grained control.
348
+
349
+ **Timeline-only mode** — skip the capture overhead entirely when you only need the action timeline:
350
+
351
+ ```ts
352
+ const rec = await human.record({ video: false }, async () => {
353
+ await human.click('#login');
354
+ });
355
+ await rec.toTimeline('session.json'); // works
356
+ // rec.toVideo('demo.mp4') // throws with a clear message
357
+ ```
358
+
359
+ **Lifecycle notes**:
360
+
361
+ - Each session can produce **one** recording. `human.record()` throws if called twice on the same session — open a new context (and a new human) to record a separate clip.
362
+ - `Recording.toVideo()` / `Recording.toGif()` are repeatable and interleavable. Frames live until `rec.dispose()` (or `await using` goes out of scope, or the process exits — a sweep-on-exit handler covers forgotten disposes).
363
+ - For a one-call API that owns the entire lifecycle (launch → record → close), use [`@humanjs/recorder`](../recorder)'s `record(options, fn)` instead.
364
+
365
+ Every recording is a regular plugin action — `beforeAction` and `afterAction` observe `{ type: 'record' }` exactly like `'click'` or `'scroll'`.
366
+
174
367
  ## License
175
368
 
176
369
  MIT