@humanjs/playwright 0.3.0 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +200 -7
- package/dist/index.cjs +1022 -183
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +406 -31
- package/dist/index.d.ts +406 -31
- package/dist/index.js +1004 -186
- package/dist/index.js.map +1 -1
- package/package.json +4 -2
package/README.md
CHANGED
|
@@ -13,8 +13,7 @@ pnpm add @humanjs/playwright playwright
|
|
|
13
13
|
## Quick start
|
|
14
14
|
|
|
15
15
|
```ts
|
|
16
|
-
import { chromium } from 'playwright';
|
|
17
|
-
import { createHuman } from '@humanjs/playwright';
|
|
16
|
+
import { chromium, createHuman } from '@humanjs/playwright';
|
|
18
17
|
|
|
19
18
|
const browser = await chromium.launch();
|
|
20
19
|
const page = await browser.newPage();
|
|
@@ -46,6 +45,114 @@ await human.type('input[name="email"]', 'gonzalo@example.com');
|
|
|
46
45
|
|
|
47
46
|
Pass a `seed` and every random decision (path curvature, typo placement, keystroke jitter) becomes reproducible. Same seed + same personality + same value = same keystrokes.
|
|
48
47
|
|
|
48
|
+
### Primitives
|
|
49
|
+
|
|
50
|
+
The full `Human` surface, at a glance. Each one fires real DOM events through Playwright; the humanization wraps the timing and the path, not the dispatch.
|
|
51
|
+
|
|
52
|
+
| Primitive | Purpose |
|
|
53
|
+
|---|---|
|
|
54
|
+
| `goto(url)` | Navigate the page. |
|
|
55
|
+
| `click(target)` | Bezier path → pre-click hover dwell → click. Occasionally near-misses (cursor wobble outside the target, then corrects) per `personality.mouse.misclickProbability`. |
|
|
56
|
+
| `rightClick(target)` | Same as `click` but with `button: 'right'`. Fires `contextmenu`. Same near-miss behavior. |
|
|
57
|
+
| `drag(from, to)` | Two-phase Bezier (to start → mouse down → curve to end → mouse up). Both endpoints accept `Locator \| string \| Point` — `Point` is essential for canvas / SVG / slider drags. Both endpoints independently near-miss per `misclickProbability` — a drag may wobble on the grab, the drop, both, or neither. |
|
|
58
|
+
| `hover(target)` | Walk to the element and settle. No click. Hover-state UI (tooltips, dropdowns) fires. |
|
|
59
|
+
| `move(target)` | Walk to a `Locator \| string \| Point`. Pure positional motion. No dwell, no element interaction — use this when you want the cursor parked somewhere with no implied click. |
|
|
60
|
+
| `type(target, value)` | Clicks the field for focus, then per-key rhythm with optional typos + Backspace recovery. |
|
|
61
|
+
| `paste(target, value)` | Clicks the field for focus, then `insertText` — instant insertion, no per-character timing. Cmd-V semantic. |
|
|
62
|
+
| `press(key)` | Single key (`'Tab'`) or chord (`'Mod+S'`). See [Keyboard](#keyboard) below. |
|
|
63
|
+
| `read(target)` | Dwell as a reader would. Cursor scans across the text in humanized mode. See [Reading](#reading). |
|
|
64
|
+
| `scroll(target?)` | Multi-segment wheel motion, bell-curve velocity, optional mid-scroll pauses. See [Scrolling](#scrolling). |
|
|
65
|
+
| `sleep(ms)` | Re-exported from `@humanjs/core` for convenience. |
|
|
66
|
+
| `record(fn)` | Wrap a block and export as mp4 / gif / JSON. See [Recording](#recording). |
|
|
67
|
+
|
|
68
|
+
Targets accept a CSS selector string or a Playwright `Locator`. `move` and `drag` additionally accept raw `Point` coordinates. Auto-scroll fires for any element-bound primitive when the target is outside the viewport — humanized scroll in normal speed modes, `scrollIntoViewIfNeeded` in `'instant'`.
|
|
69
|
+
|
|
70
|
+
Near-miss (cursor wobble before committing — see `personality.mouse.misclickProbability`) applies to the primitives that commit a button event at the resolved coordinates: `click`, `rightClick`, both `drag` endpoints, and the implicit focus-acquiring click inside `type` / `paste` (the keystrokes themselves are unaffected). `hover`, `move`, `press`, `read`, and `scroll` never misclick, by design — a wobble would trigger handlers on the wrong element for `hover`, and would contradict the explicit-coordinate contract for `move`. The misclick is also skipped when the cursor is already on the target (no approach means no overshoot).
|
|
71
|
+
|
|
72
|
+
### Keyboard
|
|
73
|
+
|
|
74
|
+
```ts
|
|
75
|
+
await human.press('Tab'); // single key
|
|
76
|
+
await human.press('Mod+S'); // cross-platform save (Meta on Mac, Control elsewhere)
|
|
77
|
+
await human.press('Cmd+Shift+P'); // literal Meta+Shift+P on every OS
|
|
78
|
+
await human.press('Control+C'); // literal Ctrl+C
|
|
79
|
+
await human.press('Shift+ArrowDown'); // extend selection down
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
`press` accepts a single key or a keyboard chord. IDE autocomplete enumerates every `Modifier+Key` combination — type `'Shift+'` and you get `Shift+A`, `Shift+B`, …, `Shift+Tab`, etc. as completions.
|
|
83
|
+
|
|
84
|
+
**Modifier rules:**
|
|
85
|
+
|
|
86
|
+
| Token | Resolves to | Notes |
|
|
87
|
+
|---|---|---|
|
|
88
|
+
| `Mod` / `CmdOrCtrl` / `CommandOrControl` | `Meta` on macOS, `Control` elsewhere | The right token for cross-platform app shortcuts. All three are aliases; `Mod` is shortest. |
|
|
89
|
+
| `Cmd` / `Command` / `Meta` / `Win` / `Super` | `Meta` keycode | Literal — does **not** auto-translate to `Control`. Same physical key on every OS. |
|
|
90
|
+
| `Ctrl` / `Control` | `Control` keycode | Literal — stays `Control` everywhere, so Mac-specific things like terminal `Ctrl+C` still work. |
|
|
91
|
+
| `Alt` / `Option` / `Opt` | `Alt` keycode | Literal. |
|
|
92
|
+
| `Shift` | `Shift` keycode | Literal. |
|
|
93
|
+
|
|
94
|
+
Case-insensitive at runtime. Modifier typos (`'Mosd+S'`) are caught at compile time — the modifier union is closed.
|
|
95
|
+
|
|
96
|
+
**Escape hatch for uncommon keys.** Uncommon keys (`'BracketLeft'`, `'NumpadAdd'`, locale-specific keys) and 3+ modifier chords aren't in the typed `KeyOrChord` union. Cast at the call site — the runtime parser handles them:
|
|
97
|
+
|
|
98
|
+
```ts
|
|
99
|
+
import type { KeyOrChord } from '@humanjs/playwright';
|
|
100
|
+
|
|
101
|
+
await human.press('Mod+BracketLeft' as KeyOrChord);
|
|
102
|
+
await human.press('Ctrl+Shift+Alt+K' as KeyOrChord);
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
Why a cast? Including a `(string & {})` escape hatch in the type collapses TypeScript's literal-template IntelliSense, so `'Shift+...'` completions disappear. Autocomplete wins for the 95% case; the cast handles the 5%.
|
|
106
|
+
|
|
107
|
+
**Press does NOT move the cursor** — keyboard input dispatches against focus, not cursor position. Compose with `click` / `hover` / `move` when you need both.
|
|
108
|
+
|
|
109
|
+
### Typing
|
|
110
|
+
|
|
111
|
+
```ts
|
|
112
|
+
await human.type('input[name="email"]', 'gonzalo@example.com');
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
`type` simulates a real keyboard. Per-key delays scale with the personality's typing speed (with jitter); the `distracted` personality occasionally injects QWERTY typos and recovers them with `Backspace`. Single ASCII characters route through Playwright's `keyboard.press` so per-key handlers (autocomplete, validation) fire; non-ASCII characters fall back to `keyboard.insertText` since `press` is keyboard-layout-aware and can't reliably synthesize `é` or `🎉` on every layout.
|
|
116
|
+
|
|
117
|
+
Like every other element-bound primitive, `type` clicks the field first to focus it — a real user moves the cursor to the input and clicks; they don't teleport-focus a field. The implicit click is a sub-step of the `'type'` action, not its own timeline event.
|
|
118
|
+
|
|
119
|
+
**Privacy.** The typed value is never echoed to plugin params. The `'type'` action surfaces only `{ target, length }` — by design, since this argument may carry passwords, tokens, or other secrets.
|
|
120
|
+
|
|
121
|
+
In `'instant'` speed mode, the humanized loop is bypassed for Playwright's `locator.pressSequentially(value, { delay: 0 })` — per-key events still fire, just without the timing.
|
|
122
|
+
|
|
123
|
+
### Pasting
|
|
124
|
+
|
|
125
|
+
```ts
|
|
126
|
+
await human.paste('textarea', longCodeBlock);
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
The Cmd-V semantic. `paste` clicks the field for focus (same mouse-led pattern as `type`), then dispatches the value via `page.keyboard.insertText` — instant, no per-character timing, the value lands in the field in a single beat.
|
|
130
|
+
|
|
131
|
+
When to use which:
|
|
132
|
+
|
|
133
|
+
- **`type`** — short strings where the per-key rhythm is the showcase (form fills, search queries, demo content). Slow on long input by design.
|
|
134
|
+
- **`paste`** — long content where humanized typing would be slow and uninformative (code blocks, multi-paragraph text, anything you'd realistically Cmd-V into the field).
|
|
135
|
+
|
|
136
|
+
`paste` does NOT fire the page's `paste` event. If you need that, drive it yourself: write to the clipboard, focus the field, then `human.press('Mod+V')`.
|
|
137
|
+
|
|
138
|
+
**Privacy.** Same posture as `type` — `{ target, length }` only.
|
|
139
|
+
|
|
140
|
+
### Dragging
|
|
141
|
+
|
|
142
|
+
```ts
|
|
143
|
+
await human.drag('#card-1', '#slot-3'); // selector → selector
|
|
144
|
+
await human.drag('#slider-thumb', { x: 800, y: 450 }); // selector → Point
|
|
145
|
+
await human.drag('#card', locator); // any combination
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
Two-phase Bezier motion: walk to `from`, press the left button, curve to `to` with the button held, release. Each endpoint accepts a `Locator`, a CSS selector, or a raw `Point` coordinate.
|
|
149
|
+
|
|
150
|
+
The `Point` form is essential for canvas / SVG drags where the destination isn't a DOM element — sliders, signature pads, freehand drawing tools. The selector-to-Point shape is the canonical "drag this thumb to that position" pattern.
|
|
151
|
+
|
|
152
|
+
Both Bezier paths (start-to-`from` and `from`-to-`to`) are humanized independently with their own curvature and jitter, so drags don't trace robotic straight lines mid-flight.
|
|
153
|
+
|
|
154
|
+
Auto-scroll fires on both endpoints when needed — if the destination is below the fold, the cursor scrolls to bring it into view before releasing, the way a real user would scroll-to-grab then scroll-to-drop. Raw `Point` endpoints opt out of auto-scroll (explicit coordinates are the caller's responsibility).
|
|
155
|
+
|
|
49
156
|
### Reading
|
|
50
157
|
|
|
51
158
|
```ts
|
|
@@ -75,15 +182,16 @@ await human.read('ul.changelog', { kind: 'scan' }); // explicit skim
|
|
|
75
182
|
|
|
76
183
|
Explicit `kind` always wins over auto-detection.
|
|
77
184
|
|
|
78
|
-
**
|
|
185
|
+
**Eye-scan cursor motion** runs during the dwell by default:
|
|
79
186
|
|
|
80
187
|
```ts
|
|
81
|
-
await human.read('article'
|
|
188
|
+
await human.read('article'); // motion: on
|
|
189
|
+
await human.read('article', { withMotion: false }); // motion: off
|
|
82
190
|
```
|
|
83
191
|
|
|
84
|
-
The cursor walks a humanized L→R sweep through every line of rendered text and emits a small return-saccade between lines — same `mousemove` events a real reader would dispatch (so reading-time tooltip / hover handlers fire).
|
|
192
|
+
The cursor walks a humanized L→R sweep through every line of rendered text and emits a small return-saccade between lines — same `mousemove` events a real reader would dispatch (so reading-time tooltip / hover handlers fire). Pass `{ withMotion: false }` when you only care about the temporal pattern (typical AI-agent use case).
|
|
85
193
|
|
|
86
|
-
For demos and screen recordings, pair
|
|
194
|
+
For demos and screen recordings, pair it with `installMouseHelper(page)` to render a visible cursor that follows the synthetic motion:
|
|
87
195
|
|
|
88
196
|
```ts
|
|
89
197
|
import { createHuman, installMouseHelper } from '@humanjs/playwright';
|
|
@@ -93,7 +201,7 @@ await page.goto('https://example.com/article');
|
|
|
93
201
|
await installMouseHelper(page);
|
|
94
202
|
|
|
95
203
|
const human = await createHuman(page, { personality: 'careful' });
|
|
96
|
-
await human.read('article'
|
|
204
|
+
await human.read('article');
|
|
97
205
|
```
|
|
98
206
|
|
|
99
207
|
**Returns** a `ReadResult`:
|
|
@@ -171,6 +279,91 @@ In `speed: 'instant'`, the page jumps directly via `window.scrollTo` — no whee
|
|
|
171
279
|
|
|
172
280
|
See [humanjs.dev](https://humanjs.dev) for the full feature set and personality reference.
|
|
173
281
|
|
|
282
|
+
### Recording
|
|
283
|
+
|
|
284
|
+
```ts
|
|
285
|
+
import { chromium, createHuman, installMouseHelper } from '@humanjs/playwright';
|
|
286
|
+
|
|
287
|
+
const browser = await chromium.launch({ headless: false });
|
|
288
|
+
const context = await browser.newContext({ viewport: { width: 1920, height: 1080 } });
|
|
289
|
+
const page = await context.newPage();
|
|
290
|
+
|
|
291
|
+
// Visible cursor overlay so the recorded video shows mouse motion.
|
|
292
|
+
await installMouseHelper(context);
|
|
293
|
+
|
|
294
|
+
const human = await createHuman(page);
|
|
295
|
+
|
|
296
|
+
const rec = await human.record(async () => {
|
|
297
|
+
await human.click('#login');
|
|
298
|
+
await human.type('#email', 'demo@humanjs.dev');
|
|
299
|
+
});
|
|
300
|
+
|
|
301
|
+
await rec.toVideo('demo.mp4');
|
|
302
|
+
await rec.toTimeline('demo.json');
|
|
303
|
+
await browser.close();
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
`human.record(cb)` polls `page.screenshot()` at the target FPS, writes each frame to a temp directory, then assembles them via ffmpeg when you call an exporter:
|
|
307
|
+
|
|
308
|
+
- `rec.toVideo(path)` — `.mp4` (H.264 / yuv420p) or `.webm` (VP9)
|
|
309
|
+
- `rec.toGif(path, { fps?, width? })` — palette-optimized animated GIF (`palettegen` + `paletteuse`, Bayer dither). Defaults to 15 fps, source viewport size.
|
|
310
|
+
|
|
311
|
+
Both exporters are **repeatable and interleavable** — they read the captured frames, they don't consume them. Want an mp4 for the landing page *and* a GIF for the README from the same recording? Just call both:
|
|
312
|
+
|
|
313
|
+
```ts
|
|
314
|
+
const rec = await human.record(fn);
|
|
315
|
+
await rec.toVideo('demo.mp4');
|
|
316
|
+
await rec.toGif('demo.gif', { width: 720 });
|
|
317
|
+
// No explicit cleanup needed for one-shot scripts — see below.
|
|
318
|
+
```
|
|
319
|
+
|
|
320
|
+
Captured frames live in a temp directory under `os.tmpdir()`. Cleanup happens automatically at process exit (a single `process.on('exit')` handler sweeps any un-disposed frame dirs), so casual scripts don't have to think about it. For long-running services, batch jobs, or anywhere you want predictable disk usage, release them proactively:
|
|
321
|
+
|
|
322
|
+
```ts
|
|
323
|
+
await rec.dispose(); // explicit, idempotent
|
|
324
|
+
// or with TS ≥ 5.2 / Node ≥ 20.4:
|
|
325
|
+
await using rec = await human.record(fn); // auto-disposes at scope exit
|
|
326
|
+
```
|
|
327
|
+
|
|
328
|
+
The same `Recording` exposes a **structured action timeline** of everything that happened during the callback:
|
|
329
|
+
|
|
330
|
+
```ts
|
|
331
|
+
await rec.toTimeline('session.json'); // → JSON on disk
|
|
332
|
+
const timeline = rec.timeline; // → in-memory object
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
The shape (`Timeline` with `personality`, `seed`, `speed`, `durationMs`, and an `events` array of `{ type, params, tMs, durationMs }`) is intended for observability pipelines, replay infrastructure, analytics, and debugger UIs. `toTimeline()` doesn't touch the browser context — call it before or after `toVideo()`, multiple times, in any order.
|
|
336
|
+
|
|
337
|
+
**Quality presets** trade off file size, encoding time, and visual fidelity. Defaults to `'high'`:
|
|
338
|
+
|
|
339
|
+
```ts
|
|
340
|
+
await rec.toVideo('demo.mp4', { quality: 'high' });
|
|
341
|
+
// 'fast' — JPEG q=85, CRF 23, preset fast (iteration)
|
|
342
|
+
// 'standard' — JPEG q=90, CRF 20, preset fast (balanced)
|
|
343
|
+
// 'high' — JPEG q=95, CRF 18, preset slow, animation (DEFAULT)
|
|
344
|
+
// 'lossless' — PNG capture, CRF 12, preset veryslow (archival)
|
|
345
|
+
```
|
|
346
|
+
|
|
347
|
+
Individual ffmpeg knobs (`crf`, `preset`, `tune`) can override the preset for fine-grained control.
|
|
348
|
+
|
|
349
|
+
**Timeline-only mode** — skip the capture overhead entirely when you only need the action timeline:
|
|
350
|
+
|
|
351
|
+
```ts
|
|
352
|
+
const rec = await human.record({ video: false }, async () => {
|
|
353
|
+
await human.click('#login');
|
|
354
|
+
});
|
|
355
|
+
await rec.toTimeline('session.json'); // works
|
|
356
|
+
// rec.toVideo('demo.mp4') // throws with a clear message
|
|
357
|
+
```
|
|
358
|
+
|
|
359
|
+
**Lifecycle notes**:
|
|
360
|
+
|
|
361
|
+
- Each session can produce **one** recording. `human.record()` throws if called twice on the same session — open a new context (and a new human) to record a separate clip.
|
|
362
|
+
- `Recording.toVideo()` / `Recording.toGif()` are repeatable and interleavable. Frames live until `rec.dispose()` (or `await using` goes out of scope, or the process exits — a sweep-on-exit handler covers forgotten disposes).
|
|
363
|
+
- For a one-call API that owns the entire lifecycle (launch → record → close), use [`@humanjs/recorder`](../recorder)'s `record(options, fn)` instead.
|
|
364
|
+
|
|
365
|
+
Every recording is a regular plugin action — `beforeAction` and `afterAction` observe `{ type: 'record' }` exactly like `'click'` or `'scroll'`.
|
|
366
|
+
|
|
174
367
|
## License
|
|
175
368
|
|
|
176
369
|
MIT
|