@humanjs/playwright 0.4.0 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,5 +1,14 @@
1
1
  # @humanjs/playwright
2
2
 
3
+ <p>
4
+ <a href="https://www.npmjs.com/package/@humanjs/playwright"><img alt="npm" src="https://img.shields.io/npm/v/@humanjs/playwright"></a>
5
+ <a href="https://www.npmjs.com/package/@humanjs/playwright"><img alt="downloads" src="https://img.shields.io/npm/dt/@humanjs/playwright"></a>
6
+ <a href="https://github.com/totigm/humanjs"><img alt="GitHub" src="https://img.shields.io/badge/GitHub-totigm%2Fhumanjs-181717?logo=github"></a>
7
+ <a href="https://github.com/totigm/humanjs/actions/workflows/ci.yml"><img alt="CI" src="https://github.com/totigm/humanjs/actions/workflows/ci.yml/badge.svg"></a>
8
+ <a href="https://github.com/totigm/humanjs/blob/main/LICENSE"><img alt="license" src="https://img.shields.io/npm/l/@humanjs/playwright"></a>
9
+ <a href="https://humanjs.dev"><img alt="docs" src="https://img.shields.io/badge/docs-humanjs.dev-emerald"></a>
10
+ </p>
11
+
3
12
  Humanize Playwright sessions for AI agents, QA tests, and demos. Drop-in adapter for an existing Playwright `Page`.
4
13
 
5
14
  ## Install
@@ -13,8 +22,7 @@ pnpm add @humanjs/playwright playwright
13
22
  ## Quick start
14
23
 
15
24
  ```ts
16
- import { chromium } from 'playwright';
17
- import { createHuman } from '@humanjs/playwright';
25
+ import { chromium, createHuman } from '@humanjs/playwright';
18
26
 
19
27
  const browser = await chromium.launch();
20
28
  const page = await browser.newPage();
@@ -46,6 +54,114 @@ await human.type('input[name="email"]', 'gonzalo@example.com');
46
54
 
47
55
  Pass a `seed` and every random decision (path curvature, typo placement, keystroke jitter) becomes reproducible. Same seed + same personality + same value = same keystrokes.
48
56
 
57
+ ### Primitives
58
+
59
+ The full `Human` surface, at a glance. Each one fires real DOM events through Playwright; the humanization wraps the timing and the path, not the dispatch.
60
+
61
+ | Primitive | Purpose |
62
+ |---|---|
63
+ | `goto(url)` | Navigate the page. |
64
+ | `click(target)` | Bezier path → pre-click hover dwell → click. Occasionally near-misses (cursor wobble outside the target, then corrects) per `personality.mouse.misclickProbability`. |
65
+ | `rightClick(target)` | Same as `click` but with `button: 'right'`. Fires `contextmenu`. Same near-miss behavior. |
66
+ | `drag(from, to)` | Two-phase Bezier (to start → mouse down → curve to end → mouse up). Both endpoints accept `Locator \| string \| Point` — `Point` is essential for canvas / SVG / slider drags. Both endpoints independently near-miss per `misclickProbability` — a drag may wobble on the grab, the drop, both, or neither. |
67
+ | `hover(target)` | Walk to the element and settle. No click. Hover-state UI (tooltips, dropdowns) fires. |
68
+ | `move(target)` | Walk to a `Locator \| string \| Point`. Pure positional motion. No dwell, no element interaction — use this when you want the cursor parked somewhere with no implied click. |
69
+ | `type(target, value)` | Clicks the field for focus, then per-key rhythm with optional typos + Backspace recovery. |
70
+ | `paste(target, value)` | Clicks the field for focus, then `insertText` — instant insertion, no per-character timing. Cmd-V semantic. |
71
+ | `press(key)` | Single key (`'Tab'`) or chord (`'Mod+S'`). See [Keyboard](#keyboard) below. |
72
+ | `read(target)` | Dwell as a reader would. Cursor scans across the text in humanized mode. See [Reading](#reading). |
73
+ | `scroll(target?)` | Multi-segment wheel motion, bell-curve velocity, optional mid-scroll pauses. See [Scrolling](#scrolling). |
74
+ | `sleep(ms)` | Re-exported from `@humanjs/core` for convenience. |
75
+ | `record(fn)` | Wrap a block and export as mp4 / gif / JSON. See [Recording](#recording). |
76
+
77
+ Targets accept a CSS selector string or a Playwright `Locator`. `move` and `drag` additionally accept raw `Point` coordinates. Auto-scroll fires for any element-bound primitive when the target is outside the viewport — humanized scroll in normal speed modes, `scrollIntoViewIfNeeded` in `'instant'`.
78
+
79
+ Near-miss (cursor wobble before committing — see `personality.mouse.misclickProbability`) applies to the primitives that commit a button event at the resolved coordinates: `click`, `rightClick`, both `drag` endpoints, and the implicit focus-acquiring click inside `type` / `paste` (the keystrokes themselves are unaffected). `hover`, `move`, `press`, `read`, and `scroll` never misclick, by design — a wobble would trigger handlers on the wrong element for `hover`, and would contradict the explicit-coordinate contract for `move`. The misclick is also skipped when the cursor is already on the target (no approach means no overshoot).
80
+
81
+ ### Keyboard
82
+
83
+ ```ts
84
+ await human.press('Tab'); // single key
85
+ await human.press('Mod+S'); // cross-platform save (Meta on Mac, Control elsewhere)
86
+ await human.press('Cmd+Shift+P'); // literal Meta+Shift+P on every OS
87
+ await human.press('Control+C'); // literal Ctrl+C
88
+ await human.press('Shift+ArrowDown'); // extend selection down
89
+ ```
90
+
91
+ `press` accepts a single key or a keyboard chord. IDE autocomplete enumerates every `Modifier+Key` combination — type `'Shift+'` and you get `Shift+A`, `Shift+B`, …, `Shift+Tab`, etc. as completions.
92
+
93
+ **Modifier rules:**
94
+
95
+ | Token | Resolves to | Notes |
96
+ |---|---|---|
97
+ | `Mod` / `CmdOrCtrl` / `CommandOrControl` | `Meta` on macOS, `Control` elsewhere | The right token for cross-platform app shortcuts. All three are aliases; `Mod` is shortest. |
98
+ | `Cmd` / `Command` / `Meta` / `Win` / `Super` | `Meta` keycode | Literal — does **not** auto-translate to `Control`. Same physical key on every OS. |
99
+ | `Ctrl` / `Control` | `Control` keycode | Literal — stays `Control` everywhere, so Mac-specific things like terminal `Ctrl+C` still work. |
100
+ | `Alt` / `Option` / `Opt` | `Alt` keycode | Literal. |
101
+ | `Shift` | `Shift` keycode | Literal. |
102
+
103
+ Case-insensitive at runtime. Modifier typos (`'Mosd+S'`) are caught at compile time — the modifier union is closed.
104
+
105
+ **Escape hatch for uncommon keys.** Uncommon keys (`'BracketLeft'`, `'NumpadAdd'`, locale-specific keys) and 3+ modifier chords aren't in the typed `KeyOrChord` union. Cast at the call site — the runtime parser handles them:
106
+
107
+ ```ts
108
+ import type { KeyOrChord } from '@humanjs/playwright';
109
+
110
+ await human.press('Mod+BracketLeft' as KeyOrChord);
111
+ await human.press('Ctrl+Shift+Alt+K' as KeyOrChord);
112
+ ```
113
+
114
+ Why a cast? Including a `(string & {})` escape hatch in the type collapses TypeScript's literal-template IntelliSense, so `'Shift+...'` completions disappear. Autocomplete wins for the 95% case; the cast handles the 5%.
115
+
116
+ **Press does NOT move the cursor** — keyboard input dispatches against focus, not cursor position. Compose with `click` / `hover` / `move` when you need both.
117
+
118
+ ### Typing
119
+
120
+ ```ts
121
+ await human.type('input[name="email"]', 'gonzalo@example.com');
122
+ ```
123
+
124
+ `type` simulates a real keyboard. Per-key delays scale with the personality's typing speed (with jitter); the `distracted` personality occasionally injects QWERTY typos and recovers them with `Backspace`. Single ASCII characters route through Playwright's `keyboard.press` so per-key handlers (autocomplete, validation) fire; non-ASCII characters fall back to `keyboard.insertText` since `press` is keyboard-layout-aware and can't reliably synthesize `é` or `🎉` on every layout.
125
+
126
+ Like every other element-bound primitive, `type` clicks the field first to focus it — a real user moves the cursor to the input and clicks; they don't teleport-focus a field. The implicit click is a sub-step of the `'type'` action, not its own timeline event.
127
+
128
+ **Privacy.** The typed value is never echoed to plugin params. The `'type'` action surfaces only `{ target, length }` — by design, since this argument may carry passwords, tokens, or other secrets.
129
+
130
+ In `'instant'` speed mode, the humanized loop is bypassed for Playwright's `locator.pressSequentially(value, { delay: 0 })` — per-key events still fire, just without the timing.
131
+
132
+ ### Pasting
133
+
134
+ ```ts
135
+ await human.paste('textarea', longCodeBlock);
136
+ ```
137
+
138
+ The Cmd-V semantic. `paste` clicks the field for focus (same mouse-led pattern as `type`), then dispatches the value via `page.keyboard.insertText` — instant, no per-character timing, the value lands in the field in a single beat.
139
+
140
+ When to use which:
141
+
142
+ - **`type`** — short strings where the per-key rhythm is the showcase (form fills, search queries, demo content). Slow on long input by design.
143
+ - **`paste`** — long content where humanized typing would be slow and uninformative (code blocks, multi-paragraph text, anything you'd realistically Cmd-V into the field).
144
+
145
+ `paste` does NOT fire the page's `paste` event. If you need that, drive it yourself: write to the clipboard, focus the field, then `human.press('Mod+V')`.
146
+
147
+ **Privacy.** Same posture as `type` — `{ target, length }` only.
148
+
149
+ ### Dragging
150
+
151
+ ```ts
152
+ await human.drag('#card-1', '#slot-3'); // selector → selector
153
+ await human.drag('#slider-thumb', { x: 800, y: 450 }); // selector → Point
154
+ await human.drag('#card', locator); // any combination
155
+ ```
156
+
157
+ Two-phase Bezier motion: walk to `from`, press the left button, curve to `to` with the button held, release. Each endpoint accepts a `Locator`, a CSS selector, or a raw `Point` coordinate.
158
+
159
+ The `Point` form is essential for canvas / SVG drags where the destination isn't a DOM element — sliders, signature pads, freehand drawing tools. The selector-to-Point shape is the canonical "drag this thumb to that position" pattern.
160
+
161
+ Both Bezier paths (start-to-`from` and `from`-to-`to`) are humanized independently with their own curvature and jitter, so drags don't trace robotic straight lines mid-flight.
162
+
163
+ Auto-scroll fires on both endpoints when needed — if the destination is below the fold, the cursor scrolls to bring it into view before releasing, the way a real user would scroll-to-grab then scroll-to-drop. Raw `Point` endpoints opt out of auto-scroll (explicit coordinates are the caller's responsibility).
164
+
49
165
  ### Reading
50
166
 
51
167
  ```ts
@@ -257,6 +373,39 @@ await rec.toTimeline('session.json'); // works
257
373
 
258
374
  Every recording is a regular plugin action — `beforeAction` and `afterAction` observe `{ type: 'record' }` exactly like `'click'` or `'scroll'`.
259
375
 
376
+ ## Using your own browser or a persistent profile
377
+
378
+ `createHuman(page)` wraps **any** Playwright `Page` — so reusing a saved login, your installed Chrome, or an already-running browser is just a matter of how you create that page. HumanJS adds nothing special here; these are standard Playwright entry points, collected so you don't have to hunt for them.
379
+
380
+ **Persistent profile** — keep cookies, local storage, and logins across runs. The first run signs in; later runs are already authenticated:
381
+
382
+ ```ts
383
+ import { chromium, createHuman } from '@humanjs/playwright';
384
+
385
+ const context = await chromium.launchPersistentContext('./.humanjs-profile', {
386
+ headless: false,
387
+ channel: 'chrome', // optional: use installed Google Chrome instead of bundled Chromium
388
+ });
389
+ const page = context.pages()[0] ?? (await context.newPage());
390
+
391
+ const human = await createHuman(page, { personality: 'careful' });
392
+ // …drive the page; state persists in ./.humanjs-profile for next time
393
+ ```
394
+
395
+ **Attach to an already-running browser** — drive a Chrome you started yourself, with all its existing tabs, extensions, and sessions. Launch Chrome with a debugging port first (`chrome --remote-debugging-port=9222`), then:
396
+
397
+ ```ts
398
+ import { chromium, createHuman } from '@humanjs/playwright';
399
+
400
+ const browser = await chromium.connectOverCDP('http://localhost:9222');
401
+ const context = browser.contexts()[0];
402
+ const page = context.pages()[0] ?? (await context.newPage());
403
+
404
+ const human = await createHuman(page, { personality: 'careful' });
405
+ ```
406
+
407
+ > **Heads up:** a persistent profile or a connected real browser carries whatever you're signed into. Driving it means the automation can act with those sessions' privileges — keep that in mind for anything sensitive, and be wary of pages that try to manipulate an agent into actions while logged in.
408
+
260
409
  ## License
261
410
 
262
411
  MIT