human-browser 4.4.0 β†’ 4.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +128 -76
  2. package/SKILL.md +46 -0
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -1,9 +1,10 @@
1
1
  # Human Browser β€” Cloud Stealth Browser for AI Agents
2
2
 
3
- > **No Mac Mini. No local machine. Your agent runs it anywhere.**
3
+ > **No Mac Mini. No local install. Your agent runs it anywhere.**
4
4
  > Residential IPs from 10+ countries. Bypasses Cloudflare, DataDome, PerimeterX.
5
5
  >
6
- > 🌐 **Product page:** https://humanbrowser.cloud
6
+ > 🌐 **Product page:** https://humanbrowser.cloud
7
+ > πŸ€– **A2A endpoint:** `https://agent.humanbrowser.cloud/a2a`
7
8
  > πŸ’¬ **Support:** https://t.me/virixlabs
8
9
 
9
10
  ---
@@ -15,115 +16,166 @@ Regular Playwright on a data-center server gets blocked **immediately** by:
15
16
  - DataDome (fingerprint analysis)
16
17
  - PerimeterX (behavioral analysis)
17
18
  - Instagram, LinkedIn, TikTok (residential IP requirement)
19
+ - Google sign-in, Akamai BMP (TLS / WebRTC fingerprint matching)
18
20
 
19
- Human Browser solves this by combining:
20
- 1. **Residential IP** β€” real ISP address from the target country (not a data center)
21
- 2. **Real device fingerprint** β€” iPhone 15 Pro or Windows Chrome, complete with canvas, WebGL, fonts
22
- 3. **Human-like behavior** β€” Bezier mouse curves, 60–220ms typing, natural scroll with jitter
23
- 4. **Full anti-detection** β€” `webdriver=false`, no automation flags, correct timezone & geolocation
21
+ Local stealth libraries (patchright, undetected-chromedriver, playwright-stealth) close some of these gaps in JS, but leak others β€” most notably **WebRTC ICE candidates** that surface the server's real datacenter IP regardless of your proxy.
24
22
 
25
- ---
23
+ Human Browser's **cloud build** runs a custom forked Chromium with **C++-source-level fingerprint patches** (TLS ja3/ja4 matched to real Chrome, GPU vendor/renderer spoofing, WebRTC IP replacement, canvas/WebGL/audio noise) that you cannot get from any npm package. Plus residential proxies. Plus a live browser viewer. **Drive it via the Agent2Agent protocol β€” no install, no version pinning, no Linux build of Chromium for you to maintain.**
26
24
 
27
- ## Quick Start
25
+ ---
28
26
 
29
- **No setup required** β€” just call `launchHuman()` and it automatically activates a free trial:
27
+ ## Agent2Agent (A2A) β€” recommended path
30
28
 
31
- ```js
32
- const { launchHuman } = require('./scripts/browser-human');
29
+ **Endpoint:** `https://agent.humanbrowser.cloud/a2a`
30
+ **Auth:** `Authorization: Bearer hb_live_<your-token>`
31
+ **Agent card:** `https://agent.humanbrowser.cloud/.well-known/agent.json`
33
32
 
34
- // πŸš€ Zero config β€” auto-fetches trial credentials from humanbrowser.cloud
35
- const { browser, page, humanType, humanClick, humanScroll, sleep } = await launchHuman();
36
- // Output: πŸŽ‰ Human Browser trial activated! (~100MB Romania residential IP)
33
+ Works with anything that speaks A2A β€” LangGraph, CrewAI, Google ADK, OpenAI Agents SDK, Claude/Anthropic agents, any hand-rolled JSON-RPC client.
37
34
 
38
- // Specific country
39
- const { page } = await launchHuman({ country: 'us' }); // US residential IP
40
- const { page } = await launchHuman({ country: 'gb' }); // UK residential IP
35
+ ### Submit + poll (recommended)
41
36
 
42
- // Desktop Chrome (Windows fingerprint)
43
- const { page } = await launchHuman({ mobile: false, country: 'us' });
37
+ ```bash
38
+ # 1. Submit a goal β€” get back task_id and viewerUrl
39
+ curl -sS https://agent.humanbrowser.cloud/a2a \
40
+ -H "Authorization: Bearer hb_live_<your-token>" \
41
+ -H "Content-Type: application/json" \
42
+ -d '{
43
+ "jsonrpc": "2.0",
44
+ "id": 1,
45
+ "method": "message/send",
46
+ "params": {
47
+ "message": {
48
+ "role": "user",
49
+ "metadata": { "profile": "main", "model": "anthropic/claude-sonnet-4-6" },
50
+ "parts": [{ "kind": "text", "text": "Log into example.com and report the dashboard total" }]
51
+ }
52
+ }
53
+ }'
54
+ # β†’ { "result": {
55
+ # "id": "t_abc...",
56
+ # "metadata": { "viewerUrl": "https://humanbrowser.cloud/a/s_xyz?k=...", "cost": {...} }
57
+ # } }
58
+
59
+ # 2. Poll task.metadata while running β€” every 5–10 s.
60
+ # metadata is enriched per step: step_count, current_url, last_thinking,
61
+ # last_eval, last_action, cost.
62
+ curl -sS https://agent.humanbrowser.cloud/a2a \
63
+ -H "Authorization: Bearer hb_live_<your-token>" \
64
+ -H "Content-Type: application/json" \
65
+ -d '{"jsonrpc":"2.0","id":2,"method":"tasks/get","params":{"id":"t_abc..."}}'
66
+
67
+ # 3. When task.state ∈ {completed, failed, canceled} β†’ done.
68
+ # Final outcome is in task.metadata.outcome:
69
+ # { success, result, step_count, duration_ms, cost, files }
70
+ # Full artifact in task.artifacts[0].
71
+ ```
44
72
 
45
- await page.goto('https://example.com', { waitUntil: 'domcontentloaded' });
46
- await humanScroll(page, 'down');
47
- await humanType(page, 'input[type="email"]', 'user@example.com');
48
- await humanClick(page, 760, 400);
49
- await browser.close();
73
+ ### Or: webhook callback (no polling)
74
+
75
+ Pass `callback_url` in `message.metadata` and we POST the final `Task` envelope to that URL when the task hits a terminal state. Signed with `X-HB-Signature: sha256=<HMAC>` when the server is configured with a webhook secret. Retries 3Γ— on 5xx / network errors with 2 / 8 / 30 s backoff.
76
+
77
+ ```json
78
+ {
79
+ "message": {
80
+ "role": "user",
81
+ "metadata": {
82
+ "profile": "main",
83
+ "callback_url": "https://your-agent.host/hb-callback"
84
+ },
85
+ "parts": [{ "kind": "text", "text": "..." }]
86
+ }
87
+ }
50
88
  ```
51
89
 
52
- > **Trial exhausted?** Get a paid plan at https://humanbrowser.cloud, then set `PROXY_USER` / `PROXY_PASS` in your `.env`.
90
+ ### Or: SSE streaming
91
+
92
+ `method: "message/stream"` returns Server-Sent Events; each step pushes a `status-update` event with the latest `task.metadata`. Use this if your client can hold a long-lived connection.
93
+
94
+ ### Live viewer
95
+
96
+ Every spawned session ships a `viewerUrl` in `task.metadata`. Open it to:
97
+ - Watch the browser live
98
+ - See the agent timeline (πŸ‘€ you Β· πŸ€– caller-agent like dzeny Β· 🧠 hb-agent β€” color-coded)
99
+ - Inject manual goals into a running session via the input box
100
+
101
+ ### Important: don't fire-and-await
102
+
103
+ `message/send` resolves only on terminal state, and tasks routinely run 5–30 min. Don't hold an HTTP connection that long β€” NAT / load-balancer / proxy timeouts will sever it and your client loses context **while our server keeps running and billing**. Use polling or webhooks instead.
53
104
 
54
105
  ---
55
106
 
56
- ## Setup
107
+ ## Sensitive data handling
57
108
 
58
- ```bash
59
- npm install playwright
60
- npx playwright install chromium --with-deps
109
+ Pass logins / passwords / API keys via the A2A `DataPart` with `metadata.sensitive=true` β€” they're injected as task input and are **stripped from logs, never written to artifacts, never echoed back in streams**.
61
110
 
62
- # Install via skill manager
63
- clawhub install al1enjesus/human-browser
111
+ ```json
112
+ {
113
+ "message": {
114
+ "role": "user",
115
+ "parts": [
116
+ { "kind": "text", "text": "Log in and download the latest report" },
117
+ { "kind": "data", "data": { "email": "...", "password": "..." }, "metadata": { "sensitive": true } }
118
+ ]
119
+ }
120
+ }
64
121
  ```
65
122
 
66
123
  ---
67
124
 
68
- ## Supported Countries
125
+ ## Profile persistence + per-token defaults
69
126
 
70
- | Country | Code | Best for |
71
- |---------|------|----------|
72
- | πŸ‡·πŸ‡΄ Romania | `ro` | Polymarket, Instagram, Binance, Cloudflare |
73
- | πŸ‡ΊπŸ‡Έ United States | `us` | Netflix, DoorDash, US Banks, Amazon |
74
- | πŸ‡¬πŸ‡§ United Kingdom | `gb` | Polymarket, Binance, BBC iPlayer |
75
- | πŸ‡©πŸ‡ͺ Germany | `de` | EU services, German e-commerce |
76
- | πŸ‡³πŸ‡± Netherlands | `nl` | Crypto, Polymarket, Web3 |
77
- | πŸ‡―πŸ‡΅ Japan | `jp` | Japanese e-commerce, Line |
78
- | πŸ‡«πŸ‡· France | `fr` | EU services, luxury brands |
79
- | πŸ‡¨πŸ‡¦ Canada | `ca` | North American services |
80
- | πŸ‡ΈπŸ‡¬ Singapore | `sg` | APAC/SEA e-commerce |
81
- | πŸ‡¦πŸ‡Ί Australia | `au` | Oceania content |
127
+ Each session inherits a named **profile** that persists cookies / localStorage / IndexedDB / Service Worker storage between spawns. Pass it in `message.metadata.profile`:
128
+
129
+ ```json
130
+ { "message": { "metadata": { "profile": "polymarket-main" }, "role": "user", "parts": [...] } }
131
+ ```
132
+
133
+ If you don't pass one, the server uses the token's `default_profile` (configurable per token via admin API) and falls back to `"default"`. Same profile across reconnects β†’ same cookies, same fingerprint, same WebRTC IP. Up to 5 concurrent sessions per token by default.
82
134
 
83
135
  ---
84
136
 
85
- ## Proxy Providers
137
+ ## Library mode (advanced β€” for self-hosting)
86
138
 
87
- ### Option 1: Human Browser Managed (recommended)
88
- Buy directly at **humanbrowser.cloud** β€” we handle everything, from $13.99/mo.
89
- Supports crypto (USDT/ETH/BTC/SOL) and card. AI agents can auto-purchase.
139
+ If you want to skip the cloud entirely and drive your own Chromium with a residential proxy you supply yourself, the `human-browser` npm package exposes `launchHuman()` β€” a drop-in Playwright launcher with our humanizer helpers, geo-fingerprint plumbing, and built-in 2captcha integration.
90
140
 
91
- ### Option 2: Bring Your Own Proxy
141
+ **Note:** the library does NOT include our forked Chromium with C++ stealth patches β€” that binary is part of the cloud build only. Library mode is patchright-stealth-plus-helpers; expect lower pass-rate on Cloudflare BM / DataDome / Google sign-in / WebRTC-aware bot scoring than the cloud.
92
142
 
93
- Plug any residential proxy into Human Browser via env vars.
94
- **Recommended providers** (tested and verified):
143
+ ```js
144
+ const { launchHuman } = require('human-browser');
95
145
 
96
- | Provider | Quality | Price | Best for |
97
- |---|---|---|---|
98
- | **[Decodo](https://decodo.com)** (ex-Smartproxy) | ⭐⭐⭐⭐⭐ | ~$2.5/GB | Cloudflare, DataDome, all-round. No KYC. |
99
- | **[Bright Data](https://get.brightdata.com/4ihj1kk8jt0v)** | ⭐⭐⭐⭐⭐ | ~$8.4/GB | Enterprise-grade, 72M+ IPs, 195 countries |
100
- | **[IPRoyal](https://iproyal.com)** | ⭐⭐⭐⭐ | ~$1.75/GB | High volume, budget, ethically sourced |
101
- | **[NodeMaven](https://nodemaven.com)** | ⭐⭐⭐⭐ | ~$3.5/GB | High success rate, pay-per-GB, no minimums |
102
- | **[Oxylabs](https://oxylabs.io)** | ⭐⭐⭐⭐⭐ | ~$8/GB | Business-grade, dedicated support |
146
+ // Zero config β€” auto-fetches trial credentials from humanbrowser.cloud
147
+ const { browser, page, humanType, humanClick, humanScroll, sleep } = await launchHuman();
103
148
 
104
- ```env
105
- PROXY_HOST=your-proxy-host
106
- PROXY_PORT=22225
107
- PROXY_USER=your-username
108
- PROXY_PASS=your-password
149
+ const { page } = await launchHuman({ country: 'us' }); // US residential IP
150
+ const { page } = await launchHuman({ mobile: false }); // Desktop Chrome fingerprint
109
151
  ```
110
152
 
153
+ ```bash
154
+ npm install human-browser playwright
155
+ npx playwright install chromium --with-deps
156
+ ```
157
+
158
+ For proxy-provider env vars (BYO Decodo / IPRoyal / Bright Data / NodeMaven / Oxylabs), see the [proxy setup notes](./references/brightdata-setup.md) and the env section of `scripts/browser-human.js`.
159
+
111
160
  ---
112
161
 
113
- ## How it compares
162
+ ## Supported countries
114
163
 
115
- | Feature | Regular Playwright | Human Browser |
116
- |---------|-------------------|---------------|
117
- | IP type | Data center β†’ blocked | Residential β†’ clean |
118
- | Bot detection | Fails | Passes all |
119
- | Mouse movement | Instant teleport | Bezier curves |
120
- | Typing speed | Instant | 60–220ms/char |
121
- | Fingerprint | Detectable bot | iPhone 15 Pro |
122
- | Countries | None | 10+ residential |
123
- | Cloudflare | Blocked | Bypassed |
124
- | DataDome | Blocked | Bypassed |
164
+ | Country | Code | Best for |
165
+ |---------|------|----------|
166
+ | πŸ‡·πŸ‡΄ Romania | `ro` | Polymarket, Instagram, Binance, Cloudflare |
167
+ | πŸ‡ΊπŸ‡Έ United States | `us` | Netflix, DoorDash, US banks, Amazon |
168
+ | πŸ‡¬πŸ‡§ United Kingdom | `gb` | Polymarket, Binance, BBC iPlayer |
169
+ | πŸ‡©πŸ‡ͺ Germany | `de` | EU services, German e-commerce |
170
+ | πŸ‡³πŸ‡± Netherlands | `nl` | Crypto, Polymarket, Web3 |
171
+ | πŸ‡―πŸ‡΅ Japan | `jp` | Japanese e-commerce, Line |
172
+ | πŸ‡«πŸ‡· France | `fr` | EU services, luxury brands |
173
+ | πŸ‡¨πŸ‡¦ Canada | `ca` | North American services |
174
+ | πŸ‡ΈπŸ‡¬ Singapore | `sg` | APAC/SEA e-commerce |
175
+ | πŸ‡¦πŸ‡Ί Australia | `au` | Oceania content |
125
176
 
126
177
  ---
127
178
 
128
- β†’ **Product page + pricing:** https://humanbrowser.cloud
179
+ β†’ **Buy a plan + pricing:** https://humanbrowser.cloud
129
180
  β†’ **Support & questions:** https://t.me/virixlabs
181
+ β†’ **Full spec for hand-rolled A2A clients:** [SKILL.md](./SKILL.md)
package/SKILL.md CHANGED
@@ -954,4 +954,50 @@ curl -sX POST https://agent.humanbrowser.cloud/a2a \
954
954
 
955
955
  For live streaming, swap `message/send` β†’ `message/stream` and read the response as `text/event-stream`. Each frame is a JSON-RPC notification carrying a `Task`, `TaskStatusUpdateEvent` or `TaskArtifactUpdateEvent` β€” exactly what `runOnCloud()` parses internally.
956
956
 
957
+ #### Webhook callback (v77+)
958
+
959
+ If you'd rather not poll, pass `callback_url` in `message.metadata` and we POST the final task envelope to that URL when the task hits a terminal state:
960
+
961
+ ```bash
962
+ curl -sX POST https://agent.humanbrowser.cloud/a2a \
963
+ -H "Authorization: Bearer $HUMANBROWSER_API_TOKEN" \
964
+ -H "Content-Type: application/json" \
965
+ -d '{
966
+ "jsonrpc":"2.0","id":1,"method":"message/send",
967
+ "params":{"message":{
968
+ "role":"user",
969
+ "metadata":{"callback_url":"https://your-host/hb-callback"},
970
+ "parts":[{"kind":"text","text":"..."}]
971
+ }}
972
+ }'
973
+ ```
974
+
975
+ The POST carries the full `Task` JSON (status, history, artifacts, metadata) plus `kind: "task.final"` and a `deliveredAt` timestamp. Headers: `Content-Type: application/json`, `X-HB-Task-Id`, `X-HB-Task-State`, and `X-HB-Signature: sha256=<HMAC>` when the server is configured with `A2A_WEBHOOK_SECRET`. Retries 3Γ— on 5xx / network error with 2 / 8 / 30 s backoff. HTTPS only; max URL length 1000 chars.
976
+
977
+ #### Per-step metadata (v77+)
978
+
979
+ `task.metadata` is now enriched on every step the agent takes, so polling `tasks/get` gives a rich progress snapshot without parsing `task.history`:
980
+
981
+ | Field | When updated | Example |
982
+ |---|---|---|
983
+ | `step_count` | each step | `12` |
984
+ | `current_url` | each step | `"https://featured.com/experts/questions"` |
985
+ | `last_thinking` | each step | first ~2 KB of the agent's reasoning |
986
+ | `last_next_goal` | each step | the planner's next-step intent |
987
+ | `last_eval` | each step | the agent's own verdict on the last action |
988
+ | `last_action` | each action | `{ "name": "click", "at": "2026-05-11T..." }` |
989
+ | `cost` | each LLM call | `{ tokens_in, tokens_out, usd, model }` |
990
+ | `outcome` | terminal `done` | `{ success, result, step_count, duration_ms, cost, files }` |
991
+ | `viewerUrl` | initial | `https://humanbrowser.cloud/a/s_xyz?k=...` |
992
+
993
+ A polling client thus renders a faithful "what's it doing right now" panel:
994
+
995
+ ```
996
+ step 12/50 Β· https://featured.com/experts/questions
997
+ last action: click on "Page 2"
998
+ last eval: Successfully navigated to page 2 of 7
999
+ cost: $0.58
1000
+ viewer: https://humanbrowser.cloud/a/s_xyz?k=...
1001
+ ```
1002
+
957
1003
  > **Note on multi-turn**: the A2A spec describes an `input-required` state for tasks that need follow-up input. The current cloud build runs every task to terminal in one shot β€” multi-turn resumption is reserved in the protocol but not yet wired up server-side. Use `tasks/cancel` and submit a fresh task if you need to redirect.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "human-browser",
3
- "version": "4.4.0",
3
+ "version": "4.5.0",
4
4
  "description": "Stealth browser for AI agents. Bypasses Cloudflare, DataDome, PerimeterX. Residential IPs from 10+ countries. iPhone 15 Pro fingerprint. Drop-in Playwright replacement β€” launchHuman() just works.",
5
5
  "keywords": [
6
6
  "browser-automation",