agentmb 0.1.1 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. package/README.md +818 -153
  2. package/dist/browser/actions.d.ts +24 -1
  3. package/dist/browser/actions.d.ts.map +1 -1
  4. package/dist/browser/actions.js +118 -19
  5. package/dist/browser/actions.js.map +1 -1
  6. package/dist/browser/manager.d.ts +14 -0
  7. package/dist/browser/manager.d.ts.map +1 -1
  8. package/dist/browser/manager.js +117 -4
  9. package/dist/browser/manager.js.map +1 -1
  10. package/dist/cli/commands/actions.d.ts.map +1 -1
  11. package/dist/cli/commands/actions.js +305 -70
  12. package/dist/cli/commands/actions.js.map +1 -1
  13. package/dist/cli/commands/browser-launch.d.ts +7 -0
  14. package/dist/cli/commands/browser-launch.d.ts.map +1 -0
  15. package/dist/cli/commands/browser-launch.js +116 -0
  16. package/dist/cli/commands/browser-launch.js.map +1 -0
  17. package/dist/cli/commands/session.d.ts.map +1 -1
  18. package/dist/cli/commands/session.js +67 -4
  19. package/dist/cli/commands/session.js.map +1 -1
  20. package/dist/cli/index.js +3 -1
  21. package/dist/cli/index.js.map +1 -1
  22. package/dist/daemon/index.js +2 -2
  23. package/dist/daemon/index.js.map +1 -1
  24. package/dist/daemon/routes/actions.d.ts.map +1 -1
  25. package/dist/daemon/routes/actions.js +419 -66
  26. package/dist/daemon/routes/actions.js.map +1 -1
  27. package/dist/daemon/routes/sessions.d.ts.map +1 -1
  28. package/dist/daemon/routes/sessions.js +208 -3
  29. package/dist/daemon/routes/sessions.js.map +1 -1
  30. package/dist/daemon/routes/state.d.ts.map +1 -1
  31. package/dist/daemon/routes/state.js +26 -0
  32. package/dist/daemon/routes/state.js.map +1 -1
  33. package/dist/daemon/server.js +1 -1
  34. package/dist/daemon/session.d.ts +19 -0
  35. package/dist/daemon/session.d.ts.map +1 -1
  36. package/dist/daemon/session.js +13 -0
  37. package/dist/daemon/session.js.map +1 -1
  38. package/dist/policy/types.d.ts.map +1 -1
  39. package/dist/policy/types.js +14 -12
  40. package/dist/policy/types.js.map +1 -1
  41. package/package.json +1 -1
package/README.md CHANGED
@@ -4,28 +4,22 @@ Agent-ready local browser runtime for stable, auditable web automation.
4
4
 
5
5
  ## What It Does
6
6
 
7
- `agent-managed-browser` provides a persistent Chromium daemon with session management, CLI/Python SDK access, and human login handoff support. It is designed for coding/ops agents that need reproducible browser workflows instead of fragile one-off scripts.
7
+ `agent-managed-browser` runs a persistent **Chromium stable** browser daemon (via Playwright's bundled Chromium stable channel) with session management, structured audit logs, multi-modal element targeting, and human login handoff. It exposes a REST API, a CLI, and a Python SDK.
8
8
 
9
- ## Use Cases
10
-
11
- - **Agent web tasks**: Let Codex/Claude run navigation, click/fill, extraction, screenshot, and evaluation in a controlled runtime.
12
- - **Human-in-the-loop login**: Switch to headed mode for manual login, then return to headless automation with the same profile.
13
- - **E2E and CI verification**: Run isolated smoke/auth/handoff/cdp checks with configurable port and data dir.
14
- - **Local automation service**: Keep one daemon running and let multiple tools/agents reuse sessions safely.
15
-
16
- Local Chromium runtime for AI agents, with:
9
+ The browser engine is Chromium (Chrome-compatible). Firefox and WebKit are not supported. Node.js 20 LTS is the runtime baseline.
17
10
 
18
- - daemon API (`agentmb`)
19
- - CLI (`agentmb`)
20
- - Python SDK (`agentmb`)
11
+ Designed for coding and ops agents that need reproducible, inspectable browser workflows rather than fragile one-off scripts.
21
12
 
22
- This repo supports macOS, Linux, and Windows.
13
+ ## Use Cases
23
14
 
24
- ## Agent Skill
15
+ - **Agent web tasks**: navigate, click, fill, extract, screenshot, evaluate JavaScript, all via API or SDK.
16
+ - **Human-in-the-loop login**: switch to headed mode for manual login, then return to headless automation with the same profile and cookies intact.
17
+ - **E2E and CI verification**: run isolated smoke/auth/CDP/policy checks with configurable port and data dir.
18
+ - **Local automation service**: one daemon, multiple sessions, multiple agents reusing sessions safely.
25
19
 
26
- For Codex/Claude/AgentMB operation guidance (initialization, core commands, troubleshooting), see:
20
+ Supports macOS, Linux, and Windows.
27
21
 
28
- - [agentmb-operations-skill/SKILL.md](./agentmb-operations-skill/SKILL.md)
22
+ ---
29
23
 
30
24
  ## Quick Start
31
25
 
@@ -45,9 +39,22 @@ npm link
45
39
  agentmb start
46
40
  ```
47
41
 
48
- ## Install from npm / pip
42
+ In another terminal:
43
+
44
+ ```bash
45
+ agentmb status
46
+ agentmb session new --profile demo
47
+ agentmb session list
48
+ agentmb navigate <session-id> https://example.com
49
+ agentmb screenshot <session-id> -o ./shot.png
50
+ agentmb stop
51
+ ```
52
+
53
+ ---
49
54
 
50
- ### macOS / Linux
55
+ ## Install
56
+
57
+ ### npm + pip (macOS / Linux)
51
58
 
52
59
  ```bash
53
60
  npm i -g agentmb
@@ -56,239 +63,755 @@ agentmb --help
56
63
  python3 -c "import agentmb; print(agentmb.__version__)"
57
64
  ```
58
65
 
59
- ### Windows (PowerShell)
66
+ ### npm + pip (Windows PowerShell)
60
67
 
61
68
  ```powershell
62
69
  npm i -g agentmb
63
70
  py -m pip install --user agentmb
64
71
  agentmb --help
65
- py -c "import agentmb; print(agentmb.__version__)"
66
72
  ```
67
73
 
68
74
  Package roles:
69
- - npm package: CLI + daemon runtime
70
- - pip package: Python SDK client
75
+ - `npm` package: CLI + daemon runtime (Chromium via Playwright)
76
+ - `pip` package: Python SDK client (httpx + pydantic v2)
71
77
 
72
- In another terminal:
73
-
74
- ```bash
75
- agentmb status
76
- agentmb session new --profile demo
77
- agentmb session list
78
- agentmb navigate <session-id> https://example.com
79
- agentmb screenshot <session-id> -o ./shot.png
80
- agentmb stop
81
- ```
78
+ ---
82
79
 
83
80
  ## Python SDK
84
81
 
85
82
  ```bash
86
83
  python3 -m pip install -e sdk/python
87
- python3 -c "from agentmb import BrowserClient; print('SDK OK')"
88
84
  ```
89
85
 
90
- ## Install By Platform
86
+ ```python
87
+ from agentmb import BrowserClient
91
88
 
92
- For full installation steps on all environments:
89
+ with BrowserClient(base_url="http://127.0.0.1:19315") as client:
90
+ sess = client.sessions.create(headless=True, profile="demo")
91
+ sess.navigate("https://example.com")
92
+ res = sess.screenshot()
93
+ res.save("shot.png")
94
+ sess.close()
95
+ ```
93
96
 
94
- - macOS
95
- - Linux (Ubuntu / Debian)
96
- - Windows (PowerShell / WSL2)
97
+ ---
97
98
 
98
- See [INSTALL.md](./INSTALL.md).
99
+ ## Locator Models
99
100
 
100
- ## Action Reference
101
+ Three targeting modes based on page stability and replay requirements.
101
102
 
102
- | Action | CLI command | Description |
103
- |---|---|---|
104
- | navigate | `agentmb navigate <sess> <url>` | Navigate to URL |
105
- | screenshot | `agentmb screenshot <sess> -o out.png` | Capture screenshot |
106
- | eval | `agentmb eval <sess> <expr>` | Run JavaScript expression |
107
- | extract | `agentmb extract <sess> <selector>` | Extract text/attributes |
108
- | click | `agentmb click <sess> <selector>` | Click element |
109
- | fill | `agentmb fill <sess> <selector> <value>` | Fill form field |
110
- | type | `agentmb type <sess> <selector> <text>` | Type char-by-char |
111
- | press | `agentmb press <sess> <selector> <key>` | Press key / combo (e.g. `Enter`, `Control+a`) |
112
- | select | `agentmb select <sess> <selector> <val>` | Select `<option>` in a `<select>` |
113
- | hover | `agentmb hover <sess> <selector>` | Hover over element |
114
- | wait-selector | `agentmb wait-selector <sess> <selector>` | Wait for element state |
115
- | wait-url | `agentmb wait-url <sess> <pattern>` | Wait for URL pattern |
116
- | upload | `agentmb upload <sess> <selector> <file>` | Upload local file to file input |
117
- | download | `agentmb download <sess> <selector> -o out` | Click link and save download |
118
- | element-map | `agentmb element-map <sess>` | Scan page, label interactive elements with stable IDs |
119
- | get | `agentmb get <sess> <property> <selector>` | Read text/html/value/attr/count/box from element |
120
- | assert | `agentmb assert <sess> <property> <selector>` | Assert visible/enabled/checked state |
121
- | wait-stable | `agentmb wait-stable <sess>` | Wait for network idle + DOM quiet + overlay gone |
122
-
123
- Actions that accept `<selector>` also accept `--element-id <eid>` (from `element-map`) as an alternative stable locator. Both remain backward-compatible.
124
-
125
- ### Element Map
126
-
127
- ```bash
128
- # Scan the page and label all interactive elements
103
+ ### 1) Selector Mode
104
+
105
+ Plain CSS selectors passed directly.
106
+
107
+ ```bash
108
+ agentmb click <session-id> "#submit"
109
+ agentmb fill <session-id> "#email" "name@example.com"
110
+ agentmb get <session-id> text "#title"
111
+ ```
112
+
113
+ Best for: stable pages where selectors are reliable.
114
+
115
+ ### 2) Element-ID Mode (`element-map`)
116
+
117
+ Step 1: scan the page, get stable `element_id` values.
118
+
119
+ ```bash
129
120
  agentmb element-map <session-id>
130
- # table: element_id | tag | role | text | rect
121
+ agentmb element-map <session-id> --include-unlabeled # also surface icon-only elements
122
+ ```
131
123
 
132
- # Use element_id in subsequent actions (no selector drift)
124
+ Step 2: pass the ID to any action.
125
+
126
+ ```bash
133
127
  agentmb click <session-id> e3 --element-id
134
- agentmb fill <session-id> e5 "hello" --element-id
128
+ agentmb fill <session-id> e5 "hello" --element-id
129
+ agentmb get <session-id> text e3 --element-id
130
+ agentmb assert <session-id> visible e3 --element-id
131
+ ```
132
+
133
+ `label` field per element is synthesized using a 7-level priority chain:
134
+
135
+ | Priority | Source | `label_source` value |
136
+ |---|---|---|
137
+ | 1 | `aria-label` attribute | `"aria-label"` |
138
+ | 2 | `title` attribute | `"title"` |
139
+ | 3 | `aria-labelledby` target text | `"aria-labelledby"` |
140
+ | 4 | SVG `<title>` / `<desc>` | `"svg-title"` |
141
+ | 5 | `innerText` (trimmed) | `"text"` |
142
+ | 6 | `placeholder` attribute | `"placeholder"` |
143
+ | 7 | Fallback (icon-only) | `"none"` / `"[tag @ x,y]"` |
144
+
145
+ Icon-only elements get `label_source="none"` by default; `--include-unlabeled` adds a `[tag @ x,y]` coordinate fallback.
135
146
 
136
- # Read element properties
137
- agentmb get <session-id> text --element-id e3
138
- agentmb get <session-id> value --element-id e5
139
- agentmb get <session-id> count .item-class
140
- agentmb get <session-id> attr "#logo" --attr-name src
147
+ Best for: selector drift, dynamic class names, and icon-heavy SPAs.
141
148
 
142
- # Assert element state
143
- agentmb assert <session-id> visible --element-id e3
144
- agentmb assert <session-id> enabled "#submit" --expected true
145
- agentmb assert <session-id> checked "#agree" --expected false
149
+ ### 3) Snapshot-Ref Mode (`snapshot-map` + `ref_id`)
146
150
 
147
- # Wait for page to be fully stable (network idle + DOM quiet + overlays gone)
148
- agentmb wait-stable <session-id> --timeout-ms 10000 --dom-stable-ms 300
149
- agentmb wait-stable <session-id> --overlay-selector "#loading-overlay"
151
+ Step 1: create a server-side snapshot.
152
+
153
+ ```bash
154
+ agentmb snapshot-map <session-id>
155
+ agentmb snapshot-map <session-id> --include-unlabeled
150
156
  ```
151
157
 
152
- ### Element Map (Python SDK)
158
+ Step 2: use the returned `ref_id` (`snap_XXXXXX:eN`) in API/SDK calls.
159
+
160
+ - `page_rev` is an integer counter returned with each snapshot; it increments on every main-frame navigation. Poll it directly to detect page changes without taking a full snapshot:
161
+
162
+ ```http
163
+ GET /api/v1/sessions/:id/page_rev
164
+ → { "status": "ok", "session_id": "...", "page_rev": 3, "url": "https://..." }
165
+ ```
153
166
 
154
167
  ```python
155
- with client.create_session(headless=True) as sess:
156
- sess.navigate("https://example.com")
168
+ rev = sess.page_rev() # PageRevResult with .page_rev, .url
169
+ ```
170
+
171
+ - If the page has navigated since the snapshot, using a stale `ref_id` returns `409 stale_ref` with a structured payload:
172
+
173
+ ```json
174
+ {
175
+ "error": "stale_ref: page changed",
176
+ "suggestions": ["call snapshot_map to get fresh ref_ids", "re-run your step with the new ref_id"]
177
+ }
178
+ ```
179
+
180
+ - Recovery: call `snapshot-map` again, retry with new `ref_id`.
181
+
182
+ Best for: deterministic replay and safe automation on changing pages.
183
+
184
+ ### Mode Selection Guide
185
+
186
+ | Page Type | Recommended Mode |
187
+ |---|---|
188
+ | Text-rich pages (docs, GitHub, HN) | `element-map` + `--element-id` |
189
+ | Icon/SVG-dense SPAs (social apps, dashboards) | CSS selector or `--include-unlabeled` |
190
+ | `contenteditable` / custom components | `eval getBoundingClientRect` + `click-at` |
191
+ | Image feeds (Unsplash, Pinterest) | `snapshot-map` (images have `alt` text) |
192
+
193
+ | Action | Approach |
194
+ |---|---|
195
+ | Search / navigation | Construct the URL directly |
196
+ | Click a labeled button | `element-map` eid or CSS selector |
197
+ | Click `contenteditable` | `click-at <sess> <x> <y>` (get coords via `bbox`) |
198
+ | Scroll SPA content area | Check `scrolled` + `scrollable_hint` in response; use `eval el.scrollBy()` if needed |
199
+ | File upload from disk | `upload <sess> <selector> <file>` (MIME inferred from extension) |
200
+ | File upload from URL | API: `POST /sessions/:id/upload_url` |
201
+ | Click JS-signed links | `click-at` to trigger a real click event |
202
+
203
+ ---
204
+
205
+ ## Action Reference
206
+
207
+ Use `agentmb --help` and `agentmb <command> --help` for full flags.
208
+
209
+ ### Navigation
210
+
211
+ | Command | Notes |
212
+ |---|---|
213
+ | `agentmb navigate <sess> <url>` | Navigate; `--wait-until load\|networkidle\|commit` |
214
+ | `agentmb back <sess>` / `forward <sess>` / `reload <sess>` | Browser history |
215
+ | `agentmb wait-url <sess> <pattern>` | Wait for URL match |
216
+ | `agentmb wait-load-state <sess>` | Wait for load state |
217
+ | `agentmb wait-function <sess> <expr>` | Wait for JS condition |
218
+ | `agentmb wait-text <sess> <text>` | Wait for text to appear |
219
+ | `agentmb wait-stable <sess>` | Network idle + DOM quiet + optional overlay clear |
220
+
221
+ ### Locator / Read / Assert
222
+
223
+ | Command | Notes |
224
+ |---|---|
225
+ | `agentmb element-map <sess>` | Scan; inject `element_id`; return `label` + `label_source` |
226
+ | `agentmb element-map <sess> --include-unlabeled` | Include icon-only elements; fallback label = `[tag @ x,y]` |
227
+ | `agentmb snapshot-map <sess>` | Server snapshot with `page_rev`; returns `ref_id` per element |
228
+ | `agentmb get <sess> <property> <selector-or-eid>` | Read `text/html/value/attr/count/box` |
229
+ | `agentmb assert <sess> <property> <selector-or-eid>` | Assert `visible/enabled/checked` |
230
+ | `agentmb extract <sess> <selector>` | Extract text/attributes as list |
231
+
232
+ `selector-or-eid` accepts a CSS selector, `--element-id` (element-map), or `--ref-id` (snapshot-map) on all commands.
233
+
234
+ ### Element Interaction
235
+
236
+ | Command | Notes |
237
+ |---|---|
238
+ | `agentmb click <sess> <selector-or-eid>` | Click; `contenteditable` supported; returns `422` with diagnostics + `recovery_hint` on failure |
239
+ | `agentmb dblclick <sess> <selector-or-eid>` | Double-click |
240
+ | `agentmb fill <sess> <selector-or-eid> <value>` | Fast fill (replaces value) |
241
+ | `agentmb type <sess> <selector-or-eid> <text>` | Type character by character; `--delay-ms <ms>` |
242
+ | `agentmb press <sess> <selector-or-eid> <key>` | Key / combo (`Enter`, `Tab`, `Control+a`) |
243
+ | `agentmb select <sess> <selector> <value...>` | Select `<option>` in `<select>` |
244
+ | `agentmb hover <sess> <selector-or-eid>` | Hover |
245
+ | `agentmb focus <sess> <selector-or-eid>` | Focus |
246
+ | `agentmb check <sess> <selector-or-eid>` / `uncheck` | Checkbox / radio |
247
+ | `agentmb drag <sess> <source> <target>` | Drag-and-drop; also accepts `--source-ref-id` / `--target-ref-id` |
248
+
249
+ **API/SDK — click advanced options:**
250
+
251
+ ```python
252
+ # executor: 'strict' (default) or 'auto_fallback'
253
+ # auto_fallback: tries Playwright click; if it times out due to overlay/intercept,
254
+ # falls back to page.mouse.click(center_x, center_y).
255
+ # When clicking inside an <iframe>, auto_fallback automatically adds the frame's
256
+ # page-level offset so coordinates land correctly.
257
+ # Response includes executed_via: 'high_level' | 'low_level'
258
+ sess.click(selector="#btn", executor="auto_fallback", timeout_ms=3000)
259
+
260
+ # stability: optional pre/post waits to handle animated UIs
261
+ sess.click(selector="#btn", stability={
262
+ "wait_before_ms": 200, # pause before the action
263
+ "wait_after_ms": 100, # pause after the action
264
+ "wait_dom_stable_ms": 500 # wait for DOM readyState before acting
265
+ })
266
+ ```
267
+
268
+ **API/SDK — fill humanization:**
269
+
270
+ ```python
271
+ # fill_strategy='type': types character-by-character (slower, more human-like)
272
+ # char_delay_ms: delay between keystrokes in ms (used with fill_strategy='type')
273
+ sess.fill(selector="#inp", value="hello", fill_strategy="type", char_delay_ms=30)
274
+ ```
275
+
276
+ ### Scroll and Feed
157
277
 
158
- # Scan page elements
159
- result = sess.element_map()
160
- for el in result.elements:
161
- print(el.element_id, el.tag, el.role, el.text)
278
+ | Command | Notes |
279
+ |---|---|
280
+ | `agentmb scroll <sess> <selector-or-eid>` | Scroll element; structured response (see below) |
281
+ | `agentmb scroll-into-view <sess> <selector-or-eid>` | Scroll element into viewport |
282
+ | `agentmb scroll-until <sess>` | Scroll until stop condition (`--stop-selector`, `--stop-text`, `--max-scrolls`) |
283
+ | `agentmb load-more-until <sess> <btn-selector> <item-selector>` | Repeatedly click load-more |
162
284
 
163
- # Click by element_id
164
- btn = next(e for e in result.elements if e.role == "button")
165
- sess.click(element_id=btn.element_id)
285
+ **`scroll` response fields:**
166
286
 
167
- # Read / assert
168
- text = sess.get("text", element_id=btn.element_id)
169
- check = sess.assert_state("visible", selector="#main", expected=True)
170
- print(check.passed)
287
+ ```json
288
+ {
289
+ "scrolled": true,
290
+ "warning": "element not scrollable — scrolled nearest scrollable ancestor",
291
+ "scrollable_hint": [
292
+ { "selector": "#feed", "tag": "div", "scrollHeight": 4200, "clientHeight": 600 },
293
+ ...
294
+ ]
295
+ }
296
+ ```
297
+
298
+ - `scrolled` — `true` if any scroll movement occurred
299
+ - `warning` — present when the target element itself is not scrollable and a fallback was used
300
+ - `scrollable_hint` — top-5 scrollable descendants ranked by `scrollHeight`; use these selectors in subsequent `scroll` calls when `scrolled=false`
301
+
302
+ **`scroll_until` / `load_more_until` response** includes `session_id` for chaining:
303
+
304
+ ```json
305
+ { "status": "ok", "session_id": "sess_...", "scrolls": 12, "stop_reason": "stop_text_found" }
306
+ ```
307
+
308
+ **API/SDK — scroll_until with step_delay:**
309
+
310
+ ```python
311
+ # step_delay_ms: wait between each scroll step (default = stall_ms)
312
+ sess.scroll_until(scroll_selector="#feed", direction="down",
313
+ stop_selector=".end", max_scrolls=20, step_delay_ms=150)
314
+ ```
315
+
316
+ ### Coordinate and Low-Level Input
317
+
318
+ | Command | Notes |
319
+ |---|---|
320
+ | `agentmb click-at <sess> <x> <y>` | Click absolute page coordinates |
321
+ | `agentmb wheel <sess> --dx --dy` | Low-level wheel event |
322
+ | `agentmb insert-text <sess> <text>` | Insert text into focused element (no keyboard simulation) |
323
+ | `agentmb bbox <sess> <selector-or-eid>` | Bounding box + center coordinates; accepts `--element-id` / `--ref-id` |
324
+ | `agentmb mouse-move <sess> [x] [y]` | Move mouse to absolute coordinates; or use `--selector`/`--element-id`/`--ref-id` to resolve element center |
325
+ | `agentmb mouse-down <sess>` / `mouse-up <sess>` | Mouse button press / release |
326
+ | `agentmb key-down <sess> <key>` / `key-up <sess> <key>` | Raw key press / release |
327
+
328
+ **API/SDK — smooth mouse movement:**
329
+
330
+ ```python
331
+ # Move by absolute coordinates with smooth interpolation
332
+ res = sess.mouse_move(x=400, y=300, steps=10)
333
+
334
+ # Move to an element center by selector / element_id / ref_id (x/y resolved server-side)
335
+ res = sess.mouse_move(selector="#submit-btn", steps=5)
336
+ res = sess.mouse_move(element_id="e3", steps=5)
337
+ res = sess.mouse_move(ref_id="snap_000001:e3")
171
338
 
172
- # Stability gate before next scan
173
- sess.wait_page_stable(timeout_ms=8000, overlay_selector="#spinner")
339
+ # Response includes x, y, steps fields
340
+ print(res.x, res.y, res.steps)
341
+ ```
342
+
343
+ CLI equivalents:
344
+ ```bash
345
+ agentmb mouse-move <sess> 400 300 --steps 10
346
+ agentmb mouse-move <sess> --selector "#btn" --steps 5
347
+ agentmb mouse-move <sess> --element-id e3
348
+ agentmb mouse-move <sess> --ref-id snap_000001:e3
174
349
  ```
175
350
 
351
+ ### Semantic Find (API / SDK)
352
+
353
+ Locate elements by Playwright semantic locators without knowing CSS selectors.
354
+
355
+ ```python
356
+ # query_type: 'role' | 'text' | 'label' | 'placeholder' | 'alt_text'
357
+ # Returns: found (bool), count, tag, text, bbox, nth
358
+ res = sess.find(query_type="role", query="button", name="Submit")
359
+ res = sess.find(query_type="text", query="Sign in", exact=True)
360
+ res = sess.find(query_type="placeholder", query="Search…")
361
+ res = sess.find(query_type="label", query="Email address")
362
+ res = sess.find(query_type="alt_text", query="Product photo", nth=2)
363
+ ```
364
+
365
+ | `query_type` | Playwright call |
366
+ |---|---|
367
+ | `role` | `page.getByRole(query, { name, exact })` |
368
+ | `text` | `page.getByText(query, { exact })` |
369
+ | `label` | `page.getByLabel(query, { exact })` |
370
+ | `placeholder` | `page.getByPlaceholder(query, { exact })` |
371
+ | `alt_text` | `page.getByAltText(query, { exact })` |
372
+
373
+ Returns `FindResult` with `found`, `count`, `nth`, `tag`, `text`, `bbox`.
374
+
375
+ ### Batch Execution — run_steps (API / SDK)
376
+
377
+ Execute a sequence of actions in a single request. Supports `stop_on_error`.
378
+
379
+ Each step's `params` accepts `selector`, `element_id`, or `ref_id` interchangeably for element targeting:
380
+
381
+ ```python
382
+ # First, take a snapshot to get ref_ids
383
+ snap = sess.snapshot_map()
384
+ btn_ref = next(e.ref_id for e in snap.elements if "Login" in (e.label or ""))
385
+
386
+ result = sess.run_steps([
387
+ {"action": "navigate", "params": {"url": "https://example.com"}},
388
+ {"action": "click", "params": {"ref_id": btn_ref}}, # ref_id from snapshot
389
+ {"action": "fill", "params": {"element_id": "e5", "value": "user@example.com"}}, # element_id
390
+ {"action": "fill", "params": {"selector": "#pass", "value": "secret"}}, # CSS selector
391
+ {"action": "press", "params": {"selector": "#pass", "key": "Enter"}},
392
+ {"action": "wait_for_selector","params": {"selector": ".dashboard"}},
393
+ {"action": "screenshot", "params": {"format": "png"}},
394
+ ], stop_on_error=True)
395
+
396
+ print(result.status) # 'ok' | 'partial' | 'failed'
397
+ print(result.completed_steps) # number of steps that succeeded
398
+ for step in result.results:
399
+ print(step.step, step.action, step.error)
400
+ ```
401
+
402
+ - A stale `ref_id` (page navigated since snapshot) returns a step-level error, not a request crash. Use `stop_on_error=False` to continue remaining steps.
403
+ - Supported actions: `navigate`, `click`, `fill`, `type`, `press`, `hover`, `scroll`, `wait_for_selector`, `wait_text`, `screenshot`, `eval`. Max 100 steps per call.
404
+
405
+ ### File Transfer
406
+
407
+ | Command | Notes |
408
+ |---|---|
409
+ | `agentmb upload <sess> <selector> <file>` | Upload file from disk; MIME auto-inferred from extension (`--mime-type` to override) |
410
+ | `agentmb download <sess> <selector-or-eid> -o <file>` | Trigger download; accepts `--element-id` / `--ref-id`; requires `--accept-downloads` on session |
411
+
412
+ **download guard**: sessions created without `accept_downloads=True` return `422 download_not_enabled`:
413
+
414
+ ```python
415
+ # Correct — enable at session creation time
416
+ sess = client.sessions.create(accept_downloads=True)
417
+ sess.download(selector="#dl-link", output_path="./file.pdf")
418
+
419
+ # download also accepts element_id / ref_id
420
+ sess.download(element_id="e7", output_path="./file.pdf")
421
+ sess.download(ref_id="snap_000001:e7", output_path="./file.pdf")
422
+ ```
423
+
424
+ ```bash
425
+ agentmb session new --accept-downloads
426
+ agentmb download <sess> "#dl-link" -o file.pdf
427
+ agentmb download <sess> e7 --element-id -o file.pdf
428
+ ```
429
+
430
+ **API/SDK — upload from URL:**
431
+
432
+ ```python
433
+ # Fetches the URL server-side (Node fetch), writes to temp file, uploads to file input.
434
+ res = sess.upload_url(
435
+ url="https://example.com/assets/photo.jpg",
436
+ selector="#file-input",
437
+ filename="photo.jpg", # optional; defaults to last URL path segment
438
+ mime_type="image/jpeg", # optional; defaults to application/octet-stream
439
+ )
440
+ # res.size_bytes, res.fetched_bytes, res.filename
441
+ ```
442
+
443
+ ### Session State (Cookie / Storage)
444
+
445
+ | Command | Notes |
446
+ |---|---|
447
+ | `agentmb cookie-list <sess>` | List all cookies |
448
+ | `agentmb cookie-clear <sess>` | Clear all cookies |
449
+ | `agentmb storage-export <sess> -o state.json` | Export Playwright storageState (cookies + origins) |
450
+ | `agentmb storage-import <sess> state.json` | Restore cookies from storageState; `origins_skipped` count returned |
451
+
452
+ **API/SDK — delete cookie by name:**
453
+
454
+ ```python
455
+ # Removes matching cookies, preserves the rest. domain is optional filter.
456
+ res = sess.delete_cookie("session_token")
457
+ res = sess.delete_cookie("tracker", domain=".example.com")
458
+ # res.removed, res.remaining
459
+ ```
460
+
461
+ ### Observability and Debug
462
+
463
+ | Command | Notes |
464
+ |---|---|
465
+ | `agentmb screenshot <sess> -o out.png` | Screenshot; `--full-page`, `--format png\|jpeg` |
466
+ | `agentmb annotated-screenshot <sess> --highlight <sel>` | Screenshot with colored element overlays |
467
+ | `agentmb eval <sess> <expr>` | Evaluate JavaScript; returns raw result |
468
+ | `agentmb console-log <sess>` | Browser console entries; `--tail N` |
469
+ | `agentmb page-errors <sess>` | Uncaught JS errors from the page |
470
+ | `agentmb dialogs <sess>` | Auto-dismissed dialog history (alert/confirm/prompt) |
471
+ | `agentmb logs <sess>` | Session audit log tail (all actions, policy events, CDP calls) |
472
+ | `agentmb trace start <sess>` / `trace stop <sess> -o trace.zip` | Playwright trace capture |
473
+
474
+ ### Browser Environment and Controls
475
+
476
+ | Command | Notes |
477
+ |---|---|
478
+ | `agentmb set-viewport <sess> <w> <h>` | Resize viewport |
479
+ | `agentmb clipboard-write <sess> <text>` / `clipboard-read <sess>` | Clipboard access |
480
+ | `agentmb policy <sess> [profile]` | Get or set safety policy profile |
481
+ | `agentmb cdp-ws <sess>` | Print browser-level CDP WebSocket URL |
482
+
483
+ **API/SDK — browser settings:**
484
+
485
+ ```python
486
+ # Returns viewport, user_agent, url, headless, profile for a session.
487
+ settings = sess.get_settings()
488
+ print(settings.viewport, settings.user_agent, settings.headless)
489
+ ```
490
+
491
+ ---
492
+
176
493
  ## Multi-Page Management
177
494
 
178
495
  ```bash
179
- agentmb pages list <session-id> # list all open tabs
180
- agentmb pages new <session-id> # open a new tab
496
+ agentmb pages list <session-id> # list all open tabs
497
+ agentmb pages new <session-id> # open a new tab
181
498
  agentmb pages switch <session-id> <page-id> # make a tab the active target
182
499
  agentmb pages close <session-id> <page-id> # close a tab (last tab protected)
183
500
  ```
184
501
 
502
+ ---
503
+
185
504
  ## Network Route Mocks
186
505
 
187
506
  ```bash
188
- agentmb route list <session-id> # list active mocks
507
+ agentmb route list <session-id>
189
508
  agentmb route add <session-id> "**/api/**" \
190
509
  --status 200 --body '{"ok":true}' \
191
- --content-type application/json # intercept requests
192
- agentmb route rm <session-id> "**/api/**" # remove a mock
510
+ --content-type application/json
511
+ agentmb route rm <session-id> "**/api/**"
193
512
  ```
194
513
 
195
- ## Playwright Trace Recording
514
+ Route mocks are applied at context level, so they persist across page navigations within the same session.
515
+
516
+ ---
517
+
518
+ ## Three Browser Running Modes
519
+
520
+ agentmb supports three distinct browser modes, differing in **which browser binary is used and how it is connected**.
521
+
522
+ | Mode | Browser | How Connected | Profile Persistence |
523
+ |---|---|---|---|
524
+ | **1. Managed Chromium** | Playwright bundled Chromium | agentmb spawns & owns | Persistent or ephemeral |
525
+ | **2. Managed Chrome Stable** | System Chrome / Edge | agentmb spawns & owns | Persistent or ephemeral |
526
+ | **3. CDP Attach** (Bold Mode) | Any running Chrome-compatible | agentmb attaches via CDP | Owned by external process |
527
+
528
+ ```
529
+ ┌─────────────────────────────────────────────────────────┐
530
+ │ agentmb daemon │
531
+ │ REST API POST /api/v1/sessions (+ preflight check) │
532
+ └───────────┬──────────────────┬──────────────┬───────────┘
533
+ │ │ │
534
+ launchPersistent() launchPersistent() connectOverCDP()
535
+ (bundled Chromium) (system Chrome/Edge) (external process)
536
+ │ │ │
537
+ ┌────────────▼────┐ ┌──────────▼────┐ ┌────▼──────────────┐
538
+ │ Mode 1 │ │ Mode 2 │ │ Mode 3 │
539
+ │ Managed │ │ Managed │ │ CDP Attach │
540
+ │ Chromium │ │ Chrome Stable │ │ (Bold Mode) │
541
+ │ │ │ / Edge │ │ launch_mode= │
542
+ │ profile=name │ │ browser_ │ │ attach │
543
+ │ or ephemeral=T │ │ channel=chrome│ │ │
544
+ └─────────────────┘ └───────────────┘ └───────────────────┘
545
+ ```
546
+
547
+ ### Mode 1: Managed Chromium (default)
548
+
549
+ agentmb spawns the **Playwright-bundled Chromium** binary. No system Chrome required. Works in headless (CI) and headed modes.
550
+
551
+ Within managed modes, choose a **profile strategy**:
552
+
553
+ **Agent Workspace** — named profile; cookies, localStorage, and browser state persist across runs:
554
+
555
+ ```python
556
+ sess = client.sessions.create(profile="gmail-account")
557
+ ```
196
558
 
197
559
  ```bash
198
- agentmb trace start <session-id> # start recording
199
- # ... do actions ...
200
- agentmb trace stop <session-id> -o trace.zip # save ZIP
201
- npx playwright show-trace trace.zip # open in Playwright UI
560
+ agentmb session new --profile gmail-account
202
561
  ```
203
562
 
204
- ## CDP WebSocket URL
563
+ **Pure Sandbox** ephemeral temp directory; all data is auto-deleted on `close()`:
564
+
565
+ ```python
566
+ sess = client.sessions.create(ephemeral=True)
567
+ ```
205
568
 
206
569
  ```bash
207
- agentmb cdp-ws <session-id> # print browser CDP WebSocket URL
570
+ agentmb session new --ephemeral
208
571
  ```
209
572
 
210
- ## Linux Headed Mode
573
+ ### Mode 2: Managed Chrome Stable
211
574
 
212
- Linux visual/headed mode requires Xvfb.
575
+ agentmb spawns a **system-installed Chrome or Edge** binary via Playwright. Requires Chrome Stable or Edge to be installed on the host. Both Agent Workspace and Pure Sandbox profile strategies apply.
576
+
577
+ ```python
578
+ sess = client.sessions.create(browser_channel="chrome") # system Chrome Stable
579
+ sess = client.sessions.create(browser_channel="msedge") # system Edge
580
+ sess = client.sessions.create(executable_path="/path/to/chrome") # custom binary path
581
+ ```
213
582
 
214
583
  ```bash
215
- sudo apt-get install -y xvfb
216
- bash scripts/xvfb-headed.sh
584
+ agentmb session new --browser-channel chrome
585
+ agentmb session new --browser-channel msedge
586
+ agentmb session new --executable-path /usr/bin/chromium-browser
217
587
  ```
218
588
 
219
- ## Verify
589
+ Valid `browser_channel` values: `chromium` (Playwright bundled, default), `chrome` (system Chrome Stable), `msedge`. `browser_channel` and `executable_path` are mutually exclusive.
590
+
591
+ ### Mode 3: CDP Attach (Bold Mode)
592
+
593
+ agentmb **attaches to an already-running Chrome** process via the Chrome DevTools Protocol. The remote browser is **not terminated** on `close()` — only the Playwright connection is dropped. This mode exposes lower `navigator.webdriver` fingerprint than managed modes and supports extensions.
594
+
595
+ Three profile variants are available, depending on which `--user-data-dir` Chrome is launched with:
596
+
597
+ | Variant | `--user-data-dir` | State | Typical Use |
598
+ |---|---|---|---|
599
+ | **A. Sandbox** | temp dir (auto) | ephemeral | clean-slate CI runs, throwaway sessions |
600
+ | **B. Dedicated Profile** | custom persistent dir | persistent, isolated | automation account, persistent login |
601
+ | **C. User Chrome** | your real Chrome profile | inherits all cookies & extensions | leverage personal login state |
602
+
603
+ #### Variant A: Sandbox (ephemeral temp dir)
604
+
605
+ `agentmb browser-launch` creates a fresh temp profile automatically. Clean slate — no cookies, no extensions.
220
606
 
221
607
  ```bash
222
- bash scripts/verify.sh
608
+ agentmb browser-launch --port 9222
609
+ # → launches Chrome with --user-data-dir=/tmp/agentmb-cdp-9222 (temp, ephemeral)
610
+ # → CDP URL: http://127.0.0.1:9222
223
611
  ```
224
612
 
225
- ## npm Release Setup
613
+ ```python
614
+ sess = client.sessions.create(launch_mode="attach", cdp_url="http://127.0.0.1:9222")
615
+ sess.navigate("https://example.com")
616
+ sess.close() # disconnects only — Chrome stays alive
617
+ ```
618
+
619
+ #### Variant B: Dedicated Profile (isolated persistent profile)
620
+
621
+ Pass a fixed `--user-data-dir` to Chrome. State (cookies, localStorage) persists across restarts. Completely isolated from your personal Chrome.
226
622
 
227
623
  ```bash
228
- # login once
229
- npm login
230
- npm whoami
624
+ # macOS / Linux
625
+ /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
626
+ --remote-debugging-port=9222 \
627
+ --user-data-dir="$HOME/.agentmb-profiles/my-automation-profile" \
628
+ --no-first-run --no-default-browser-check
629
+
630
+ # Windows
631
+ "C:\Program Files\Google\Chrome\Application\chrome.exe" ^
632
+ --remote-debugging-port=9222 ^
633
+ --user-data-dir="%APPDATA%\agentmb-profiles\my-automation-profile"
634
+ ```
635
+
636
+ ```python
637
+ sess = client.sessions.create(launch_mode="attach", cdp_url="http://127.0.0.1:9222")
638
+ ```
639
+
640
+ #### Variant C: User Chrome (reuse your real Chrome profile)
231
641
 
232
- # check package payload before publish
233
- npm run pack:check
642
+ Point Chrome at your existing user profile to inherit all logged-in sessions, saved passwords, and installed extensions. **Chrome must not already be running with that profile** when you launch with remote debugging.
234
643
 
235
- # publish from repo root
236
- npm publish
644
+ ```bash
645
+ # macOS — close Chrome first, then:
646
+ /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
647
+ --remote-debugging-port=9222 \
648
+ --user-data-dir="$HOME/Library/Application Support/Google/Chrome"
649
+
650
+ # Linux
651
+ google-chrome --remote-debugging-port=9222 \
652
+ --user-data-dir="$HOME/.config/google-chrome"
653
+
654
+ # Windows
655
+ "C:\Program Files\Google\Chrome\Application\chrome.exe" ^
656
+ --remote-debugging-port=9222 ^
657
+ --user-data-dir="%LOCALAPPDATA%\Google\Chrome\User Data"
237
658
  ```
238
659
 
239
- If your global npm cache has permission issues, this repo uses project-local cache (`.npm-cache`) via `.npmrc`.
660
+ ```python
661
+ sess = client.sessions.create(launch_mode="attach", cdp_url="http://127.0.0.1:9222")
662
+ # → all cookies, extensions, and login state from your personal Chrome are available
663
+ ```
240
664
 
241
- ## Environment Variables
665
+ **Warning**: actions performed via agentmb will affect your real Chrome profile (cookies written, history created, etc.). Use Variant B when in doubt.
666
+
667
+ ---
668
+
669
+ Attach a session (all variants):
242
670
 
243
- Common runtime env vars:
671
+ ```bash
672
+ agentmb session new --launch-mode attach --cdp-url http://127.0.0.1:9222
673
+ ```
244
674
 
245
- - `AGENTMB_PORT` (default `19315`)
246
- - `AGENTMB_DATA_DIR` (default `~/.agentmb`)
247
- - `AGENTMB_API_TOKEN` (optional API auth)
248
- - `AGENTMB_ENCRYPTION_KEY` (optional AES-256-GCM profile encryption key, 32 bytes as base64 or hex)
249
- - `AGENTMB_LOG_LEVEL` (default `info`)
250
- - `AGENTMB_POLICY_PROFILE` (default `safe`) — daemon-wide default safety policy profile
675
+ **Note**: `launch_mode=attach` is incompatible with `browser_channel` and `executable_path` (preflight returns `400`). CDP attach gives agentmb control over **all tabs** in the connected browser.
676
+
677
+ ### Session Seal
678
+
679
+ Mark a session as sealed to prevent accidental deletion:
680
+
681
+ ```python
682
+ sess.seal()
683
+ # Now sess.close() / DELETE returns 423 session_sealed
684
+ ```
685
+
686
+ ```bash
687
+ agentmb session seal <session-id>
688
+ agentmb session rm <session-id> # → error: session is sealed
689
+ ```
690
+
691
+ ### Preflight Validation
692
+
693
+ The `POST /api/v1/sessions` endpoint validates parameters before launching and returns `400 preflight_failed` for:
694
+ - `browser_channel` + `executable_path` used together (mutually exclusive)
695
+ - `browser_channel` not in `['chromium', 'chrome', 'msedge']`
696
+ - `launch_mode=attach` without `cdp_url`
697
+ - `cdp_url` with invalid URL format
698
+ - `launch_mode=attach` combined with `browser_channel` or `executable_path`
699
+
700
+ ---
701
+
702
+ ## CDP Access
703
+
704
+ agentmb uses Chromium stable as the browser engine. The protocol exposed is the full **Chrome DevTools Protocol (CDP)** as implemented in Chromium/Chrome. Three distinct access modes are provided.
705
+
706
+ ### 1. CDP Command Passthrough (REST)
707
+
708
+ Send any DevTools Protocol method to the session's CDP session.
709
+
710
+ ```http
711
+ GET /api/v1/sessions/:id/cdp → session CDP info
712
+ POST /api/v1/sessions/:id/cdp
713
+ {"method": "Page.captureScreenshot", "params": {"format": "png"}}
714
+ ```
715
+
716
+ All CDP calls are written to the session audit log (`type="cdp"`, `method`, `session_id`, `purpose`, `operator`). Error responses are sanitized (stack frames and internal paths stripped before logging).
717
+
718
+ ### 2. CDP WebSocket Passthrough
719
+
720
+ Returns the browser-level `ws://` endpoint. Connect Puppeteer, Chrome DevTools, or any CDP client directly.
721
+
722
+ ```bash
723
+ agentmb cdp-ws <session-id>
724
+ # → ws://127.0.0.1:NNNN/devtools/browser/...
725
+ ```
726
+
727
+ ```python
728
+ ws_url = sess.cdp_ws_url()
729
+ # connect with puppeteer, pyppeteer, or raw websocket
730
+ ```
731
+
732
+ Note: The WebSocket URL is for the full browser process (not per-page). It is only available when the daemon uses a non-persistent browser launch. Auth-gated: requires the same `X-API-Token` as REST endpoints when auth is enabled.
733
+
734
+ ### 3. CDP Network Emulation
735
+
736
+ Apply network throttling or offline mode via an internal CDP session attached per-session. Does not require external CDP tooling.
737
+
738
+ ```bash
739
+ agentmb set-network <session-id> \
740
+ --latency-ms 200 \
741
+ --download-kbps 512 \
742
+ --upload-kbps 256
743
+
744
+ agentmb set-network <session-id> --offline # full offline mode
745
+ agentmb reset-network <session-id> # restore normal conditions
746
+ ```
747
+
748
+ ```python
749
+ sess.network_conditions(offline=False, latency_ms=200,
750
+ download_kbps=512, upload_kbps=256)
751
+ ```
752
+
753
+ ---
754
+
755
+ ## Profile Management (API / SDK)
756
+
757
+ Profiles persist cookies, localStorage, and browser state between sessions.
758
+
759
+ ```python
760
+ # List all profiles on disk
761
+ result = client.list_profiles()
762
+ for p in result.profiles:
763
+ print(p.name, p.path, p.last_used)
764
+
765
+ # Reset a profile (wipes data dir and recreates empty directory)
766
+ # Returns 409 if a live session is currently using the profile.
767
+ result = client.reset_profile("demo")
768
+ # result.status == "ok"
769
+ ```
770
+
771
+ REST:
772
+ ```
773
+ GET /api/v1/profiles → ProfileListResult
774
+ POST /api/v1/profiles/:name/reset → ProfileResetResult
775
+ ```
776
+
777
+ Profile directories are stored under `AGENTMB_DATA_DIR/profiles/<name>/`.
778
+
779
+ ---
251
780
 
252
781
  ## Safety Execution Policy
253
782
 
254
- agentmb enforces a configurable **safety execution policy** that throttles actions, enforces per-domain rate limits, and blocks sensitive actions (e.g. form submissions, file uploads) unless explicitly permitted.
783
+ Rate limiting and action guardrails enforced per-session, per-domain.
255
784
 
256
785
  ### Profiles
257
786
 
258
787
  | Profile | Min interval | Jitter | Max actions/min | Sensitive actions |
259
788
  |---|---|---|---|---|
260
- | `safe` | 1500 ms | 300–800 ms | 8 | blocked by default |
789
+ | `safe` | 1500 ms | 300–800 ms | 8 | blocked (HTTP 403) |
261
790
  | `permissive` | 200 ms | 0–100 ms | 60 | allowed |
262
791
  | `disabled` | 0 ms | 0 ms | unlimited | allowed |
263
792
 
264
- Set the daemon-wide default via env var:
793
+ Set daemon-wide default via environment variable:
794
+
265
795
  ```bash
266
796
  AGENTMB_POLICY_PROFILE=disabled node dist/daemon/index.js # CI / trusted automation
267
- AGENTMB_POLICY_PROFILE=safe node dist/daemon/index.js # social-media / sensitive workflows
797
+ AGENTMB_POLICY_PROFILE=safe node dist/daemon/index.js # untrusted / social-media flows
268
798
  ```
269
799
 
270
- ### Per-session override (CLI)
800
+ ### Per-session override
271
801
 
272
802
  ```bash
273
- agentmb policy <session-id> # get current policy
274
- agentmb policy <session-id> safe # switch to safe profile
275
- agentmb policy <session-id> permissive # switch to permissive
803
+ agentmb policy <session-id> # get current profile
804
+ agentmb policy <session-id> safe # switch to safe
805
+ agentmb policy <session-id> permissive # switch to permissive
276
806
  agentmb policy <session-id> safe --allow-sensitive # safe + allow sensitive actions
277
807
  ```
278
808
 
279
- ### Per-session override (Python SDK)
280
-
281
809
  ```python
282
- from agentmb import BrowserClient
283
-
284
- with BrowserClient() as client:
285
- sess = client.sessions.create()
286
- policy = sess.set_policy("safe", allow_sensitive_actions=False)
287
- print(policy.max_retries_per_domain) # 3
288
- current = sess.get_policy()
810
+ sess.set_policy("safe", allow_sensitive_actions=False)
811
+ info = sess.get_policy() # → PolicyInfo
289
812
  ```
290
813
 
291
- ### Audit logs
814
+ ### Audit log (policy events)
292
815
 
293
816
  All policy events (`throttle`, `jitter`, `cooldown`, `deny`, `retry`) are written to the session audit log with `type="policy"`.
294
817
 
@@ -296,14 +819,156 @@ All policy events (`throttle`, `jitter`, `cooldown`, `deny`, `retry`) are writte
296
819
  agentmb logs <session-id> # shows policy events inline
297
820
  ```
298
821
 
299
- ### Sensitive actions
822
+ ### Sensitive action guard
300
823
 
301
- Mark any action as sensitive by passing `"sensitive": true` in the request body. With `safe` profile and `allow_sensitive_actions=false`, the request returns HTTP 403:
824
+ Pass `"sensitive": true` in any request body to mark it as sensitive. With `safe` profile and `allow_sensitive_actions=false`:
302
825
 
303
826
  ```json
304
827
  { "error": "sensitive action blocked by policy", "policy_event": "deny" }
305
828
  ```
306
829
 
830
+ HTTP status: `403`.
831
+
832
+ ---
833
+
834
+ ## Security
835
+
836
+ ### API Token Authentication
837
+
838
+ All endpoints require `X-API-Token` or `Authorization: Bearer <token>` when `AGENTMB_API_TOKEN` is set.
839
+
840
+ ```bash
841
+ export AGENTMB_API_TOKEN="my-secret-token"
842
+ ```
843
+
844
+ Requests without a valid token return `401 Unauthorized`. CDP REST and WebSocket endpoints are subject to the same token check.
845
+
846
+ ### Profile Encryption
847
+
848
+ Browser profiles (cookies, storage) are encrypted at rest using AES-256-GCM when `AGENTMB_ENCRYPTION_KEY` is set.
849
+
850
+ ```bash
851
+ # 32-byte key, base64 or hex encoded
852
+ export AGENTMB_ENCRYPTION_KEY="$(openssl rand -base64 32)"
853
+ ```
854
+
855
+ Profiles written without a key cannot be read with one and vice versa.
856
+
857
+ ### Input Validation (Preflight)
858
+
859
+ Every action route runs preflight checks before execution:
860
+
861
+ - `timeout_ms`: must be in range `[50, 60000]` ms. Out-of-range values return `400 preflight_failed` with `{ field, constraint, value }`.
862
+ - `fill` value: max 100,000 characters. Longer values return `400 preflight_failed`.
863
+
864
+ ### Error Diagnostics and Recovery Hints
865
+
866
+ When an action fails (element not found, timeout, detached context, overlay intercept), the route returns `422` with a structured diagnostic payload:
867
+
868
+ ```json
869
+ {
870
+ "error": "Timeout 3000ms exceeded.",
871
+ "url": "https://example.com",
872
+ "readyState": "complete",
873
+ "recovery_hint": "Increase timeout_ms or add stability.wait_before_ms; ensure element is visible before acting"
874
+ }
875
+ ```
876
+
877
+ `recovery_hint` categories:
878
+ - **Timeout / waiting for**: increase `timeout_ms` or add `stability.wait_before_ms`; verify element visibility
879
+ - **Target closed / detached**: page navigated or element removed; re-navigate or call `snapshot_map` again
880
+ - **Not found / no element**: check selector; use `snapshot_map` to verify element exists on current page
881
+ - **Intercept / overlap / obscured**: element covered by overlay; try `executor=auto_fallback` or scroll into view first
882
+
883
+ ### Audit Logging
884
+
885
+ Every action, CDP call, and policy event is appended to a per-session JSONL audit log:
886
+
887
+ ```json
888
+ {
889
+ "ts": "2026-02-28T10:00:01.234Z",
890
+ "v": 1,
891
+ "session_id": "s_abc123",
892
+ "action_id": "act_xyz",
893
+ "type": "action",
894
+ "action": "click",
895
+ "url": "https://example.com",
896
+ "selector": "#submit",
897
+ "result": { "status": "ok", "duration_ms": 142 },
898
+ "purpose": "submit search form",
899
+ "operator": "codex-agent"
900
+ }
901
+ ```
902
+
903
+ Fields: `purpose` (why), `operator` (who/what). Set via request body or `X-Operator` header.
904
+
905
+ ```bash
906
+ agentmb logs <session-id> --tail 50
907
+ ```
908
+
909
+ ---
910
+
911
+ ## Human Login Handoff
912
+
913
+ Switch a session to headed (visible) mode, log in manually, then return to headless automation with the same cookies and storage.
914
+
915
+ ```bash
916
+ agentmb login <session-id>
917
+ # → browser window opens
918
+ # → log in manually
919
+ # → press Enter in terminal to return to headless mode
920
+ ```
921
+
922
+ ---
923
+
924
+ ## Linux Headed Mode
925
+
926
+ Linux visual/headed mode requires Xvfb:
927
+
928
+ ```bash
929
+ sudo apt-get install -y xvfb
930
+ bash scripts/xvfb-headed.sh
931
+ ```
932
+
933
+ ---
934
+
935
+ ## Playwright Trace Recording
936
+
937
+ ```bash
938
+ agentmb trace start <session-id>
939
+ # ... perform actions ...
940
+ agentmb trace stop <session-id> -o trace.zip
941
+ npx playwright show-trace trace.zip
942
+ ```
943
+
944
+ ---
945
+
946
+ ## Verify
947
+
948
+ Runs: build → daemon start → 19 pytest suites → daemon stop. Requires daemon to not be running on the configured port.
949
+
950
+ ```bash
951
+ bash scripts/verify.sh # uses default port 19315
952
+ AGENTMB_PORT=19320 bash scripts/verify.sh
953
+ ```
954
+
955
+ Expected output: `ALL GATES PASSED (24/24)`.
956
+
957
+ ---
958
+
959
+ ## Environment Variables
960
+
961
+ | Variable | Default | Purpose |
962
+ |---|---|---|
963
+ | `AGENTMB_PORT` | `19315` | Daemon HTTP port |
964
+ | `AGENTMB_DATA_DIR` | `~/.agentmb` | Profiles and logs directory |
965
+ | `AGENTMB_API_TOKEN` | _(none)_ | Require this token on all requests |
966
+ | `AGENTMB_ENCRYPTION_KEY` | _(none)_ | AES-256-GCM key for profile encryption (32 bytes, base64 or hex) |
967
+ | `AGENTMB_LOG_LEVEL` | `info` | Daemon log verbosity |
968
+ | `AGENTMB_POLICY_PROFILE` | `safe` | Default safety policy profile (`safe\|permissive\|disabled`) |
969
+
970
+ ---
971
+
307
972
  ## License
308
973
 
309
974
  MIT