browser4-cli 0.1.8 → 0.1.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -68,6 +68,16 @@ browser4-cli -s=<session> <command> [args] [options]
68
68
  | `--version` | Print version |
69
69
  | `-s=<name>` | Named session label |
70
70
  | `--server=<url>` | Override Browser4 server URL |
71
+ | `--json` | Emit machine-parseable JSON to stdout |
72
+ | `-q`, `--quiet` | Suppress normal output, only show errors |
73
+
74
+ `--json` switches every command's stdout from human-readable text to a
75
+ single-line JSON envelope (`{"status":"ok","command":"<name>","output":{...}}`).
76
+ Omit `--json` for the default human-readable output.
77
+
78
+ `-q` / `--quiet` suppresses all normal stdout output. Errors and
79
+ progress messages still go to stderr. Combine with `--json` for
80
+ silent-on-success scripting: `browser4-cli --json -q open`.
71
81
 
72
82
  Sessions are persisted independently per name. Omitting `-s` uses the
73
83
  default session (`~/.browser4/cli-state.json`). With `-s=<name>`, a
@@ -83,9 +93,9 @@ The tables below mirror the commands surfaced by the global `browser4-cli help`
83
93
 
84
94
  | Command | Description |
85
95
  |---|---|
86
- | `open [url]` | Open or switch to a browser session (optionally navigate to URL) |
96
+ | `open [url]` | Open or switch to a browser session. Supports `--headed` (force visible window) and `--headless` (force headless). |
87
97
  | `close` | Close the active session |
88
- | `goto <url>` | Navigate to a URL using the current active session |
98
+ | `goto <url>` | Navigate to a URL, auto-opening or refreshing the session if needed |
89
99
  | `click <ref> [button]` | Click an element |
90
100
  | `dblclick <ref> [button]` | Double-click an element |
91
101
  | `type <text> [ref]` | Type text into the focused element or an optional target element |
@@ -128,11 +138,11 @@ The tables below mirror the commands surfaced by the global `browser4-cli help`
128
138
  | `mouseup [button]` | Release mouse button |
129
139
  | `mousewheel <dx> <dy>` | Scroll the mouse wheel |
130
140
 
131
- #### Save as
141
+ #### Screenshots
132
142
 
133
143
  | Command | Description |
134
144
  |---|---|
135
- | `screenshot [ref]` | Take a screenshot |
145
+ | `screenshot [ref]` | Take a screenshot (optionally of a specific element) |
136
146
 
137
147
  #### Tabs
138
148
 
@@ -181,13 +191,22 @@ Use `close-all` for session cleanup when you want to keep the current Browser4 s
181
191
 
182
192
  | Command | Description |
183
193
  |---|---|
184
- | `install` | Download the self-contained Browser4 runtime bundle (JAR + bundled JRE) from GitHub Releases |
185
- | `upgrade` | Upgrade `browser4-cli` itself to the latest release (requires `cargo`) |
194
+ | `install` | Download the Browser4 runtime bundle. Supports `--tag=<version>` to pin a release and `--force` to reinstall even when already present. |
195
+ | `upgrade` | Upgrade the Browser4 runtime bundle to the latest version or a specified `--tag` |
186
196
  | `stop` | Kill the Browser4 backend after closing all sessions |
187
197
  | `status` | Check whether the Browser4 backend is reachable and healthy |
188
198
 
199
+ `install` and `upgrade` both manage the Browser4 runtime bundle — a self-contained
200
+ distribution that includes all dependency jars, a minimal `jlink`-built JRE, and
201
+ platform launcher scripts. Neither requires `cargo` or a Rust toolchain; the runtime
202
+ is a Java application downloaded from GitHub Releases.
203
+
204
+ Use `--tag=<version>` to pin a specific release (e.g. `--tag=v4.9.3`). Use `--force`
205
+ to reinstall even when the same version is already present.
206
+
189
207
  When a local Browser4 checkout is detected with the `browser4-bundle` module present,
190
- `install` auto-builds the runtime bundle from source instead of downloading.
208
+ `install` and `upgrade` auto-build the runtime bundle from source (via Maven) instead
209
+ of downloading.
191
210
 
192
211
  ### Advanced commands
193
212
 
@@ -204,9 +223,10 @@ Query `browser4-cli help <command>` for the exact syntax when you need them.
204
223
  | `agent status <id>` | Check the status of a running agent task |
205
224
  | `agent result <id>` | Get the result of a completed agent task |
206
225
  | `swarm create` | Create a swarm scrape session with parallel browser contexts |
207
- | `swarm submit [url]` | Submit URL(s) or X-SQL payloads as scrape jobs |
208
- | `swarm status <id>` | Check the status of a scrape job |
209
- | `swarm result <id>` | Get the result of a completed scrape job |
226
+ | `swarm submit [url]` | Submit URL(s) or raw X-SQL payloads as scrape jobs |
227
+ | `swarm query <url>` | Run an X-SQL query against a loaded webpage |
228
+ | `swarm status <id>` | Check the status of a scrape or query job |
229
+ | `swarm result <id>` | Get the result of a completed job |
210
230
 
211
231
  ## Agent task workflow (`agent <subcommand>`)
212
232
 
@@ -259,7 +279,7 @@ browser4-cli agent result agent-task-1
259
279
  #### 1. Submit an autonomous agent task
260
280
 
261
281
  ```shell
262
- browser4-cli agent run "Open example.com and summarize the hero section"
282
+ browser4-cli agent run "Open browser4.io and summarize the hero section"
263
283
  ```
264
284
 
265
285
  Typical output:
@@ -299,93 +319,78 @@ If the backend returns a structured `CommandResult`, expect fields such as
299
319
  The `swarm` subcommands support a swarm scrape workflow where one CLI session
300
320
  coordinates multiple browser contexts in the Browser4 backend.
301
321
 
302
- Use the spaced `swarm <subcommand>` form:
303
-
304
- ```shell
305
- browser4-cli swarm create
306
- browser4-cli swarm submit https://example.com
307
- ```
308
-
309
- ### Command lifecycle
322
+ ### Command overview
310
323
 
311
- | Step | Command | What it does |
324
+ | Command | Purpose | Backend endpoint |
312
325
  |---|---|---|
313
- | 1 | `swarm create` | Opens a swarm scrape session and persists the returned session ID in the current CLI slot |
314
- | 2 | `swarm submit [url]` | Submits one direct URL plus any URLs from `--seed-file` as scrape jobs through `ScrapeController.submit(payload)` |
315
- | 3 | `swarm status <id>` | Calls `ScrapeController.getStatus(id)` and prints the returned scrape job status JSON |
316
- | 4 | `swarm result <id>` | Calls `ScrapeController.getResult(id)` and prints the returned scrape job result JSON |
326
+ | `swarm create` | Create a swarm scrape session | `POST /api/swarm` |
327
+ | `swarm submit <url>` | Scrape URLs or submit raw X-SQL | `POST /api/swarm/submit` |
328
+ | `swarm query <url>` | Run X-SQL queries against loaded pages | `POST /api/swarm/query` |
329
+ | `swarm status <id>` | Poll job status | `GET /api/swarm/{id}/status` |
330
+ | `swarm result <id>` | Fetch completed job result | `GET /api/swarm/{id}/result` |
317
331
 
318
- ### Notes
319
-
320
- - `swarm create` accepts backend capability hints such as `--profile-mode`,
321
- `--max-open-tabs`, `--max-browser-contexts`, and `--display-mode`.
322
- - `swarm submit` accepts either a direct positional URL, `--seed-file`, or both.
323
- Seed files are plain text files with one URL per line; blank lines and lines
324
- starting with `#` are ignored.
325
- - `swarm submit` maps CLI flags like `--deadline`, `--expires`, `--refresh`,
326
- `--parse`, and `--store-content` into the raw submission payload sent to the
327
- scrape REST API.
328
- - `swarm status` and `swarm result` are read-only follow-up commands; keep the job ID
329
- printed by `swarm submit`.
330
-
331
- ### Use cases
332
-
333
- #### 1. Create a supervised swarm scrape session for manual monitoring
332
+ ### URL scraping with `swarm submit`
334
333
 
335
334
  ```shell
335
+ # create a session
336
336
  browser4-cli swarm create \
337
337
  --profile-mode=TEMPORARY \
338
338
  --max-open-tabs=12 \
339
339
  --max-browser-contexts=3 \
340
340
  --display-mode=HEADLESS
341
- ```
342
-
343
- Use this when you want multiple isolated browser contexts and you still want to
344
- watch the run visually.
345
341
 
346
- #### 2. Submit a seed crawl as scrape jobs
347
-
348
- ```shell
342
+ # submit URLs as scrape jobs
349
343
  browser4-cli swarm submit https://example.com/direct \
350
344
  --seed-file=./swarm-seeds.txt \
351
345
  --deadline=2026-03-30T00:00:00Z \
352
346
  --expires=1d \
353
- --refresh \
354
- --parse \
355
- --store-content
356
- ```
347
+ --refresh --parse --store-content
357
348
 
358
- Example `swarm-seeds.txt`:
359
-
360
- ```text
361
- # campaign landing pages
362
- https://example.com/seed-1
363
- https://example.com/seed-2
349
+ # poll and fetch the result
350
+ browser4-cli swarm status scrape-task-4
351
+ browser4-cli swarm result scrape-task-4
364
352
  ```
365
353
 
366
- This pattern is useful for warming caches, refreshing a URL list, or launching
367
- parallel collection across a curated seed set.
354
+ ### X-SQL queries with `swarm query`
368
355
 
369
- #### 3. Poll and fetch the result
356
+ Run structured X-SQL queries against loaded webpages to extract data.
370
357
 
371
358
  ```shell
372
- browser4-cli swarm status scrape-task-4
373
- browser4-cli swarm result scrape-task-4
359
+ # Inline query:
360
+ browser4-cli swarm query "https://www.amazon.com/dp/B08PP5MSVB" --sql "
361
+ SELECT
362
+ dom_base_uri(dom) AS url,
363
+ dom_first_text(dom, '#productTitle') AS title,
364
+ dom_first_slim_html(dom, 'img:expr(width > 400)') AS img
365
+ FROM load_and_select(@url, 'body');
366
+ "
367
+
368
+ # From a file:
369
+ browser4-cli swarm query "https://www.amazon.com/dp/B08PP5MSVB" --sql @query.sql
370
+
371
+ # With seed file and load options:
372
+ browser4-cli swarm query --sql @query.sql --seed-file=./urls.txt --refresh --parse
374
373
  ```
375
374
 
376
- The status and result commands print the scrape job response payload as-is. In
377
- the current backend, `getResult(id)` returns the same response envelope type as
378
- `getStatus(id)`.
375
+ ### Notes
376
+
377
+ - `swarm create` accepts backend capability hints: `--profile-mode`, `--max-open-tabs`,
378
+ `--max-browser-contexts`, `--display-mode`.
379
+ - `swarm submit` and `swarm query` both accept a positional URL, `--seed-file`, or both.
380
+ Seed files use one URL per line; `#` comments and blank lines are ignored.
381
+ - Both commands support load-option flags: `--deadline`, `--expires`, `--refresh`,
382
+ `--parse`, `--store-content`.
383
+ - `swarm query --sql` is **required**; `swarm submit --sql` also works as a convenience.
384
+ Use `@url` in the X-SQL template; it is replaced with the target URL server-side.
385
+ - Prefix the `--sql` value with `@` to read from a file (e.g. `--sql @query.sql`).
386
+ - All commands return a task ID; use `swarm status` / `swarm result` to track progress.
379
387
 
380
388
  ## Element References
381
389
 
382
390
  The `snapshot` command returns an accessibility tree where every interactive
383
391
  node is labeled with a short identifier such as `e15`. Pass this identifier
384
- directly to commands like `click`, `type`, or `press`; the CLI automatically
385
- converts it to the `backend:15` selector format required by the server.
386
-
387
- You can also pass plain CSS selectors (e.g. `.my-button`, `#search-input`) or
388
- fully-qualified `backend:<N>` refs directly.
392
+ directly to commands like `click`, `type`, or `press`. You can also use plain
393
+ CSS selectors (e.g. `.my-button`, `#search-input`).
389
394
 
390
395
  ## State Persistence
391
396
 
@@ -406,41 +411,23 @@ fields such as:
406
411
 
407
412
  ### Session state transitions
408
413
 
409
- The `with_session()` helper in `src/main.rs` is the central session lifecycle
410
- gate for commands that require an active Browser4 session.
414
+ | Command | Behavior |
415
+ |---|---|
416
+ | `open` | Creates a new session, or reuses an existing active one. Stale sessions are automatically refreshed. |
417
+ | `open -s=<name>` | Same as `open` but scoped to a named session slot. |
418
+ | `goto <url>` | Reuses the current session if active; otherwise opens a fresh one before navigating. |
419
+ | `close` | Closes the current session (no-op if none active). |
420
+ | `close-all` / `kill-all` / `stop` | Clears all persisted session state. |
411
421
 
412
- | Situation | Persisted state transition | Result |
413
- |---|---|---|
414
- | No persisted session | No state change | `require_session()` fails with `No active session. Run "browser4-cli open" first.` |
415
- | `open` succeeds (no existing session) | `create_session()` writes a fresh state file with new `sessionId`, current `baseUrl`, and clears `activeSelector` / `lastMousePosition` | A new active session becomes the current CLI session |
416
- | `open` when a saved session exists and the backend still reports it `active` | No state change — keeps the existing `sessionId` | The existing session is reused; subsequent commands target the same session |
417
- | `open` when a saved session exists but is missing or no longer `active` in the backend | `invalidate_session()` clears the stale saved `sessionId`, `activeSelector`, and `lastMousePosition`, then `create_session()` writes a fresh session | The stale session is refreshed automatically by opening a new one |
418
- | `open -s=<name>` | Reads/writes the named session state file | Opens, reuses, or refreshes the named session for that slot; subsequent `-s=<name>` commands use the same slot |
419
- | Command succeeds through `with_session()` | `sessionId` stays unchanged | The command uses the persisted session normally |
420
- | Command fails because the server reports a stale / expired session and `recover_stale = false` | `invalidate_session()` clears `sessionId`, `activeSelector`, and `lastMousePosition`, while keeping `baseUrl` | The command fails with `Saved session expired. Run "browser4-cli open" first.` |
421
- | `goto` is invoked but the saved session is missing or no longer `active` in the backend | `invalidate_session()` clears any stale saved `sessionId`, then `create_session()` writes a fresh session before navigation continues | `goto` automatically refreshes the session and proceeds to the requested URL |
422
- | `close` with an active session | `clear_state()` removes only the current session state file after best-effort remote close | The selected default or named session is fully cleared |
423
- | `close` with no persisted `sessionId` | `clear_state()` best-effort removes the current session slot | Prints `No active session. Run "browser4-cli open" first.` and exits successfully as a no-op |
424
- | `close-all` / `kill-all` | `clear_all_state()` removes the default state file and all named session files | All persisted CLI session files are cleared |
425
-
426
- Notes:
427
-
428
- - `goto` first tries to reuse the current backend-`active` session. If the saved
429
- session is missing, stale, or the backend had been stopped, it automatically
430
- opens a fresh session for the current slot before navigating.
431
- - `open` first checks whether the saved session for the current slot is still
432
- backend-`active`. It reuses active sessions and refreshes stale ones by
433
- creating a new session for the same slot.
434
- - `list` reads persisted session files and compares them with live backend
435
- sessions to show both the current status (`Active`, `Stale`, or `Unknown`)
436
- and whether the next `open` will `Reuse` or `Refresh` that slot.
422
+ The `list` command shows each session's status: **Active** (backend confirms),
423
+ **Stale** (backend has stopped it), or **Unknown** (backend unreachable).
437
424
 
438
425
  ## Runtime Temp Files
439
426
 
440
427
  `browser4-cli` keeps ephemeral runtime artifacts under the system temp directory:
441
428
 
442
- - Windows: `%TEMP%\.browser4\browser4-cli`
443
- - Linux/macOS: `${TMPDIR:-/tmp}/.browser4/browser4-cli`
429
+ - Windows: `%TEMP%\browser4\browser4-cli`
430
+ - Linux/macOS: `${TMPDIR:-/tmp}/browser4/browser4-cli`
444
431
 
445
432
  This temp subtree contains items such as:
446
433
 
@@ -448,7 +435,20 @@ This temp subtree contains items such as:
448
435
  - staged Maven wrapper launchers
449
436
  - Rust test scratch directories used by `browser4-cli` tests
450
437
 
451
- Persistent CLI state and the fallback `Browser4.jar` remain under `~/.browser4` by default.
438
+ Persistent CLI state remains under `~/.browser4` by default. The Browser4 runtime
439
+ bundle (JRE, JARs, launchers) is stored separately in a platform-conventional
440
+ data directory so that clearing CLI session state does not require re-downloading
441
+ the ~200 MB runtime:
442
+
443
+ - Linux: `~/.local/share/browser4/runtime/<version>/`
444
+ - macOS: `~/Library/Application Support/browser4/runtime/<version>/`
445
+ - Windows: `%APPDATA%/browser4/runtime/<version>/`
446
+
447
+ The `current.tag` file in the `runtime/` directory records the active version.
448
+ Override the runtime data root with the `BROWSER4_RUNTIME_DIR` environment variable.
449
+ Downloaded archives are cached under the platform cache directory
450
+ (`~/.cache/browser4/downloads/` on Linux, `~/Library/Caches/browser4/downloads/`
451
+ on macOS, `%LOCALAPPDATA%/browser4/downloads/` on Windows).
452
452
 
453
453
  ## Snapshots
454
454
 
@@ -462,10 +462,14 @@ After each command that modifies browser state, the CLI automatically:
462
462
  ## Examples
463
463
 
464
464
  ```shell
465
- # Open a new browser window
465
+ # Open a new browser window (defaults to headed)
466
466
  browser4-cli open
467
467
 
468
- # Navigate to a page with the current active session
468
+ # Open in headed or headless mode
469
+ browser4-cli open --headed https://browser4.io
470
+ browser4-cli open --headless https://browser4.io
471
+
472
+ # Navigate to a page — auto-opens a session if none is active
469
473
  browser4-cli goto https://playwright.dev
470
474
 
471
475
  # Inspect the page — note the eN labels on interactive nodes
File without changes
Binary file
Binary file
Binary file
File without changes
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "browser4-cli",
3
- "version": "0.1.8",
3
+ "version": "0.1.9",
4
4
  "description": "Browser automation CLI for AI agents",
5
5
  "type": "module",
6
6
  "files": [
@@ -20,7 +20,7 @@
20
20
  "build:linux": "npm run version:sync && docker compose -f docker/docker-compose.yml run --rm build-linux",
21
21
  "build:macos": "npm run version:sync && (cargo build --release --manifest-path browser4-cli/Cargo.toml --target aarch64-apple-darwin & cargo build --release --manifest-path browser4-cli/Cargo.toml --target x86_64-apple-darwin & wait) && cp cli/target/aarch64-apple-darwin/release/browser4 bin/browser4-darwin-arm64 && cp cli/target/x86_64-apple-darwin/release/browser4 bin/browser4-darwin-x64",
22
22
  "build:windows": "npm run version:sync && docker compose -f docker/docker-compose.yml run --rm build-windows",
23
- "build:all-platforms": "npm run version:sync && (npm run build:linux & npm run build:windows & wait)",
23
+ "build:all-platforms": "npm run version:sync && (npm run build:linux & npm run build:windows & npm run build:macos & wait)",
24
24
  "build:docker": "docker build -t browser4-builder -f docker/Dockerfile.build .",
25
25
  "publish:if-needed": "node scripts/publish-if-needed.js",
26
26
  "release": "npm run publish:if-needed",
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
Binary file
Binary file
Binary file
Binary file