npm - agent-browser - Versions diffs - 0.21.3 → 0.22.0 - Mend

agent-browser 0.21.3 → 0.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/README.md +5 -5
package/bin/.install-method +1 -0
package/bin/agent-browser-darwin-arm64 +0 -0
package/bin/agent-browser-darwin-x64 +0 -0
package/bin/agent-browser-linux-arm64 +0 -0
package/bin/agent-browser-linux-musl-arm64 +0 -0
package/bin/agent-browser-linux-musl-x64 +0 -0
package/bin/agent-browser-linux-x64 +0 -0
package/bin/agent-browser-win32-x64.exe +0 -0
package/package.json +1 -1
package/scripts/postinstall.js +33 -4
package/skills/agent-browser/SKILL.md +8 -1
package/skills/electron/SKILL.md +0 -1
package/skills/slack/SKILL.md +0 -9
package/skills/slack/references/slack-tasks.md +2 -8

package/README.md CHANGED Viewed

@@ -270,6 +270,10 @@ agent-browser network route <url> --body <json>  # Mock response
 agent-browser network unroute [url]            # Remove routes
 agent-browser network requests                 # View tracked requests
 agent-browser network requests --filter api    # Filter requests
+agent-browser network requests --type xhr,fetch  # Filter by resource type
+agent-browser network requests --method POST   # Filter by HTTP method
+agent-browser network requests --status 2xx    # Filter by status (200, 2xx, 400-499)
+agent-browser network request <requestId>      # View full request/response detail
 agent-browser network har start                # Start HAR recording
 agent-browser network har stop [output.har]    # Stop and save HAR (temp path if omitted)
 ```
@@ -486,7 +490,7 @@ agent-browser --session-name secure open example.com
 agent-browser includes security features for safe AI agent deployments. All features are opt-in -- existing workflows are unaffected until you explicitly enable a feature:
-- **Authentication Vault** -- Store credentials locally (always encrypted), reference by name. The LLM never sees passwords. A key is auto-generated at `~/.agent-browser/.encryption-key` if `AGENT_BROWSER_ENCRYPTION_KEY` is not set: `echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin` then `agent-browser auth login github`
+- **Authentication Vault** -- Store credentials locally (always encrypted), reference by name. The LLM never sees passwords. `auth login` navigates with `load` and then waits for login form selectors to appear (SPA-friendly, timeout follows the default action timeout). A key is auto-generated at `~/.agent-browser/.encryption-key` if `AGENT_BROWSER_ENCRYPTION_KEY` is not set: `echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin` then `agent-browser auth login github`
 - **Content Boundary Markers** -- Wrap page output in delimiters so LLMs can distinguish tool output from untrusted content: `--content-boundaries`
 - **Domain Allowlist** -- Restrict navigation to trusted domains (wildcards like `*.example.com` also match the bare domain): `--allowed-domains "example.com,*.example.com"`. Sub-resource requests (scripts, images, fetch) and WebSocket/EventSource connections to non-allowed domains are also blocked. Include any CDN domains your target pages depend on (e.g., `*.cdn.example.com`).
 - **Action Policy** -- Gate destructive actions with a static policy file: `--action-policy ./policy.json`
@@ -511,7 +515,6 @@ The `snapshot` command supports filtering to reduce output size:
 ```bash
 agent-browser snapshot                    # Full accessibility tree
 agent-browser snapshot -i                 # Interactive elements only (buttons, inputs, links)
-agent-browser snapshot -i -C              # Include cursor-interactive elements (divs with onclick, etc.)
 agent-browser snapshot -c                 # Compact (remove empty structural elements)
 agent-browser snapshot -d 3               # Limit depth to 3 levels
 agent-browser snapshot -s "#main"         # Scope to CSS selector
@@ -521,13 +524,10 @@ agent-browser snapshot -i -c -d 5         # Combine options
 | Option                 | Description                                                             |
 | ---------------------- | ----------------------------------------------------------------------- |
 | `-i, --interactive`    | Only show interactive elements (buttons, links, inputs)                 |
-| `-C, --cursor`         | Include cursor-interactive elements (cursor:pointer, onclick, tabindex) |
 | `-c, --compact`        | Remove empty structural elements                                        |
 | `-d, --depth <n>`      | Limit tree depth                                                        |
 | `-s, --selector <sel>` | Scope to CSS selector                                                   |
-The `-C` flag is useful for modern web apps that use custom clickable elements (divs, spans) instead of standard buttons/links.
 ## Annotated Screenshots
 The `--annotate` flag overlays numbered labels on interactive elements in the screenshot. Each label `[N]` corresponds to ref `@eN`, so the same refs work for both visual and text-based workflows.

package/bin/.install-method ADDED Viewed

	@@ -0,0 +1 @@
1	+ pnpm

package/bin/agent-browser-darwin-arm64 CHANGED Viewed

Binary file

package/bin/agent-browser-darwin-x64 CHANGED Viewed

Binary file

package/bin/agent-browser-linux-arm64 CHANGED Viewed

Binary file

package/bin/agent-browser-linux-musl-arm64 CHANGED Viewed

Binary file

package/bin/agent-browser-linux-musl-x64 CHANGED Viewed

Binary file

package/bin/agent-browser-linux-x64 CHANGED Viewed

Binary file

package/bin/agent-browser-win32-x64.exe CHANGED Viewed

Binary file

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agent-browser",
-  "version": "0.21.3",
+  "version": "0.22.0",
   "description": "Headless browser automation CLI for AI agents",
   "type": "module",
   "files": [

package/scripts/postinstall.js CHANGED Viewed

@@ -80,6 +80,31 @@ async function downloadFile(url, dest) {
   });
 }
+/**
+ * Detect which package manager ran this postinstall and write a marker file
+ * next to the binary so `agent-browser upgrade` can use the correct one
+ * without fragile path heuristics or slow subprocess probing.
+ *
+ * npm_config_user_agent is set by npm/pnpm/yarn/bun during lifecycle scripts,
+ * e.g. "pnpm/8.10.0 node/v20.10.0 linux x64"
+ */
+function writeInstallMethod() {
+  const ua = process.env.npm_config_user_agent || '';
+  let method = '';
+  if (ua.startsWith('pnpm/')) method = 'pnpm';
+  else if (ua.startsWith('yarn/')) method = 'yarn';
+  else if (ua.startsWith('bun/')) method = 'bun';
+  else if (ua.startsWith('npm/')) method = 'npm';
+  if (method) {
+    try {
+      writeFileSync(join(binDir, '.install-method'), method);
+    } catch {
+      // Non-critical — upgrade will fall back to heuristics
+    }
+  }
+}
 async function main() {
   // Check if binary already exists
   if (existsSync(binaryPath)) {
@@ -88,10 +113,12 @@ async function main() {
       chmodSync(binaryPath, 0o755);
     }
     console.log(`✓ Native binary ready: ${binaryName}`);
+    writeInstallMethod();
     // On global installs, fix npm's bin entry to use native binary directly
     await fixGlobalInstallBin();
     showInstallReminder();
     return;
   }
@@ -106,12 +133,12 @@ async function main() {
   try {
     await downloadFile(DOWNLOAD_URL, binaryPath);
     // Make executable on Unix
     if (platform() !== 'win32') {
       chmodSync(binaryPath, 0o755);
     }
     console.log(`✓ Downloaded native binary: ${binaryName}`);
   } catch (err) {
     console.log(`Could not download native binary: ${err.message}`);
@@ -121,6 +148,8 @@ async function main() {
     console.log('  2. Run: npm run build:native');
   }
+  writeInstallMethod();
   // On global installs, fix npm's bin entry to use native binary directly
   // This avoids the /bin/sh error on Windows and provides zero-overhead execution
   await fixGlobalInstallBin();

package/skills/agent-browser/SKILL.md CHANGED Viewed

@@ -90,6 +90,8 @@ echo "$PASSWORD" | agent-browser auth save myapp --url https://app.example.com/l
 agent-browser auth login myapp
 ```
+`auth login` navigates with `load` and then waits for login form selectors to appear before filling/clicking, which is more reliable on delayed SPA login screens.
 **Option 5: State file (manual save/load)**
 ```bash
@@ -111,7 +113,6 @@ agent-browser close                   # Close browser
 # Snapshot
 agent-browser snapshot -i             # Interactive elements with refs (recommended)
-agent-browser snapshot -i -C          # Include cursor-interactive elements (divs with onclick, cursor:pointer)
 agent-browser snapshot -s "#selector" # Scope to CSS selector
 # Interaction (use @refs from snapshot)
@@ -149,6 +150,10 @@ agent-browser --download-path ./downloads open <url>  # Set default download dir
 # Network
 agent-browser network requests                 # Inspect tracked requests
+agent-browser network requests --type xhr,fetch  # Filter by resource type
+agent-browser network requests --method POST   # Filter by HTTP method
+agent-browser network requests --status 2xx    # Filter by status (200, 2xx, 400-499)
+agent-browser network request <requestId>      # View full request/response detail
 agent-browser network route "**/api/*" --abort  # Block matching requests
 agent-browser network har start                # Start HAR recording
 agent-browser network har stop ./capture.har   # Stop and save HAR file
@@ -230,6 +235,8 @@ agent-browser auth show github
 agent-browser auth delete github
 ```
+`auth login` waits for username/password/submit selectors before interacting, with a timeout tied to the default action timeout.
 ### Authentication with State Persistence
 ```bash

package/skills/electron/SKILL.md CHANGED Viewed

@@ -217,7 +217,6 @@ AGENT_BROWSER_COLOR_SCHEME=dark agent-browser connect 9222
 ### Elements not appearing in snapshot
 - The app may use multiple webviews. Use `agent-browser tab` to list targets and switch to the right one
-- Use `agent-browser snapshot -i -C` to include cursor-interactive elements (divs with onclick handlers)
 ### Cannot type in input fields

package/skills/slack/SKILL.md CHANGED Viewed

@@ -235,15 +235,6 @@ agent-browser console
 agent-browser errors
 ```
-### View raw HTML of an element
-```bash
-# Snapshot shows the accessibility tree. If an element isn't there,
-# it may not be interactive (e.g., div instead of button)
-# Use snapshot -i -C to include cursor-interactive divs
-agent-browser snapshot -i -C
-```
 ### Get current page state
 ```bash

package/skills/slack/references/slack-tasks.md CHANGED Viewed

@@ -334,19 +334,13 @@ If you can't find an element:
    agent-browser snapshot -i
    ```
-3. **Try snapshot with extended range**
-   ```bash
-   # Include cursor-interactive elements (divs with onclick handlers)
-   agent-browser snapshot -i -C
-   ```
-4. **Check current URL**
+3. **Check current URL**
    ```bash
    agent-browser get url
    # Verify you're in the right section
    ```
-5. **Wait for page to load**
+4. **Wait for page to load**
    ```bash
    agent-browser wait --load networkidle
    agent-browser wait 1000