agent-browser 0.22.3 → 0.23.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +42 -11
- package/bin/agent-browser-darwin-arm64 +0 -0
- package/bin/agent-browser-darwin-x64 +0 -0
- package/bin/agent-browser-linux-arm64 +0 -0
- package/bin/agent-browser-linux-musl-arm64 +0 -0
- package/bin/agent-browser-linux-musl-x64 +0 -0
- package/bin/agent-browser-linux-x64 +0 -0
- package/bin/agent-browser-win32-x64.exe +0 -0
- package/package.json +4 -3
- package/scripts/windows-debug/provision.sh +220 -0
- package/scripts/windows-debug/run.sh +92 -0
- package/scripts/windows-debug/start.sh +43 -0
- package/scripts/windows-debug/stop.sh +28 -0
- package/scripts/windows-debug/sync.sh +27 -0
- package/skills/agent-browser/SKILL.md +30 -7
- package/skills/agent-browser/references/commands.md +4 -1
package/README.md
CHANGED
|
@@ -70,7 +70,7 @@ Detects your installation method (npm, Homebrew, or Cargo) and runs the appropri
|
|
|
70
70
|
|
|
71
71
|
### Requirements
|
|
72
72
|
|
|
73
|
-
- **Chrome** - Run `agent-browser install` to download Chrome from [Chrome for Testing](https://developer.chrome.com/blog/chrome-for-testing/) (Google's official automation channel). No Playwright or Node.js required for the daemon.
|
|
73
|
+
- **Chrome** - Run `agent-browser install` to download Chrome from [Chrome for Testing](https://developer.chrome.com/blog/chrome-for-testing/) (Google's official automation channel). Existing Chrome, Brave, Playwright, and Puppeteer installations are detected automatically. No Playwright or Node.js required for the daemon.
|
|
74
74
|
- **Rust** - Only needed when building from source (see From Source above).
|
|
75
75
|
|
|
76
76
|
## Quick Start
|
|
@@ -129,6 +129,7 @@ agent-browser stream enable [--port <port>] # Start runtime WebSocket streaming
|
|
|
129
129
|
agent-browser stream status # Show runtime streaming state and bound port
|
|
130
130
|
agent-browser stream disable # Stop runtime WebSocket streaming
|
|
131
131
|
agent-browser close # Close browser (aliases: quit, exit)
|
|
132
|
+
agent-browser close --all # Close all active sessions
|
|
132
133
|
```
|
|
133
134
|
|
|
134
135
|
### Get Info
|
|
@@ -306,6 +307,8 @@ agent-browser dialog dismiss # Dismiss
|
|
|
306
307
|
agent-browser dialog status # Check if a dialog is currently open
|
|
307
308
|
```
|
|
308
309
|
|
|
310
|
+
By default, `alert` and `beforeunload` dialogs are automatically accepted so they never block the agent. `confirm` and `prompt` dialogs still require explicit handling. Use `--no-auto-dialog` (or `AGENT_BROWSER_NO_AUTO_DIALOG=1`) to disable automatic handling.
|
|
311
|
+
|
|
309
312
|
When a JavaScript dialog is pending, all command responses include a `warning` field with the dialog type and message.
|
|
310
313
|
|
|
311
314
|
### Diff
|
|
@@ -331,6 +334,7 @@ agent-browser trace stop [path] # Stop and save trace
|
|
|
331
334
|
agent-browser profiler start # Start Chrome DevTools profiling
|
|
332
335
|
agent-browser profiler stop [path] # Stop and save profile (.json)
|
|
333
336
|
agent-browser console # View console messages (log, error, warn, info)
|
|
337
|
+
agent-browser console --json # JSON output with raw CDP args for programmatic access
|
|
334
338
|
agent-browser console --clear # Clear console
|
|
335
339
|
agent-browser errors # View page errors (uncaught JavaScript exceptions)
|
|
336
340
|
agent-browser errors --clear # Clear errors
|
|
@@ -593,9 +597,36 @@ This is useful for multimodal AI models that can reason about visual layout, unl
|
|
|
593
597
|
| `--confirm-actions <list>` | Action categories requiring confirmation (or `AGENT_BROWSER_CONFIRM_ACTIONS` env) |
|
|
594
598
|
| `--confirm-interactive` | Interactive confirmation prompts; auto-denies if stdin is not a TTY (or `AGENT_BROWSER_CONFIRM_INTERACTIVE` env) |
|
|
595
599
|
| `--engine <name>` | Browser engine: `chrome` (default), `lightpanda` (or `AGENT_BROWSER_ENGINE` env) |
|
|
600
|
+
| `--no-auto-dialog` | Disable automatic dismissal of `alert`/`beforeunload` dialogs (or `AGENT_BROWSER_NO_AUTO_DIALOG` env) |
|
|
596
601
|
| `--config <path>` | Use a custom config file (or `AGENT_BROWSER_CONFIG` env) |
|
|
597
602
|
| `--debug` | Debug output |
|
|
598
603
|
|
|
604
|
+
## Observability Dashboard
|
|
605
|
+
|
|
606
|
+
Monitor agent-browser sessions in real time with a local web dashboard showing a live viewport and command activity feed.
|
|
607
|
+
|
|
608
|
+
```bash
|
|
609
|
+
# Install the dashboard (one time)
|
|
610
|
+
agent-browser dashboard install
|
|
611
|
+
|
|
612
|
+
# Start the dashboard server (runs in background on port 4848)
|
|
613
|
+
agent-browser dashboard start
|
|
614
|
+
agent-browser dashboard start --port 8080 # Custom port
|
|
615
|
+
|
|
616
|
+
# All sessions are automatically visible in the dashboard
|
|
617
|
+
agent-browser open example.com
|
|
618
|
+
|
|
619
|
+
# Stop the dashboard
|
|
620
|
+
agent-browser dashboard stop
|
|
621
|
+
```
|
|
622
|
+
|
|
623
|
+
The dashboard runs as a standalone background process on port 4848, independent of browser sessions. It stays available even when no sessions are running. All sessions automatically stream to the dashboard.
|
|
624
|
+
|
|
625
|
+
The dashboard displays:
|
|
626
|
+
- **Live viewport** -- real-time JPEG frames from the browser
|
|
627
|
+
- **Activity feed** -- chronological command/result stream with timing and expandable details
|
|
628
|
+
- **Console output** -- browser console messages (log, warn, error)
|
|
629
|
+
|
|
599
630
|
## Configuration
|
|
600
631
|
|
|
601
632
|
Create an `agent-browser.json` file to set persistent defaults instead of repeating flags on every command.
|
|
@@ -926,28 +957,28 @@ This is useful when:
|
|
|
926
957
|
|
|
927
958
|
Stream the browser viewport via WebSocket for live preview or "pair browsing" where a human can watch and interact alongside an AI agent.
|
|
928
959
|
|
|
929
|
-
###
|
|
960
|
+
### Streaming
|
|
930
961
|
|
|
931
|
-
|
|
962
|
+
Every session automatically starts a WebSocket stream server on an OS-assigned port. Use `stream status` to see the bound port and connection state:
|
|
932
963
|
|
|
933
964
|
```bash
|
|
934
|
-
agent-browser stream enable
|
|
935
965
|
agent-browser stream status
|
|
936
|
-
agent-browser stream disable
|
|
937
966
|
```
|
|
938
967
|
|
|
939
|
-
|
|
940
|
-
Use `stream status` to inspect whether streaming is enabled, which port is active, whether a browser is attached, and whether screencasting is active.
|
|
941
|
-
|
|
942
|
-
If you want streaming to be available immediately when the daemon starts, set `AGENT_BROWSER_STREAM_PORT` before the first command in that session:
|
|
968
|
+
To bind to a specific port, set `AGENT_BROWSER_STREAM_PORT`:
|
|
943
969
|
|
|
944
970
|
```bash
|
|
945
971
|
AGENT_BROWSER_STREAM_PORT=9223 agent-browser open example.com
|
|
946
972
|
```
|
|
947
973
|
|
|
948
|
-
|
|
974
|
+
You can also manage streaming at runtime with `stream enable`, `stream disable`, and `stream status`:
|
|
975
|
+
|
|
976
|
+
```bash
|
|
977
|
+
agent-browser stream enable --port 9223 # Re-enable on a specific port
|
|
978
|
+
agent-browser stream disable # Stop streaming for the session
|
|
979
|
+
```
|
|
949
980
|
|
|
950
|
-
|
|
981
|
+
The WebSocket server streams the browser viewport and accepts input events.
|
|
951
982
|
|
|
952
983
|
### WebSocket Protocol
|
|
953
984
|
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "agent-browser",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.23.1",
|
|
4
4
|
"description": "Headless browser automation CLI for AI agents",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"files": [
|
|
@@ -28,7 +28,7 @@
|
|
|
28
28
|
"bugs": {
|
|
29
29
|
"url": "https://github.com/vercel-labs/agent-browser/issues"
|
|
30
30
|
},
|
|
31
|
-
"homepage": "https://
|
|
31
|
+
"homepage": "https://agent-browser.dev",
|
|
32
32
|
"devDependencies": {
|
|
33
33
|
"@changesets/cli": "^2.29.8"
|
|
34
34
|
},
|
|
@@ -45,6 +45,7 @@
|
|
|
45
45
|
"postinstall": "node scripts/postinstall.js",
|
|
46
46
|
"changeset": "changeset",
|
|
47
47
|
"ci:version": "changeset version && pnpm run version:sync && pnpm install --no-frozen-lockfile",
|
|
48
|
-
"ci:publish": "pnpm run version:sync && changeset publish"
|
|
48
|
+
"ci:publish": "pnpm run version:sync && changeset publish",
|
|
49
|
+
"build:dashboard": "cd packages/dashboard && pnpm build"
|
|
49
50
|
}
|
|
50
51
|
}
|
|
@@ -0,0 +1,220 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
set -euo pipefail
|
|
3
|
+
|
|
4
|
+
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
|
5
|
+
INSTANCE_FILE="$SCRIPT_DIR/.instance"
|
|
6
|
+
NAME_PREFIX="agent-browser-debug"
|
|
7
|
+
INSTANCE_TYPE="${INSTANCE_TYPE:-t3.xlarge}"
|
|
8
|
+
|
|
9
|
+
if [[ -f "$INSTANCE_FILE" ]]; then
|
|
10
|
+
echo "Error: Instance already provisioned. See $INSTANCE_FILE"
|
|
11
|
+
echo "Run ./scripts/windows-debug/start.sh to start it, or delete .instance to re-provision."
|
|
12
|
+
exit 1
|
|
13
|
+
fi
|
|
14
|
+
|
|
15
|
+
REGION=$(aws configure get region 2>/dev/null || echo "")
|
|
16
|
+
if [[ -z "$REGION" ]]; then
|
|
17
|
+
echo "Error: No AWS region configured. Run: aws configure set region us-east-1"
|
|
18
|
+
exit 1
|
|
19
|
+
fi
|
|
20
|
+
|
|
21
|
+
echo "Provisioning Windows debug instance in $REGION..."
|
|
22
|
+
|
|
23
|
+
# --- IAM Role for SSM ---
|
|
24
|
+
ROLE_NAME="${IAM_ROLE_NAME:-$NAME_PREFIX-ssm-role}"
|
|
25
|
+
PROFILE_NAME="${INSTANCE_PROFILE_NAME:-$NAME_PREFIX-instance-profile}"
|
|
26
|
+
|
|
27
|
+
if aws iam get-instance-profile --instance-profile-name "$PROFILE_NAME" &>/dev/null; then
|
|
28
|
+
echo "Instance profile $PROFILE_NAME already exists, reusing."
|
|
29
|
+
else
|
|
30
|
+
echo "Instance profile $PROFILE_NAME not found. Creating IAM resources..."
|
|
31
|
+
|
|
32
|
+
if ! aws iam get-role --role-name "$ROLE_NAME" &>/dev/null; then
|
|
33
|
+
echo "Creating IAM role: $ROLE_NAME"
|
|
34
|
+
if ! aws iam create-role \
|
|
35
|
+
--role-name "$ROLE_NAME" \
|
|
36
|
+
--assume-role-policy-document '{
|
|
37
|
+
"Version": "2012-10-17",
|
|
38
|
+
"Statement": [{
|
|
39
|
+
"Effect": "Allow",
|
|
40
|
+
"Principal": {"Service": "ec2.amazonaws.com"},
|
|
41
|
+
"Action": "sts:AssumeRole"
|
|
42
|
+
}]
|
|
43
|
+
}' \
|
|
44
|
+
--no-cli-pager; then
|
|
45
|
+
|
|
46
|
+
echo ""
|
|
47
|
+
echo "Error: Failed to create IAM role (see error above)."
|
|
48
|
+
echo ""
|
|
49
|
+
echo "Ask an IAM admin to create the following, then re-run with:"
|
|
50
|
+
echo " INSTANCE_PROFILE_NAME=<name> ./scripts/windows-debug/provision.sh"
|
|
51
|
+
echo ""
|
|
52
|
+
echo "What the admin needs to create:"
|
|
53
|
+
echo " 1. IAM Role: $ROLE_NAME"
|
|
54
|
+
echo " - Trusted entity: EC2 (ec2.amazonaws.com)"
|
|
55
|
+
echo " - Attached policy: AmazonSSMManagedInstanceCore"
|
|
56
|
+
echo " 2. Instance Profile: $PROFILE_NAME"
|
|
57
|
+
echo " - With the above role added to it"
|
|
58
|
+
echo ""
|
|
59
|
+
echo "Or run these commands with an account that has iam:CreateRole permission:"
|
|
60
|
+
echo ""
|
|
61
|
+
echo " aws iam create-role --role-name $ROLE_NAME \\"
|
|
62
|
+
echo " --assume-role-policy-document '{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"ec2.amazonaws.com\"},\"Action\":\"sts:AssumeRole\"}]}'"
|
|
63
|
+
echo ""
|
|
64
|
+
echo " aws iam attach-role-policy --role-name $ROLE_NAME \\"
|
|
65
|
+
echo " --policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
|
|
66
|
+
echo ""
|
|
67
|
+
echo " aws iam create-instance-profile --instance-profile-name $PROFILE_NAME"
|
|
68
|
+
echo ""
|
|
69
|
+
echo " aws iam add-role-to-instance-profile \\"
|
|
70
|
+
echo " --instance-profile-name $PROFILE_NAME --role-name $ROLE_NAME"
|
|
71
|
+
exit 1
|
|
72
|
+
fi
|
|
73
|
+
|
|
74
|
+
aws iam attach-role-policy \
|
|
75
|
+
--role-name "$ROLE_NAME" \
|
|
76
|
+
--policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
|
|
77
|
+
else
|
|
78
|
+
echo "IAM role $ROLE_NAME already exists."
|
|
79
|
+
fi
|
|
80
|
+
|
|
81
|
+
echo "Creating instance profile: $PROFILE_NAME"
|
|
82
|
+
aws iam create-instance-profile --instance-profile-name "$PROFILE_NAME" --no-cli-pager
|
|
83
|
+
aws iam add-role-to-instance-profile \
|
|
84
|
+
--instance-profile-name "$PROFILE_NAME" \
|
|
85
|
+
--role-name "$ROLE_NAME"
|
|
86
|
+
echo "Waiting for instance profile propagation..."
|
|
87
|
+
sleep 10
|
|
88
|
+
fi
|
|
89
|
+
|
|
90
|
+
# --- Security Group (no inbound rules) ---
|
|
91
|
+
VPC_ID=$(aws ec2 describe-vpcs --filters "Name=isDefault,Values=true" --query "Vpcs[0].VpcId" --output text)
|
|
92
|
+
if [[ "$VPC_ID" == "None" || -z "$VPC_ID" ]]; then
|
|
93
|
+
echo "Error: No default VPC found. Create one with: aws ec2 create-default-vpc"
|
|
94
|
+
exit 1
|
|
95
|
+
fi
|
|
96
|
+
|
|
97
|
+
SG_NAME="$NAME_PREFIX-sg"
|
|
98
|
+
SG_ID=$(aws ec2 describe-security-groups \
|
|
99
|
+
--filters "Name=group-name,Values=$SG_NAME" "Name=vpc-id,Values=$VPC_ID" \
|
|
100
|
+
--query "SecurityGroups[0].GroupId" --output text 2>/dev/null || echo "None")
|
|
101
|
+
|
|
102
|
+
if [[ "$SG_ID" == "None" || -z "$SG_ID" ]]; then
|
|
103
|
+
echo "Creating security group: $SG_NAME"
|
|
104
|
+
SG_ID=$(aws ec2 create-security-group \
|
|
105
|
+
--group-name "$SG_NAME" \
|
|
106
|
+
--description "agent-browser Windows debug instance (SSM only, no inbound)" \
|
|
107
|
+
--vpc-id "$VPC_ID" \
|
|
108
|
+
--query "GroupId" --output text)
|
|
109
|
+
|
|
110
|
+
# Revoke default egress isn't needed; SSM requires outbound HTTPS.
|
|
111
|
+
# No inbound rules -- SSM uses outbound connections only.
|
|
112
|
+
else
|
|
113
|
+
echo "Security group $SG_NAME ($SG_ID) already exists, reusing."
|
|
114
|
+
fi
|
|
115
|
+
|
|
116
|
+
# --- AMI (latest Windows Server 2022) ---
|
|
117
|
+
AMI_ID=$(aws ssm get-parameter \
|
|
118
|
+
--name "/aws/service/ami-windows-latest/Windows_Server-2022-English-Full-Base" \
|
|
119
|
+
--query "Parameter.Value" --output text)
|
|
120
|
+
echo "Using AMI: $AMI_ID (Windows Server 2022)"
|
|
121
|
+
|
|
122
|
+
# --- UserData bootstrap script ---
|
|
123
|
+
USERDATA_FILE=$(mktemp)
|
|
124
|
+
trap "rm -f $USERDATA_FILE" EXIT
|
|
125
|
+
|
|
126
|
+
cat > "$USERDATA_FILE" <<'PWSH'
|
|
127
|
+
<powershell>
|
|
128
|
+
$ErrorActionPreference = "Continue"
|
|
129
|
+
$logFile = "C:\bootstrap.log"
|
|
130
|
+
|
|
131
|
+
function Log($msg) {
|
|
132
|
+
$ts = Get-Date -Format "yyyy-MM-dd HH:mm:ss"
|
|
133
|
+
"$ts $msg" | Tee-Object -FilePath $logFile -Append
|
|
134
|
+
}
|
|
135
|
+
|
|
136
|
+
Log "--- Bootstrap starting ---"
|
|
137
|
+
|
|
138
|
+
# Install Git
|
|
139
|
+
Log "Installing Git..."
|
|
140
|
+
$gitInstaller = "$env:TEMP\git-installer.exe"
|
|
141
|
+
Invoke-WebRequest -Uri "https://github.com/git-for-windows/git/releases/download/v2.47.1.windows.2/Git-2.47.1.2-64-bit.exe" -OutFile $gitInstaller
|
|
142
|
+
Start-Process -FilePath $gitInstaller -ArgumentList "/VERYSILENT /NORESTART /NOCANCEL /SP- /CLOSEAPPLICATIONS /RESTARTAPPLICATIONS /COMPONENTS=`"icons,ext\reg\shellhere,assoc,assoc_sh`"" -Wait
|
|
143
|
+
$env:PATH = "C:\Program Files\Git\cmd;$env:PATH"
|
|
144
|
+
[Environment]::SetEnvironmentVariable("PATH", "C:\Program Files\Git\cmd;$([Environment]::GetEnvironmentVariable('PATH', 'Machine'))", "Machine")
|
|
145
|
+
Log "Git installed: $(git --version)"
|
|
146
|
+
|
|
147
|
+
# Install Rust
|
|
148
|
+
Log "Installing Rust..."
|
|
149
|
+
$rustupInit = "$env:TEMP\rustup-init.exe"
|
|
150
|
+
Invoke-WebRequest -Uri "https://win.rustup.rs/x86_64" -OutFile $rustupInit
|
|
151
|
+
Start-Process -FilePath $rustupInit -ArgumentList "-y --default-toolchain stable" -Wait
|
|
152
|
+
$env:PATH = "$env:USERPROFILE\.cargo\bin;$env:PATH"
|
|
153
|
+
[Environment]::SetEnvironmentVariable("PATH", "$env:USERPROFILE\.cargo\bin;$([Environment]::GetEnvironmentVariable('PATH', 'Machine'))", "Machine")
|
|
154
|
+
Log "Rust installed: $(rustc --version)"
|
|
155
|
+
|
|
156
|
+
# Install MSVC build tools (required for Rust on Windows)
|
|
157
|
+
Log "Installing Visual Studio Build Tools..."
|
|
158
|
+
$vsInstaller = "$env:TEMP\vs_buildtools.exe"
|
|
159
|
+
Invoke-WebRequest -Uri "https://aka.ms/vs/17/release/vs_buildtools.exe" -OutFile $vsInstaller
|
|
160
|
+
Start-Process -FilePath $vsInstaller -ArgumentList "--quiet --wait --norestart --nocache --add Microsoft.VisualStudio.Workload.VCTools --includeRecommended" -Wait
|
|
161
|
+
Log "Build tools installed."
|
|
162
|
+
|
|
163
|
+
# Clone repo
|
|
164
|
+
Log "Cloning agent-browser..."
|
|
165
|
+
git clone https://github.com/vercel-labs/agent-browser.git C:\agent-browser
|
|
166
|
+
Set-Location C:\agent-browser
|
|
167
|
+
Log "Repo cloned."
|
|
168
|
+
|
|
169
|
+
# Build CLI
|
|
170
|
+
Log "Building agent-browser CLI..."
|
|
171
|
+
cargo build --release --manifest-path cli\Cargo.toml
|
|
172
|
+
Log "Build complete."
|
|
173
|
+
|
|
174
|
+
# Install Chrome
|
|
175
|
+
Log "Installing Chrome via agent-browser..."
|
|
176
|
+
.\cli\target\release\agent-browser.exe install
|
|
177
|
+
Log "Chrome installed."
|
|
178
|
+
|
|
179
|
+
Log "--- Bootstrap complete ---"
|
|
180
|
+
</powershell>
|
|
181
|
+
PWSH
|
|
182
|
+
|
|
183
|
+
# --- Launch instance ---
|
|
184
|
+
echo "Launching $INSTANCE_TYPE instance..."
|
|
185
|
+
INSTANCE_ID=$(aws ec2 run-instances \
|
|
186
|
+
--image-id "$AMI_ID" \
|
|
187
|
+
--instance-type "$INSTANCE_TYPE" \
|
|
188
|
+
--iam-instance-profile "Name=$PROFILE_NAME" \
|
|
189
|
+
--security-group-ids "$SG_ID" \
|
|
190
|
+
--user-data "file://$USERDATA_FILE" \
|
|
191
|
+
--block-device-mappings '[{"DeviceName":"/dev/sda1","Ebs":{"VolumeSize":80,"VolumeType":"gp3"}}]' \
|
|
192
|
+
--tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=$NAME_PREFIX}]" \
|
|
193
|
+
--metadata-options "HttpTokens=required" \
|
|
194
|
+
--query "Instances[0].InstanceId" --output text)
|
|
195
|
+
|
|
196
|
+
echo "Instance launched: $INSTANCE_ID"
|
|
197
|
+
|
|
198
|
+
# Save instance config
|
|
199
|
+
cat > "$INSTANCE_FILE" <<EOF
|
|
200
|
+
INSTANCE_ID=$INSTANCE_ID
|
|
201
|
+
REGION=$REGION
|
|
202
|
+
EOF
|
|
203
|
+
|
|
204
|
+
echo "Waiting for instance to enter running state..."
|
|
205
|
+
aws ec2 wait instance-running --instance-ids "$INSTANCE_ID"
|
|
206
|
+
echo "Instance is running."
|
|
207
|
+
|
|
208
|
+
echo ""
|
|
209
|
+
echo "Instance $INSTANCE_ID is booting and bootstrapping (Rust, Git, Chrome)."
|
|
210
|
+
echo "Bootstrap takes ~15-20 minutes on first boot."
|
|
211
|
+
echo ""
|
|
212
|
+
echo "Check bootstrap progress:"
|
|
213
|
+
echo " ./scripts/windows-debug/run.sh \"Get-Content C:\\bootstrap.log\""
|
|
214
|
+
echo ""
|
|
215
|
+
echo "Once ready, sync your branch and start debugging:"
|
|
216
|
+
echo " ./scripts/windows-debug/sync.sh"
|
|
217
|
+
echo " ./scripts/windows-debug/run.sh \"cd C:\\agent-browser && cargo test\""
|
|
218
|
+
echo ""
|
|
219
|
+
echo "Stop when done to save costs:"
|
|
220
|
+
echo " ./scripts/windows-debug/stop.sh"
|
|
@@ -0,0 +1,92 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
set -euo pipefail
|
|
3
|
+
|
|
4
|
+
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
|
5
|
+
INSTANCE_FILE="$SCRIPT_DIR/.instance"
|
|
6
|
+
|
|
7
|
+
if [[ ! -f "$INSTANCE_FILE" ]]; then
|
|
8
|
+
echo "Error: No instance provisioned. Run ./scripts/windows-debug/provision.sh first."
|
|
9
|
+
exit 1
|
|
10
|
+
fi
|
|
11
|
+
|
|
12
|
+
if [[ $# -eq 0 ]]; then
|
|
13
|
+
echo "Usage: ./scripts/windows-debug/run.sh \"<powershell-command>\""
|
|
14
|
+
echo ""
|
|
15
|
+
echo "Examples:"
|
|
16
|
+
echo " ./scripts/windows-debug/run.sh \"cd C:\\agent-browser && cargo test\""
|
|
17
|
+
echo " ./scripts/windows-debug/run.sh \"Get-Content C:\\bootstrap.log\""
|
|
18
|
+
echo " ./scripts/windows-debug/run.sh \"cd C:\\agent-browser && cargo test e2e -- --ignored --test-threads=1\""
|
|
19
|
+
exit 1
|
|
20
|
+
fi
|
|
21
|
+
|
|
22
|
+
source "$INSTANCE_FILE"
|
|
23
|
+
export AWS_DEFAULT_REGION="$REGION"
|
|
24
|
+
|
|
25
|
+
COMMAND="$*"
|
|
26
|
+
|
|
27
|
+
PARAMS_FILE=$(mktemp)
|
|
28
|
+
trap "rm -f $PARAMS_FILE" EXIT
|
|
29
|
+
|
|
30
|
+
python3 -c '
|
|
31
|
+
import json, sys
|
|
32
|
+
path_setup = "$env:PATH = \"$env:USERPROFILE\\.cargo\\bin;C:\\Program Files\\Git\\cmd;$env:PATH\""
|
|
33
|
+
cmd = path_setup + "\n" + sys.argv[1]
|
|
34
|
+
json.dump({"commands": [cmd]}, open(sys.argv[2], "w"))
|
|
35
|
+
' "$COMMAND" "$PARAMS_FILE"
|
|
36
|
+
|
|
37
|
+
COMMAND_ID=$(aws ssm send-command \
|
|
38
|
+
--instance-ids "$INSTANCE_ID" \
|
|
39
|
+
--document-name "AWS-RunPowerShellScript" \
|
|
40
|
+
--parameters "file://$PARAMS_FILE" \
|
|
41
|
+
--timeout-seconds 3600 \
|
|
42
|
+
--query "Command.CommandId" --output text)
|
|
43
|
+
|
|
44
|
+
echo "Command sent (ID: $COMMAND_ID). Waiting..." >&2
|
|
45
|
+
|
|
46
|
+
while true; do
|
|
47
|
+
RESULT=$(aws ssm get-command-invocation \
|
|
48
|
+
--command-id "$COMMAND_ID" \
|
|
49
|
+
--instance-id "$INSTANCE_ID" \
|
|
50
|
+
--output json 2>&1) || true
|
|
51
|
+
|
|
52
|
+
STATUS=$(echo "$RESULT" | python3 -c "
|
|
53
|
+
import sys, json
|
|
54
|
+
try:
|
|
55
|
+
print(json.loads(sys.stdin.read()).get('Status', 'Unknown'))
|
|
56
|
+
except:
|
|
57
|
+
print('Pending')
|
|
58
|
+
" 2>/dev/null)
|
|
59
|
+
|
|
60
|
+
case "$STATUS" in
|
|
61
|
+
Success)
|
|
62
|
+
echo "$RESULT" | python3 -c "
|
|
63
|
+
import sys, json
|
|
64
|
+
r = json.loads(sys.stdin.read())
|
|
65
|
+
out = r.get('StandardOutputContent', '').rstrip()
|
|
66
|
+
err = r.get('StandardErrorContent', '').rstrip()
|
|
67
|
+
if out:
|
|
68
|
+
print(out)
|
|
69
|
+
if err:
|
|
70
|
+
print(err, file=sys.stderr)
|
|
71
|
+
"
|
|
72
|
+
exit 0
|
|
73
|
+
;;
|
|
74
|
+
Failed|TimedOut|Cancelled)
|
|
75
|
+
echo "$RESULT" | python3 -c "
|
|
76
|
+
import sys, json
|
|
77
|
+
r = json.loads(sys.stdin.read())
|
|
78
|
+
out = r.get('StandardOutputContent', '').rstrip()
|
|
79
|
+
err = r.get('StandardErrorContent', '').rstrip()
|
|
80
|
+
if out:
|
|
81
|
+
print(out)
|
|
82
|
+
if err:
|
|
83
|
+
print(err, file=sys.stderr)
|
|
84
|
+
"
|
|
85
|
+
echo "Command $STATUS." >&2
|
|
86
|
+
exit 1
|
|
87
|
+
;;
|
|
88
|
+
*)
|
|
89
|
+
sleep 3
|
|
90
|
+
;;
|
|
91
|
+
esac
|
|
92
|
+
done
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
set -euo pipefail
|
|
3
|
+
|
|
4
|
+
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
|
5
|
+
INSTANCE_FILE="$SCRIPT_DIR/.instance"
|
|
6
|
+
|
|
7
|
+
if [[ ! -f "$INSTANCE_FILE" ]]; then
|
|
8
|
+
echo "Error: No instance provisioned. Run ./scripts/windows-debug/provision.sh first."
|
|
9
|
+
exit 1
|
|
10
|
+
fi
|
|
11
|
+
|
|
12
|
+
source "$INSTANCE_FILE"
|
|
13
|
+
export AWS_DEFAULT_REGION="$REGION"
|
|
14
|
+
|
|
15
|
+
STATE=$(aws ec2 describe-instances \
|
|
16
|
+
--instance-ids "$INSTANCE_ID" \
|
|
17
|
+
--query "Reservations[0].Instances[0].State.Name" --output text)
|
|
18
|
+
|
|
19
|
+
if [[ "$STATE" == "running" ]]; then
|
|
20
|
+
echo "Instance $INSTANCE_ID is already running."
|
|
21
|
+
else
|
|
22
|
+
echo "Starting instance $INSTANCE_ID..."
|
|
23
|
+
aws ec2 start-instances --instance-ids "$INSTANCE_ID" --no-cli-pager
|
|
24
|
+
echo "Waiting for running state..."
|
|
25
|
+
aws ec2 wait instance-running --instance-ids "$INSTANCE_ID"
|
|
26
|
+
echo "Instance is running."
|
|
27
|
+
fi
|
|
28
|
+
|
|
29
|
+
echo "Waiting for SSM agent connectivity..."
|
|
30
|
+
for i in $(seq 1 30); do
|
|
31
|
+
SSM_STATUS=$(aws ssm describe-instance-information \
|
|
32
|
+
--filters "Key=InstanceIds,Values=$INSTANCE_ID" \
|
|
33
|
+
--query "InstanceInformationList[0].PingStatus" --output text 2>/dev/null || echo "None")
|
|
34
|
+
if [[ "$SSM_STATUS" == "Online" ]]; then
|
|
35
|
+
echo "SSM agent is online. Ready for commands."
|
|
36
|
+
echo " ./scripts/windows-debug/run.sh \"your-command-here\""
|
|
37
|
+
exit 0
|
|
38
|
+
fi
|
|
39
|
+
sleep 10
|
|
40
|
+
done
|
|
41
|
+
|
|
42
|
+
echo "Warning: SSM agent not online after 5 minutes. The instance may still be booting."
|
|
43
|
+
echo "Try again in a minute: ./scripts/windows-debug/run.sh \"hostname\""
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
set -euo pipefail
|
|
3
|
+
|
|
4
|
+
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
|
5
|
+
INSTANCE_FILE="$SCRIPT_DIR/.instance"
|
|
6
|
+
|
|
7
|
+
if [[ ! -f "$INSTANCE_FILE" ]]; then
|
|
8
|
+
echo "Error: No instance provisioned. Nothing to stop."
|
|
9
|
+
exit 1
|
|
10
|
+
fi
|
|
11
|
+
|
|
12
|
+
source "$INSTANCE_FILE"
|
|
13
|
+
export AWS_DEFAULT_REGION="$REGION"
|
|
14
|
+
|
|
15
|
+
STATE=$(aws ec2 describe-instances \
|
|
16
|
+
--instance-ids "$INSTANCE_ID" \
|
|
17
|
+
--query "Reservations[0].Instances[0].State.Name" --output text)
|
|
18
|
+
|
|
19
|
+
if [[ "$STATE" == "stopped" ]]; then
|
|
20
|
+
echo "Instance $INSTANCE_ID is already stopped."
|
|
21
|
+
exit 0
|
|
22
|
+
fi
|
|
23
|
+
|
|
24
|
+
echo "Stopping instance $INSTANCE_ID..."
|
|
25
|
+
aws ec2 stop-instances --instance-ids "$INSTANCE_ID" --no-cli-pager
|
|
26
|
+
echo "Waiting for stopped state..."
|
|
27
|
+
aws ec2 wait instance-stopped --instance-ids "$INSTANCE_ID"
|
|
28
|
+
echo "Instance stopped. No compute charges while stopped (storage only: ~$0.64/mo)."
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
set -euo pipefail
|
|
3
|
+
|
|
4
|
+
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
|
5
|
+
RUN="$SCRIPT_DIR/run.sh"
|
|
6
|
+
|
|
7
|
+
BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "main")
|
|
8
|
+
REMOTE_URL=$(git remote get-url origin 2>/dev/null || echo "https://github.com/vercel-labs/agent-browser.git")
|
|
9
|
+
|
|
10
|
+
echo "Syncing branch '$BRANCH' on Windows instance..."
|
|
11
|
+
|
|
12
|
+
"$RUN" "
|
|
13
|
+
cd C:\agent-browser
|
|
14
|
+
git remote set-url origin '$REMOTE_URL'
|
|
15
|
+
git fetch origin
|
|
16
|
+
git checkout -B '$BRANCH' 'origin/$BRANCH'
|
|
17
|
+
git log -1 --oneline
|
|
18
|
+
"
|
|
19
|
+
|
|
20
|
+
echo ""
|
|
21
|
+
echo "Branch synced. Rebuilding..."
|
|
22
|
+
|
|
23
|
+
"$RUN" "
|
|
24
|
+
cd C:\agent-browser
|
|
25
|
+
cargo build --release --manifest-path cli\Cargo.toml
|
|
26
|
+
Write-Host 'Build complete.'
|
|
27
|
+
"
|
|
@@ -6,7 +6,7 @@ allowed-tools: Bash(npx agent-browser:*), Bash(agent-browser:*)
|
|
|
6
6
|
|
|
7
7
|
# Browser Automation with agent-browser
|
|
8
8
|
|
|
9
|
-
The CLI uses Chrome/Chromium via CDP directly. Install via `npm i -g agent-browser`, `brew install agent-browser`, or `cargo install agent-browser`. Run `agent-browser install` to download Chrome. Run `agent-browser upgrade` to update to the latest version.
|
|
9
|
+
The CLI uses Chrome/Chromium via CDP directly. Install via `npm i -g agent-browser`, `brew install agent-browser`, or `cargo install agent-browser`. Run `agent-browser install` to download Chrome. Existing Chrome, Brave, Playwright, and Puppeteer installations are detected automatically. Run `agent-browser upgrade` to update to the latest version.
|
|
10
10
|
|
|
11
11
|
## Core Workflow
|
|
12
12
|
|
|
@@ -110,6 +110,7 @@ See [references/authentication.md](references/authentication.md) for OAuth, 2FA,
|
|
|
110
110
|
# Navigation
|
|
111
111
|
agent-browser open <url> # Navigate (aliases: goto, navigate)
|
|
112
112
|
agent-browser close # Close browser
|
|
113
|
+
agent-browser close --all # Close all active sessions
|
|
113
114
|
|
|
114
115
|
# Snapshot
|
|
115
116
|
agent-browser snapshot -i # Interactive elements with refs (recommended)
|
|
@@ -183,7 +184,10 @@ agent-browser clipboard write "Hello, World!" # Write text to clipboard
|
|
|
183
184
|
agent-browser clipboard copy # Copy current selection
|
|
184
185
|
agent-browser clipboard paste # Paste from clipboard
|
|
185
186
|
|
|
186
|
-
# Dialogs (alert, confirm, prompt)
|
|
187
|
+
# Dialogs (alert, confirm, prompt, beforeunload)
|
|
188
|
+
# By default, alert and beforeunload dialogs are auto-accepted so they never block the agent.
|
|
189
|
+
# confirm and prompt dialogs still require explicit handling.
|
|
190
|
+
# Use --no-auto-dialog (or AGENT_BROWSER_NO_AUTO_DIALOG=1) to disable automatic handling.
|
|
187
191
|
agent-browser dialog accept # Accept dialog
|
|
188
192
|
agent-browser dialog accept "my input" # Accept prompt dialog with text
|
|
189
193
|
agent-browser dialog dismiss # Dismiss/cancel dialog
|
|
@@ -198,11 +202,9 @@ agent-browser diff url <url1> <url2> --wait-until networkidle # Custom wait str
|
|
|
198
202
|
agent-browser diff url <url1> <url2> --selector "#main" # Scope to element
|
|
199
203
|
```
|
|
200
204
|
|
|
201
|
-
##
|
|
205
|
+
## Streaming
|
|
202
206
|
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
If streaming must be present from the first daemon command, `AGENT_BROWSER_STREAM_PORT` still works at daemon startup, but that environment variable is not retroactive for sessions that are already running.
|
|
207
|
+
Every session automatically starts a WebSocket stream server on an OS-assigned port. Use `agent-browser stream status` to see the bound port and connection state. Use `stream disable` to tear it down, and `stream enable --port <port>` to re-enable on a specific port.
|
|
206
208
|
|
|
207
209
|
## Batch Execution
|
|
208
210
|
|
|
@@ -578,9 +580,10 @@ Always close your browser session when done to avoid leaked processes:
|
|
|
578
580
|
```bash
|
|
579
581
|
agent-browser close # Close default session
|
|
580
582
|
agent-browser --session agent1 close # Close specific session
|
|
583
|
+
agent-browser close --all # Close all active sessions
|
|
581
584
|
```
|
|
582
585
|
|
|
583
|
-
If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up
|
|
586
|
+
If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up, or `agent-browser close --all` to shut down every session at once.
|
|
584
587
|
|
|
585
588
|
To auto-shutdown the daemon after a period of inactivity (useful for ephemeral/CI environments):
|
|
586
589
|
|
|
@@ -712,6 +715,26 @@ Supported engines:
|
|
|
712
715
|
|
|
713
716
|
Lightpanda does not support `--extension`, `--profile`, `--state`, or `--allow-file-access`. Install Lightpanda from https://lightpanda.io/docs/open-source/installation.
|
|
714
717
|
|
|
718
|
+
## Observability Dashboard
|
|
719
|
+
|
|
720
|
+
The dashboard is a standalone background server that shows live browser viewports, command activity, and console output for all sessions.
|
|
721
|
+
|
|
722
|
+
```bash
|
|
723
|
+
# Install the dashboard once
|
|
724
|
+
agent-browser dashboard install
|
|
725
|
+
|
|
726
|
+
# Start the dashboard server (background, port 4848)
|
|
727
|
+
agent-browser dashboard start
|
|
728
|
+
|
|
729
|
+
# All sessions are automatically visible in the dashboard
|
|
730
|
+
agent-browser open example.com
|
|
731
|
+
|
|
732
|
+
# Stop the dashboard
|
|
733
|
+
agent-browser dashboard stop
|
|
734
|
+
```
|
|
735
|
+
|
|
736
|
+
The dashboard runs independently of browser sessions on port 4848 (configurable with `--port`). All sessions automatically stream to the dashboard.
|
|
737
|
+
|
|
715
738
|
## Ready-to-Use Templates
|
|
716
739
|
|
|
717
740
|
| Template | Description |
|
|
@@ -209,9 +209,12 @@ The `frame` command accepts:
|
|
|
209
209
|
|
|
210
210
|
## Dialogs
|
|
211
211
|
|
|
212
|
+
By default, `alert` and `beforeunload` dialogs are automatically accepted so they never block the agent. `confirm` and `prompt` dialogs still require explicit handling. Use `--no-auto-dialog` to disable this behavior.
|
|
213
|
+
|
|
212
214
|
```bash
|
|
213
215
|
agent-browser dialog accept [text] # Accept dialog
|
|
214
216
|
agent-browser dialog dismiss # Dismiss dialog
|
|
217
|
+
agent-browser dialog status # Check if a dialog is currently open
|
|
215
218
|
```
|
|
216
219
|
|
|
217
220
|
## JavaScript
|
|
@@ -287,6 +290,6 @@ AGENT_BROWSER_SESSION="mysession" # Default session name
|
|
|
287
290
|
AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
|
|
288
291
|
AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
|
|
289
292
|
AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider
|
|
290
|
-
AGENT_BROWSER_STREAM_PORT="9223" # WebSocket streaming port
|
|
293
|
+
AGENT_BROWSER_STREAM_PORT="9223" # Override WebSocket streaming port (default: OS-assigned)
|
|
291
294
|
AGENT_BROWSER_HOME="/path/to/agent-browser" # Custom install location
|
|
292
295
|
```
|