@sentry/junior-agent-browser 0.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,193 @@
1
+ # Session Management
2
+
3
+ Multiple isolated browser sessions with state persistence and concurrent browsing.
4
+
5
+ **Related**: [authentication.md](authentication.md) for login patterns, [SKILL.md](../SKILL.md) for quick start.
6
+
7
+ ## Contents
8
+
9
+ - [Named Sessions](#named-sessions)
10
+ - [Session Isolation Properties](#session-isolation-properties)
11
+ - [Session State Persistence](#session-state-persistence)
12
+ - [Common Patterns](#common-patterns)
13
+ - [Default Session](#default-session)
14
+ - [Session Cleanup](#session-cleanup)
15
+ - [Best Practices](#best-practices)
16
+
17
+ ## Named Sessions
18
+
19
+ Use `--session` flag to isolate browser contexts:
20
+
21
+ ```bash
22
+ # Session 1: Authentication flow
23
+ agent-browser --session auth open https://app.example.com/login
24
+
25
+ # Session 2: Public browsing (separate cookies, storage)
26
+ agent-browser --session public open https://example.com
27
+
28
+ # Commands are isolated by session
29
+ agent-browser --session auth fill @e1 "user@example.com"
30
+ agent-browser --session public get text body
31
+ ```
32
+
33
+ ## Session Isolation Properties
34
+
35
+ Each session has independent:
36
+ - Cookies
37
+ - LocalStorage / SessionStorage
38
+ - IndexedDB
39
+ - Cache
40
+ - Browsing history
41
+ - Open tabs
42
+
43
+ ## Session State Persistence
44
+
45
+ ### Save Session State
46
+
47
+ ```bash
48
+ # Save cookies, storage, and auth state
49
+ agent-browser state save /path/to/auth-state.json
50
+ ```
51
+
52
+ ### Load Session State
53
+
54
+ ```bash
55
+ # Restore saved state
56
+ agent-browser state load /path/to/auth-state.json
57
+
58
+ # Continue with authenticated session
59
+ agent-browser open https://app.example.com/dashboard
60
+ ```
61
+
62
+ ### State File Contents
63
+
64
+ ```json
65
+ {
66
+ "cookies": [...],
67
+ "localStorage": {...},
68
+ "sessionStorage": {...},
69
+ "origins": [...]
70
+ }
71
+ ```
72
+
73
+ ## Common Patterns
74
+
75
+ ### Authenticated Session Reuse
76
+
77
+ ```bash
78
+ #!/bin/bash
79
+ # Save login state once, reuse many times
80
+
81
+ STATE_FILE="/tmp/auth-state.json"
82
+
83
+ # Check if we have saved state
84
+ if [[ -f "$STATE_FILE" ]]; then
85
+ agent-browser state load "$STATE_FILE"
86
+ agent-browser open https://app.example.com/dashboard
87
+ else
88
+ # Perform login
89
+ agent-browser open https://app.example.com/login
90
+ agent-browser snapshot -i
91
+ agent-browser fill @e1 "$USERNAME"
92
+ agent-browser fill @e2 "$PASSWORD"
93
+ agent-browser click @e3
94
+ agent-browser wait --load networkidle
95
+
96
+ # Save for future use
97
+ agent-browser state save "$STATE_FILE"
98
+ fi
99
+ ```
100
+
101
+ ### Concurrent Scraping
102
+
103
+ ```bash
104
+ #!/bin/bash
105
+ # Scrape multiple sites concurrently
106
+
107
+ # Start all sessions
108
+ agent-browser --session site1 open https://site1.com &
109
+ agent-browser --session site2 open https://site2.com &
110
+ agent-browser --session site3 open https://site3.com &
111
+ wait
112
+
113
+ # Extract from each
114
+ agent-browser --session site1 get text body > site1.txt
115
+ agent-browser --session site2 get text body > site2.txt
116
+ agent-browser --session site3 get text body > site3.txt
117
+
118
+ # Cleanup
119
+ agent-browser --session site1 close
120
+ agent-browser --session site2 close
121
+ agent-browser --session site3 close
122
+ ```
123
+
124
+ ### A/B Testing Sessions
125
+
126
+ ```bash
127
+ # Test different user experiences
128
+ agent-browser --session variant-a open "https://app.com?variant=a"
129
+ agent-browser --session variant-b open "https://app.com?variant=b"
130
+
131
+ # Compare
132
+ agent-browser --session variant-a screenshot /tmp/variant-a.png
133
+ agent-browser --session variant-b screenshot /tmp/variant-b.png
134
+ ```
135
+
136
+ ## Default Session
137
+
138
+ When `--session` is omitted, commands use the default session:
139
+
140
+ ```bash
141
+ # These use the same default session
142
+ agent-browser open https://example.com
143
+ agent-browser snapshot -i
144
+ agent-browser close # Closes default session
145
+ ```
146
+
147
+ ## Session Cleanup
148
+
149
+ ```bash
150
+ # Close specific session
151
+ agent-browser --session auth close
152
+
153
+ # List active sessions
154
+ agent-browser session list
155
+ ```
156
+
157
+ ## Best Practices
158
+
159
+ ### 1. Name Sessions Semantically
160
+
161
+ ```bash
162
+ # GOOD: Clear purpose
163
+ agent-browser --session github-auth open https://github.com
164
+ agent-browser --session docs-scrape open https://docs.example.com
165
+
166
+ # AVOID: Generic names
167
+ agent-browser --session s1 open https://github.com
168
+ ```
169
+
170
+ ### 2. Always Clean Up
171
+
172
+ ```bash
173
+ # Close sessions when done
174
+ agent-browser --session auth close
175
+ agent-browser --session scrape close
176
+ ```
177
+
178
+ ### 3. Handle State Files Securely
179
+
180
+ ```bash
181
+ # Don't commit state files (contain auth tokens!)
182
+ echo "*.auth-state.json" >> .gitignore
183
+
184
+ # Delete after use
185
+ rm /tmp/auth-state.json
186
+ ```
187
+
188
+ ### 4. Timeout Long Sessions
189
+
190
+ ```bash
191
+ # Set timeout for automated scripts
192
+ timeout 60 agent-browser --session long-task get text body
193
+ ```
@@ -0,0 +1,194 @@
1
+ # Snapshot and Refs
2
+
3
+ Compact element references that reduce context usage dramatically for AI agents.
4
+
5
+ **Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
6
+
7
+ ## Contents
8
+
9
+ - [How Refs Work](#how-refs-work)
10
+ - [Snapshot Command](#the-snapshot-command)
11
+ - [Using Refs](#using-refs)
12
+ - [Ref Lifecycle](#ref-lifecycle)
13
+ - [Best Practices](#best-practices)
14
+ - [Ref Notation Details](#ref-notation-details)
15
+ - [Troubleshooting](#troubleshooting)
16
+
17
+ ## How Refs Work
18
+
19
+ Traditional approach:
20
+ ```
21
+ Full DOM/HTML → AI parses → CSS selector → Action (~3000-5000 tokens)
22
+ ```
23
+
24
+ agent-browser approach:
25
+ ```
26
+ Compact snapshot → @refs assigned → Direct interaction (~200-400 tokens)
27
+ ```
28
+
29
+ ## The Snapshot Command
30
+
31
+ ```bash
32
+ # Basic snapshot (shows page structure)
33
+ agent-browser snapshot
34
+
35
+ # Interactive snapshot (-i flag) - RECOMMENDED
36
+ agent-browser snapshot -i
37
+ ```
38
+
39
+ ### Snapshot Output Format
40
+
41
+ ```
42
+ Page: Example Site - Home
43
+ URL: https://example.com
44
+
45
+ @e1 [header]
46
+ @e2 [nav]
47
+ @e3 [a] "Home"
48
+ @e4 [a] "Products"
49
+ @e5 [a] "About"
50
+ @e6 [button] "Sign In"
51
+
52
+ @e7 [main]
53
+ @e8 [h1] "Welcome"
54
+ @e9 [form]
55
+ @e10 [input type="email"] placeholder="Email"
56
+ @e11 [input type="password"] placeholder="Password"
57
+ @e12 [button type="submit"] "Log In"
58
+
59
+ @e13 [footer]
60
+ @e14 [a] "Privacy Policy"
61
+ ```
62
+
63
+ ## Using Refs
64
+
65
+ Once you have refs, interact directly:
66
+
67
+ ```bash
68
+ # Click the "Sign In" button
69
+ agent-browser click @e6
70
+
71
+ # Fill email input
72
+ agent-browser fill @e10 "user@example.com"
73
+
74
+ # Fill password
75
+ agent-browser fill @e11 "password123"
76
+
77
+ # Submit the form
78
+ agent-browser click @e12
79
+ ```
80
+
81
+ ## Ref Lifecycle
82
+
83
+ **IMPORTANT**: Refs are invalidated when the page changes!
84
+
85
+ ```bash
86
+ # Get initial snapshot
87
+ agent-browser snapshot -i
88
+ # @e1 [button] "Next"
89
+
90
+ # Click triggers page change
91
+ agent-browser click @e1
92
+
93
+ # MUST re-snapshot to get new refs!
94
+ agent-browser snapshot -i
95
+ # @e1 [h1] "Page 2" ← Different element now!
96
+ ```
97
+
98
+ ## Best Practices
99
+
100
+ ### 1. Always Snapshot Before Interacting
101
+
102
+ ```bash
103
+ # CORRECT
104
+ agent-browser open https://example.com
105
+ agent-browser snapshot -i # Get refs first
106
+ agent-browser click @e1 # Use ref
107
+
108
+ # WRONG
109
+ agent-browser open https://example.com
110
+ agent-browser click @e1 # Ref doesn't exist yet!
111
+ ```
112
+
113
+ ### 2. Re-Snapshot After Navigation
114
+
115
+ ```bash
116
+ agent-browser click @e5 # Navigates to new page
117
+ agent-browser snapshot -i # Get new refs
118
+ agent-browser click @e1 # Use new refs
119
+ ```
120
+
121
+ ### 3. Re-Snapshot After Dynamic Changes
122
+
123
+ ```bash
124
+ agent-browser click @e1 # Opens dropdown
125
+ agent-browser snapshot -i # See dropdown items
126
+ agent-browser click @e7 # Select item
127
+ ```
128
+
129
+ ### 4. Snapshot Specific Regions
130
+
131
+ For complex pages, snapshot specific areas:
132
+
133
+ ```bash
134
+ # Snapshot just the form
135
+ agent-browser snapshot @e9
136
+ ```
137
+
138
+ ## Ref Notation Details
139
+
140
+ ```
141
+ @e1 [tag type="value"] "text content" placeholder="hint"
142
+ │ │ │ │ │
143
+ │ │ │ │ └─ Additional attributes
144
+ │ │ │ └─ Visible text
145
+ │ │ └─ Key attributes shown
146
+ │ └─ HTML tag name
147
+ └─ Unique ref ID
148
+ ```
149
+
150
+ ### Common Patterns
151
+
152
+ ```
153
+ @e1 [button] "Submit" # Button with text
154
+ @e2 [input type="email"] # Email input
155
+ @e3 [input type="password"] # Password input
156
+ @e4 [a href="/page"] "Link Text" # Anchor link
157
+ @e5 [select] # Dropdown
158
+ @e6 [textarea] placeholder="Message" # Text area
159
+ @e7 [div class="modal"] # Container (when relevant)
160
+ @e8 [img alt="Logo"] # Image
161
+ @e9 [checkbox] checked # Checked checkbox
162
+ @e10 [radio] selected # Selected radio
163
+ ```
164
+
165
+ ## Troubleshooting
166
+
167
+ ### "Ref not found" Error
168
+
169
+ ```bash
170
+ # Ref may have changed - re-snapshot
171
+ agent-browser snapshot -i
172
+ ```
173
+
174
+ ### Element Not Visible in Snapshot
175
+
176
+ ```bash
177
+ # Scroll down to reveal element
178
+ agent-browser scroll down 1000
179
+ agent-browser snapshot -i
180
+
181
+ # Or wait for dynamic content
182
+ agent-browser wait 1000
183
+ agent-browser snapshot -i
184
+ ```
185
+
186
+ ### Too Many Elements
187
+
188
+ ```bash
189
+ # Snapshot specific container
190
+ agent-browser snapshot @e5
191
+
192
+ # Or use get text for content-only extraction
193
+ agent-browser get text @e5
194
+ ```
@@ -0,0 +1,173 @@
1
+ # Video Recording
2
+
3
+ Capture browser automation as video for debugging, documentation, or verification.
4
+
5
+ **Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
6
+
7
+ ## Contents
8
+
9
+ - [Basic Recording](#basic-recording)
10
+ - [Recording Commands](#recording-commands)
11
+ - [Use Cases](#use-cases)
12
+ - [Best Practices](#best-practices)
13
+ - [Output Format](#output-format)
14
+ - [Limitations](#limitations)
15
+
16
+ ## Basic Recording
17
+
18
+ ```bash
19
+ # Start recording
20
+ agent-browser record start ./demo.webm
21
+
22
+ # Perform actions
23
+ agent-browser open https://example.com
24
+ agent-browser snapshot -i
25
+ agent-browser click @e1
26
+ agent-browser fill @e2 "test input"
27
+
28
+ # Stop and save
29
+ agent-browser record stop
30
+ ```
31
+
32
+ ## Recording Commands
33
+
34
+ ```bash
35
+ # Start recording to file
36
+ agent-browser record start ./output.webm
37
+
38
+ # Stop current recording
39
+ agent-browser record stop
40
+
41
+ # Restart with new file (stops current + starts new)
42
+ agent-browser record restart ./take2.webm
43
+ ```
44
+
45
+ ## Use Cases
46
+
47
+ ### Debugging Failed Automation
48
+
49
+ ```bash
50
+ #!/bin/bash
51
+ # Record automation for debugging
52
+
53
+ agent-browser record start ./debug-$(date +%Y%m%d-%H%M%S).webm
54
+
55
+ # Run your automation
56
+ agent-browser open https://app.example.com
57
+ agent-browser snapshot -i
58
+ agent-browser click @e1 || {
59
+ echo "Click failed - check recording"
60
+ agent-browser record stop
61
+ exit 1
62
+ }
63
+
64
+ agent-browser record stop
65
+ ```
66
+
67
+ ### Documentation Generation
68
+
69
+ ```bash
70
+ #!/bin/bash
71
+ # Record workflow for documentation
72
+
73
+ agent-browser record start ./docs/how-to-login.webm
74
+
75
+ agent-browser open https://app.example.com/login
76
+ agent-browser wait 1000 # Pause for visibility
77
+
78
+ agent-browser snapshot -i
79
+ agent-browser fill @e1 "demo@example.com"
80
+ agent-browser wait 500
81
+
82
+ agent-browser fill @e2 "password"
83
+ agent-browser wait 500
84
+
85
+ agent-browser click @e3
86
+ agent-browser wait --load networkidle
87
+ agent-browser wait 1000 # Show result
88
+
89
+ agent-browser record stop
90
+ ```
91
+
92
+ ### CI/CD Test Evidence
93
+
94
+ ```bash
95
+ #!/bin/bash
96
+ # Record E2E test runs for CI artifacts
97
+
98
+ TEST_NAME="${1:-e2e-test}"
99
+ RECORDING_DIR="./test-recordings"
100
+ mkdir -p "$RECORDING_DIR"
101
+
102
+ agent-browser record start "$RECORDING_DIR/$TEST_NAME-$(date +%s).webm"
103
+
104
+ # Run test
105
+ if run_e2e_test; then
106
+ echo "Test passed"
107
+ else
108
+ echo "Test failed - recording saved"
109
+ fi
110
+
111
+ agent-browser record stop
112
+ ```
113
+
114
+ ## Best Practices
115
+
116
+ ### 1. Add Pauses for Clarity
117
+
118
+ ```bash
119
+ # Slow down for human viewing
120
+ agent-browser click @e1
121
+ agent-browser wait 500 # Let viewer see result
122
+ ```
123
+
124
+ ### 2. Use Descriptive Filenames
125
+
126
+ ```bash
127
+ # Include context in filename
128
+ agent-browser record start ./recordings/login-flow-2024-01-15.webm
129
+ agent-browser record start ./recordings/checkout-test-run-42.webm
130
+ ```
131
+
132
+ ### 3. Handle Recording in Error Cases
133
+
134
+ ```bash
135
+ #!/bin/bash
136
+ set -e
137
+
138
+ cleanup() {
139
+ agent-browser record stop 2>/dev/null || true
140
+ agent-browser close 2>/dev/null || true
141
+ }
142
+ trap cleanup EXIT
143
+
144
+ agent-browser record start ./automation.webm
145
+ # ... automation steps ...
146
+ ```
147
+
148
+ ### 4. Combine with Screenshots
149
+
150
+ ```bash
151
+ # Record video AND capture key frames
152
+ agent-browser record start ./flow.webm
153
+
154
+ agent-browser open https://example.com
155
+ agent-browser screenshot ./screenshots/step1-homepage.png
156
+
157
+ agent-browser click @e1
158
+ agent-browser screenshot ./screenshots/step2-after-click.png
159
+
160
+ agent-browser record stop
161
+ ```
162
+
163
+ ## Output Format
164
+
165
+ - Default format: WebM (VP8/VP9 codec)
166
+ - Compatible with all modern browsers and video players
167
+ - Compressed but high quality
168
+
169
+ ## Limitations
170
+
171
+ - Recording adds slight overhead to automation
172
+ - Large recordings can consume significant disk space
173
+ - Some headless environments may have codec limitations
@@ -0,0 +1,105 @@
1
+ #!/bin/bash
2
+ # Template: Authenticated Session Workflow
3
+ # Purpose: Login once, save state, reuse for subsequent runs
4
+ # Usage: ./authenticated-session.sh <login-url> [state-file]
5
+ #
6
+ # RECOMMENDED: Use the auth vault instead of this template:
7
+ # echo "<pass>" | agent-browser auth save myapp --url <login-url> --username <user> --password-stdin
8
+ # agent-browser auth login myapp
9
+ # The auth vault stores credentials securely and the LLM never sees passwords.
10
+ #
11
+ # Environment variables:
12
+ # APP_USERNAME - Login username/email
13
+ # APP_PASSWORD - Login password
14
+ #
15
+ # Two modes:
16
+ # 1. Discovery mode (default): Shows form structure so you can identify refs
17
+ # 2. Login mode: Performs actual login after you update the refs
18
+ #
19
+ # Setup steps:
20
+ # 1. Run once to see form structure (discovery mode)
21
+ # 2. Update refs in LOGIN FLOW section below
22
+ # 3. Set APP_USERNAME and APP_PASSWORD
23
+ # 4. Delete the DISCOVERY section
24
+
25
+ set -euo pipefail
26
+
27
+ LOGIN_URL="${1:?Usage: $0 <login-url> [state-file]}"
28
+ STATE_FILE="${2:-./auth-state.json}"
29
+
30
+ echo "Authentication workflow: $LOGIN_URL"
31
+
32
+ # ================================================================
33
+ # SAVED STATE: Skip login if valid saved state exists
34
+ # ================================================================
35
+ if [[ -f "$STATE_FILE" ]]; then
36
+ echo "Loading saved state from $STATE_FILE..."
37
+ if agent-browser --state "$STATE_FILE" open "$LOGIN_URL" 2>/dev/null; then
38
+ agent-browser wait --load networkidle
39
+
40
+ CURRENT_URL=$(agent-browser get url)
41
+ if [[ "$CURRENT_URL" != *"login"* ]] && [[ "$CURRENT_URL" != *"signin"* ]]; then
42
+ echo "Session restored successfully"
43
+ agent-browser snapshot -i
44
+ exit 0
45
+ fi
46
+ echo "Session expired, performing fresh login..."
47
+ agent-browser close 2>/dev/null || true
48
+ else
49
+ echo "Failed to load state, re-authenticating..."
50
+ fi
51
+ rm -f "$STATE_FILE"
52
+ fi
53
+
54
+ # ================================================================
55
+ # DISCOVERY MODE: Shows form structure (delete after setup)
56
+ # ================================================================
57
+ echo "Opening login page..."
58
+ agent-browser open "$LOGIN_URL"
59
+ agent-browser wait --load networkidle
60
+
61
+ echo ""
62
+ echo "Login form structure:"
63
+ echo "---"
64
+ agent-browser snapshot -i
65
+ echo "---"
66
+ echo ""
67
+ echo "Next steps:"
68
+ echo " 1. Note the refs: username=@e?, password=@e?, submit=@e?"
69
+ echo " 2. Update the LOGIN FLOW section below with your refs"
70
+ echo " 3. Set: export APP_USERNAME='...' APP_PASSWORD='...'"
71
+ echo " 4. Delete this DISCOVERY MODE section"
72
+ echo ""
73
+ agent-browser close
74
+ exit 0
75
+
76
+ # ================================================================
77
+ # LOGIN FLOW: Uncomment and customize after discovery
78
+ # ================================================================
79
+ # : "${APP_USERNAME:?Set APP_USERNAME environment variable}"
80
+ # : "${APP_PASSWORD:?Set APP_PASSWORD environment variable}"
81
+ #
82
+ # agent-browser open "$LOGIN_URL"
83
+ # agent-browser wait --load networkidle
84
+ # agent-browser snapshot -i
85
+ #
86
+ # # Fill credentials (update refs to match your form)
87
+ # agent-browser fill @e1 "$APP_USERNAME"
88
+ # agent-browser fill @e2 "$APP_PASSWORD"
89
+ # agent-browser click @e3
90
+ # agent-browser wait --load networkidle
91
+ #
92
+ # # Verify login succeeded
93
+ # FINAL_URL=$(agent-browser get url)
94
+ # if [[ "$FINAL_URL" == *"login"* ]] || [[ "$FINAL_URL" == *"signin"* ]]; then
95
+ # echo "Login failed - still on login page"
96
+ # agent-browser screenshot /tmp/login-failed.png
97
+ # agent-browser close
98
+ # exit 1
99
+ # fi
100
+ #
101
+ # # Save state for future runs
102
+ # echo "Saving state to $STATE_FILE"
103
+ # agent-browser state save "$STATE_FILE"
104
+ # echo "Login successful"
105
+ # agent-browser snapshot -i