tab-agent 0.3.1 → 0.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "tab-agent",
3
- "version": "0.3.1",
3
+ "version": "0.3.2",
4
4
  "description": "Browser control for Claude Code and Codex via WebSocket",
5
5
  "bin": {
6
6
  "tab-agent": "./bin/tab-agent.js"
@@ -19,8 +19,6 @@ sleep 2
19
19
  ```bash
20
20
  npx tab-agent tabs # List active tabs
21
21
  npx tab-agent snapshot # Get page with refs [e1], [e2]...
22
- npx tab-agent screenshot # Capture viewport
23
- npx tab-agent screenshot --full # Capture full page
24
22
  npx tab-agent click <ref> # Click element
25
23
  npx tab-agent type <ref> <text> # Type text
26
24
  npx tab-agent fill <ref> <value> # Fill form field
@@ -28,15 +26,19 @@ npx tab-agent press <key> # Press key (Enter, Escape, Tab)
28
26
  npx tab-agent scroll <dir> [amount] # Scroll up/down
29
27
  npx tab-agent navigate <url> # Go to URL
30
28
  npx tab-agent wait <text|selector> # Wait for condition
31
- npx tab-agent evaluate <script> # Run JavaScript
29
+ npx tab-agent screenshot # Capture viewport (fallback only)
30
+ npx tab-agent screenshot --full # Capture full page (fallback only)
32
31
  ```
33
32
 
34
- ## Usage
33
+ ## Workflow
35
34
 
36
- 1. `tabs` -> find active tab
37
- 2. `snapshot` -> read page, get element refs [e1], [e2]...
38
- 3. `click`/`type`/`fill` using refs
39
- 4. If snapshot incomplete -> `screenshot` and analyze visually
35
+ 1. `snapshot` first - always start here to get element refs
36
+ 2. Use refs [e1], [e2]... with `click`/`type`/`fill`
37
+ 3. `snapshot` again after actions to see results
38
+ 4. **Only use `screenshot` if:**
39
+ - Snapshot is missing expected content
40
+ - Page has complex visuals (charts, images, canvas)
41
+ - Debugging why an action didn't work
40
42
 
41
43
  ## Examples
42
44
 
@@ -46,14 +48,15 @@ npx tab-agent navigate "https://google.com"
46
48
  npx tab-agent snapshot
47
49
  npx tab-agent type e1 "hello world"
48
50
  npx tab-agent press Enter
51
+ npx tab-agent snapshot # See results
49
52
 
50
- # Read page content
51
- npx tab-agent snapshot
53
+ # Only screenshot if snapshot doesn't show what you need
52
54
  npx tab-agent screenshot --full
53
55
  ```
54
56
 
55
57
  ## Notes
56
58
 
57
- - Screenshot saves to /tmp/ and opens automatically
58
59
  - Refs reset on each snapshot - always snapshot before interacting
59
60
  - Keys: Enter, Escape, Tab, Backspace, ArrowUp/Down/Left/Right
61
+ - Screenshot outputs base64 to stdout (no file saved)
62
+ - Prefer snapshot over screenshot - it's faster and text-based
@@ -16,19 +16,23 @@ curl -s http://localhost:9876/health || (npx tab-agent start &)
16
16
  ## Commands
17
17
 
18
18
  ```bash
19
- tabs # List active tabs
20
- snapshot # Page with refs [e1], [e2]...
21
- screenshot [--full] # Capture viewport/full page
22
- click <ref> # Click element
23
- type <ref> <text> # Type text
24
- fill <ref> <value> # Fill form field
25
- press <key> # Enter/Escape/Tab/Arrow*
26
- scroll <dir> [amount] # Scroll up/down
27
- navigate <url> # Go to URL
28
- wait <text|selector> # Wait for condition
29
- evaluate <script> # Run JavaScript
19
+ npx tab-agent tabs # List active tabs
20
+ npx tab-agent snapshot # Page with refs [e1], [e2]...
21
+ npx tab-agent click <ref> # Click element
22
+ npx tab-agent type <ref> <text> # Type text
23
+ npx tab-agent fill <ref> <val> # Fill form field
24
+ npx tab-agent press <key> # Enter/Escape/Tab/Arrow*
25
+ npx tab-agent scroll <dir> [n] # Scroll up/down
26
+ npx tab-agent navigate <url> # Go to URL
27
+ npx tab-agent wait <text|sel> # Wait for condition
28
+ npx tab-agent screenshot # Fallback only - if snapshot incomplete
30
29
  ```
31
30
 
32
- ## Flow
31
+ ## Workflow
33
32
 
34
- `snapshot` -> `click`/`type` -> repeat. Use `screenshot` if snapshot incomplete.
33
+ 1. Always `snapshot` first - get refs [e1], [e2]...
34
+ 2. `click`/`type`/`fill` using refs
35
+ 3. `snapshot` again to see results
36
+ 4. **Only screenshot if snapshot missing content** (charts, canvas, debugging)
37
+
38
+ Prefer snapshot over screenshot - faster and text-based.