tab-agent 0.2.3 → 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +50 -32
- package/bin/tab-agent.js +52 -28
- package/cli/command.js +200 -0
- package/extension/service-worker.js +6 -15
- package/package.json +1 -1
- package/skills/claude-code/tab-agent.md +34 -28
- package/skills/codex/tab-agent.md +15 -21
package/README.md
CHANGED
|
@@ -1,32 +1,45 @@
|
|
|
1
1
|
# Tab Agent
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
[](https://www.npmjs.com/package/tab-agent)
|
|
4
|
+
|
|
5
|
+
**Give Claude, Codex, or any LLM full control of your browser tabs** — securely, with click-to-activate permission.
|
|
4
6
|
|
|
5
7
|
```
|
|
6
8
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
7
|
-
│
|
|
8
|
-
│
|
|
9
|
+
│ Claude Code │────▶│ Relay Server │────▶│ Extension │
|
|
10
|
+
│ Codex / LLM │◀────│ (background) │◀────│ (Chrome) │
|
|
9
11
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
12
|
+
│
|
|
13
|
+
▼
|
|
14
|
+
┌───────────────────┐
|
|
15
|
+
│ Your Active Tab │
|
|
16
|
+
│ 🟢 Click to ON │
|
|
17
|
+
└───────────────────┘
|
|
16
18
|
```
|
|
17
19
|
|
|
20
|
+
## Features
|
|
21
|
+
|
|
22
|
+
- **Full browser control** — navigate, click, type, scroll, screenshot, run JavaScript
|
|
23
|
+
- **Uses your login sessions** — access authenticated sites (GitHub, Gmail, X) without sharing credentials
|
|
24
|
+
- **Runs in background** — relay server starts automatically, works while you do other things
|
|
25
|
+
- **Click-to-activate security** — only tabs you explicitly enable, your other tabs stay private
|
|
26
|
+
- **AI-optimized snapshots** — pages converted to readable text with element refs `[e1]`, `[e2]`
|
|
27
|
+
- **Works with any LLM** — Claude Code, Codex, or any tool that can run shell commands
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
18
31
|
## Quick Start
|
|
19
32
|
|
|
20
33
|
```bash
|
|
21
34
|
# 1. Clone and load extension
|
|
22
35
|
git clone https://github.com/DrHB/tab-agent
|
|
23
|
-
# →
|
|
36
|
+
# → Chrome: chrome://extensions → Developer mode → Load unpacked → select extension/
|
|
24
37
|
|
|
25
38
|
# 2. Setup (auto-detects everything)
|
|
26
39
|
npx tab-agent setup
|
|
27
40
|
|
|
28
|
-
# 3.
|
|
29
|
-
# → Click Tab Agent icon on
|
|
41
|
+
# 3. Activate a tab & go!
|
|
42
|
+
# → Click Tab Agent icon on any tab (turns green = active)
|
|
30
43
|
# → Ask Claude: "Use tab-agent to search Google for 'hello world'"
|
|
31
44
|
```
|
|
32
45
|
|
|
@@ -42,9 +55,8 @@ npx tab-agent setup
|
|
|
42
55
|
| **Visibility** | Green badge = active | Hidden/background |
|
|
43
56
|
| **Sessions** | Uses your cookies | Requires re-login |
|
|
44
57
|
| **Credentials** | Never shared | Often required |
|
|
45
|
-
| **Audit** | Full action logging | Varies |
|
|
46
58
|
|
|
47
|
-
**Click-to-activate model:** Your banking, email, and sensitive tabs stay completely isolated. You always see exactly which tabs
|
|
59
|
+
**Click-to-activate model:** Your banking, email, and sensitive tabs stay completely isolated. You always see exactly which tabs the LLM can control.
|
|
48
60
|
|
|
49
61
|
### 🍪 Works With Your Login Sessions
|
|
50
62
|
|
|
@@ -55,9 +67,9 @@ Because Tab Agent runs as a Chrome extension:
|
|
|
55
67
|
- **Works with SSO and 2FA** — enterprise apps, protected accounts
|
|
56
68
|
- **No credential sharing** — your passwords stay in your browser
|
|
57
69
|
|
|
58
|
-
### 🤖
|
|
70
|
+
### 🤖 LLM-Optimized
|
|
59
71
|
|
|
60
|
-
- **Semantic snapshots** — pages converted to
|
|
72
|
+
- **Semantic snapshots** — pages converted to readable text with refs `[e1]`, `[e2]`
|
|
61
73
|
- **Screenshot fallback** — for complex dynamic pages
|
|
62
74
|
- **Simple targeting** — click/type using refs instead of fragile CSS selectors
|
|
63
75
|
|
|
@@ -111,7 +123,7 @@ This automatically:
|
|
|
111
123
|
|
|
112
124
|
1. Navigate to any webpage
|
|
113
125
|
2. **Click the Tab Agent icon** — it turns green (🟢 ON)
|
|
114
|
-
3. Ask your
|
|
126
|
+
3. Ask your LLM to interact with the page
|
|
115
127
|
|
|
116
128
|
---
|
|
117
129
|
|
|
@@ -122,37 +134,43 @@ This automatically:
|
|
|
122
134
|
|---------|-------------|
|
|
123
135
|
| `tabs` | List all activated tabs |
|
|
124
136
|
| `navigate` | Go to a URL |
|
|
125
|
-
| `snapshot` | Get
|
|
137
|
+
| `snapshot` | Get page with element refs |
|
|
126
138
|
| `screenshot` | Capture viewport image |
|
|
127
|
-
| `screenshot
|
|
139
|
+
| `screenshot --full` | Capture entire page |
|
|
128
140
|
|
|
129
141
|
### Interaction
|
|
130
142
|
| Command | Description |
|
|
131
143
|
|---------|-------------|
|
|
132
144
|
| `click` | Click element by ref |
|
|
133
145
|
| `type` | Type text into element |
|
|
134
|
-
| `type ... submit` | Type and press Enter |
|
|
135
146
|
| `fill` | Fill a form field |
|
|
136
|
-
| `batchfill` | Fill multiple fields at once |
|
|
137
147
|
| `press` | Press a key (Enter, Escape, Tab, Arrows) |
|
|
138
148
|
|
|
139
149
|
### Page Control
|
|
140
150
|
| Command | Description |
|
|
141
151
|
|---------|-------------|
|
|
142
152
|
| `scroll` | Scroll up/down by amount |
|
|
143
|
-
| `scrollintoview` | Scroll element into view |
|
|
144
153
|
| `wait` | Wait for text or element to appear |
|
|
145
154
|
| `evaluate` | Run JavaScript in page context |
|
|
146
|
-
| `dialog` | Handle alert/confirm/prompt |
|
|
147
155
|
|
|
148
156
|
---
|
|
149
157
|
|
|
150
|
-
## CLI
|
|
158
|
+
## CLI Usage
|
|
151
159
|
|
|
152
160
|
```bash
|
|
153
|
-
|
|
154
|
-
npx tab-agent
|
|
155
|
-
npx tab-agent
|
|
161
|
+
# Setup & Status
|
|
162
|
+
npx tab-agent setup # Initial configuration
|
|
163
|
+
npx tab-agent status # Check if everything works
|
|
164
|
+
npx tab-agent start # Start relay server manually
|
|
165
|
+
|
|
166
|
+
# Browser Commands
|
|
167
|
+
npx tab-agent tabs # List active tabs
|
|
168
|
+
npx tab-agent snapshot # Get page content with refs
|
|
169
|
+
npx tab-agent screenshot # Capture viewport
|
|
170
|
+
npx tab-agent screenshot --full # Capture full page
|
|
171
|
+
npx tab-agent click e5 # Click element
|
|
172
|
+
npx tab-agent type e3 "hello" # Type text
|
|
173
|
+
npx tab-agent navigate "https://..." # Go to URL
|
|
156
174
|
```
|
|
157
175
|
|
|
158
176
|
---
|
|
@@ -189,17 +207,17 @@ Setup automatically detects your browser.
|
|
|
189
207
|
|
|
190
208
|
1. **Chrome Extension** — Runs in your browser with access to activated tabs and your session cookies
|
|
191
209
|
|
|
192
|
-
2. **Relay Server** — Local WebSocket server
|
|
210
|
+
2. **Relay Server** — Local WebSocket server that bridges LLM ↔ Extension via Chrome's Native Messaging API (runs in background)
|
|
193
211
|
|
|
194
|
-
3. **Skill File** — Tells Claude/Codex how to send commands
|
|
212
|
+
3. **Skill File** — Tells Claude/Codex how to send commands
|
|
195
213
|
|
|
196
214
|
**Data flow:**
|
|
197
215
|
```
|
|
198
216
|
You: "Search Google for cats"
|
|
199
217
|
↓
|
|
200
|
-
|
|
218
|
+
LLM → CLI command → Relay Server → Native Messaging → Extension → Browser action
|
|
201
219
|
↑
|
|
202
|
-
Results ←
|
|
220
|
+
Results ← Response ← Relay Server ← Native Messaging ← Page snapshot
|
|
203
221
|
```
|
|
204
222
|
|
|
205
223
|
---
|
|
@@ -210,4 +228,4 @@ MIT
|
|
|
210
228
|
|
|
211
229
|
---
|
|
212
230
|
|
|
213
|
-
**
|
|
231
|
+
**Works with [Claude Code](https://claude.ai/code), [Codex](https://openai.com/codex), and any LLM that can run shell commands.**
|
package/bin/tab-agent.js
CHANGED
|
@@ -1,40 +1,64 @@
|
|
|
1
1
|
#!/usr/bin/env node
|
|
2
2
|
const command = process.argv[2];
|
|
3
|
-
|
|
3
|
+
|
|
4
|
+
// Commands that go to the command module
|
|
5
|
+
const BROWSER_COMMANDS = ['tabs', 'snapshot', 'screenshot', 'click', 'type', 'fill', 'press', 'scroll', 'navigate', 'wait', 'evaluate'];
|
|
6
|
+
|
|
7
|
+
if (command === '-v' || command === '--version') {
|
|
8
|
+
console.log(require('../package.json').version);
|
|
9
|
+
process.exit(0);
|
|
10
|
+
}
|
|
11
|
+
|
|
12
|
+
if (BROWSER_COMMANDS.includes(command)) {
|
|
13
|
+
const { runCommand } = require('../cli/command.js');
|
|
14
|
+
runCommand(process.argv.slice(2));
|
|
15
|
+
} else {
|
|
16
|
+
switch (command) {
|
|
17
|
+
case 'setup':
|
|
18
|
+
require('../cli/setup.js');
|
|
19
|
+
break;
|
|
20
|
+
case 'start':
|
|
21
|
+
require('../cli/start.js');
|
|
22
|
+
break;
|
|
23
|
+
case 'status':
|
|
24
|
+
require('../cli/status.js');
|
|
25
|
+
break;
|
|
26
|
+
default:
|
|
27
|
+
showHelp();
|
|
28
|
+
}
|
|
29
|
+
}
|
|
4
30
|
|
|
5
31
|
function showHelp() {
|
|
6
32
|
console.log(`
|
|
7
33
|
tab-agent - Browser control for Claude/Codex
|
|
8
34
|
|
|
9
|
-
Commands:
|
|
10
|
-
setup
|
|
11
|
-
start
|
|
12
|
-
status
|
|
35
|
+
Setup Commands:
|
|
36
|
+
setup Auto-detect extension, register native host, install skills
|
|
37
|
+
start Start the relay server
|
|
38
|
+
status Check configuration status
|
|
13
39
|
|
|
14
|
-
|
|
40
|
+
Browser Commands:
|
|
41
|
+
tabs List active tabs
|
|
42
|
+
snapshot Get AI-readable page content
|
|
43
|
+
screenshot [--full] Capture screenshot
|
|
44
|
+
click <ref> Click element (e.g., click e5)
|
|
45
|
+
type <ref> <text> Type text into element
|
|
46
|
+
fill <ref> <value> Fill form field
|
|
47
|
+
press <key> Press key (Enter, Escape, etc.)
|
|
48
|
+
scroll <dir> [amount] Scroll up/down
|
|
49
|
+
navigate <url> Go to URL
|
|
50
|
+
wait <text|selector> Wait for text or element
|
|
51
|
+
evaluate <script> Run JavaScript
|
|
52
|
+
|
|
53
|
+
Examples:
|
|
15
54
|
npx tab-agent setup
|
|
16
|
-
npx tab-agent
|
|
17
|
-
|
|
18
|
-
|
|
55
|
+
npx tab-agent snapshot
|
|
56
|
+
npx tab-agent click e5
|
|
57
|
+
npx tab-agent type e3 "hello world"
|
|
19
58
|
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
break;
|
|
24
|
-
case 'start':
|
|
25
|
-
require('../cli/start.js');
|
|
26
|
-
break;
|
|
27
|
-
case 'status':
|
|
28
|
-
require('../cli/status.js');
|
|
29
|
-
break;
|
|
30
|
-
case '-v':
|
|
31
|
-
case '--version':
|
|
32
|
-
console.log(pkg.version);
|
|
33
|
-
break;
|
|
34
|
-
case undefined:
|
|
35
|
-
showHelp();
|
|
36
|
-
break;
|
|
37
|
-
default:
|
|
38
|
-
showHelp();
|
|
59
|
+
Version: ${require('../package.json').version}
|
|
60
|
+
`);
|
|
61
|
+
if (command && command !== 'help' && command !== '--help' && command !== '-h') {
|
|
39
62
|
process.exit(1);
|
|
63
|
+
}
|
|
40
64
|
}
|
package/cli/command.js
ADDED
|
@@ -0,0 +1,200 @@
|
|
|
1
|
+
// cli/command.js
|
|
2
|
+
const WebSocket = require('ws');
|
|
3
|
+
|
|
4
|
+
const COMMANDS = ['tabs', 'snapshot', 'screenshot', 'click', 'type', 'fill', 'press', 'scroll', 'navigate', 'wait', 'evaluate'];
|
|
5
|
+
|
|
6
|
+
async function runCommand(args) {
|
|
7
|
+
const [command, ...params] = args;
|
|
8
|
+
|
|
9
|
+
if (!command || command === 'help') {
|
|
10
|
+
printHelp();
|
|
11
|
+
return;
|
|
12
|
+
}
|
|
13
|
+
|
|
14
|
+
if (!COMMANDS.includes(command)) {
|
|
15
|
+
console.error(`Unknown command: ${command}`);
|
|
16
|
+
console.error(`Available: ${COMMANDS.join(', ')}`);
|
|
17
|
+
process.exit(1);
|
|
18
|
+
}
|
|
19
|
+
|
|
20
|
+
const ws = new WebSocket('ws://localhost:9876');
|
|
21
|
+
|
|
22
|
+
const timeout = setTimeout(() => {
|
|
23
|
+
console.error('Connection timeout - is the relay running? Try: npx tab-agent start');
|
|
24
|
+
ws.close();
|
|
25
|
+
process.exit(1);
|
|
26
|
+
}, 5000);
|
|
27
|
+
|
|
28
|
+
ws.on('error', (err) => {
|
|
29
|
+
clearTimeout(timeout);
|
|
30
|
+
console.error('Connection failed:', err.message);
|
|
31
|
+
console.error('Make sure relay is running: npx tab-agent start');
|
|
32
|
+
process.exit(1);
|
|
33
|
+
});
|
|
34
|
+
|
|
35
|
+
ws.on('open', () => {
|
|
36
|
+
clearTimeout(timeout);
|
|
37
|
+
|
|
38
|
+
// First get tabs to find tabId
|
|
39
|
+
if (command === 'tabs') {
|
|
40
|
+
ws.send(JSON.stringify({ id: 1, action: 'tabs' }));
|
|
41
|
+
} else {
|
|
42
|
+
// Get active tab first, then run command
|
|
43
|
+
ws.send(JSON.stringify({ id: 0, action: 'tabs' }));
|
|
44
|
+
}
|
|
45
|
+
});
|
|
46
|
+
|
|
47
|
+
ws.on('message', (data) => {
|
|
48
|
+
const msg = JSON.parse(data);
|
|
49
|
+
|
|
50
|
+
// Handle tabs response
|
|
51
|
+
if (msg.id === 0) {
|
|
52
|
+
if (!msg.tabs || msg.tabs.length === 0) {
|
|
53
|
+
console.error('No active tabs. Click Tab Agent icon on a tab to activate it.');
|
|
54
|
+
ws.close();
|
|
55
|
+
process.exit(1);
|
|
56
|
+
}
|
|
57
|
+
|
|
58
|
+
const tabId = msg.tabs[0].tabId;
|
|
59
|
+
const payload = buildPayload(command, params, tabId);
|
|
60
|
+
ws.send(JSON.stringify({ id: 1, ...payload }));
|
|
61
|
+
return;
|
|
62
|
+
}
|
|
63
|
+
|
|
64
|
+
// Handle command response
|
|
65
|
+
if (msg.id === 1) {
|
|
66
|
+
if (command === 'tabs') {
|
|
67
|
+
printTabs(msg);
|
|
68
|
+
} else if (command === 'snapshot') {
|
|
69
|
+
printSnapshot(msg);
|
|
70
|
+
} else if (command === 'screenshot') {
|
|
71
|
+
printScreenshot(msg);
|
|
72
|
+
} else {
|
|
73
|
+
printResult(msg);
|
|
74
|
+
}
|
|
75
|
+
ws.close();
|
|
76
|
+
process.exit(msg.ok ? 0 : 1);
|
|
77
|
+
}
|
|
78
|
+
});
|
|
79
|
+
}
|
|
80
|
+
|
|
81
|
+
function buildPayload(command, params, tabId) {
|
|
82
|
+
const payload = { action: command, tabId };
|
|
83
|
+
|
|
84
|
+
switch (command) {
|
|
85
|
+
case 'click':
|
|
86
|
+
payload.ref = params[0];
|
|
87
|
+
break;
|
|
88
|
+
case 'type':
|
|
89
|
+
payload.ref = params[0];
|
|
90
|
+
payload.text = params.slice(1).join(' ');
|
|
91
|
+
if (params.includes('--submit')) {
|
|
92
|
+
payload.submit = true;
|
|
93
|
+
payload.text = payload.text.replace('--submit', '').trim();
|
|
94
|
+
}
|
|
95
|
+
break;
|
|
96
|
+
case 'fill':
|
|
97
|
+
payload.ref = params[0];
|
|
98
|
+
payload.value = params.slice(1).join(' ');
|
|
99
|
+
break;
|
|
100
|
+
case 'press':
|
|
101
|
+
payload.key = params[0];
|
|
102
|
+
break;
|
|
103
|
+
case 'scroll':
|
|
104
|
+
payload.direction = params[0] || 'down';
|
|
105
|
+
payload.amount = parseInt(params[1]) || 500;
|
|
106
|
+
break;
|
|
107
|
+
case 'navigate':
|
|
108
|
+
payload.url = params[0];
|
|
109
|
+
break;
|
|
110
|
+
case 'wait':
|
|
111
|
+
if (params[0]?.startsWith('.') || params[0]?.startsWith('#')) {
|
|
112
|
+
payload.selector = params[0];
|
|
113
|
+
} else {
|
|
114
|
+
payload.text = params.join(' ');
|
|
115
|
+
}
|
|
116
|
+
payload.timeout = parseInt(params.find(p => /^\d+$/.test(p))) || 5000;
|
|
117
|
+
break;
|
|
118
|
+
case 'evaluate':
|
|
119
|
+
payload.script = params.join(' ');
|
|
120
|
+
break;
|
|
121
|
+
case 'screenshot':
|
|
122
|
+
if (params.includes('--full') || params.includes('--fullPage')) {
|
|
123
|
+
payload.fullPage = true;
|
|
124
|
+
}
|
|
125
|
+
break;
|
|
126
|
+
}
|
|
127
|
+
|
|
128
|
+
return payload;
|
|
129
|
+
}
|
|
130
|
+
|
|
131
|
+
function printHelp() {
|
|
132
|
+
console.log(`
|
|
133
|
+
tab-agent - Browser control commands
|
|
134
|
+
|
|
135
|
+
Usage: npx tab-agent <command> [options]
|
|
136
|
+
|
|
137
|
+
Commands:
|
|
138
|
+
tabs List active tabs
|
|
139
|
+
snapshot Get AI-readable page content
|
|
140
|
+
screenshot [--full] Capture screenshot (--full for full page)
|
|
141
|
+
click <ref> Click element (e.g., click e5)
|
|
142
|
+
type <ref> <text> Type text into element
|
|
143
|
+
fill <ref> <value> Fill form field
|
|
144
|
+
press <key> Press key (Enter, Escape, Tab, etc.)
|
|
145
|
+
scroll <dir> [amount] Scroll up/down (default: 500px)
|
|
146
|
+
navigate <url> Go to URL
|
|
147
|
+
wait <text|selector> Wait for text or element
|
|
148
|
+
evaluate <script> Run JavaScript
|
|
149
|
+
|
|
150
|
+
Examples:
|
|
151
|
+
npx tab-agent tabs
|
|
152
|
+
npx tab-agent snapshot
|
|
153
|
+
npx tab-agent click e5
|
|
154
|
+
npx tab-agent type e3 hello world
|
|
155
|
+
npx tab-agent navigate https://google.com
|
|
156
|
+
npx tab-agent screenshot --full
|
|
157
|
+
`);
|
|
158
|
+
}
|
|
159
|
+
|
|
160
|
+
function printTabs(msg) {
|
|
161
|
+
if (!msg.ok) {
|
|
162
|
+
console.error('Error:', msg.error);
|
|
163
|
+
return;
|
|
164
|
+
}
|
|
165
|
+
console.log('Active tabs:\n');
|
|
166
|
+
msg.tabs.forEach((tab, i) => {
|
|
167
|
+
console.log(` ${i + 1}. [${tab.tabId}] ${tab.title}`);
|
|
168
|
+
console.log(` ${tab.url}\n`);
|
|
169
|
+
});
|
|
170
|
+
}
|
|
171
|
+
|
|
172
|
+
function printSnapshot(msg) {
|
|
173
|
+
if (!msg.ok) {
|
|
174
|
+
console.error('Error:', msg.error);
|
|
175
|
+
return;
|
|
176
|
+
}
|
|
177
|
+
console.log(msg.snapshot);
|
|
178
|
+
}
|
|
179
|
+
|
|
180
|
+
function printScreenshot(msg) {
|
|
181
|
+
if (!msg.ok) {
|
|
182
|
+
console.error('Error:', msg.error);
|
|
183
|
+
return;
|
|
184
|
+
}
|
|
185
|
+
// Output base64 directly - no file, no auto-open
|
|
186
|
+
console.log(msg.screenshot);
|
|
187
|
+
}
|
|
188
|
+
|
|
189
|
+
function printResult(msg) {
|
|
190
|
+
if (!msg.ok) {
|
|
191
|
+
console.error('Error:', msg.error);
|
|
192
|
+
return;
|
|
193
|
+
}
|
|
194
|
+
console.log('OK');
|
|
195
|
+
if (msg.result !== undefined) {
|
|
196
|
+
console.log('Result:', msg.result);
|
|
197
|
+
}
|
|
198
|
+
}
|
|
199
|
+
|
|
200
|
+
module.exports = { runCommand };
|
|
@@ -191,32 +191,23 @@ async function routeCommand(tabId, command) {
|
|
|
191
191
|
// Full page screenshot using chrome.debugger
|
|
192
192
|
if (fullPage) {
|
|
193
193
|
try {
|
|
194
|
-
|
|
194
|
+
// Try to detach first in case previous attempt left debugger attached
|
|
195
|
+
try { await chrome.debugger.detach({ tabId }); } catch {}
|
|
195
196
|
|
|
196
|
-
|
|
197
|
-
{ tabId },
|
|
198
|
-
'Page.getLayoutMetrics'
|
|
199
|
-
);
|
|
197
|
+
await chrome.debugger.attach({ tabId }, '1.3');
|
|
200
198
|
|
|
201
|
-
const
|
|
199
|
+
const screenshot = await chrome.debugger.sendCommand(
|
|
202
200
|
{ tabId },
|
|
203
201
|
'Page.captureScreenshot',
|
|
204
202
|
{
|
|
205
203
|
format: 'png',
|
|
206
|
-
captureBeyondViewport: true
|
|
207
|
-
clip: {
|
|
208
|
-
x: 0,
|
|
209
|
-
y: 0,
|
|
210
|
-
width: layout.contentSize.width,
|
|
211
|
-
height: layout.contentSize.height,
|
|
212
|
-
scale: 1
|
|
213
|
-
}
|
|
204
|
+
captureBeyondViewport: true
|
|
214
205
|
}
|
|
215
206
|
);
|
|
216
207
|
|
|
217
208
|
await chrome.debugger.detach({ tabId });
|
|
218
209
|
audit('screenshot', { tabId, fullPage: true }, { ok: true });
|
|
219
|
-
return { ok: true, screenshot: 'data:image/png;base64,' + data, format: 'png' };
|
|
210
|
+
return { ok: true, screenshot: 'data:image/png;base64,' + screenshot.data, format: 'png' };
|
|
220
211
|
} catch (error) {
|
|
221
212
|
try { await chrome.debugger.detach({ tabId }); } catch {}
|
|
222
213
|
const result = { ok: false, error: error.message };
|
package/package.json
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: tab-agent
|
|
3
|
-
description: Browser control via
|
|
3
|
+
description: Browser control via CLI - snapshot, click, type, fill, screenshot
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Tab Agent
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
Control browser tabs via CLI. User activates tabs via extension icon (green = active).
|
|
9
9
|
|
|
10
10
|
## Before First Command
|
|
11
11
|
|
|
@@ -16,38 +16,44 @@ sleep 2
|
|
|
16
16
|
|
|
17
17
|
## Commands
|
|
18
18
|
|
|
19
|
-
```
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
{"id": 13, "action": "wait", "tabId": ID, "text": "Loading complete"} // wait for text
|
|
33
|
-
{"id": 14, "action": "wait", "tabId": ID, "selector": ".results", "timeout": 5000} // wait for element
|
|
34
|
-
{"id": 15, "action": "evaluate", "tabId": ID, "script": "document.title"} // run JavaScript
|
|
35
|
-
{"id": 16, "action": "batchfill", "tabId": ID, "fields": [{"ref": "e1", "value": "a"}, {"ref": "e2", "value": "b"}]}
|
|
36
|
-
{"id": 17, "action": "dialog", "tabId": ID, "accept": true} // handle alert/confirm
|
|
19
|
+
```bash
|
|
20
|
+
npx tab-agent tabs # List active tabs
|
|
21
|
+
npx tab-agent snapshot # Get page with refs [e1], [e2]...
|
|
22
|
+
npx tab-agent screenshot # Capture viewport
|
|
23
|
+
npx tab-agent screenshot --full # Capture full page
|
|
24
|
+
npx tab-agent click <ref> # Click element
|
|
25
|
+
npx tab-agent type <ref> <text> # Type text
|
|
26
|
+
npx tab-agent fill <ref> <value> # Fill form field
|
|
27
|
+
npx tab-agent press <key> # Press key (Enter, Escape, Tab)
|
|
28
|
+
npx tab-agent scroll <dir> [amount] # Scroll up/down
|
|
29
|
+
npx tab-agent navigate <url> # Go to URL
|
|
30
|
+
npx tab-agent wait <text|selector> # Wait for condition
|
|
31
|
+
npx tab-agent evaluate <script> # Run JavaScript
|
|
37
32
|
```
|
|
38
33
|
|
|
39
34
|
## Usage
|
|
40
35
|
|
|
41
|
-
1. `tabs` ->
|
|
36
|
+
1. `tabs` -> find active tab
|
|
42
37
|
2. `snapshot` -> read page, get element refs [e1], [e2]...
|
|
43
|
-
3. `click`/`fill
|
|
44
|
-
4. If snapshot incomplete
|
|
38
|
+
3. `click`/`type`/`fill` using refs
|
|
39
|
+
4. If snapshot incomplete -> `screenshot` and analyze visually
|
|
40
|
+
|
|
41
|
+
## Examples
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
# Search Google
|
|
45
|
+
npx tab-agent navigate "https://google.com"
|
|
46
|
+
npx tab-agent snapshot
|
|
47
|
+
npx tab-agent type e1 "hello world"
|
|
48
|
+
npx tab-agent press Enter
|
|
49
|
+
|
|
50
|
+
# Read page content
|
|
51
|
+
npx tab-agent snapshot
|
|
52
|
+
npx tab-agent screenshot --full
|
|
53
|
+
```
|
|
45
54
|
|
|
46
55
|
## Notes
|
|
47
56
|
|
|
48
|
-
- Screenshot
|
|
49
|
-
-
|
|
57
|
+
- Screenshot saves to /tmp/ and opens automatically
|
|
58
|
+
- Refs reset on each snapshot - always snapshot before interacting
|
|
50
59
|
- Keys: Enter, Escape, Tab, Backspace, ArrowUp/Down/Left/Right
|
|
51
|
-
- `type` with `submit: true` presses Enter after typing (for search boxes)
|
|
52
|
-
- `evaluate` runs in page context - can access page variables/functions
|
|
53
|
-
- `dialog` handles alert/confirm/prompt - debugger bar appears when attached
|
|
@@ -1,11 +1,11 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: tab-agent
|
|
3
|
-
description: Browser control via
|
|
3
|
+
description: Browser control via CLI
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Tab Agent
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
CLI browser control. User activates tabs via extension (green = active).
|
|
9
9
|
|
|
10
10
|
## Start Relay
|
|
11
11
|
|
|
@@ -15,26 +15,20 @@ curl -s http://localhost:9876/health || (npx tab-agent start &)
|
|
|
15
15
|
|
|
16
16
|
## Commands
|
|
17
17
|
|
|
18
|
-
```
|
|
19
|
-
tabs
|
|
20
|
-
snapshot
|
|
21
|
-
screenshot
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
fill
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
navigate tabId url -> go to URL
|
|
31
|
-
wait tabId text="..." -> wait for text
|
|
32
|
-
wait tabId selector="..." timeout=ms -> wait for element
|
|
33
|
-
evaluate tabId script="..." -> run JavaScript
|
|
34
|
-
batchfill tabId fields=[...] -> fill multiple fields
|
|
35
|
-
dialog tabId accept=true -> handle alert/confirm
|
|
18
|
+
```bash
|
|
19
|
+
tabs # List active tabs
|
|
20
|
+
snapshot # Page with refs [e1], [e2]...
|
|
21
|
+
screenshot [--full] # Capture viewport/full page
|
|
22
|
+
click <ref> # Click element
|
|
23
|
+
type <ref> <text> # Type text
|
|
24
|
+
fill <ref> <value> # Fill form field
|
|
25
|
+
press <key> # Enter/Escape/Tab/Arrow*
|
|
26
|
+
scroll <dir> [amount] # Scroll up/down
|
|
27
|
+
navigate <url> # Go to URL
|
|
28
|
+
wait <text|selector> # Wait for condition
|
|
29
|
+
evaluate <script> # Run JavaScript
|
|
36
30
|
```
|
|
37
31
|
|
|
38
32
|
## Flow
|
|
39
33
|
|
|
40
|
-
`
|
|
34
|
+
`snapshot` -> `click`/`type` -> repeat. Use `screenshot` if snapshot incomplete.
|