tab-agent 0.3.0 → 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +45 -46
- package/cli/command.js +2 -10
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,32 +1,45 @@
|
|
|
1
1
|
# Tab Agent
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
[](https://www.npmjs.com/package/tab-agent)
|
|
4
|
+
|
|
5
|
+
**Give Claude, Codex, or any LLM full control of your browser tabs** — securely, with click-to-activate permission.
|
|
4
6
|
|
|
5
7
|
```
|
|
6
8
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
7
|
-
│
|
|
8
|
-
│
|
|
9
|
+
│ Claude Code │────▶│ Relay Server │────▶│ Extension │
|
|
10
|
+
│ Codex / LLM │◀────│ (background) │◀────│ (Chrome) │
|
|
9
11
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
12
|
+
│
|
|
13
|
+
▼
|
|
14
|
+
┌───────────────────┐
|
|
15
|
+
│ Your Active Tab │
|
|
16
|
+
│ 🟢 Click to ON │
|
|
17
|
+
└───────────────────┘
|
|
16
18
|
```
|
|
17
19
|
|
|
20
|
+
## Features
|
|
21
|
+
|
|
22
|
+
- **Full browser control** — navigate, click, type, scroll, screenshot, run JavaScript
|
|
23
|
+
- **Uses your login sessions** — access authenticated sites (GitHub, Gmail, X) without sharing credentials
|
|
24
|
+
- **Runs in background** — relay server starts automatically, works while you do other things
|
|
25
|
+
- **Click-to-activate security** — only tabs you explicitly enable, your other tabs stay private
|
|
26
|
+
- **AI-optimized snapshots** — pages converted to readable text with element refs `[e1]`, `[e2]`
|
|
27
|
+
- **Works with any LLM** — Claude Code, Codex, or any tool that can run shell commands
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
18
31
|
## Quick Start
|
|
19
32
|
|
|
20
33
|
```bash
|
|
21
34
|
# 1. Clone and load extension
|
|
22
35
|
git clone https://github.com/DrHB/tab-agent
|
|
23
|
-
# →
|
|
36
|
+
# → Chrome: chrome://extensions → Developer mode → Load unpacked → select extension/
|
|
24
37
|
|
|
25
38
|
# 2. Setup (auto-detects everything)
|
|
26
39
|
npx tab-agent setup
|
|
27
40
|
|
|
28
|
-
# 3.
|
|
29
|
-
# → Click Tab Agent icon on
|
|
41
|
+
# 3. Activate a tab & go!
|
|
42
|
+
# → Click Tab Agent icon on any tab (turns green = active)
|
|
30
43
|
# → Ask Claude: "Use tab-agent to search Google for 'hello world'"
|
|
31
44
|
```
|
|
32
45
|
|
|
@@ -42,9 +55,8 @@ npx tab-agent setup
|
|
|
42
55
|
| **Visibility** | Green badge = active | Hidden/background |
|
|
43
56
|
| **Sessions** | Uses your cookies | Requires re-login |
|
|
44
57
|
| **Credentials** | Never shared | Often required |
|
|
45
|
-
| **Audit** | Full action logging | Varies |
|
|
46
58
|
|
|
47
|
-
**Click-to-activate model:** Your banking, email, and sensitive tabs stay completely isolated. You always see exactly which tabs
|
|
59
|
+
**Click-to-activate model:** Your banking, email, and sensitive tabs stay completely isolated. You always see exactly which tabs the LLM can control.
|
|
48
60
|
|
|
49
61
|
### 🍪 Works With Your Login Sessions
|
|
50
62
|
|
|
@@ -55,9 +67,9 @@ Because Tab Agent runs as a Chrome extension:
|
|
|
55
67
|
- **Works with SSO and 2FA** — enterprise apps, protected accounts
|
|
56
68
|
- **No credential sharing** — your passwords stay in your browser
|
|
57
69
|
|
|
58
|
-
### 🤖
|
|
70
|
+
### 🤖 LLM-Optimized
|
|
59
71
|
|
|
60
|
-
- **Semantic snapshots** — pages converted to
|
|
72
|
+
- **Semantic snapshots** — pages converted to readable text with refs `[e1]`, `[e2]`
|
|
61
73
|
- **Screenshot fallback** — for complex dynamic pages
|
|
62
74
|
- **Simple targeting** — click/type using refs instead of fragile CSS selectors
|
|
63
75
|
|
|
@@ -111,7 +123,7 @@ This automatically:
|
|
|
111
123
|
|
|
112
124
|
1. Navigate to any webpage
|
|
113
125
|
2. **Click the Tab Agent icon** — it turns green (🟢 ON)
|
|
114
|
-
3. Ask your
|
|
126
|
+
3. Ask your LLM to interact with the page
|
|
115
127
|
|
|
116
128
|
---
|
|
117
129
|
|
|
@@ -122,56 +134,43 @@ This automatically:
|
|
|
122
134
|
|---------|-------------|
|
|
123
135
|
| `tabs` | List all activated tabs |
|
|
124
136
|
| `navigate` | Go to a URL |
|
|
125
|
-
| `snapshot` | Get
|
|
137
|
+
| `snapshot` | Get page with element refs |
|
|
126
138
|
| `screenshot` | Capture viewport image |
|
|
127
|
-
| `screenshot
|
|
139
|
+
| `screenshot --full` | Capture entire page |
|
|
128
140
|
|
|
129
141
|
### Interaction
|
|
130
142
|
| Command | Description |
|
|
131
143
|
|---------|-------------|
|
|
132
144
|
| `click` | Click element by ref |
|
|
133
145
|
| `type` | Type text into element |
|
|
134
|
-
| `type ... submit` | Type and press Enter |
|
|
135
146
|
| `fill` | Fill a form field |
|
|
136
|
-
| `batchfill` | Fill multiple fields at once |
|
|
137
147
|
| `press` | Press a key (Enter, Escape, Tab, Arrows) |
|
|
138
148
|
|
|
139
149
|
### Page Control
|
|
140
150
|
| Command | Description |
|
|
141
151
|
|---------|-------------|
|
|
142
152
|
| `scroll` | Scroll up/down by amount |
|
|
143
|
-
| `scrollintoview` | Scroll element into view |
|
|
144
153
|
| `wait` | Wait for text or element to appear |
|
|
145
154
|
| `evaluate` | Run JavaScript in page context |
|
|
146
|
-
| `dialog` | Handle alert/confirm/prompt |
|
|
147
155
|
|
|
148
|
-
|
|
156
|
+
---
|
|
149
157
|
|
|
150
|
-
|
|
158
|
+
## CLI Usage
|
|
151
159
|
|
|
152
160
|
```bash
|
|
161
|
+
# Setup & Status
|
|
162
|
+
npx tab-agent setup # Initial configuration
|
|
163
|
+
npx tab-agent status # Check if everything works
|
|
164
|
+
npx tab-agent start # Start relay server manually
|
|
165
|
+
|
|
166
|
+
# Browser Commands
|
|
153
167
|
npx tab-agent tabs # List active tabs
|
|
154
|
-
npx tab-agent snapshot # Get page content
|
|
168
|
+
npx tab-agent snapshot # Get page content with refs
|
|
155
169
|
npx tab-agent screenshot # Capture viewport
|
|
156
170
|
npx tab-agent screenshot --full # Capture full page
|
|
157
171
|
npx tab-agent click e5 # Click element
|
|
158
172
|
npx tab-agent type e3 "hello" # Type text
|
|
159
|
-
npx tab-agent fill e3 "value" # Fill field
|
|
160
|
-
npx tab-agent press Enter # Press key
|
|
161
|
-
npx tab-agent scroll down 500 # Scroll
|
|
162
173
|
npx tab-agent navigate "https://..." # Go to URL
|
|
163
|
-
npx tab-agent wait "Loading" # Wait for text
|
|
164
|
-
npx tab-agent evaluate "document.title" # Run JS
|
|
165
|
-
```
|
|
166
|
-
|
|
167
|
-
---
|
|
168
|
-
|
|
169
|
-
## CLI Reference
|
|
170
|
-
|
|
171
|
-
```bash
|
|
172
|
-
npx tab-agent setup # Initial configuration
|
|
173
|
-
npx tab-agent status # Check if everything works
|
|
174
|
-
npx tab-agent start # Start relay server manually
|
|
175
174
|
```
|
|
176
175
|
|
|
177
176
|
---
|
|
@@ -208,17 +207,17 @@ Setup automatically detects your browser.
|
|
|
208
207
|
|
|
209
208
|
1. **Chrome Extension** — Runs in your browser with access to activated tabs and your session cookies
|
|
210
209
|
|
|
211
|
-
2. **Relay Server** — Local WebSocket server
|
|
210
|
+
2. **Relay Server** — Local WebSocket server that bridges LLM ↔ Extension via Chrome's Native Messaging API (runs in background)
|
|
212
211
|
|
|
213
|
-
3. **Skill File** — Tells Claude/Codex how to send commands
|
|
212
|
+
3. **Skill File** — Tells Claude/Codex how to send commands
|
|
214
213
|
|
|
215
214
|
**Data flow:**
|
|
216
215
|
```
|
|
217
216
|
You: "Search Google for cats"
|
|
218
217
|
↓
|
|
219
|
-
|
|
218
|
+
LLM → CLI command → Relay Server → Native Messaging → Extension → Browser action
|
|
220
219
|
↑
|
|
221
|
-
Results ←
|
|
220
|
+
Results ← Response ← Relay Server ← Native Messaging ← Page snapshot
|
|
222
221
|
```
|
|
223
222
|
|
|
224
223
|
---
|
|
@@ -229,4 +228,4 @@ MIT
|
|
|
229
228
|
|
|
230
229
|
---
|
|
231
230
|
|
|
232
|
-
**
|
|
231
|
+
**Works with [Claude Code](https://claude.ai/code), [Codex](https://openai.com/codex), and any LLM that can run shell commands.**
|
package/cli/command.js
CHANGED
|
@@ -182,16 +182,8 @@ function printScreenshot(msg) {
|
|
|
182
182
|
console.error('Error:', msg.error);
|
|
183
183
|
return;
|
|
184
184
|
}
|
|
185
|
-
//
|
|
186
|
-
|
|
187
|
-
const filename = `/tmp/tab-agent-screenshot-${Date.now()}.png`;
|
|
188
|
-
const base64Data = msg.screenshot.replace(/^data:image\/png;base64,/, '');
|
|
189
|
-
fs.writeFileSync(filename, base64Data, 'base64');
|
|
190
|
-
console.log(`Screenshot saved: ${filename}`);
|
|
191
|
-
|
|
192
|
-
// Try to open it
|
|
193
|
-
const { exec } = require('child_process');
|
|
194
|
-
exec(`open "${filename}"`, () => {});
|
|
185
|
+
// Output base64 directly - no file, no auto-open
|
|
186
|
+
console.log(msg.screenshot);
|
|
195
187
|
}
|
|
196
188
|
|
|
197
189
|
function printResult(msg) {
|