tab-agent 0.2.1 → 0.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +159 -104
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,135 +1,161 @@
|
|
|
1
1
|
# Tab Agent
|
|
2
2
|
|
|
3
|
-
Secure
|
|
3
|
+
**Secure browser control for Claude Code and Codex** — only the tabs you explicitly activate, not your entire browser.
|
|
4
|
+
|
|
5
|
+
```
|
|
6
|
+
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
7
|
+
│ Claude/Codex │────▶│ Relay Server │────▶│ Extension │
|
|
8
|
+
│ │◀────│ :9876 │◀────│ (Chrome) │
|
|
9
|
+
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
10
|
+
│
|
|
11
|
+
▼
|
|
12
|
+
┌───────────────────┐
|
|
13
|
+
│ Your Active Tab │
|
|
14
|
+
│ 🟢 ON │
|
|
15
|
+
└───────────────────┘
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
## Quick Start
|
|
19
|
+
|
|
20
|
+
```bash
|
|
21
|
+
# 1. Clone and load extension
|
|
22
|
+
git clone https://github.com/DrHB/tab-agent
|
|
23
|
+
# → Open chrome://extensions → Enable Developer mode → Load unpacked → select extension/
|
|
24
|
+
|
|
25
|
+
# 2. Setup (auto-detects everything)
|
|
26
|
+
npx tab-agent setup
|
|
27
|
+
|
|
28
|
+
# 3. Use it
|
|
29
|
+
# → Click Tab Agent icon on a tab (turns green)
|
|
30
|
+
# → Ask Claude: "Use tab-agent to search Google for 'hello world'"
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
---
|
|
4
34
|
|
|
5
35
|
## Why Tab Agent?
|
|
6
36
|
|
|
7
|
-
### Security First
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
37
|
+
### 🔒 Security First
|
|
38
|
+
|
|
39
|
+
| | Tab Agent | Traditional Automation |
|
|
40
|
+
|--|-----------|----------------------|
|
|
41
|
+
| **Access** | Only tabs you activate | Entire browser |
|
|
42
|
+
| **Visibility** | Green badge = active | Hidden/background |
|
|
43
|
+
| **Sessions** | Uses your cookies | Requires re-login |
|
|
44
|
+
| **Credentials** | Never shared | Often required |
|
|
45
|
+
| **Audit** | Full action logging | Varies |
|
|
46
|
+
|
|
47
|
+
**Click-to-activate model:** Your banking, email, and sensitive tabs stay completely isolated. You always see exactly which tabs AI can control.
|
|
48
|
+
|
|
49
|
+
### 🍪 Works With Your Login Sessions
|
|
13
50
|
|
|
14
|
-
|
|
15
|
-
Tab Agent operates through a Chrome extension, which means:
|
|
16
|
-
- **Uses your existing cookies and login sessions** — no need to re-authenticate
|
|
17
|
-
- Access sites that require login (GitHub, Twitter, internal tools, etc.)
|
|
18
|
-
- Works with SSO, 2FA-protected accounts, and enterprise apps
|
|
19
|
-
- No credential sharing or token management needed
|
|
51
|
+
Because Tab Agent runs as a Chrome extension:
|
|
20
52
|
|
|
21
|
-
|
|
22
|
-
- **
|
|
23
|
-
- **
|
|
24
|
-
- **
|
|
53
|
+
- **Uses your existing cookies** — no re-authentication needed
|
|
54
|
+
- **Access any site you're logged into** — GitHub, Twitter, Gmail, internal tools
|
|
55
|
+
- **Works with SSO and 2FA** — enterprise apps, protected accounts
|
|
56
|
+
- **No credential sharing** — your passwords stay in your browser
|
|
25
57
|
|
|
26
|
-
|
|
58
|
+
### 🤖 AI-Optimized
|
|
27
59
|
|
|
28
|
-
|
|
60
|
+
- **Semantic snapshots** — pages converted to AI-readable text with refs `[e1]`, `[e2]`
|
|
61
|
+
- **Screenshot fallback** — for complex dynamic pages
|
|
62
|
+
- **Simple targeting** — click/type using refs instead of fragile CSS selectors
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## Example Use Cases
|
|
67
|
+
|
|
68
|
+
**Web Research**
|
|
69
|
+
> "Go to Hacker News and summarize the top 5 articles"
|
|
70
|
+
|
|
71
|
+
**Authenticated Actions** (uses your session!)
|
|
72
|
+
> "Check my GitHub notifications and list the unread ones"
|
|
73
|
+
|
|
74
|
+
**Form Automation**
|
|
75
|
+
> "Fill out this contact form with my details"
|
|
76
|
+
|
|
77
|
+
**Data Extraction**
|
|
78
|
+
> "Get the last 20 tweets from my timeline with author names"
|
|
79
|
+
|
|
80
|
+
**Multi-step Workflows**
|
|
81
|
+
> "Search Amazon for 'mechanical keyboard', filter by 4+ stars, and list the top 3"
|
|
82
|
+
|
|
83
|
+
---
|
|
84
|
+
|
|
85
|
+
## Installation
|
|
86
|
+
|
|
87
|
+
### Step 1: Load Extension
|
|
29
88
|
|
|
30
89
|
```bash
|
|
31
90
|
git clone https://github.com/DrHB/tab-agent
|
|
32
91
|
```
|
|
33
92
|
|
|
34
|
-
1. Open `chrome://extensions`
|
|
35
|
-
2. Enable **Developer mode** (top right)
|
|
36
|
-
3. Click **Load unpacked**
|
|
37
|
-
4.
|
|
93
|
+
1. Open `chrome://extensions` in your browser
|
|
94
|
+
2. Enable **Developer mode** (toggle in top right)
|
|
95
|
+
3. Click **Load unpacked**
|
|
96
|
+
4. Select the `extension/` folder
|
|
97
|
+
5. You'll see the Tab Agent icon in your toolbar
|
|
38
98
|
|
|
39
|
-
### 2
|
|
99
|
+
### Step 2: Run Setup
|
|
40
100
|
|
|
41
101
|
```bash
|
|
42
102
|
npx tab-agent setup
|
|
43
103
|
```
|
|
44
104
|
|
|
45
|
-
This
|
|
46
|
-
|
|
47
|
-
|
|
105
|
+
This automatically:
|
|
106
|
+
- Detects your extension ID
|
|
107
|
+
- Configures native messaging
|
|
108
|
+
- Installs the Claude/Codex skill
|
|
48
109
|
|
|
49
|
-
|
|
50
|
-
2. **Ask Claude/Codex:**
|
|
51
|
-
- "Use tab-agent to search Google for 'best restaurants nearby'"
|
|
52
|
-
- "Go to my GitHub notifications and summarize them"
|
|
53
|
-
- "Fill out this form with my details"
|
|
54
|
-
|
|
55
|
-
## Example Use Cases
|
|
110
|
+
### Step 3: Activate & Use
|
|
56
111
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
```
|
|
61
|
-
|
|
62
|
-
### Authenticated Actions
|
|
63
|
-
```
|
|
64
|
-
"Check my GitHub notifications and mark the resolved ones as read"
|
|
65
|
-
```
|
|
66
|
-
Works because Tab Agent uses your existing GitHub session!
|
|
112
|
+
1. Navigate to any webpage
|
|
113
|
+
2. **Click the Tab Agent icon** — it turns green (🟢 ON)
|
|
114
|
+
3. Ask your AI to interact with the page
|
|
67
115
|
|
|
68
|
-
|
|
69
|
-
```
|
|
70
|
-
"Fill out this job application with my resume details"
|
|
71
|
-
```
|
|
116
|
+
---
|
|
72
117
|
|
|
73
|
-
|
|
74
|
-
```
|
|
75
|
-
"Go to my Twitter timeline and get the last 20 tweets"
|
|
76
|
-
```
|
|
118
|
+
## Commands Reference
|
|
77
119
|
|
|
78
|
-
|
|
120
|
+
### Navigation & Viewing
|
|
121
|
+
| Command | Description |
|
|
122
|
+
|---------|-------------|
|
|
123
|
+
| `tabs` | List all activated tabs |
|
|
124
|
+
| `navigate` | Go to a URL |
|
|
125
|
+
| `snapshot` | Get AI-readable page with element refs |
|
|
126
|
+
| `screenshot` | Capture viewport image |
|
|
127
|
+
| `screenshot fullPage` | Capture entire page |
|
|
79
128
|
|
|
129
|
+
### Interaction
|
|
80
130
|
| Command | Description |
|
|
81
131
|
|---------|-------------|
|
|
82
|
-
| `tabs` | List activated tabs |
|
|
83
|
-
| `snapshot` | Get AI-readable page with refs [e1], [e2]... |
|
|
84
|
-
| `screenshot` | Capture viewport (add `fullPage: true` for full page) |
|
|
85
132
|
| `click` | Click element by ref |
|
|
86
|
-
| `
|
|
87
|
-
| `type` | Type
|
|
88
|
-
| `
|
|
89
|
-
| `scroll` | Scroll page up/down |
|
|
90
|
-
| `scrollintoview` | Scroll element into view |
|
|
91
|
-
| `navigate` | Go to URL |
|
|
92
|
-
| `wait` | Wait for text or selector to appear |
|
|
93
|
-
| `evaluate` | Run JavaScript in page context |
|
|
133
|
+
| `type` | Type text into element |
|
|
134
|
+
| `type ... submit` | Type and press Enter |
|
|
135
|
+
| `fill` | Fill a form field |
|
|
94
136
|
| `batchfill` | Fill multiple fields at once |
|
|
95
|
-
| `
|
|
137
|
+
| `press` | Press a key (Enter, Escape, Tab, Arrows) |
|
|
96
138
|
|
|
97
|
-
|
|
139
|
+
### Page Control
|
|
140
|
+
| Command | Description |
|
|
141
|
+
|---------|-------------|
|
|
142
|
+
| `scroll` | Scroll up/down by amount |
|
|
143
|
+
| `scrollintoview` | Scroll element into view |
|
|
144
|
+
| `wait` | Wait for text or element to appear |
|
|
145
|
+
| `evaluate` | Run JavaScript in page context |
|
|
146
|
+
| `dialog` | Handle alert/confirm/prompt |
|
|
98
147
|
|
|
99
|
-
|
|
100
|
-
npx tab-agent setup # Configure everything (run once)
|
|
101
|
-
npx tab-agent status # Check if everything is working
|
|
102
|
-
npx tab-agent start # Manually start the relay server
|
|
103
|
-
```
|
|
148
|
+
---
|
|
104
149
|
|
|
105
|
-
##
|
|
150
|
+
## CLI Reference
|
|
106
151
|
|
|
152
|
+
```bash
|
|
153
|
+
npx tab-agent setup # Initial configuration
|
|
154
|
+
npx tab-agent status # Check if everything works
|
|
155
|
+
npx tab-agent start # Start relay server manually
|
|
107
156
|
```
|
|
108
|
-
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
109
|
-
│ Claude/Codex │────▶│ Relay Server │────▶│ Extension │
|
|
110
|
-
│ (Your AI) │◀────│ (WebSocket) │◀────│ (Chrome) │
|
|
111
|
-
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
112
|
-
:9876 │
|
|
113
|
-
▼
|
|
114
|
-
┌─────────────────┐
|
|
115
|
-
│ Activated Tab │
|
|
116
|
-
│ (Green = ON) │
|
|
117
|
-
└─────────────────┘
|
|
118
|
-
```
|
|
119
|
-
|
|
120
|
-
1. **Extension** runs in Chrome with access to your tabs and sessions
|
|
121
|
-
2. **Relay Server** bridges WebSocket (AI) ↔ Native Messaging (Extension)
|
|
122
|
-
3. **AI** sends commands, receives snapshots/screenshots, takes actions
|
|
123
|
-
|
|
124
|
-
## Security Model
|
|
125
157
|
|
|
126
|
-
|
|
127
|
-
|---------|-----------|----------------------|
|
|
128
|
-
| Tab Access | Only activated tabs | All tabs or new browser |
|
|
129
|
-
| Sessions | Uses existing cookies | Requires re-login |
|
|
130
|
-
| Visibility | Green badge shows active | Hidden/background |
|
|
131
|
-
| Audit | Full action logging | Varies |
|
|
132
|
-
| Credentials | Never shared | Often required |
|
|
158
|
+
---
|
|
133
159
|
|
|
134
160
|
## Supported Browsers
|
|
135
161
|
|
|
@@ -138,21 +164,50 @@ npx tab-agent start # Manually start the relay server
|
|
|
138
164
|
- Microsoft Edge
|
|
139
165
|
- Chromium
|
|
140
166
|
|
|
141
|
-
|
|
167
|
+
Setup automatically detects your browser.
|
|
168
|
+
|
|
169
|
+
---
|
|
142
170
|
|
|
143
171
|
## Troubleshooting
|
|
144
172
|
|
|
145
173
|
**Extension not detected?**
|
|
146
|
-
-
|
|
147
|
-
-
|
|
174
|
+
- Ensure `extension/` folder is loaded in chrome://extensions
|
|
175
|
+
- Developer mode must be enabled
|
|
176
|
+
- Try refreshing the extensions page
|
|
177
|
+
|
|
178
|
+
**Tab not responding?**
|
|
179
|
+
- Click the Tab Agent icon — must show green "ON" badge
|
|
180
|
+
- Refresh the page after activating
|
|
181
|
+
|
|
182
|
+
**Relay connection issues?**
|
|
183
|
+
- Run `npx tab-agent status` to check config
|
|
184
|
+
- Run `npx tab-agent start` to see error details
|
|
185
|
+
|
|
186
|
+
---
|
|
148
187
|
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
188
|
+
## How It Works
|
|
189
|
+
|
|
190
|
+
1. **Chrome Extension** — Runs in your browser with access to activated tabs and your session cookies
|
|
191
|
+
|
|
192
|
+
2. **Relay Server** — Local WebSocket server (port 9876) that bridges AI ↔ Extension via Chrome's Native Messaging API
|
|
152
193
|
|
|
153
|
-
**
|
|
154
|
-
|
|
194
|
+
3. **Skill File** — Tells Claude/Codex how to send commands to the relay
|
|
195
|
+
|
|
196
|
+
**Data flow:**
|
|
197
|
+
```
|
|
198
|
+
You: "Search Google for cats"
|
|
199
|
+
↓
|
|
200
|
+
Claude/Codex → WebSocket command → Relay Server → Native Messaging → Extension → DOM action
|
|
201
|
+
↑
|
|
202
|
+
Results ← WebSocket response ← Relay Server ← Native Messaging ← Page snapshot
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
---
|
|
155
206
|
|
|
156
207
|
## License
|
|
157
208
|
|
|
158
209
|
MIT
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
**Made for [Claude Code](https://claude.ai/code) and [Codex](https://openai.com/codex)**
|