tab-agent 0.2.0 → 0.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +178 -27
- package/package.json +9 -2
package/README.md
CHANGED
|
@@ -1,62 +1,213 @@
|
|
|
1
1
|
# Tab Agent
|
|
2
2
|
|
|
3
|
-
Secure
|
|
3
|
+
**Secure browser control for Claude Code and Codex** — only the tabs you explicitly activate, not your entire browser.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
```
|
|
6
|
+
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
7
|
+
│ Claude/Codex │────▶│ Relay Server │────▶│ Extension │
|
|
8
|
+
│ │◀────│ :9876 │◀────│ (Chrome) │
|
|
9
|
+
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
10
|
+
│
|
|
11
|
+
▼
|
|
12
|
+
┌───────────────────┐
|
|
13
|
+
│ Your Active Tab │
|
|
14
|
+
│ 🟢 ON │
|
|
15
|
+
└───────────────────┘
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
## Quick Start
|
|
6
19
|
|
|
7
|
-
### 1. Load Extension
|
|
8
20
|
```bash
|
|
21
|
+
# 1. Clone and load extension
|
|
9
22
|
git clone https://github.com/DrHB/tab-agent
|
|
23
|
+
# → Open chrome://extensions → Enable Developer mode → Load unpacked → select extension/
|
|
24
|
+
|
|
25
|
+
# 2. Setup (auto-detects everything)
|
|
26
|
+
npx tab-agent setup
|
|
27
|
+
|
|
28
|
+
# 3. Use it
|
|
29
|
+
# → Click Tab Agent icon on a tab (turns green)
|
|
30
|
+
# → Ask Claude: "Use tab-agent to search Google for 'hello world'"
|
|
10
31
|
```
|
|
11
32
|
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## Why Tab Agent?
|
|
36
|
+
|
|
37
|
+
### 🔒 Security First
|
|
38
|
+
|
|
39
|
+
| | Tab Agent | Traditional Automation |
|
|
40
|
+
|--|-----------|----------------------|
|
|
41
|
+
| **Access** | Only tabs you activate | Entire browser |
|
|
42
|
+
| **Visibility** | Green badge = active | Hidden/background |
|
|
43
|
+
| **Sessions** | Uses your cookies | Requires re-login |
|
|
44
|
+
| **Credentials** | Never shared | Often required |
|
|
45
|
+
| **Audit** | Full action logging | Varies |
|
|
46
|
+
|
|
47
|
+
**Click-to-activate model:** Your banking, email, and sensitive tabs stay completely isolated. You always see exactly which tabs AI can control.
|
|
48
|
+
|
|
49
|
+
### 🍪 Works With Your Login Sessions
|
|
50
|
+
|
|
51
|
+
Because Tab Agent runs as a Chrome extension:
|
|
52
|
+
|
|
53
|
+
- **Uses your existing cookies** — no re-authentication needed
|
|
54
|
+
- **Access any site you're logged into** — GitHub, Twitter, Gmail, internal tools
|
|
55
|
+
- **Works with SSO and 2FA** — enterprise apps, protected accounts
|
|
56
|
+
- **No credential sharing** — your passwords stay in your browser
|
|
57
|
+
|
|
58
|
+
### 🤖 AI-Optimized
|
|
59
|
+
|
|
60
|
+
- **Semantic snapshots** — pages converted to AI-readable text with refs `[e1]`, `[e2]`
|
|
61
|
+
- **Screenshot fallback** — for complex dynamic pages
|
|
62
|
+
- **Simple targeting** — click/type using refs instead of fragile CSS selectors
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## Example Use Cases
|
|
67
|
+
|
|
68
|
+
**Web Research**
|
|
69
|
+
> "Go to Hacker News and summarize the top 5 articles"
|
|
70
|
+
|
|
71
|
+
**Authenticated Actions** (uses your session!)
|
|
72
|
+
> "Check my GitHub notifications and list the unread ones"
|
|
73
|
+
|
|
74
|
+
**Form Automation**
|
|
75
|
+
> "Fill out this contact form with my details"
|
|
76
|
+
|
|
77
|
+
**Data Extraction**
|
|
78
|
+
> "Get the last 20 tweets from my timeline with author names"
|
|
79
|
+
|
|
80
|
+
**Multi-step Workflows**
|
|
81
|
+
> "Search Amazon for 'mechanical keyboard', filter by 4+ stars, and list the top 3"
|
|
82
|
+
|
|
83
|
+
---
|
|
84
|
+
|
|
85
|
+
## Installation
|
|
86
|
+
|
|
87
|
+
### Step 1: Load Extension
|
|
88
|
+
|
|
89
|
+
```bash
|
|
90
|
+
git clone https://github.com/DrHB/tab-agent
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
1. Open `chrome://extensions` in your browser
|
|
94
|
+
2. Enable **Developer mode** (toggle in top right)
|
|
95
|
+
3. Click **Load unpacked**
|
|
96
|
+
4. Select the `extension/` folder
|
|
97
|
+
5. You'll see the Tab Agent icon in your toolbar
|
|
98
|
+
|
|
99
|
+
### Step 2: Run Setup
|
|
15
100
|
|
|
16
|
-
### 2. Setup
|
|
17
101
|
```bash
|
|
18
102
|
npx tab-agent setup
|
|
19
103
|
```
|
|
20
104
|
|
|
21
|
-
|
|
105
|
+
This automatically:
|
|
106
|
+
- Detects your extension ID
|
|
107
|
+
- Configures native messaging
|
|
108
|
+
- Installs the Claude/Codex skill
|
|
22
109
|
|
|
23
|
-
|
|
110
|
+
### Step 3: Activate & Use
|
|
24
111
|
|
|
25
|
-
1.
|
|
26
|
-
2.
|
|
112
|
+
1. Navigate to any webpage
|
|
113
|
+
2. **Click the Tab Agent icon** — it turns green (🟢 ON)
|
|
114
|
+
3. Ask your AI to interact with the page
|
|
27
115
|
|
|
28
|
-
|
|
116
|
+
---
|
|
29
117
|
|
|
118
|
+
## Commands Reference
|
|
119
|
+
|
|
120
|
+
### Navigation & Viewing
|
|
121
|
+
| Command | Description |
|
|
122
|
+
|---------|-------------|
|
|
123
|
+
| `tabs` | List all activated tabs |
|
|
124
|
+
| `navigate` | Go to a URL |
|
|
125
|
+
| `snapshot` | Get AI-readable page with element refs |
|
|
126
|
+
| `screenshot` | Capture viewport image |
|
|
127
|
+
| `screenshot fullPage` | Capture entire page |
|
|
128
|
+
|
|
129
|
+
### Interaction
|
|
30
130
|
| Command | Description |
|
|
31
131
|
|---------|-------------|
|
|
32
|
-
| `tabs` | List activated tabs |
|
|
33
|
-
| `snapshot` | Get AI-readable page with refs [e1], [e2]... |
|
|
34
|
-
| `screenshot` | Capture viewport (or `fullPage: true` for full page) |
|
|
35
132
|
| `click` | Click element by ref |
|
|
36
|
-
| `
|
|
37
|
-
| `type` | Type
|
|
38
|
-
| `
|
|
39
|
-
| `
|
|
133
|
+
| `type` | Type text into element |
|
|
134
|
+
| `type ... submit` | Type and press Enter |
|
|
135
|
+
| `fill` | Fill a form field |
|
|
136
|
+
| `batchfill` | Fill multiple fields at once |
|
|
137
|
+
| `press` | Press a key (Enter, Escape, Tab, Arrows) |
|
|
138
|
+
|
|
139
|
+
### Page Control
|
|
140
|
+
| Command | Description |
|
|
141
|
+
|---------|-------------|
|
|
142
|
+
| `scroll` | Scroll up/down by amount |
|
|
40
143
|
| `scrollintoview` | Scroll element into view |
|
|
41
|
-
| `
|
|
42
|
-
| `wait` | Wait for text or selector |
|
|
144
|
+
| `wait` | Wait for text or element to appear |
|
|
43
145
|
| `evaluate` | Run JavaScript in page context |
|
|
44
|
-
| `batchfill` | Fill multiple fields at once |
|
|
45
146
|
| `dialog` | Handle alert/confirm/prompt |
|
|
46
147
|
|
|
47
|
-
|
|
148
|
+
---
|
|
149
|
+
|
|
150
|
+
## CLI Reference
|
|
48
151
|
|
|
49
152
|
```bash
|
|
50
|
-
npx tab-agent
|
|
51
|
-
npx tab-agent
|
|
153
|
+
npx tab-agent setup # Initial configuration
|
|
154
|
+
npx tab-agent status # Check if everything works
|
|
155
|
+
npx tab-agent start # Start relay server manually
|
|
52
156
|
```
|
|
53
157
|
|
|
54
|
-
|
|
158
|
+
---
|
|
159
|
+
|
|
160
|
+
## Supported Browsers
|
|
161
|
+
|
|
162
|
+
- Google Chrome
|
|
163
|
+
- Brave
|
|
164
|
+
- Microsoft Edge
|
|
165
|
+
- Chromium
|
|
55
166
|
|
|
167
|
+
Setup automatically detects your browser.
|
|
168
|
+
|
|
169
|
+
---
|
|
170
|
+
|
|
171
|
+
## Troubleshooting
|
|
172
|
+
|
|
173
|
+
**Extension not detected?**
|
|
174
|
+
- Ensure `extension/` folder is loaded in chrome://extensions
|
|
175
|
+
- Developer mode must be enabled
|
|
176
|
+
- Try refreshing the extensions page
|
|
177
|
+
|
|
178
|
+
**Tab not responding?**
|
|
179
|
+
- Click the Tab Agent icon — must show green "ON" badge
|
|
180
|
+
- Refresh the page after activating
|
|
181
|
+
|
|
182
|
+
**Relay connection issues?**
|
|
183
|
+
- Run `npx tab-agent status` to check config
|
|
184
|
+
- Run `npx tab-agent start` to see error details
|
|
185
|
+
|
|
186
|
+
---
|
|
187
|
+
|
|
188
|
+
## How It Works
|
|
189
|
+
|
|
190
|
+
1. **Chrome Extension** — Runs in your browser with access to activated tabs and your session cookies
|
|
191
|
+
|
|
192
|
+
2. **Relay Server** — Local WebSocket server (port 9876) that bridges AI ↔ Extension via Chrome's Native Messaging API
|
|
193
|
+
|
|
194
|
+
3. **Skill File** — Tells Claude/Codex how to send commands to the relay
|
|
195
|
+
|
|
196
|
+
**Data flow:**
|
|
56
197
|
```
|
|
57
|
-
|
|
198
|
+
You: "Search Google for cats"
|
|
199
|
+
↓
|
|
200
|
+
Claude/Codex → WebSocket command → Relay Server → Native Messaging → Extension → DOM action
|
|
201
|
+
↑
|
|
202
|
+
Results ← WebSocket response ← Relay Server ← Native Messaging ← Page snapshot
|
|
58
203
|
```
|
|
59
204
|
|
|
205
|
+
---
|
|
206
|
+
|
|
60
207
|
## License
|
|
61
208
|
|
|
62
209
|
MIT
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
**Made for [Claude Code](https://claude.ai/code) and [Codex](https://openai.com/codex)**
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "tab-agent",
|
|
3
|
-
"version": "0.2.
|
|
3
|
+
"version": "0.2.2",
|
|
4
4
|
"description": "Browser control for Claude Code and Codex via WebSocket",
|
|
5
5
|
"bin": {
|
|
6
6
|
"tab-agent": "./bin/tab-agent.js"
|
|
@@ -19,7 +19,14 @@
|
|
|
19
19
|
"dependencies": {
|
|
20
20
|
"ws": "^8.16.0"
|
|
21
21
|
},
|
|
22
|
-
"keywords": [
|
|
22
|
+
"keywords": [
|
|
23
|
+
"chrome",
|
|
24
|
+
"extension",
|
|
25
|
+
"browser",
|
|
26
|
+
"automation",
|
|
27
|
+
"claude",
|
|
28
|
+
"codex"
|
|
29
|
+
],
|
|
23
30
|
"repository": "https://github.com/DrHB/tab-agent",
|
|
24
31
|
"license": "MIT"
|
|
25
32
|
}
|