tab-agent 0.2.0 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +114 -18
- package/package.json +9 -2
package/README.md
CHANGED
|
@@ -2,28 +2,78 @@
|
|
|
2
2
|
|
|
3
3
|
Secure tab-level browser control for Claude Code and Codex — only the tabs you explicitly activate, not your entire browser.
|
|
4
4
|
|
|
5
|
+
## Why Tab Agent?
|
|
6
|
+
|
|
7
|
+
### Security First
|
|
8
|
+
Unlike browser automation tools that control your entire browser, Tab Agent uses a **click-to-activate** model:
|
|
9
|
+
- Only tabs you explicitly activate (green badge) can be controlled
|
|
10
|
+
- Your banking, email, and other sensitive tabs remain completely isolated
|
|
11
|
+
- No background access — you see exactly which tabs AI can interact with
|
|
12
|
+
- Full audit logging of every action taken
|
|
13
|
+
|
|
14
|
+
### Works With Your Session
|
|
15
|
+
Tab Agent operates through a Chrome extension, which means:
|
|
16
|
+
- **Uses your existing cookies and login sessions** — no need to re-authenticate
|
|
17
|
+
- Access sites that require login (GitHub, Twitter, internal tools, etc.)
|
|
18
|
+
- Works with SSO, 2FA-protected accounts, and enterprise apps
|
|
19
|
+
- No credential sharing or token management needed
|
|
20
|
+
|
|
21
|
+
### AI-Optimized
|
|
22
|
+
- **Semantic snapshots** — pages converted to AI-readable text with element refs `[e1]`, `[e2]`
|
|
23
|
+
- **Screenshot fallback** — for complex/dynamic pages, get visual screenshots
|
|
24
|
+
- **Smart element targeting** — click, type, fill using simple refs instead of fragile selectors
|
|
25
|
+
|
|
5
26
|
## Install
|
|
6
27
|
|
|
7
28
|
### 1. Load Extension
|
|
29
|
+
|
|
8
30
|
```bash
|
|
9
31
|
git clone https://github.com/DrHB/tab-agent
|
|
10
32
|
```
|
|
11
33
|
|
|
12
34
|
1. Open `chrome://extensions`
|
|
13
|
-
2. Enable **Developer mode**
|
|
14
|
-
3. Click **Load unpacked** → select `extension/` folder
|
|
35
|
+
2. Enable **Developer mode** (top right)
|
|
36
|
+
3. Click **Load unpacked** → select the `extension/` folder
|
|
37
|
+
4. You'll see the Tab Agent icon in your toolbar
|
|
38
|
+
|
|
39
|
+
### 2. Run Setup
|
|
15
40
|
|
|
16
|
-
### 2. Setup
|
|
17
41
|
```bash
|
|
18
42
|
npx tab-agent setup
|
|
19
43
|
```
|
|
20
44
|
|
|
21
|
-
|
|
45
|
+
This auto-detects your extension and configures everything (native messaging + skills).
|
|
46
|
+
|
|
47
|
+
### 3. Use It
|
|
22
48
|
|
|
23
|
-
|
|
49
|
+
1. **Click the Tab Agent icon** on any tab you want to control (turns green = active)
|
|
50
|
+
2. **Ask Claude/Codex:**
|
|
51
|
+
- "Use tab-agent to search Google for 'best restaurants nearby'"
|
|
52
|
+
- "Go to my GitHub notifications and summarize them"
|
|
53
|
+
- "Fill out this form with my details"
|
|
54
|
+
|
|
55
|
+
## Example Use Cases
|
|
56
|
+
|
|
57
|
+
### Web Research
|
|
58
|
+
```
|
|
59
|
+
"Go to Hacker News and get me the top 5 articles with summaries"
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
### Authenticated Actions
|
|
63
|
+
```
|
|
64
|
+
"Check my GitHub notifications and mark the resolved ones as read"
|
|
65
|
+
```
|
|
66
|
+
Works because Tab Agent uses your existing GitHub session!
|
|
67
|
+
|
|
68
|
+
### Form Automation
|
|
69
|
+
```
|
|
70
|
+
"Fill out this job application with my resume details"
|
|
71
|
+
```
|
|
24
72
|
|
|
25
|
-
|
|
26
|
-
|
|
73
|
+
### Data Extraction
|
|
74
|
+
```
|
|
75
|
+
"Go to my Twitter timeline and get the last 20 tweets"
|
|
76
|
+
```
|
|
27
77
|
|
|
28
78
|
## Commands
|
|
29
79
|
|
|
@@ -31,32 +81,78 @@ That's it! The setup auto-detects your extension and configures everything.
|
|
|
31
81
|
|---------|-------------|
|
|
32
82
|
| `tabs` | List activated tabs |
|
|
33
83
|
| `snapshot` | Get AI-readable page with refs [e1], [e2]... |
|
|
34
|
-
| `screenshot` | Capture viewport (
|
|
84
|
+
| `screenshot` | Capture viewport (add `fullPage: true` for full page) |
|
|
35
85
|
| `click` | Click element by ref |
|
|
36
86
|
| `fill` | Fill form field |
|
|
37
|
-
| `type` | Type text (
|
|
38
|
-
| `press` | Press key (Enter, Escape, Tab, Arrow
|
|
39
|
-
| `scroll` | Scroll page |
|
|
87
|
+
| `type` | Type text (add `submit: true` to press Enter) |
|
|
88
|
+
| `press` | Press key (Enter, Escape, Tab, Arrow keys) |
|
|
89
|
+
| `scroll` | Scroll page up/down |
|
|
40
90
|
| `scrollintoview` | Scroll element into view |
|
|
41
91
|
| `navigate` | Go to URL |
|
|
42
|
-
| `wait` | Wait for text or selector |
|
|
92
|
+
| `wait` | Wait for text or selector to appear |
|
|
43
93
|
| `evaluate` | Run JavaScript in page context |
|
|
44
94
|
| `batchfill` | Fill multiple fields at once |
|
|
45
|
-
| `dialog` | Handle alert/confirm/prompt |
|
|
95
|
+
| `dialog` | Handle alert/confirm/prompt dialogs |
|
|
46
96
|
|
|
47
|
-
##
|
|
97
|
+
## CLI Commands
|
|
48
98
|
|
|
49
99
|
```bash
|
|
50
|
-
npx tab-agent
|
|
51
|
-
npx tab-agent
|
|
100
|
+
npx tab-agent setup # Configure everything (run once)
|
|
101
|
+
npx tab-agent status # Check if everything is working
|
|
102
|
+
npx tab-agent start # Manually start the relay server
|
|
52
103
|
```
|
|
53
104
|
|
|
54
|
-
##
|
|
105
|
+
## How It Works
|
|
55
106
|
|
|
56
107
|
```
|
|
57
|
-
|
|
108
|
+
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
109
|
+
│ Claude/Codex │────▶│ Relay Server │────▶│ Extension │
|
|
110
|
+
│ (Your AI) │◀────│ (WebSocket) │◀────│ (Chrome) │
|
|
111
|
+
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
112
|
+
:9876 │
|
|
113
|
+
▼
|
|
114
|
+
┌─────────────────┐
|
|
115
|
+
│ Activated Tab │
|
|
116
|
+
│ (Green = ON) │
|
|
117
|
+
└─────────────────┘
|
|
58
118
|
```
|
|
59
119
|
|
|
120
|
+
1. **Extension** runs in Chrome with access to your tabs and sessions
|
|
121
|
+
2. **Relay Server** bridges WebSocket (AI) ↔ Native Messaging (Extension)
|
|
122
|
+
3. **AI** sends commands, receives snapshots/screenshots, takes actions
|
|
123
|
+
|
|
124
|
+
## Security Model
|
|
125
|
+
|
|
126
|
+
| Feature | Tab Agent | Traditional Automation |
|
|
127
|
+
|---------|-----------|----------------------|
|
|
128
|
+
| Tab Access | Only activated tabs | All tabs or new browser |
|
|
129
|
+
| Sessions | Uses existing cookies | Requires re-login |
|
|
130
|
+
| Visibility | Green badge shows active | Hidden/background |
|
|
131
|
+
| Audit | Full action logging | Varies |
|
|
132
|
+
| Credentials | Never shared | Often required |
|
|
133
|
+
|
|
134
|
+
## Supported Browsers
|
|
135
|
+
|
|
136
|
+
- Google Chrome
|
|
137
|
+
- Brave
|
|
138
|
+
- Microsoft Edge
|
|
139
|
+
- Chromium
|
|
140
|
+
|
|
141
|
+
The setup automatically detects which browser you're using.
|
|
142
|
+
|
|
143
|
+
## Troubleshooting
|
|
144
|
+
|
|
145
|
+
**Extension not detected?**
|
|
146
|
+
- Make sure you loaded the `extension/` folder in chrome://extensions
|
|
147
|
+
- Check that Developer mode is enabled
|
|
148
|
+
|
|
149
|
+
**Commands not working?**
|
|
150
|
+
- Click the Tab Agent icon to activate the tab (must show green "ON")
|
|
151
|
+
- Run `npx tab-agent status` to check configuration
|
|
152
|
+
|
|
153
|
+
**Relay not connecting?**
|
|
154
|
+
- Run `npx tab-agent start` manually to see any errors
|
|
155
|
+
|
|
60
156
|
## License
|
|
61
157
|
|
|
62
158
|
MIT
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "tab-agent",
|
|
3
|
-
"version": "0.2.
|
|
3
|
+
"version": "0.2.1",
|
|
4
4
|
"description": "Browser control for Claude Code and Codex via WebSocket",
|
|
5
5
|
"bin": {
|
|
6
6
|
"tab-agent": "./bin/tab-agent.js"
|
|
@@ -19,7 +19,14 @@
|
|
|
19
19
|
"dependencies": {
|
|
20
20
|
"ws": "^8.16.0"
|
|
21
21
|
},
|
|
22
|
-
"keywords": [
|
|
22
|
+
"keywords": [
|
|
23
|
+
"chrome",
|
|
24
|
+
"extension",
|
|
25
|
+
"browser",
|
|
26
|
+
"automation",
|
|
27
|
+
"claude",
|
|
28
|
+
"codex"
|
|
29
|
+
],
|
|
23
30
|
"repository": "https://github.com/DrHB/tab-agent",
|
|
24
31
|
"license": "MIT"
|
|
25
32
|
}
|