tab-agent 0.2.0 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +178 -27
  2. package/package.json +9 -2
package/README.md CHANGED
@@ -1,62 +1,213 @@
1
1
  # Tab Agent
2
2
 
3
- Secure tab-level browser control for Claude Code and Codex — only the tabs you explicitly activate, not your entire browser.
3
+ **Secure browser control for Claude Code and Codex** — only the tabs you explicitly activate, not your entire browser.
4
4
 
5
- ## Install
5
+ ```
6
+ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
7
+ │ Claude/Codex │────▶│ Relay Server │────▶│ Extension │
8
+ │ │◀────│ :9876 │◀────│ (Chrome) │
9
+ └─────────────────┘ └─────────────────┘ └─────────────────┘
10
+
11
+
12
+ ┌───────────────────┐
13
+ │ Your Active Tab │
14
+ │ 🟢 ON │
15
+ └───────────────────┘
16
+ ```
17
+
18
+ ## Quick Start
6
19
 
7
- ### 1. Load Extension
8
20
  ```bash
21
+ # 1. Clone and load extension
9
22
  git clone https://github.com/DrHB/tab-agent
23
+ # → Open chrome://extensions → Enable Developer mode → Load unpacked → select extension/
24
+
25
+ # 2. Setup (auto-detects everything)
26
+ npx tab-agent setup
27
+
28
+ # 3. Use it
29
+ # → Click Tab Agent icon on a tab (turns green)
30
+ # → Ask Claude: "Use tab-agent to search Google for 'hello world'"
10
31
  ```
11
32
 
12
- 1. Open `chrome://extensions`
13
- 2. Enable **Developer mode**
14
- 3. Click **Load unpacked** → select `extension/` folder
33
+ ---
34
+
35
+ ## Why Tab Agent?
36
+
37
+ ### 🔒 Security First
38
+
39
+ | | Tab Agent | Traditional Automation |
40
+ |--|-----------|----------------------|
41
+ | **Access** | Only tabs you activate | Entire browser |
42
+ | **Visibility** | Green badge = active | Hidden/background |
43
+ | **Sessions** | Uses your cookies | Requires re-login |
44
+ | **Credentials** | Never shared | Often required |
45
+ | **Audit** | Full action logging | Varies |
46
+
47
+ **Click-to-activate model:** Your banking, email, and sensitive tabs stay completely isolated. You always see exactly which tabs AI can control.
48
+
49
+ ### 🍪 Works With Your Login Sessions
50
+
51
+ Because Tab Agent runs as a Chrome extension:
52
+
53
+ - **Uses your existing cookies** — no re-authentication needed
54
+ - **Access any site you're logged into** — GitHub, Twitter, Gmail, internal tools
55
+ - **Works with SSO and 2FA** — enterprise apps, protected accounts
56
+ - **No credential sharing** — your passwords stay in your browser
57
+
58
+ ### 🤖 AI-Optimized
59
+
60
+ - **Semantic snapshots** — pages converted to AI-readable text with refs `[e1]`, `[e2]`
61
+ - **Screenshot fallback** — for complex dynamic pages
62
+ - **Simple targeting** — click/type using refs instead of fragile CSS selectors
63
+
64
+ ---
65
+
66
+ ## Example Use Cases
67
+
68
+ **Web Research**
69
+ > "Go to Hacker News and summarize the top 5 articles"
70
+
71
+ **Authenticated Actions** (uses your session!)
72
+ > "Check my GitHub notifications and list the unread ones"
73
+
74
+ **Form Automation**
75
+ > "Fill out this contact form with my details"
76
+
77
+ **Data Extraction**
78
+ > "Get the last 20 tweets from my timeline with author names"
79
+
80
+ **Multi-step Workflows**
81
+ > "Search Amazon for 'mechanical keyboard', filter by 4+ stars, and list the top 3"
82
+
83
+ ---
84
+
85
+ ## Installation
86
+
87
+ ### Step 1: Load Extension
88
+
89
+ ```bash
90
+ git clone https://github.com/DrHB/tab-agent
91
+ ```
92
+
93
+ 1. Open `chrome://extensions` in your browser
94
+ 2. Enable **Developer mode** (toggle in top right)
95
+ 3. Click **Load unpacked**
96
+ 4. Select the `extension/` folder
97
+ 5. You'll see the Tab Agent icon in your toolbar
98
+
99
+ ### Step 2: Run Setup
15
100
 
16
- ### 2. Setup
17
101
  ```bash
18
102
  npx tab-agent setup
19
103
  ```
20
104
 
21
- That's it! The setup auto-detects your extension and configures everything.
105
+ This automatically:
106
+ - Detects your extension ID
107
+ - Configures native messaging
108
+ - Installs the Claude/Codex skill
22
109
 
23
- ## Use
110
+ ### Step 3: Activate & Use
24
111
 
25
- 1. Click Tab Agent icon on any tab (turns green = active)
26
- 2. Ask Claude/Codex: "Use tab-agent to search Google for 'hello world'"
112
+ 1. Navigate to any webpage
113
+ 2. **Click the Tab Agent icon** it turns green (🟢 ON)
114
+ 3. Ask your AI to interact with the page
27
115
 
28
- ## Commands
116
+ ---
29
117
 
118
+ ## Commands Reference
119
+
120
+ ### Navigation & Viewing
121
+ | Command | Description |
122
+ |---------|-------------|
123
+ | `tabs` | List all activated tabs |
124
+ | `navigate` | Go to a URL |
125
+ | `snapshot` | Get AI-readable page with element refs |
126
+ | `screenshot` | Capture viewport image |
127
+ | `screenshot fullPage` | Capture entire page |
128
+
129
+ ### Interaction
30
130
  | Command | Description |
31
131
  |---------|-------------|
32
- | `tabs` | List activated tabs |
33
- | `snapshot` | Get AI-readable page with refs [e1], [e2]... |
34
- | `screenshot` | Capture viewport (or `fullPage: true` for full page) |
35
132
  | `click` | Click element by ref |
36
- | `fill` | Fill form field |
37
- | `type` | Type text (with optional `submit: true`) |
38
- | `press` | Press key (Enter, Escape, Tab, Arrow*) |
39
- | `scroll` | Scroll page |
133
+ | `type` | Type text into element |
134
+ | `type ... submit` | Type and press Enter |
135
+ | `fill` | Fill a form field |
136
+ | `batchfill` | Fill multiple fields at once |
137
+ | `press` | Press a key (Enter, Escape, Tab, Arrows) |
138
+
139
+ ### Page Control
140
+ | Command | Description |
141
+ |---------|-------------|
142
+ | `scroll` | Scroll up/down by amount |
40
143
  | `scrollintoview` | Scroll element into view |
41
- | `navigate` | Go to URL |
42
- | `wait` | Wait for text or selector |
144
+ | `wait` | Wait for text or element to appear |
43
145
  | `evaluate` | Run JavaScript in page context |
44
- | `batchfill` | Fill multiple fields at once |
45
146
  | `dialog` | Handle alert/confirm/prompt |
46
147
 
47
- ## Manual Commands
148
+ ---
149
+
150
+ ## CLI Reference
48
151
 
49
152
  ```bash
50
- npx tab-agent status # Check configuration
51
- npx tab-agent start # Start relay manually
153
+ npx tab-agent setup # Initial configuration
154
+ npx tab-agent status # Check if everything works
155
+ npx tab-agent start # Start relay server manually
52
156
  ```
53
157
 
54
- ## Architecture
158
+ ---
159
+
160
+ ## Supported Browsers
161
+
162
+ - Google Chrome
163
+ - Brave
164
+ - Microsoft Edge
165
+ - Chromium
55
166
 
167
+ Setup automatically detects your browser.
168
+
169
+ ---
170
+
171
+ ## Troubleshooting
172
+
173
+ **Extension not detected?**
174
+ - Ensure `extension/` folder is loaded in chrome://extensions
175
+ - Developer mode must be enabled
176
+ - Try refreshing the extensions page
177
+
178
+ **Tab not responding?**
179
+ - Click the Tab Agent icon — must show green "ON" badge
180
+ - Refresh the page after activating
181
+
182
+ **Relay connection issues?**
183
+ - Run `npx tab-agent status` to check config
184
+ - Run `npx tab-agent start` to see error details
185
+
186
+ ---
187
+
188
+ ## How It Works
189
+
190
+ 1. **Chrome Extension** — Runs in your browser with access to activated tabs and your session cookies
191
+
192
+ 2. **Relay Server** — Local WebSocket server (port 9876) that bridges AI ↔ Extension via Chrome's Native Messaging API
193
+
194
+ 3. **Skill File** — Tells Claude/Codex how to send commands to the relay
195
+
196
+ **Data flow:**
56
197
  ```
57
- Claude/Codex → WebSocket:9876 Relay Native Messaging → Extension → DOM
198
+ You: "Search Google for cats"
199
+
200
+ Claude/Codex → WebSocket command → Relay Server → Native Messaging → Extension → DOM action
201
+
202
+ Results ← WebSocket response ← Relay Server ← Native Messaging ← Page snapshot
58
203
  ```
59
204
 
205
+ ---
206
+
60
207
  ## License
61
208
 
62
209
  MIT
210
+
211
+ ---
212
+
213
+ **Made for [Claude Code](https://claude.ai/code) and [Codex](https://openai.com/codex)**
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "tab-agent",
3
- "version": "0.2.0",
3
+ "version": "0.2.2",
4
4
  "description": "Browser control for Claude Code and Codex via WebSocket",
5
5
  "bin": {
6
6
  "tab-agent": "./bin/tab-agent.js"
@@ -19,7 +19,14 @@
19
19
  "dependencies": {
20
20
  "ws": "^8.16.0"
21
21
  },
22
- "keywords": ["chrome", "extension", "browser", "automation", "claude", "codex"],
22
+ "keywords": [
23
+ "chrome",
24
+ "extension",
25
+ "browser",
26
+ "automation",
27
+ "claude",
28
+ "codex"
29
+ ],
23
30
  "repository": "https://github.com/DrHB/tab-agent",
24
31
  "license": "MIT"
25
32
  }