tab-agent 0.3.2 → 0.3.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +80 -145
- package/bin/tab-agent.js +17 -15
- package/cli/command.js +11 -11
- package/cli/status.js +2 -2
- package/package.json +23 -8
- package/relay/install-native-host.sh +2 -2
- package/relay/native-host-wrapper.sh +1 -1
- package/relay/package.json +2 -2
- package/relay/server.js +1 -1
- package/skills/claude-code/tab-agent.md +4 -9
- package/skills/codex/tab-agent.md +1 -1
package/README.md
CHANGED
|
@@ -1,12 +1,15 @@
|
|
|
1
1
|
# Tab Agent
|
|
2
2
|
|
|
3
3
|
[](https://www.npmjs.com/package/tab-agent)
|
|
4
|
+
[](https://opensource.org/licenses/MIT)
|
|
4
5
|
|
|
5
|
-
**Give
|
|
6
|
+
**Give LLMs full control of your browser** — securely, with click-to-activate permission.
|
|
7
|
+
|
|
8
|
+
Works with Claude, ChatGPT, Codex, and any AI that can run shell commands.
|
|
6
9
|
|
|
7
10
|
```
|
|
8
11
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
9
|
-
│
|
|
12
|
+
│ Claude / GPT │────▶│ Relay Server │────▶│ Extension │
|
|
10
13
|
│ Codex / LLM │◀────│ (background) │◀────│ (Chrome) │
|
|
11
14
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
12
15
|
│
|
|
@@ -20,160 +23,103 @@
|
|
|
20
23
|
## Features
|
|
21
24
|
|
|
22
25
|
- **Full browser control** — navigate, click, type, scroll, screenshot, run JavaScript
|
|
23
|
-
- **Uses your login sessions** — access
|
|
24
|
-
- **Runs in background** — relay
|
|
25
|
-
- **Click-to-activate security** — only tabs you explicitly enable,
|
|
26
|
-
- **AI-optimized snapshots** — pages converted to
|
|
27
|
-
- **Works with any LLM** — Claude
|
|
28
|
-
|
|
29
|
-
---
|
|
26
|
+
- **Uses your login sessions** — access GitHub, Gmail, Amazon without sharing credentials
|
|
27
|
+
- **Runs in background** — relay starts automatically, works while you do other things
|
|
28
|
+
- **Click-to-activate security** — only tabs you explicitly enable, others stay private
|
|
29
|
+
- **AI-optimized snapshots** — pages converted to text with refs `[e1]`, `[e2]` for easy targeting
|
|
30
|
+
- **Works with any LLM** — Claude, ChatGPT, Codex, or custom AI agents
|
|
30
31
|
|
|
31
32
|
## Quick Start
|
|
32
33
|
|
|
33
34
|
```bash
|
|
34
|
-
# 1.
|
|
35
|
+
# 1. Install extension
|
|
35
36
|
git clone https://github.com/DrHB/tab-agent
|
|
36
|
-
#
|
|
37
|
+
# Chrome: chrome://extensions → Developer mode → Load unpacked → select extension/
|
|
37
38
|
|
|
38
|
-
# 2. Setup
|
|
39
|
+
# 2. Setup
|
|
39
40
|
npx tab-agent setup
|
|
40
41
|
|
|
41
|
-
# 3. Activate
|
|
42
|
-
#
|
|
43
|
-
#
|
|
42
|
+
# 3. Activate & go
|
|
43
|
+
# Click extension icon on any tab (turns green)
|
|
44
|
+
# Ask your AI: "Search Amazon for mechanical keyboards and find the best rated"
|
|
44
45
|
```
|
|
45
46
|
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
## Why Tab Agent?
|
|
49
|
-
|
|
50
|
-
### 🔒 Security First
|
|
51
|
-
|
|
52
|
-
| | Tab Agent | Traditional Automation |
|
|
53
|
-
|--|-----------|----------------------|
|
|
54
|
-
| **Access** | Only tabs you activate | Entire browser |
|
|
55
|
-
| **Visibility** | Green badge = active | Hidden/background |
|
|
56
|
-
| **Sessions** | Uses your cookies | Requires re-login |
|
|
57
|
-
| **Credentials** | Never shared | Often required |
|
|
58
|
-
|
|
59
|
-
**Click-to-activate model:** Your banking, email, and sensitive tabs stay completely isolated. You always see exactly which tabs the LLM can control.
|
|
60
|
-
|
|
61
|
-
### 🍪 Works With Your Login Sessions
|
|
62
|
-
|
|
63
|
-
Because Tab Agent runs as a Chrome extension:
|
|
47
|
+
## Example Tasks
|
|
64
48
|
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
- **No credential sharing** — your passwords stay in your browser
|
|
69
|
-
|
|
70
|
-
### 🤖 LLM-Optimized
|
|
71
|
-
|
|
72
|
-
- **Semantic snapshots** — pages converted to readable text with refs `[e1]`, `[e2]`
|
|
73
|
-
- **Screenshot fallback** — for complex dynamic pages
|
|
74
|
-
- **Simple targeting** — click/type using refs instead of fragile CSS selectors
|
|
75
|
-
|
|
76
|
-
---
|
|
49
|
+
```bash
|
|
50
|
+
# Research
|
|
51
|
+
"Go to Hacker News and summarize the top 5 stories"
|
|
77
52
|
|
|
78
|
-
|
|
53
|
+
# Shopping (uses your login!)
|
|
54
|
+
"Search Amazon for protein powder, filter by 4+ stars, find the best value"
|
|
79
55
|
|
|
80
|
-
|
|
81
|
-
|
|
56
|
+
# Social Media
|
|
57
|
+
"Check my GitHub notifications and list unread ones"
|
|
82
58
|
|
|
83
|
-
|
|
84
|
-
|
|
59
|
+
# Data Extraction
|
|
60
|
+
"Get the titles and prices of the first 10 products on this page"
|
|
85
61
|
|
|
86
|
-
|
|
87
|
-
|
|
62
|
+
# Automation
|
|
63
|
+
"Fill out this form with my details"
|
|
64
|
+
```
|
|
88
65
|
|
|
89
|
-
|
|
90
|
-
> "Get the last 20 posts from my X timeline with author names"
|
|
66
|
+
## Commands
|
|
91
67
|
|
|
92
|
-
|
|
93
|
-
|
|
68
|
+
```bash
|
|
69
|
+
# Core workflow
|
|
70
|
+
npx tab-agent snapshot # Get page content with refs [e1], [e2]...
|
|
71
|
+
npx tab-agent click <ref> # Click element (e.g., click e5)
|
|
72
|
+
npx tab-agent type <ref> <text> # Type into element
|
|
73
|
+
npx tab-agent fill <ref> <value> # Fill form field
|
|
74
|
+
|
|
75
|
+
# Navigation
|
|
76
|
+
npx tab-agent navigate <url> # Go to URL
|
|
77
|
+
npx tab-agent scroll <dir> [amount] # Scroll up/down
|
|
78
|
+
npx tab-agent press <key> # Press key (Enter, Escape, Tab)
|
|
79
|
+
|
|
80
|
+
# Utilities
|
|
81
|
+
npx tab-agent tabs # List active tabs
|
|
82
|
+
npx tab-agent wait <text> # Wait for text to appear
|
|
83
|
+
npx tab-agent screenshot # Capture page (fallback for complex UIs)
|
|
84
|
+
```
|
|
94
85
|
|
|
95
|
-
|
|
86
|
+
**Workflow:** `snapshot` → use refs → `click`/`type` → `snapshot` again → repeat
|
|
96
87
|
|
|
97
88
|
## Installation
|
|
98
89
|
|
|
99
|
-
###
|
|
90
|
+
### 1. Load Extension
|
|
100
91
|
|
|
101
92
|
```bash
|
|
102
93
|
git clone https://github.com/DrHB/tab-agent
|
|
103
94
|
```
|
|
104
95
|
|
|
105
|
-
1. Open `chrome://extensions`
|
|
106
|
-
2. Enable **Developer mode** (
|
|
96
|
+
1. Open `chrome://extensions`
|
|
97
|
+
2. Enable **Developer mode** (top right)
|
|
107
98
|
3. Click **Load unpacked**
|
|
108
99
|
4. Select the `extension/` folder
|
|
109
|
-
5. You'll see the Tab Agent icon in your toolbar
|
|
110
100
|
|
|
111
|
-
###
|
|
101
|
+
### 2. Run Setup
|
|
112
102
|
|
|
113
103
|
```bash
|
|
114
104
|
npx tab-agent setup
|
|
115
105
|
```
|
|
116
106
|
|
|
117
|
-
This
|
|
118
|
-
- Detects your extension ID
|
|
119
|
-
- Configures native messaging
|
|
120
|
-
- Installs the Claude/Codex skill
|
|
107
|
+
This auto-detects your extension and configures everything.
|
|
121
108
|
|
|
122
|
-
###
|
|
109
|
+
### 3. Activate Tabs
|
|
123
110
|
|
|
124
|
-
|
|
125
|
-
2. **Click the Tab Agent icon** — it turns green (🟢 ON)
|
|
126
|
-
3. Ask your LLM to interact with the page
|
|
111
|
+
Click the Tab Agent icon on any tab you want to control. Green = active.
|
|
127
112
|
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
## Commands Reference
|
|
131
|
-
|
|
132
|
-
### Navigation & Viewing
|
|
133
|
-
| Command | Description |
|
|
134
|
-
|---------|-------------|
|
|
135
|
-
| `tabs` | List all activated tabs |
|
|
136
|
-
| `navigate` | Go to a URL |
|
|
137
|
-
| `snapshot` | Get page with element refs |
|
|
138
|
-
| `screenshot` | Capture viewport image |
|
|
139
|
-
| `screenshot --full` | Capture entire page |
|
|
140
|
-
|
|
141
|
-
### Interaction
|
|
142
|
-
| Command | Description |
|
|
143
|
-
|---------|-------------|
|
|
144
|
-
| `click` | Click element by ref |
|
|
145
|
-
| `type` | Type text into element |
|
|
146
|
-
| `fill` | Fill a form field |
|
|
147
|
-
| `press` | Press a key (Enter, Escape, Tab, Arrows) |
|
|
148
|
-
|
|
149
|
-
### Page Control
|
|
150
|
-
| Command | Description |
|
|
151
|
-
|---------|-------------|
|
|
152
|
-
| `scroll` | Scroll up/down by amount |
|
|
153
|
-
| `wait` | Wait for text or element to appear |
|
|
154
|
-
| `evaluate` | Run JavaScript in page context |
|
|
155
|
-
|
|
156
|
-
---
|
|
157
|
-
|
|
158
|
-
## CLI Usage
|
|
113
|
+
## Security Model
|
|
159
114
|
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
115
|
+
| Feature | Tab Agent | Traditional Automation |
|
|
116
|
+
|---------|--------------|----------------------|
|
|
117
|
+
| **Access** | Only tabs you click to activate | Entire browser |
|
|
118
|
+
| **Sessions** | Uses your cookies | Requires credentials |
|
|
119
|
+
| **Visibility** | Green badge shows active tabs | Hidden/background |
|
|
120
|
+
| **Control** | You choose what AI can access | Full access by default |
|
|
165
121
|
|
|
166
|
-
|
|
167
|
-
npx tab-agent tabs # List active tabs
|
|
168
|
-
npx tab-agent snapshot # Get page content with refs
|
|
169
|
-
npx tab-agent screenshot # Capture viewport
|
|
170
|
-
npx tab-agent screenshot --full # Capture full page
|
|
171
|
-
npx tab-agent click e5 # Click element
|
|
172
|
-
npx tab-agent type e3 "hello" # Type text
|
|
173
|
-
npx tab-agent navigate "https://..." # Go to URL
|
|
174
|
-
```
|
|
175
|
-
|
|
176
|
-
---
|
|
122
|
+
Your banking, email, and sensitive tabs stay completely isolated unless you explicitly activate them.
|
|
177
123
|
|
|
178
124
|
## Supported Browsers
|
|
179
125
|
|
|
@@ -182,50 +128,39 @@ npx tab-agent navigate "https://..." # Go to URL
|
|
|
182
128
|
- Microsoft Edge
|
|
183
129
|
- Chromium
|
|
184
130
|
|
|
185
|
-
Setup automatically detects your browser.
|
|
186
|
-
|
|
187
|
-
---
|
|
188
|
-
|
|
189
131
|
## Troubleshooting
|
|
190
132
|
|
|
191
133
|
**Extension not detected?**
|
|
192
|
-
-
|
|
193
|
-
-
|
|
194
|
-
- Try refreshing the extensions page
|
|
134
|
+
- Make sure Developer mode is enabled in chrome://extensions
|
|
135
|
+
- Reload the extension
|
|
195
136
|
|
|
196
|
-
**
|
|
197
|
-
- Click the
|
|
198
|
-
-
|
|
137
|
+
**Commands not working?**
|
|
138
|
+
- Click the extension icon — must show green "ON"
|
|
139
|
+
- Run `npx tab-agent status` to check configuration
|
|
199
140
|
|
|
200
|
-
**
|
|
201
|
-
-
|
|
202
|
-
- Run `npx tab-agent start` to see error details
|
|
203
|
-
|
|
204
|
-
---
|
|
141
|
+
**No active tabs?**
|
|
142
|
+
- Activate at least one tab by clicking the extension icon
|
|
205
143
|
|
|
206
144
|
## How It Works
|
|
207
145
|
|
|
208
|
-
1. **Chrome Extension** —
|
|
146
|
+
1. **Chrome Extension** — Injects into activated tabs, captures DOM snapshots
|
|
147
|
+
2. **Relay Server** — Bridges AI ↔ Extension via Chrome Native Messaging (runs in background)
|
|
148
|
+
3. **CLI** — Simple commands that any LLM can execute
|
|
209
149
|
|
|
210
|
-
2. **Relay Server** — Local WebSocket server that bridges LLM ↔ Extension via Chrome's Native Messaging API (runs in background)
|
|
211
|
-
|
|
212
|
-
3. **Skill File** — Tells Claude/Codex how to send commands
|
|
213
|
-
|
|
214
|
-
**Data flow:**
|
|
215
150
|
```
|
|
216
|
-
You: "
|
|
151
|
+
You: "Find cheap flights to Tokyo"
|
|
217
152
|
↓
|
|
218
|
-
LLM →
|
|
219
|
-
|
|
220
|
-
|
|
153
|
+
LLM → npx tab-agent navigate "google.com/flights"
|
|
154
|
+
→ npx tab-agent snapshot
|
|
155
|
+
→ npx tab-agent type e5 "Tokyo"
|
|
156
|
+
→ npx tab-agent click e12
|
|
157
|
+
→ ...
|
|
221
158
|
```
|
|
222
159
|
|
|
223
|
-
---
|
|
224
|
-
|
|
225
160
|
## License
|
|
226
161
|
|
|
227
162
|
MIT
|
|
228
163
|
|
|
229
164
|
---
|
|
230
165
|
|
|
231
|
-
**
|
|
166
|
+
**Keywords:** browser agent, browser automation, AI browser control, Claude browser, ChatGPT browser, LLM web automation, Codex browser, puppeteer alternative, playwright alternative
|
package/bin/tab-agent.js
CHANGED
|
@@ -30,31 +30,33 @@ if (BROWSER_COMMANDS.includes(command)) {
|
|
|
30
30
|
|
|
31
31
|
function showHelp() {
|
|
32
32
|
console.log(`
|
|
33
|
-
|
|
33
|
+
tabpilot - Give LLMs full control of your browser
|
|
34
34
|
|
|
35
|
-
Setup
|
|
36
|
-
setup Auto-detect extension,
|
|
35
|
+
Setup:
|
|
36
|
+
setup Auto-detect extension, configure native messaging
|
|
37
37
|
start Start the relay server
|
|
38
|
-
status Check configuration
|
|
38
|
+
status Check configuration
|
|
39
39
|
|
|
40
|
-
Browser
|
|
41
|
-
|
|
42
|
-
snapshot Get AI-readable page content
|
|
43
|
-
screenshot [--full] Capture screenshot
|
|
40
|
+
Browser Control:
|
|
41
|
+
snapshot Get page content with refs [e1], [e2]...
|
|
44
42
|
click <ref> Click element (e.g., click e5)
|
|
45
|
-
type <ref> <text> Type
|
|
43
|
+
type <ref> <text> Type into element
|
|
46
44
|
fill <ref> <value> Fill form field
|
|
47
|
-
press <key> Press key (Enter, Escape,
|
|
45
|
+
press <key> Press key (Enter, Escape, Tab)
|
|
48
46
|
scroll <dir> [amount] Scroll up/down
|
|
49
47
|
navigate <url> Go to URL
|
|
48
|
+
tabs List active tabs
|
|
50
49
|
wait <text|selector> Wait for text or element
|
|
51
|
-
|
|
50
|
+
screenshot [--full] Capture page (fallback)
|
|
51
|
+
|
|
52
|
+
Workflow: snapshot → click/type → snapshot → repeat
|
|
52
53
|
|
|
53
54
|
Examples:
|
|
54
|
-
npx
|
|
55
|
-
npx
|
|
56
|
-
npx
|
|
57
|
-
npx
|
|
55
|
+
npx tabpilot setup
|
|
56
|
+
npx tabpilot snapshot
|
|
57
|
+
npx tabpilot click e5
|
|
58
|
+
npx tabpilot type e3 "hello world"
|
|
59
|
+
npx tabpilot navigate "https://google.com"
|
|
58
60
|
|
|
59
61
|
Version: ${require('../package.json').version}
|
|
60
62
|
`);
|
package/cli/command.js
CHANGED
|
@@ -130,30 +130,30 @@ function buildPayload(command, params, tabId) {
|
|
|
130
130
|
|
|
131
131
|
function printHelp() {
|
|
132
132
|
console.log(`
|
|
133
|
-
tab-agent -
|
|
133
|
+
tab-agent - Give LLMs full control of your browser
|
|
134
134
|
|
|
135
135
|
Usage: npx tab-agent <command> [options]
|
|
136
136
|
|
|
137
137
|
Commands:
|
|
138
|
-
|
|
139
|
-
snapshot Get AI-readable page content
|
|
140
|
-
screenshot [--full] Capture screenshot (--full for full page)
|
|
138
|
+
snapshot Get page content with refs [e1], [e2]...
|
|
141
139
|
click <ref> Click element (e.g., click e5)
|
|
142
|
-
type <ref> <text> Type
|
|
140
|
+
type <ref> <text> Type into element
|
|
143
141
|
fill <ref> <value> Fill form field
|
|
144
|
-
press <key> Press key (Enter, Escape, Tab
|
|
145
|
-
scroll <dir> [amount] Scroll up/down
|
|
142
|
+
press <key> Press key (Enter, Escape, Tab)
|
|
143
|
+
scroll <dir> [amount] Scroll up/down
|
|
146
144
|
navigate <url> Go to URL
|
|
145
|
+
tabs List active tabs
|
|
147
146
|
wait <text|selector> Wait for text or element
|
|
147
|
+
screenshot [--full] Capture page (fallback)
|
|
148
148
|
evaluate <script> Run JavaScript
|
|
149
149
|
|
|
150
|
+
Workflow: snapshot → click/type → snapshot → repeat
|
|
151
|
+
|
|
150
152
|
Examples:
|
|
151
|
-
npx tab-agent tabs
|
|
152
153
|
npx tab-agent snapshot
|
|
153
154
|
npx tab-agent click e5
|
|
154
|
-
npx tab-agent type e3 hello world
|
|
155
|
-
npx tab-agent navigate https://google.com
|
|
156
|
-
npx tab-agent screenshot --full
|
|
155
|
+
npx tab-agent type e3 "hello world"
|
|
156
|
+
npx tab-agent navigate "https://google.com"
|
|
157
157
|
`);
|
|
158
158
|
}
|
|
159
159
|
|
package/cli/status.js
CHANGED
|
@@ -54,8 +54,8 @@ async function status() {
|
|
|
54
54
|
const claudeSkill = path.join(home, '.claude', 'skills', 'tab-agent.md');
|
|
55
55
|
const codexSkill = path.join(home, '.codex', 'skills', 'tab-agent.md');
|
|
56
56
|
|
|
57
|
-
console.log(`\nClaude Skill: ${fs.existsSync(claudeSkill) ? 'Installed' : 'Not installed'}
|
|
58
|
-
console.log(`Codex Skill: ${fs.existsSync(codexSkill) ? 'Installed' : 'Not installed (optional)'}
|
|
57
|
+
console.log(`\nClaude Skill: ${fs.existsSync(claudeSkill) ? 'Installed' : 'Not installed'}`);
|
|
58
|
+
console.log(`Codex Skill: ${fs.existsSync(codexSkill) ? 'Installed' : 'Not installed (optional)'}`);
|
|
59
59
|
|
|
60
60
|
// Check relay server
|
|
61
61
|
console.log('\nRelay Server:');
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "tab-agent",
|
|
3
|
-
"version": "0.3.
|
|
4
|
-
"description": "
|
|
3
|
+
"version": "0.3.3",
|
|
4
|
+
"description": "Give LLMs full control of your browser - secure, click-to-activate automation for Claude, ChatGPT, Codex, and any AI",
|
|
5
5
|
"bin": {
|
|
6
6
|
"tab-agent": "./bin/tab-agent.js"
|
|
7
7
|
},
|
|
@@ -20,13 +20,28 @@
|
|
|
20
20
|
"ws": "^8.16.0"
|
|
21
21
|
},
|
|
22
22
|
"keywords": [
|
|
23
|
-
"
|
|
24
|
-
"
|
|
25
|
-
"browser",
|
|
26
|
-
"
|
|
23
|
+
"tab-agent",
|
|
24
|
+
"browser-agent",
|
|
25
|
+
"browser-automation",
|
|
26
|
+
"browser-control",
|
|
27
|
+
"ai-browser",
|
|
28
|
+
"llm-browser",
|
|
27
29
|
"claude",
|
|
28
|
-
"
|
|
30
|
+
"chatgpt",
|
|
31
|
+
"codex",
|
|
32
|
+
"openai",
|
|
33
|
+
"anthropic",
|
|
34
|
+
"chrome-extension",
|
|
35
|
+
"web-automation",
|
|
36
|
+
"ai-agent",
|
|
37
|
+
"puppeteer-alternative",
|
|
38
|
+
"playwright-alternative",
|
|
39
|
+
"web-agent"
|
|
29
40
|
],
|
|
30
|
-
"repository":
|
|
41
|
+
"repository": {
|
|
42
|
+
"type": "git",
|
|
43
|
+
"url": "https://github.com/DrHB/tab-agent"
|
|
44
|
+
},
|
|
45
|
+
"homepage": "https://github.com/DrHB/tab-agent#readme",
|
|
31
46
|
"license": "MIT"
|
|
32
47
|
}
|
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
set -e
|
|
3
3
|
|
|
4
4
|
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
|
5
|
-
HOST_NAME="com.
|
|
5
|
+
HOST_NAME="com.tabpilot.relay"
|
|
6
6
|
HOST_DIR="$HOME/Library/Application Support/TabAgent"
|
|
7
7
|
WRAPPER_PATH="$HOST_DIR/native-host-wrapper.sh"
|
|
8
8
|
|
|
@@ -40,7 +40,7 @@ cp -R "$SCRIPT_DIR/node_modules" "$HOST_DIR/node_modules"
|
|
|
40
40
|
cat > "$MANIFEST_DIR/$HOST_NAME.json" << EOF
|
|
41
41
|
{
|
|
42
42
|
"name": "$HOST_NAME",
|
|
43
|
-
"description": "
|
|
43
|
+
"description": "TabPilot Native Messaging Host",
|
|
44
44
|
"path": "$WRAPPER_PATH",
|
|
45
45
|
"type": "stdio",
|
|
46
46
|
"allowed_origins": [
|
|
@@ -6,7 +6,7 @@ cd "$SCRIPT_DIR"
|
|
|
6
6
|
|
|
7
7
|
LOG_FILE="$SCRIPT_DIR/wrapper.log"
|
|
8
8
|
echo "$(date): Starting native host from $SCRIPT_DIR" >> "$LOG_FILE"
|
|
9
|
-
export TAB_AGENT_LOG="/tmp/
|
|
9
|
+
export TAB_AGENT_LOG="/tmp/tabpilot-native-host.log"
|
|
10
10
|
|
|
11
11
|
NODE_BIN="/opt/homebrew/bin/node"
|
|
12
12
|
if [ ! -x "$NODE_BIN" ]; then
|
package/relay/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
|
-
"name": "
|
|
2
|
+
"name": "tabpilot-relay",
|
|
3
3
|
"version": "0.1.0",
|
|
4
|
-
"description": "WebSocket relay for
|
|
4
|
+
"description": "WebSocket relay for TabPilot Chrome extension",
|
|
5
5
|
"main": "server.js",
|
|
6
6
|
"scripts": {
|
|
7
7
|
"start": "node server.js"
|
package/relay/server.js
CHANGED
|
@@ -102,7 +102,7 @@ wss.on('connection', (ws, req) => {
|
|
|
102
102
|
});
|
|
103
103
|
|
|
104
104
|
httpServer.listen(PORT, () => {
|
|
105
|
-
console.log(`
|
|
105
|
+
console.log(`TabPilot Relay running on ws://localhost:${PORT}`);
|
|
106
106
|
console.log(`Health check: http://localhost:${PORT}/health`);
|
|
107
107
|
});
|
|
108
108
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: tab-agent
|
|
3
|
-
description: Browser control via CLI - snapshot, click, type,
|
|
3
|
+
description: Browser control via CLI - snapshot, click, type, navigate
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Tab Agent
|
|
@@ -17,7 +17,6 @@ sleep 2
|
|
|
17
17
|
## Commands
|
|
18
18
|
|
|
19
19
|
```bash
|
|
20
|
-
npx tab-agent tabs # List active tabs
|
|
21
20
|
npx tab-agent snapshot # Get page with refs [e1], [e2]...
|
|
22
21
|
npx tab-agent click <ref> # Click element
|
|
23
22
|
npx tab-agent type <ref> <text> # Type text
|
|
@@ -25,9 +24,9 @@ npx tab-agent fill <ref> <value> # Fill form field
|
|
|
25
24
|
npx tab-agent press <key> # Press key (Enter, Escape, Tab)
|
|
26
25
|
npx tab-agent scroll <dir> [amount] # Scroll up/down
|
|
27
26
|
npx tab-agent navigate <url> # Go to URL
|
|
27
|
+
npx tab-agent tabs # List active tabs
|
|
28
28
|
npx tab-agent wait <text|selector> # Wait for condition
|
|
29
|
-
npx tab-agent screenshot # Capture
|
|
30
|
-
npx tab-agent screenshot --full # Capture full page (fallback only)
|
|
29
|
+
npx tab-agent screenshot # Capture page (fallback only)
|
|
31
30
|
```
|
|
32
31
|
|
|
33
32
|
## Workflow
|
|
@@ -49,14 +48,10 @@ npx tab-agent snapshot
|
|
|
49
48
|
npx tab-agent type e1 "hello world"
|
|
50
49
|
npx tab-agent press Enter
|
|
51
50
|
npx tab-agent snapshot # See results
|
|
52
|
-
|
|
53
|
-
# Only screenshot if snapshot doesn't show what you need
|
|
54
|
-
npx tab-agent screenshot --full
|
|
55
51
|
```
|
|
56
52
|
|
|
57
53
|
## Notes
|
|
58
54
|
|
|
59
55
|
- Refs reset on each snapshot - always snapshot before interacting
|
|
60
56
|
- Keys: Enter, Escape, Tab, Backspace, ArrowUp/Down/Left/Right
|
|
61
|
-
-
|
|
62
|
-
- Prefer snapshot over screenshot - it's faster and text-based
|
|
57
|
+
- Prefer snapshot over screenshot - faster and text-based
|
|
@@ -16,7 +16,6 @@ curl -s http://localhost:9876/health || (npx tab-agent start &)
|
|
|
16
16
|
## Commands
|
|
17
17
|
|
|
18
18
|
```bash
|
|
19
|
-
npx tab-agent tabs # List active tabs
|
|
20
19
|
npx tab-agent snapshot # Page with refs [e1], [e2]...
|
|
21
20
|
npx tab-agent click <ref> # Click element
|
|
22
21
|
npx tab-agent type <ref> <text> # Type text
|
|
@@ -24,6 +23,7 @@ npx tab-agent fill <ref> <val> # Fill form field
|
|
|
24
23
|
npx tab-agent press <key> # Enter/Escape/Tab/Arrow*
|
|
25
24
|
npx tab-agent scroll <dir> [n] # Scroll up/down
|
|
26
25
|
npx tab-agent navigate <url> # Go to URL
|
|
26
|
+
npx tab-agent tabs # List active tabs
|
|
27
27
|
npx tab-agent wait <text|sel> # Wait for condition
|
|
28
28
|
npx tab-agent screenshot # Fallback only - if snapshot incomplete
|
|
29
29
|
```
|