browser-ctl 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 geb
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,286 @@
1
+ Metadata-Version: 2.4
2
+ Name: browser-ctl
3
+ Version: 0.1.0
4
+ Summary: Control your browser from the command line via a Chrome extension + WebSocket bridge
5
+ Author-email: geb <853934146@qq.com>
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/mikuh/browser-ctl
8
+ Project-URL: Repository, https://github.com/mikuh/browser-ctl
9
+ Project-URL: Issues, https://github.com/mikuh/browser-ctl/issues
10
+ Keywords: browser,automation,chrome,cli,websocket,devtools
11
+ Classifier: Development Status :: 4 - Beta
12
+ Classifier: Environment :: Console
13
+ Classifier: Intended Audience :: Developers
14
+ Classifier: Operating System :: OS Independent
15
+ Classifier: Programming Language :: Python :: 3
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Programming Language :: Python :: 3.13
19
+ Classifier: Topic :: Software Development :: Testing
20
+ Classifier: Topic :: Internet :: WWW/HTTP :: Browsers
21
+ Requires-Python: >=3.11
22
+ Description-Content-Type: text/markdown
23
+ License-File: LICENSE
24
+ Requires-Dist: aiohttp>=3.9
25
+ Dynamic: license-file
26
+
27
+ # browser-ctl
28
+
29
+ **Control Chrome from your terminal.** A lightweight CLI tool for browser automation — navigate, click, type, scroll, screenshot, and more, all through simple commands.
30
+
31
+ ```bash
32
+ pip install browser-ctl
33
+
34
+ bctl go https://github.com
35
+ bctl click "a.search-button"
36
+ bctl type "input[name=q]" "browser-ctl"
37
+ bctl press Enter
38
+ bctl screenshot results.png
39
+ ```
40
+
41
+ ## Why browser-ctl?
42
+
43
+ - **Zero-config CLI** — single `bctl` command, JSON output, works in any shell or script
44
+ - **No browser binary management** — uses your existing Chrome with a lightweight extension
45
+ - **Stdlib-only CLI** — the CLI itself has zero external Python dependencies
46
+ - **AI-agent friendly** — ships with an AI coding skill file (`SKILL.md`) for Cursor / OpenCode integration
47
+ - **Local & private** — all communication stays on `localhost`, no data leaves your machine
48
+
49
+ ## How It Works
50
+
51
+ ```
52
+ Terminal (bctl) ──HTTP──▶ Bridge Server ◀──WebSocket── Chrome Extension
53
+ ```
54
+
55
+ 1. The **CLI** (`bctl`) sends commands via HTTP to a local bridge server
56
+ 2. The **bridge server** relays them over WebSocket to the Chrome extension
57
+ 3. The **extension** executes commands using Chrome APIs and content scripts
58
+ 4. Results flow back the same path as JSON
59
+
60
+ The bridge server auto-starts on first command — no manual setup needed.
61
+
62
+ ## Installation
63
+
64
+ ### 1. Install the Python package
65
+
66
+ ```bash
67
+ pip install browser-ctl
68
+ ```
69
+
70
+ ### 2. Load the Chrome extension
71
+
72
+ ```bash
73
+ bctl setup
74
+ ```
75
+
76
+ This copies the extension to `~/.browser-ctl/extension/` and opens Chrome's extension page. Then:
77
+
78
+ 1. Open `chrome://extensions`
79
+ 2. Enable **Developer mode** (top right)
80
+ 3. Click **Load unpacked**
81
+ 4. Select the `~/.browser-ctl/extension/` directory
82
+
83
+ ### 3. Verify
84
+
85
+ ```bash
86
+ bctl ping
87
+ ```
88
+
89
+ You should see `{"success": true, "data": {"server": true, "extension": true}}`.
90
+
91
+ ## Commands
92
+
93
+ ### Navigation
94
+
95
+ ```bash
96
+ bctl navigate <url> # Navigate to URL (aliases: nav, go)
97
+ bctl back # Go back in history
98
+ bctl forward # Go forward (alias: fwd)
99
+ bctl reload # Reload current page
100
+ ```
101
+
102
+ ### Interaction
103
+
104
+ ```bash
105
+ bctl click <sel> [-i N] # Click element (CSS selector, optional Nth match)
106
+ bctl hover <sel> [-i N] # Hover over element
107
+ bctl type <sel> <text> # Type text into input/textarea
108
+ bctl press <key> # Press key (Enter, Escape, Tab, etc.)
109
+ bctl scroll <dir|sel> [pixels] # Scroll: up/down/top/bottom or element into view
110
+ bctl select-option <sel> <val> # Select dropdown option (alias: sopt) [--text]
111
+ bctl drag <src> [target] # Drag to element or offset [--dx N --dy N]
112
+ ```
113
+
114
+ ### DOM Query
115
+
116
+ ```bash
117
+ bctl text [sel] # Get text content (default: body)
118
+ bctl html [sel] # Get innerHTML
119
+ bctl attr <sel> [name] # Get attribute(s) [-i N for Nth element]
120
+ bctl select <sel> [-l N] # List matching elements (alias: sel, limit default: 20)
121
+ bctl count <sel> # Count matching elements
122
+ bctl status # Current page URL and title
123
+ ```
124
+
125
+ ### JavaScript
126
+
127
+ ```bash
128
+ bctl eval <code> # Execute JS in page context (auto-bypasses CSP)
129
+ ```
130
+
131
+ ### Tabs
132
+
133
+ ```bash
134
+ bctl tabs # List all tabs
135
+ bctl tab <id> # Switch to tab by ID
136
+ bctl new-tab [url] # Open new tab
137
+ bctl close-tab [id] # Close tab (default: active)
138
+ ```
139
+
140
+ ### Screenshot & Files
141
+
142
+ ```bash
143
+ bctl screenshot [path] # Capture screenshot (alias: ss)
144
+ bctl download <target> # Download file/image (alias: dl) [-o file] [-i N]
145
+ bctl upload <sel> <files> # Upload file(s) to <input type="file">
146
+ ```
147
+
148
+ ### Wait & Dialog
149
+
150
+ ```bash
151
+ bctl wait <sel|seconds> # Wait for element or sleep [timeout]
152
+ bctl dialog [accept|dismiss] [--text <val>] # Handle next alert/confirm/prompt
153
+ ```
154
+
155
+ ### Server
156
+
157
+ ```bash
158
+ bctl ping # Check server & extension status
159
+ bctl serve # Start server in foreground
160
+ bctl stop # Stop server
161
+ ```
162
+
163
+ ## Examples
164
+
165
+ ### Search and extract
166
+
167
+ ```bash
168
+ bctl go "https://news.ycombinator.com"
169
+ bctl select "a.titlelink" -l 5 # Top 5 links with text, href, etc.
170
+ ```
171
+
172
+ ### Fill a form
173
+
174
+ ```bash
175
+ bctl type "input[name=email]" "user@example.com"
176
+ bctl type "input[name=password]" "hunter2"
177
+ bctl select-option "select#country" "US"
178
+ bctl upload "input[type=file]" ./resume.pdf
179
+ bctl click "button[type=submit]"
180
+ ```
181
+
182
+ ### Scroll and screenshot
183
+
184
+ ```bash
185
+ bctl go "https://en.wikipedia.org/wiki/Web_browser"
186
+ bctl scroll down 1000
187
+ bctl ss page.png
188
+ ```
189
+
190
+ ### Handle dialogs
191
+
192
+ ```bash
193
+ bctl dialog accept # Set up handler BEFORE triggering
194
+ bctl click "#delete-button" # This triggers a confirm() dialog
195
+ ```
196
+
197
+ ### Drag and drop
198
+
199
+ ```bash
200
+ bctl drag ".task-card" ".done-column"
201
+ bctl drag ".range-slider" --dx 50 --dy 0
202
+ ```
203
+
204
+ ### Use in shell scripts
205
+
206
+ ```bash
207
+ # Extract all image URLs from a page
208
+ bctl go "https://example.com"
209
+ bctl eval "JSON.stringify(Array.from(document.images).map(i=>i.src))"
210
+
211
+ # Wait for SPA content to load
212
+ bctl go "https://app.example.com/dashboard"
213
+ bctl wait ".dashboard-loaded" 15
214
+ bctl text ".metric-value"
215
+ ```
216
+
217
+ ## AI Agent Integration
218
+
219
+ browser-ctl ships with a `SKILL.md` file designed for AI coding assistants. Install it for your tool:
220
+
221
+ ```bash
222
+ bctl setup cursor # Install skill for Cursor IDE
223
+ bctl setup opencode # Install skill for OpenCode
224
+ bctl setup /path/to/dir # Install to custom directory
225
+ ```
226
+
227
+ Once installed, AI agents can use `bctl` commands to automate browser tasks on your behalf.
228
+
229
+ ## Output Format
230
+
231
+ All commands return JSON to stdout:
232
+
233
+ ```json
234
+ // Success
235
+ {"success": true, "data": {"url": "https://example.com", "title": "Example"}}
236
+
237
+ // Error
238
+ {"success": false, "error": "Element not found: .missing"}
239
+ ```
240
+
241
+ Non-zero exit code on errors — works naturally with `set -e` and `&&` chains.
242
+
243
+ ## Architecture
244
+
245
+ ```
246
+ ┌─────────────────────────────────────────────────┐
247
+ │ Terminal │
248
+ │ $ bctl click "button.submit" │
249
+ │ │ │
250
+ │ ▼ HTTP POST localhost:19876/command │
251
+ │ ┌─────────────────────┐ │
252
+ │ │ Bridge Server │ (Python, aiohttp) │
253
+ │ │ :19876 │ │
254
+ │ └────────┬────────────┘ │
255
+ │ │ WebSocket │
256
+ │ ▼ │
257
+ │ ┌─────────────────────┐ │
258
+ │ │ Chrome Extension │ (Manifest V3) │
259
+ │ │ Service Worker │ │
260
+ │ └────────┬────────────┘ │
261
+ │ │ chrome.scripting / chrome.debugger │
262
+ │ ▼ │
263
+ │ ┌─────────────────────┐ │
264
+ │ │ Web Page │ │
265
+ │ └─────────────────────┘ │
266
+ └─────────────────────────────────────────────────┘
267
+ ```
268
+
269
+ - **CLI** → stdlib only, communicates via HTTP
270
+ - **Bridge Server** → async relay (aiohttp), auto-daemonizes
271
+ - **Extension** → MV3 service worker, auto-reconnects via `chrome.alarms`
272
+ - **Eval** → dual strategy: MAIN-world injection (fast) with CDP fallback (CSP-safe)
273
+
274
+ ## Requirements
275
+
276
+ - Python >= 3.11
277
+ - Chrome / Chromium with the extension loaded
278
+ - macOS, Linux, or Windows
279
+
280
+ ## Privacy
281
+
282
+ All communication is local (`127.0.0.1`). No analytics, no telemetry, no external servers. See [PRIVACY.md](PRIVACY.md) for the full privacy policy.
283
+
284
+ ## License
285
+
286
+ [MIT](LICENSE)
@@ -0,0 +1,260 @@
1
+ # browser-ctl
2
+
3
+ **Control Chrome from your terminal.** A lightweight CLI tool for browser automation — navigate, click, type, scroll, screenshot, and more, all through simple commands.
4
+
5
+ ```bash
6
+ pip install browser-ctl
7
+
8
+ bctl go https://github.com
9
+ bctl click "a.search-button"
10
+ bctl type "input[name=q]" "browser-ctl"
11
+ bctl press Enter
12
+ bctl screenshot results.png
13
+ ```
14
+
15
+ ## Why browser-ctl?
16
+
17
+ - **Zero-config CLI** — single `bctl` command, JSON output, works in any shell or script
18
+ - **No browser binary management** — uses your existing Chrome with a lightweight extension
19
+ - **Stdlib-only CLI** — the CLI itself has zero external Python dependencies
20
+ - **AI-agent friendly** — ships with an AI coding skill file (`SKILL.md`) for Cursor / OpenCode integration
21
+ - **Local & private** — all communication stays on `localhost`, no data leaves your machine
22
+
23
+ ## How It Works
24
+
25
+ ```
26
+ Terminal (bctl) ──HTTP──▶ Bridge Server ◀──WebSocket── Chrome Extension
27
+ ```
28
+
29
+ 1. The **CLI** (`bctl`) sends commands via HTTP to a local bridge server
30
+ 2. The **bridge server** relays them over WebSocket to the Chrome extension
31
+ 3. The **extension** executes commands using Chrome APIs and content scripts
32
+ 4. Results flow back the same path as JSON
33
+
34
+ The bridge server auto-starts on first command — no manual setup needed.
35
+
36
+ ## Installation
37
+
38
+ ### 1. Install the Python package
39
+
40
+ ```bash
41
+ pip install browser-ctl
42
+ ```
43
+
44
+ ### 2. Load the Chrome extension
45
+
46
+ ```bash
47
+ bctl setup
48
+ ```
49
+
50
+ This copies the extension to `~/.browser-ctl/extension/` and opens Chrome's extension page. Then:
51
+
52
+ 1. Open `chrome://extensions`
53
+ 2. Enable **Developer mode** (top right)
54
+ 3. Click **Load unpacked**
55
+ 4. Select the `~/.browser-ctl/extension/` directory
56
+
57
+ ### 3. Verify
58
+
59
+ ```bash
60
+ bctl ping
61
+ ```
62
+
63
+ You should see `{"success": true, "data": {"server": true, "extension": true}}`.
64
+
65
+ ## Commands
66
+
67
+ ### Navigation
68
+
69
+ ```bash
70
+ bctl navigate <url> # Navigate to URL (aliases: nav, go)
71
+ bctl back # Go back in history
72
+ bctl forward # Go forward (alias: fwd)
73
+ bctl reload # Reload current page
74
+ ```
75
+
76
+ ### Interaction
77
+
78
+ ```bash
79
+ bctl click <sel> [-i N] # Click element (CSS selector, optional Nth match)
80
+ bctl hover <sel> [-i N] # Hover over element
81
+ bctl type <sel> <text> # Type text into input/textarea
82
+ bctl press <key> # Press key (Enter, Escape, Tab, etc.)
83
+ bctl scroll <dir|sel> [pixels] # Scroll: up/down/top/bottom or element into view
84
+ bctl select-option <sel> <val> # Select dropdown option (alias: sopt) [--text]
85
+ bctl drag <src> [target] # Drag to element or offset [--dx N --dy N]
86
+ ```
87
+
88
+ ### DOM Query
89
+
90
+ ```bash
91
+ bctl text [sel] # Get text content (default: body)
92
+ bctl html [sel] # Get innerHTML
93
+ bctl attr <sel> [name] # Get attribute(s) [-i N for Nth element]
94
+ bctl select <sel> [-l N] # List matching elements (alias: sel, limit default: 20)
95
+ bctl count <sel> # Count matching elements
96
+ bctl status # Current page URL and title
97
+ ```
98
+
99
+ ### JavaScript
100
+
101
+ ```bash
102
+ bctl eval <code> # Execute JS in page context (auto-bypasses CSP)
103
+ ```
104
+
105
+ ### Tabs
106
+
107
+ ```bash
108
+ bctl tabs # List all tabs
109
+ bctl tab <id> # Switch to tab by ID
110
+ bctl new-tab [url] # Open new tab
111
+ bctl close-tab [id] # Close tab (default: active)
112
+ ```
113
+
114
+ ### Screenshot & Files
115
+
116
+ ```bash
117
+ bctl screenshot [path] # Capture screenshot (alias: ss)
118
+ bctl download <target> # Download file/image (alias: dl) [-o file] [-i N]
119
+ bctl upload <sel> <files> # Upload file(s) to <input type="file">
120
+ ```
121
+
122
+ ### Wait & Dialog
123
+
124
+ ```bash
125
+ bctl wait <sel|seconds> # Wait for element or sleep [timeout]
126
+ bctl dialog [accept|dismiss] [--text <val>] # Handle next alert/confirm/prompt
127
+ ```
128
+
129
+ ### Server
130
+
131
+ ```bash
132
+ bctl ping # Check server & extension status
133
+ bctl serve # Start server in foreground
134
+ bctl stop # Stop server
135
+ ```
136
+
137
+ ## Examples
138
+
139
+ ### Search and extract
140
+
141
+ ```bash
142
+ bctl go "https://news.ycombinator.com"
143
+ bctl select "a.titlelink" -l 5 # Top 5 links with text, href, etc.
144
+ ```
145
+
146
+ ### Fill a form
147
+
148
+ ```bash
149
+ bctl type "input[name=email]" "user@example.com"
150
+ bctl type "input[name=password]" "hunter2"
151
+ bctl select-option "select#country" "US"
152
+ bctl upload "input[type=file]" ./resume.pdf
153
+ bctl click "button[type=submit]"
154
+ ```
155
+
156
+ ### Scroll and screenshot
157
+
158
+ ```bash
159
+ bctl go "https://en.wikipedia.org/wiki/Web_browser"
160
+ bctl scroll down 1000
161
+ bctl ss page.png
162
+ ```
163
+
164
+ ### Handle dialogs
165
+
166
+ ```bash
167
+ bctl dialog accept # Set up handler BEFORE triggering
168
+ bctl click "#delete-button" # This triggers a confirm() dialog
169
+ ```
170
+
171
+ ### Drag and drop
172
+
173
+ ```bash
174
+ bctl drag ".task-card" ".done-column"
175
+ bctl drag ".range-slider" --dx 50 --dy 0
176
+ ```
177
+
178
+ ### Use in shell scripts
179
+
180
+ ```bash
181
+ # Extract all image URLs from a page
182
+ bctl go "https://example.com"
183
+ bctl eval "JSON.stringify(Array.from(document.images).map(i=>i.src))"
184
+
185
+ # Wait for SPA content to load
186
+ bctl go "https://app.example.com/dashboard"
187
+ bctl wait ".dashboard-loaded" 15
188
+ bctl text ".metric-value"
189
+ ```
190
+
191
+ ## AI Agent Integration
192
+
193
+ browser-ctl ships with a `SKILL.md` file designed for AI coding assistants. Install it for your tool:
194
+
195
+ ```bash
196
+ bctl setup cursor # Install skill for Cursor IDE
197
+ bctl setup opencode # Install skill for OpenCode
198
+ bctl setup /path/to/dir # Install to custom directory
199
+ ```
200
+
201
+ Once installed, AI agents can use `bctl` commands to automate browser tasks on your behalf.
202
+
203
+ ## Output Format
204
+
205
+ All commands return JSON to stdout:
206
+
207
+ ```json
208
+ // Success
209
+ {"success": true, "data": {"url": "https://example.com", "title": "Example"}}
210
+
211
+ // Error
212
+ {"success": false, "error": "Element not found: .missing"}
213
+ ```
214
+
215
+ Non-zero exit code on errors — works naturally with `set -e` and `&&` chains.
216
+
217
+ ## Architecture
218
+
219
+ ```
220
+ ┌─────────────────────────────────────────────────┐
221
+ │ Terminal │
222
+ │ $ bctl click "button.submit" │
223
+ │ │ │
224
+ │ ▼ HTTP POST localhost:19876/command │
225
+ │ ┌─────────────────────┐ │
226
+ │ │ Bridge Server │ (Python, aiohttp) │
227
+ │ │ :19876 │ │
228
+ │ └────────┬────────────┘ │
229
+ │ │ WebSocket │
230
+ │ ▼ │
231
+ │ ┌─────────────────────┐ │
232
+ │ │ Chrome Extension │ (Manifest V3) │
233
+ │ │ Service Worker │ │
234
+ │ └────────┬────────────┘ │
235
+ │ │ chrome.scripting / chrome.debugger │
236
+ │ ▼ │
237
+ │ ┌─────────────────────┐ │
238
+ │ │ Web Page │ │
239
+ │ └─────────────────────┘ │
240
+ └─────────────────────────────────────────────────┘
241
+ ```
242
+
243
+ - **CLI** → stdlib only, communicates via HTTP
244
+ - **Bridge Server** → async relay (aiohttp), auto-daemonizes
245
+ - **Extension** → MV3 service worker, auto-reconnects via `chrome.alarms`
246
+ - **Eval** → dual strategy: MAIN-world injection (fast) with CDP fallback (CSP-safe)
247
+
248
+ ## Requirements
249
+
250
+ - Python >= 3.11
251
+ - Chrome / Chromium with the extension loaded
252
+ - macOS, Linux, or Windows
253
+
254
+ ## Privacy
255
+
256
+ All communication is local (`127.0.0.1`). No analytics, no telemetry, no external servers. See [PRIVACY.md](PRIVACY.md) for the full privacy policy.
257
+
258
+ ## License
259
+
260
+ [MIT](LICENSE)