headless_browser_tool 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.claude/settings.json +21 -0
- data/.rubocop.yml +56 -0
- data/.ruby-version +1 -0
- data/CHANGELOG.md +5 -0
- data/CLAUDE.md +298 -0
- data/LICENSE.md +7 -0
- data/README.md +522 -0
- data/Rakefile +12 -0
- data/config.ru +8 -0
- data/exe/hbt +7 -0
- data/lib/headless_browser_tool/browser.rb +374 -0
- data/lib/headless_browser_tool/browser_adapter.rb +320 -0
- data/lib/headless_browser_tool/cli.rb +34 -0
- data/lib/headless_browser_tool/directory_setup.rb +25 -0
- data/lib/headless_browser_tool/logger.rb +31 -0
- data/lib/headless_browser_tool/server.rb +150 -0
- data/lib/headless_browser_tool/session_manager.rb +199 -0
- data/lib/headless_browser_tool/session_middleware.rb +158 -0
- data/lib/headless_browser_tool/session_persistence.rb +146 -0
- data/lib/headless_browser_tool/stdio_server.rb +73 -0
- data/lib/headless_browser_tool/strict_session_middleware.rb +88 -0
- data/lib/headless_browser_tool/tools/attach_file_tool.rb +40 -0
- data/lib/headless_browser_tool/tools/auto_narrate_tool.rb +155 -0
- data/lib/headless_browser_tool/tools/base_tool.rb +39 -0
- data/lib/headless_browser_tool/tools/check_tool.rb +35 -0
- data/lib/headless_browser_tool/tools/choose_tool.rb +56 -0
- data/lib/headless_browser_tool/tools/click_button_tool.rb +49 -0
- data/lib/headless_browser_tool/tools/click_link_tool.rb +48 -0
- data/lib/headless_browser_tool/tools/click_tool.rb +45 -0
- data/lib/headless_browser_tool/tools/close_window_tool.rb +31 -0
- data/lib/headless_browser_tool/tools/double_click_tool.rb +37 -0
- data/lib/headless_browser_tool/tools/drag_tool.rb +46 -0
- data/lib/headless_browser_tool/tools/evaluate_script_tool.rb +20 -0
- data/lib/headless_browser_tool/tools/execute_script_tool.rb +29 -0
- data/lib/headless_browser_tool/tools/fill_in_tool.rb +66 -0
- data/lib/headless_browser_tool/tools/find_all_tool.rb +42 -0
- data/lib/headless_browser_tool/tools/find_element_tool.rb +21 -0
- data/lib/headless_browser_tool/tools/find_elements_containing_text_tool.rb +259 -0
- data/lib/headless_browser_tool/tools/get_attribute_tool.rb +21 -0
- data/lib/headless_browser_tool/tools/get_current_path_tool.rb +16 -0
- data/lib/headless_browser_tool/tools/get_current_url_tool.rb +16 -0
- data/lib/headless_browser_tool/tools/get_narration_history_tool.rb +35 -0
- data/lib/headless_browser_tool/tools/get_page_context_tool.rb +188 -0
- data/lib/headless_browser_tool/tools/get_page_source_tool.rb +16 -0
- data/lib/headless_browser_tool/tools/get_page_title_tool.rb +16 -0
- data/lib/headless_browser_tool/tools/get_session_info_tool.rb +37 -0
- data/lib/headless_browser_tool/tools/get_text_tool.rb +20 -0
- data/lib/headless_browser_tool/tools/get_value_tool.rb +20 -0
- data/lib/headless_browser_tool/tools/get_window_handles_tool.rb +29 -0
- data/lib/headless_browser_tool/tools/go_back_tool.rb +29 -0
- data/lib/headless_browser_tool/tools/go_forward_tool.rb +29 -0
- data/lib/headless_browser_tool/tools/has_element_tool.rb +21 -0
- data/lib/headless_browser_tool/tools/has_text_tool.rb +21 -0
- data/lib/headless_browser_tool/tools/hover_tool.rb +38 -0
- data/lib/headless_browser_tool/tools/is_visible_tool.rb +20 -0
- data/lib/headless_browser_tool/tools/maximize_window_tool.rb +34 -0
- data/lib/headless_browser_tool/tools/open_new_window_tool.rb +25 -0
- data/lib/headless_browser_tool/tools/refresh_tool.rb +32 -0
- data/lib/headless_browser_tool/tools/resize_window_tool.rb +43 -0
- data/lib/headless_browser_tool/tools/right_click_tool.rb +37 -0
- data/lib/headless_browser_tool/tools/save_page_tool.rb +32 -0
- data/lib/headless_browser_tool/tools/screenshot_tool.rb +199 -0
- data/lib/headless_browser_tool/tools/search_page_tool.rb +224 -0
- data/lib/headless_browser_tool/tools/search_source_tool.rb +148 -0
- data/lib/headless_browser_tool/tools/select_tool.rb +44 -0
- data/lib/headless_browser_tool/tools/switch_to_window_tool.rb +30 -0
- data/lib/headless_browser_tool/tools/uncheck_tool.rb +35 -0
- data/lib/headless_browser_tool/tools/visit_tool.rb +27 -0
- data/lib/headless_browser_tool/tools/visual_diff_tool.rb +177 -0
- data/lib/headless_browser_tool/tools.rb +104 -0
- data/lib/headless_browser_tool/version.rb +5 -0
- data/lib/headless_browser_tool.rb +8 -0
- metadata +256 -0
data/README.md
ADDED
@@ -0,0 +1,522 @@
|
|
1
|
+
# Headless Browser Tool
|
2
|
+
|
3
|
+
A headless browser control tool that provides an MCP (Model Context Protocol) server with tools to control a headless browser using Capybara and Selenium. Features multi-session support, session persistence, and both HTTP and stdio communication modes.
|
4
|
+
|
5
|
+
## Features
|
6
|
+
|
7
|
+
- **Headless Chrome browser automation** - Full browser control via Selenium WebDriver
|
8
|
+
- **MCP server with 40+ browser control tools** - Comprehensive API for browser interactions
|
9
|
+
- **Multi-session support** - Isolated browser sessions for each client
|
10
|
+
- **Session persistence** - Sessions survive server restarts with cookies and state preservation
|
11
|
+
- **Two server modes** - HTTP server mode and stdio mode for different integration patterns
|
12
|
+
- **Smart screenshot tools** - With annotations, highlighting, and visual diff capabilities
|
13
|
+
- **AI-assisted tools** - Auto-narration and intelligent page analysis
|
14
|
+
- **Comprehensive logging** - Separate log files for stdio mode to avoid protocol interference
|
15
|
+
- **Structured responses** - All tools return rich, structured data instead of simple strings
|
16
|
+
- **Smart element selectors** - Tools returning multiple elements include selectors for each
|
17
|
+
|
18
|
+
## Installation
|
19
|
+
|
20
|
+
Add this line to your application's Gemfile:
|
21
|
+
|
22
|
+
```ruby
|
23
|
+
gem 'headless_browser_tool'
|
24
|
+
```
|
25
|
+
|
26
|
+
And then execute:
|
27
|
+
|
28
|
+
```bash
|
29
|
+
bundle install
|
30
|
+
```
|
31
|
+
|
32
|
+
Or install it yourself as:
|
33
|
+
|
34
|
+
```bash
|
35
|
+
gem install headless_browser_tool
|
36
|
+
```
|
37
|
+
|
38
|
+
## Prerequisites
|
39
|
+
|
40
|
+
You need to have Chrome/Chromium browser installed on your system. The gem will use Chrome in headless mode by default.
|
41
|
+
|
42
|
+
## Usage
|
43
|
+
|
44
|
+
### Command Line Interface
|
45
|
+
|
46
|
+
The `hbt` command provides three main commands:
|
47
|
+
|
48
|
+
#### `hbt start` - Start HTTP Server Mode
|
49
|
+
|
50
|
+
Starts the MCP server as an HTTP server with SSE (Server-Sent Events) support:
|
51
|
+
|
52
|
+
```bash
|
53
|
+
hbt start [OPTIONS]
|
54
|
+
```
|
55
|
+
|
56
|
+
**Options:**
|
57
|
+
- `--port PORT` - Port for the MCP server (default: 4567)
|
58
|
+
- `--headless` / `--no-headless` - Run browser in headless mode (default: true)
|
59
|
+
- `--single-session` - Use single shared browser session instead of multi-session mode
|
60
|
+
- `--session-id SESSION_ID` - Enable session persistence for single session mode (requires `--single-session`)
|
61
|
+
- `--show-headers` - Show HTTP request headers for debugging session issues
|
62
|
+
|
63
|
+
**Examples:**
|
64
|
+
```bash
|
65
|
+
# Start with default settings (multi-session, headless, port 4567)
|
66
|
+
hbt start
|
67
|
+
|
68
|
+
# Start in non-headless mode for debugging
|
69
|
+
hbt start --no-headless
|
70
|
+
|
71
|
+
# Start in single session mode (legacy compatibility)
|
72
|
+
hbt start --single-session
|
73
|
+
|
74
|
+
# Start in single session mode with persistence
|
75
|
+
hbt start --single-session --session-id my-app-session
|
76
|
+
|
77
|
+
# Start with request header logging
|
78
|
+
hbt start --show-headers
|
79
|
+
```
|
80
|
+
|
81
|
+
#### `hbt stdio` - Start Stdio Server Mode
|
82
|
+
|
83
|
+
Starts the MCP server in stdio mode for direct integration with tools that spawn subprocesses:
|
84
|
+
|
85
|
+
```bash
|
86
|
+
hbt stdio [OPTIONS]
|
87
|
+
```
|
88
|
+
|
89
|
+
**Options:**
|
90
|
+
- `--headless` / `--no-headless` - Run browser in headless mode (default: true)
|
91
|
+
|
92
|
+
**Notes:**
|
93
|
+
- Always runs in single-session mode
|
94
|
+
- Logs to `.hbt/logs/PID.log` instead of stdout to avoid interfering with MCP protocol
|
95
|
+
- Ideal for editor integrations and tools that communicate via stdin/stdout
|
96
|
+
- Supports optional session persistence via `HBT_SESSION_ID` environment variable
|
97
|
+
|
98
|
+
**Session Persistence in Stdio Mode:**
|
99
|
+
|
100
|
+
You can enable session persistence by setting the `HBT_SESSION_ID` environment variable:
|
101
|
+
|
102
|
+
```bash
|
103
|
+
# First run - creates and saves session
|
104
|
+
HBT_SESSION_ID=my-editor-session hbt stdio
|
105
|
+
|
106
|
+
# Later run - restores previous session state
|
107
|
+
HBT_SESSION_ID=my-editor-session hbt stdio
|
108
|
+
```
|
109
|
+
|
110
|
+
When `HBT_SESSION_ID` is set:
|
111
|
+
- Session state is saved to `.hbt/sessions/{session_id}.json` on exit
|
112
|
+
- On startup, if the session file exists, it restores:
|
113
|
+
- Current URL
|
114
|
+
- Cookies
|
115
|
+
- localStorage
|
116
|
+
- sessionStorage
|
117
|
+
- Window size
|
118
|
+
|
119
|
+
This is useful for editor integrations that want to maintain browser state across multiple tool invocations.
|
120
|
+
|
121
|
+
**Examples:**
|
122
|
+
```bash
|
123
|
+
# Start in stdio mode (headless by default, no persistence)
|
124
|
+
hbt stdio
|
125
|
+
|
126
|
+
# Start with session persistence
|
127
|
+
HBT_SESSION_ID=vscode-session hbt stdio
|
128
|
+
|
129
|
+
# Start in stdio mode with visible browser
|
130
|
+
hbt stdio --no-headless
|
131
|
+
```
|
132
|
+
|
133
|
+
#### `hbt version` - Display Version
|
134
|
+
|
135
|
+
Shows the current version of HeadlessBrowserTool:
|
136
|
+
|
137
|
+
```bash
|
138
|
+
hbt version
|
139
|
+
```
|
140
|
+
|
141
|
+
### Session Management
|
142
|
+
|
143
|
+
#### Multi-Session Mode (Default for HTTP Server)
|
144
|
+
|
145
|
+
In multi-session mode, each client connection gets its own isolated browser session with:
|
146
|
+
- **Separate cookies and localStorage** - Complete isolation between sessions
|
147
|
+
- **Independent navigation history** - Each session maintains its own browser state
|
148
|
+
- **Session persistence** - Sessions are saved to `.hbt/sessions/` and restored on restart
|
149
|
+
- **Automatic cleanup** - Idle sessions are closed after 30 minutes
|
150
|
+
- **LRU eviction** - When at capacity (10 sessions), least recently used sessions are closed
|
151
|
+
|
152
|
+
**Session Identification in Multi-Session Mode:**
|
153
|
+
|
154
|
+
For HTTP server mode, sessions require an `X-Session-ID` header:
|
155
|
+
|
156
|
+
```bash
|
157
|
+
# Connect with session ID "alice"
|
158
|
+
curl -H "X-Session-ID: alice" -H "Accept: text/event-stream" http://localhost:4567/
|
159
|
+
|
160
|
+
# Different session ID gets different browser
|
161
|
+
curl -H "X-Session-ID: bob" -H "Accept: text/event-stream" http://localhost:4567/
|
162
|
+
|
163
|
+
# Without X-Session-ID header, connection is rejected
|
164
|
+
curl -H "Accept: text/event-stream" http://localhost:4567/
|
165
|
+
# Returns: 400 Bad Request - X-Session-ID header is required
|
166
|
+
```
|
167
|
+
|
168
|
+
**Session ID Requirements:**
|
169
|
+
- Must be provided via `X-Session-ID` header
|
170
|
+
- Can only contain alphanumeric characters, underscores, and hyphens
|
171
|
+
- Maximum length: 64 characters
|
172
|
+
- Invalid formats are rejected with 400 error
|
173
|
+
|
174
|
+
#### Single Session Mode
|
175
|
+
|
176
|
+
Use `--single-session` flag for legacy mode where all clients share one browser:
|
177
|
+
```bash
|
178
|
+
hbt start --single-session
|
179
|
+
```
|
180
|
+
|
181
|
+
**Session Persistence in Single Session Mode:**
|
182
|
+
|
183
|
+
You can enable session persistence with the `--session-id` flag:
|
184
|
+
|
185
|
+
```bash
|
186
|
+
# First run - creates and saves session
|
187
|
+
hbt start --single-session --session-id my-app
|
188
|
+
|
189
|
+
# Server restart - restores previous session
|
190
|
+
hbt start --single-session --session-id my-app
|
191
|
+
```
|
192
|
+
|
193
|
+
When `--session-id` is provided:
|
194
|
+
- Session state is saved to `.hbt/sessions/{session_id}.json` on shutdown
|
195
|
+
- On startup, if the session file exists, it restores browser state
|
196
|
+
- All clients share this single persistent session
|
197
|
+
- Compatible with stdio mode session files
|
198
|
+
|
199
|
+
This is useful for:
|
200
|
+
- Development servers that need to maintain login state
|
201
|
+
- Testing environments where you want consistent browser state
|
202
|
+
- Applications that don't need multi-user isolation
|
203
|
+
|
204
|
+
**Note:** The `--session-id` flag can only be used with `--single-session`. In multi-session mode, session IDs are provided by clients via headers.
|
205
|
+
|
206
|
+
#### Session Management Endpoints
|
207
|
+
|
208
|
+
**View active sessions:**
|
209
|
+
```bash
|
210
|
+
curl http://localhost:4567/sessions | jq
|
211
|
+
```
|
212
|
+
|
213
|
+
Response:
|
214
|
+
```json
|
215
|
+
{
|
216
|
+
"active_sessions": ["alice", "bob"],
|
217
|
+
"session_count": 2,
|
218
|
+
"session_data": {
|
219
|
+
"alice": {
|
220
|
+
"created_at": "2024-01-20T10:00:00Z",
|
221
|
+
"last_activity": "2024-01-20T10:05:00Z",
|
222
|
+
"idle_time": 300.5
|
223
|
+
}
|
224
|
+
}
|
225
|
+
}
|
226
|
+
```
|
227
|
+
|
228
|
+
**Close a specific session:**
|
229
|
+
```bash
|
230
|
+
curl -X DELETE http://localhost:4567/sessions/alice
|
231
|
+
```
|
232
|
+
|
233
|
+
### Directory Structure
|
234
|
+
|
235
|
+
HeadlessBrowserTool creates a `.hbt/` directory with:
|
236
|
+
```
|
237
|
+
.hbt/
|
238
|
+
├── .gitignore # Contains "*" to ignore all contents
|
239
|
+
├── screenshots/ # Screenshot storage
|
240
|
+
├── sessions/ # Session persistence files
|
241
|
+
└── logs/ # Log files (stdio mode only)
|
242
|
+
└── PID.log # Process-specific log file
|
243
|
+
```
|
244
|
+
|
245
|
+
### MCP API
|
246
|
+
|
247
|
+
The server implements the Model Context Protocol (MCP) and responds to JSON-RPC requests.
|
248
|
+
|
249
|
+
#### Using with MCP Clients
|
250
|
+
|
251
|
+
For HTTP mode with proper MCP clients:
|
252
|
+
```bash
|
253
|
+
# Start server
|
254
|
+
hbt start
|
255
|
+
|
256
|
+
# MCP client should:
|
257
|
+
# 1. Connect with X-Session-ID header
|
258
|
+
# 2. Use SSE endpoint for streaming: http://localhost:4567/mcp/sse
|
259
|
+
# 3. Send commands via JSON-RPC
|
260
|
+
```
|
261
|
+
|
262
|
+
For stdio mode:
|
263
|
+
```bash
|
264
|
+
# MCP client spawns the process directly
|
265
|
+
hbt stdio
|
266
|
+
# Communication happens via stdin/stdout
|
267
|
+
```
|
268
|
+
|
269
|
+
### Available Browser Tools
|
270
|
+
|
271
|
+
All tools are available through the MCP protocol. Here's a complete reference:
|
272
|
+
|
273
|
+
#### Navigation Tools
|
274
|
+
|
275
|
+
| Tool | Description | Parameters | Returns |
|
276
|
+
|------|-------------|------------|----------|
|
277
|
+
| `visit` | Navigate to a URL | `url` (required) | `{url, current_url, title, status}` |
|
278
|
+
| `refresh` | Reload the current page | None | `{url, title, changed, status}` |
|
279
|
+
| `go_back` | Navigate back in browser history | None | `{navigation: {from, to, title, navigated}, status}` |
|
280
|
+
| `go_forward` | Navigate forward in browser history | None | `{navigation: {from, to, title, navigated}, status}` |
|
281
|
+
|
282
|
+
#### Element Interaction Tools
|
283
|
+
|
284
|
+
| Tool | Description | Parameters | Returns |
|
285
|
+
|------|-------------|------------|----------|
|
286
|
+
| `click` | Click an element | `selector` (required) | `{selector, element, navigation, status}` |
|
287
|
+
| `right_click` | Right-click an element | `selector` (required) | `{selector, element, status}` |
|
288
|
+
| `double_click` | Double-click an element | `selector` (required) | `{selector, element, status}` |
|
289
|
+
| `hover` | Hover mouse over element | `selector` (required) | `{selector, element, status}` |
|
290
|
+
| `drag` | Drag element to target | `source_selector`, `target_selector` (required) | `{source_selector, target_selector, source, target, status}` |
|
291
|
+
|
292
|
+
#### Element Finding Tools
|
293
|
+
|
294
|
+
| Tool | Description | Parameters | Key Returns |
|
295
|
+
|------|-------------|------------|-------------|
|
296
|
+
| `find_element` | Find single element | `selector` (required) | Element details with attributes |
|
297
|
+
| `find_all` | Find all matching elements | `selector` (required) | `{elements: [{selector, tag_name, text, visible, attributes}]}` |
|
298
|
+
| `find_elements_containing_text` | Find elements with text | `text` (required), `exact_match`, `case_sensitive`, `visible_only` | `{elements: [{selector, xpath, tag, text, clickable}]}` |
|
299
|
+
| `get_text` | Get element text | `selector` (required) | Text content string |
|
300
|
+
| `get_attribute` | Get element attribute | `selector`, `attribute` (required) | Attribute value |
|
301
|
+
| `get_value` | Get input value | `selector` (required) | Input value |
|
302
|
+
| `is_visible` | Check element visibility | `selector` (required) | Boolean |
|
303
|
+
| `has_element` | Check element exists | `selector` (required), `wait` | Boolean |
|
304
|
+
| `has_text` | Check text exists | `text` (required), `wait` | Boolean |
|
305
|
+
|
306
|
+
#### Form Interaction Tools
|
307
|
+
|
308
|
+
| Tool | Description | Parameters | Key Returns |
|
309
|
+
|------|-------------|------------|-------------|
|
310
|
+
| `fill_in` | Fill input field | `field`, `value` (required) | `{field, value, field_info, status}` |
|
311
|
+
| `select` | Select dropdown option | `value`, `dropdown_selector` (required) | `{selected_value, selected_text, options: [{selector, value, text}]}` |
|
312
|
+
| `check` | Check checkbox | `checkbox_selector` (required) | `{selector, was_checked, is_checked, element, status}` |
|
313
|
+
| `uncheck` | Uncheck checkbox | `checkbox_selector` (required) | `{selector, was_checked, is_checked, element, status}` |
|
314
|
+
| `choose` | Select radio button | `radio_button_selector` (required) | `{selector, radio, group: [{selector, value, checked}], status}` |
|
315
|
+
| `attach_file` | Upload file | `file_field_selector`, `file_path` (required) | `{field_selector, file_path, file_name, file_size, field, status}` |
|
316
|
+
| `click_button` | Click button | `button_text_or_selector` (required) | `{button, element, navigation, status}` |
|
317
|
+
| `click_link` | Click link | `link_text_or_selector` (required) | `{link, element, navigation, status}` |
|
318
|
+
|
319
|
+
#### Page Information Tools
|
320
|
+
|
321
|
+
| Tool | Description | Returns |
|
322
|
+
|------|-------------|---------|
|
323
|
+
| `get_current_url` | Get current URL | Full URL string |
|
324
|
+
| `get_current_path` | Get current path | Path without domain |
|
325
|
+
| `get_page_title` | Get page title | Title string |
|
326
|
+
| `get_page_source` | Get HTML source | Full HTML |
|
327
|
+
| `get_page_context` | Get page analysis | Structured page data |
|
328
|
+
|
329
|
+
#### Search Tools
|
330
|
+
|
331
|
+
| Tool | Description | Parameters |
|
332
|
+
|------|-------------|------------|
|
333
|
+
| `search_page` | Search visible content | `query` (required), `case_sensitive`, `regex`, `context_lines`, `highlight` |
|
334
|
+
| `search_source` | Search HTML source | `query` (required), `case_sensitive`, `regex`, `context_lines`, `show_line_numbers` |
|
335
|
+
|
336
|
+
#### JavaScript Execution Tools
|
337
|
+
|
338
|
+
| Tool | Description | Parameters | Returns |
|
339
|
+
|------|-------------|------------|----------|
|
340
|
+
| `execute_script` | Run JavaScript | `javascript_code` (required) | `{javascript_code, execution_time, timestamp, status}` |
|
341
|
+
| `evaluate_script` | Run JS and return result | `javascript_code` (required) | Script return value |
|
342
|
+
|
343
|
+
#### Screenshot and Capture Tools
|
344
|
+
|
345
|
+
| Tool | Description | Parameters | Key Returns |
|
346
|
+
|------|-------------|------------|-------------|
|
347
|
+
| `screenshot` | Take screenshot | `filename`, `highlight_selectors`, `annotate`, `full_page` | `{file_path, filename, file_size, timestamp, url, title}` |
|
348
|
+
| `save_page` | Save HTML to file | `file_path` (required) | `{file_path, file_size, timestamp, url, title, status}` |
|
349
|
+
|
350
|
+
#### Window Management Tools
|
351
|
+
|
352
|
+
| Tool | Description | Parameters | Key Returns |
|
353
|
+
|------|-------------|------------|-------------|
|
354
|
+
| `switch_to_window` | Switch to window/tab | `window_handle` (required) | `{window_handle, previous_window, current_url, title, total_windows}` |
|
355
|
+
| `open_new_window` | Open new window/tab | None | `{window_handle, total_windows, previous_windows, current_window}` |
|
356
|
+
| `close_window` | Close window/tab | `window_handle` (required) | `{closed_window, was_current, remaining_windows, current_window}` |
|
357
|
+
| `get_window_handles` | Get all window handles | None | `{current_window, windows: [{handle, index, is_current}], total_windows}` |
|
358
|
+
| `maximize_window` | Maximize window | None | `{size_before: {width, height}, size_after: {width, height}, status}` |
|
359
|
+
| `resize_window` | Resize window | `width`, `height` (required) | `{requested_size, size_before, size_after, status}` |
|
360
|
+
|
361
|
+
#### AI-Assisted Tools
|
362
|
+
|
363
|
+
| Tool | Description | Parameters |
|
364
|
+
|------|-------------|------------|
|
365
|
+
| `auto_narrate` | Generate page description | `focus_on` |
|
366
|
+
| `get_narration_history` | Get narration history | None |
|
367
|
+
| `visual_diff` | Compare screenshots | `before_path`, `after_path` (required) |
|
368
|
+
|
369
|
+
#### Session Management Tools
|
370
|
+
|
371
|
+
| Tool | Description | Returns |
|
372
|
+
|------|-------------|---------|
|
373
|
+
| `get_session_info` | Get session information | Session details |
|
374
|
+
|
375
|
+
### Tool Response Structure
|
376
|
+
|
377
|
+
All tools now return structured data instead of simple strings. This makes it easier to:
|
378
|
+
- Extract specific information from responses
|
379
|
+
- Check operation success/failure
|
380
|
+
- Access element properties and metadata
|
381
|
+
- Navigate to specific elements using returned selectors
|
382
|
+
|
383
|
+
**Example responses:**
|
384
|
+
|
385
|
+
```json
|
386
|
+
// visit tool response
|
387
|
+
{
|
388
|
+
"url": "https://example.com",
|
389
|
+
"current_url": "https://example.com/",
|
390
|
+
"title": "Example Domain",
|
391
|
+
"status": "success"
|
392
|
+
}
|
393
|
+
|
394
|
+
// find_all tool response with selectors
|
395
|
+
{
|
396
|
+
"selector": ".item",
|
397
|
+
"count": 3,
|
398
|
+
"elements": [
|
399
|
+
{
|
400
|
+
"index": 0,
|
401
|
+
"selector": ".item:nth-of-type(1)",
|
402
|
+
"tag_name": "div",
|
403
|
+
"text": "Item 1",
|
404
|
+
"visible": true,
|
405
|
+
"attributes": {"class": "item active"}
|
406
|
+
},
|
407
|
+
// ... more elements
|
408
|
+
]
|
409
|
+
}
|
410
|
+
|
411
|
+
// select tool response with option selectors
|
412
|
+
{
|
413
|
+
"dropdown_selector": "#country",
|
414
|
+
"selected_value": "US",
|
415
|
+
"selected_text": "United States",
|
416
|
+
"options": [
|
417
|
+
{
|
418
|
+
"selector": "#country option:nth-of-type(1)",
|
419
|
+
"value": "US",
|
420
|
+
"text": "United States",
|
421
|
+
"selected": true
|
422
|
+
},
|
423
|
+
// ... more options
|
424
|
+
],
|
425
|
+
"status": "selected"
|
426
|
+
}
|
427
|
+
```
|
428
|
+
|
429
|
+
### Example Tool Calls
|
430
|
+
|
431
|
+
Here are examples using curl with the HTTP server:
|
432
|
+
|
433
|
+
```bash
|
434
|
+
# Navigate to a URL
|
435
|
+
curl -X POST http://localhost:4567/ \
|
436
|
+
-H "Content-Type: application/json" \
|
437
|
+
-H "X-Session-ID: alice" \
|
438
|
+
-d '{"jsonrpc": "2.0", "id": 1, "method": "tools/call",
|
439
|
+
"params": {"name": "visit", "arguments": {"url": "https://example.com"}}}'
|
440
|
+
|
441
|
+
# Take an annotated screenshot
|
442
|
+
curl -X POST http://localhost:4567/ \
|
443
|
+
-H "Content-Type: application/json" \
|
444
|
+
-H "X-Session-ID: alice" \
|
445
|
+
-d '{"jsonrpc": "2.0", "id": 2, "method": "tools/call",
|
446
|
+
"params": {"name": "screenshot",
|
447
|
+
"arguments": {"filename": "example",
|
448
|
+
"highlight_selectors": [".error", ".warning"],
|
449
|
+
"annotate": true,
|
450
|
+
"full_page": true}}}'
|
451
|
+
|
452
|
+
# Search page content with highlighting
|
453
|
+
curl -X POST http://localhost:4567/ \
|
454
|
+
-H "Content-Type: application/json" \
|
455
|
+
-H "X-Session-ID: alice" \
|
456
|
+
-d '{"jsonrpc": "2.0", "id": 3, "method": "tools/call",
|
457
|
+
"params": {"name": "search_page",
|
458
|
+
"arguments": {"query": "error|warning",
|
459
|
+
"regex": true,
|
460
|
+
"highlight": true}}}'
|
461
|
+
```
|
462
|
+
|
463
|
+
### Environment Variables
|
464
|
+
|
465
|
+
- `HBT_SINGLE_SESSION=true` - Force single session mode in HTTP server
|
466
|
+
- `HBT_SHOW_HEADERS=true` - Enable request header logging in HTTP server
|
467
|
+
- `HBT_SESSION_ID=<session_name>` - Enable session persistence in stdio mode
|
468
|
+
|
469
|
+
### Logging
|
470
|
+
|
471
|
+
- **HTTP mode**: Logs to stdout
|
472
|
+
- **Stdio mode**: Logs to `.hbt/logs/PID.log` to avoid interfering with MCP protocol
|
473
|
+
|
474
|
+
Tool calls are logged with format:
|
475
|
+
```
|
476
|
+
INFO -- HBT: CALL: ToolName [] {args} -> result
|
477
|
+
ERROR -- HBT: ERROR: ToolName [] {args} -> error_message
|
478
|
+
```
|
479
|
+
|
480
|
+
## Development
|
481
|
+
|
482
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt.
|
483
|
+
|
484
|
+
To install this gem onto your local machine, run `bundle exec rake install`.
|
485
|
+
|
486
|
+
### Running Tests and Linting
|
487
|
+
|
488
|
+
```bash
|
489
|
+
# Run tests
|
490
|
+
rake test
|
491
|
+
|
492
|
+
# Run linter
|
493
|
+
rake rubocop
|
494
|
+
|
495
|
+
# Run linter with auto-fix
|
496
|
+
rake rubocop -A
|
497
|
+
|
498
|
+
# Run both tests and linter (default task)
|
499
|
+
rake
|
500
|
+
```
|
501
|
+
|
502
|
+
## Recent Improvements
|
503
|
+
|
504
|
+
### Version 0.1.0
|
505
|
+
|
506
|
+
- **Structured tool responses** - All tools now return rich JSON objects instead of simple strings
|
507
|
+
- **Element selectors in arrays** - Tools returning multiple elements include unique selectors for each
|
508
|
+
- **Session persistence** - Both stdio and single-session HTTP modes support persistent sessions
|
509
|
+
- **Strict session management** - Multi-session mode requires X-Session-ID header (no auto-creation)
|
510
|
+
- **Improved logging** - Fixed stdio mode logging to properly write to `.hbt/logs/PID.log`
|
511
|
+
- **DRY refactoring** - Extracted common functionality into `SessionPersistence` and `DirectorySetup` modules
|
512
|
+
- **Better error handling** - Tools return structured error information
|
513
|
+
- **Enhanced tool responses**:
|
514
|
+
- Navigation tools return before/after URLs and navigation status
|
515
|
+
- Form tools return element state before/after interaction
|
516
|
+
- Window tools return comprehensive window state information
|
517
|
+
- Screenshot tool returns file metadata
|
518
|
+
- All element-finding tools return complete element information
|
519
|
+
|
520
|
+
## Contributing
|
521
|
+
|
522
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/parruda/headless_browser_tool.
|
data/Rakefile
ADDED
data/config.ru
ADDED