ferrum-mcp 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (63) hide show
  1. checksums.yaml +7 -0
  2. data/.env.example +90 -0
  3. data/CHANGELOG.md +229 -0
  4. data/CONTRIBUTING.md +469 -0
  5. data/LICENSE +21 -0
  6. data/README.md +334 -0
  7. data/SECURITY.md +286 -0
  8. data/bin/ferrum-mcp +66 -0
  9. data/bin/lint +10 -0
  10. data/bin/serve +3 -0
  11. data/bin/test +4 -0
  12. data/docs/API_REFERENCE.md +1410 -0
  13. data/docs/CONFIGURATION.md +254 -0
  14. data/docs/DEPLOYMENT.md +846 -0
  15. data/docs/DOCKER.md +836 -0
  16. data/docs/DOCKER_BOTBROWSER.md +455 -0
  17. data/docs/GETTING_STARTED.md +249 -0
  18. data/docs/TROUBLESHOOTING.md +677 -0
  19. data/lib/ferrum_mcp/browser_manager.rb +101 -0
  20. data/lib/ferrum_mcp/cli/command_handler.rb +99 -0
  21. data/lib/ferrum_mcp/cli/server_runner.rb +166 -0
  22. data/lib/ferrum_mcp/configuration.rb +229 -0
  23. data/lib/ferrum_mcp/resource_manager.rb +223 -0
  24. data/lib/ferrum_mcp/server.rb +254 -0
  25. data/lib/ferrum_mcp/session.rb +227 -0
  26. data/lib/ferrum_mcp/session_manager.rb +183 -0
  27. data/lib/ferrum_mcp/tools/accept_cookies_tool.rb +458 -0
  28. data/lib/ferrum_mcp/tools/base_tool.rb +114 -0
  29. data/lib/ferrum_mcp/tools/clear_cookies_tool.rb +66 -0
  30. data/lib/ferrum_mcp/tools/click_tool.rb +218 -0
  31. data/lib/ferrum_mcp/tools/close_session_tool.rb +49 -0
  32. data/lib/ferrum_mcp/tools/create_session_tool.rb +146 -0
  33. data/lib/ferrum_mcp/tools/drag_and_drop_tool.rb +171 -0
  34. data/lib/ferrum_mcp/tools/evaluate_js_tool.rb +46 -0
  35. data/lib/ferrum_mcp/tools/execute_script_tool.rb +48 -0
  36. data/lib/ferrum_mcp/tools/fill_form_tool.rb +78 -0
  37. data/lib/ferrum_mcp/tools/find_by_text_tool.rb +153 -0
  38. data/lib/ferrum_mcp/tools/get_attribute_tool.rb +56 -0
  39. data/lib/ferrum_mcp/tools/get_cookies_tool.rb +70 -0
  40. data/lib/ferrum_mcp/tools/get_html_tool.rb +52 -0
  41. data/lib/ferrum_mcp/tools/get_session_info_tool.rb +40 -0
  42. data/lib/ferrum_mcp/tools/get_text_tool.rb +67 -0
  43. data/lib/ferrum_mcp/tools/get_title_tool.rb +42 -0
  44. data/lib/ferrum_mcp/tools/get_url_tool.rb +39 -0
  45. data/lib/ferrum_mcp/tools/go_back_tool.rb +49 -0
  46. data/lib/ferrum_mcp/tools/go_forward_tool.rb +49 -0
  47. data/lib/ferrum_mcp/tools/hover_tool.rb +76 -0
  48. data/lib/ferrum_mcp/tools/list_sessions_tool.rb +33 -0
  49. data/lib/ferrum_mcp/tools/navigate_tool.rb +59 -0
  50. data/lib/ferrum_mcp/tools/press_key_tool.rb +91 -0
  51. data/lib/ferrum_mcp/tools/query_shadow_dom_tool.rb +225 -0
  52. data/lib/ferrum_mcp/tools/refresh_tool.rb +49 -0
  53. data/lib/ferrum_mcp/tools/screenshot_tool.rb +121 -0
  54. data/lib/ferrum_mcp/tools/session_tool.rb +37 -0
  55. data/lib/ferrum_mcp/tools/set_cookie_tool.rb +77 -0
  56. data/lib/ferrum_mcp/tools/solve_captcha_tool.rb +528 -0
  57. data/lib/ferrum_mcp/transport/http_server.rb +93 -0
  58. data/lib/ferrum_mcp/transport/rate_limiter.rb +79 -0
  59. data/lib/ferrum_mcp/transport/stdio_server.rb +63 -0
  60. data/lib/ferrum_mcp/version.rb +5 -0
  61. data/lib/ferrum_mcp/whisper_service.rb +222 -0
  62. data/lib/ferrum_mcp.rb +35 -0
  63. metadata +248 -0
@@ -0,0 +1,1410 @@
1
+ # FerrumMCP API Reference
2
+
3
+ Comprehensive documentation for all FerrumMCP browser automation tools.
4
+
5
+ ## Table of Contents
6
+
7
+ - [Overview](#overview)
8
+ - [Important Notes](#important-notes)
9
+ - [Session Management](#session-management)
10
+ - [create_session](#create_session)
11
+ - [list_sessions](#list_sessions)
12
+ - [get_session_info](#get_session_info)
13
+ - [close_session](#close_session)
14
+ - [Navigation](#navigation)
15
+ - [navigate](#navigate)
16
+ - [go_back](#go_back)
17
+ - [go_forward](#go_forward)
18
+ - [refresh](#refresh)
19
+ - [Interaction](#interaction)
20
+ - [click](#click)
21
+ - [fill_form](#fill_form)
22
+ - [press_key](#press_key)
23
+ - [hover](#hover)
24
+ - [drag_and_drop](#drag_and_drop)
25
+ - [accept_cookies](#accept_cookies)
26
+ - [solve_captcha](#solve_captcha)
27
+ - [Extraction](#extraction)
28
+ - [get_text](#get_text)
29
+ - [get_html](#get_html)
30
+ - [screenshot](#screenshot)
31
+ - [get_title](#get_title)
32
+ - [get_url](#get_url)
33
+ - [find_by_text](#find_by_text)
34
+ - [Advanced](#advanced)
35
+ - [execute_script](#execute_script)
36
+ - [evaluate_js](#evaluate_js)
37
+ - [get_cookies](#get_cookies)
38
+ - [set_cookie](#set_cookie)
39
+ - [clear_cookies](#clear_cookies)
40
+ - [get_attribute](#get_attribute)
41
+ - [query_shadow_dom](#query_shadow_dom)
42
+ - [Waiting (Currently Disabled)](#waiting-currently-disabled)
43
+ - [wait_for_element](#wait_for_element)
44
+ - [wait_for_navigation](#wait_for_navigation)
45
+ - [wait](#wait)
46
+
47
+ ---
48
+
49
+ ## Overview
50
+
51
+ FerrumMCP provides 27+ browser automation tools through the Model Context Protocol (MCP). All tools return responses in a standardized JSON format with a `success` boolean and either `data` or `error` fields.
52
+
53
+ ## Important Notes
54
+
55
+ 1. **Session-Based Architecture**: All browser operation tools (except session management tools) require a valid `session_id` parameter
56
+ 2. **Session Creation**: You must create a session using `create_session` before using any browser automation tools
57
+ 3. **Session Lifecycle**: Sessions auto-close after 30 minutes of inactivity or can be manually closed with `close_session`
58
+ 4. **Multiple Sessions**: You can run multiple concurrent browser sessions with different configurations
59
+ 5. **Screenshot Format**: The `screenshot` tool returns base64-encoded image data
60
+ 6. **Selector Support**: Most tools support both CSS selectors and XPath (use `xpath:` prefix for XPath)
61
+
62
+ ---
63
+
64
+ ## Session Management
65
+
66
+ ### create_session
67
+
68
+ Create a new browser session with custom options. Returns a `session_id` to use with other tools.
69
+
70
+ **Parameters:**
71
+
72
+ | Name | Type | Required | Description |
73
+ |------|------|----------|-------------|
74
+ | browser_id | string | No | Browser ID from `ferrum://browsers` resource |
75
+ | user_profile_id | string | No | User profile ID from `ferrum://user-profiles` resource |
76
+ | bot_profile_id | string | No | BotBrowser profile ID from `ferrum://bot-profiles` resource |
77
+ | browser_path | string | No | Path to browser executable (legacy) |
78
+ | botbrowser_profile | string | No | Path to BotBrowser profile (legacy) |
79
+ | headless | boolean | No | Run in headless mode (default: false) |
80
+ | timeout | number | No | Browser timeout in seconds (default: 60) |
81
+ | browser_options | object | No | Additional browser options (e.g., `{"--window-size": "1920,1080"}`) |
82
+ | metadata | object | No | Custom metadata for this session |
83
+
84
+ **Example Request:**
85
+
86
+ ```json
87
+ {
88
+ "name": "create_session",
89
+ "arguments": {
90
+ "headless": true,
91
+ "timeout": 60,
92
+ "browser_options": {
93
+ "--window-size": "1920,1080"
94
+ },
95
+ "metadata": {
96
+ "user": "john",
97
+ "project": "scraping"
98
+ }
99
+ }
100
+ }
101
+ ```
102
+
103
+ **Example Response:**
104
+
105
+ ```json
106
+ {
107
+ "session_id": "uuid-1234-5678",
108
+ "message": "Session created successfully",
109
+ "options": {
110
+ "headless": true,
111
+ "timeout": 60,
112
+ "browser_options": {
113
+ "--window-size": "1920,1080"
114
+ }
115
+ }
116
+ }
117
+ ```
118
+
119
+ **Notes:**
120
+ - Use `browser_id`, `user_profile_id`, and `bot_profile_id` for resource-based configuration (recommended)
121
+ - Legacy `browser_path` and `botbrowser_profile` parameters still supported
122
+ - Query `ferrum://browsers` and `ferrum://bot-profiles` resources to discover available configurations
123
+
124
+ ---
125
+
126
+ ### list_sessions
127
+
128
+ List all active browser sessions with their information.
129
+
130
+ **Parameters:**
131
+
132
+ None.
133
+
134
+ **Example Request:**
135
+
136
+ ```json
137
+ {
138
+ "name": "list_sessions",
139
+ "arguments": {}
140
+ }
141
+ ```
142
+
143
+ **Example Response:**
144
+
145
+ ```json
146
+ {
147
+ "count": 2,
148
+ "sessions": [
149
+ {
150
+ "id": "uuid-1234",
151
+ "status": "active",
152
+ "browser_type": "chrome",
153
+ "headless": true,
154
+ "created_at": "2025-11-22T10:00:00Z",
155
+ "last_used_at": "2025-11-22T10:05:00Z",
156
+ "uptime_seconds": 300
157
+ },
158
+ {
159
+ "id": "uuid-5678",
160
+ "status": "active",
161
+ "browser_type": "botbrowser",
162
+ "headless": false,
163
+ "created_at": "2025-11-22T10:10:00Z",
164
+ "last_used_at": "2025-11-22T10:15:00Z",
165
+ "uptime_seconds": 300
166
+ }
167
+ ]
168
+ }
169
+ ```
170
+
171
+ ---
172
+
173
+ ### get_session_info
174
+
175
+ Get detailed information about a specific browser session.
176
+
177
+ **Parameters:**
178
+
179
+ | Name | Type | Required | Description |
180
+ |------|------|----------|-------------|
181
+ | session_id | string | No | Session ID (omit for default session) |
182
+
183
+ **Example Request:**
184
+
185
+ ```json
186
+ {
187
+ "name": "get_session_info",
188
+ "arguments": {
189
+ "session_id": "uuid-1234"
190
+ }
191
+ }
192
+ ```
193
+
194
+ **Example Response:**
195
+
196
+ ```json
197
+ {
198
+ "id": "uuid-1234",
199
+ "status": "active",
200
+ "browser_type": "chrome",
201
+ "headless": true,
202
+ "timeout": 60,
203
+ "created_at": "2025-11-22T10:00:00Z",
204
+ "last_used_at": "2025-11-22T10:05:00Z",
205
+ "uptime_seconds": 300,
206
+ "metadata": {
207
+ "user": "john",
208
+ "project": "scraping"
209
+ }
210
+ }
211
+ ```
212
+
213
+ ---
214
+
215
+ ### close_session
216
+
217
+ Close a specific browser session. The browser will be stopped and the session removed.
218
+
219
+ **Parameters:**
220
+
221
+ | Name | Type | Required | Description |
222
+ |------|------|----------|-------------|
223
+ | session_id | string | Yes | ID of the session to close |
224
+
225
+ **Example Request:**
226
+
227
+ ```json
228
+ {
229
+ "name": "close_session",
230
+ "arguments": {
231
+ "session_id": "uuid-1234"
232
+ }
233
+ }
234
+ ```
235
+
236
+ **Example Response:**
237
+
238
+ ```json
239
+ {
240
+ "session_id": "uuid-1234",
241
+ "message": "Session closed successfully"
242
+ }
243
+ ```
244
+
245
+ ---
246
+
247
+ ## Navigation
248
+
249
+ ### navigate
250
+
251
+ Navigate to a specific URL in the browser.
252
+
253
+ **Parameters:**
254
+
255
+ | Name | Type | Required | Description |
256
+ |------|------|----------|-------------|
257
+ | url | string | Yes | URL to navigate to (must include http:// or https://) |
258
+ | session_id | string | Yes | Session ID to use |
259
+
260
+ **Example Request:**
261
+
262
+ ```json
263
+ {
264
+ "name": "navigate",
265
+ "arguments": {
266
+ "url": "https://example.com",
267
+ "session_id": "uuid-1234"
268
+ }
269
+ }
270
+ ```
271
+
272
+ **Example Response:**
273
+
274
+ ```json
275
+ {
276
+ "url": "https://example.com",
277
+ "title": "Example Domain"
278
+ }
279
+ ```
280
+
281
+ **Notes:**
282
+ - URL must start with `http://` or `https://`
283
+ - Automatically waits for network to be idle after navigation
284
+ - Throws timeout error if navigation takes longer than browser timeout
285
+
286
+ ---
287
+
288
+ ### go_back
289
+
290
+ Go back to the previous page in browser history.
291
+
292
+ **Parameters:**
293
+
294
+ | Name | Type | Required | Description |
295
+ |------|------|----------|-------------|
296
+ | session_id | string | Yes | Session ID to use |
297
+
298
+ **Example Request:**
299
+
300
+ ```json
301
+ {
302
+ "name": "go_back",
303
+ "arguments": {
304
+ "session_id": "uuid-1234"
305
+ }
306
+ }
307
+ ```
308
+
309
+ **Example Response:**
310
+
311
+ ```json
312
+ {
313
+ "url": "https://previous-page.com",
314
+ "title": "Previous Page"
315
+ }
316
+ ```
317
+
318
+ **Notes:**
319
+ - Waits for network to be idle after navigation
320
+ - Returns current URL and title after going back
321
+
322
+ ---
323
+
324
+ ### go_forward
325
+
326
+ Go forward to the next page in browser history.
327
+
328
+ **Parameters:**
329
+
330
+ | Name | Type | Required | Description |
331
+ |------|------|----------|-------------|
332
+ | session_id | string | Yes | Session ID to use |
333
+
334
+ **Example Request:**
335
+
336
+ ```json
337
+ {
338
+ "name": "go_forward",
339
+ "arguments": {
340
+ "session_id": "uuid-1234"
341
+ }
342
+ }
343
+ ```
344
+
345
+ **Example Response:**
346
+
347
+ ```json
348
+ {
349
+ "url": "https://next-page.com",
350
+ "title": "Next Page"
351
+ }
352
+ ```
353
+
354
+ **Notes:**
355
+ - Waits for network to be idle after navigation
356
+ - Returns current URL and title after going forward
357
+
358
+ ---
359
+
360
+ ### refresh
361
+
362
+ Refresh the current page.
363
+
364
+ **Parameters:**
365
+
366
+ | Name | Type | Required | Description |
367
+ |------|------|----------|-------------|
368
+ | session_id | string | Yes | Session ID to use |
369
+
370
+ **Example Request:**
371
+
372
+ ```json
373
+ {
374
+ "name": "refresh",
375
+ "arguments": {
376
+ "session_id": "uuid-1234"
377
+ }
378
+ }
379
+ ```
380
+
381
+ **Example Response:**
382
+
383
+ ```json
384
+ {
385
+ "url": "https://example.com",
386
+ "title": "Example Domain"
387
+ }
388
+ ```
389
+
390
+ **Notes:**
391
+ - Waits for network to be idle after refresh
392
+ - Returns current URL and title
393
+
394
+ ---
395
+
396
+ ## Interaction
397
+
398
+ ### click
399
+
400
+ Click on an element using a CSS selector or XPath.
401
+
402
+ **Parameters:**
403
+
404
+ | Name | Type | Required | Description |
405
+ |------|------|----------|-------------|
406
+ | selector | string | Yes | CSS selector or XPath (use `xpath:` prefix for XPath) |
407
+ | wait | number | No | Seconds to wait for element (default: 5) |
408
+ | force | boolean | No | Force click even if hidden/not visible (default: false) |
409
+ | session_id | string | Yes | Session ID to use |
410
+
411
+ **Example Request:**
412
+
413
+ ```json
414
+ {
415
+ "name": "click",
416
+ "arguments": {
417
+ "selector": "button.submit",
418
+ "wait": 10,
419
+ "force": false,
420
+ "session_id": "uuid-1234"
421
+ }
422
+ }
423
+ ```
424
+
425
+ **Example Response:**
426
+
427
+ ```json
428
+ {
429
+ "message": "Clicked on button.submit"
430
+ }
431
+ ```
432
+
433
+ **Notes:**
434
+ - Supports both CSS selectors and XPath (use `xpath://button[@id='submit']`)
435
+ - Automatically scrolls element into view before clicking
436
+ - If `force: true`, uses JavaScript click as fallback for hidden elements
437
+ - Includes retry logic for stale elements
438
+
439
+ ---
440
+
441
+ ### fill_form
442
+
443
+ Fill one or more form fields with values.
444
+
445
+ **Parameters:**
446
+
447
+ | Name | Type | Required | Description |
448
+ |------|------|----------|-------------|
449
+ | fields | array | Yes | Array of field objects with `selector` and `value` |
450
+ | session_id | string | Yes | Session ID to use |
451
+
452
+ **Field Object:**
453
+
454
+ | Name | Type | Required | Description |
455
+ |------|------|----------|-------------|
456
+ | selector | string | Yes | CSS selector of the field |
457
+ | value | string | Yes | Value to fill |
458
+
459
+ **Example Request:**
460
+
461
+ ```json
462
+ {
463
+ "name": "fill_form",
464
+ "arguments": {
465
+ "fields": [
466
+ {
467
+ "selector": "input[name='username']",
468
+ "value": "john_doe"
469
+ },
470
+ {
471
+ "selector": "input[name='password']",
472
+ "value": "secret123"
473
+ }
474
+ ],
475
+ "session_id": "uuid-1234"
476
+ }
477
+ }
478
+ ```
479
+
480
+ **Example Response:**
481
+
482
+ ```json
483
+ {
484
+ "fields": [
485
+ {
486
+ "selector": "input[name='username']",
487
+ "filled": true
488
+ },
489
+ {
490
+ "selector": "input[name='password']",
491
+ "filled": true
492
+ }
493
+ ]
494
+ }
495
+ ```
496
+
497
+ **Notes:**
498
+ - Automatically scrolls fields into view
499
+ - Focuses each field before typing
500
+ - Includes small delays between fields for validation/autocomplete handlers
501
+ - Uses retry logic for stale elements
502
+
503
+ ---
504
+
505
+ ### press_key
506
+
507
+ Press keyboard keys (e.g., Enter, Tab, Escape).
508
+
509
+ **Parameters:**
510
+
511
+ | Name | Type | Required | Description |
512
+ |------|------|----------|-------------|
513
+ | key | string | Yes | Key to press (Enter, Tab, Escape, ArrowDown, etc.) |
514
+ | selector | string | No | CSS selector to focus before pressing key |
515
+ | session_id | string | Yes | Session ID to use |
516
+
517
+ **Example Request:**
518
+
519
+ ```json
520
+ {
521
+ "name": "press_key",
522
+ "arguments": {
523
+ "key": "Enter",
524
+ "selector": "input.search",
525
+ "session_id": "uuid-1234"
526
+ }
527
+ }
528
+ ```
529
+
530
+ **Example Response:**
531
+
532
+ ```json
533
+ {
534
+ "message": "Pressed key: Enter"
535
+ }
536
+ ```
537
+
538
+ **Supported Keys:**
539
+ - `Enter`, `Return`
540
+ - `Tab`
541
+ - `Escape`, `Esc`
542
+ - `Backspace`
543
+ - `Delete`, `Del`
544
+ - `ArrowDown`, `Down`
545
+ - `ArrowUp`, `Up`
546
+ - `ArrowLeft`, `Left`
547
+ - `ArrowRight`, `Right`
548
+ - `Space`
549
+
550
+ ---
551
+
552
+ ### hover
553
+
554
+ Hover over an element using a CSS selector.
555
+
556
+ **Parameters:**
557
+
558
+ | Name | Type | Required | Description |
559
+ |------|------|----------|-------------|
560
+ | selector | string | Yes | CSS selector of the element |
561
+ | session_id | string | Yes | Session ID to use |
562
+
563
+ **Example Request:**
564
+
565
+ ```json
566
+ {
567
+ "name": "hover",
568
+ "arguments": {
569
+ "selector": ".dropdown-menu",
570
+ "session_id": "uuid-1234"
571
+ }
572
+ }
573
+ ```
574
+
575
+ **Example Response:**
576
+
577
+ ```json
578
+ {
579
+ "message": "Hovered over .dropdown-menu"
580
+ }
581
+ ```
582
+
583
+ **Notes:**
584
+ - Automatically scrolls element into view
585
+ - Falls back to JavaScript hover if native hover fails
586
+
587
+ ---
588
+
589
+ ### drag_and_drop
590
+
591
+ Drag an element and drop it onto another element or coordinates.
592
+
593
+ **Parameters:**
594
+
595
+ | Name | Type | Required | Description |
596
+ |------|------|----------|-------------|
597
+ | source_selector | string | Yes | CSS selector or XPath of element to drag |
598
+ | target_selector | string | No | CSS selector or XPath of drop target |
599
+ | target_x | number | No | X coordinate to drop at |
600
+ | target_y | number | No | Y coordinate to drop at |
601
+ | steps | number | No | Number of steps for smooth dragging (default: 10) |
602
+ | session_id | string | Yes | Session ID to use |
603
+
604
+ **Example Request:**
605
+
606
+ ```json
607
+ {
608
+ "name": "drag_and_drop",
609
+ "arguments": {
610
+ "source_selector": ".draggable-item",
611
+ "target_selector": ".drop-zone",
612
+ "steps": 15,
613
+ "session_id": "uuid-1234"
614
+ }
615
+ }
616
+ ```
617
+
618
+ **Example Response:**
619
+
620
+ ```json
621
+ {
622
+ "message": "Dragged from (100, 200) to (500, 300)"
623
+ }
624
+ ```
625
+
626
+ **Notes:**
627
+ - Either `target_selector` or both `target_x` and `target_y` must be provided
628
+ - Supports both CSS selectors and XPath
629
+ - Performs smooth drag with configurable steps
630
+ - Includes delays to ensure drag events register properly
631
+
632
+ ---
633
+
634
+ ### accept_cookies
635
+
636
+ Automatically detect and accept cookie consent banners.
637
+
638
+ **Parameters:**
639
+
640
+ | Name | Type | Required | Description |
641
+ |------|------|----------|-------------|
642
+ | wait | number | No | Seconds to wait for banner to appear (default: 3) |
643
+ | session_id | string | Yes | Session ID to use |
644
+
645
+ **Example Request:**
646
+
647
+ ```json
648
+ {
649
+ "name": "accept_cookies",
650
+ "arguments": {
651
+ "wait": 5,
652
+ "session_id": "uuid-1234"
653
+ }
654
+ }
655
+ ```
656
+
657
+ **Example Response:**
658
+
659
+ ```json
660
+ {
661
+ "message": "Cookie consent accepted successfully",
662
+ "strategy": "common_frameworks",
663
+ "selector": "#onetrust-accept-btn-handler"
664
+ }
665
+ ```
666
+
667
+ **Detection Strategies (in order):**
668
+ 1. **Common Frameworks**: OneTrust, Cookiebot, Osano, Quantcast, TrustArc, Termly, Didomi, Sourcepoint
669
+ 2. **Iframe Detection**: Checks iframes for cookie banners
670
+ 3. **Text-Based Detection**: Searches for common accept button text in multiple languages
671
+ 4. **CSS Selectors**: Generic CSS patterns for accept buttons
672
+
673
+ **Supported Languages:**
674
+ - English, French, German, Spanish, Italian, Portuguese
675
+
676
+ **Notes:**
677
+ - Automatically tries multiple strategies
678
+ - Filters out reject/customize buttons
679
+ - Works with both main page and iframes
680
+ - Returns the strategy and selector used for success
681
+
682
+ ---
683
+
684
+ ### solve_captcha
685
+
686
+ Automatically detect and solve audio CAPTCHA challenges using Whisper speech recognition.
687
+
688
+ **Parameters:**
689
+
690
+ | Name | Type | Required | Description |
691
+ |------|------|----------|-------------|
692
+ | session_id | string | Yes | Session ID to use |
693
+
694
+ **Example Request:**
695
+
696
+ ```json
697
+ {
698
+ "name": "solve_captcha",
699
+ "arguments": {
700
+ "session_id": "uuid-1234"
701
+ }
702
+ }
703
+ ```
704
+
705
+ **Example Response:**
706
+
707
+ ```json
708
+ {
709
+ "message": "CAPTCHA solved successfully",
710
+ "transcription": "the quick brown fox",
711
+ "audio_button": "#recaptcha-audio-button",
712
+ "input_field": "#audio-response",
713
+ "verify_button": "#recaptcha-verify-button"
714
+ }
715
+ ```
716
+
717
+ **Process:**
718
+ 1. Detects and clicks CAPTCHA checkbox (if present)
719
+ 2. Finds and clicks audio challenge button
720
+ 3. Downloads audio challenge
721
+ 4. Transcribes audio using Whisper
722
+ 5. Fills input field with transcription
723
+ 6. Clicks verify button
724
+
725
+ **Supported CAPTCHAs:**
726
+ - Google reCAPTCHA
727
+ - hCaptcha
728
+
729
+ **Notes:**
730
+ - Requires Whisper service to be available
731
+ - Works with both main page and iframes
732
+ - Uses human-like typing delays
733
+ - Automatically cleans up temporary audio files
734
+
735
+ ---
736
+
737
+ ## Extraction
738
+
739
+ ### get_text
740
+
741
+ Extract text content from one or more elements.
742
+
743
+ **Parameters:**
744
+
745
+ | Name | Type | Required | Description |
746
+ |------|------|----------|-------------|
747
+ | selector | string | Yes | CSS selector or XPath (use `xpath:` prefix) |
748
+ | multiple | boolean | No | Extract from all matching elements (default: false) |
749
+ | session_id | string | Yes | Session ID to use |
750
+
751
+ **Example Request (Single Element):**
752
+
753
+ ```json
754
+ {
755
+ "name": "get_text",
756
+ "arguments": {
757
+ "selector": "h1.title",
758
+ "session_id": "uuid-1234"
759
+ }
760
+ }
761
+ ```
762
+
763
+ **Example Response (Single):**
764
+
765
+ ```json
766
+ {
767
+ "text": "Welcome to Example"
768
+ }
769
+ ```
770
+
771
+ **Example Request (Multiple Elements):**
772
+
773
+ ```json
774
+ {
775
+ "name": "get_text",
776
+ "arguments": {
777
+ "selector": "li.item",
778
+ "multiple": true,
779
+ "session_id": "uuid-1234"
780
+ }
781
+ }
782
+ ```
783
+
784
+ **Example Response (Multiple):**
785
+
786
+ ```json
787
+ {
788
+ "texts": [
789
+ "First item",
790
+ "Second item",
791
+ "Third item"
792
+ ],
793
+ "count": 3
794
+ }
795
+ ```
796
+
797
+ **Notes:**
798
+ - Supports both CSS selectors and XPath
799
+ - Returns array when `multiple: true`
800
+ - Throws error if no elements found
801
+
802
+ ---
803
+
804
+ ### get_html
805
+
806
+ Get HTML content of the page or a specific element.
807
+
808
+ **Parameters:**
809
+
810
+ | Name | Type | Required | Description |
811
+ |------|------|----------|-------------|
812
+ | selector | string | No | CSS selector to get HTML of specific element |
813
+ | session_id | string | Yes | Session ID to use |
814
+
815
+ **Example Request (Full Page):**
816
+
817
+ ```json
818
+ {
819
+ "name": "get_html",
820
+ "arguments": {
821
+ "session_id": "uuid-1234"
822
+ }
823
+ }
824
+ ```
825
+
826
+ **Example Response (Full Page):**
827
+
828
+ ```json
829
+ {
830
+ "html": "<!DOCTYPE html><html>...</html>",
831
+ "url": "https://example.com"
832
+ }
833
+ ```
834
+
835
+ **Example Request (Specific Element):**
836
+
837
+ ```json
838
+ {
839
+ "name": "get_html",
840
+ "arguments": {
841
+ "selector": "div.content",
842
+ "session_id": "uuid-1234"
843
+ }
844
+ }
845
+ ```
846
+
847
+ **Example Response (Element):**
848
+
849
+ ```json
850
+ {
851
+ "html": "<div class=\"content\">...</div>",
852
+ "selector": "div.content"
853
+ }
854
+ ```
855
+
856
+ **Notes:**
857
+ - Omit `selector` to get full page HTML
858
+ - Returns `outerHTML` for specific elements (includes the element itself)
859
+
860
+ ---
861
+
862
+ ### screenshot
863
+
864
+ Take a screenshot of the page or a specific element.
865
+
866
+ **Parameters:**
867
+
868
+ | Name | Type | Required | Description |
869
+ |------|------|----------|-------------|
870
+ | selector | string | No | CSS selector to screenshot specific element |
871
+ | full_page | boolean | No | Capture full scrollable page (default: false) |
872
+ | format | string | No | Image format: `png` or `jpeg` (default: png) |
873
+ | session_id | string | Yes | Session ID to use |
874
+
875
+ **Example Request:**
876
+
877
+ ```json
878
+ {
879
+ "name": "screenshot",
880
+ "arguments": {
881
+ "full_page": true,
882
+ "format": "png",
883
+ "session_id": "uuid-1234"
884
+ }
885
+ }
886
+ ```
887
+
888
+ **Example Response:**
889
+
890
+ ```json
891
+ {
892
+ "type": "image",
893
+ "data": "iVBORw0KGgoAAAANSUhEUgAA...(base64 data)...",
894
+ "mime_type": "image/png"
895
+ }
896
+ ```
897
+
898
+ **Notes:**
899
+ - Returns base64-encoded image data
900
+ - Automatically resizes if dimensions exceed 8000px (Claude API limit)
901
+ - Uses high-quality Lanczos3 interpolation for resizing
902
+ - Scrolls element into view before screenshot if selector provided
903
+
904
+ ---
905
+
906
+ ### get_title
907
+
908
+ Get the title of the current page.
909
+
910
+ **Parameters:**
911
+
912
+ | Name | Type | Required | Description |
913
+ |------|------|----------|-------------|
914
+ | session_id | string | Yes | Session ID to use |
915
+
916
+ **Example Request:**
917
+
918
+ ```json
919
+ {
920
+ "name": "get_title",
921
+ "arguments": {
922
+ "session_id": "uuid-1234"
923
+ }
924
+ }
925
+ ```
926
+
927
+ **Example Response:**
928
+
929
+ ```json
930
+ {
931
+ "title": "Example Domain",
932
+ "url": "https://example.com"
933
+ }
934
+ ```
935
+
936
+ ---
937
+
938
+ ### get_url
939
+
940
+ Get the current URL of the page.
941
+
942
+ **Parameters:**
943
+
944
+ | Name | Type | Required | Description |
945
+ |------|------|----------|-------------|
946
+ | session_id | string | Yes | Session ID to use |
947
+
948
+ **Example Request:**
949
+
950
+ ```json
951
+ {
952
+ "name": "get_url",
953
+ "arguments": {
954
+ "session_id": "uuid-1234"
955
+ }
956
+ }
957
+ ```
958
+
959
+ **Example Response:**
960
+
961
+ ```json
962
+ {
963
+ "url": "https://example.com/page"
964
+ }
965
+ ```
966
+
967
+ ---
968
+
969
+ ### find_by_text
970
+
971
+ Find elements by their text content using XPath.
972
+
973
+ **Parameters:**
974
+
975
+ | Name | Type | Required | Description |
976
+ |------|------|----------|-------------|
977
+ | text | string | Yes | Text to search for |
978
+ | tag | string | No | HTML tag to search within (default: `*` for any) |
979
+ | exact | boolean | No | Exact match vs partial match (default: false) |
980
+ | multiple | boolean | No | Return all matches or first visible (default: false) |
981
+ | session_id | string | Yes | Session ID to use |
982
+
983
+ **Example Request:**
984
+
985
+ ```json
986
+ {
987
+ "name": "find_by_text",
988
+ "arguments": {
989
+ "text": "Sign In",
990
+ "tag": "button",
991
+ "exact": false,
992
+ "session_id": "uuid-1234"
993
+ }
994
+ }
995
+ ```
996
+
997
+ **Example Response (Single):**
998
+
999
+ ```json
1000
+ {
1001
+ "tag": "button",
1002
+ "text": "Sign In",
1003
+ "visible": true,
1004
+ "selector": "button.login-btn",
1005
+ "xpath": "//button[contains(normalize-space(.), 'sign in')]",
1006
+ "total_found": 3
1007
+ }
1008
+ ```
1009
+
1010
+ **Example Response (Multiple):**
1011
+
1012
+ ```json
1013
+ {
1014
+ "found": 3,
1015
+ "elements": [
1016
+ {
1017
+ "index": 0,
1018
+ "tag": "button",
1019
+ "text": "Sign In",
1020
+ "visible": true,
1021
+ "selector": "button#main-login"
1022
+ },
1023
+ {
1024
+ "index": 1,
1025
+ "tag": "a",
1026
+ "text": "Sign In Here",
1027
+ "visible": true,
1028
+ "selector": "a.secondary-login"
1029
+ }
1030
+ ],
1031
+ "xpath": "//button[contains(normalize-space(.), 'sign in')]"
1032
+ }
1033
+ ```
1034
+
1035
+ **Notes:**
1036
+ - Case-insensitive search
1037
+ - Handles quotes in text properly (prevents XPath injection)
1038
+ - Prefers visible elements when `multiple: false`
1039
+ - Generates CSS selector from element id/classes when possible
1040
+
1041
+ ---
1042
+
1043
+ ## Advanced
1044
+
1045
+ ### execute_script
1046
+
1047
+ Execute JavaScript code in the browser context (for side effects, no return value).
1048
+
1049
+ **Parameters:**
1050
+
1051
+ | Name | Type | Required | Description |
1052
+ |------|------|----------|-------------|
1053
+ | script | string | Yes | JavaScript code to execute |
1054
+ | session_id | string | Yes | Session ID to use |
1055
+
1056
+ **Example Request:**
1057
+
1058
+ ```json
1059
+ {
1060
+ "name": "execute_script",
1061
+ "arguments": {
1062
+ "script": "document.body.style.backgroundColor = 'red';",
1063
+ "session_id": "uuid-1234"
1064
+ }
1065
+ }
1066
+ ```
1067
+
1068
+ **Example Response:**
1069
+
1070
+ ```json
1071
+ {
1072
+ "message": "Script executed successfully"
1073
+ }
1074
+ ```
1075
+
1076
+ **Notes:**
1077
+ - Use `execute_script` for side effects (DOM manipulation, etc.)
1078
+ - Use `evaluate_js` if you need to get a return value
1079
+ - Script runs in page context with access to DOM
1080
+
1081
+ ---
1082
+
1083
+ ### evaluate_js
1084
+
1085
+ Evaluate JavaScript expression and return the result.
1086
+
1087
+ **Parameters:**
1088
+
1089
+ | Name | Type | Required | Description |
1090
+ |------|------|----------|-------------|
1091
+ | expression | string | Yes | JavaScript expression to evaluate |
1092
+ | session_id | string | Yes | Session ID to use |
1093
+
1094
+ **Example Request:**
1095
+
1096
+ ```json
1097
+ {
1098
+ "name": "evaluate_js",
1099
+ "arguments": {
1100
+ "expression": "document.querySelectorAll('p').length",
1101
+ "session_id": "uuid-1234"
1102
+ }
1103
+ }
1104
+ ```
1105
+
1106
+ **Example Response:**
1107
+
1108
+ ```json
1109
+ {
1110
+ "result": 42
1111
+ }
1112
+ ```
1113
+
1114
+ **Notes:**
1115
+ - Returns the result of the expression
1116
+ - Can return primitives, arrays, or objects
1117
+ - Use `execute_script` for code without return values
1118
+
1119
+ ---
1120
+
1121
+ ### get_cookies
1122
+
1123
+ Get all cookies or cookies for a specific domain.
1124
+
1125
+ **Parameters:**
1126
+
1127
+ | Name | Type | Required | Description |
1128
+ |------|------|----------|-------------|
1129
+ | domain | string | No | Filter cookies by domain |
1130
+ | session_id | string | Yes | Session ID to use |
1131
+
1132
+ **Example Request:**
1133
+
1134
+ ```json
1135
+ {
1136
+ "name": "get_cookies",
1137
+ "arguments": {
1138
+ "domain": "example.com",
1139
+ "session_id": "uuid-1234"
1140
+ }
1141
+ }
1142
+ ```
1143
+
1144
+ **Example Response:**
1145
+
1146
+ ```json
1147
+ {
1148
+ "cookies": [
1149
+ {
1150
+ "name": "session_id",
1151
+ "value": "abc123",
1152
+ "domain": ".example.com",
1153
+ "path": "/",
1154
+ "secure": true,
1155
+ "httpOnly": true
1156
+ }
1157
+ ],
1158
+ "count": 1
1159
+ }
1160
+ ```
1161
+
1162
+ **Notes:**
1163
+ - Omit `domain` to get all cookies
1164
+ - Returns structured cookie data with all attributes
1165
+
1166
+ ---
1167
+
1168
+ ### set_cookie
1169
+
1170
+ Set a cookie in the browser.
1171
+
1172
+ **Parameters:**
1173
+
1174
+ | Name | Type | Required | Description |
1175
+ |------|------|----------|-------------|
1176
+ | name | string | Yes | Cookie name |
1177
+ | value | string | Yes | Cookie value |
1178
+ | domain | string | Yes | Cookie domain |
1179
+ | path | string | No | Cookie path (default: /) |
1180
+ | secure | boolean | No | Secure flag (default: false) |
1181
+ | httponly | boolean | No | HttpOnly flag (default: false) |
1182
+ | session_id | string | Yes | Session ID to use |
1183
+
1184
+ **Example Request:**
1185
+
1186
+ ```json
1187
+ {
1188
+ "name": "set_cookie",
1189
+ "arguments": {
1190
+ "name": "user_pref",
1191
+ "value": "dark_mode",
1192
+ "domain": ".example.com",
1193
+ "path": "/",
1194
+ "secure": true,
1195
+ "session_id": "uuid-1234"
1196
+ }
1197
+ }
1198
+ ```
1199
+
1200
+ **Example Response:**
1201
+
1202
+ ```json
1203
+ {
1204
+ "message": "Cookie set: user_pref"
1205
+ }
1206
+ ```
1207
+
1208
+ ---
1209
+
1210
+ ### clear_cookies
1211
+
1212
+ Clear all cookies or cookies for a specific domain.
1213
+
1214
+ **Parameters:**
1215
+
1216
+ | Name | Type | Required | Description |
1217
+ |------|------|----------|-------------|
1218
+ | domain | string | No | Clear cookies only for this domain |
1219
+ | session_id | string | Yes | Session ID to use |
1220
+
1221
+ **Example Request:**
1222
+
1223
+ ```json
1224
+ {
1225
+ "name": "clear_cookies",
1226
+ "arguments": {
1227
+ "domain": "example.com",
1228
+ "session_id": "uuid-1234"
1229
+ }
1230
+ }
1231
+ ```
1232
+
1233
+ **Example Response:**
1234
+
1235
+ ```json
1236
+ {
1237
+ "message": "Cleared 5 cookies for example.com"
1238
+ }
1239
+ ```
1240
+
1241
+ **Notes:**
1242
+ - Omit `domain` to clear all cookies
1243
+
1244
+ ---
1245
+
1246
+ ### get_attribute
1247
+
1248
+ Get attribute value(s) from an element.
1249
+
1250
+ **Parameters:**
1251
+
1252
+ | Name | Type | Required | Description |
1253
+ |------|------|----------|-------------|
1254
+ | selector | string | Yes | CSS selector of the element |
1255
+ | attribute | string | Yes | Attribute name to get |
1256
+ | session_id | string | Yes | Session ID to use |
1257
+
1258
+ **Example Request:**
1259
+
1260
+ ```json
1261
+ {
1262
+ "name": "get_attribute",
1263
+ "arguments": {
1264
+ "selector": "a.download",
1265
+ "attribute": "href",
1266
+ "session_id": "uuid-1234"
1267
+ }
1268
+ }
1269
+ ```
1270
+
1271
+ **Example Response:**
1272
+
1273
+ ```json
1274
+ {
1275
+ "selector": "a.download",
1276
+ "attribute": "href",
1277
+ "value": "https://example.com/file.pdf"
1278
+ }
1279
+ ```
1280
+
1281
+ ---
1282
+
1283
+ ### query_shadow_dom
1284
+
1285
+ Query and interact with elements inside Shadow DOM.
1286
+
1287
+ **Parameters:**
1288
+
1289
+ | Name | Type | Required | Description |
1290
+ |------|------|----------|-------------|
1291
+ | host_selector | string | Yes | CSS selector of Shadow DOM host element |
1292
+ | shadow_selector | string | Yes | CSS selector to find element(s) within Shadow DOM |
1293
+ | action | string | Yes | Action: `click`, `get_text`, `get_html`, or `get_attribute` |
1294
+ | attribute | string | No | Attribute name (required when action is `get_attribute`) |
1295
+ | multiple | boolean | No | Return all matching elements (default: false) |
1296
+ | session_id | string | Yes | Session ID to use |
1297
+
1298
+ **Example Request (Click):**
1299
+
1300
+ ```json
1301
+ {
1302
+ "name": "query_shadow_dom",
1303
+ "arguments": {
1304
+ "host_selector": "video-player",
1305
+ "shadow_selector": "button.play",
1306
+ "action": "click",
1307
+ "session_id": "uuid-1234"
1308
+ }
1309
+ }
1310
+ ```
1311
+
1312
+ **Example Response (Click):**
1313
+
1314
+ ```json
1315
+ {
1316
+ "message": "Clicked element in Shadow DOM: button.play"
1317
+ }
1318
+ ```
1319
+
1320
+ **Example Request (Get Text):**
1321
+
1322
+ ```json
1323
+ {
1324
+ "name": "query_shadow_dom",
1325
+ "arguments": {
1326
+ "host_selector": "custom-widget",
1327
+ "shadow_selector": ".status",
1328
+ "action": "get_text",
1329
+ "session_id": "uuid-1234"
1330
+ }
1331
+ }
1332
+ ```
1333
+
1334
+ **Example Response (Get Text):**
1335
+
1336
+ ```json
1337
+ {
1338
+ "text": "Online"
1339
+ }
1340
+ ```
1341
+
1342
+ **Notes:**
1343
+ - Essential for interacting with Web Components
1344
+ - Supports all standard actions: click, text extraction, HTML, attributes
1345
+ - Can query multiple elements with `multiple: true`
1346
+
1347
+ ---
1348
+
1349
+ ## Response Format
1350
+
1351
+ All tools return responses in this standard format:
1352
+
1353
+ **Success Response:**
1354
+
1355
+ ```json
1356
+ {
1357
+ "success": true,
1358
+ "data": {
1359
+ // Tool-specific data
1360
+ }
1361
+ }
1362
+ ```
1363
+
1364
+ **Error Response:**
1365
+
1366
+ ```json
1367
+ {
1368
+ "success": false,
1369
+ "error": "Error message describing what went wrong"
1370
+ }
1371
+ ```
1372
+
1373
+ **Image Response (screenshot tool):**
1374
+
1375
+ ```json
1376
+ {
1377
+ "type": "image",
1378
+ "data": "base64-encoded-image-data",
1379
+ "mime_type": "image/png"
1380
+ }
1381
+ ```
1382
+
1383
+ ---
1384
+
1385
+ ## Common Error Scenarios
1386
+
1387
+ 1. **Missing session_id**: "session_id is required. Create a session first using create_session tool."
1388
+ 2. **Invalid session**: "Session not found: {session_id}"
1389
+ 3. **Element not found**: "Element not found: {selector}"
1390
+ 4. **Timeout errors**: "Navigation timed out" / "Timeout waiting for element"
1391
+ 5. **JavaScript errors**: "Failed to execute script: {error}"
1392
+
1393
+ ---
1394
+
1395
+ ## Best Practices
1396
+
1397
+ 1. **Always create a session first**: Use `create_session` before any browser operations
1398
+ 2. **Use resource discovery**: Query `ferrum://browsers` and `ferrum://bot-profiles` to see available configurations
1399
+ 3. **Handle sessions properly**: Close sessions when done to free resources
1400
+ 4. **Use appropriate selectors**: Prefer CSS selectors for performance, XPath for complex queries
1401
+ 5. **Set timeouts wisely**: Increase timeout for slow-loading pages
1402
+ 6. **Force clicks sparingly**: Only use `force: true` when necessary, as it bypasses visibility checks
1403
+ 7. **Screenshot optimization**: Use `jpeg` format for smaller file sizes when quality is not critical
1404
+
1405
+ ---
1406
+
1407
+ For more information, see:
1408
+ - [Getting Started Guide](GETTING_STARTED.md)
1409
+ - [Configuration Guide](CONFIGURATION.md)
1410
+ - [Project Documentation](../CLAUDE.md)