agent_ferrum 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e0f2cc1c04153bbcfec18fe9d0e5c68299bf3bac2eed2cff4b7ab4cdc8cd03a5
4
- data.tar.gz: e1150457b6023ebb1176a671dbab3e2058e5ec183adab75aee59f4c067acd422
3
+ metadata.gz: e12be8b95890fc0d62001605462364ac13691dad862b6a39f51d230985c89262
4
+ data.tar.gz: 361a0f0728979f221669776126d2e6f098149d246c481345631acc0e2a6577b7
5
5
  SHA512:
6
- metadata.gz: 72d310191f52a7b7aa0a179058c771611b1387c8e87def6d6d2051b32256f8fc48c0f739e14c657f64fe1bd2fc6c8955768f7765a4a5841667e59aa9a55439de
7
- data.tar.gz: 49e604f1dad52a548b1180e77186a54bf6b5a8306ca478cb2c5679157d8928df91595ff6087c1d4e8320e62b786120d4c060388ff9a9dd1f892bd9253066778a
6
+ metadata.gz: 4a3c6893977cd42e7df9f9a1bc7f3a21cf527d2b53ee69de49fd88ae61f964a3cb78b82d20961a690944a7188dad3b1bc0e150eaae854ad8184620fdba07a431
7
+ data.tar.gz: 015a330a02adbf51ea08b391fec09104a473e6e96d8e0bbb92e4d8842baa2c13867c4796d915a547abec31578f56737fc36d076161a8752b1ee49361f9ebc153
data/CHANGELOG.md CHANGED
@@ -5,6 +5,22 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.2.0] - 2026-02-09
9
+
10
+ ### Added
11
+
12
+ - **Standalone CLI** -- `agent_ferrum start/snapshot/click/stop` where each command is a separate invocation, ideal for AI agents chaining shell commands
13
+ - Background daemon holds the Chrome browser process, communicating over a Unix domain socket (`UNIXSocket` + JSON protocol)
14
+ - Zero external dependencies -- only Ruby stdlib
15
+ - Session management via `~/.agent_ferrum/` (PID file + socket)
16
+ - Full command set: `start`, `stop`, `status`, `navigate`, `snapshot`, `tree`, `markdown`, `click`, `fill`, `select`, `hover`, `type`, `screenshot`, `eval`, `back`, `forward`, `refresh`, `stealth`, `wait`
17
+ - Start options: `--headed`, `--stealth`, `--user-agent`, `--viewport`, `--timeout`, `--browser-path`
18
+
19
+ ### Improved
20
+
21
+ - **Click reliability** -- elements are scrolled into view (`scrollIntoView({block: 'center'})`) before clicking, fixing issues with elements covered by sticky headers or overlapping content on complex pages
22
+ - **Hover implementation** -- `Node#hover` now works via CDP mouse events (`scrollIntoView` + `find_position` + `mouse.move`), replacing Ferrum's `NotImplementedError`
23
+
8
24
  ## [0.1.0] - 2026-02-09
9
25
 
10
26
  ### Added
@@ -26,4 +42,5 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
26
42
  - **Timezone override** via CDP `Emulation.setTimezoneOverride`
27
43
  - **Locale support** via Chrome browser options
28
44
 
45
+ [0.2.0]: https://github.com/Alqemist-labs/agent_ferrum/releases/tag/v0.2.0
29
46
  [0.1.0]: https://github.com/Alqemist-labs/agent_ferrum/releases/tag/v0.1.0
data/README.md CHANGED
@@ -39,11 +39,11 @@ Your agent reads a compact snapshot instead of the full DOM. It clicks `@e5` to
39
39
 
40
40
  Real-world token reduction measured on live pages (February 2026):
41
41
 
42
- | Site | Raw HTML | Snapshot | Reduction |
43
- |------|----------|----------|-----------|
44
- | **Hacker News** | ~8,600 tokens | ~4,500 tokens | **47%** |
45
- | **Wikipedia** (Ruby article) | ~140,000 tokens | ~31,000 tokens | **78%** |
46
- | **GitHub** (repo page) | ~265,000 tokens | ~22,000 tokens | **92%** |
42
+ | Site | Raw HTML | Snapshot | Reduction |
43
+ | ---------------------------- | --------------- | -------------- | --------- |
44
+ | **Hacker News** | ~8,600 tokens | ~4,500 tokens | **47%** |
45
+ | **Wikipedia** (Ruby article) | ~140,000 tokens | ~31,000 tokens | **78%** |
46
+ | **GitHub** (repo page) | ~265,000 tokens | ~22,000 tokens | **92%** |
47
47
 
48
48
  The heavier the page (scripts, styles, data attributes, hidden elements), the bigger the savings. Simple content-focused pages like HN see ~50% reduction. Rich web apps like GitHub or StackOverflow see 90%+.
49
49
 
@@ -58,6 +58,7 @@ The heavier the page (scripts, styles, data attributes, hidden elements), the bi
58
58
  - **Smart waiting** -- Poll for CSS/XPath/text/block conditions with configurable timeout and interval
59
59
  - **Auto-retry** -- Node actions retry automatically on transient errors (element moving, coordinates not found)
60
60
  - **AI-friendly errors** -- Error messages tell the agent what to do next ("Call browser.snapshot to refresh refs")
61
+ - **Standalone CLI** -- `agent_ferrum start/snapshot/click/stop` — each command is a separate invocation, ideal for AI agents chaining shell commands. Zero external dependencies
61
62
 
62
63
  ## Installation
63
64
 
@@ -104,7 +105,7 @@ browser.fill("@e2", "search query")
104
105
  # Or use CSS/XPath selectors
105
106
  browser.click("button.submit")
106
107
  browser.click("//a[@href='/about']")
107
- browser.fill(css: "input[name='email']", "user@example.com")
108
+ browser.fill({css: "input[name='email']"}, "user@example.com")
108
109
  ```
109
110
 
110
111
  ### 3. Wait for content
@@ -153,6 +154,7 @@ AgentFerrum.configure do |c|
153
154
  c.browser_path = nil # Custom Chrome/Chromium path
154
155
  c.user_agent = nil # Custom user agent
155
156
  c.locale = nil # Browser locale (e.g., "fr-FR")
157
+ c.timezone = nil # Timezone override (e.g., "Europe/Paris")
156
158
  end
157
159
  ```
158
160
 
@@ -308,6 +310,84 @@ browser.click("@e7") # Product link
308
310
  browser.quit
309
311
  ```
310
312
 
313
+ ## CLI
314
+
315
+ AgentFerrum ships with a standalone CLI where each command is a separate invocation. A background daemon holds the browser process, communicating over a Unix domain socket. Zero external dependencies -- only Ruby stdlib (`UNIXSocket`, `JSON`).
316
+
317
+ ```
318
+ CLI invocation Daemon (background)
319
+ ┌──────────────┐ ┌───────────────────┐
320
+ │ agent_ferrum │ Unix socket │ AgentFerrum:: │
321
+ │ snapshot │ ──────────────> │ Browser instance │
322
+ │ │ <────────────── │ (Chrome headless) │
323
+ └──────────────┘ JSON response └───────────────────┘
324
+ ```
325
+
326
+ ### Quick start
327
+
328
+ ```bash
329
+ agent_ferrum start https://example.com # Launch browser daemon + navigate
330
+ agent_ferrum snapshot # Get accessibility tree + markdown
331
+ agent_ferrum click @e1 # Click an element by ref
332
+ agent_ferrum screenshot /tmp/page.png # Take a screenshot
333
+ agent_ferrum stop # Stop daemon and browser
334
+ ```
335
+
336
+ ### All commands
337
+
338
+ ```
339
+ Session:
340
+ start [URL] [options] Start the browser daemon
341
+ --headed Visible browser (default: headless)
342
+ --stealth PROFILE off / minimal / moderate / maximum
343
+ --user-agent UA Custom user agent
344
+ --viewport WxH Viewport size (default: 1920x1080)
345
+ --timeout N Timeout in seconds (default: 30)
346
+ --browser-path PATH Path to Chrome binary
347
+ stop Stop the browser daemon
348
+ status Show browser status
349
+
350
+ Navigation:
351
+ navigate URL Navigate to URL (alias: go)
352
+ back / forward / refresh
353
+
354
+ Content:
355
+ snapshot Full snapshot (alias: snap)
356
+ tree Accessibility tree only
357
+ markdown Page markdown only (alias: md)
358
+ url Current URL
359
+ title Page title
360
+
361
+ Actions:
362
+ click TARGET Click element (ref @e1 or CSS selector)
363
+ fill TARGET VALUE Fill an input field
364
+ select TARGET VALUE Select a dropdown option
365
+ hover TARGET Hover over element
366
+ type TEXT Type text via keyboard
367
+
368
+ Utilities:
369
+ screenshot [PATH] Take a screenshot
370
+ eval JS Evaluate JavaScript
371
+ stealth PROFILE Change stealth profile
372
+ wait SELECTOR [TIMEOUT] Wait for element (CSS or XPath)
373
+ ```
374
+
375
+ ### Example: AI agent loop
376
+
377
+ ```bash
378
+ agent_ferrum start https://shop.example.com
379
+ # AI reads the page
380
+ SNAP=$(agent_ferrum snapshot)
381
+ # AI decides to search
382
+ agent_ferrum fill @e3 "wireless headphones"
383
+ agent_ferrum click @e4
384
+ agent_ferrum wait ".results"
385
+ # AI reads results
386
+ agent_ferrum snapshot
387
+ # Done
388
+ agent_ferrum stop
389
+ ```
390
+
311
391
  ## Development
312
392
 
313
393
  ```bash
data/bin/agent_ferrum ADDED
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require "agent_ferrum"
5
+ require_relative "../lib/agent_ferrum/cli"
6
+
7
+ AgentFerrum::CLI.run(ARGV)
@@ -0,0 +1,35 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "socket"
4
+ require "json"
5
+
6
+ module AgentFerrum
7
+ class CLI
8
+ class Client
9
+ class ConnectionError < AgentFerrum::Error; end
10
+ class RemoteError < AgentFerrum::Error; end
11
+
12
+ def initialize(socket_path)
13
+ @socket_path = socket_path
14
+ end
15
+
16
+ def call(method, *args)
17
+ socket = UNIXSocket.new(@socket_path)
18
+ request = { method: method, args: args }
19
+ socket.puts(JSON.generate(request))
20
+
21
+ raw = socket.gets("\n")
22
+ raise ConnectionError, "No response from daemon" unless raw
23
+
24
+ response = JSON.parse(raw)
25
+ raise RemoteError, response["error"] if response["error"]
26
+
27
+ response["result"]
28
+ rescue Errno::ENOENT, Errno::ECONNREFUSED
29
+ raise ConnectionError, "No browser running. Start one with: agent_ferrum start"
30
+ ensure
31
+ socket&.close
32
+ end
33
+ end
34
+ end
35
+ end
@@ -0,0 +1,84 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "socket"
4
+ require "json"
5
+
6
+ module AgentFerrum
7
+ class CLI
8
+ class Server
9
+ def initialize(socket_path, options, ready_write_fd: nil)
10
+ @socket_path = socket_path
11
+ @options = options
12
+ @ready_write_fd = ready_write_fd
13
+ @running = false
14
+ end
15
+
16
+ def run
17
+ browser = AgentFerrum::Browser.new(**@options)
18
+ @service = Service.new(browser)
19
+ @running = true
20
+
21
+ File.delete(@socket_path) if File.exist?(@socket_path)
22
+ @server = UNIXServer.new(@socket_path)
23
+ signal_ready
24
+
25
+ while @running
26
+ client = @server.accept
27
+ handle_connection(client)
28
+ end
29
+ rescue StandardError => e
30
+ signal_error(e)
31
+ raise
32
+ ensure
33
+ @server&.close
34
+ File.delete(@socket_path) if @socket_path && File.exist?(@socket_path)
35
+ end
36
+
37
+ private
38
+
39
+ def handle_connection(client)
40
+ raw = client.gets("\n")
41
+ return unless raw
42
+
43
+ request = JSON.parse(raw)
44
+ method = request["method"]
45
+ args = request["args"] || []
46
+
47
+ if method == "stop"
48
+ client.puts(JSON.generate(result: "Stopped"))
49
+ client.close
50
+ @service.stop
51
+ @running = false
52
+ return
53
+ end
54
+
55
+ result = @service.public_send(method, *args)
56
+ client.puts(JSON.generate(result: result))
57
+ rescue NoMethodError
58
+ client.puts(JSON.generate(error: "Unknown method: #{method}"))
59
+ rescue StandardError => e
60
+ client.puts(JSON.generate(error: e.message))
61
+ ensure
62
+ client&.close unless client&.closed?
63
+ end
64
+
65
+ def signal_ready
66
+ return unless @ready_write_fd
67
+
68
+ io = IO.for_fd(@ready_write_fd)
69
+ io.write("ready")
70
+ io.close
71
+ end
72
+
73
+ def signal_error(error)
74
+ return unless @ready_write_fd
75
+
76
+ io = IO.for_fd(@ready_write_fd)
77
+ io.write("error:#{error.message}")
78
+ io.close
79
+ rescue StandardError
80
+ # fd may already be closed
81
+ end
82
+ end
83
+ end
84
+ end
@@ -0,0 +1,111 @@
1
+ # frozen_string_literal: true
2
+
3
+ module AgentFerrum
4
+ class CLI
5
+ class Service
6
+ def initialize(browser)
7
+ @browser = browser
8
+ end
9
+
10
+ def navigate(url)
11
+ url = "https://#{url}" unless url.match?(%r{\Ahttps?://})
12
+ @browser.navigate(url)
13
+ "Navigated to #{@browser.current_url}"
14
+ end
15
+
16
+ def snapshot
17
+ @browser.snapshot.to_s
18
+ end
19
+
20
+ def tree
21
+ @browser.accessibility_tree.to_s
22
+ end
23
+
24
+ def markdown
25
+ @browser.page_markdown
26
+ end
27
+
28
+ def click(target)
29
+ url_before = @browser.current_url
30
+ @browser.click(target)
31
+ @browser.wait_for_navigation(timeout: 3) rescue nil
32
+ url_after = @browser.current_url
33
+ if url_after != url_before
34
+ "Clicked #{target} → #{url_after}"
35
+ else
36
+ "Clicked #{target}"
37
+ end
38
+ end
39
+
40
+ def fill(target, value)
41
+ @browser.fill(target, value)
42
+ "Filled #{target}"
43
+ end
44
+
45
+ def select_option(target, value)
46
+ @browser.select(target, value)
47
+ "Selected #{value} in #{target}"
48
+ end
49
+
50
+ def hover(target)
51
+ @browser.hover(target)
52
+ "Hovering #{target}"
53
+ end
54
+
55
+ def type_text(text)
56
+ @browser.type_text(text)
57
+ "Typed #{text}"
58
+ end
59
+
60
+ def current_url
61
+ @browser.current_url
62
+ end
63
+
64
+ def title
65
+ @browser.title
66
+ end
67
+
68
+ def eval_js(expression)
69
+ @browser.evaluate(expression).inspect
70
+ end
71
+
72
+ def screenshot(path = nil)
73
+ path ||= "screenshot_#{Time.now.strftime('%Y%m%d_%H%M%S')}.png"
74
+ @browser.screenshot(path: path)
75
+ "Screenshot saved to #{path}"
76
+ end
77
+
78
+ def back
79
+ @browser.back
80
+ "Back to #{@browser.current_url}"
81
+ end
82
+
83
+ def forward
84
+ @browser.forward
85
+ "Forward to #{@browser.current_url}"
86
+ end
87
+
88
+ def refresh
89
+ @browser.refresh
90
+ "Refreshed"
91
+ end
92
+
93
+ def stealth(profile)
94
+ @browser.stealth(profile.to_sym)
95
+ "Stealth set to #{profile}"
96
+ end
97
+
98
+ def wait(selector, timeout = nil)
99
+ opts = selector.start_with?("/") ? { xpath: selector } : { css: selector }
100
+ opts[:timeout] = timeout.to_i if timeout
101
+ @browser.wait_for(**opts)
102
+ "Element #{selector} found"
103
+ end
104
+
105
+ def stop
106
+ @browser.quit
107
+ "Stopped"
108
+ end
109
+ end
110
+ end
111
+ end
@@ -0,0 +1,292 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "optparse"
4
+ require "json"
5
+ require "socket"
6
+ require "fileutils"
7
+
8
+ require_relative "cli/service"
9
+ require_relative "cli/server"
10
+ require_relative "cli/client"
11
+
12
+ module AgentFerrum
13
+ class CLI
14
+ SESSION_DIR = File.expand_path("~/.agent_ferrum")
15
+ SESSION_FILE = File.join(SESSION_DIR, "session.json")
16
+ SOCKET_PATH = File.join(SESSION_DIR, "agent_ferrum.sock")
17
+
18
+ def self.run(argv)
19
+ new(argv).execute
20
+ end
21
+
22
+ def initialize(argv)
23
+ @argv = argv.dup
24
+ @command = @argv.shift
25
+ end
26
+
27
+ def execute
28
+ case @command
29
+ when "start" then cmd_start
30
+ when "stop" then cmd_stop
31
+ when "status" then cmd_status
32
+ when "version" then puts "agent_ferrum #{AgentFerrum::VERSION}"
33
+ when "help", nil, "-h", "--help" then print_help
34
+ else cmd_remote
35
+ end
36
+ end
37
+
38
+ private
39
+
40
+ # --- Commands ---
41
+
42
+ def cmd_start
43
+ cleanup_stale_session
44
+
45
+ if session_alive?
46
+ $stderr.puts "Browser already running (pid: #{read_session["pid"]}). Stop it first with: agent_ferrum stop"
47
+ exit 1
48
+ end
49
+
50
+ options = parse_start_options
51
+
52
+ ready_read, ready_write = IO.pipe
53
+
54
+ pid = fork do
55
+ ready_read.close
56
+ $stdin.reopen("/dev/null")
57
+ $stdout.reopen("/dev/null")
58
+ $stderr.reopen("/dev/null")
59
+ Process.setsid
60
+ Server.new(SOCKET_PATH, options, ready_write_fd: ready_write.fileno).run
61
+ end
62
+
63
+ ready_write.close
64
+ Process.detach(pid)
65
+ write_session(pid: pid)
66
+ wait_for_daemon(ready_read)
67
+
68
+ url = @argv.shift
69
+ if url
70
+ client = connect
71
+ puts client.call(:navigate, url)
72
+ end
73
+
74
+ puts "Browser started (pid: #{pid})"
75
+ end
76
+
77
+ def cmd_stop
78
+ unless session_alive?
79
+ $stderr.puts "No browser running."
80
+ cleanup_session
81
+ exit 1
82
+ end
83
+
84
+ begin
85
+ client = connect
86
+ client.call(:stop)
87
+ rescue Client::ConnectionError
88
+ # Daemon already gone, kill the process
89
+ session = read_session
90
+ Process.kill("TERM", session["pid"]) if session
91
+ rescue StandardError
92
+ # ignore
93
+ end
94
+
95
+ cleanup_session
96
+ puts "Browser stopped."
97
+ end
98
+
99
+ def cmd_status
100
+ if session_alive?
101
+ session = read_session
102
+ client = connect
103
+ url = client.call(:current_url) rescue "unknown"
104
+ puts "Browser running (pid: #{session["pid"]})"
105
+ puts "URL: #{url}"
106
+ else
107
+ cleanup_session
108
+ puts "No browser running."
109
+ end
110
+ end
111
+
112
+ def cmd_remote
113
+ client = connect
114
+
115
+ result = case @command
116
+ when "navigate", "go" then client.call(:navigate, @argv[0])
117
+ when "snapshot", "snap" then client.call(:snapshot)
118
+ when "tree" then client.call(:tree)
119
+ when "markdown", "md" then client.call(:markdown)
120
+ when "click" then client.call(:click, @argv[0])
121
+ when "fill" then client.call(:fill, @argv[0], @argv[1..].join(" "))
122
+ when "select" then client.call(:select_option, @argv[0], @argv[1..].join(" "))
123
+ when "hover" then client.call(:hover, @argv[0])
124
+ when "type" then client.call(:type_text, @argv.join(" "))
125
+ when "url" then client.call(:current_url)
126
+ when "title" then client.call(:title)
127
+ when "eval" then client.call(:eval_js, @argv.join(" "))
128
+ when "screenshot" then client.call(:screenshot, @argv[0])
129
+ when "back" then client.call(:back)
130
+ when "forward" then client.call(:forward)
131
+ when "refresh" then client.call(:refresh)
132
+ when "stealth" then client.call(:stealth, @argv[0])
133
+ when "wait" then client.call(:wait, @argv[0], @argv[1])
134
+ else
135
+ $stderr.puts "Unknown command: #{@command}. Run 'agent_ferrum help' for usage."
136
+ exit 1
137
+ end
138
+
139
+ puts result
140
+ rescue Client::ConnectionError => e
141
+ $stderr.puts e.message
142
+ exit 1
143
+ rescue Client::RemoteError => e
144
+ $stderr.puts "Error: #{e.message}"
145
+ exit 1
146
+ end
147
+
148
+ # --- Session management ---
149
+
150
+ def write_session(pid:)
151
+ FileUtils.mkdir_p(SESSION_DIR)
152
+ File.write(SESSION_FILE, JSON.generate(pid: pid, socket: SOCKET_PATH))
153
+ end
154
+
155
+ def read_session
156
+ return nil unless File.exist?(SESSION_FILE)
157
+
158
+ JSON.parse(File.read(SESSION_FILE))
159
+ end
160
+
161
+ def cleanup_session
162
+ File.delete(SESSION_FILE) if File.exist?(SESSION_FILE)
163
+ File.delete(SOCKET_PATH) if File.exist?(SOCKET_PATH)
164
+ end
165
+
166
+ def session_alive?
167
+ session = read_session
168
+ return false unless session
169
+
170
+ Process.kill(0, session["pid"])
171
+ true
172
+ rescue Errno::ESRCH, Errno::EPERM
173
+ false
174
+ end
175
+
176
+ def cleanup_stale_session
177
+ return unless File.exist?(SESSION_FILE) && !session_alive?
178
+
179
+ cleanup_session
180
+ end
181
+
182
+ def connect
183
+ Client.new(SOCKET_PATH)
184
+ end
185
+
186
+ def wait_for_daemon(ready_read, timeout: 30)
187
+ result = IO.select([ready_read], nil, nil, timeout)
188
+
189
+ unless result
190
+ $stderr.puts "Timeout waiting for browser daemon to start."
191
+ exit 1
192
+ end
193
+
194
+ message = ready_read.read
195
+ ready_read.close
196
+
197
+ if message.start_with?("error:")
198
+ $stderr.puts "Failed to start browser: #{message.sub('error:', '')}"
199
+ exit 1
200
+ end
201
+ end
202
+
203
+ # --- Option parsing ---
204
+
205
+ def parse_start_options
206
+ options = {}
207
+
208
+ parser = OptionParser.new do |opts|
209
+ opts.banner = "Usage: agent_ferrum start [URL] [options]"
210
+
211
+ opts.on("--headed", "Run in headed mode (visible browser)") do
212
+ options[:headless] = false
213
+ end
214
+
215
+ opts.on("--stealth PROFILE", "Stealth profile: off, minimal, moderate, maximum") do |v|
216
+ options[:stealth] = v.to_sym
217
+ end
218
+
219
+ opts.on("--user-agent UA", "Custom user agent string") do |v|
220
+ options[:user_agent] = v
221
+ end
222
+
223
+ opts.on("--viewport WxH", "Viewport size (e.g. 1920x1080)") do |v|
224
+ w, h = v.split("x").map(&:to_i)
225
+ options[:viewport] = [w, h]
226
+ end
227
+
228
+ opts.on("--timeout N", Integer, "Timeout in seconds") do |v|
229
+ options[:timeout] = v
230
+ end
231
+
232
+ opts.on("--browser-path PATH", "Path to Chrome binary") do |v|
233
+ options[:browser_path] = v
234
+ end
235
+ end
236
+
237
+ parser.parse!(@argv)
238
+ options
239
+ end
240
+
241
+ # --- Help ---
242
+
243
+ def print_help
244
+ puts <<~HELP
245
+ agent_ferrum — Browser automation CLI for AI agents
246
+
247
+ Usage: agent_ferrum <command> [args] [options]
248
+
249
+ Session:
250
+ start [URL] [options] Start the browser daemon
251
+ --headed Visible browser (default: headless)
252
+ --stealth PROFILE off / minimal / moderate / maximum
253
+ --user-agent UA Custom user agent
254
+ --viewport WxH Viewport size (default: 1920x1080)
255
+ --timeout N Timeout in seconds (default: 30)
256
+ --browser-path PATH Path to Chrome binary
257
+ stop Stop the browser daemon
258
+ status Show browser status
259
+
260
+ Navigation:
261
+ navigate URL Navigate to URL (alias: go)
262
+ back Go back
263
+ forward Go forward
264
+ refresh Reload page
265
+
266
+ Content:
267
+ snapshot Full snapshot: accessibility tree + markdown (alias: snap)
268
+ tree Accessibility tree only
269
+ markdown Page markdown only (alias: md)
270
+ url Current URL
271
+ title Page title
272
+
273
+ Actions:
274
+ click TARGET Click element (ref @e1 or CSS selector)
275
+ fill TARGET VALUE Fill an input field
276
+ select TARGET VALUE Select a dropdown option
277
+ hover TARGET Hover over element
278
+ type TEXT Type text via keyboard
279
+
280
+ Utilities:
281
+ screenshot [PATH] Take a screenshot
282
+ eval JS Evaluate JavaScript
283
+ stealth PROFILE Change stealth profile
284
+ wait SELECTOR [TIMEOUT] Wait for element (CSS or XPath)
285
+
286
+ Other:
287
+ help Show this help
288
+ version Show version
289
+ HELP
290
+ end
291
+ end
292
+ end
@@ -9,7 +9,10 @@ module AgentFerrum
9
9
  end
10
10
 
11
11
  def click
12
- with_retry { @node.click }
12
+ with_retry do
13
+ @node.evaluate("this.scrollIntoView({block: 'center', inline: 'center'})")
14
+ @node.click
15
+ end
13
16
  end
14
17
 
15
18
  def fill(value)
@@ -24,7 +27,11 @@ module AgentFerrum
24
27
  end
25
28
 
26
29
  def hover
27
- with_retry { @node.hover }
30
+ with_retry do
31
+ @node.evaluate("this.scrollIntoView({block: 'center', inline: 'center'})")
32
+ x, y = @node.find_position
33
+ @node.page.mouse.move(x: x, y: y)
34
+ end
28
35
  end
29
36
 
30
37
  def focus
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module AgentFerrum
4
- VERSION = "0.1.0"
4
+ VERSION = "0.2.0"
5
5
  end
data/lib/agent_ferrum.rb CHANGED
@@ -4,6 +4,8 @@ require "zeitwerk"
4
4
 
5
5
  loader = Zeitwerk::Loader.for_gem
6
6
  loader.ignore("#{__dir__}/agent_ferrum/errors.rb")
7
+ loader.ignore("#{__dir__}/agent_ferrum/cli.rb")
8
+ loader.ignore("#{__dir__}/agent_ferrum/cli")
7
9
  loader.setup
8
10
 
9
11
  require_relative "agent_ferrum/errors"
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: agent_ferrum
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Florian
@@ -69,16 +69,22 @@ description: 'Wraps Ferrum (Chrome headless via CDP) with AI-optimized content e
69
69
  accessibility tree with refs, compact markdown snapshots, and stealth mode.'
70
70
  email:
71
71
  - florian@alqemist.com
72
- executables: []
72
+ executables:
73
+ - agent_ferrum
73
74
  extensions: []
74
75
  extra_rdoc_files: []
75
76
  files:
76
77
  - CHANGELOG.md
77
78
  - LICENSE.txt
78
79
  - README.md
80
+ - bin/agent_ferrum
79
81
  - lib/agent_ferrum.rb
80
82
  - lib/agent_ferrum/browser.rb
81
83
  - lib/agent_ferrum/browser/target_resolution.rb
84
+ - lib/agent_ferrum/cli.rb
85
+ - lib/agent_ferrum/cli/client.rb
86
+ - lib/agent_ferrum/cli/server.rb
87
+ - lib/agent_ferrum/cli/service.rb
82
88
  - lib/agent_ferrum/configuration.rb
83
89
  - lib/agent_ferrum/content/accessibility_tree.rb
84
90
  - lib/agent_ferrum/content/markdown_converter.rb