lightpanda 0.0.1 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 585fa864b9a0946daa70334644abf23b36a940fc6a283f5a839e8a3f6febe6c0
4
- data.tar.gz: ec6ab4fc8101a82150d9c3861d883825966b280e56d9ab2ca076172ca6676335
3
+ metadata.gz: 3e916e825715e3501b088ac44898d31bab07985528782eec4da975fd91bfe009
4
+ data.tar.gz: e290aabcd4c89547255ca9e6090eb4eb5bc22dc4551d3811e9bf7f5f03a3bbe4
5
5
  SHA512:
6
- metadata.gz: 3d09767f01eaf1f49eb91a5e20fc3b2e79f7569fec2b9d1c4d95eaa3b785f6f8b12a2403e70a48d01ad24f3ac8d27df84ab86519bd7c1ea23ac141c0742c0b2c
7
- data.tar.gz: b8515873cfc8a1beb3f4942a4f8b774892dff3fce1fa4d4c8463be42558911480289a6d3e9329da098331e24aaa715dcaf4d990a323e371a69b3e577cb95d9c5
6
+ metadata.gz: e98478d21b3044f719470994b8ef2d90ca484e26173097829ebc196612e9bad1e8e0da1764abef32d605cf9d38686cc267e5b3a4571ec4cbb23f58189a523fe0
7
+ data.tar.gz: ecf37c64b4349e8d9a2013979c3bfac94c40952afa5e835711ddbfdf49e8035677cb7b863bd9a2922e9a8d595f0c4cfd3299dfa1996ac115d77eefc714f03fd1
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2025 Marco Roth
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md CHANGED
@@ -1,39 +1,213 @@
1
- # Lightpanda
1
+ # Lightpanda for Ruby
2
2
 
3
- TODO: Delete this and the text below, and describe your gem
3
+ Ruby client for the [Lightpanda](https://lightpanda.io) [open-source](https://github.com/lightpanda-io/browser) headless browser via CDP (Chrome DevTools Protocol).
4
4
 
5
- Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/lightpanda`. To experiment with that code, run `bin/console` for an interactive prompt.
5
+ Lightpanda is a fast, lightweight headless browser built for web automation, AI agents, and scraping. This gem provides a high-level Ruby API to control Lightpanda, similar to [Ferrum](https://github.com/rubycdp/ferrum) for Chrome.
6
6
 
7
- ## Installation
7
+ > [!NOTE]
8
+ > This gem is experimental. [Lightpanda itself](https://github.com/lightpanda-io/browser?tab=readme-ov-file#status) is in Beta and currently a work in progress. Stability and coverage are improving, but you may still encounter errors or crashes. This gem's API will evolve as Lightpanda matures. See [Limitations](#limitations).
8
9
 
9
- TODO: Replace `UPDATE_WITH_YOUR_GEM_NAME_IMMEDIATELY_AFTER_RELEASE_TO_RUBYGEMS_ORG` with your gem name right after releasing it to RubyGems.org. Please do not do it earlier due to security reasons. Alternatively, replace this section with instructions to install your gem from git if you don't plan to release to RubyGems.org.
10
10
 
11
- Install the gem and add to the application's Gemfile by executing:
11
+ ## Features
12
12
 
13
- ```bash
14
- bundle add UPDATE_WITH_YOUR_GEM_NAME_IMMEDIATELY_AFTER_RELEASE_TO_RUBYGEMS_ORG
13
+ - High-level browser automation API
14
+ - CDP (Chrome DevTools Protocol) client
15
+ - Capybara driver included
16
+ - Auto-downloads Lightpanda binary if not found
17
+ - Ruby 3.2+
18
+
19
+ ## Installation
20
+
21
+ Add to your Gemfile:
22
+
23
+ ```ruby
24
+ gem "lightpanda"
15
25
  ```
16
26
 
17
- If bundler is not being used to manage dependencies, install the gem by executing:
27
+ Or install directly:
18
28
 
19
29
  ```bash
20
- gem install UPDATE_WITH_YOUR_GEM_NAME_IMMEDIATELY_AFTER_RELEASE_TO_RUBYGEMS_ORG
30
+ gem install lightpanda
21
31
  ```
22
32
 
33
+ The Lightpanda binary will be automatically downloaded on first use if not found in your `PATH`.
34
+
23
35
  ## Usage
24
36
 
25
- TODO: Write usage instructions here
37
+ ### Basic Browser Control
26
38
 
27
- ## Development
39
+ **Create a browser instance**
40
+
41
+ ```ruby
42
+ require "lightpanda"
43
+
44
+ browser = Lightpanda::Browser.new
45
+ ```
46
+
47
+ **Navigate to a page**
48
+
49
+ ```ruby
50
+ browser.go_to("https://example.com")
51
+ ```
52
+
53
+ **Get page info**
54
+
55
+ ```ruby
56
+ browser.current_url # => "https://example.com/"
57
+ browser.title # => "Example Domain"
58
+ browser.body # => "<html>...</html>"
59
+ ```
60
+
61
+ **Evaluate JavaScript**
62
+
63
+ ```ruby
64
+ browser.evaluate("1 + 1") # => 2
65
+ browser.evaluate("document.querySelector('h1').textContent") # => "Example Domain"
66
+ ```
67
+
68
+ **Execute JavaScript (no return value)**
69
+
70
+ ```ruby
71
+ browser.execute("console.log('Hello from Lightpanda!')")
72
+ ```
73
+
74
+ **Send raw CDP commands**
75
+
76
+ ```ruby
77
+ browser.command("Browser.getVersion")
78
+ # => {"protocolVersion"=>"1.3", "product"=>"Chrome/124.0.6367.29", ...}
79
+ ```
80
+
81
+ **Clean up**
82
+
83
+ ```ruby
84
+ browser.quit
85
+ ```
86
+
87
+ ### Configuration Options
88
+
89
+ ```ruby
90
+ browser = Lightpanda.new(
91
+ host: "127.0.0.1", # CDP server host
92
+ port: 9222, # CDP server port
93
+ timeout: 5, # Command timeout in seconds
94
+ process_timeout: 10, # Process startup timeout
95
+ window_size: [1024, 768],
96
+ browser_path: "/path/to/lightpanda" # Custom binary path
97
+ )
98
+ ```
99
+
100
+ ### Binary Management
101
+
102
+ **Get binary path (downloads if needed)**
103
+
104
+ ```ruby
105
+ Lightpanda::Binary.path # => "/Users/you/.cache/lightpanda/lightpanda"
106
+ ```
28
107
 
29
- After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
108
+ **Get version**
30
109
 
31
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
110
+ ```ruby
111
+ Lightpanda::Binary.version # => "7c976209"
112
+ ```
113
+
114
+ **Run arbitrary commands**
115
+
116
+ ```ruby
117
+ result = Lightpanda::Binary.run("--help")
118
+ result.stdout # => ""
119
+ result.stderr # => "usage: lightpanda command [options] [URL]..."
120
+ result.success? # => false (help exits with 1)
121
+ result.output # => returns stderr if stdout empty
122
+ ```
123
+
124
+ **Fetch a URL directly (no browser instance needed)**
125
+
126
+ ```ruby
127
+ html = Lightpanda::Binary.fetch("https://example.com")
128
+ # => "<!DOCTYPE html><html>..."
129
+ ```
130
+
131
+ ### Global Configuration
132
+
133
+ ```ruby
134
+ Lightpanda.configure do |config|
135
+ config.binary_path = "/path/to/lightpanda"
136
+ end
137
+ ```
138
+
139
+ If not explicitly set, the binary path is auto-discovered on first use and cached for subsequent calls. The discovery order is:
140
+
141
+ 1. `LIGHTPANDA_PATH` environment variable
142
+ 2. `lightpanda` executable in your `PATH`
143
+ 3. Auto-download to `~/.cache/lightpanda/lightpanda` (or `$XDG_CACHE_HOME/lightpanda/lightpanda`)
144
+
145
+ ### Capybara Integration
146
+
147
+ **Basic usage**
148
+
149
+ ```ruby
150
+ require "lightpanda/capybara"
151
+
152
+ Capybara.default_driver = :lightpanda
153
+
154
+ visit "https://example.com"
155
+ find("h1").text # => "Example Domain"
156
+ all("p").count # => 2
157
+ ```
32
158
 
33
- ## Contributing
159
+ **Configuration**
34
160
 
35
- Bug reports and pull requests are welcome on GitHub at https://github.com/marcoroth/lightpanda. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/marcoroth/lightpanda/blob/main/CODE_OF_CONDUCT.md).
161
+ ```ruby
162
+ Lightpanda::Capybara.configure do |config|
163
+ config.host = "127.0.0.1"
164
+ config.port = 9222
165
+ config.timeout = 5
166
+ end
167
+ ```
168
+
169
+ **In tests**
170
+
171
+ ```ruby
172
+ class FeatureTest < Minitest::Spec
173
+ include Capybara::DSL
174
+
175
+ def setup
176
+ Capybara.default_driver = :lightpanda
177
+ end
178
+
179
+ def teardown
180
+ Capybara.reset_sessions!
181
+ end
182
+
183
+ it "shows the homepage" do
184
+ visit "https://example.com"
185
+ assert find("h1").text == "Example Domain"
186
+ end
187
+ end
188
+ ```
189
+
190
+ ## Environment Variables
191
+
192
+ - `LIGHTPANDA_PATH` - Custom path to Lightpanda binary
193
+ - `LIGHTPANDA_DEFAULT_TIMEOUT` - Default command timeout (default: 5)
194
+ - `LIGHTPANDA_PROCESS_TIMEOUT` - Process startup timeout (default: 10)
195
+
196
+ ## Limitations
197
+
198
+ Lightpanda is a lightweight browser with some limitations compared to Chrome:
199
+
200
+ - Single browser context only (no incognito/multi-context)
201
+ - No XPath support (`XPathResult` not implemented)
202
+ - Limited CDP command coverage
203
+
204
+ ## Development
205
+
206
+ ```bash
207
+ bundle install
208
+ bundle exec minitest
209
+ ```
36
210
 
37
- ## Code of Conduct
211
+ ## License
38
212
 
39
- Everyone interacting in the Lightpanda project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/marcoroth/lightpanda/blob/main/CODE_OF_CONDUCT.md).
213
+ MIT License. See [LICENSE.txt](LICENSE.txt).
data/exe/lightpanda ADDED
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require_relative "../lib/lightpanda"
5
+
6
+ args = ARGV.empty? ? ["help"] : ARGV
7
+
8
+ Lightpanda::Binary.exec(*args)
@@ -0,0 +1,180 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "fileutils"
4
+ require "net/http"
5
+ require "open3"
6
+ require "rbconfig"
7
+ require "uri"
8
+
9
+ module Lightpanda
10
+ class Binary
11
+ Result = Struct.new(:stdout, :stderr, :status, keyword_init: true) do
12
+ def success?
13
+ status.success?
14
+ end
15
+
16
+ def exit_code
17
+ status.exitstatus
18
+ end
19
+
20
+ def output
21
+ stdout.empty? ? stderr : stdout
22
+ end
23
+ end
24
+
25
+ GITHUB_RELEASE_URL = "https://github.com/lightpanda-io/browser/releases/download/nightly"
26
+
27
+ PLATFORMS = {
28
+ ["x86_64", "linux"] => "lightpanda-x86_64-linux",
29
+ ["aarch64", "darwin"] => "lightpanda-aarch64-macos",
30
+ ["arm64", "darwin"] => "lightpanda-aarch64-macos",
31
+ }.freeze
32
+
33
+ class << self
34
+ def path
35
+ Lightpanda.configuration.binary_path ||= find_or_download
36
+ end
37
+
38
+ def find_or_download
39
+ find || download
40
+ end
41
+
42
+ def run(*)
43
+ stdout, stderr, status = Open3.capture3(path, *)
44
+
45
+ Result.new(stdout: stdout, stderr: stderr, status: status)
46
+ rescue Errno::ENOENT
47
+ raise BinaryNotFoundError, "Lightpanda binary not found"
48
+ end
49
+
50
+ def exec(*)
51
+ Kernel.exec(path, *)
52
+ end
53
+
54
+ def fetch(url)
55
+ result = run("fetch", "--dump", url)
56
+ raise BinaryError, result.stderr unless result.success?
57
+
58
+ result.stdout
59
+ end
60
+
61
+ def version
62
+ result = run("version")
63
+ result.output.strip
64
+ end
65
+
66
+ def find
67
+ env_path = ENV.fetch("LIGHTPANDA_PATH", nil)
68
+ return env_path if env_path && File.executable?(env_path)
69
+
70
+ path_binary = find_in_path
71
+ return path_binary if path_binary
72
+
73
+ default_path = default_binary_path
74
+ return default_path if File.executable?(default_path)
75
+
76
+ nil
77
+ end
78
+
79
+ def download
80
+ binary_name = platform_binary
81
+ url = "#{GITHUB_RELEASE_URL}/#{binary_name}"
82
+ destination = default_binary_path
83
+
84
+ FileUtils.mkdir_p(File.dirname(destination))
85
+
86
+ download_file(url, destination)
87
+ FileUtils.chmod(0o755, destination)
88
+
89
+ destination
90
+ end
91
+
92
+ def platform_binary
93
+ arch = normalize_arch(RbConfig::CONFIG["host_cpu"])
94
+ os = normalize_os(RbConfig::CONFIG["host_os"])
95
+
96
+ PLATFORMS[[arch, os]] || raise(UnsupportedPlatformError, "Unsupported platform: #{arch}-#{os}")
97
+ end
98
+
99
+ def default_binary_path
100
+ cache_dir = ENV.fetch("XDG_CACHE_HOME") { File.expand_path("~/.cache") }
101
+
102
+ File.join(cache_dir, "lightpanda", "lightpanda")
103
+ end
104
+
105
+ private
106
+
107
+ def find_in_path
108
+ ENV["PATH"].to_s.split(File::PATH_SEPARATOR).each do |dir|
109
+ path = File.join(dir, "lightpanda")
110
+
111
+ return path if File.executable?(path) && native_binary?(path)
112
+ end
113
+
114
+ nil
115
+ end
116
+
117
+ def native_binary?(path)
118
+ header = File.binread(path, 4)
119
+
120
+ return true if elf_binary?(header)
121
+ return true if mach_o_binary?(header)
122
+
123
+ false
124
+ rescue StandardError
125
+ false
126
+ end
127
+
128
+ def elf_binary?(header)
129
+ header.start_with?("\x7FELF")
130
+ end
131
+
132
+ def mach_o_binary?(header)
133
+ header.start_with?("\xCF\xFA\xED\xFE")
134
+ end
135
+
136
+ def normalize_arch(arch)
137
+ case arch
138
+ when /x86_64|amd64/i then "x86_64"
139
+ when /aarch64|arm64/i then "aarch64"
140
+ else arch
141
+ end
142
+ end
143
+
144
+ def normalize_os(os)
145
+ case os
146
+ when /darwin|mac/i then "darwin"
147
+ when /linux/i then "linux"
148
+ else os
149
+ end
150
+ end
151
+
152
+ def download_file(url, destination)
153
+ uri = URI.parse(url)
154
+
155
+ follow_redirects(uri, destination)
156
+ end
157
+
158
+ def follow_redirects(uri, destination, limit = 10)
159
+ raise BinaryNotFoundError, "Too many redirects" if limit.zero?
160
+
161
+ Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == "https") do |http|
162
+ request = Net::HTTP::Get.new(uri)
163
+
164
+ http.request(request) do |response|
165
+ case response
166
+ when Net::HTTPSuccess
167
+ File.open(destination, "wb") do |file|
168
+ response.read_body { |chunk| file.write(chunk) }
169
+ end
170
+ when Net::HTTPRedirection
171
+ follow_redirects(URI.parse(response["location"]), destination, limit - 1)
172
+ else
173
+ raise BinaryNotFoundError, "Failed to download binary: #{response.code} #{response.message}"
174
+ end
175
+ end
176
+ end
177
+ end
178
+ end
179
+ end
180
+ end
@@ -0,0 +1,212 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "forwardable"
4
+
5
+ module Lightpanda
6
+ class Browser
7
+ extend Forwardable
8
+
9
+ attr_reader :options, :process, :client, :target_id, :session_id
10
+
11
+ delegate [:on, :off] => :client
12
+
13
+ def initialize(options = {})
14
+ @options = Options.new(options)
15
+ @process = nil
16
+ @client = nil
17
+ @target_id = nil
18
+ @session_id = nil
19
+ @started = false
20
+ @page_events_enabled = false
21
+
22
+ start
23
+ end
24
+
25
+ def start
26
+ return if @started
27
+
28
+ if @options.ws_url?
29
+ @client = Client.new(@options.ws_url, @options)
30
+ else
31
+ @process = Process.new(@options)
32
+ @process.start
33
+ @client = Client.new(@process.ws_url, @options)
34
+ end
35
+
36
+ create_page
37
+
38
+ @started = true
39
+ end
40
+
41
+ def create_page
42
+ result = @client.command("Target.createTarget", { url: "about:blank" })
43
+ @target_id = result["targetId"]
44
+
45
+ attach_result = @client.command("Target.attachToTarget", { targetId: @target_id, flatten: true })
46
+ @session_id = attach_result["sessionId"]
47
+ end
48
+
49
+ def restart
50
+ quit
51
+ start
52
+ end
53
+
54
+ def quit
55
+ @client&.close
56
+ @process&.stop
57
+ @client = nil
58
+ @process = nil
59
+ @started = false
60
+ end
61
+
62
+ def command(method, **params)
63
+ @client.command(method, params)
64
+ end
65
+
66
+ def page_command(method, **params)
67
+ @client.command(method, params, session_id: @session_id)
68
+ end
69
+
70
+ def go_to(url, wait: true)
71
+ enable_page_events
72
+
73
+ if wait
74
+ loaded = Concurrent::Event.new
75
+
76
+ handler = proc { loaded.set }
77
+ @client.on("Page.loadEventFired", &handler)
78
+
79
+ result = page_command("Page.navigate", url: url)
80
+
81
+ loaded.wait(@options.timeout)
82
+
83
+ @client.off("Page.loadEventFired", handler)
84
+
85
+ result
86
+ else
87
+ page_command("Page.navigate", url: url)
88
+ end
89
+ end
90
+ alias goto go_to
91
+
92
+ def enable_page_events
93
+ return if @page_events_enabled
94
+
95
+ page_command("Page.enable")
96
+ @page_events_enabled = true
97
+ end
98
+
99
+ def back
100
+ page_command("Page.navigateToHistoryEntry", entryId: current_entry_id - 1)
101
+ end
102
+
103
+ def forward
104
+ page_command("Page.navigateToHistoryEntry", entryId: current_entry_id + 1)
105
+ end
106
+
107
+ def refresh
108
+ page_command("Page.reload")
109
+ end
110
+ alias reload refresh
111
+
112
+ def current_url
113
+ evaluate("window.location.href")
114
+ end
115
+
116
+ def title
117
+ evaluate("document.title")
118
+ end
119
+
120
+ def body
121
+ evaluate("document.documentElement.outerHTML")
122
+ end
123
+ alias html body
124
+
125
+ def evaluate(expression)
126
+ response = page_command("Runtime.evaluate", expression: expression, returnByValue: true, awaitPromise: true)
127
+
128
+ handle_evaluate_response(response)
129
+ end
130
+
131
+ def execute(expression)
132
+ page_command("Runtime.evaluate", expression: expression, returnByValue: false, awaitPromise: false)
133
+ nil
134
+ end
135
+
136
+ def css(selector)
137
+ node_ids = page_command("DOM.querySelectorAll", nodeId: document_node_id, selector: selector)
138
+ node_ids["nodeIds"] || []
139
+ end
140
+
141
+ def at_css(selector)
142
+ result = page_command("DOM.querySelector", nodeId: document_node_id, selector: selector)
143
+
144
+ result["nodeId"]
145
+ end
146
+
147
+ def screenshot(path: nil, format: :png, quality: nil, full_page: false, encoding: :binary)
148
+ params = { format: format.to_s }
149
+ params[:quality] = quality if quality && format == :jpeg
150
+
151
+ if full_page
152
+ metrics = page_command("Page.getLayoutMetrics")
153
+ content_size = metrics["contentSize"]
154
+
155
+ params[:clip] = {
156
+ x: 0,
157
+ y: 0,
158
+ width: content_size["width"],
159
+ height: content_size["height"],
160
+ scale: 1,
161
+ }
162
+ end
163
+
164
+ result = page_command("Page.captureScreenshot", **params)
165
+ data = result["data"]
166
+
167
+ if encoding == :base64
168
+ data
169
+ else
170
+ decoded = Base64.decode64(data)
171
+
172
+ if path
173
+ File.binwrite(path, decoded)
174
+ path
175
+ else
176
+ decoded
177
+ end
178
+ end
179
+ end
180
+
181
+ def network
182
+ @network ||= Network.new(self)
183
+ end
184
+
185
+ def cookies
186
+ @cookies ||= Cookies.new(self)
187
+ end
188
+
189
+ private
190
+
191
+ def document_node_id
192
+ result = page_command("DOM.getDocument")
193
+
194
+ result.dig("root", "nodeId")
195
+ end
196
+
197
+ def current_entry_id
198
+ result = page_command("Page.getNavigationHistory")
199
+
200
+ result["currentIndex"]
201
+ end
202
+
203
+ def handle_evaluate_response(response)
204
+ raise JavaScriptError, response if response["exceptionDetails"]
205
+
206
+ result = response["result"]
207
+ return nil if result["type"] == "undefined"
208
+
209
+ result["value"]
210
+ end
211
+ end
212
+ end