ferrum_pdf 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ad27a3b666b15d934cc1bbf678ac52bfe86fc421723c89a60a123b3599aa2c39
4
- data.tar.gz: 0a9bf56518d53ff03f84b9a9143280f97effe45b26edda3695a767d871bb0f73
3
+ metadata.gz: 9b709d6f3d6b9472b389f08891e3ec59f90a2e0a962f034acd31a2ee4efdb79c
4
+ data.tar.gz: c1353b79300829afd212880fbcb76633e3fe7da2c2facc3b96d8305a7c5fc6ae
5
5
  SHA512:
6
- metadata.gz: 9a2fa255b8d7484952959023ac2ca320ce68259857adbba20fbb0dc9c455e40e6db6a7088daf22b8f70f85da2d31f8fc2c1550a6f6a2ad97cb9920a1841bfd2b
7
- data.tar.gz: ed0d20754dee52a621880ef3ffc95894749b8ddfc5796226029d2417c2b6c18e4b29fb94bbae0dd3249d9f7e56e936c6b429f7640ebc7eb33bc5ba72556fb9dc
6
+ metadata.gz: a4145a37dc89ac2dbaeb0512fffafbd6056291e0521734ed57affd66a45d32489fd412d0f8fbbde7b53046ffbc8d030f602ee6914224ea76bdb1e95c78adf50f
7
+ data.tar.gz: 57c44da1352bac877910af6193f7104da77580bd9edaafa1d3dbbb4faa2175d8ec05215bccc40c7f85fa4e93d84089f15c1e2c606bcf886a46625f3c2b3ba61a
data/README.md CHANGED
@@ -6,7 +6,7 @@ Inspired by [Grover](https://github.com/Studiosity/grover).
6
6
 
7
7
  ## Installation
8
8
 
9
- First, make sure Chrome is installed
9
+ First, make sure Chrome is installed.
10
10
 
11
11
  Run the following or add the gem to your Gemfile:
12
12
 
@@ -16,7 +16,7 @@ bundle add "ferrum_pdf"
16
16
 
17
17
  ## Usage
18
18
 
19
- You can use FerrumPdf to render [PDFs](#pdfs) and [Screenshots](#screenshots)
19
+ You can use FerrumPdf to render [PDFs](#-pdfs) and [Screenshots](#-screenshots)
20
20
 
21
21
  ### 📄 PDFs
22
22
 
@@ -63,14 +63,17 @@ FerrumPdf.render_pdf(html: content)
63
63
  FerrumPdf.render_pdf(url: "https://google.com")
64
64
  ```
65
65
 
66
- You can also pass host and protocol to convert any relative paths to full URLs. This is helpful for converting relative asset paths to full URLs.
66
+ The full list of options:
67
67
 
68
68
  ```ruby
69
69
  FerrumPdf.render_pdf(
70
- html: content, # Provide HTML
71
- url: "https://example.com", # or provide a URL to the content
72
- host: request.base_url + "/", # Used for setting the host for relative paths
73
- protocol: request.protocol, # Used for handling relative protocol paths
70
+ url: "https://example.com/page", # Provide a URL to the content
71
+
72
+ html: content, # or provide HTML
73
+ base_url: request.base_url, # Preprocesses `html` to convert relative paths and protocols. Example: "https://example.org"
74
+
75
+ authorize: { user: "username", password: "password" }, # Used for authenticating with basic auth
76
+ wait_for_idle_options: { connections: 0, duration: 0.05, timeout: 5 }, # Used for setting network wait_for_idle options
74
77
 
75
78
  pdf_options: {
76
79
  landscape: false, # paper orientation
@@ -109,10 +112,10 @@ See [Chrome DevTools Protocol docs](https://chromedevtools.github.io/devtools-pr
109
112
 
110
113
  There are two ways to render Screenshots:
111
114
 
112
- * [FerrumPdf.render_screenshot](#render-screenshot)
115
+ * [FerrumPdf.render_screenshot](#render-screenshots)
113
116
  * [render_screenshot in Rails](#render-screenshots-from-rails-controllers)
114
117
 
115
- #### Render Screenshot from Rails controller
118
+ #### Render Screenshots from Rails controllers
116
119
 
117
120
  Use the `render_screenshot` helper in Rails controllers to render a PDF from the current action.
118
121
 
@@ -155,14 +158,15 @@ FerrumPdf.render_screenshot(html: content)
155
158
  FerrumPdf.render_screenshot(url: "https://google.com")
156
159
  ```
157
160
 
158
- You can also pass host and protocol to convert any relative paths to full URLs. This is helpful for converting relative asset paths to full URLs.
161
+ The full list of options
159
162
 
160
163
  ```ruby
161
164
  FerrumPdf.render_screenshot(
162
- html: "",
163
- url: "",
164
- root_url: "",
165
- protocol: "",
165
+ url: "https://example.com/page", # Provide a URL to the content
166
+
167
+ html: content, # or provide HTML
168
+ base_url: request.base_url, # Preprocesses `html` to convert relative paths and protocols. Example: "https://example.org"
169
+
166
170
  screenshot_options: {
167
171
  format: "png" # or "jpeg"
168
172
  quality: nil # Integer 0-100 works for jpeg only
@@ -175,9 +179,89 @@ FerrumPdf.render_screenshot(
175
179
  )
176
180
  ```
177
181
 
182
+ ## Configuring the Browser
183
+
184
+ You can set the default browser options with the configure block.
185
+
186
+ See [Ferrum's Customization docs](https://github.com/rubycdp/ferrum?tab=readme-ov-file#customization) for a full list of options.
187
+
188
+ ```ruby
189
+ FerrumPdf.configure do |config|
190
+ config.window_size = [1920, 1080]
191
+
192
+ # config.process_timeout = 30 # defaults to 10
193
+ # config.browser_path = '/usr/bin/chromium'
194
+
195
+ # For use with Docker, but ensure you trust any sites visited
196
+ # config.browser_options = {
197
+ # "no-sandbox" => true
198
+ # }
199
+ end
200
+ ```
201
+
202
+ For Docker, `seccomp` is recommend over `--no-sandbox` for security: https://github.com/jlandure/alpine-chrome?tab=readme-ov-file#3-ways-to-securely-use-chrome-headless-with-this-image
203
+
204
+ To add Chrome to your Docker image:
205
+
206
+ ```dockerfile
207
+ RUN apt-get update && apt-get install gnupg wget -y && \
208
+ wget --quiet --output-document=- https://dl-ssl.google.com/linux/linux_signing_key.pub | gpg --dearmor > /etc/apt/trusted.gpg.d/google-archive.gpg && \
209
+ sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' && \
210
+ apt-get update && \
211
+ apt-get install google-chrome-stable -y && \
212
+ rm -rf /var/lib/apt/lists/*
213
+ ```
214
+
215
+ ### Multiple Browser Support
216
+
217
+ ```ruby
218
+ # Create two browsers using the FerrumPdf config, but overriding `window_size`
219
+ FerrumPdf.add_browser(:small, window_size: [1024, 768]))
220
+ FerrumPdf.add_browser(:large, window_size: [1920, 1080]))
221
+
222
+ FerrumPdf.render_pdf(url: "https://example.org", browser: :small)
223
+ FerrumPdf.render_pdf(url: "https://example.org", browser: :large)
224
+ ```
225
+
226
+ You can also create a `Ferrum::Browser` instance and pass it in as `browser`:
227
+
228
+ ```ruby
229
+ FerrumPdf.render_pdf(url: "https://example.org", browser: Ferrum::Browser.new)
230
+ ```
231
+
232
+ ## Debugging
233
+
234
+ One option for debugging is to use Chrome in regular, non-headless mode:
235
+
236
+ ```ruby
237
+ FerrumPdf.configure do |config|
238
+ config.headless = false
239
+ end
240
+ ```
241
+
242
+ FerrumPdf also allows you to pass a block for debugging. This block is executed after loading the page but before rendering the PDF or screenshot.
243
+
244
+ ```ruby
245
+ FerrumPdf.render_pdf(url: "https://gooogle.com") do |browser, page|
246
+ # Open Chrome DevTools to remotely inspect the browser
247
+ browser.debug
248
+
249
+ # Or pause and poke around
250
+ binding.irb
251
+ end
252
+ ```
253
+
254
+ The block will receive the `Ferrum::Browser` and `Ferrum::Page` objects which you can use for debugging.
255
+
178
256
  ## Contributing
179
257
 
180
258
  If you have an issue you'd like to submit, please do so using the issue tracker in GitHub. In order for us to help you in the best way possible, please be as detailed as you can.
181
259
 
260
+ To run the test suite, run:
261
+
262
+ ```bash
263
+ bin/test
264
+ ```
265
+
182
266
  ## License
183
267
  The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
@@ -0,0 +1,65 @@
1
+ module FerrumPdf
2
+ module AssetsHelper
3
+ class BaseAsset
4
+ def initialize(asset)
5
+ @asset = asset
6
+ end
7
+ end
8
+
9
+ class PropshaftAsset < BaseAsset
10
+ def content_type
11
+ @asset.content_type.to_s
12
+ end
13
+
14
+ def content
15
+ @asset.content
16
+ end
17
+ end
18
+
19
+ class SprocketsAsset < BaseAsset
20
+ def content_type
21
+ @asset.content_type
22
+ end
23
+
24
+ def content
25
+ @asset.source
26
+ end
27
+ end
28
+
29
+ class AssetFinder
30
+ class << self
31
+ def find(path)
32
+ if Rails.application.assets.respond_to?(:load_path)
33
+ propshaft_asset(path)
34
+ elsif Rails.application.assets.respond_to?(:find_asset)
35
+ sprockets_asset(path)
36
+ else
37
+ nil
38
+ end
39
+ end
40
+
41
+ def propshaft_asset(path)
42
+ (asset = Rails.application.assets.load_path.find(path)) ? PropshaftAsset.new(asset) : nil
43
+ end
44
+
45
+ def sprockets_asset(path)
46
+ (asset = Rails.application.assets.find_asset(path)) ? SprocketsAsset.new(asset) : nil
47
+ end
48
+ end
49
+ end
50
+
51
+ def ferrum_pdf_inline_stylesheet(path)
52
+ (asset = AssetFinder.find(path)) ? "<style>#{asset.content}</style>".html_safe : nil
53
+ end
54
+
55
+ def ferrum_pdf_inline_javascript(path)
56
+ (asset = AssetFinder.find(path)) ? "<script>#{asset.content}</script>".html_safe : nil
57
+ end
58
+
59
+ def ferrum_pdf_base64_asset(path)
60
+ return nil unless (asset = AssetFinder.find(path))
61
+
62
+ "data:#{asset.content_type};base64,#{Base64.encode64(asset.content).gsub(/\s+/, '')}"
63
+ end
64
+ end
65
+ end
@@ -2,25 +2,25 @@ module FerrumPdf
2
2
  module Controller
3
3
  extend ActiveSupport::Concern
4
4
 
5
- def render_pdf(pdf_options: {}, **rendering)
5
+ def render_pdf(pdf_options: {}, **rendering, &block)
6
6
  content = render_to_string(**rendering.with_defaults(formats: [ :html ]))
7
7
 
8
8
  FerrumPdf.render_pdf(
9
9
  html: content,
10
- host: request.base_url + "/",
11
- protocol: request.protocol,
12
- pdf_options: pdf_options
10
+ base_url: request.base_url,
11
+ pdf_options: pdf_options,
12
+ &block
13
13
  )
14
14
  end
15
15
 
16
- def render_screenshot(screenshot_options: {}, **rendering)
16
+ def render_screenshot(screenshot_options: {}, **rendering, &block)
17
17
  content = render_to_string(**rendering.with_defaults(formats: [ :html ]))
18
18
 
19
19
  FerrumPdf.render_screenshot(
20
20
  html: content,
21
- host: request.base_url + "/",
22
- protocol: request.protocol,
23
- screenshot_options: screenshot_options
21
+ base_url: request.base_url,
22
+ screenshot_options: screenshot_options,
23
+ &block
24
24
  )
25
25
  end
26
26
  end
@@ -5,15 +5,20 @@ module FerrumPdf
5
5
  # @see https://github.com/pdfkit/pdfkit
6
6
  module HTMLPreprocessor
7
7
  # Change relative paths to absolute, and relative protocols to absolute protocols
8
- def self.process(html, root_url, protocol)
9
- html = translate_relative_paths(html, root_url) if root_url
8
+ #
9
+ # process("Some HTML", "https://example.org")
10
+ #
11
+ def self.process(html, base_url)
12
+ base_url += "/" unless base_url.end_with? "/"
13
+ protocol = base_url.split("://").first
14
+ html = translate_relative_paths(html, base_url) if base_url
10
15
  html = translate_relative_protocols(html, protocol) if protocol
11
16
  html
12
17
  end
13
18
 
14
- def self.translate_relative_paths(html, root_url)
19
+ def self.translate_relative_paths(html, base_url)
15
20
  # Try out this regexp using rubular http://rubular.com/r/hiAxBNX7KE
16
- html.gsub(%r{(href|src)=(['"])/([^/"']([^"']*|[^"']*))?['"]}, "\\1=\\2#{root_url}\\3\\2")
21
+ html.gsub(%r{(href|src)=(['"])/([^/"']([^"']*|[^"']*))?['"]}, "\\1=\\2#{base_url}\\3\\2")
17
22
  end
18
23
  private_class_method :translate_relative_paths
19
24
 
@@ -1,5 +1,11 @@
1
1
  module FerrumPdf
2
2
  class Railtie < ::Rails::Railtie
3
+ initializer "ferrum_pdf.assets_helper" do
4
+ ActiveSupport.on_load(:action_view) do
5
+ include FerrumPdf::AssetsHelper if FerrumPdf.include_assets_helper_module
6
+ end
7
+ end
8
+
3
9
  initializer "ferrum_pdf.controller" do
4
10
  ActiveSupport.on_load(:action_controller) do
5
11
  include FerrumPdf::Controller if FerrumPdf.include_controller_module
@@ -1,3 +1,3 @@
1
1
  module FerrumPdf
2
- VERSION = "0.3.0"
2
+ VERSION = "0.4.0"
3
3
  end
data/lib/ferrum_pdf.rb CHANGED
@@ -9,41 +9,96 @@ module FerrumPdf
9
9
  <div class='text right'><span class='pageNumber'></span>/<span class='totalPages'></span></div>
10
10
  HTML
11
11
 
12
+ autoload :AssetsHelper, "ferrum_pdf/assets_helper"
12
13
  autoload :Controller, "ferrum_pdf/controller"
13
14
  autoload :HTMLPreprocessor, "ferrum_pdf/html_preprocessor"
14
15
 
15
- mattr_accessor :include_controller_module
16
- @@include_controller_module = true
16
+ mattr_accessor :include_assets_helper_module, default: true
17
+ mattr_accessor :include_controller_module, default: true
18
+ mattr_accessor :browsers, default: {}
19
+ mattr_accessor :config, default: ActiveSupport::OrderedOptions.new.merge(
20
+ window_size: [ 1920, 1080 ]
21
+ )
17
22
 
18
23
  class << self
19
- def browser(**options)
20
- @browser ||= Ferrum::Browser.new(options)
24
+ def configure
25
+ yield config
21
26
  end
22
27
 
23
- def render_pdf(html: nil, url: nil, host: nil, protocol: nil, pdf_options: {})
24
- render(host: host, protocol: protocol, html: html, url: url) do |page|
28
+ def add_browser(name, **options)
29
+ @@browsers[name] = Ferrum::Browser.new(@@config.merge(options))
30
+ end
31
+
32
+ # Renders HTML or URL to PDF
33
+ #
34
+ # render_pdf(url: "https://example.org/receipts/example.pdf")
35
+ # render_pdf(html: "<h1>Hello world</h1>")
36
+ #
37
+ # For rendering HTML, we also need the base_url for preprocessing URLs with relative paths & protocols
38
+ #
39
+ # render_pdf(html: "<h1>Hello world</h1>", base_url: "https://example.org/")
40
+ #
41
+ def render_pdf(pdf_options: {}, **load_page_args)
42
+ load_page(**load_page_args) do |browser, page|
43
+ yield browser, page if block_given?
25
44
  page.pdf(**pdf_options.with_defaults(encoding: :binary))
26
45
  end
27
46
  end
28
47
 
29
- def render_screenshot(html: nil, url: nil, host: nil, protocol: nil, screenshot_options: {})
30
- render(host: host, protocol: protocol, html: html, url: url) do |page|
48
+ # Renders HTML or URL to Screenshot
49
+ #
50
+ # render_screenshot(url: "https://example.org/receipts/example.pdf")
51
+ # render_screenshot(html: "<h1>Hello world</h1>")
52
+ #
53
+ # For rendering HTML, we also need the base_url for preprocessing URLs with relative paths & protocols
54
+ #
55
+ # render_screenshot(html: "<h1>Hello world</h1>", base_url: "https://example.org/")
56
+ #
57
+ def render_screenshot(screenshot_options: {}, **load_page_args)
58
+ load_page(**load_page_args) do |browser, page|
59
+ yield browser, page if block_given?
31
60
  page.screenshot(**screenshot_options.with_defaults(encoding: :binary, full: true))
32
61
  end
33
62
  end
34
63
 
35
- def render(host:, protocol:, html: nil, url: nil)
64
+ # Loads page into the browser to be used for rendering PDFs or screenshots
65
+ #
66
+ # This automatically applies HTML preprocessing if `html:` is present
67
+ #
68
+ def load_page(url: nil, html: nil, base_url: nil, authorize: nil, wait_for_idle_options: nil, browser: :default, retries: 1)
69
+ try = 0
70
+
71
+ # Lookup browser if a name was passed
72
+ browser = @@browsers[browser] || add_browser(browser) if browser.is_a? Symbol
73
+
74
+ # Automatically restart the browser if it was disconnected
75
+ browser.restart unless browser.client.present?
76
+
77
+ # Closes page automatically after block finishes
78
+ # https://github.com/rubycdp/ferrum/blob/main/lib/ferrum/browser.rb#L169
36
79
  browser.create_page do |page|
80
+ page.network.authorize(**authorize) { |req| req.continue } if authorize
81
+
82
+ # Load content
37
83
  if html
38
- page.content = FerrumPdf::HTMLPreprocessor.process(html, host, protocol)
39
- page.network.wait_for_idle
84
+ page.content = FerrumPdf::HTMLPreprocessor.process(html, base_url)
40
85
  else
41
86
  page.go_to(url)
42
87
  end
43
- yield page
88
+
89
+ # Wait for everything to load
90
+ page.network.wait_for_idle(**wait_for_idle_options)
91
+
92
+ yield browser, page
44
93
  end
45
94
  rescue Ferrum::DeadBrowserError
46
- retry
95
+ try += 1
96
+ if try <= retries
97
+ browser.restart
98
+ retry
99
+ else
100
+ raise
101
+ end
47
102
  end
48
103
  end
49
104
  end
metadata CHANGED
@@ -1,14 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ferrum_pdf
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.0
4
+ version: 0.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Chris Oliver
8
- autorequire:
9
8
  bindir: bin
10
9
  cert_chain: []
11
- date: 2024-09-22 00:00:00.000000000 Z
10
+ date: 1980-01-02 00:00:00.000000000 Z
12
11
  dependencies:
13
12
  - !ruby/object:Gem::Dependency
14
13
  name: rails
@@ -50,6 +49,7 @@ files:
50
49
  - README.md
51
50
  - Rakefile
52
51
  - lib/ferrum_pdf.rb
52
+ - lib/ferrum_pdf/assets_helper.rb
53
53
  - lib/ferrum_pdf/controller.rb
54
54
  - lib/ferrum_pdf/html_preprocessor.rb
55
55
  - lib/ferrum_pdf/railtie.rb
@@ -62,7 +62,6 @@ metadata:
62
62
  homepage_uri: https://github.com/excid3/ferrum_pdf
63
63
  source_code_uri: https://github.com/excid3/ferrum_pdf
64
64
  changelog_uri: https://github.com/excid3/ferrum_pdf/blob/main/CHANGELOG.md
65
- post_install_message:
66
65
  rdoc_options: []
67
66
  require_paths:
68
67
  - lib
@@ -77,8 +76,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
77
76
  - !ruby/object:Gem::Version
78
77
  version: '0'
79
78
  requirements: []
80
- rubygems_version: 3.5.16
81
- signing_key:
79
+ rubygems_version: 3.6.9
82
80
  specification_version: 4
83
81
  summary: PDFs & screenshots for Rails using Ferrum & headless Chrome
84
82
  test_files: []