palapala_pdf 0.1.9 → 0.1.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e9359afef10584362d61be46353495ae97d4ee17a380912fac9ede823fc9d41b
4
- data.tar.gz: 3c8fce8d86a1fa1e1a9394c7e46eb9fe487669f5d621eabf56cd0ae2c29dd33b
3
+ metadata.gz: e06d55c5dca6e14014e1154d4cd4fdcdddcd61844ccd138f38e1d9d803d1094e
4
+ data.tar.gz: 89ed6d300a9e4c804d3bfcb54516ec921ed741df1718d68b4d160b36e3fd2792
5
5
  SHA512:
6
- metadata.gz: e026b00c0e48fc24a412314a51f612813c2ac6441bee5b48139999dc58ceeec0b9c2807edf0cd0c00cdffb99fbcd0bb6848b2fc53599ed8d51fe286ee3a77155
7
- data.tar.gz: c759088c15ea39529b7cf9ccdb95dee9af3c7b8bb02c8bc3ab6aef73e59de4e93378cd12f85e7affd3c2c2a7c2c8a476449087d529ce98dfd1aa2806f9119adc
6
+ metadata.gz: f0bd26fe4c402e06f1f75ab4a5ccfad05834b78bcb125024967134149b7738b2cbd8121dd985332907644ecfd167f44fc3109b322099ebb9c0e988971f74f42e
7
+ data.tar.gz: c95f8931ce1538af9b0cb63d93fcc179016cfe05cd4920b7ead1bf6933f4712bfbaee3ae8d243e1112353c1c2b4af875c92b6e1e6373ad219aa0c409f7db117a
data/README.md CHANGED
@@ -31,49 +31,47 @@ $ gem install palapala_pdf
31
31
  ```
32
32
 
33
33
  Palapala PDF connects to Chrome over a web socket connection.
34
- An external Chrome/Chromium is expected. Start it with the following
35
- command (9222 is the default port):
34
+ An external Chrome/Chromium is preferred. Start it with the following
35
+ command (9222 is the default/expected port):
36
36
 
37
37
  ```sh
38
38
  /path/to/chrome --headless --disable-gpu --remote-debugging-port=9222
39
39
  ```
40
40
 
41
- ### Installing Chrome / Headless Chrome
41
+ ### Connecting to Chrome
42
42
 
43
- Seems the august 2024 release 128.0.6613.85 is seriously performance impacted. So to avoid regression issues, it's suggested to install a specific version of Chrome, test it and stick with it. This is easiest using npx and some tooling provided by Puppeteer. Unfortunately it depends on node/npm, but it's worth it. E.g. install a specific version like this:
43
+ Palapa PDF will go through this process
44
44
 
45
- ```
46
- npx @puppeteer/browsers install chrome@127.0.6533.88
47
- ````
45
+ - check if a Chrome is running and exposing port 9222 (and if so, use it)
46
+ - if `Palapala.headless_chrome_path` is defined, launch Chrome as a child process using that path
47
+ - if **NPX** is avalaillable, install a **Chrome-Headless-Shell** variant locally and launch it as a child process. It will install the 'stable' version or the version identified by `Palapala.chrome_headless_shell_version` setting (or from ENV `CHROME_HEADLESS_SHELL_VERSION`).
48
+ - as a last fallback it will guess a chrome path from the detected OS and try to launch a Chrome with that
48
49
 
49
- This installs chrome in a `chrome` folder in the current working dir and it outputs the path where it's installed when it's finished.
50
+ A Chrome-Headless-Shell version gives the best performance and resource useage
50
51
 
51
- If you installed it using puppeteer from above
52
+ ### Installing Chrome / Headless Chrome manually
53
+
54
+ This is easiest using npx and some tooling provided by Puppeteer. Unfortunately it depends on node/npm, but it's worth it. E.g. install a specific version like this:
52
55
 
53
- ```sh
54
- ./chrome/mac_arm-127.0.6533.88/chrome-mac-arm64/Google\ Chrome\ for\ Testing.app/Contents/MacOS/Google\ Chrome\ for\ Testing --headless --disable-gpu --remote-debugging-port=9222
55
56
  ```
57
+ npx @puppeteer/browsers install chrome@127.0.6533.88
58
+ ````
56
59
 
57
- Currently i'd advise for the `chrome-headless-shell`variant that is a light version meant just for this use case. The chrome-headless-shell is a minimal, headless version of the Chrome browser designed specifically for environments where you need to run Chrome without a graphical user interface (GUI). This is particularly useful in scenarios like server-side rendering, automated testing, web scraping, or any situation where you need the power of the Chrome browser engine without the overhead of displaying a UI. Headless by design, reduced size and overhead but still the same engine.
60
+ This installs chrome in a `chrome` folder in the current working dir and it outputs the path where it's installed when it's finished which then could be started like this
61
+
62
+ Currently we'd advise for the `chrome-headless-shell` variant that is a light version meant just for this use case. The chrome-headless-shell is a minimal, headless version of the Chrome browser designed specifically for environments where you need to run Chrome without a graphical user interface (GUI). This is particularly useful in scenarios like server-side rendering, automated testing, web scraping, or any situation where you need the power of the Chrome browser engine without the overhead of displaying a UI. Headless by design, reduced size and overhead but still the same engine.
58
63
 
59
64
  ```
60
65
  npx @puppeteer/browsers install chrome-headless-shell@stable
61
66
  ```
62
67
 
63
- It installs to a path like this `./chrome-headless-shell/mac_arm-128.0.6613.84/chrome-headless-shell-mac-arm64/chrome-headless-shell`. As it's headless by design, it only needs one parameter
68
+ It installs to a path like this `./chrome-headless-shell/mac_arm-128.0.6613.84/chrome-headless-shell-mac-arm64/chrome-headless-shell`. As it's headless by design, it only needs one parameter:
64
69
 
65
70
  ```
66
71
  ./chrome-headless-shell/mac_arm-128.0.6613.84/chrome-headless-shell-mac-arm64/chrome-headless-shell --remote-debugging-port=9222
67
72
  ```
68
73
 
69
- Alternatively, Palapala PDF will try to launch Chrome as a child process.
70
- It guesses the path to Chrome, or you configure it like this:
71
-
72
- ```ruby
73
- Palapala.setup do |config|
74
- config.headless_chrome_path = '/usr/bin/google-chrome-stable' # path to Chrome executable
75
- end
76
- ```
74
+ *Note: Seems the august 2024 release 128.0.6613.85 is seriously performance impacted. So to avoid regression issues, it's suggested to install a specific version of Chrome, test it and stick with it. The chrome-headless-shell does not seem to suffer from this though.*
77
75
 
78
76
  ### Installing Node/NPX
79
77
 
@@ -92,7 +90,6 @@ nvm --version
92
90
  nvm install node
93
91
  ````
94
92
 
95
-
96
93
  ## Usage Instructions
97
94
 
98
95
  To create a PDF from HTML content using the `Palapala` library, follow these steps:
@@ -25,25 +25,20 @@ HEADER_HTML = <<~HTML
25
25
  HTML
26
26
 
27
27
  Palapala.setup do |config|
28
- config.debug = true
29
- config.headless_chrome_url = 'http://localhost:9222' # run against a remote Chrome instance
28
+ # config.debug = true
29
+ # config.headless_chrome_url = 'http://localhost:9222' # run against a remote Chrome instance
30
30
  # config.headless_chrome_path = '/usr/bin/google-chrome-stable' # path to Chrome executable
31
31
  end
32
32
 
33
33
  result = Palapala::Pdf.new(
34
34
  # "<style>@page { size: A4 landscape; }</style><p>Hello world #{Time.now}</>",
35
35
  "<h1>Title</h1><p>Hello world #{Time.now}</>",
36
- header_html: HEADER_HTML,
37
- footer_html: '<div style="text-align: center;">Generated with Palapala PDF</div>',
36
+ header_template: HEADER_HTML,
37
+ footer_template: '<div style="text-align: center; font-size: 12pt; width: 100%;">Generated with Palapala PDF</div>',
38
38
  scale: 0.75,
39
39
  prefer_css_page_size: false,
40
- margin: { top: 3, bottom: 2 }
41
- ).save('tmp/headers_and_footers.pdf',
42
- generateDocumentOutline: false,
43
- # marginTop: 1,
44
- # paperWidth: 3,
45
- displayHeaderFooter: true,
46
- # landscape: false,
47
- headerTemplate: HEADER_HTML)
40
+ margin_top: 3,
41
+ margin_bottom: 2).save('tmp/headers_and_footers.pdf')
48
42
 
49
43
  puts result
44
+ `open tmp/headers_and_footers.pdf`
@@ -15,8 +15,9 @@ DOCUMENT = <<~HTML
15
15
  HTML
16
16
 
17
17
  Palapala.setup do |config|
18
- config.debug = true
18
+ # config.debug = true
19
+ # config.defaults = { header_template: '<div></div>', footer_template: '<div></div>' }
19
20
  end
20
21
 
21
- result = Palapala::Pdf.new(DOCUMENT).save('tmp/js_based_rendering.pdf')
22
- puts result
22
+ Palapala::Pdf.new(DOCUMENT).save('tmp/js_based_rendering.pdf')
23
+ `open tmp/js_based_rendering.pdf`
@@ -25,9 +25,9 @@ module Palapala
25
25
  end
26
26
  end
27
27
 
28
- # Check if a Chrome is running
28
+ # Check if a Chrome is running locally
29
29
  def self.chrome_running?
30
- port_in_use? || # Check if the port is in use and Chrome is running externally
30
+ port_in_use? || # Check if the port is in use
31
31
  chrome_process_healthy? # Check if the process is still alive
32
32
  end
33
33
 
@@ -59,9 +59,9 @@ module Palapala
59
59
  system("which npx > /dev/null 2>&1")
60
60
  end
61
61
 
62
- def self.spawn_chrome_headless_server
62
+ def self.spawn_chrome_headless_server_with_npx
63
63
  # Run the command and capture the output
64
- puts "Installing latest stable chrome-headless-shell..."
64
+ puts "Installing/launching chrome-headless-shell@#{Palapala.chrome_headless_shell_version}"
65
65
  output, status = Open3.capture2("npx --yes @puppeteer/browsers install chrome-headless-shell@#{Palapala.chrome_headless_shell_version}")
66
66
 
67
67
  if status.success?
@@ -82,28 +82,37 @@ module Palapala
82
82
  # Display the version
83
83
  system("#{chrome_path} --version") if Palapala.debug
84
84
  # Launch chrome-headless-shell with the --remote-debugging-port parameter
85
- if Palapala.debug
86
- spawn(chrome_path, "--remote-debugging-port=9222", "--disable-gpu")
85
+ params = [ "--disable-gpu", "--remote-debugging-port=9222" ]
86
+ params.merge!(Palapala.chrome_params) if Palapala.chrome_params
87
+ pid = if Palapala.debug
88
+ spawn(chrome_path, *params)
87
89
  else
88
- spawn(chrome_path, "--remote-debugging-port=9222", "--disable-gpu", out: "/dev/null", err: "/dev/null")
90
+ spawn(chrome_path, *params, out: "/dev/null", err: "/dev/null")
89
91
  end
92
+ Palapala.headless_chrome_url = "http://localhost:9222"
93
+ pid
90
94
  else
91
95
  raise "Failed to install chrome-headless-shell"
92
96
  end
93
97
  end
94
98
 
99
+ def self.spawn_chrome_from_path
100
+ params = [ "--headless", "--disable-gpu", "--remote-debugging-port=9222" ]
101
+ params.merge!(Palapala.chrome_params) if Palapala.chrome_params
102
+ # Spawn an existing chrome with the path and parameters
103
+ Process.spawn(chrome_path, *params)
104
+ end
105
+
95
106
  # Spawn a Chrome child process
96
107
  def self.spawn_chrome
97
108
  return if chrome_running?
98
109
 
99
- if self.npx_installed?
100
- @chrome_process_id = spawn_chrome_headless_server
101
- else
102
- params = [ "--headless", "--disable-gpu", "--remote-debugging-port=9222" ]
103
- params.merge!(Palapala.chrome_params) if Palapala.chrome_params
104
- # Spawn an existing chrome with the path and parameters
105
- @chrome_process_id = Process.spawn(chrome_path, *params)
106
- end
110
+ @chrome_process_id =
111
+ if Palapala.headless_chrome_path.nil? && self.npx_installed?
112
+ spawn_chrome_headless_server_with_npx
113
+ else
114
+ spawn_chrome_from_path
115
+ end
107
116
 
108
117
  # Wait until the port is in use
109
118
  sleep 0.1 until port_in_use?
data/lib/palapala/pdf.rb CHANGED
@@ -42,20 +42,22 @@ module Palapala
42
42
  scale: nil)
43
43
  @content = content || raise(ArgumentError, "Content is required and can't be nil")
44
44
  @opts = {}
45
- @opts[:headerTemplate] = header_template || Palapala.defaults[:header_template]
46
- @opts[:footerTemplate] = footer_template || Palapala.defaults[:footer_template]
47
- @opts[:pageRanges] = page_ranges || Palapala.defaults[:page_ranges]
48
- @opts[:generateTaggedPDF] = generate_tagged_pdf || Palapala.defaults[:generate_tagged_pdf]
49
- @opts[:paperWidth] = paper_width || Palapala.defaults[:paper_width]
50
- @opts[:paperHeight] = paper_height || Palapala.defaults[:paper_height]
51
- @opts[:landscape] = landscape || Palapala.defaults[:landscape]
52
- @opts[:marginTop] = margin_top || Palapala.defaults[:margin_top]
53
- @opts[:marginLeft] = margin_left || Palapala.defaults[:margin_left]
54
- @opts[:marginBottom] = margin_bottom || Palapala.defaults[:margin_bottom]
55
- @opts[:marginRight] = margin_right || Palapala.defaults[:margin_right]
56
- @opts[:preferCSSPageSize] = prefer_css_page_size || Palapala.defaults[:prefer_css_page_size]
57
- @opts[:printBackground] = print_background || Palapala.defaults[:print_background]
58
- @opts[:scale] = scale || Palapala.defaults[:scale]
45
+ @opts[:headerTemplate] = header_template || Palapala.defaults[:header_template]
46
+ @opts[:footerTemplate] = footer_template || Palapala.defaults[:footer_template]
47
+ @opts[:pageRanges] = page_ranges || Palapala.defaults[:page_ranges]
48
+ @opts[:generateTaggedPDF] = generate_tagged_pdf || Palapala.defaults[:generate_tagged_pdf]
49
+ @opts[:paperWidth] = paper_width || Palapala.defaults[:paper_width]
50
+ @opts[:paperHeight] = paper_height || Palapala.defaults[:paper_height]
51
+ @opts[:landscape] = landscape || Palapala.defaults[:landscape]
52
+ @opts[:marginTop] = margin_top || Palapala.defaults[:margin_top]
53
+ @opts[:marginLeft] = margin_left || Palapala.defaults[:margin_left]
54
+ @opts[:marginBottom] = margin_bottom || Palapala.defaults[:margin_bottom]
55
+ @opts[:marginRight] = margin_right || Palapala.defaults[:margin_right]
56
+ @opts[:preferCSSPageSize] = prefer_css_page_size || Palapala.defaults[:prefer_css_page_size]
57
+ @opts[:printBackground] = print_background || Palapala.defaults[:print_background]
58
+ @opts[:scale] = scale || Palapala.defaults[:scale]
59
+ @opts[:displayHeaderFooter] = true
60
+ @opts[:encoding] = :binary
59
61
  @opts.compact!
60
62
  end
61
63
 
@@ -22,6 +22,13 @@ module Palapala
22
22
  send_command_and_wait_for_result("Page.enable")
23
23
  end
24
24
 
25
+ def websocket_url
26
+ self.class.websocket_url
27
+ rescue Errno::ECONNREFUSED
28
+ ChromeProcess.spawn_chrome # Spawn a new Chrome process
29
+ self.class.websocket_url # Retry (once)
30
+ end
31
+
25
32
  # Create a thread-local instance of the renderer
26
33
  def self.thread_local_instance
27
34
  Thread.current[:renderer] ||= Renderer.new
@@ -102,16 +109,8 @@ module Palapala
102
109
  @client.close
103
110
  end
104
111
 
105
- private
106
-
107
- # Convert the HTML content to a data URL
108
- def data_url_for_html(html)
109
- "data:text/html;base64,#{Base64.strict_encode64(html)}"
110
- end
111
-
112
112
  # Open a new tab in the remote chrome and return the WebSocket URL
113
- def websocket_url
114
- ChromeProcess.spawn_chrome
113
+ def self.websocket_url
115
114
  uri = URI("#{Palapala.headless_chrome_url}/json/new")
116
115
  http = Net::HTTP.new(uri.host, uri.port)
117
116
  request = Net::HTTP::Put.new(uri)
@@ -122,5 +121,12 @@ module Palapala
122
121
  puts "WebSocket URL: #{websocket_url}" if Palapala.debug
123
122
  websocket_url
124
123
  end
124
+
125
+ private
126
+
127
+ # Convert the HTML content to a data URL
128
+ def data_url_for_html(html)
129
+ "data:text/html;base64,#{Base64.strict_encode64(html)}"
130
+ end
125
131
  end
126
132
  end
@@ -1,3 +1,3 @@
1
1
  module Palapala
2
- VERSION = "0.1.9"
2
+ VERSION = "0.1.10"
3
3
  end
data/lib/palapala.rb CHANGED
@@ -19,18 +19,20 @@ module Palapala
19
19
  # path to the headless Chrome executable when using the child process renderer
20
20
  attr_accessor :headless_chrome_path
21
21
 
22
- # URL to the headless Chrome instance when using the remote renderer
22
+ # URL to the headless Chrome instance when using the remote renderer (priority)
23
23
  attr_accessor :headless_chrome_url
24
24
 
25
- # Chrome headless shell version to use
25
+ # Chrome headless shell version to use (stable, beta, dev, canary, etc.)
26
+ # when launching a new Chrome instance using npx
26
27
  attr_accessor :chrome_headless_shell_version
27
28
  end
28
- puts "setting defaults on palapala"
29
29
  self.debug = false
30
- self.defaults = { displayHeaderFooter: true, encoding: :binary }
30
+ self.defaults = {
31
+ header_template: "<div></div>",
32
+ footer_template: "<div></div>"
33
+ # footer_template: '<div style="text-align: center; font-size: 12pt; width: 100%;">Generated with Palapala PDF</div>'
34
+ }
31
35
  self.headless_chrome_path = nil
32
- self.headless_chrome_url = "http://localhost:9222"
36
+ self.headless_chrome_url = ENV.fetch("HEADLESS_CHROME_URL", "http://localhost:9222")
33
37
  self.chrome_headless_shell_version = ENV.fetch("CHROME_HEADLESS_SHELL_VERSION", "stable")
34
38
  end
35
-
36
- puts "hoo"
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: palapala_pdf
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.9
4
+ version: 0.1.10
5
5
  platform: ruby
6
6
  authors:
7
7
  - Koen Handekyn