palapala_pdf 0.1.9 → 0.1.11

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e9359afef10584362d61be46353495ae97d4ee17a380912fac9ede823fc9d41b
4
- data.tar.gz: 3c8fce8d86a1fa1e1a9394c7e46eb9fe487669f5d621eabf56cd0ae2c29dd33b
3
+ metadata.gz: 230cc525cd5e4bbc4d2ce9ddb0a418486bc27ad6163cef9834a67d7e23442b41
4
+ data.tar.gz: b283d90551ef07efe3384148b061ec0a6a18cb3bdbc9485c1e8717e6bc4979f3
5
5
  SHA512:
6
- metadata.gz: e026b00c0e48fc24a412314a51f612813c2ac6441bee5b48139999dc58ceeec0b9c2807edf0cd0c00cdffb99fbcd0bb6848b2fc53599ed8d51fe286ee3a77155
7
- data.tar.gz: c759088c15ea39529b7cf9ccdb95dee9af3c7b8bb02c8bc3ab6aef73e59de4e93378cd12f85e7affd3c2c2a7c2c8a476449087d529ce98dfd1aa2806f9119adc
6
+ metadata.gz: cfcf738f7171f679d419349cce4ce0441bb3be54fc355fa9592222b04c63654d6646c6103310f8ddf56e7112a3ccc95b30ddb92db86e0ff48a12b221f8be039c
7
+ data.tar.gz: efa8277743d960b0d3e869ab970c8248a2f901b572d4451bf0c479c311eabbd2a810e678efad489913ced8ab2ada14e8ce7d4b4fbb7d3ccad61ce98f419ee1b2
data/README.md CHANGED
@@ -4,7 +4,9 @@
4
4
 
5
5
  This project is a Ruby gem that provides functionality for generating PDF files from HTML using the Chrome browser. It allows you to easily convert HTML content into PDF documents, making it convenient for tasks such as generating reports, invoices, or any other printable documents. The gem provides a simple and intuitive API for converting HTML to PDF, and it leverages the power and flexibility of the Chrome browser's rendering engine to ensure accurate and high-quality PDF output. With this gem, you can easily integrate PDF generation capabilities into your Ruby applications.
6
6
 
7
- At the core, this project leverages the same rendering engine as [Grover](https://github.com/Studiosity/grover), but with significantly reduced overhead and dependencies. Instead of relying on the full Grover/Puppeteer/NodeJS stack, this project uses a raw web socket to enable direct communication from Ruby to a headless Chrome or Chromium browser. This approach ensures efficieny while providing a streamlined alternative for rendering tasks without sacrificing performance or flexibility.
7
+ At the core, this project leverages the Chrome rendering engine, but with significantly reduced overhead and dependencies. Instead of relying on the full Grover/Puppeteer/NodeJS stack, this project uses a raw web socket to enable direct communication from Ruby to a headless Chrome or Chromium browser. This approach ensures efficieny while providing a streamlined alternative for rendering tasks without sacrificing performance or flexibility.
8
+
9
+ It leverages work from [Puppeteer](https://pptr.dev/browsers-api/) (@puppeteer/browsers) to install a local Chrome-Headless-Shell if no Chrome is running, but that requires node (npx) to be available.
8
10
 
9
11
  This is how easy PDF generation can be in Ruby:
10
12
 
@@ -16,88 +18,28 @@ And this while having the most modern HTML/CSS/JS availlable to you: flex, grid,
16
18
 
17
19
  A core goal of this project is performance, and it is designed to be exceptionally fast. By leveraging **direct communication** with a headless Chrome or Chromium browser via a **raw web socket**, the gem minimizes overhead and dependencies, enabling PDF generation at speeds that significantly outperform other solutions. Whether generating simple or complex documents, this gem ensures that your Ruby applications can handle PDF tasks efficiently and at scale.
18
20
 
19
- ## Installation
20
-
21
- To install the gem and add it to your application's Gemfile, execute the following command:
22
-
23
- ```
24
- $ bundle add palapala_pdf
25
- ```
26
-
27
- If you are not using bundler to manage dependencies, you can install the gem by running:
28
-
29
- ```
30
- $ gem install palapala_pdf
31
- ```
32
-
33
- Palapala PDF connects to Chrome over a web socket connection.
34
- An external Chrome/Chromium is expected. Start it with the following
35
- command (9222 is the default port):
36
-
37
- ```sh
38
- /path/to/chrome --headless --disable-gpu --remote-debugging-port=9222
39
- ```
40
-
41
- ### Installing Chrome / Headless Chrome
42
-
43
- Seems the august 2024 release 128.0.6613.85 is seriously performance impacted. So to avoid regression issues, it's suggested to install a specific version of Chrome, test it and stick with it. This is easiest using npx and some tooling provided by Puppeteer. Unfortunately it depends on node/npm, but it's worth it. E.g. install a specific version like this:
44
-
45
- ```
46
- npx @puppeteer/browsers install chrome@127.0.6533.88
47
- ````
48
-
49
- This installs chrome in a `chrome` folder in the current working dir and it outputs the path where it's installed when it's finished.
50
-
51
- If you installed it using puppeteer from above
52
-
53
- ```sh
54
- ./chrome/mac_arm-127.0.6533.88/chrome-mac-arm64/Google\ Chrome\ for\ Testing.app/Contents/MacOS/Google\ Chrome\ for\ Testing --headless --disable-gpu --remote-debugging-port=9222
55
- ```
21
+ ## Sponsor This Project
56
22
 
57
- Currently i'd advise for the `chrome-headless-shell`variant that is a light version meant just for this use case. The chrome-headless-shell is a minimal, headless version of the Chrome browser designed specifically for environments where you need to run Chrome without a graphical user interface (GUI). This is particularly useful in scenarios like server-side rendering, automated testing, web scraping, or any situation where you need the power of the Chrome browser engine without the overhead of displaying a UI. Headless by design, reduced size and overhead but still the same engine.
23
+ If you find this project useful and would like to support its development, consider sponsoring or buying a coffee to help keep it going:
58
24
 
59
- ```
60
- npx @puppeteer/browsers install chrome-headless-shell@stable
61
- ```
25
+ - **GitHub Sponsors:** [Sponsor on GitHub](https://github.com/sponsors/koenhandekyn)
26
+ - **Buy Me a Coffee:** [Buy a Coffee](https://buymeacoffee.com/koenhandekyn)
62
27
 
63
- It installs to a path like this `./chrome-headless-shell/mac_arm-128.0.6613.84/chrome-headless-shell-mac-arm64/chrome-headless-shell`. As it's headless by design, it only needs one parameter
28
+ Your support is greatly appreciated and helps maintain the project!
64
29
 
65
- ```
66
- ./chrome-headless-shell/mac_arm-128.0.6613.84/chrome-headless-shell-mac-arm64/chrome-headless-shell --remote-debugging-port=9222
67
- ```
30
+ ## Installation
68
31
 
69
- Alternatively, Palapala PDF will try to launch Chrome as a child process.
70
- It guesses the path to Chrome, or you configure it like this:
32
+ To install the gem and add it to your application's Gemfile, execute the following command:
71
33
 
72
- ```ruby
73
- Palapala.setup do |config|
74
- config.headless_chrome_path = '/usr/bin/google-chrome-stable' # path to Chrome executable
75
- end
76
34
  ```
77
-
78
- ### Installing Node/NPX
79
-
80
- Using Brew
81
-
82
- ````
83
- brew install node
35
+ $ bundle add palapala_pdf
84
36
  ```
85
37
 
86
- Using NVM (Node Version Manager)
87
-
88
- ````
89
- curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.3/install.sh | bash
90
- source ~/.nvm/nvm.sh
91
- nvm --version
92
- nvm install node
93
- ````
94
-
95
-
96
38
  ## Usage Instructions
97
39
 
98
40
  To create a PDF from HTML content using the `Palapala` library, follow these steps:
99
41
 
100
- 1. **Configuration**:
42
+ **Configuration from inside Ruby**
101
43
 
102
44
  Configure the `Palapala` library with the necessary options, such as the URL for the browser and default settings like scale and format.
103
45
 
@@ -105,76 +47,82 @@ In a Rails context, this could be inside an initializer.
105
47
 
106
48
  ```ruby
107
49
  Palapala.setup do |config|
108
- # run against an external chrome/chromium or leave this out to run against a chrome that is started as a child process
50
+ # debug mode
109
51
  config.debug = true
110
- config.headless_chrome_url = 'http://localhost:9222' # run against a remote Chrome instance
111
- # config.headless_chrome_path = '/usr/bin/google-chrome-stable' # path to Chrome executable
112
- config.defaults = { scale: 1, format: :A4 }
52
+ # Chrome headless shell version to use (stable, beta, dev, canary, etc.) when launching a new Chrome instance
53
+ config.chrome_headless_shell_version = :stable
54
+ # run against an external chrome/chromium or leave this out to run against a chrome that is started as a child process
55
+ config.headless_chrome_url = 'http://localhost:9222'
56
+ # path to Chrome executable
57
+ config.headless_chrome_path = '/usr/bin/google-chrome-stable'
58
+ # default options for PDF generation
59
+ config.defaults = { scale: 1 }
60
+ # extra params to pass to Chrome when launched as a child process
61
+ config.chrome_params = []
113
62
  end
114
63
  ```
115
- 1. **Create a PDF from HTML**:
116
64
 
117
- Create a PDF file from HTML in `irb`
65
+ **Using environemnt variables**
66
+
67
+ ```sh
68
+ CHROME_HEADLESS_SHELL_VERSION=canary ruby examples/performance_benchmark.rb
69
+ ````
70
+
71
+ ```sh
72
+ HEADLESS_CHROME_URL=http://192.168.1.1:9222 ruby examples/performance_benchmark.rb
73
+ ```
118
74
 
119
75
  ```sh
120
- gem install palapala_pdf
76
+ CHROME_HEADLESS_PATH=/var/to/chrome ruby examples/performance_benchmark.rb
121
77
  ```
122
78
 
123
- in IRB, load palapala and create a PDF from an HTML snippet:
79
+ **Create a PDF from HTML**
80
+
81
+ Load palapala and create a PDF file from an HTML snippet:
124
82
 
125
83
  ```ruby
126
84
  require "palapala"
127
85
  Palapala::Pdf.new("<h1>Hello, world! #{Time.now}</h1>").save('hello.pdf')
128
86
  ```
129
87
 
130
- Instantiate a new Palapala::Pdf object with your HTML content and generate the PDF binary data.
88
+ Instantiate a new Palapala::Pdf object with your HTML content and generate the PDF binary data:
131
89
 
132
90
  ```ruby
133
91
  require "palapala"
134
92
  binary_data = Palapala::Pdf.new("<h1>Hello, world! #{Time.now}</h1>").binary_data
135
93
  ```
136
94
 
137
- ## Paged CSS
138
-
139
- Paged CSS is a subset of CSS designed for styling printed documents. It extends standard CSS to handle pagination, page sizes, headers, footers, and other aspects of printed content. Paged CSS is commonly used in scenarios where web content needs to be converted to PDFs or other paginated formats.
140
-
141
- ### Headers and Footers
95
+ ## Advanced Examples
142
96
 
143
- When using Chromium-based rendering engines, headers and footers are not controlled by the Paged CSS standard but are instead managed through specific settings in the rendering engine.
97
+ - headers and footers
98
+ - paged css for paper sizes, paper margins, pages breaks, etc
99
+ - js based rendering
144
100
 
145
- With palapala PDF headers and footers are defined using `header_html` and `footer_html` options. These allow you to insert HTML content directly into the header or footer areas.
101
+ ## Connecting to Chrome
146
102
 
147
- ```ruby
148
- Palapala::Pdf.new(
149
- "<p>Hello world</>",
150
- header_html: '<div style="text-align: center;">Page <span class="pageNumber"></span> of <span class="totalPages"></span></div>',
151
- footer_html: '<div style="text-align: center;">Generated with Palapala PDF</div>',
152
- margin: { top: "2cm", bottom: "2cm"}
153
- ).save("test.pdf")
154
- ```
103
+ Palapa PDF will go through this process
155
104
 
156
- ### Page size, orientation and margins
105
+ - check if a Chrome is running and exposing port 9222 (and if so, use it)
106
+ - if `Palapala.headless_chrome_path` is defined, launch Chrome as a child process using that path
107
+ - if **NPX** is avalaillable, install a **Chrome-Headless-Shell** variant locally and launch it as a child process. It will install the 'stable' version or the version identified by `Palapala.chrome_headless_shell_version` setting (or from ENV `CHROME_HEADLESS_SHELL_VERSION`).
108
+ - as a last fallback it will guess a chrome path from the detected OS and try to launch a Chrome with that
157
109
 
158
- #### With CSS
110
+ In our expreience a Chrome-Headless-Shell version gives the best performance and resource useage.
159
111
 
160
- todo example
112
+ ### Installing Chrome / Headless Chrome manually
161
113
 
162
- #### As params
114
+ This is easiest using npx and tooling provided by Puppeteer (depends on node/npm, but it's worth it). This installs chrome in a `chrome` folder in the current working dir and it outputs the path where it's installed when it's finished. Currently we'd advise for the `chrome-headless-shell` variant that is a light version meant just for this use case. The chrome-headless-shell is a minimal, headless version of the Chrome browser designed specifically for environments where you need to run Chrome without a graphical user interface (GUI). This is particularly useful in scenarios like server-side rendering, automated testing, web scraping, or any situation where you need the power of the Chrome browser engine without the overhead of displaying a UI. Headless by design, reduced size and overhead but still the same engine.
163
115
 
164
- todo example
116
+ ```sh
117
+ npx @puppeteer/browsers install chrome-headless-shell@stable
118
+ ```
165
119
 
166
- ## JS based rendering
120
+ It installs to a path like this `./chrome-headless-shell/mac_arm-128.0.6613.84/chrome-headless-shell-mac-arm64/chrome-headless-shell`. As it's headless by design, it only needs one parameter:
167
121
 
168
- ```html
169
- <html>
170
- <script type="text/javascript">
171
- document.addEventListener("DOMContentLoaded", () => {
172
- document.body.innerHTML += "<p>Current time from JS: " + new Date().toLocaleString() + "</p>";
173
- });
174
- </script>
175
- <body><p>Default body text.</p></body>
176
- </html>
122
+ ```sh
123
+ ./chrome-headless-shell/mac_arm-128.0.6613.84/chrome-headless-shell-mac-arm64/chrome-headless-shell --remote-debugging-port=9222
177
124
  ```
125
+ *Note: Seems the august 2024 release Chrome releases 128.0.6613.85 onward is seriously performance impacted for PDF generation. Chrome Headless Shell releases don't seem to suffer from this issue.
178
126
 
179
127
  ## Raw parameters (Page.printToPDF)
180
128
 
@@ -196,15 +144,6 @@ Bug reports and pull requests are welcome on GitHub at https://github.com/palapa
196
144
  - [Eugen Neagoe](https://github.com/eneagoe) - Thank you for your valuable input, feedback and opinions.
197
145
  - [Radu Bogoevici](https://github.com/codenighter) - Thanks for test driving, and all help big and small.
198
146
 
199
- ## Sponsor This Project
200
-
201
- If you find this project useful and would like to support its development, consider sponsoring or buying a coffee to help keep it going:
202
-
203
- - **GitHub Sponsors:** [Sponsor on GitHub](https://github.com/sponsors/koenhandekyn)
204
- - **Buy Me a Coffee:** [Buy a Coffee](https://buymeacoffee.com/koenhandekyn)
205
-
206
- Your support is greatly appreciated and helps maintain the project!
207
-
208
147
  ## Findings
209
148
 
210
149
  - For Chrome, mode headless=new seems to be slower for pdf rendering cases.
@@ -212,24 +151,14 @@ Your support is greatly appreciated and helps maintain the project!
212
151
 
213
152
  ## Primitive benchmark
214
153
 
215
- On a macbook m3, the throughput for 'hello world' PDF generation can reach around 300 docs/second when allowing for some concurrency. As Chrome is actually also very efficient, it scales really well for complex documents also. If you run this in Rails, the concurrency is being taken care of either by the front end thread pool or by the workers and you shouldn't have to think about this. (Using an external Chrome)
154
+ On a macbook m3, the throughput for 'hello world' PDF generation can reach around 500 to 800 docs/second when allowing for some concurrency (4 threads). As Chrome is actually also very efficient, it scales really well for complex documents also. If you run this in Rails, the concurrency is being taken care of either by the front end thread pool or by the workers and you shouldn't have to think about this. (Using an external Chrome)
216
155
 
217
156
  Note: it renders `"Hello #{i}, world #{j}! #{Time.now}."` where i is the thread and j is the iteration counter within the thread and persists it to an SSD (which is very fast these days).
218
157
 
219
- ### benchmarking 20 docs: 1x20, 2x10, 4x5
220
-
221
- ```sh
222
- c:1, n:20 : Throughput = 159.41 docs/sec, Total time = 0.1255 seconds
223
- c:2, n:10 : Throughput = 124.91 docs/sec, Total time = 0.1601 seconds
224
- c:4, n:5 : Throughput = 196.40 docs/sec, Total time = 0.1018 seconds
225
- ```
226
-
227
- ### benchmarking 320 docs: 1x320, 4x80, 8x40
228
-
229
158
  ```sh
230
- c:1, n:320 : Throughput = 184.99 docs/sec, Total time = 1.7299 seconds
231
- c:4, n:80 : Throughput = 302.50 docs/sec, Total time = 1.0578 seconds
232
- c:8, n:40 : Throughput = 254.29 docs/sec, Total time = 1.2584 seconds
159
+ c:1, n:10 : Throughput = 16.76 docs/sec, Total time = 0.5968 seconds
160
+ c:2, n:10 : Throughput = 170.41 docs/sec, Total time = 0.1174 seconds
161
+ c:4, n:80 : Throughput = 579.03 docs/sec, Total time = 0.5526 seconds```
233
162
  ```
234
163
 
235
164
  This is about a factor 100x faster then what you typically get with Grover and still 10x faster then with many alternatives. It's effectively that fast that you can run this for a lot of uses cases straight from e.g. your Ruby On Rails web worker in the controller on a single machine and still scale to lot's of users.
@@ -256,25 +185,22 @@ In this example, `pdf_data` is the binary data of the PDF file. The `filename` o
256
185
 
257
186
  ## Docker
258
187
 
259
- In docker as root you must pass the no-sandbox browser option:
188
+ TODO
260
189
 
261
- ```ruby
262
- Palapala.setup do |config|
263
- config.opts = { 'no-sandbox': nil }
264
- end
265
- ```
266
- It has also been reported that the Chrome process repeatedly crashes when running inside a Docker container on an M1 Mac. Chrome should work as expected when deployed to a Docker container on a non-M1 Mac.
190
+ *It has also been reported that the Chrome process repeatedly crashes when running inside a Docker container on an M1 Mac. Chrome should work as expected when deployed to a Docker container on a non-M1 Mac.*
267
191
 
268
192
  ## Thread-safety
269
193
 
270
- Behind the scenes, a websocket is openend and stored on Thread.current for subsequent requests. Hence, the code is
271
- thread safe in the sense that every web socket get's a new tab in the underlying chromium and get an isolated context.
272
-
273
194
  For performance reasons, the code uses a low level websocket connection that does all it's work on the curent thread
274
195
  so we can avoid synchronisation penalties.
275
196
 
197
+ Behind the scenes, a websocket is openend and stored on Thread.current for subsequent requests. Hence, the code is
198
+ thread safe in the sense that every web socket get's a new tab in the underlying chromium and get an isolated context.
199
+
276
200
  ## Heroku
277
201
 
202
+ TODO
203
+
278
204
  possible buildpacks
279
205
 
280
206
  https://github.com/heroku/heroku-buildpack-chrome-for-testing
@@ -0,0 +1,16 @@
1
+ ### Installing Node (npx)
2
+
3
+ Using Brew
4
+
5
+ ```sh
6
+ brew install node
7
+ ```
8
+
9
+ Using NVM (Node Version Manager)
10
+
11
+ ```sh
12
+ curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.3/install.sh | bash
13
+ source ~/.nvm/nvm.sh
14
+ nvm --version
15
+ nvm install node
16
+ ```
data/doc/paged_css.md ADDED
@@ -0,0 +1,167 @@
1
+ ## Paged CSS
2
+
3
+ Paged CSS is a subset of CSS designed for styling printed documents. It extends standard CSS to handle pagination, page sizes, headers, footers, and other aspects of printed content. Paged CSS is commonly used in scenarios where web content needs to be converted to PDFs or other paginated formats.
4
+
5
+ Setting page size
6
+
7
+ ```css
8
+ @page {
9
+ /* set a standard page size */
10
+ size: A4 landscape;
11
+ /* Custom */
12
+ size: 8.5in 11in; /* Width x Height */
13
+ }
14
+ ```
15
+
16
+ Setting page margins
17
+
18
+ ```css
19
+ @page {
20
+ margin: 1in; /* 1 inch on all sides */
21
+ margin: 1in 0.5in 1in 0.5in; /* Top, Right, Bottom, Left */
22
+ }
23
+ ```
24
+
25
+ Forcing a Page Break before or after an Element
26
+
27
+ ```css
28
+ /* This ensures that every `h1` starts on a new page. */
29
+ h1 {
30
+ page-break-before: always;
31
+ }
32
+ /* This ensures that every `p` element ends with a page break, starting the next content on a new page. */
33
+ p {
34
+ page-break-after: always;
35
+ }
36
+ /* This prevents a table from being split across two pages. */
37
+ table {
38
+ page-break-inside: avoid;
39
+ }
40
+ ```
41
+
42
+ ### Headers and Footers
43
+
44
+ When using Chromium-based rendering engines, headers and footers are not controlled by the Paged CSS standard but are instead managed through specific settings in the rendering engine.
45
+
46
+ With palapala PDF headers and footers are defined using `header_template` and `footer_template` options. These allow you to insert HTML content directly into the header or footer areas.
47
+
48
+ Critical is that you specify a font-size because by default Chrome uses a very tiny font.
49
+
50
+ ```ruby
51
+ Palapala::Pdf.new(
52
+ "<p>Hello world</>",
53
+ header_template: '<div style="text-align: center; font-size: 12pt;">Page <span class="pageNumber"></span> of <span class="totalPages"></span></div>',
54
+ footer_template: '<div style="text-align: center; font-size: 12pt;">Generated with Palapala PDF</div>',
55
+ ).save("test.pdf")
56
+ ```
57
+
58
+ ### Examples
59
+
60
+ #### Headers and Footers
61
+
62
+ TODO explain about headers and footers, font sizes, styles being independent, and how to insert current page, total pages, etc.
63
+
64
+ #### Page sizes and margins
65
+
66
+ Paged CSS, also known as @page CSS, is used to control the layout and appearance of printed documents. It allows you to define page-specific styles, such as sizes and margins, which are crucial for generating well-formatted PDFs.
67
+
68
+ You can specify the size of the page using predefined sizes or custom dimensions. Common predefined sizes include A4, A3, letter, etc. Margins can be set for the top, right, bottom, and left sides of the page. You can specify all four margins at once or individually. You can also define named pages for different sections of your document.
69
+
70
+ ##### Example: Different First Page
71
+
72
+ TODO Validate
73
+
74
+ ```css
75
+ @page first {
76
+ size: A4;
77
+ margin: 2in; /* Larger margin for the first page */
78
+ }
79
+
80
+ @page {
81
+ size: A4;
82
+ margin: 1in;
83
+ }
84
+
85
+ body {
86
+ counter-reset: page;
87
+ }
88
+
89
+ body:first {
90
+ page: first;
91
+ }
92
+ ```
93
+
94
+ #### Page breaks
95
+
96
+ Paged CSS allows you to control how content is divided across pages when printing or generating PDFs. Page breaks are an essential part of this, as they determine where a new page starts. You can control page breaks using the `page-break-before`, `page-break-after`, and `page-break-inside` properties.
97
+
98
+ ##### Page Break Properties
99
+
100
+ 1. **`page-break-before`**: Forces a page break before the element.
101
+ 2. **`page-break-after`**: Forces a page break after the element.
102
+ 3. **`page-break-inside`**: Prevents or allows a page break inside the element.
103
+
104
+ ##### Values
105
+
106
+ - `auto`: Default. Neither forces nor prevents a page break.
107
+ - `always` Always forces a page break.
108
+ - `avoid`: Avoids a page break inside the element.
109
+ - `left`: Forces a page break so that the next page is a left page.
110
+ - `right`: Forces a page break so that the next page is a right page.
111
+
112
+ ##### Examples
113
+
114
+ ```css
115
+ /* This ensures that every `h1` starts on a new page. */
116
+ h1 {
117
+ page-break-before: always;
118
+ }
119
+ /* This ensures that every `p` element ends with a page break, starting the next content on a new page. */
120
+ p {
121
+ page-break-after: always;
122
+ }
123
+ /* This prevents a table from being split across two pages. */
124
+ table {
125
+ page-break-inside: avoid;
126
+ }
127
+ ```
128
+
129
+ ##### Practical Use Cases
130
+
131
+ - **Chapter Titles**: Use `page-break-before: always;` for chapter titles to ensure each chapter starts on a new page.
132
+ - **Sections**: Use `page-break-after: always;` for sections that should end with a page break.
133
+ - **Tables and Figures**: Use `page-break-inside: avoid;` to keep tables and figures from being split across pages.
134
+
135
+ #### Tables accross Pages
136
+
137
+ TODO explain `display` property with the values `table-header-group` and `table-footer-group`
138
+
139
+ ##### Example
140
+
141
+ ```html
142
+ <table>
143
+ <thead>
144
+ <tr>
145
+ <th>Header 1</th>
146
+ <th>Header 2</th>
147
+ </tr>
148
+ </thead>
149
+ <tbody>
150
+ <tr>
151
+ <td>Data 1</td>
152
+ <td>Data 2</td>
153
+ </tr>
154
+ <!-- More rows -->
155
+ </tbody>
156
+ <tfoot>
157
+ <tr>
158
+ <td>Footer 1</td>
159
+ <td>Footer 2</td>
160
+ </tr>
161
+ </tfoot>
162
+ </table>
163
+ ```
164
+
165
+ In this example:
166
+ - The `<thead>` section will be repeated at the top of each page.
167
+ - The `<tfoot>` section will be repeated at the bottom of each page.
data/examples/all.rb ADDED
@@ -0,0 +1,9 @@
1
+ $LOAD_PATH.unshift File.expand_path('../lib', __dir__)
2
+ require 'palapala'
3
+
4
+ $debug = ARGV[0] == 'debug'
5
+ Palapala.debug = $debug
6
+
7
+ require_relative "headers_and_footers"
8
+ require_relative "paged_css"
9
+ require_relative "js_based_rendering"
@@ -0,0 +1,169 @@
1
+ <!--
2
+ OPTIONS AS PASSED IN THE C++ code
3
+ =================================
4
+ options.Set(kSettingHeaderFooterDate,
5
+ base::Time::Now().InMillisecondsFSinceUnixEpoch());
6
+ options.Set("width", static_cast<double>(page_size.width()));
7
+ options.Set("height", static_cast<double>(page_size.height()));
8
+ options.Set("topMargin", page_layout.margin_top);
9
+ options.Set("bottomMargin", page_layout.margin_bottom);
10
+ options.Set("leftMargin", page_layout.margin_left);
11
+ options.Set("rightMargin", page_layout.margin_right);
12
+ // `page_index` is 0-based, so 1 is added to get the page number.
13
+ options.Set("pageNumber", base::checked_cast<int>(page_index + 1));
14
+ options.Set("totalPages", base::checked_cast<int>(total_pages));
15
+ options.Set("url", params.url);
16
+ std::u16string title = source_frame.GetDocument().Title().Utf16();
17
+ options.Set("title", title.empty() ? params.title : title);
18
+ options.Set("headerTemplate", params.header_template);
19
+ options.Set("footerTemplate", params.footer_template);
20
+ options.Set("isRtl", base::i18n::IsRTL());
21
+ -->
22
+
23
+ <!doctype html>
24
+ <html>
25
+
26
+ <head>
27
+ <link rel="stylesheet" href="chrome://resources/css/text_defaults.css">
28
+ <style>
29
+ body {
30
+ display: flex;
31
+ flex-direction: column;
32
+ margin: 0;
33
+ }
34
+
35
+ #header,
36
+ #footer {
37
+ display: flex;
38
+ flex: none;
39
+ }
40
+
41
+ #header {
42
+ align-items: flex-start;
43
+ padding-top: 15pt;
44
+ }
45
+
46
+ #footer {
47
+ align-items: flex-end;
48
+ padding-bottom: 15pt;
49
+ }
50
+
51
+ #content {
52
+ flex: auto;
53
+ }
54
+
55
+ .left {
56
+ flex: none;
57
+ padding-left: 24pt;
58
+ /* csschecker-disable-line left-right */
59
+ padding-right: 6pt;
60
+ /* csschecker-disable-line left-right */
61
+ }
62
+
63
+ .center {
64
+ flex: auto;
65
+ padding-left: 24pt;
66
+ /* csschecker-disable-line left-right */
67
+ padding-right: 24pt;
68
+ /* csschecker-disable-line left-right */
69
+ text-align: center;
70
+ }
71
+
72
+ .right {
73
+ flex: none;
74
+ /* historically does not account for RTL */
75
+ padding-left: 6pt;
76
+ /* csschecker-disable-line left-right */
77
+ padding-right: 24pt;
78
+ /* csschecker-disable-line left-right */
79
+ }
80
+
81
+ .grow {
82
+ flex: auto;
83
+ }
84
+
85
+ .text {
86
+ font-size: 8pt;
87
+ overflow: hidden;
88
+ text-overflow: ellipsis;
89
+ white-space: nowrap;
90
+ }
91
+ </style>
92
+ <script>
93
+
94
+ function getComputedStyleAsFloat(style, value) {
95
+ return parseFloat(style.getPropertyValue(value).slice(0, -2));
96
+ }
97
+
98
+ function elementIntersects(element, topPos, bottomPos, leftPos, rightPos) {
99
+ const rect = element.getBoundingClientRect();
100
+ const style = window.getComputedStyle(element);
101
+
102
+ // Only consider the size of |element|, so remove the padding from |rect|.
103
+ // The padding is used for positioning.
104
+ rect.top += getComputedStyleAsFloat(style, 'padding-top');
105
+ rect.bottom -= getComputedStyleAsFloat(style, 'padding-bottom');
106
+ rect.left += getComputedStyleAsFloat(style, 'padding-left');
107
+ rect.right -= getComputedStyleAsFloat(style, 'padding-right');
108
+ return leftPos < rect.right && rightPos > rect.left && topPos < rect.bottom &&
109
+ bottomPos > rect.top;
110
+ }
111
+
112
+ function setupHeaderFooterTemplate(options) {
113
+ const body = document.querySelector('body');
114
+ const header = document.querySelector('#header');
115
+ const footer = document.querySelector('#footer');
116
+
117
+ body.style.width = `${options.width}px`;
118
+ body.style.height = `${options.height}px`;
119
+ header.style.height = `${options.topMargin}px`;
120
+ footer.style.height = `${options.bottomMargin}px`;
121
+
122
+ const topMargin = options.topMargin;
123
+ const bottomMargin = options.height - options.bottomMargin;
124
+ const leftMargin = options.leftMargin;
125
+ const rightMargin = options.width - options.rightMargin;
126
+
127
+ header.innerHTML = options['headerTemplate'] || `
128
+ <div class='date text left'></div>
129
+ <div class='title text center'></div>`;
130
+ footer.innerHTML = options['footerTemplate'] || `
131
+ <div class='url text left grow'></div>
132
+ <div class='text right'>
133
+ <span class='pageNumber'></span>/<span class='totalPages'></span>
134
+ </div>`;
135
+
136
+ const date = new Date(options.date);
137
+ const formatter =
138
+ new Intl.DateTimeFormat(
139
+ navigator.languages[0].split('@')[0],
140
+ { dateStyle: 'short', timeStyle: 'short' });
141
+ options.date = formatter.format(date);
142
+ for (const cssClass of ['date', 'title', 'url', 'pageNumber', 'totalPages']) {
143
+ for (const element of document.querySelectorAll(`.${cssClass}`)) {
144
+ element.textContent = options[cssClass];
145
+ }
146
+ }
147
+ for (const element of document.querySelectorAll(`.text`)) {
148
+ if (options.isRtl &&
149
+ !element.classList.contains('url') &&
150
+ !element.classList.contains('title')) {
151
+ element.dir = 'rtl';
152
+ }
153
+ if (elementIntersects(element, topMargin, bottomMargin, leftMargin,
154
+ rightMargin)) {
155
+ element.style.visibility = 'hidden';
156
+ }
157
+ }
158
+ }
159
+
160
+ </script>
161
+ </head>
162
+
163
+ <body>
164
+ <div id="header"></div>
165
+ <div id="content"></div>
166
+ <div id="footer"></div>
167
+ </body>
168
+
169
+ </html>
Binary file
@@ -3,47 +3,18 @@
3
3
  $LOAD_PATH.unshift File.expand_path('../lib', __dir__)
4
4
  require 'palapala'
5
5
 
6
- HEADER_HTML = <<~HTML
7
- <style type="text/css">
8
- .header {
9
- -webkit-print-color-adjust: exact;
10
- border-bottom: 1px solid lightgray;
11
- color: black;
12
- font-family: Arial, Helvetica, sans-serif;
13
- font-size: 12pt;
14
- margin: 0 auto;
15
- padding: 5px;
16
- text-align: center;
17
- vertical-align: middle;
18
- width: 100%;
19
- border: 1px solid black;
20
- }
21
- </style>
22
- <div class="header" style="text-align: center">
23
- Page <span class="pageNumber"></span> of <span class="totalPages"></span>
24
- </div>
25
- HTML
6
+ header_template =
7
+ '<div style="text-align: center; font-size: 12pt; padding: 1rem; width: 100%;">Page <span class="pageNumber"></span> of <span class="totalPages"></span></div>'
26
8
 
27
- Palapala.setup do |config|
28
- config.debug = true
29
- config.headless_chrome_url = 'http://localhost:9222' # run against a remote Chrome instance
30
- # config.headless_chrome_path = '/usr/bin/google-chrome-stable' # path to Chrome executable
31
- end
9
+ footer_template =
10
+ '<div style="text-align: center; font-size: 12pt; padding: 1rem; width: 100%;">Generated with Palapala PDF</div>'
32
11
 
33
- result = Palapala::Pdf.new(
34
- # "<style>@page { size: A4 landscape; }</style><p>Hello world #{Time.now}</>",
12
+ Palapala::Pdf.new(
35
13
  "<h1>Title</h1><p>Hello world #{Time.now}</>",
36
- header_html: HEADER_HTML,
37
- footer_html: '<div style="text-align: center;">Generated with Palapala PDF</div>',
38
- scale: 0.75,
39
- prefer_css_page_size: false,
40
- margin: { top: 3, bottom: 2 }
41
- ).save('tmp/headers_and_footers.pdf',
42
- generateDocumentOutline: false,
43
- # marginTop: 1,
44
- # paperWidth: 3,
45
- displayHeaderFooter: true,
46
- # landscape: false,
47
- headerTemplate: HEADER_HTML)
14
+ header_template:,
15
+ footer_template:,
16
+ margin_top: 3,
17
+ margin_bottom: 3).save('headers_and_footers.pdf')
48
18
 
49
- puts result
19
+ puts "Generated headers_and_footers.pdf"
20
+ # `open headers_and_footers.pdf`
Binary file
@@ -14,9 +14,8 @@ DOCUMENT = <<~HTML
14
14
  </html>
15
15
  HTML
16
16
 
17
- Palapala.setup do |config|
18
- config.debug = true
19
- end
17
+ Palapala::Pdf.new(DOCUMENT).save('js_based_rendering.pdf')
20
18
 
21
- result = Palapala::Pdf.new(DOCUMENT).save('tmp/js_based_rendering.pdf')
22
- puts result
19
+ puts "Generated js_based_rendering.pdf"
20
+
21
+ # `open tmp/js_based_rendering.pdf`
Binary file
@@ -0,0 +1,186 @@
1
+ # frozen_string_literal: true
2
+
3
+ $LOAD_PATH.unshift File.expand_path("../lib", __dir__)
4
+ require "palapala"
5
+
6
+ long_text = (1..30).map { "Demonstrate a paragraph that is not split across pages." }.join(" ")
7
+
8
+ def table(rows)
9
+ <<~HTML
10
+ <table>
11
+ <thead>
12
+ <tr>
13
+ <th>Header 1</th>
14
+ <th>Header 2</th>
15
+ </tr>
16
+ </thead>
17
+ <tbody>
18
+ #{ (1..rows).map { |i| "<tr><td>Row #{i}, Cell 1</td><td>Row #{i}, Cell 2</td></tr>" }.join }
19
+ </tbody>
20
+ <tfoot>
21
+ <tr>
22
+ <td>Footer 1</td>
23
+ <td>Footer 2</td>
24
+ </tr>
25
+ </tfoot>
26
+ </table>
27
+ HTML
28
+ end
29
+
30
+ big_table = table(35)
31
+ small_table = table(5)
32
+
33
+ document = <<~HTML
34
+ <html>
35
+ <style>
36
+ @page {
37
+ size: A4;
38
+ margin: 2cm;
39
+ margin-top: 3cm;
40
+ margin-bottom: 3cm;
41
+ }
42
+ body, html {
43
+ margin: 0;
44
+ padding: 0;
45
+ font-family: Arial, sans-serif;
46
+ }
47
+ h1 {
48
+ page-break-before: always;
49
+ border-bottom: 1px solid black;
50
+ }
51
+ h2 {
52
+ /* keep with next */
53
+ page-break-after: avoid;
54
+ }
55
+ @page:first {
56
+ size: A4 landscape;
57
+ margin: 0; /* no margin for the first page */
58
+ padding: 0;
59
+ }
60
+ div.titlepage {
61
+ background-color: black;
62
+ color: white;
63
+ font-size: 72pt;
64
+ text-align: center;
65
+ display: flex;
66
+ justify-content: center;
67
+ align-items: center;
68
+ height: 100%;
69
+ width: 100vw;
70
+ }
71
+ table {
72
+ font-size: 10pt;
73
+ width: 100%;
74
+ border-collapse: collapse;
75
+ td, th {
76
+ border: 1px solid black;
77
+ padding: 0.5rem;
78
+ }
79
+ & thead, & tfoot {
80
+ tr {
81
+ background-color: lightgray;
82
+ & th, & td {
83
+ padding-top: 0.5rem;
84
+ padding-bottom: 0.5rem;
85
+ }
86
+ }
87
+ }
88
+ }
89
+ /* Initialize counters */
90
+ body {
91
+ counter-reset: h1Counter h2Counter;
92
+ }
93
+ /* Numbering for H1 elements */
94
+ h1 {
95
+ counter-increment: h1Counter;
96
+ counter-reset: h2Counter; /* Reset h2 counter when a new h1 appears */
97
+ }
98
+ h1::before {
99
+ content: counter(h1Counter) ". ";
100
+ /* font-weight: bold; */
101
+ }
102
+ /* Numbering for H2 elements */
103
+ h2 {
104
+ counter-increment: h2Counter;
105
+ }
106
+ h2::before {
107
+ content: counter(h1Counter) "." counter(h2Counter) " ";
108
+ /* font-weight: bold; */
109
+ }
110
+ /* named pages */
111
+ @page addendum {
112
+ size: A5;
113
+ margin: 1cm;
114
+ margin-top: 3cm;
115
+ }
116
+ .addendum {
117
+ page: addendum;
118
+ counter-reset: h1Counter h2Counter;
119
+ }
120
+ </style>
121
+ <body>
122
+ <div class="titlepage">
123
+ <c-title>Title Page</c-title>
124
+ </div>
125
+ <h1>New Section</h1>
126
+ <h2>Subsection tables</h2>
127
+ <p>This demonstrates a table with a header and footer that spans multiple pages.</p>
128
+ #{big_table}
129
+ <h2>Subsection page break inside</h2>
130
+ <p style="page-break-inside: avoid; text-align: justify">
131
+ #{long_text}
132
+ </p>
133
+ <p>Note that the section title has moved to the second page because the paragraph above was moved to the second page.</p>
134
+ <h1>New Section</h1>
135
+ <p>Page 3 content</p>
136
+ <p>A small table</p>
137
+ #{small_table}
138
+ <h2>Subsection</h2>
139
+ <p>Some content</p>
140
+ <h2>Subsection</h2>
141
+ <p>Some content</p>
142
+ <div class="addendum">
143
+ This is an addendum and the page size is A5.
144
+ Headers are starting again from 1.
145
+ <h1>Some addendum header</h1>
146
+ <h2>Subsection</h2>
147
+ <h2>Subsection</h2>
148
+ <h1>Some addendum header</h1>
149
+ </div>
150
+ </body>
151
+ </html>
152
+ HTML
153
+
154
+ def debug(color: "red")
155
+ <<~HTML
156
+ <style>
157
+ /* this is a class chrome assigns to the header, footer and content in the main template */
158
+ #header, #content, #footer {
159
+ border: 1px dotted #{color}; /* uncomment to see the areas */
160
+ }
161
+ </style>
162
+ HTML
163
+ end
164
+
165
+ def header_footer_template(debug_color: nil)
166
+ <<~HTML
167
+ #{ debug(color: debug_color) if debug_color }
168
+ <div style="text-align: center; font-size: 12pt; padding: 1rem; width: 100%;">#{yield}</div>
169
+ HTML
170
+ end
171
+
172
+ footer_template = header_footer_template do
173
+ "Page <span class='pageNumber'></span> of <span class='totalPages'></span>"
174
+ end
175
+
176
+ header_template = header_footer_template do
177
+ "Generated with Palapala PDF"
178
+ end
179
+
180
+ Palapala::Pdf.new(document,
181
+ header_template:,
182
+ footer_template:).save("paged_css.pdf")
183
+
184
+ puts "Generated paged_css.pdf"
185
+
186
+ # `open paged_css.pdf`
@@ -5,14 +5,10 @@ $LOAD_PATH.unshift File.expand_path('../lib', __dir__)
5
5
  require 'benchmark'
6
6
  require 'palapala'
7
7
 
8
- debug = ARGV[0] == 'debug'
8
+ $debug = ARGV[0] == 'debug'
9
+ $save = ARGV[0] == 'save'
9
10
 
10
- Palapala.setup do |config|
11
- # config.headless_chrome_url = 'http://localhost:9222'
12
- config.debug = debug
13
- config.defaults.merge! scale: 0.75, format: :A4
14
- config.chrome_headless_shell_version = 'canary'
15
- end
11
+ Palapala.debug = $debug
16
12
 
17
13
  # @param concurrency Number of concurrent threads
18
14
  # @param iterations Number of iterations per thread
@@ -22,7 +18,8 @@ def benchmark(concurrency, iterations)
22
18
  Thread.new do
23
19
  iterations.times do |j|
24
20
  doc = "Hello #{i}, world #{j}! #{Time.now}."
25
- Palapala::Pdf.new(doc).save("tmp/benchmark_#{i}_#{j}.pdf")
21
+ pdf = Palapala::Pdf.new(doc)
22
+ $save ? pdf.save("tmp/benchmark_#{i}_#{j}.pdf") : pdf.binary_data
26
23
  end
27
24
  end
28
25
  end
@@ -32,18 +29,9 @@ def benchmark(concurrency, iterations)
32
29
  time
33
30
  end
34
31
 
35
- puts 'warmup'
32
+ puts "Warmup..."
33
+ benchmark(1, 5)
34
+ puts "Starting benchmark..."
36
35
  benchmark(1, 10)
37
-
38
- # benchmark(1, 20)
39
- benchmark(2, 10)
40
- # benchmark(4, 5)
41
- # benchmark(5, 4)
42
- # benchmark(20, 1)
43
-
44
- # benchmark(1, 320)
45
- # benchmark(2, 320 / 2)
36
+ benchmark(2, 20 / 2)
46
37
  benchmark(4, 320 / 4)
47
- # benchmark(8, 320 / 8)
48
- # benchmark(20, 2)
49
- # benchmark(40, 1)
@@ -25,9 +25,9 @@ module Palapala
25
25
  end
26
26
  end
27
27
 
28
- # Check if a Chrome is running
28
+ # Check if a Chrome is running locally
29
29
  def self.chrome_running?
30
- port_in_use? || # Check if the port is in use and Chrome is running externally
30
+ port_in_use? || # Check if the port is in use
31
31
  chrome_process_healthy? # Check if the process is still alive
32
32
  end
33
33
 
@@ -59,9 +59,9 @@ module Palapala
59
59
  system("which npx > /dev/null 2>&1")
60
60
  end
61
61
 
62
- def self.spawn_chrome_headless_server
62
+ def self.spawn_chrome_headless_server_with_npx
63
63
  # Run the command and capture the output
64
- puts "Installing latest stable chrome-headless-shell..."
64
+ puts "Installing/launching chrome-headless-shell@#{Palapala.chrome_headless_shell_version}"
65
65
  output, status = Open3.capture2("npx --yes @puppeteer/browsers install chrome-headless-shell@#{Palapala.chrome_headless_shell_version}")
66
66
 
67
67
  if status.success?
@@ -82,28 +82,37 @@ module Palapala
82
82
  # Display the version
83
83
  system("#{chrome_path} --version") if Palapala.debug
84
84
  # Launch chrome-headless-shell with the --remote-debugging-port parameter
85
- if Palapala.debug
86
- spawn(chrome_path, "--remote-debugging-port=9222", "--disable-gpu")
85
+ params = [ "--disable-gpu", "--remote-debugging-port=9222" ]
86
+ params.merge!(Palapala.chrome_params) if Palapala.chrome_params
87
+ pid = if Palapala.debug
88
+ spawn(chrome_path, *params)
87
89
  else
88
- spawn(chrome_path, "--remote-debugging-port=9222", "--disable-gpu", out: "/dev/null", err: "/dev/null")
90
+ spawn(chrome_path, *params, out: "/dev/null", err: "/dev/null")
89
91
  end
92
+ Palapala.headless_chrome_url = "http://localhost:9222"
93
+ pid
90
94
  else
91
95
  raise "Failed to install chrome-headless-shell"
92
96
  end
93
97
  end
94
98
 
99
+ def self.spawn_chrome_from_path
100
+ params = [ "--headless", "--disable-gpu", "--remote-debugging-port=9222" ]
101
+ params.merge!(Palapala.chrome_params) if Palapala.chrome_params
102
+ # Spawn an existing chrome with the path and parameters
103
+ Process.spawn(chrome_path, *params)
104
+ end
105
+
95
106
  # Spawn a Chrome child process
96
107
  def self.spawn_chrome
97
108
  return if chrome_running?
98
109
 
99
- if self.npx_installed?
100
- @chrome_process_id = spawn_chrome_headless_server
101
- else
102
- params = [ "--headless", "--disable-gpu", "--remote-debugging-port=9222" ]
103
- params.merge!(Palapala.chrome_params) if Palapala.chrome_params
104
- # Spawn an existing chrome with the path and parameters
105
- @chrome_process_id = Process.spawn(chrome_path, *params)
106
- end
110
+ @chrome_process_id =
111
+ if Palapala.headless_chrome_path.nil? && self.npx_installed?
112
+ spawn_chrome_headless_server_with_npx
113
+ else
114
+ spawn_chrome_from_path
115
+ end
107
116
 
108
117
  # Wait until the port is in use
109
118
  sleep 0.1 until port_in_use?
data/lib/palapala/pdf.rb CHANGED
@@ -42,20 +42,22 @@ module Palapala
42
42
  scale: nil)
43
43
  @content = content || raise(ArgumentError, "Content is required and can't be nil")
44
44
  @opts = {}
45
- @opts[:headerTemplate] = header_template || Palapala.defaults[:header_template]
46
- @opts[:footerTemplate] = footer_template || Palapala.defaults[:footer_template]
47
- @opts[:pageRanges] = page_ranges || Palapala.defaults[:page_ranges]
48
- @opts[:generateTaggedPDF] = generate_tagged_pdf || Palapala.defaults[:generate_tagged_pdf]
49
- @opts[:paperWidth] = paper_width || Palapala.defaults[:paper_width]
50
- @opts[:paperHeight] = paper_height || Palapala.defaults[:paper_height]
51
- @opts[:landscape] = landscape || Palapala.defaults[:landscape]
52
- @opts[:marginTop] = margin_top || Palapala.defaults[:margin_top]
53
- @opts[:marginLeft] = margin_left || Palapala.defaults[:margin_left]
54
- @opts[:marginBottom] = margin_bottom || Palapala.defaults[:margin_bottom]
55
- @opts[:marginRight] = margin_right || Palapala.defaults[:margin_right]
56
- @opts[:preferCSSPageSize] = prefer_css_page_size || Palapala.defaults[:prefer_css_page_size]
57
- @opts[:printBackground] = print_background || Palapala.defaults[:print_background]
58
- @opts[:scale] = scale || Palapala.defaults[:scale]
45
+ @opts[:headerTemplate] = header_template || Palapala.defaults[:header_template]
46
+ @opts[:footerTemplate] = footer_template || Palapala.defaults[:footer_template]
47
+ @opts[:pageRanges] = page_ranges || Palapala.defaults[:page_ranges]
48
+ @opts[:generateTaggedPDF] = generate_tagged_pdf || Palapala.defaults[:generate_tagged_pdf]
49
+ @opts[:paperWidth] = paper_width || Palapala.defaults[:paper_width]
50
+ @opts[:paperHeight] = paper_height || Palapala.defaults[:paper_height]
51
+ @opts[:landscape] = landscape || Palapala.defaults[:landscape]
52
+ @opts[:marginTop] = margin_top || Palapala.defaults[:margin_top]
53
+ @opts[:marginLeft] = margin_left || Palapala.defaults[:margin_left]
54
+ @opts[:marginBottom] = margin_bottom || Palapala.defaults[:margin_bottom]
55
+ @opts[:marginRight] = margin_right || Palapala.defaults[:margin_right]
56
+ @opts[:preferCSSPageSize] = prefer_css_page_size || Palapala.defaults[:prefer_css_page_size]
57
+ @opts[:printBackground] = print_background || Palapala.defaults[:print_background]
58
+ @opts[:scale] = scale || Palapala.defaults[:scale]
59
+ @opts[:displayHeaderFooter] = (@opts[:headerTemplate] || @opts[:footerTemplate]) ? true : false
60
+ @opts[:encoding] = :binary
59
61
  @opts.compact!
60
62
  end
61
63
 
@@ -22,6 +22,13 @@ module Palapala
22
22
  send_command_and_wait_for_result("Page.enable")
23
23
  end
24
24
 
25
+ def websocket_url
26
+ self.class.websocket_url
27
+ rescue Errno::ECONNREFUSED
28
+ ChromeProcess.spawn_chrome # Spawn a new Chrome process
29
+ self.class.websocket_url # Retry (once)
30
+ end
31
+
25
32
  # Create a thread-local instance of the renderer
26
33
  def self.thread_local_instance
27
34
  Thread.current[:renderer] ||= Renderer.new
@@ -37,6 +44,9 @@ module Palapala
37
44
  def on_message(e)
38
45
  puts "Received: #{e.data[0..64]}" if Palapala.debug
39
46
  @response = JSON.parse(e.data) # Parse the JSON response
47
+ if @response["error"] # Raise an error if the response contains an error
48
+ raise "#{@response["error"]["message"]}: #{@response["error"]["data"]} (#{@response["error"]["code"]})"
49
+ end
40
50
  end
41
51
 
42
52
  # Update the current ID to the next ID (increment by 1)
@@ -102,16 +112,8 @@ module Palapala
102
112
  @client.close
103
113
  end
104
114
 
105
- private
106
-
107
- # Convert the HTML content to a data URL
108
- def data_url_for_html(html)
109
- "data:text/html;base64,#{Base64.strict_encode64(html)}"
110
- end
111
-
112
115
  # Open a new tab in the remote chrome and return the WebSocket URL
113
- def websocket_url
114
- ChromeProcess.spawn_chrome
116
+ def self.websocket_url
115
117
  uri = URI("#{Palapala.headless_chrome_url}/json/new")
116
118
  http = Net::HTTP.new(uri.host, uri.port)
117
119
  request = Net::HTTP::Put.new(uri)
@@ -122,5 +124,12 @@ module Palapala
122
124
  puts "WebSocket URL: #{websocket_url}" if Palapala.debug
123
125
  websocket_url
124
126
  end
127
+
128
+ private
129
+
130
+ # Convert the HTML content to a data URL
131
+ def data_url_for_html(html)
132
+ "data:text/html;base64,#{Base64.strict_encode64(html)}"
133
+ end
125
134
  end
126
135
  end
@@ -1,3 +1,3 @@
1
1
  module Palapala
2
- VERSION = "0.1.9"
2
+ VERSION = "0.1.11"
3
3
  end
data/lib/palapala.rb CHANGED
@@ -19,18 +19,16 @@ module Palapala
19
19
  # path to the headless Chrome executable when using the child process renderer
20
20
  attr_accessor :headless_chrome_path
21
21
 
22
- # URL to the headless Chrome instance when using the remote renderer
22
+ # URL to the headless Chrome instance when using the remote renderer (priority)
23
23
  attr_accessor :headless_chrome_url
24
24
 
25
- # Chrome headless shell version to use
25
+ # Chrome headless shell version to use (stable, beta, dev, canary, etc.)
26
+ # when launching a new Chrome instance using npx
26
27
  attr_accessor :chrome_headless_shell_version
27
28
  end
28
- puts "setting defaults on palapala"
29
29
  self.debug = false
30
- self.defaults = { displayHeaderFooter: true, encoding: :binary }
31
- self.headless_chrome_path = nil
32
- self.headless_chrome_url = "http://localhost:9222"
30
+ self.defaults = { print_background: true, prefer_css_page_size: true, margin_left: 0, margin_right: 0, margin_top: 0, margin_bottom: 0 }
31
+ self.headless_chrome_path = ENV.fetch("HEADLESS_CHROME_PATH", nil)
32
+ self.headless_chrome_url = ENV.fetch("HEADLESS_CHROME_URL", "http://localhost:9222")
33
33
  self.chrome_headless_shell_version = ENV.fetch("CHROME_HEADLESS_SHELL_VERSION", "stable")
34
34
  end
35
-
36
- puts "hoo"
data/paged_css.pdf ADDED
Binary file
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: palapala_pdf
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.9
4
+ version: 0.1.11
5
5
  platform: ruby
6
6
  authors:
7
7
  - Koen Handekyn
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-08-29 00:00:00.000000000 Z
11
+ date: 2024-08-30 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: base64
@@ -56,8 +56,16 @@ files:
56
56
  - assets/images/logo-variant2.webp
57
57
  - assets/images/logo.webp
58
58
  - bin/chrome-headless-server
59
+ - doc/installing_node.md
60
+ - doc/paged_css.md
61
+ - examples/all.rb
62
+ - examples/chrome_base_header_footer_template.html
63
+ - examples/headers_and_footers.pdf
59
64
  - examples/headers_and_footers.rb
65
+ - examples/js_based_rendering.pdf
60
66
  - examples/js_based_rendering.rb
67
+ - examples/paged_css.pdf
68
+ - examples/paged_css.rb
61
69
  - examples/performance_benchmark.rb
62
70
  - lib/palapala.rb
63
71
  - lib/palapala/chrome_process.rb
@@ -66,6 +74,7 @@ files:
66
74
  - lib/palapala/version.rb
67
75
  - lib/palapala/web_socket_client.rb
68
76
  - lib/palapala_pdf.rb
77
+ - paged_css.pdf
69
78
  - palapala_pdf.gemspec
70
79
  homepage: https://github.com/palapala-app/palapala_pdf
71
80
  licenses: