palapala_pdf 0.1.9 → 0.1.11
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +68 -142
- data/doc/installing_node.md +16 -0
- data/doc/paged_css.md +167 -0
- data/examples/all.rb +9 -0
- data/examples/chrome_base_header_footer_template.html +169 -0
- data/examples/headers_and_footers.pdf +0 -0
- data/examples/headers_and_footers.rb +11 -40
- data/examples/js_based_rendering.pdf +0 -0
- data/examples/js_based_rendering.rb +4 -5
- data/examples/paged_css.pdf +0 -0
- data/examples/paged_css.rb +186 -0
- data/examples/performance_benchmark.rb +9 -21
- data/lib/palapala/chrome_process.rb +24 -15
- data/lib/palapala/pdf.rb +16 -14
- data/lib/palapala/renderer.rb +18 -9
- data/lib/palapala/version.rb +1 -1
- data/lib/palapala.rb +6 -8
- data/paged_css.pdf +0 -0
- metadata +11 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 230cc525cd5e4bbc4d2ce9ddb0a418486bc27ad6163cef9834a67d7e23442b41
|
4
|
+
data.tar.gz: b283d90551ef07efe3384148b061ec0a6a18cb3bdbc9485c1e8717e6bc4979f3
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: cfcf738f7171f679d419349cce4ce0441bb3be54fc355fa9592222b04c63654d6646c6103310f8ddf56e7112a3ccc95b30ddb92db86e0ff48a12b221f8be039c
|
7
|
+
data.tar.gz: efa8277743d960b0d3e869ab970c8248a2f901b572d4451bf0c479c311eabbd2a810e678efad489913ced8ab2ada14e8ce7d4b4fbb7d3ccad61ce98f419ee1b2
|
data/README.md
CHANGED
@@ -4,7 +4,9 @@
|
|
4
4
|
|
5
5
|
This project is a Ruby gem that provides functionality for generating PDF files from HTML using the Chrome browser. It allows you to easily convert HTML content into PDF documents, making it convenient for tasks such as generating reports, invoices, or any other printable documents. The gem provides a simple and intuitive API for converting HTML to PDF, and it leverages the power and flexibility of the Chrome browser's rendering engine to ensure accurate and high-quality PDF output. With this gem, you can easily integrate PDF generation capabilities into your Ruby applications.
|
6
6
|
|
7
|
-
At the core, this project leverages the
|
7
|
+
At the core, this project leverages the Chrome rendering engine, but with significantly reduced overhead and dependencies. Instead of relying on the full Grover/Puppeteer/NodeJS stack, this project uses a raw web socket to enable direct communication from Ruby to a headless Chrome or Chromium browser. This approach ensures efficieny while providing a streamlined alternative for rendering tasks without sacrificing performance or flexibility.
|
8
|
+
|
9
|
+
It leverages work from [Puppeteer](https://pptr.dev/browsers-api/) (@puppeteer/browsers) to install a local Chrome-Headless-Shell if no Chrome is running, but that requires node (npx) to be available.
|
8
10
|
|
9
11
|
This is how easy PDF generation can be in Ruby:
|
10
12
|
|
@@ -16,88 +18,28 @@ And this while having the most modern HTML/CSS/JS availlable to you: flex, grid,
|
|
16
18
|
|
17
19
|
A core goal of this project is performance, and it is designed to be exceptionally fast. By leveraging **direct communication** with a headless Chrome or Chromium browser via a **raw web socket**, the gem minimizes overhead and dependencies, enabling PDF generation at speeds that significantly outperform other solutions. Whether generating simple or complex documents, this gem ensures that your Ruby applications can handle PDF tasks efficiently and at scale.
|
18
20
|
|
19
|
-
##
|
20
|
-
|
21
|
-
To install the gem and add it to your application's Gemfile, execute the following command:
|
22
|
-
|
23
|
-
```
|
24
|
-
$ bundle add palapala_pdf
|
25
|
-
```
|
26
|
-
|
27
|
-
If you are not using bundler to manage dependencies, you can install the gem by running:
|
28
|
-
|
29
|
-
```
|
30
|
-
$ gem install palapala_pdf
|
31
|
-
```
|
32
|
-
|
33
|
-
Palapala PDF connects to Chrome over a web socket connection.
|
34
|
-
An external Chrome/Chromium is expected. Start it with the following
|
35
|
-
command (9222 is the default port):
|
36
|
-
|
37
|
-
```sh
|
38
|
-
/path/to/chrome --headless --disable-gpu --remote-debugging-port=9222
|
39
|
-
```
|
40
|
-
|
41
|
-
### Installing Chrome / Headless Chrome
|
42
|
-
|
43
|
-
Seems the august 2024 release 128.0.6613.85 is seriously performance impacted. So to avoid regression issues, it's suggested to install a specific version of Chrome, test it and stick with it. This is easiest using npx and some tooling provided by Puppeteer. Unfortunately it depends on node/npm, but it's worth it. E.g. install a specific version like this:
|
44
|
-
|
45
|
-
```
|
46
|
-
npx @puppeteer/browsers install chrome@127.0.6533.88
|
47
|
-
````
|
48
|
-
|
49
|
-
This installs chrome in a `chrome` folder in the current working dir and it outputs the path where it's installed when it's finished.
|
50
|
-
|
51
|
-
If you installed it using puppeteer from above
|
52
|
-
|
53
|
-
```sh
|
54
|
-
./chrome/mac_arm-127.0.6533.88/chrome-mac-arm64/Google\ Chrome\ for\ Testing.app/Contents/MacOS/Google\ Chrome\ for\ Testing --headless --disable-gpu --remote-debugging-port=9222
|
55
|
-
```
|
21
|
+
## Sponsor This Project
|
56
22
|
|
57
|
-
|
23
|
+
If you find this project useful and would like to support its development, consider sponsoring or buying a coffee to help keep it going:
|
58
24
|
|
59
|
-
|
60
|
-
|
61
|
-
```
|
25
|
+
- **GitHub Sponsors:** [Sponsor on GitHub](https://github.com/sponsors/koenhandekyn)
|
26
|
+
- **Buy Me a Coffee:** [Buy a Coffee](https://buymeacoffee.com/koenhandekyn)
|
62
27
|
|
63
|
-
|
28
|
+
Your support is greatly appreciated and helps maintain the project!
|
64
29
|
|
65
|
-
|
66
|
-
./chrome-headless-shell/mac_arm-128.0.6613.84/chrome-headless-shell-mac-arm64/chrome-headless-shell --remote-debugging-port=9222
|
67
|
-
```
|
30
|
+
## Installation
|
68
31
|
|
69
|
-
|
70
|
-
It guesses the path to Chrome, or you configure it like this:
|
32
|
+
To install the gem and add it to your application's Gemfile, execute the following command:
|
71
33
|
|
72
|
-
```ruby
|
73
|
-
Palapala.setup do |config|
|
74
|
-
config.headless_chrome_path = '/usr/bin/google-chrome-stable' # path to Chrome executable
|
75
|
-
end
|
76
34
|
```
|
77
|
-
|
78
|
-
### Installing Node/NPX
|
79
|
-
|
80
|
-
Using Brew
|
81
|
-
|
82
|
-
````
|
83
|
-
brew install node
|
35
|
+
$ bundle add palapala_pdf
|
84
36
|
```
|
85
37
|
|
86
|
-
Using NVM (Node Version Manager)
|
87
|
-
|
88
|
-
````
|
89
|
-
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.3/install.sh | bash
|
90
|
-
source ~/.nvm/nvm.sh
|
91
|
-
nvm --version
|
92
|
-
nvm install node
|
93
|
-
````
|
94
|
-
|
95
|
-
|
96
38
|
## Usage Instructions
|
97
39
|
|
98
40
|
To create a PDF from HTML content using the `Palapala` library, follow these steps:
|
99
41
|
|
100
|
-
|
42
|
+
**Configuration from inside Ruby**
|
101
43
|
|
102
44
|
Configure the `Palapala` library with the necessary options, such as the URL for the browser and default settings like scale and format.
|
103
45
|
|
@@ -105,76 +47,82 @@ In a Rails context, this could be inside an initializer.
|
|
105
47
|
|
106
48
|
```ruby
|
107
49
|
Palapala.setup do |config|
|
108
|
-
#
|
50
|
+
# debug mode
|
109
51
|
config.debug = true
|
110
|
-
|
111
|
-
|
112
|
-
|
52
|
+
# Chrome headless shell version to use (stable, beta, dev, canary, etc.) when launching a new Chrome instance
|
53
|
+
config.chrome_headless_shell_version = :stable
|
54
|
+
# run against an external chrome/chromium or leave this out to run against a chrome that is started as a child process
|
55
|
+
config.headless_chrome_url = 'http://localhost:9222'
|
56
|
+
# path to Chrome executable
|
57
|
+
config.headless_chrome_path = '/usr/bin/google-chrome-stable'
|
58
|
+
# default options for PDF generation
|
59
|
+
config.defaults = { scale: 1 }
|
60
|
+
# extra params to pass to Chrome when launched as a child process
|
61
|
+
config.chrome_params = []
|
113
62
|
end
|
114
63
|
```
|
115
|
-
1. **Create a PDF from HTML**:
|
116
64
|
|
117
|
-
|
65
|
+
**Using environemnt variables**
|
66
|
+
|
67
|
+
```sh
|
68
|
+
CHROME_HEADLESS_SHELL_VERSION=canary ruby examples/performance_benchmark.rb
|
69
|
+
````
|
70
|
+
|
71
|
+
```sh
|
72
|
+
HEADLESS_CHROME_URL=http://192.168.1.1:9222 ruby examples/performance_benchmark.rb
|
73
|
+
```
|
118
74
|
|
119
75
|
```sh
|
120
|
-
|
76
|
+
CHROME_HEADLESS_PATH=/var/to/chrome ruby examples/performance_benchmark.rb
|
121
77
|
```
|
122
78
|
|
123
|
-
|
79
|
+
**Create a PDF from HTML**
|
80
|
+
|
81
|
+
Load palapala and create a PDF file from an HTML snippet:
|
124
82
|
|
125
83
|
```ruby
|
126
84
|
require "palapala"
|
127
85
|
Palapala::Pdf.new("<h1>Hello, world! #{Time.now}</h1>").save('hello.pdf')
|
128
86
|
```
|
129
87
|
|
130
|
-
Instantiate a new Palapala::Pdf object with your HTML content and generate the PDF binary data
|
88
|
+
Instantiate a new Palapala::Pdf object with your HTML content and generate the PDF binary data:
|
131
89
|
|
132
90
|
```ruby
|
133
91
|
require "palapala"
|
134
92
|
binary_data = Palapala::Pdf.new("<h1>Hello, world! #{Time.now}</h1>").binary_data
|
135
93
|
```
|
136
94
|
|
137
|
-
##
|
138
|
-
|
139
|
-
Paged CSS is a subset of CSS designed for styling printed documents. It extends standard CSS to handle pagination, page sizes, headers, footers, and other aspects of printed content. Paged CSS is commonly used in scenarios where web content needs to be converted to PDFs or other paginated formats.
|
140
|
-
|
141
|
-
### Headers and Footers
|
95
|
+
## Advanced Examples
|
142
96
|
|
143
|
-
|
97
|
+
- headers and footers
|
98
|
+
- paged css for paper sizes, paper margins, pages breaks, etc
|
99
|
+
- js based rendering
|
144
100
|
|
145
|
-
|
101
|
+
## Connecting to Chrome
|
146
102
|
|
147
|
-
|
148
|
-
Palapala::Pdf.new(
|
149
|
-
"<p>Hello world</>",
|
150
|
-
header_html: '<div style="text-align: center;">Page <span class="pageNumber"></span> of <span class="totalPages"></span></div>',
|
151
|
-
footer_html: '<div style="text-align: center;">Generated with Palapala PDF</div>',
|
152
|
-
margin: { top: "2cm", bottom: "2cm"}
|
153
|
-
).save("test.pdf")
|
154
|
-
```
|
103
|
+
Palapa PDF will go through this process
|
155
104
|
|
156
|
-
|
105
|
+
- check if a Chrome is running and exposing port 9222 (and if so, use it)
|
106
|
+
- if `Palapala.headless_chrome_path` is defined, launch Chrome as a child process using that path
|
107
|
+
- if **NPX** is avalaillable, install a **Chrome-Headless-Shell** variant locally and launch it as a child process. It will install the 'stable' version or the version identified by `Palapala.chrome_headless_shell_version` setting (or from ENV `CHROME_HEADLESS_SHELL_VERSION`).
|
108
|
+
- as a last fallback it will guess a chrome path from the detected OS and try to launch a Chrome with that
|
157
109
|
|
158
|
-
|
110
|
+
In our expreience a Chrome-Headless-Shell version gives the best performance and resource useage.
|
159
111
|
|
160
|
-
|
112
|
+
### Installing Chrome / Headless Chrome manually
|
161
113
|
|
162
|
-
|
114
|
+
This is easiest using npx and tooling provided by Puppeteer (depends on node/npm, but it's worth it). This installs chrome in a `chrome` folder in the current working dir and it outputs the path where it's installed when it's finished. Currently we'd advise for the `chrome-headless-shell` variant that is a light version meant just for this use case. The chrome-headless-shell is a minimal, headless version of the Chrome browser designed specifically for environments where you need to run Chrome without a graphical user interface (GUI). This is particularly useful in scenarios like server-side rendering, automated testing, web scraping, or any situation where you need the power of the Chrome browser engine without the overhead of displaying a UI. Headless by design, reduced size and overhead but still the same engine.
|
163
115
|
|
164
|
-
|
116
|
+
```sh
|
117
|
+
npx @puppeteer/browsers install chrome-headless-shell@stable
|
118
|
+
```
|
165
119
|
|
166
|
-
|
120
|
+
It installs to a path like this `./chrome-headless-shell/mac_arm-128.0.6613.84/chrome-headless-shell-mac-arm64/chrome-headless-shell`. As it's headless by design, it only needs one parameter:
|
167
121
|
|
168
|
-
```
|
169
|
-
|
170
|
-
<script type="text/javascript">
|
171
|
-
document.addEventListener("DOMContentLoaded", () => {
|
172
|
-
document.body.innerHTML += "<p>Current time from JS: " + new Date().toLocaleString() + "</p>";
|
173
|
-
});
|
174
|
-
</script>
|
175
|
-
<body><p>Default body text.</p></body>
|
176
|
-
</html>
|
122
|
+
```sh
|
123
|
+
./chrome-headless-shell/mac_arm-128.0.6613.84/chrome-headless-shell-mac-arm64/chrome-headless-shell --remote-debugging-port=9222
|
177
124
|
```
|
125
|
+
*Note: Seems the august 2024 release Chrome releases 128.0.6613.85 onward is seriously performance impacted for PDF generation. Chrome Headless Shell releases don't seem to suffer from this issue.
|
178
126
|
|
179
127
|
## Raw parameters (Page.printToPDF)
|
180
128
|
|
@@ -196,15 +144,6 @@ Bug reports and pull requests are welcome on GitHub at https://github.com/palapa
|
|
196
144
|
- [Eugen Neagoe](https://github.com/eneagoe) - Thank you for your valuable input, feedback and opinions.
|
197
145
|
- [Radu Bogoevici](https://github.com/codenighter) - Thanks for test driving, and all help big and small.
|
198
146
|
|
199
|
-
## Sponsor This Project
|
200
|
-
|
201
|
-
If you find this project useful and would like to support its development, consider sponsoring or buying a coffee to help keep it going:
|
202
|
-
|
203
|
-
- **GitHub Sponsors:** [Sponsor on GitHub](https://github.com/sponsors/koenhandekyn)
|
204
|
-
- **Buy Me a Coffee:** [Buy a Coffee](https://buymeacoffee.com/koenhandekyn)
|
205
|
-
|
206
|
-
Your support is greatly appreciated and helps maintain the project!
|
207
|
-
|
208
147
|
## Findings
|
209
148
|
|
210
149
|
- For Chrome, mode headless=new seems to be slower for pdf rendering cases.
|
@@ -212,24 +151,14 @@ Your support is greatly appreciated and helps maintain the project!
|
|
212
151
|
|
213
152
|
## Primitive benchmark
|
214
153
|
|
215
|
-
On a macbook m3, the throughput for 'hello world' PDF generation can reach around
|
154
|
+
On a macbook m3, the throughput for 'hello world' PDF generation can reach around 500 to 800 docs/second when allowing for some concurrency (4 threads). As Chrome is actually also very efficient, it scales really well for complex documents also. If you run this in Rails, the concurrency is being taken care of either by the front end thread pool or by the workers and you shouldn't have to think about this. (Using an external Chrome)
|
216
155
|
|
217
156
|
Note: it renders `"Hello #{i}, world #{j}! #{Time.now}."` where i is the thread and j is the iteration counter within the thread and persists it to an SSD (which is very fast these days).
|
218
157
|
|
219
|
-
### benchmarking 20 docs: 1x20, 2x10, 4x5
|
220
|
-
|
221
|
-
```sh
|
222
|
-
c:1, n:20 : Throughput = 159.41 docs/sec, Total time = 0.1255 seconds
|
223
|
-
c:2, n:10 : Throughput = 124.91 docs/sec, Total time = 0.1601 seconds
|
224
|
-
c:4, n:5 : Throughput = 196.40 docs/sec, Total time = 0.1018 seconds
|
225
|
-
```
|
226
|
-
|
227
|
-
### benchmarking 320 docs: 1x320, 4x80, 8x40
|
228
|
-
|
229
158
|
```sh
|
230
|
-
c:1, n:
|
231
|
-
c:
|
232
|
-
c:
|
159
|
+
c:1, n:10 : Throughput = 16.76 docs/sec, Total time = 0.5968 seconds
|
160
|
+
c:2, n:10 : Throughput = 170.41 docs/sec, Total time = 0.1174 seconds
|
161
|
+
c:4, n:80 : Throughput = 579.03 docs/sec, Total time = 0.5526 seconds```
|
233
162
|
```
|
234
163
|
|
235
164
|
This is about a factor 100x faster then what you typically get with Grover and still 10x faster then with many alternatives. It's effectively that fast that you can run this for a lot of uses cases straight from e.g. your Ruby On Rails web worker in the controller on a single machine and still scale to lot's of users.
|
@@ -256,25 +185,22 @@ In this example, `pdf_data` is the binary data of the PDF file. The `filename` o
|
|
256
185
|
|
257
186
|
## Docker
|
258
187
|
|
259
|
-
|
188
|
+
TODO
|
260
189
|
|
261
|
-
|
262
|
-
Palapala.setup do |config|
|
263
|
-
config.opts = { 'no-sandbox': nil }
|
264
|
-
end
|
265
|
-
```
|
266
|
-
It has also been reported that the Chrome process repeatedly crashes when running inside a Docker container on an M1 Mac. Chrome should work as expected when deployed to a Docker container on a non-M1 Mac.
|
190
|
+
*It has also been reported that the Chrome process repeatedly crashes when running inside a Docker container on an M1 Mac. Chrome should work as expected when deployed to a Docker container on a non-M1 Mac.*
|
267
191
|
|
268
192
|
## Thread-safety
|
269
193
|
|
270
|
-
Behind the scenes, a websocket is openend and stored on Thread.current for subsequent requests. Hence, the code is
|
271
|
-
thread safe in the sense that every web socket get's a new tab in the underlying chromium and get an isolated context.
|
272
|
-
|
273
194
|
For performance reasons, the code uses a low level websocket connection that does all it's work on the curent thread
|
274
195
|
so we can avoid synchronisation penalties.
|
275
196
|
|
197
|
+
Behind the scenes, a websocket is openend and stored on Thread.current for subsequent requests. Hence, the code is
|
198
|
+
thread safe in the sense that every web socket get's a new tab in the underlying chromium and get an isolated context.
|
199
|
+
|
276
200
|
## Heroku
|
277
201
|
|
202
|
+
TODO
|
203
|
+
|
278
204
|
possible buildpacks
|
279
205
|
|
280
206
|
https://github.com/heroku/heroku-buildpack-chrome-for-testing
|
@@ -0,0 +1,16 @@
|
|
1
|
+
### Installing Node (npx)
|
2
|
+
|
3
|
+
Using Brew
|
4
|
+
|
5
|
+
```sh
|
6
|
+
brew install node
|
7
|
+
```
|
8
|
+
|
9
|
+
Using NVM (Node Version Manager)
|
10
|
+
|
11
|
+
```sh
|
12
|
+
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.3/install.sh | bash
|
13
|
+
source ~/.nvm/nvm.sh
|
14
|
+
nvm --version
|
15
|
+
nvm install node
|
16
|
+
```
|
data/doc/paged_css.md
ADDED
@@ -0,0 +1,167 @@
|
|
1
|
+
## Paged CSS
|
2
|
+
|
3
|
+
Paged CSS is a subset of CSS designed for styling printed documents. It extends standard CSS to handle pagination, page sizes, headers, footers, and other aspects of printed content. Paged CSS is commonly used in scenarios where web content needs to be converted to PDFs or other paginated formats.
|
4
|
+
|
5
|
+
Setting page size
|
6
|
+
|
7
|
+
```css
|
8
|
+
@page {
|
9
|
+
/* set a standard page size */
|
10
|
+
size: A4 landscape;
|
11
|
+
/* Custom */
|
12
|
+
size: 8.5in 11in; /* Width x Height */
|
13
|
+
}
|
14
|
+
```
|
15
|
+
|
16
|
+
Setting page margins
|
17
|
+
|
18
|
+
```css
|
19
|
+
@page {
|
20
|
+
margin: 1in; /* 1 inch on all sides */
|
21
|
+
margin: 1in 0.5in 1in 0.5in; /* Top, Right, Bottom, Left */
|
22
|
+
}
|
23
|
+
```
|
24
|
+
|
25
|
+
Forcing a Page Break before or after an Element
|
26
|
+
|
27
|
+
```css
|
28
|
+
/* This ensures that every `h1` starts on a new page. */
|
29
|
+
h1 {
|
30
|
+
page-break-before: always;
|
31
|
+
}
|
32
|
+
/* This ensures that every `p` element ends with a page break, starting the next content on a new page. */
|
33
|
+
p {
|
34
|
+
page-break-after: always;
|
35
|
+
}
|
36
|
+
/* This prevents a table from being split across two pages. */
|
37
|
+
table {
|
38
|
+
page-break-inside: avoid;
|
39
|
+
}
|
40
|
+
```
|
41
|
+
|
42
|
+
### Headers and Footers
|
43
|
+
|
44
|
+
When using Chromium-based rendering engines, headers and footers are not controlled by the Paged CSS standard but are instead managed through specific settings in the rendering engine.
|
45
|
+
|
46
|
+
With palapala PDF headers and footers are defined using `header_template` and `footer_template` options. These allow you to insert HTML content directly into the header or footer areas.
|
47
|
+
|
48
|
+
Critical is that you specify a font-size because by default Chrome uses a very tiny font.
|
49
|
+
|
50
|
+
```ruby
|
51
|
+
Palapala::Pdf.new(
|
52
|
+
"<p>Hello world</>",
|
53
|
+
header_template: '<div style="text-align: center; font-size: 12pt;">Page <span class="pageNumber"></span> of <span class="totalPages"></span></div>',
|
54
|
+
footer_template: '<div style="text-align: center; font-size: 12pt;">Generated with Palapala PDF</div>',
|
55
|
+
).save("test.pdf")
|
56
|
+
```
|
57
|
+
|
58
|
+
### Examples
|
59
|
+
|
60
|
+
#### Headers and Footers
|
61
|
+
|
62
|
+
TODO explain about headers and footers, font sizes, styles being independent, and how to insert current page, total pages, etc.
|
63
|
+
|
64
|
+
#### Page sizes and margins
|
65
|
+
|
66
|
+
Paged CSS, also known as @page CSS, is used to control the layout and appearance of printed documents. It allows you to define page-specific styles, such as sizes and margins, which are crucial for generating well-formatted PDFs.
|
67
|
+
|
68
|
+
You can specify the size of the page using predefined sizes or custom dimensions. Common predefined sizes include A4, A3, letter, etc. Margins can be set for the top, right, bottom, and left sides of the page. You can specify all four margins at once or individually. You can also define named pages for different sections of your document.
|
69
|
+
|
70
|
+
##### Example: Different First Page
|
71
|
+
|
72
|
+
TODO Validate
|
73
|
+
|
74
|
+
```css
|
75
|
+
@page first {
|
76
|
+
size: A4;
|
77
|
+
margin: 2in; /* Larger margin for the first page */
|
78
|
+
}
|
79
|
+
|
80
|
+
@page {
|
81
|
+
size: A4;
|
82
|
+
margin: 1in;
|
83
|
+
}
|
84
|
+
|
85
|
+
body {
|
86
|
+
counter-reset: page;
|
87
|
+
}
|
88
|
+
|
89
|
+
body:first {
|
90
|
+
page: first;
|
91
|
+
}
|
92
|
+
```
|
93
|
+
|
94
|
+
#### Page breaks
|
95
|
+
|
96
|
+
Paged CSS allows you to control how content is divided across pages when printing or generating PDFs. Page breaks are an essential part of this, as they determine where a new page starts. You can control page breaks using the `page-break-before`, `page-break-after`, and `page-break-inside` properties.
|
97
|
+
|
98
|
+
##### Page Break Properties
|
99
|
+
|
100
|
+
1. **`page-break-before`**: Forces a page break before the element.
|
101
|
+
2. **`page-break-after`**: Forces a page break after the element.
|
102
|
+
3. **`page-break-inside`**: Prevents or allows a page break inside the element.
|
103
|
+
|
104
|
+
##### Values
|
105
|
+
|
106
|
+
- `auto`: Default. Neither forces nor prevents a page break.
|
107
|
+
- `always` Always forces a page break.
|
108
|
+
- `avoid`: Avoids a page break inside the element.
|
109
|
+
- `left`: Forces a page break so that the next page is a left page.
|
110
|
+
- `right`: Forces a page break so that the next page is a right page.
|
111
|
+
|
112
|
+
##### Examples
|
113
|
+
|
114
|
+
```css
|
115
|
+
/* This ensures that every `h1` starts on a new page. */
|
116
|
+
h1 {
|
117
|
+
page-break-before: always;
|
118
|
+
}
|
119
|
+
/* This ensures that every `p` element ends with a page break, starting the next content on a new page. */
|
120
|
+
p {
|
121
|
+
page-break-after: always;
|
122
|
+
}
|
123
|
+
/* This prevents a table from being split across two pages. */
|
124
|
+
table {
|
125
|
+
page-break-inside: avoid;
|
126
|
+
}
|
127
|
+
```
|
128
|
+
|
129
|
+
##### Practical Use Cases
|
130
|
+
|
131
|
+
- **Chapter Titles**: Use `page-break-before: always;` for chapter titles to ensure each chapter starts on a new page.
|
132
|
+
- **Sections**: Use `page-break-after: always;` for sections that should end with a page break.
|
133
|
+
- **Tables and Figures**: Use `page-break-inside: avoid;` to keep tables and figures from being split across pages.
|
134
|
+
|
135
|
+
#### Tables accross Pages
|
136
|
+
|
137
|
+
TODO explain `display` property with the values `table-header-group` and `table-footer-group`
|
138
|
+
|
139
|
+
##### Example
|
140
|
+
|
141
|
+
```html
|
142
|
+
<table>
|
143
|
+
<thead>
|
144
|
+
<tr>
|
145
|
+
<th>Header 1</th>
|
146
|
+
<th>Header 2</th>
|
147
|
+
</tr>
|
148
|
+
</thead>
|
149
|
+
<tbody>
|
150
|
+
<tr>
|
151
|
+
<td>Data 1</td>
|
152
|
+
<td>Data 2</td>
|
153
|
+
</tr>
|
154
|
+
<!-- More rows -->
|
155
|
+
</tbody>
|
156
|
+
<tfoot>
|
157
|
+
<tr>
|
158
|
+
<td>Footer 1</td>
|
159
|
+
<td>Footer 2</td>
|
160
|
+
</tr>
|
161
|
+
</tfoot>
|
162
|
+
</table>
|
163
|
+
```
|
164
|
+
|
165
|
+
In this example:
|
166
|
+
- The `<thead>` section will be repeated at the top of each page.
|
167
|
+
- The `<tfoot>` section will be repeated at the bottom of each page.
|
data/examples/all.rb
ADDED
@@ -0,0 +1,169 @@
|
|
1
|
+
<!--
|
2
|
+
OPTIONS AS PASSED IN THE C++ code
|
3
|
+
=================================
|
4
|
+
options.Set(kSettingHeaderFooterDate,
|
5
|
+
base::Time::Now().InMillisecondsFSinceUnixEpoch());
|
6
|
+
options.Set("width", static_cast<double>(page_size.width()));
|
7
|
+
options.Set("height", static_cast<double>(page_size.height()));
|
8
|
+
options.Set("topMargin", page_layout.margin_top);
|
9
|
+
options.Set("bottomMargin", page_layout.margin_bottom);
|
10
|
+
options.Set("leftMargin", page_layout.margin_left);
|
11
|
+
options.Set("rightMargin", page_layout.margin_right);
|
12
|
+
// `page_index` is 0-based, so 1 is added to get the page number.
|
13
|
+
options.Set("pageNumber", base::checked_cast<int>(page_index + 1));
|
14
|
+
options.Set("totalPages", base::checked_cast<int>(total_pages));
|
15
|
+
options.Set("url", params.url);
|
16
|
+
std::u16string title = source_frame.GetDocument().Title().Utf16();
|
17
|
+
options.Set("title", title.empty() ? params.title : title);
|
18
|
+
options.Set("headerTemplate", params.header_template);
|
19
|
+
options.Set("footerTemplate", params.footer_template);
|
20
|
+
options.Set("isRtl", base::i18n::IsRTL());
|
21
|
+
-->
|
22
|
+
|
23
|
+
<!doctype html>
|
24
|
+
<html>
|
25
|
+
|
26
|
+
<head>
|
27
|
+
<link rel="stylesheet" href="chrome://resources/css/text_defaults.css">
|
28
|
+
<style>
|
29
|
+
body {
|
30
|
+
display: flex;
|
31
|
+
flex-direction: column;
|
32
|
+
margin: 0;
|
33
|
+
}
|
34
|
+
|
35
|
+
#header,
|
36
|
+
#footer {
|
37
|
+
display: flex;
|
38
|
+
flex: none;
|
39
|
+
}
|
40
|
+
|
41
|
+
#header {
|
42
|
+
align-items: flex-start;
|
43
|
+
padding-top: 15pt;
|
44
|
+
}
|
45
|
+
|
46
|
+
#footer {
|
47
|
+
align-items: flex-end;
|
48
|
+
padding-bottom: 15pt;
|
49
|
+
}
|
50
|
+
|
51
|
+
#content {
|
52
|
+
flex: auto;
|
53
|
+
}
|
54
|
+
|
55
|
+
.left {
|
56
|
+
flex: none;
|
57
|
+
padding-left: 24pt;
|
58
|
+
/* csschecker-disable-line left-right */
|
59
|
+
padding-right: 6pt;
|
60
|
+
/* csschecker-disable-line left-right */
|
61
|
+
}
|
62
|
+
|
63
|
+
.center {
|
64
|
+
flex: auto;
|
65
|
+
padding-left: 24pt;
|
66
|
+
/* csschecker-disable-line left-right */
|
67
|
+
padding-right: 24pt;
|
68
|
+
/* csschecker-disable-line left-right */
|
69
|
+
text-align: center;
|
70
|
+
}
|
71
|
+
|
72
|
+
.right {
|
73
|
+
flex: none;
|
74
|
+
/* historically does not account for RTL */
|
75
|
+
padding-left: 6pt;
|
76
|
+
/* csschecker-disable-line left-right */
|
77
|
+
padding-right: 24pt;
|
78
|
+
/* csschecker-disable-line left-right */
|
79
|
+
}
|
80
|
+
|
81
|
+
.grow {
|
82
|
+
flex: auto;
|
83
|
+
}
|
84
|
+
|
85
|
+
.text {
|
86
|
+
font-size: 8pt;
|
87
|
+
overflow: hidden;
|
88
|
+
text-overflow: ellipsis;
|
89
|
+
white-space: nowrap;
|
90
|
+
}
|
91
|
+
</style>
|
92
|
+
<script>
|
93
|
+
|
94
|
+
function getComputedStyleAsFloat(style, value) {
|
95
|
+
return parseFloat(style.getPropertyValue(value).slice(0, -2));
|
96
|
+
}
|
97
|
+
|
98
|
+
function elementIntersects(element, topPos, bottomPos, leftPos, rightPos) {
|
99
|
+
const rect = element.getBoundingClientRect();
|
100
|
+
const style = window.getComputedStyle(element);
|
101
|
+
|
102
|
+
// Only consider the size of |element|, so remove the padding from |rect|.
|
103
|
+
// The padding is used for positioning.
|
104
|
+
rect.top += getComputedStyleAsFloat(style, 'padding-top');
|
105
|
+
rect.bottom -= getComputedStyleAsFloat(style, 'padding-bottom');
|
106
|
+
rect.left += getComputedStyleAsFloat(style, 'padding-left');
|
107
|
+
rect.right -= getComputedStyleAsFloat(style, 'padding-right');
|
108
|
+
return leftPos < rect.right && rightPos > rect.left && topPos < rect.bottom &&
|
109
|
+
bottomPos > rect.top;
|
110
|
+
}
|
111
|
+
|
112
|
+
function setupHeaderFooterTemplate(options) {
|
113
|
+
const body = document.querySelector('body');
|
114
|
+
const header = document.querySelector('#header');
|
115
|
+
const footer = document.querySelector('#footer');
|
116
|
+
|
117
|
+
body.style.width = `${options.width}px`;
|
118
|
+
body.style.height = `${options.height}px`;
|
119
|
+
header.style.height = `${options.topMargin}px`;
|
120
|
+
footer.style.height = `${options.bottomMargin}px`;
|
121
|
+
|
122
|
+
const topMargin = options.topMargin;
|
123
|
+
const bottomMargin = options.height - options.bottomMargin;
|
124
|
+
const leftMargin = options.leftMargin;
|
125
|
+
const rightMargin = options.width - options.rightMargin;
|
126
|
+
|
127
|
+
header.innerHTML = options['headerTemplate'] || `
|
128
|
+
<div class='date text left'></div>
|
129
|
+
<div class='title text center'></div>`;
|
130
|
+
footer.innerHTML = options['footerTemplate'] || `
|
131
|
+
<div class='url text left grow'></div>
|
132
|
+
<div class='text right'>
|
133
|
+
<span class='pageNumber'></span>/<span class='totalPages'></span>
|
134
|
+
</div>`;
|
135
|
+
|
136
|
+
const date = new Date(options.date);
|
137
|
+
const formatter =
|
138
|
+
new Intl.DateTimeFormat(
|
139
|
+
navigator.languages[0].split('@')[0],
|
140
|
+
{ dateStyle: 'short', timeStyle: 'short' });
|
141
|
+
options.date = formatter.format(date);
|
142
|
+
for (const cssClass of ['date', 'title', 'url', 'pageNumber', 'totalPages']) {
|
143
|
+
for (const element of document.querySelectorAll(`.${cssClass}`)) {
|
144
|
+
element.textContent = options[cssClass];
|
145
|
+
}
|
146
|
+
}
|
147
|
+
for (const element of document.querySelectorAll(`.text`)) {
|
148
|
+
if (options.isRtl &&
|
149
|
+
!element.classList.contains('url') &&
|
150
|
+
!element.classList.contains('title')) {
|
151
|
+
element.dir = 'rtl';
|
152
|
+
}
|
153
|
+
if (elementIntersects(element, topMargin, bottomMargin, leftMargin,
|
154
|
+
rightMargin)) {
|
155
|
+
element.style.visibility = 'hidden';
|
156
|
+
}
|
157
|
+
}
|
158
|
+
}
|
159
|
+
|
160
|
+
</script>
|
161
|
+
</head>
|
162
|
+
|
163
|
+
<body>
|
164
|
+
<div id="header"></div>
|
165
|
+
<div id="content"></div>
|
166
|
+
<div id="footer"></div>
|
167
|
+
</body>
|
168
|
+
|
169
|
+
</html>
|
Binary file
|
@@ -3,47 +3,18 @@
|
|
3
3
|
$LOAD_PATH.unshift File.expand_path('../lib', __dir__)
|
4
4
|
require 'palapala'
|
5
5
|
|
6
|
-
|
7
|
-
<style
|
8
|
-
.header {
|
9
|
-
-webkit-print-color-adjust: exact;
|
10
|
-
border-bottom: 1px solid lightgray;
|
11
|
-
color: black;
|
12
|
-
font-family: Arial, Helvetica, sans-serif;
|
13
|
-
font-size: 12pt;
|
14
|
-
margin: 0 auto;
|
15
|
-
padding: 5px;
|
16
|
-
text-align: center;
|
17
|
-
vertical-align: middle;
|
18
|
-
width: 100%;
|
19
|
-
border: 1px solid black;
|
20
|
-
}
|
21
|
-
</style>
|
22
|
-
<div class="header" style="text-align: center">
|
23
|
-
Page <span class="pageNumber"></span> of <span class="totalPages"></span>
|
24
|
-
</div>
|
25
|
-
HTML
|
6
|
+
header_template =
|
7
|
+
'<div style="text-align: center; font-size: 12pt; padding: 1rem; width: 100%;">Page <span class="pageNumber"></span> of <span class="totalPages"></span></div>'
|
26
8
|
|
27
|
-
|
28
|
-
|
29
|
-
config.headless_chrome_url = 'http://localhost:9222' # run against a remote Chrome instance
|
30
|
-
# config.headless_chrome_path = '/usr/bin/google-chrome-stable' # path to Chrome executable
|
31
|
-
end
|
9
|
+
footer_template =
|
10
|
+
'<div style="text-align: center; font-size: 12pt; padding: 1rem; width: 100%;">Generated with Palapala PDF</div>'
|
32
11
|
|
33
|
-
|
34
|
-
# "<style>@page { size: A4 landscape; }</style><p>Hello world #{Time.now}</>",
|
12
|
+
Palapala::Pdf.new(
|
35
13
|
"<h1>Title</h1><p>Hello world #{Time.now}</>",
|
36
|
-
|
37
|
-
|
38
|
-
|
39
|
-
|
40
|
-
margin: { top: 3, bottom: 2 }
|
41
|
-
).save('tmp/headers_and_footers.pdf',
|
42
|
-
generateDocumentOutline: false,
|
43
|
-
# marginTop: 1,
|
44
|
-
# paperWidth: 3,
|
45
|
-
displayHeaderFooter: true,
|
46
|
-
# landscape: false,
|
47
|
-
headerTemplate: HEADER_HTML)
|
14
|
+
header_template:,
|
15
|
+
footer_template:,
|
16
|
+
margin_top: 3,
|
17
|
+
margin_bottom: 3).save('headers_and_footers.pdf')
|
48
18
|
|
49
|
-
puts
|
19
|
+
puts "Generated headers_and_footers.pdf"
|
20
|
+
# `open headers_and_footers.pdf`
|
Binary file
|
@@ -14,9 +14,8 @@ DOCUMENT = <<~HTML
|
|
14
14
|
</html>
|
15
15
|
HTML
|
16
16
|
|
17
|
-
Palapala.
|
18
|
-
config.debug = true
|
19
|
-
end
|
17
|
+
Palapala::Pdf.new(DOCUMENT).save('js_based_rendering.pdf')
|
20
18
|
|
21
|
-
|
22
|
-
|
19
|
+
puts "Generated js_based_rendering.pdf"
|
20
|
+
|
21
|
+
# `open tmp/js_based_rendering.pdf`
|
Binary file
|
@@ -0,0 +1,186 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
$LOAD_PATH.unshift File.expand_path("../lib", __dir__)
|
4
|
+
require "palapala"
|
5
|
+
|
6
|
+
long_text = (1..30).map { "Demonstrate a paragraph that is not split across pages." }.join(" ")
|
7
|
+
|
8
|
+
def table(rows)
|
9
|
+
<<~HTML
|
10
|
+
<table>
|
11
|
+
<thead>
|
12
|
+
<tr>
|
13
|
+
<th>Header 1</th>
|
14
|
+
<th>Header 2</th>
|
15
|
+
</tr>
|
16
|
+
</thead>
|
17
|
+
<tbody>
|
18
|
+
#{ (1..rows).map { |i| "<tr><td>Row #{i}, Cell 1</td><td>Row #{i}, Cell 2</td></tr>" }.join }
|
19
|
+
</tbody>
|
20
|
+
<tfoot>
|
21
|
+
<tr>
|
22
|
+
<td>Footer 1</td>
|
23
|
+
<td>Footer 2</td>
|
24
|
+
</tr>
|
25
|
+
</tfoot>
|
26
|
+
</table>
|
27
|
+
HTML
|
28
|
+
end
|
29
|
+
|
30
|
+
big_table = table(35)
|
31
|
+
small_table = table(5)
|
32
|
+
|
33
|
+
document = <<~HTML
|
34
|
+
<html>
|
35
|
+
<style>
|
36
|
+
@page {
|
37
|
+
size: A4;
|
38
|
+
margin: 2cm;
|
39
|
+
margin-top: 3cm;
|
40
|
+
margin-bottom: 3cm;
|
41
|
+
}
|
42
|
+
body, html {
|
43
|
+
margin: 0;
|
44
|
+
padding: 0;
|
45
|
+
font-family: Arial, sans-serif;
|
46
|
+
}
|
47
|
+
h1 {
|
48
|
+
page-break-before: always;
|
49
|
+
border-bottom: 1px solid black;
|
50
|
+
}
|
51
|
+
h2 {
|
52
|
+
/* keep with next */
|
53
|
+
page-break-after: avoid;
|
54
|
+
}
|
55
|
+
@page:first {
|
56
|
+
size: A4 landscape;
|
57
|
+
margin: 0; /* no margin for the first page */
|
58
|
+
padding: 0;
|
59
|
+
}
|
60
|
+
div.titlepage {
|
61
|
+
background-color: black;
|
62
|
+
color: white;
|
63
|
+
font-size: 72pt;
|
64
|
+
text-align: center;
|
65
|
+
display: flex;
|
66
|
+
justify-content: center;
|
67
|
+
align-items: center;
|
68
|
+
height: 100%;
|
69
|
+
width: 100vw;
|
70
|
+
}
|
71
|
+
table {
|
72
|
+
font-size: 10pt;
|
73
|
+
width: 100%;
|
74
|
+
border-collapse: collapse;
|
75
|
+
td, th {
|
76
|
+
border: 1px solid black;
|
77
|
+
padding: 0.5rem;
|
78
|
+
}
|
79
|
+
& thead, & tfoot {
|
80
|
+
tr {
|
81
|
+
background-color: lightgray;
|
82
|
+
& th, & td {
|
83
|
+
padding-top: 0.5rem;
|
84
|
+
padding-bottom: 0.5rem;
|
85
|
+
}
|
86
|
+
}
|
87
|
+
}
|
88
|
+
}
|
89
|
+
/* Initialize counters */
|
90
|
+
body {
|
91
|
+
counter-reset: h1Counter h2Counter;
|
92
|
+
}
|
93
|
+
/* Numbering for H1 elements */
|
94
|
+
h1 {
|
95
|
+
counter-increment: h1Counter;
|
96
|
+
counter-reset: h2Counter; /* Reset h2 counter when a new h1 appears */
|
97
|
+
}
|
98
|
+
h1::before {
|
99
|
+
content: counter(h1Counter) ". ";
|
100
|
+
/* font-weight: bold; */
|
101
|
+
}
|
102
|
+
/* Numbering for H2 elements */
|
103
|
+
h2 {
|
104
|
+
counter-increment: h2Counter;
|
105
|
+
}
|
106
|
+
h2::before {
|
107
|
+
content: counter(h1Counter) "." counter(h2Counter) " ";
|
108
|
+
/* font-weight: bold; */
|
109
|
+
}
|
110
|
+
/* named pages */
|
111
|
+
@page addendum {
|
112
|
+
size: A5;
|
113
|
+
margin: 1cm;
|
114
|
+
margin-top: 3cm;
|
115
|
+
}
|
116
|
+
.addendum {
|
117
|
+
page: addendum;
|
118
|
+
counter-reset: h1Counter h2Counter;
|
119
|
+
}
|
120
|
+
</style>
|
121
|
+
<body>
|
122
|
+
<div class="titlepage">
|
123
|
+
<c-title>Title Page</c-title>
|
124
|
+
</div>
|
125
|
+
<h1>New Section</h1>
|
126
|
+
<h2>Subsection tables</h2>
|
127
|
+
<p>This demonstrates a table with a header and footer that spans multiple pages.</p>
|
128
|
+
#{big_table}
|
129
|
+
<h2>Subsection page break inside</h2>
|
130
|
+
<p style="page-break-inside: avoid; text-align: justify">
|
131
|
+
#{long_text}
|
132
|
+
</p>
|
133
|
+
<p>Note that the section title has moved to the second page because the paragraph above was moved to the second page.</p>
|
134
|
+
<h1>New Section</h1>
|
135
|
+
<p>Page 3 content</p>
|
136
|
+
<p>A small table</p>
|
137
|
+
#{small_table}
|
138
|
+
<h2>Subsection</h2>
|
139
|
+
<p>Some content</p>
|
140
|
+
<h2>Subsection</h2>
|
141
|
+
<p>Some content</p>
|
142
|
+
<div class="addendum">
|
143
|
+
This is an addendum and the page size is A5.
|
144
|
+
Headers are starting again from 1.
|
145
|
+
<h1>Some addendum header</h1>
|
146
|
+
<h2>Subsection</h2>
|
147
|
+
<h2>Subsection</h2>
|
148
|
+
<h1>Some addendum header</h1>
|
149
|
+
</div>
|
150
|
+
</body>
|
151
|
+
</html>
|
152
|
+
HTML
|
153
|
+
|
154
|
+
def debug(color: "red")
|
155
|
+
<<~HTML
|
156
|
+
<style>
|
157
|
+
/* this is a class chrome assigns to the header, footer and content in the main template */
|
158
|
+
#header, #content, #footer {
|
159
|
+
border: 1px dotted #{color}; /* uncomment to see the areas */
|
160
|
+
}
|
161
|
+
</style>
|
162
|
+
HTML
|
163
|
+
end
|
164
|
+
|
165
|
+
def header_footer_template(debug_color: nil)
|
166
|
+
<<~HTML
|
167
|
+
#{ debug(color: debug_color) if debug_color }
|
168
|
+
<div style="text-align: center; font-size: 12pt; padding: 1rem; width: 100%;">#{yield}</div>
|
169
|
+
HTML
|
170
|
+
end
|
171
|
+
|
172
|
+
footer_template = header_footer_template do
|
173
|
+
"Page <span class='pageNumber'></span> of <span class='totalPages'></span>"
|
174
|
+
end
|
175
|
+
|
176
|
+
header_template = header_footer_template do
|
177
|
+
"Generated with Palapala PDF"
|
178
|
+
end
|
179
|
+
|
180
|
+
Palapala::Pdf.new(document,
|
181
|
+
header_template:,
|
182
|
+
footer_template:).save("paged_css.pdf")
|
183
|
+
|
184
|
+
puts "Generated paged_css.pdf"
|
185
|
+
|
186
|
+
# `open paged_css.pdf`
|
@@ -5,14 +5,10 @@ $LOAD_PATH.unshift File.expand_path('../lib', __dir__)
|
|
5
5
|
require 'benchmark'
|
6
6
|
require 'palapala'
|
7
7
|
|
8
|
-
debug = ARGV[0] == 'debug'
|
8
|
+
$debug = ARGV[0] == 'debug'
|
9
|
+
$save = ARGV[0] == 'save'
|
9
10
|
|
10
|
-
Palapala.
|
11
|
-
# config.headless_chrome_url = 'http://localhost:9222'
|
12
|
-
config.debug = debug
|
13
|
-
config.defaults.merge! scale: 0.75, format: :A4
|
14
|
-
config.chrome_headless_shell_version = 'canary'
|
15
|
-
end
|
11
|
+
Palapala.debug = $debug
|
16
12
|
|
17
13
|
# @param concurrency Number of concurrent threads
|
18
14
|
# @param iterations Number of iterations per thread
|
@@ -22,7 +18,8 @@ def benchmark(concurrency, iterations)
|
|
22
18
|
Thread.new do
|
23
19
|
iterations.times do |j|
|
24
20
|
doc = "Hello #{i}, world #{j}! #{Time.now}."
|
25
|
-
Palapala::Pdf.new(doc)
|
21
|
+
pdf = Palapala::Pdf.new(doc)
|
22
|
+
$save ? pdf.save("tmp/benchmark_#{i}_#{j}.pdf") : pdf.binary_data
|
26
23
|
end
|
27
24
|
end
|
28
25
|
end
|
@@ -32,18 +29,9 @@ def benchmark(concurrency, iterations)
|
|
32
29
|
time
|
33
30
|
end
|
34
31
|
|
35
|
-
puts
|
32
|
+
puts "Warmup..."
|
33
|
+
benchmark(1, 5)
|
34
|
+
puts "Starting benchmark..."
|
36
35
|
benchmark(1, 10)
|
37
|
-
|
38
|
-
# benchmark(1, 20)
|
39
|
-
benchmark(2, 10)
|
40
|
-
# benchmark(4, 5)
|
41
|
-
# benchmark(5, 4)
|
42
|
-
# benchmark(20, 1)
|
43
|
-
|
44
|
-
# benchmark(1, 320)
|
45
|
-
# benchmark(2, 320 / 2)
|
36
|
+
benchmark(2, 20 / 2)
|
46
37
|
benchmark(4, 320 / 4)
|
47
|
-
# benchmark(8, 320 / 8)
|
48
|
-
# benchmark(20, 2)
|
49
|
-
# benchmark(40, 1)
|
@@ -25,9 +25,9 @@ module Palapala
|
|
25
25
|
end
|
26
26
|
end
|
27
27
|
|
28
|
-
# Check if a Chrome is running
|
28
|
+
# Check if a Chrome is running locally
|
29
29
|
def self.chrome_running?
|
30
|
-
port_in_use? || # Check if the port is in use
|
30
|
+
port_in_use? || # Check if the port is in use
|
31
31
|
chrome_process_healthy? # Check if the process is still alive
|
32
32
|
end
|
33
33
|
|
@@ -59,9 +59,9 @@ module Palapala
|
|
59
59
|
system("which npx > /dev/null 2>&1")
|
60
60
|
end
|
61
61
|
|
62
|
-
def self.
|
62
|
+
def self.spawn_chrome_headless_server_with_npx
|
63
63
|
# Run the command and capture the output
|
64
|
-
puts "Installing
|
64
|
+
puts "Installing/launching chrome-headless-shell@#{Palapala.chrome_headless_shell_version}"
|
65
65
|
output, status = Open3.capture2("npx --yes @puppeteer/browsers install chrome-headless-shell@#{Palapala.chrome_headless_shell_version}")
|
66
66
|
|
67
67
|
if status.success?
|
@@ -82,28 +82,37 @@ module Palapala
|
|
82
82
|
# Display the version
|
83
83
|
system("#{chrome_path} --version") if Palapala.debug
|
84
84
|
# Launch chrome-headless-shell with the --remote-debugging-port parameter
|
85
|
-
|
86
|
-
|
85
|
+
params = [ "--disable-gpu", "--remote-debugging-port=9222" ]
|
86
|
+
params.merge!(Palapala.chrome_params) if Palapala.chrome_params
|
87
|
+
pid = if Palapala.debug
|
88
|
+
spawn(chrome_path, *params)
|
87
89
|
else
|
88
|
-
spawn(chrome_path,
|
90
|
+
spawn(chrome_path, *params, out: "/dev/null", err: "/dev/null")
|
89
91
|
end
|
92
|
+
Palapala.headless_chrome_url = "http://localhost:9222"
|
93
|
+
pid
|
90
94
|
else
|
91
95
|
raise "Failed to install chrome-headless-shell"
|
92
96
|
end
|
93
97
|
end
|
94
98
|
|
99
|
+
def self.spawn_chrome_from_path
|
100
|
+
params = [ "--headless", "--disable-gpu", "--remote-debugging-port=9222" ]
|
101
|
+
params.merge!(Palapala.chrome_params) if Palapala.chrome_params
|
102
|
+
# Spawn an existing chrome with the path and parameters
|
103
|
+
Process.spawn(chrome_path, *params)
|
104
|
+
end
|
105
|
+
|
95
106
|
# Spawn a Chrome child process
|
96
107
|
def self.spawn_chrome
|
97
108
|
return if chrome_running?
|
98
109
|
|
99
|
-
|
100
|
-
|
101
|
-
|
102
|
-
|
103
|
-
|
104
|
-
|
105
|
-
@chrome_process_id = Process.spawn(chrome_path, *params)
|
106
|
-
end
|
110
|
+
@chrome_process_id =
|
111
|
+
if Palapala.headless_chrome_path.nil? && self.npx_installed?
|
112
|
+
spawn_chrome_headless_server_with_npx
|
113
|
+
else
|
114
|
+
spawn_chrome_from_path
|
115
|
+
end
|
107
116
|
|
108
117
|
# Wait until the port is in use
|
109
118
|
sleep 0.1 until port_in_use?
|
data/lib/palapala/pdf.rb
CHANGED
@@ -42,20 +42,22 @@ module Palapala
|
|
42
42
|
scale: nil)
|
43
43
|
@content = content || raise(ArgumentError, "Content is required and can't be nil")
|
44
44
|
@opts = {}
|
45
|
-
@opts[:headerTemplate]
|
46
|
-
@opts[:footerTemplate]
|
47
|
-
@opts[:pageRanges]
|
48
|
-
@opts[:generateTaggedPDF]
|
49
|
-
@opts[:paperWidth]
|
50
|
-
@opts[:paperHeight]
|
51
|
-
@opts[:landscape]
|
52
|
-
@opts[:marginTop]
|
53
|
-
@opts[:marginLeft]
|
54
|
-
@opts[:marginBottom]
|
55
|
-
@opts[:marginRight]
|
56
|
-
@opts[:preferCSSPageSize]
|
57
|
-
@opts[:printBackground]
|
58
|
-
@opts[:scale]
|
45
|
+
@opts[:headerTemplate] = header_template || Palapala.defaults[:header_template]
|
46
|
+
@opts[:footerTemplate] = footer_template || Palapala.defaults[:footer_template]
|
47
|
+
@opts[:pageRanges] = page_ranges || Palapala.defaults[:page_ranges]
|
48
|
+
@opts[:generateTaggedPDF] = generate_tagged_pdf || Palapala.defaults[:generate_tagged_pdf]
|
49
|
+
@opts[:paperWidth] = paper_width || Palapala.defaults[:paper_width]
|
50
|
+
@opts[:paperHeight] = paper_height || Palapala.defaults[:paper_height]
|
51
|
+
@opts[:landscape] = landscape || Palapala.defaults[:landscape]
|
52
|
+
@opts[:marginTop] = margin_top || Palapala.defaults[:margin_top]
|
53
|
+
@opts[:marginLeft] = margin_left || Palapala.defaults[:margin_left]
|
54
|
+
@opts[:marginBottom] = margin_bottom || Palapala.defaults[:margin_bottom]
|
55
|
+
@opts[:marginRight] = margin_right || Palapala.defaults[:margin_right]
|
56
|
+
@opts[:preferCSSPageSize] = prefer_css_page_size || Palapala.defaults[:prefer_css_page_size]
|
57
|
+
@opts[:printBackground] = print_background || Palapala.defaults[:print_background]
|
58
|
+
@opts[:scale] = scale || Palapala.defaults[:scale]
|
59
|
+
@opts[:displayHeaderFooter] = (@opts[:headerTemplate] || @opts[:footerTemplate]) ? true : false
|
60
|
+
@opts[:encoding] = :binary
|
59
61
|
@opts.compact!
|
60
62
|
end
|
61
63
|
|
data/lib/palapala/renderer.rb
CHANGED
@@ -22,6 +22,13 @@ module Palapala
|
|
22
22
|
send_command_and_wait_for_result("Page.enable")
|
23
23
|
end
|
24
24
|
|
25
|
+
def websocket_url
|
26
|
+
self.class.websocket_url
|
27
|
+
rescue Errno::ECONNREFUSED
|
28
|
+
ChromeProcess.spawn_chrome # Spawn a new Chrome process
|
29
|
+
self.class.websocket_url # Retry (once)
|
30
|
+
end
|
31
|
+
|
25
32
|
# Create a thread-local instance of the renderer
|
26
33
|
def self.thread_local_instance
|
27
34
|
Thread.current[:renderer] ||= Renderer.new
|
@@ -37,6 +44,9 @@ module Palapala
|
|
37
44
|
def on_message(e)
|
38
45
|
puts "Received: #{e.data[0..64]}" if Palapala.debug
|
39
46
|
@response = JSON.parse(e.data) # Parse the JSON response
|
47
|
+
if @response["error"] # Raise an error if the response contains an error
|
48
|
+
raise "#{@response["error"]["message"]}: #{@response["error"]["data"]} (#{@response["error"]["code"]})"
|
49
|
+
end
|
40
50
|
end
|
41
51
|
|
42
52
|
# Update the current ID to the next ID (increment by 1)
|
@@ -102,16 +112,8 @@ module Palapala
|
|
102
112
|
@client.close
|
103
113
|
end
|
104
114
|
|
105
|
-
private
|
106
|
-
|
107
|
-
# Convert the HTML content to a data URL
|
108
|
-
def data_url_for_html(html)
|
109
|
-
"data:text/html;base64,#{Base64.strict_encode64(html)}"
|
110
|
-
end
|
111
|
-
|
112
115
|
# Open a new tab in the remote chrome and return the WebSocket URL
|
113
|
-
def websocket_url
|
114
|
-
ChromeProcess.spawn_chrome
|
116
|
+
def self.websocket_url
|
115
117
|
uri = URI("#{Palapala.headless_chrome_url}/json/new")
|
116
118
|
http = Net::HTTP.new(uri.host, uri.port)
|
117
119
|
request = Net::HTTP::Put.new(uri)
|
@@ -122,5 +124,12 @@ module Palapala
|
|
122
124
|
puts "WebSocket URL: #{websocket_url}" if Palapala.debug
|
123
125
|
websocket_url
|
124
126
|
end
|
127
|
+
|
128
|
+
private
|
129
|
+
|
130
|
+
# Convert the HTML content to a data URL
|
131
|
+
def data_url_for_html(html)
|
132
|
+
"data:text/html;base64,#{Base64.strict_encode64(html)}"
|
133
|
+
end
|
125
134
|
end
|
126
135
|
end
|
data/lib/palapala/version.rb
CHANGED
data/lib/palapala.rb
CHANGED
@@ -19,18 +19,16 @@ module Palapala
|
|
19
19
|
# path to the headless Chrome executable when using the child process renderer
|
20
20
|
attr_accessor :headless_chrome_path
|
21
21
|
|
22
|
-
# URL to the headless Chrome instance when using the remote renderer
|
22
|
+
# URL to the headless Chrome instance when using the remote renderer (priority)
|
23
23
|
attr_accessor :headless_chrome_url
|
24
24
|
|
25
|
-
# Chrome headless shell version to use
|
25
|
+
# Chrome headless shell version to use (stable, beta, dev, canary, etc.)
|
26
|
+
# when launching a new Chrome instance using npx
|
26
27
|
attr_accessor :chrome_headless_shell_version
|
27
28
|
end
|
28
|
-
puts "setting defaults on palapala"
|
29
29
|
self.debug = false
|
30
|
-
self.defaults = {
|
31
|
-
self.headless_chrome_path = nil
|
32
|
-
self.headless_chrome_url = "http://localhost:9222"
|
30
|
+
self.defaults = { print_background: true, prefer_css_page_size: true, margin_left: 0, margin_right: 0, margin_top: 0, margin_bottom: 0 }
|
31
|
+
self.headless_chrome_path = ENV.fetch("HEADLESS_CHROME_PATH", nil)
|
32
|
+
self.headless_chrome_url = ENV.fetch("HEADLESS_CHROME_URL", "http://localhost:9222")
|
33
33
|
self.chrome_headless_shell_version = ENV.fetch("CHROME_HEADLESS_SHELL_VERSION", "stable")
|
34
34
|
end
|
35
|
-
|
36
|
-
puts "hoo"
|
data/paged_css.pdf
ADDED
Binary file
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: palapala_pdf
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
4
|
+
version: 0.1.11
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Koen Handekyn
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2024-08-
|
11
|
+
date: 2024-08-30 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: base64
|
@@ -56,8 +56,16 @@ files:
|
|
56
56
|
- assets/images/logo-variant2.webp
|
57
57
|
- assets/images/logo.webp
|
58
58
|
- bin/chrome-headless-server
|
59
|
+
- doc/installing_node.md
|
60
|
+
- doc/paged_css.md
|
61
|
+
- examples/all.rb
|
62
|
+
- examples/chrome_base_header_footer_template.html
|
63
|
+
- examples/headers_and_footers.pdf
|
59
64
|
- examples/headers_and_footers.rb
|
65
|
+
- examples/js_based_rendering.pdf
|
60
66
|
- examples/js_based_rendering.rb
|
67
|
+
- examples/paged_css.pdf
|
68
|
+
- examples/paged_css.rb
|
61
69
|
- examples/performance_benchmark.rb
|
62
70
|
- lib/palapala.rb
|
63
71
|
- lib/palapala/chrome_process.rb
|
@@ -66,6 +74,7 @@ files:
|
|
66
74
|
- lib/palapala/version.rb
|
67
75
|
- lib/palapala/web_socket_client.rb
|
68
76
|
- lib/palapala_pdf.rb
|
77
|
+
- paged_css.pdf
|
69
78
|
- palapala_pdf.gemspec
|
70
79
|
homepage: https://github.com/palapala-app/palapala_pdf
|
71
80
|
licenses:
|