gkhtmltopdf 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 450c50b4a300eb003c28a7d3df213ed05ec5ec0626172ef178966f693af4cf5d
4
- data.tar.gz: 0a80041e7a625a55ba11f7636b25a2ebee2a2500e3a6df9d0d8e60c49300a4c9
3
+ metadata.gz: 693828b2a9f004f9038b744c5d4a527a46f27ca55ad7c3d4638478aa2a747ade
4
+ data.tar.gz: 418727b131d4305641bcb66df639c961aabc73de334fd599b06d772f2b0557ba
5
5
  SHA512:
6
- metadata.gz: ca97c9e7eeb874af9258869c4ead4d3dfbdd68f641afed5baaa66f5bdbf20dc1fd9b851039ba7fdc231df8f7ecd7b01b2b5768dddeace29c0c42608c469725ec
7
- data.tar.gz: 2feea957d0d915498bb92378a6ff734ffce7922da83f1cadb4765090eb09d5d3030b38525c3295ec31fe1826453f34a4bfde5f0343aad8fef9d6fc943ac6234c
6
+ metadata.gz: 8776f073004b0e5dba496316e4c95e18121784be2430b12f679dc900cfee41e7002befda8fe022f30739e8bdf33c76ce7c51cd625ed5e5095cc95f4eb60b4d3f
7
+ data.tar.gz: 3df53840690e009d5c39a18710867505eff22e0bbb0d6711dfc6f8cc44a73442c02f1d6f5b5c1c3ac9f5c5c327f61d377bbab9a73dafea0124ff4a20e378426b
data/CHANGELOG.md CHANGED
@@ -2,6 +2,12 @@
2
2
 
3
3
  All noteworthy changes to this project will be documented in this file.
4
4
 
5
+ ## 1.1.0 / 2026-06-29
6
+
7
+ - UA設定機能の追加
8
+ - mktmpdirの明示的な削除
9
+ - ubuntuでの動かし方とか修正
10
+
5
11
  ## 1.0.0 / 2026-03-19
6
12
 
7
13
  - Added serial processing for multiple files and URLs.
data/README.md CHANGED
@@ -5,6 +5,11 @@ Gkhtmltopdf is mean Gecko HTML to PDF converter.
5
5
  Developed as an alternative to wkhtmltopdf.
6
6
  This gem converts HTML to PDF using Firefox's Geckodriver.
7
7
 
8
+ [![Gem Version](https://badge.fury.io/rb/gkhtmltopdf.svg)](https://badge.fury.io/rb/gkhtmltopdf)
9
+ ![Gem Total Downloads](https://img.shields.io/gem/dt/gkhtmltopdf)
10
+ ![GitHub License](https://img.shields.io/github/license/fantasia-tech/gkhtmltopdf-rb)
11
+ ![Rspec](https://github.com/fantasia-tech/gkhtmltopdf-rb/actions/workflows/main.yml/badge.svg)
12
+
8
13
  ---
9
14
 
10
15
  ## How to
@@ -12,36 +17,42 @@ This gem converts HTML to PDF using Firefox's Geckodriver.
12
17
  ### 1. Install
13
18
 
14
19
  1. [Firefox](https://www.firefox.com)
15
- - Ubuntu
16
- ```bash
17
- $ apt install -y firefox
18
- $ apt install -y fonts-noto # recommended
19
- ```
20
- - Debian
21
- ```bash
22
- $ apt install -y firefox-esr
23
- $ apt install -y fonts-noto # recommended
24
- ```
20
+ - Ubuntu
21
+ > The snap does not work correctly, so please install it from the [official source](https://support.mozilla.org/en-US/kb/install-firefox-linux).
22
+ ```bash
23
+ $ wget -q https://packages.mozilla.org/apt/repo-signing-key.gpg -O- | sudo tee /etc/apt/keyrings/packages.mozilla.org.asc > /dev/null
24
+ $ echo "deb [signed-by=/etc/apt/keyrings/packages.mozilla.org.asc] https://packages.mozilla.org/apt mozilla main" | tee -a /etc/apt/sources.list.d/mozilla.list > /dev/null
25
+ $ tee /etc/apt/preferences.d/mozilla > /dev/null << EOF
26
+ Package: *
27
+ Pin: origin packages.mozilla.org
28
+ Pin-Priority: 1000
29
+ EOF
30
+ $ apt install -y firefox
31
+ $ apt install -y fonts-noto # recommended
32
+ ```
33
+ - Debian
34
+ ```bash
35
+ $ apt install -y firefox-esr
36
+ $ apt install -y fonts-noto # recommended
37
+ ```
25
38
 
26
39
  2. [geckodriver](https://github.com/mozilla/geckodriver)
27
- - Linux (Ubuntu / Debian)
40
+ - Linux (Ubuntu / Debian)
28
41
  ```bash
29
42
  $ wget "https://github.com/mozilla/geckodriver/releases/download/v0.36.0/geckodriver-v0.36.0-linux64.tar.gz" -O /tmp/geckodriver.tar.gz
30
43
  $ tar -xzf /tmp/geckodriver.tar.gz -C /usr/local/bin
31
44
  ```
32
45
 
33
46
  3. gem install
34
- - bundler
47
+ - bundler
35
48
  ```bash
36
49
  $ bundle add gkhtmltopdf
37
50
  ```
38
- - other
51
+ - other
39
52
  ```bash
40
53
  $ gem install gkhtmltopdf
41
54
  ```
42
55
 
43
- ---
44
-
45
56
  ### 2. Using
46
57
 
47
58
  #### Ruby
@@ -57,6 +68,8 @@ Gkhtmltopdf.convert('https://example.com', 'example_com.pdf')
57
68
  Gkhtmltopdf.convert('file:///foo/bar/test.html', 'local.pdf')
58
69
  # with option (print background)
59
70
  Gkhtmltopdf.convert('https://f6a.net/oss/', 'with_bg.pdf', print_options: {background: true})
71
+ # with option (set custom user-agent)
72
+ Gkhtmltopdf.convert('https://f6a.net/oss/', 'ua.pdf', user_agent: 'YOUR USER AGENT')
60
73
  ```
61
74
 
62
75
  Additionally, in version 1.0.0 we added the following syntax.
@@ -69,6 +82,10 @@ Gkhtmltopdf.open do |gkh2p|
69
82
  gkh2p.save_pdf('file:///foo/bar/test.html', 'local.pdf')
70
83
  gkh2p.save_pdf('https://f6a.net/oss/', 'with_bg.pdf', print_options: {background: true})
71
84
  end
85
+ # set custom user-agent
86
+ Gkhtmltopdf.open(user_agent: 'YOUR USER AGENT') do |gkh2p|
87
+ gkh2p.save_pdf('https://f6a.net/oss/', 'ua.pdf')
88
+ end
72
89
  ```
73
90
 
74
91
  #### Shell
@@ -80,6 +97,8 @@ $ gkhtmltopdf https://example.com/ example_com.pdf
80
97
  $ gkhtmltopdf /foo/bar/test.html local.pdf
81
98
  # with option (print background)
82
99
  $ gkhtmltopdf https://f6a.net/oss/ with_bg.pdf --background
100
+ # with option (set custom user-agent)
101
+ $ gkhtmltopdf https://f6a.net/oss/ ua.pdf --user-agent "YOUR USER AGENT"
83
102
  # other option
84
103
  $ gkhtmltopdf --help
85
104
  ```
@@ -105,7 +124,7 @@ Attackers could potentially generate PDFs of internal network resources (e.g., `
105
124
 
106
125
  ---
107
126
 
108
- ## Errors
127
+ ## Expected Errors
109
128
 
110
129
  The following errors inherit `Gkhtmltopdf::Error`, so you can handle them as follows:
111
130
 
@@ -132,6 +151,12 @@ Response from Firefox/Geckodriver is not as expected.
132
151
 
133
152
  ---
134
153
 
154
+ ## Documents
155
+
156
+ - [ForDeveloper](/docs/ForDeveloper.md)
157
+
158
+ ---
159
+
135
160
  ## Acknowledgments & Third-Party Licenses
136
161
 
137
162
  This gem acts as a wrapper and communicates with the following external open-source tools.
data/TODO.md CHANGED
@@ -17,9 +17,20 @@
17
17
  - [x] エラー処理
18
18
  - [x] 起動時の待機時間オプション
19
19
 
20
- ## 今後検討
20
+ ## 完了 v1.1.0
21
+
22
+ - [x] UA設定機能の追加
23
+ - [x] mktmpdirの明示的な削除
24
+
25
+ ## つぎ v1.2.0
26
+
27
+ - [ ] 設定のStruct化?(default切り出し)
28
+ - [ ] pdfをsaveせずバイナリを直接返却するメソッドの追加
29
+ - [ ] 肥大化したconvert.rbの分割( `Gkhtmltopdf::PDF` とかつくるか)
30
+
31
+ ## 検討中
21
32
 
22
- - [ ] UA設定機能
23
33
  - [ ] ポート範囲設定?
24
- - [ ] configファイルからオプションを設定?
25
34
  - [ ] YARD追加
35
+ - [ ] FireFox timeout
36
+ - [ ] configファイルからオプションを設定?
@@ -1,3 +1,5 @@
1
+ # escape=\
2
+ # syntax=docker/dockerfile:1.7
1
3
  FROM ruby:3.2-slim
2
4
 
3
5
  RUN apt-get update
@@ -0,0 +1,28 @@
1
+ FROM ubuntu:noble
2
+
3
+ RUN apt-get update
4
+ RUN apt-get install -y ruby3.2 bundler
5
+ RUN apt-get install -y git wget xz-utils build-essential libyaml-dev
6
+
7
+ # Install Noto Fonts
8
+ RUN apt-get install -y fonts-noto
9
+
10
+ # Install Firefox
11
+ RUN wget -q https://packages.mozilla.org/apt/repo-signing-key.gpg -O- | tee /etc/apt/keyrings/packages.mozilla.org.asc > /dev/null
12
+ RUN echo "deb [signed-by=/etc/apt/keyrings/packages.mozilla.org.asc] https://packages.mozilla.org/apt mozilla main" | tee -a /etc/apt/sources.list.d/mozilla.list > /dev/null
13
+ RUN tee /etc/apt/preferences.d/mozilla > /dev/null << EOF
14
+ Package: *
15
+ Pin: origin packages.mozilla.org
16
+ Pin-Priority: 1000
17
+ EOF
18
+ RUN apt-get update
19
+ RUN apt-get install -y firefox
20
+
21
+ # Install Geckodriver
22
+ RUN wget "https://github.com/mozilla/geckodriver/releases/download/v0.36.0/geckodriver-v0.36.0-linux64.tar.gz" -O geckodriver.tar.gz
23
+ RUN tar -xzf geckodriver.tar.gz -C /usr/local/bin
24
+
25
+ COPY . /app
26
+ WORKDIR /app
27
+ RUN bundle install
28
+ CMD ["bundle", "exec", "rspec"]
@@ -0,0 +1,17 @@
1
+ # for Developer
2
+
3
+ ## Test
4
+
5
+ ```bash
6
+ $ docker build -f ./dockerfiles/Dockerfile.debian13-ruby32 . -t gkhtmltopdf-d13r32
7
+ $ docker build -f ./dockerfiles/Dockerfile.ubuntu24-ruby32 . -t gkhtmltopdf-u24r32
8
+ $ docker run --rm gkhtmltopdf-d13r32
9
+ $ docker run --rm gkhtmltopdf-u24r32
10
+ $ docker rmi gkhtmltopdf-d13r32
11
+ $ docker rmi gkhtmltopdf-u24r32
12
+ ```
13
+
14
+ ## Build
15
+
16
+ ```
17
+ ```
data/exe/gkhtmltopdf CHANGED
@@ -12,33 +12,37 @@ options = {
12
12
  parser = OptionParser.new do |opts|
13
13
  opts.banner = "Usage: gkhtmltopdf [options] <URL_OR_FILE> <OUTPUT_PDF>"
14
14
 
15
- opts.on("-O", "--orientation [PORTRAIT|LANDSCAPE]", "default: portrait") do |v|
15
+ opts.on("-O", "--orientation ORIENTATION", ["portrait", "landscape"], "ORIENTATION portrait or landscape (default: portrait)") do |v|
16
16
  options[:print_options][:orientation] = v.downcase
17
17
  end
18
18
 
19
+ opts.on("--user-agent USERAGENT", String, "Browser custom USERAGENT") do |v|
20
+ options[:user_agent] = v
21
+ end
22
+
19
23
  opts.on("--background", "Print background") do
20
24
  options[:print_options][:background] = true
21
25
  end
22
26
 
23
- opts.on("--margin-top [CM]", Float, "margin top (cm)") do |v|
27
+ opts.on("--margin-top CM", Float, "margin top CM") do |v|
24
28
  options[:print_options][:margin] ||= {}
25
29
  options[:print_options][:margin][:top] = v
26
30
  end
27
31
 
28
- opts.on("--margin-bottom [CM]", Float, "margin bottom (cm)") do |v|
32
+ opts.on("--margin-bottom CM", Float, "margin bottom CM") do |v|
29
33
  options[:print_options][:margin] ||= {}
30
34
  options[:print_options][:margin][:bottom] = v
31
35
  end
32
36
 
33
- opts.on("--firefox-path [PATH]", "Firefox custom PATH") do |v|
37
+ opts.on("--firefox-path PATH", String, "Firefox custom PATH") do |v|
34
38
  options[:firefox_path] = v
35
39
  end
36
40
 
37
- opts.on("--geckodriver-path [PATH]", "Geckodriver custom PATH") do |v|
41
+ opts.on("--geckodriver-path PATH", String, "Geckodriver custom PATH") do |v|
38
42
  options[:geckodriver_path] = v
39
43
  end
40
44
 
41
- opts.on("--launch-max-wait-time [NUM]", Integer, "Launch max wait time (approx: NUM * 0.1sec)") do |v|
45
+ opts.on("--launch-max-wait-time NUM", Integer, "Launch max wait time (approx: NUM * 0.1sec)") do |v|
42
46
  options[:wait_time] = v
43
47
  end
44
48
 
@@ -74,6 +78,7 @@ begin
74
78
  init_options[:firefox_path] = options.delete(:firefox_path) if options[:firefox_path]
75
79
  init_options[:geckodriver_path] = options.delete(:geckodriver_path) if options[:geckodriver_path]
76
80
  init_options[:wait_time] = options.delete(:wait_time) if options[:wait_time]
81
+ init_options[:user_agent] = options.delete(:user_agent) if options[:user_agent]
77
82
 
78
83
  Gkhtmltopdf.convert(input_url, output_path, print_options: options[:print_options], **init_options)
79
84
 
data/lib/errors.rb CHANGED
@@ -1,3 +1,5 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module Gkhtmltopdf
2
4
  class Error < StandardError; end
3
5
 
@@ -1,24 +1,32 @@
1
+ # frozen_string_literal: true
2
+
1
3
  require 'net/http'
2
4
  require 'json'
3
5
  require 'base64'
4
6
  require 'uri'
5
7
  require 'socket'
8
+ require 'tmpdir'
9
+ require 'fileutils'
6
10
 
7
11
  module Gkhtmltopdf
8
12
  class Converter
9
- def open(geckodriver_path: nil, firefox_path: nil, wait_time: nil, port: nil)
13
+ DEFAULT_FX_USER_AGENT = "gkhtmltopdf-rb(v#{VERSION}) by firefox and gecko".freeze
14
+
15
+ def open(geckodriver_path: nil, firefox_path: nil, wait_time: nil, port: nil, user_agent: nil)
10
16
  @geckodriver_path = resolve_geckodriver_path!(geckodriver_path)
11
17
  @firefox_path = resolve_firefox_path!(firefox_path)
12
18
  @port = port || get_free_port
13
19
  @base_url = "http://127.0.0.1:#{@port}"
14
20
  @pid = spawn("#{@geckodriver_path} --port #{@port}", out: File::NULL, err: File::NULL)
15
21
  wait_time ||= 20
22
+ @profile_path = gen_tmp_profile(user_agent)
16
23
  wait_for_gk(wait_time)
17
24
  create_session!
18
25
  end
19
26
 
20
27
  def close
21
28
  delete_session! if @session_id
29
+ delete_tmp_profile! if @profile_path
22
30
  begin
23
31
  unless @pid.nil?
24
32
  Process.kill('TERM', @pid)
@@ -101,12 +109,24 @@ module Gkhtmltopdf
101
109
  raise BrowserError, "Failed to launch geckodriver (port #{@port})"
102
110
  end
103
111
 
112
+ def gen_tmp_profile(ua = nil)
113
+ tmp_profile_path = Dir.mktmpdir
114
+ ua ||= DEFAULT_FX_USER_AGENT
115
+ escaped_user_agent = JSON.generate(ua)
116
+ profile = []
117
+ profile << '# set gkhtmltopdf default profile'
118
+ profile << "user_pref(\"general.useragent.override\", #{escaped_user_agent});\n"
119
+ File.open(File.join(tmp_profile_path, 'user.js'), 'w') do |f|
120
+ profile.each { |line| f.puts(line) }
121
+ end
122
+ tmp_profile_path
123
+ end
124
+
104
125
  def post(path, payload)
105
126
  uri = URI("#{@base_url}#{path}")
106
127
  req = Net::HTTP::Post.new(uri, 'Content-Type' => 'application/json')
107
128
  req.body = payload.to_json
108
129
  res = Net::HTTP.start(uri.hostname, uri.port) { |http| http.request(req) }
109
-
110
130
  begin
111
131
  JSON.parse(res.body)
112
132
  rescue JSON::ParserError
@@ -115,7 +135,7 @@ module Gkhtmltopdf
115
135
  end
116
136
 
117
137
  def create_session!
118
- firefox_options = { args: ["-headless"] }
138
+ firefox_options = { args: ["-headless", '--profile', @profile_path] }
119
139
  firefox_options[:binary] = @firefox_path if @firefox_path != 'firefox'
120
140
 
121
141
  payload = {
@@ -163,6 +183,10 @@ module Gkhtmltopdf
163
183
  @session_id = nil
164
184
  end
165
185
 
186
+ def delete_tmp_profile!
187
+ FileUtils.remove_entry_secure(@profile_path)
188
+ end
189
+
166
190
  def validate_url_scheme!(url_string)
167
191
  parsed_url = URI.parse(url_string)
168
192
  allowed_schemes = ['http', 'https', 'file']
@@ -1,3 +1,5 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module Gkhtmltopdf
2
4
  class DSL
3
5
  def initialize
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Gkhtmltopdf
4
- VERSION = "1.0.0"
4
+ VERSION = '1.1.0'
5
5
  end
data/lib/gkhtmltopdf.rb CHANGED
@@ -6,17 +6,17 @@ require_relative 'gkhtmltopdf/dsl'
6
6
  require_relative 'errors'
7
7
 
8
8
  module Gkhtmltopdf
9
- def self.convert(url, output_path, geckodriver_path: nil, firefox_path: nil, wait_time: nil, port: nil, print_options: {})
9
+ def self.convert(url, output_path, geckodriver_path: nil, firefox_path: nil, wait_time: nil, port: nil, user_agent: nil, print_options: {})
10
10
  converter = DSL.new
11
- converter.open(geckodriver_path: geckodriver_path, firefox_path: firefox_path, wait_time: wait_time, port: port)
11
+ converter.open(geckodriver_path: geckodriver_path, firefox_path: firefox_path, wait_time: wait_time, port: port, user_agent: user_agent)
12
12
  converter.save_pdf(url, output_path, print_options: print_options)
13
13
  ensure
14
14
  converter.close
15
15
  end
16
16
 
17
- def self.open(geckodriver_path: nil, firefox_path: nil, wait_time: nil, port: nil, &block)
17
+ def self.open(geckodriver_path: nil, firefox_path: nil, wait_time: nil, port: nil, user_agent: nil, &block)
18
18
  converter = DSL.new
19
- converter.open(geckodriver_path: geckodriver_path, firefox_path: firefox_path, wait_time: wait_time, port: port)
19
+ converter.open(geckodriver_path: geckodriver_path, firefox_path: firefox_path, wait_time: wait_time, port: port, user_agent: user_agent)
20
20
  yield converter
21
21
  ensure
22
22
  converter.close
@@ -17,12 +17,16 @@ RSpec.describe Gkhtmltopdf::Converter do
17
17
  end
18
18
  describe '#save_pdf' do
19
19
  before { converter.open }
20
- subject { converter.save_pdf(url, output_path) }
21
20
  let(:url) { "file://#{file_fixture('test.html')}" }
22
- let(:output_path) { File.join(Dir.mktmpdir, 'output.pdf') }
21
+ let(:output_path) { Dir.mktmpdir }
22
+ let(:output) { File.join(output_path, 'output.pdf') }
23
+ after { FileUtils.remove_entry_secure(output_path) }
24
+
25
+ subject { converter.save_pdf(url, output) }
26
+
23
27
  it {
24
- expect { subject }.to change { Dir.glob(output_path).count }.from(0).to(1)
25
- expect(File.binread(output_path)).to include('/FontName')
28
+ expect { subject }.to change { Dir.glob(output).count }.from(0).to(1)
29
+ expect(File.binread(output)).to include('/FontName')
26
30
  }
27
31
  end
28
32
  describe '#resolve_geckodriver_path!' do
@@ -3,7 +3,9 @@ require 'spec_helper'
3
3
  RSpec.describe Gkhtmltopdf do
4
4
  describe '.convert' do
5
5
  let(:url) { 'https://f6a.net/oss/' }
6
- let(:output) { File.join(Dir.mktmpdir, 'output.pdf') }
6
+ let(:output_path) { Dir.mktmpdir }
7
+ let(:output) { File.join(output_path, 'output.pdf') }
8
+ after { FileUtils.remove_entry_secure(output_path) }
7
9
 
8
10
  subject { Gkhtmltopdf.convert(url, output) }
9
11
 
@@ -21,6 +23,7 @@ RSpec.describe Gkhtmltopdf do
21
23
  describe '.open' do
22
24
  let(:url) { 'https://f6a.net/oss/' }
23
25
  let(:output_path) { Dir.mktmpdir }
26
+ after { FileUtils.remove_entry_secure(output_path) }
24
27
 
25
28
  subject do
26
29
  Gkhtmltopdf.open do |gk|
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: gkhtmltopdf
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Kazuki Sakane
@@ -78,11 +78,13 @@ files:
78
78
  - ".rspec"
79
79
  - ".ruby-version"
80
80
  - CHANGELOG.md
81
- - Dockerfile
82
81
  - LICENSE
83
82
  - README.md
84
83
  - Rakefile
85
84
  - TODO.md
85
+ - dockerfiles/Dockerfile.debian13-ruby32
86
+ - dockerfiles/Dockerfile.ubuntu24-ruby32
87
+ - docs/ForDeveloper.md
86
88
  - exe/gkhtmltopdf
87
89
  - lib/errors.rb
88
90
  - lib/gkhtmltopdf.rb