jekyll-pandoc-exports 0.1.14 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ca2a0537fbb9140f794e1a067ea8c5d11158d67a6c113f9b05c0b49977686f91
4
- data.tar.gz: 242fa714998209a32555dfaeea645bdf9cc70f64105d088940215346c576859c
3
+ metadata.gz: 70da42a9e14e74c755a0e9cda4cc17a5fbe348075214d1d1820f3a06d40da60b
4
+ data.tar.gz: a862b3fab61784e2d37501f1e73982ab6e68495a6c90f7fadebd352abcaa7a5c
5
5
  SHA512:
6
- metadata.gz: ecd9468193ec01c865b2738524ec3a3831e26cf23175fe4ee5ff4112366a8042717065c8e8e8e41f95ae8275ffcf0db710b2be2b9994a571be8078d4deb88182
7
- data.tar.gz: e5e1af055ee84786049f38bb4d7d78666ff3c94c6a54f067318ced7ae25e3e3a2bf24ad640856e15d92757780cdb495279b42494d7e38839cf36dd640e6bf9b1
6
+ metadata.gz: 402bf541eaa772936ef7b2cc8691f557020c1d830a0bfeedf7d5b798ad5ffc6a7c2679fe4643115195651743d80ef0cb42b5064ee1f3594cec4ca172ce970baa
7
+ data.tar.gz: 07e74c42a95154d183431db6164546aa589e09fc71d28f77f37315dbf8e32175c84b84a324f77607ef83e623999a9c8168785858adc11a587478596e6622eade
data/CHANGELOG.md CHANGED
@@ -7,6 +7,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [0.2.0] - 2026-05-06
11
+
12
+ ### Added
13
+ - `jekyll export` command — generate PDF/DOCX without a full site build
14
+ - `--format` flag: select `pdf`, `docx`, or `both` output
15
+ - `--target` flag: export a specific page by filename
16
+ - `--dry-run` flag: print the Pandoc command without executing
17
+ - `--validate` flag: check `_data/data.yml` schema before export
18
+ - `--output` flag: override the output directory
19
+ - Pre-export YAML schema validation (checks required sections, fields, structure)
20
+ - `export_filename` front matter support: custom output filenames (e.g., `McGarrah-Resume` instead of `print`)
21
+ - `build_download_url` helper for correct download link generation with output_dir
22
+ - Test suite for the export command (10 tests, 15 assertions)
23
+
24
+ ### Changed
25
+ - Version bump to 0.2.0 (new feature: Jekyll Command integration)
26
+ - Minimum Ruby version raised from 3.0 to 3.2 (3.0 and 3.1 are EOL)
27
+ - CI matrix updated to Ruby 3.2, 3.3, 3.4 (dropped EOL 3.0, 3.1; added current 3.4)
28
+
10
29
  ## [0.1.14] - 2025-06-14
11
30
 
12
31
  ### Fixed
data/README.md CHANGED
@@ -17,6 +17,13 @@ A Jekyll plugin that automatically generates DOCX and PDF exports of your pages
17
17
 
18
18
  ## Installation
19
19
 
20
+ ### Requirements
21
+
22
+ - **Ruby** >= 3.2.0
23
+ - **Jekyll** >= 4.3.2 (earlier versions crash on Ruby 3.2+ due to Liquid's removed `tainted?` method)
24
+ - **Pandoc** (for document conversion)
25
+ - **LaTeX** (for PDF generation)
26
+
20
27
  ### 1. Install Dependencies
21
28
 
22
29
  First, install Pandoc and LaTeX (for PDF generation):
@@ -193,7 +200,48 @@ When `inject_downloads` is enabled, the plugin automatically adds download links
193
200
 
194
201
  ## CLI Usage
195
202
 
196
- The plugin includes a command-line tool for standalone conversions:
203
+ The plugin provides two command-line interfaces:
204
+
205
+ ### Jekyll Export Command (Recommended)
206
+
207
+ *Added in v0.2.0* — Runs within your Jekyll site context, reads `_config.yml`:
208
+
209
+ ```bash
210
+ # Build site first, then export without rebuilding
211
+ bundle exec jekyll build
212
+ bundle exec jekyll export
213
+
214
+ # Export PDF only
215
+ bundle exec jekyll export --format pdf
216
+
217
+ # Export a specific page
218
+ bundle exec jekyll export --target print
219
+
220
+ # See what Pandoc would run (debug LaTeX issues)
221
+ bundle exec jekyll export --dry-run
222
+
223
+ # Validate _data/data.yml schema before export
224
+ bundle exec jekyll export --validate
225
+
226
+ # Custom output directory
227
+ bundle exec jekyll export --output ~/Downloads
228
+ ```
229
+
230
+ #### Export Command Options
231
+
232
+ | Option | Description | Default |
233
+ |--------|-------------|---------|
234
+ | `--format FORMAT` | Output: `pdf`, `docx`, `both` | `both` |
235
+ | `--target TARGET` | Export specific page by filename | All configured pages |
236
+ | `--dry-run` | Print Pandoc command without executing | `false` |
237
+ | `--validate` | Validate `_data/data.yml` schema first | `false` |
238
+ | `--output DIR` | Override output directory | From `_config.yml` |
239
+ | `--source DIR` | Source directory | `.` |
240
+ | `--config FILE` | Configuration file | `_config.yml` |
241
+
242
+ ### Standalone CLI Tool
243
+
244
+ For converting individual HTML files outside of Jekyll:
197
245
 
198
246
  ```bash
199
247
  # Convert single HTML file to both formats
@@ -0,0 +1,316 @@
1
+ require 'jekyll'
2
+
3
+ module Jekyll
4
+ module PandocExports
5
+ class Command < Jekyll::Command
6
+ class << self
7
+ def init_with_program(prog)
8
+ prog.command(:export) do |c|
9
+ c.syntax "export [options]"
10
+ c.description "Generate PDF/DOCX exports without a full site build"
11
+
12
+ c.option 'format', '--format FORMAT', 'Output format: pdf, docx, both (default: both)'
13
+ c.option 'target', '--target TARGET', 'Target page to export by filename (default: all configured pages)'
14
+ c.option 'dry_run', '--dry-run', 'Print the Pandoc command without executing'
15
+ c.option 'validate', '--validate', 'Validate _data/data.yml schema before export'
16
+ c.option 'output', '-o', '--output DIR', 'Override output directory'
17
+ c.option 'source', '-s', '--source DIR', 'Source directory (default: .)'
18
+ c.option 'config', '--config FILE', 'Configuration file (default: _config.yml)'
19
+
20
+ c.action do |args, options|
21
+ ExportRunner.new(args, options).run
22
+ end
23
+ end
24
+ end
25
+ end
26
+ end
27
+
28
+ class ExportRunner
29
+ def initialize(args, options)
30
+ @args = args
31
+ @options = options
32
+ @format = (options['format'] || 'both').downcase
33
+ @target = options['target']
34
+ @dry_run = options['dry_run'] || false
35
+ @validate = options['validate'] || false
36
+ @output_dir = options['output']
37
+ @source = options['source'] || '.'
38
+ @config_file = options['config'] || '_config.yml'
39
+ end
40
+
41
+ def run
42
+ # Ensure logger shows info-level output for CLI feedback
43
+ Jekyll.logger.adjust_verbosity(verbose: true)
44
+
45
+ validate_format!
46
+ validate_schema! if @validate
47
+
48
+ config = load_site_config
49
+ export_config = PandocExports.setup_configuration(mock_site(config))
50
+
51
+ unless PandocExports.validate_dependencies
52
+ Jekyll.logger.error "Export:", "Missing required dependencies."
53
+ return
54
+ end
55
+
56
+ html_files = find_export_targets(config, export_config)
57
+
58
+ if html_files.empty?
59
+ Jekyll.logger.warn "Export:", "No export targets found. Run 'jekyll build' first, or check that pages have pdf/docx front matter."
60
+ return
61
+ end
62
+
63
+ html_files.each do |target|
64
+ export_file(target, config, export_config)
65
+ end
66
+
67
+ Jekyll.logger.info "Export:", "Complete."
68
+ end
69
+
70
+ private
71
+
72
+ def validate_format!
73
+ unless %w[pdf docx both].include?(@format)
74
+ Jekyll.logger.error "Export:", "Invalid format '#{@format}'. Use: pdf, docx, both"
75
+ exit 1
76
+ end
77
+ end
78
+
79
+ def validate_schema!
80
+ Jekyll.logger.info "Export:", "Validating _data/data.yml..."
81
+
82
+ data_file = File.join(@source, '_data', 'data.yml')
83
+ unless File.exist?(data_file)
84
+ Jekyll.logger.warn "Export:", "No _data/data.yml found — skipping validation."
85
+ return
86
+ end
87
+
88
+ begin
89
+ require 'yaml'
90
+ require 'date'
91
+ data = YAML.load_file(data_file, permitted_classes: [Date])
92
+
93
+ errors = []
94
+
95
+ # Check required top-level keys
96
+ %w[sidebar career-profile education experiences].each do |key|
97
+ errors << "Missing required section: '#{key}'" unless data.key?(key)
98
+ end
99
+
100
+ # Check sidebar required fields
101
+ if data['sidebar']
102
+ %w[name tagline email].each do |field|
103
+ errors << "Missing sidebar.#{field}" unless data['sidebar'][field]
104
+ end
105
+ end
106
+
107
+ # Check experiences structure
108
+ if data['experiences'] && data['experiences']['info']
109
+ data['experiences']['info'].each_with_index do |exp, i|
110
+ errors << "Experience ##{i + 1} missing 'role'" unless exp['role']
111
+ errors << "Experience ##{i + 1} missing 'company'" unless exp['company']
112
+ errors << "Experience ##{i + 1} missing 'time'" unless exp['time']
113
+ end
114
+ end
115
+
116
+ # Check education structure
117
+ if data['education'] && data['education']['info']
118
+ data['education']['info'].each_with_index do |edu, i|
119
+ errors << "Education ##{i + 1} missing 'degree'" unless edu['degree']
120
+ errors << "Education ##{i + 1} missing 'university'" unless edu['university']
121
+ end
122
+ end
123
+
124
+ if errors.any?
125
+ Jekyll.logger.error "Export:", "Schema validation failed:"
126
+ errors.each { |e| Jekyll.logger.error " ", e }
127
+ exit 1
128
+ else
129
+ Jekyll.logger.info "Export:", "Schema validation passed ✓"
130
+ end
131
+ rescue Psych::SyntaxError => e
132
+ Jekyll.logger.error "Export:", "YAML syntax error: #{e.message}"
133
+ exit 1
134
+ end
135
+ end
136
+
137
+ def load_site_config
138
+ config_path = File.join(@source, @config_file)
139
+ unless File.exist?(config_path)
140
+ Jekyll.logger.error "Export:", "Config file not found: #{config_path}"
141
+ exit 1
142
+ end
143
+
144
+ Jekyll.configuration({
145
+ 'source' => @source,
146
+ 'quiet' => false
147
+ })
148
+ end
149
+
150
+ def mock_site(config)
151
+ site = Object.new
152
+ site.define_singleton_method(:config) { config }
153
+ site.define_singleton_method(:dest) { config['destination'] }
154
+ site.define_singleton_method(:baseurl) { config['baseurl'] || '' }
155
+ site
156
+ end
157
+
158
+ def find_export_targets(config, export_config)
159
+ dest = config['destination']
160
+ unless Dir.exist?(dest)
161
+ Jekyll.logger.error "Export:", "Site destination '#{dest}' not found. Run 'jekyll build' first."
162
+ exit 1
163
+ end
164
+
165
+ targets = []
166
+
167
+ # Scan for HTML files with pdf/docx front matter markers
168
+ # The simplest approach: look for files that the generator would process
169
+ # by checking the source pages for front matter
170
+ source = config['source']
171
+
172
+ Dir.glob(File.join(source, '**', '*.html')).each do |source_file|
173
+ content = File.read(source_file)
174
+ next unless content.start_with?('---')
175
+
176
+ # Parse front matter
177
+ if content =~ /\A---\s*\n(.*?)\n---/m
178
+ begin
179
+ front_matter = YAML.safe_load($1) || {}
180
+ rescue
181
+ next
182
+ end
183
+
184
+ next unless front_matter['pdf'] || front_matter['docx']
185
+
186
+ # Determine the output HTML path
187
+ filename = File.basename(source_file, '.html')
188
+
189
+ # Filter by target if specified
190
+ if @target
191
+ next unless filename == @target || source_file.include?(@target)
192
+ end
193
+
194
+ # Find the built HTML file
195
+ permalink = front_matter['permalink']
196
+ if permalink
197
+ html_path = File.join(dest, permalink, 'index.html')
198
+ html_path = File.join(dest, "#{permalink.sub(/^\//, '')}.html") unless File.exist?(html_path)
199
+ # Try without trailing slash
200
+ html_path = File.join(dest, permalink, 'index.html') unless File.exist?(html_path)
201
+ # Direct path
202
+ html_path = File.join(dest, "#{permalink.sub(/^\//, '')}index.html") unless File.exist?(html_path)
203
+ else
204
+ html_path = File.join(dest, "#{filename}.html")
205
+ html_path = File.join(dest, filename, 'index.html') unless File.exist?(html_path)
206
+ end
207
+
208
+ if File.exist?(html_path)
209
+ targets << {
210
+ filename: filename,
211
+ html_path: html_path,
212
+ front_matter: front_matter
213
+ }
214
+ else
215
+ Jekyll.logger.warn "Export:", "Built HTML not found for #{source_file} (expected: #{html_path})"
216
+ end
217
+ end
218
+ end
219
+
220
+ targets
221
+ end
222
+
223
+ def export_file(target, config, export_config)
224
+ filename = target[:front_matter]['export_filename'] || target[:filename]
225
+ html_path = target[:html_path]
226
+ front_matter = target[:front_matter]
227
+
228
+ html_content = File.read(html_path)
229
+ site = mock_site(config)
230
+ processed_html = PandocExports.process_html_content(html_content, site, export_config)
231
+
232
+ output_dir = determine_output_dir(config, export_config)
233
+ FileUtils.mkdir_p(output_dir) unless Dir.exist?(output_dir)
234
+
235
+ generated_files = []
236
+
237
+ if should_generate_docx?(front_matter)
238
+ if @dry_run
239
+ print_dry_run(:docx, processed_html, filename, output_dir, export_config)
240
+ else
241
+ PandocExports.generate_docx(processed_html, filename, output_dir, site, generated_files, export_config)
242
+ end
243
+ end
244
+
245
+ if should_generate_pdf?(front_matter)
246
+ if @dry_run
247
+ print_dry_run(:pdf, processed_html, filename, output_dir, export_config)
248
+ else
249
+ mock_page = Object.new
250
+ mock_page.define_singleton_method(:data) { front_matter }
251
+ PandocExports.generate_pdf(processed_html, filename, output_dir, site, generated_files, mock_page, export_config)
252
+ end
253
+ end
254
+ end
255
+
256
+ def determine_output_dir(config, export_config)
257
+ if @output_dir
258
+ File.expand_path(@output_dir)
259
+ elsif export_config['output_dir'] && !export_config['output_dir'].empty?
260
+ File.join(config['destination'], export_config['output_dir'])
261
+ else
262
+ config['destination']
263
+ end
264
+ end
265
+
266
+ def should_generate_docx?(front_matter)
267
+ return false unless front_matter['docx']
268
+ %w[docx both].include?(@format)
269
+ end
270
+
271
+ def should_generate_pdf?(front_matter)
272
+ return false unless front_matter['pdf']
273
+ %w[pdf both].include?(@format)
274
+ end
275
+
276
+ def print_dry_run(format, html_content, filename, output_dir, config)
277
+ output_file = File.join(output_dir, "#{filename}.#{format}")
278
+
279
+ # Build the equivalent pandoc command
280
+ pandoc_args = ["pandoc"]
281
+ pandoc_args << "--from=html"
282
+ pandoc_args << "--to=#{format == :pdf ? 'pdf' : 'docx'}"
283
+ pandoc_args << "--output=#{output_file}"
284
+
285
+ if format == :pdf
286
+ pdf_options = config['pdf_options'] || {}
287
+ pdf_options.each do |key, value|
288
+ pandoc_args << "--#{key}=#{value}"
289
+ end
290
+
291
+ pandoc_options = config['pandoc_options'] || {}
292
+ pandoc_options.each do |key, value|
293
+ pandoc_args << "--#{key}=#{value}"
294
+ end
295
+ end
296
+
297
+ pandoc_args << "< [stdin: #{html_content.bytesize} bytes HTML]"
298
+
299
+ Jekyll.logger.info "Export [DRY RUN]:", pandoc_args.join(" ")
300
+ Jekyll.logger.info " Input:", "#{html_content.bytesize} bytes of processed HTML"
301
+ Jekyll.logger.info " Output:", output_file
302
+
303
+ if config['unicode_cleanup'] && format == :pdf
304
+ Jekyll.logger.info " Unicode:", "cleanup enabled (emoji/symbols stripped)"
305
+ end
306
+
307
+ cleanup_count = (config['title_cleanup'] || []).length
308
+ if cleanup_count > 0
309
+ Jekyll.logger.info " Cleanup:", "#{cleanup_count} title_cleanup patterns applied"
310
+ else
311
+ Jekyll.logger.info " Cleanup:", "none (clean HTML input)"
312
+ end
313
+ end
314
+ end
315
+ end
316
+ end
@@ -119,6 +119,11 @@ module Jekyll
119
119
  end
120
120
 
121
121
  def self.get_output_filename(item)
122
+ # Support custom export filenames via front matter
123
+ if item.respond_to?(:data) && item.data['export_filename']
124
+ return item.data['export_filename']
125
+ end
126
+
122
127
  if item.respond_to?(:basename)
123
128
  File.basename(item.basename, '.md')
124
129
  else
@@ -230,7 +235,7 @@ module Jekyll
230
235
 
231
236
  generated_files << {
232
237
  type: 'Word Document (.docx)',
233
- url: "#{site.baseurl}/#{filename}.docx"
238
+ url: build_download_url(site, config, filename, 'docx')
234
239
  }
235
240
  @stats&.record_conversion_success(:docx)
236
241
  log_message(config, "Generated #{filename}.docx")
@@ -272,7 +277,7 @@ module Jekyll
272
277
 
273
278
  generated_files << {
274
279
  type: 'PDF Document (.pdf)',
275
- url: "#{site.baseurl}/#{filename}.pdf"
280
+ url: build_download_url(site, config, filename, 'pdf')
276
281
  }
277
282
  @stats&.record_conversion_success(:pdf)
278
283
  log_message(config, "Generated #{filename}.pdf")
@@ -320,6 +325,15 @@ module Jekyll
320
325
  Jekyll.logger.info "Pandoc Exports:", message
321
326
  end
322
327
  end
328
+
329
+ def self.build_download_url(site, config, filename, extension)
330
+ output_dir = config['output_dir'] || ''
331
+ if output_dir.empty?
332
+ "#{site.baseurl}/#{filename}.#{extension}"
333
+ else
334
+ "#{site.baseurl}/#{output_dir}/#{filename}.#{extension}"
335
+ end
336
+ end
323
337
 
324
338
  def self.log_error(config, message)
325
339
  Jekyll.logger.error "Pandoc Exports:", message
@@ -1,5 +1,5 @@
1
1
  module Jekyll
2
2
  module PandocExports
3
- VERSION = '0.1.14'
3
+ VERSION = '0.2.0'
4
4
  end
5
5
  end
@@ -1,2 +1,3 @@
1
1
  require "jekyll-pandoc-exports/version"
2
- require "jekyll-pandoc-exports/generator"
2
+ require "jekyll-pandoc-exports/generator"
3
+ require "jekyll-pandoc-exports/command"
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: jekyll-pandoc-exports
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.14
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Michael McGarrah
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2026-04-09 00:00:00.000000000 Z
11
+ date: 2026-05-06 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: jekyll
@@ -16,14 +16,14 @@ dependencies:
16
16
  requirements:
17
17
  - - ">="
18
18
  - !ruby/object:Gem::Version
19
- version: '3.0'
19
+ version: 4.3.2
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - ">="
25
25
  - !ruby/object:Gem::Version
26
- version: '3.0'
26
+ version: 4.3.2
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: pandoc-ruby
29
29
  requirement: !ruby/object:Gem::Requirement
@@ -96,6 +96,7 @@ files:
96
96
  - bin/release
97
97
  - bin/reset-dev
98
98
  - lib/jekyll-pandoc-exports.rb
99
+ - lib/jekyll-pandoc-exports/command.rb
99
100
  - lib/jekyll-pandoc-exports/generator.rb
100
101
  - lib/jekyll-pandoc-exports/hooks.rb
101
102
  - lib/jekyll-pandoc-exports/statistics.rb
@@ -118,7 +119,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
118
119
  requirements:
119
120
  - - ">="
120
121
  - !ruby/object:Gem::Version
121
- version: 3.0.0
122
+ version: 3.2.0
122
123
  required_rubygems_version: !ruby/object:Gem::Requirement
123
124
  requirements:
124
125
  - - ">="