RubyGems - pdftoimage - Versions diffs - 0.2.1 → 0.3.1 - Mend

pdftoimage 0.2.1 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 7385d5aaa8f461f7214d25b9972e5b3cdfd528da1da7c7dfd18b61309e9dd010
-  data.tar.gz: ed6c231fa756f90c9330d9f68fcec469aa447b5446ded5074d711ddee9579d04
+  metadata.gz: 382ac5bc0e37e99acc44c9b7b40883ca8ca0a7b72595229e460fd4c0d4f6ee27
+  data.tar.gz: 6163c9afa9cc35a3bf6231af43eaf7ee1a5ae490c946b886f2904f63af2af9b6
 SHA512:
-  metadata.gz: 182bac990daff942767ca44b8cd56e3979fa39f334f565a2b7efabb8497b8be042b04e0abfafc735ba0f2023440732093c6bc77edb7ac12ed1ebbb8fc7287634
-  data.tar.gz: 37df49043986c6c02720dec32d076144c199cac1b6b8fcf87c895f4fb1131e297eb85b91f0df1915a7571561076641c74527d18d9d2b74267b813eec4de63d51
+  metadata.gz: 640b7bbd60b1a15db121c27c1d56b360dbb509c6cdce62d525ed3ea8aceeed3e67ee359eae9db350e0d031c06b178ad5723cc29990075b96f762ad075627c614
+  data.tar.gz: 76a9559c9e50853022d5ef8748f1a0733bd065e5a832e47d3cb5106f8825b9860b786d5c1d87edb495fa511f62efc55b41b089e947bae3fc39c7e3dcd6ef4599

data/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,75 @@
+# Changelog
+All notable changes to this project will be documented in this file.
+The format is based on [Keep a Changelog](https://keepachangelog.com).
+## [0.3.1] - 2026-03-21
+### Added
+- `Image#crop(x, y, w, h)` for extracting a rectangular region from a PDF page
+## [0.3.0] - 2026-03-20
+### Added
+- `PDFToImage.open` now accepts IO objects in addition to file paths (#11)
+- `PDFToImage.from_blob` for opening PDFs from raw binary data (#11)
+- `Image#save` now accepts IO objects for output, enabling fully in-memory workflows (#11)
+- Support for opening PDFs from remote URLs (#16)
+### Changed
+- Replaced direct ImageMagick CLI calls with MiniMagick (#15)
+### Removed
+- Removed iconv dependency (#17)
+## [0.2.1] - 2025-04-08
+### Fixed
+- "Error determining page count" (#13)
+- Updated shellwords dependency to 0.2.0+ (#7)
+## [0.2.0] - 2023-04-20
+### Fixed
+- Use of deprecated `File.exists?` method (#4)
+- File paths are now escaped to properly handle spaces and special characters (#3)
+### Added
+- Specifying dpi resolution is now supported (#5)
+## [0.1.7] - 2018-05-01
+### Fixed
+- Updated yard to resolve a vulnerability
+## [0.1.6] - 2011-07-13
+### Fixed
+- Buggy PDF generators encoding CreationDate and ModDate as UTF-16 instead of ASCII, causing parsing errors
+## [0.1.5] - 2011-03-08
+### Fixed
+- poppler_utils no longer leaves off the extra padded zero
+## [0.1.4] - 2010-11-15
+### Fixed
+- Documents with page counts that are exact powers of 10 not parsing properly due to poppler_utils zero-padding behavior
+## [0.1.3] - 2010-11-12
+### Fixed
+- PDF documents with more than 9 pages not parsing properly (zero-padding issue)
+## [0.1.2] - 2010-11-11
+### Added
+- Support for blocks upon opening a PDF
+- `quality` method for JPEG/MIFF/PNG compression levels
+- Lazy conversion: PDF conversion is now deferred until saving, improving performance for partial conversions
+## [0.1.1] - 2010-11-10
+- Initial release

data/README.md ADDED Viewed

@@ -0,0 +1,91 @@
+# pdftoimage
+A Ruby gem for converting PDF documents into images using [poppler_utils](https://poppler.freedesktop.org/) and [MiniMagick](https://github.com/minimagick/minimagick).
+## Installation
+```sh
+gem install pdftoimage
+```
+### Requirements
+- [poppler_utils](https://poppler.freedesktop.org/)
+- [ImageMagick](https://imagemagick.org/)
+## Usage
+### From a file
+```ruby
+require 'pdftoimage'
+images = PDFToImage.open('somefile.pdf')
+images.each do |page|
+  page.resize('50%').save("output/page-#{page.page}.jpg")
+end
+```
+### With a block
+```ruby
+PDFToImage.open('report.pdf') do |page|
+  page.resize('150').quality('80%').save("out/thumbnail-#{page.page}.jpg")
+end
+```
+### From a URL
+```ruby
+pages = PDFToImage.open('https://example.com/report.pdf')
+pages[0].save('first_page.png')
+```
+### From an IO object
+```ruby
+File.open('report.pdf', 'rb') do |io|
+  pages = PDFToImage.open(io)
+  pages[0].save('first_page.png')
+end
+```
+### From binary data
+```ruby
+pdf_data = download_pdf_from_s3(key)
+pages = PDFToImage.from_blob(pdf_data)
+pages[0].save('first_page.png')
+```
+### Saving to an IO object
+```ruby
+pages = PDFToImage.open('report.pdf')
+io = StringIO.new(''.b)
+pages[0].save(io)
+io.rewind
+```
+### Cropping a region
+```ruby
+PDFToImage.open('report.pdf') do |page|
+  page.crop(0, 300, 100, 300).save("out/cropped-#{page.page}.jpg")
+end
+```
+### Setting resolution
+```ruby
+PDFToImage.open('report.pdf') do |page|
+  page.r(350).save("out/hires-#{page.page}.jpg")
+end
+```
+## License
+Copyright (c) 2026 Rob Flynn
+See [LICENSE](LICENSE) for details.

data/lib/pdftoimage/image.rb CHANGED Viewed

@@ -1,3 +1,5 @@
+require 'tempfile'
 module PDFToImage
     # A class which is instantiated by PDFToImage when a PDF document
     # is opened.
@@ -26,7 +28,7 @@ module PDFToImage
         CUSTOM_IMAGE_METHODS.each do |method|
             define_method(method.to_sym) do |*args|
-                @args << "-#{method} #{args.join(' ')}"
+                @args << [method, args]
                 self
             end
@@ -67,19 +69,19 @@ module PDFToImage
         # @param outname [String] The output filename of the image
         #
         def save(outname)
-            generate_temp_file
-            cmd = "convert "
-            if not @args.empty?
-                cmd += "#{@args.join(' ')} "
+            if outname.respond_to?(:write)
+                save_to_io(outname)
+            else
+                save_to_file(outname)
             end
-            cmd += "#{Shellwords.escape(@filename)} #{Shellwords.escape(outname)}"
+            return true
+        end
-            PDFToImage.exec(cmd)
+        def crop(x, y, w, h)
+            @pdf_args.push("-x #{x}", "-y #{y}", "-W #{w}", "-H #{h}")
-            return true
+            self
         end
         def <=>(img)
@@ -94,6 +96,29 @@ module PDFToImage
       private
+        def save_to_file(outname)
+            generate_temp_file
+            image = MiniMagick::Image.open(@filename)
+            @args.each do |method, args|
+                image.send(method, *args)
+            end
+            image.write(outname)
+        end
+        def save_to_io(io)
+            tempfile = Tempfile.new(['pdftoimage', '.png'])
+            tempfile.binmode
+            begin
+                save_to_file(tempfile.path)
+                tempfile.rewind
+                IO.copy_stream(tempfile, io)
+            ensure
+                tempfile.close
+                tempfile.unlink
+            end
+        end
         def generate_temp_file
             if @opened == false
                 cmd = "pdftoppm -png -f #{@page} #{@pdf_args.join(" ")} -l #{@page} #{Shellwords.escape(@pdf_name)} #{Shellwords.escape(@filename)}"

data/lib/pdftoimage/version.rb CHANGED Viewed

@@ -1,4 +1,4 @@
 module PDFToImage
     # pdftoimage version
-    VERSION = "0.2.1"
+    VERSION = "0.3.1"
 end

data/lib/pdftoimage.rb CHANGED Viewed

@@ -2,8 +2,11 @@ require 'pdftoimage/version'
 require 'pdftoimage/image'
 require 'tmpdir'
-require 'iconv'
 require 'shellwords'
+require 'mini_magick'
+require 'open-uri'
+require 'uri'
+require 'stringio'
 module PDFToImage
     class PDFError < RuntimeError; end
@@ -20,20 +23,15 @@ module PDFToImage
         raise PDFToImage::PDFError, "poppler_utils not installed"
     end
-    begin
-        tmp = `identify -version 2>&1`
-        raise(PDFToImage::PDFError, "ImageMagick not installed") unless tmp.index('ImageMagick')
-    rescue Errno::ENOENT
-        raise PDFToImage::PDFError, "ImageMagick not installed"
-    end
     class << self
         # Opens a PDF document and prepares it for splitting into images.
         #
-        # @param filename [String] The filename of the PDF to open
+        # @param source [String, IO] A filename, URL, or IO object containing PDF data
         #
         # @return [Array] An array of images
-        def open(filename, &block)
+        def open(source, &block)
+            filename = resolve_source(source)
             if not File.exist?(filename)
                 raise PDFError, "File '#{filename}' not found."
             end
@@ -54,6 +52,16 @@ module PDFToImage
             return images
         end
+        # Opens a PDF from raw binary data.
+        #
+        # @param data [String] Binary string of PDF content
+        #
+        # @return [Array] An array of images
+        def from_blob(data, &block)
+            filename = write_to_tempfile(data)
+            open(filename, &block)
+        end
         # Executes the specified command, returning the output.
         #
         # @param cmd [String] The command to run
@@ -102,6 +110,41 @@ module PDFToImage
             return matches[1].to_i
         end
+        def resolve_source(source)
+            if source.respond_to?(:read)
+                write_to_tempfile(source.read)
+            elsif url?(source)
+                download_file(source)
+            else
+                source
+            end
+        end
+        def write_to_tempfile(data)
+            tempfile = File.join(@@pdf_temp_dir, "#{random_name}.pdf")
+            File.open(tempfile, 'wb') { |f| f.write(data) }
+            tempfile
+        end
+        def url?(filename)
+            uri = URI.parse(filename)
+            uri.is_a?(URI::HTTP) || uri.is_a?(URI::HTTPS)
+        rescue URI::InvalidURIError
+            false
+        end
+        def download_file(url)
+            tempfile = File.join(@@pdf_temp_dir, "#{random_name}.pdf")
+            remote = URI.open(url)
+            File.open(tempfile, 'wb') do |file|
+                file.write(remote.read)
+            end
+            remote.close
+            tempfile
+        rescue OpenURI::HTTPError, SocketError, Errno::ECONNREFUSED => e
+            raise PDFError, "Failed to download '#{url}': #{e.message}"
+        end
         # Generate a random file name in the system's tmp folder
         def random_filename
             File.join(@@pdf_temp_dir, random_name)

metadata CHANGED Viewed

@@ -1,43 +1,42 @@
 --- !ruby/object:Gem::Specification
 name: pdftoimage
 version: !ruby/object:Gem::Version
-  version: 0.2.1
+  version: 0.3.1
 platform: ruby
 authors:
 - Rob Flynn
-autorequire:
 bindir: bin
 cert_chain: []
-date: 2025-04-08 00:00:00.000000000 Z
+date: 1980-01-02 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
-  name: iconv
+  name: shellwords
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1.0'
+        version: 0.2.2
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1.0'
+        version: 0.2.2
 - !ruby/object:Gem::Dependency
-  name: shellwords
+  name: mini_magick
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.2.2
+        version: '4.0'
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.2.2
+        version: '4.0'
 description: A ruby gem for converting PDF documents into a series of images. This
   module is based off poppler_utils and ImageMagick.
 email: rob@thingerly.com
@@ -45,9 +44,9 @@ executables: []
 extensions: []
 extra_rdoc_files: []
 files:
-- ChangeLog.rdoc
+- CHANGELOG.md
 - LICENSE
-- README.rdoc
+- README.md
 - lib/pdftoimage.rb
 - lib/pdftoimage/image.rb
 - lib/pdftoimage/version.rb
@@ -55,9 +54,8 @@ homepage: https://github.com/robflynn/pdftoimage
 licenses:
 - MIT
 metadata:
-  changelog_uri: https://github.com/robflynn/pdftoimage/blob/master/ChangeLog.rdoc
+  changelog_uri: https://github.com/robflynn/pdftoimage/blob/main/CHANGELOG.md
   source_code_uri: https://github.com/robflynn/pdftoimage/
-post_install_message:
 rdoc_options: []
 require_paths:
 - lib
@@ -65,15 +63,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
   requirements:
   - - ">="
     - !ruby/object:Gem::Version
-      version: '0'
+      version: '2.7'
 required_rubygems_version: !ruby/object:Gem::Requirement
   requirements:
   - - ">="
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.0.3.1
-signing_key:
+rubygems_version: 3.6.7
 specification_version: 4
 summary: A ruby gem for converting PDF documents into a series of images.
 test_files: []

data/ChangeLog.rdoc DELETED Viewed

@@ -1,28 +0,0 @@
-=== 0.2.0 / 2023-04-20
-* Fixed use of deprecated File.exists? method (pr#4 from Thornolf)
-* File paths are now escaped to properly handle spaces and special characters (pr#3 from drnic)
-* Specifying dpi resolution is now supported (pr#5 from lehf)
-=== 0.1.7 / 2018-05-01
-* Updated yard to resolve a vulnerability.
-=== 0.1.6 / 2011-07-13
-* Buggy PDF generators try to encode CreationDate and ModDate as UTF-16 as opposed to ASCII.  This leads to parsing errors where the code was assuming UTF-8 encoding was in use.
-=== 0.1.5 / 2011-03-08
-* Fixed a bug due to the fact that poppler_utils no longer leaves off the extra padded zero.
-=== 0.1.4 / 2010-11-15
-* Fixed a bug concerning documents with page counts that are exact powers of 10. poppler_utils prepends one less zero to the page counts when a document count is a power of 10. This is now fixed in PDFToImage.
-=== 0.1.3 / 2010-11-12
-* Fixed a problem where PDF documents with more than 9 pages were not parsing properly. (embarrassing 0 padding problem.)
-=== 0.1.2 / 2010-11-11
-* Added support for blocks upon opening a PDF
-* Image objects now support the "quality" method for JPEG/MIFF/PNG compression levels.
-* PDF conversion is now deferred until saving. This greatly speeds up the conversion process in cases where you only want a few pages out of a large document converted.
-=== 0.1.1 / 2010-11-10
-* Initial release:

data/README.rdoc DELETED Viewed

@@ -1,51 +0,0 @@
-= pdftoimage
-* {Homepage}[http://rubygems.org/gems/pdftoimage]
-== Description
-PDFToImage is a ruby gem which allows for conversion of a PDF document into
-images. It uses poppler_utils to first convert the document to PNG and then
-allows usage of ImageMagick to convert the image into other formats.
-The reasoning behind using poppler_utils is due to the fact that ghostscript
-occasionally has trouble with certain PDF documents which poppler_utils seems
-to be able to parse without error.
-== Examples
-  require 'pdftoimage'
-  images = PDFToImage.open('somefile.pdf')
-  images.each do |img|
-    img.resize('50%').save("output/page-#{img.page}.jpg")
-  end
-  require 'pdftoimage'
-  PDFToImage.open('anotherpdf.pdf') do |page|
-    page.resize('150').quality('80%').save('out/thumbnail-#{page.page}.jpg")
-  end
-  require 'pdftoimage'
-  PDFToImage.open('anotherpdf.pdf') do |page|
-    # Set the resolution to 350dpi
-    page.r(350).save('out/thumbnail-#{page.page}.jpg")
-  end
-== Requirements
-poppler_utils
-ImageMagick
-== Install
-  $ gem install pdftoimage
-== Copyright
-Copyright (c) 2023 Rob Flynn
-See LICENSE for details.