pdf_table_extractor 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: b35bb6b824eb69e6824b289b55794f355c5ae2e2ce342e8bb0130bfd269c1280
4
+ data.tar.gz: a42dba7aecb0f034a20738865c3ccf669bcb7c71c58ff787de543bf7bef7650e
5
+ SHA512:
6
+ metadata.gz: 15c8eb4460907aab704bb81cfcc869f4847be223f6709c0203f575a51b2dad4a22a74266eccc7b00d7f29747a88a3405eab13bf275f2313d62c1b71faeddea30
7
+ data.tar.gz: 06611570f328ff938b0183f23bc93546294421a2a7f85f83e3878b21aa1412fbbb931ffe4183570f4c9ffd6a3bafce14f711ec2ce6ed52a8868c55619a5aa141
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Marko Boskovic
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
22
+
data/README.md ADDED
@@ -0,0 +1,137 @@
1
+ # PDF Table Extractor
2
+
3
+ A Ruby gem for extracting tables from PDF files by analyzing text spacing and positions. It parses PDF pages, removes headers/footers and pagination if configured, splits lines into cells based on multiple-space runs, and merges rows into table-like structures.
4
+
5
+ - Source: https://github.com/jomb-ch/pdf_table_extractor
6
+ - License: MIT
7
+
8
+ ## Installation
9
+
10
+ Add this line to your application's Gemfile:
11
+
12
+ ```ruby
13
+ gem 'pdf_table_extractor'
14
+ ```
15
+
16
+ Then install:
17
+
18
+ ```bash
19
+ bundle install
20
+ ```
21
+
22
+ Or install it yourself:
23
+
24
+ ```bash
25
+ gem install pdf_table_extractor
26
+ ```
27
+
28
+ ## Usage
29
+
30
+ ### Basic usage
31
+
32
+ ```ruby
33
+ require 'pdf_table_extractor'
34
+
35
+ # Initialize with a PDF file path
36
+ extractor = PdfTableExtractor.new('path/to/file.pdf')
37
+
38
+ # Extract tables
39
+ extractor.extract_tables
40
+
41
+ # Get results: array of rows, each row is an array of cells
42
+ rows = extractor.result
43
+ # rows => [[{ text: 'Cell 1', position: 0 }, { text: 'Cell 2', position: 20 }], ...]
44
+ ```
45
+
46
+ ### Advanced usage with options
47
+
48
+ ```ruby
49
+ extractor = PdfTableExtractor.new('path/to/file.pdf', options: {
50
+ remove_page_headers: true, # Remove common leading lines across pages (default: true)
51
+ remove_page_footers: true, # Remove common trailing lines across pages (default: true)
52
+ remove_pagination_from_header: false, # true or Integer (line number from top) to remove pagination; if true, tests first 5 lines (default: false)
53
+ remove_pagination_from_footer: false, # true or Integer (line number from bottom) to remove pagination; if true, tests last 5 lines (default: false)
54
+ remove_empty_lines: true, # Filter out empty lines (default: true)
55
+ position_tolerance: 2 # Tolerance for matching column positions, allowing indentation within columns (default: 2)
56
+ })
57
+
58
+ extractor.extract_tables
59
+ rows = extractor.result
60
+ ```
61
+
62
+ ### Using with PDF::Reader
63
+
64
+ ```ruby
65
+ require 'pdf-reader'
66
+ require 'pdf_table_extractor'
67
+
68
+ reader = PDF::Reader.new('path/to/file.pdf')
69
+ extractor = PdfTableExtractor.new(reader: reader)
70
+ extractor.extract_tables
71
+ rows = extractor.result
72
+ ```
73
+
74
+ ## How it works
75
+
76
+ The gem uses several heuristics to identify table-like structures:
77
+
78
+ - Text Positioning: splits lines into cells using multiple spaces as separators and tracks each cell's starting position
79
+ - Row Congruence: considers rows to belong to the same table if their cell positions match (or are a subset) within a given tolerance
80
+ - Header/Footer Removal: optionally removes common leading/trailing lines across pages
81
+ - Pagination Handling: optionally removes page numbers from headers/footers
82
+
83
+ ### Constraints
84
+
85
+ - Single-row tables are joined into a single cell before further processing
86
+ - The first row of a table cannot have empty cells
87
+ - Multi-cell rows followed by rows with fewer cells (but matching positions) are considered part of the same table
88
+ - Trailing rows of multi-cell tables with content only in the first cell are treated as new single-cell tables if the content length is larger than the position of the second column minus 2
89
+
90
+ ## Development
91
+
92
+ Set up the project:
93
+
94
+ ```bash
95
+ bundle install
96
+ ```
97
+
98
+ Run tests:
99
+
100
+ ```bash
101
+ bundle exec rspec
102
+ ```
103
+
104
+ Run linters:
105
+
106
+ ```bash
107
+ bundle exec standardrb
108
+ bundle exec rubocop --parallel
109
+ ```
110
+
111
+ Generate API docs (YARD):
112
+
113
+ ```bash
114
+ bundle exec yard doc
115
+ open doc/index.html
116
+ ```
117
+
118
+ Release a new version:
119
+
120
+ 1. Update the version number in `lib/pdf_table_extractor/version.rb`
121
+ 2. Build and release:
122
+
123
+ ```bash
124
+ bundle exec rake release
125
+ ```
126
+
127
+ ## CI
128
+
129
+ GitHub Actions workflow runs StandardRB, RuboCop, RSpec, and generates YARD docs on pushes and pull requests. Docs are uploaded as an artifact.
130
+
131
+ ## Contributing
132
+
133
+ Bug reports and pull requests are welcome on GitHub at https://github.com/jomb-ch/pdf_table_extractor
134
+
135
+ ## License
136
+
137
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
@@ -0,0 +1,201 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'pdf-reader'
4
+
5
+ # PdfTableExtractor extracts tables from PDF text using spacing and position heuristics.
6
+ #
7
+ # @!attribute [r] rows
8
+ # @return [Array<PdfTableExtractorRow>] raw rows parsed from text
9
+ # @!attribute [r] merged_rows
10
+ # @return [Array<PdfTableExtractorRow>] rows merged into tables
11
+ # @!attribute [r] options
12
+ # @return [Hash] configuration options
13
+ class PdfTableExtractor
14
+ attr_reader :rows, :merged_rows, :options
15
+
16
+ # @param pdf_path [String, nil] Path to the PDF file (optional when reader is provided)
17
+ # @param reader [PDF::Reader, nil] Pre-initialized PDF::Reader instance
18
+ # @param options [Hash] Configuration options
19
+ # @option options [Boolean] :remove_page_headers (true) Remove common leading lines across pages
20
+ # @option options [Boolean] :remove_page_footers (true) Remove common trailing lines across pages
21
+ # @option options [Boolean, Integer] :remove_pagination_from_header (false) true or line number from top
22
+ # @option options [Boolean, Integer] :remove_pagination_from_footer (false) true or line number from bottom
23
+ # @option options [Boolean] :remove_empty_lines (true) Remove empty lines from the extracted text
24
+ # @option options [Integer] :position_tolerance (2) Tolerance for matching column positions
25
+ def initialize(pdf_path = nil, reader: nil, options: {})
26
+ @reader = reader || PDF::Reader.new(pdf_path)
27
+ @options = options
28
+ @options[:remove_page_headers] = true unless @options.key?(:remove_page_headers)
29
+ @options[:remove_page_footers] = true unless @options.key?(:remove_page_footers)
30
+ @options[:remove_pagination_from_header] = false unless @options.key?(:remove_pagination_from_header)
31
+ @options[:remove_pagination_from_footer] = false unless @options.key?(:remove_pagination_from_footer)
32
+ @options[:remove_empty_lines] = true unless @options.key?(:remove_empty_lines)
33
+ @options[:position_tolerance] = 2 unless @options.key?(:position_tolerance)
34
+ @merged_rows = []
35
+ end
36
+
37
+ # Extracts tables from the PDF and stores them in @merged_rows
38
+ # @return [void]
39
+ def extract_tables
40
+ pages = all_pages
41
+
42
+ pages = remove_pagination(pages) if @options[:remove_pagination_from_header] || @options[:remove_pagination_from_footer]
43
+
44
+ pages = remove_common_leading_lines(pages) if @options[:remove_page_headers]
45
+ pages = remove_common_trailing_lines(pages) if @options[:remove_page_footers]
46
+
47
+ lines = pages&.flatten
48
+ lines = lines&.reject { |l| l.strip.empty? } if @options[:remove_empty_lines]
49
+
50
+ @rows = lines&.map&.with_index do |line, index|
51
+ cells, positions = parse_line_to_cells(line)
52
+ PdfTableExtractorRow.new(self, cells, positions, index)
53
+ end
54
+ process_rows
55
+ end
56
+
57
+ # Returns the extracted cells as arrays per merged row.
58
+ # @return [Array<Array<Hash>>] Array of rows, each row is an array of cells with :text and :position
59
+ def result
60
+ @merged_rows.map(&:cells)
61
+ end
62
+
63
+ private
64
+
65
+ # @return [Array<Array<String>>] Array of pages, each page is an array of lines
66
+ def all_pages
67
+ all_pages_texts.map { |page_text| page_text.lines.map(&:chomp) }
68
+ end
69
+
70
+ # @return [Array<String>] Array of page texts
71
+ def all_pages_texts
72
+ @reader.pages.map { |page| page.text }
73
+ end
74
+
75
+ # Remove pagination lines from headers and footers.
76
+ # @param pages [Array<Array<String>>]
77
+ # @return [Array<Array<String>>]
78
+ def remove_pagination(pages)
79
+ return pages if pages.empty? || pages.length == 1
80
+
81
+ if @options[:remove_pagination_from_header]
82
+ if @options[:remove_pagination_from_header].is_a?(Integer)
83
+ index = @options[:remove_pagination_from_header] - 1
84
+ pages.each do |lines|
85
+ if lines.length > index && is_pagination?(lines[index])
86
+ lines.delete_at(index)
87
+ end
88
+ end
89
+ else
90
+ [1..5].each do |index|
91
+ if pages.all? { |lines| lines.length > index && is_pagination?(lines[index - 1]) }
92
+ pages.each { |lines| lines.delete_at(index - 1) }
93
+ break
94
+ end
95
+ end
96
+ end
97
+ end
98
+
99
+ if @options[:remove_pagination_from_footer]
100
+ if @options[:remove_pagination_from_footer].is_a?(Integer)
101
+ index = @options[:remove_pagination_from_footer]
102
+ pages.each do |lines|
103
+ if lines.length > index && is_pagination?(lines[-index])
104
+ lines.delete_at(-index)
105
+ end
106
+ end
107
+ else
108
+ [1..5].each do |index|
109
+ if pages.all? { |lines| lines.length > index && is_pagination?(lines[-index]) }
110
+ pages.each { |lines| lines.delete_at(-index) }
111
+ break
112
+ end
113
+ end
114
+ end
115
+ end
116
+ pages
117
+ end
118
+
119
+ # Remove common leading lines across pages.
120
+ # @param pages [Array<Array<String>>]
121
+ # @return [Array<Array<String>>]
122
+ def remove_common_leading_lines(pages)
123
+ return pages if pages.empty? || pages.length == 1
124
+ pages.each(&:shift) while same_leading_line?(pages)
125
+ pages
126
+ end
127
+
128
+ # Remove common trailing lines across pages.
129
+ # @param pages [Array<Array<String>>]
130
+ # @return [Array<Array<String>>]
131
+ def remove_common_trailing_lines(pages)
132
+ return pages if pages.empty? || pages.length == 1
133
+ pages.each(&:pop) while same_trailing_line?(pages)
134
+ pages
135
+ end
136
+
137
+ # Parse a line into cells using runs of multiple spaces as separators.
138
+ # @param line [String]
139
+ # @return [Array<(Array<Hash>, Array<Integer>)>] cells and positions
140
+ def parse_line_to_cells(line)
141
+ if has_consecutive_spaces?(line)
142
+ cells = []
143
+ position = 0
144
+ positions = []
145
+
146
+ line.split(/(\s{2,})/).each do |text|
147
+ if has_consecutive_spaces?(text)
148
+ position += text.length
149
+ elsif !text.empty?
150
+ cells << {text:, position:}
151
+ positions << position
152
+ position += text.length
153
+ end
154
+ end
155
+
156
+ [cells, positions]
157
+ else
158
+ [[{text: line.strip, position: 0}], [0]]
159
+ end
160
+ end
161
+
162
+ # Process parsed rows into merged table rows.
163
+ # @return [void]
164
+ def process_rows
165
+ @merged_rows = []
166
+
167
+ @rows.each do |row|
168
+ row.transform_to_single_cell! if row.incongruent_with_neighbours?
169
+
170
+ if @merged_rows.empty? || !row.congruent_with_last_merged?
171
+ @merged_rows << PdfTableExtractorRow.new(self, row.cells, row.positions, nil, true)
172
+ else
173
+ @merged_rows.last.merge!(row)
174
+ end
175
+ end
176
+ end
177
+
178
+ # @param pages [Array<Array<String>>]
179
+ # @return [Boolean]
180
+ def same_leading_line?(pages)
181
+ pages.all? { |lines| lines.any? } && pages.all? { |lines| lines[0] == pages[0][0] }
182
+ end
183
+
184
+ # @param pages [Array<Array<String>>]
185
+ # @return [Boolean]
186
+ def same_trailing_line?(pages)
187
+ pages.all? { |lines| lines.any? } && pages.all? { |lines| lines[-1] == pages[0][-1] }
188
+ end
189
+
190
+ # @param text [String]
191
+ # @return [Boolean]
192
+ def has_consecutive_spaces?(text)
193
+ text.match?(/\s{2,}/)
194
+ end
195
+
196
+ # @param line [String, nil]
197
+ # @return [Boolean]
198
+ def is_pagination?(line)
199
+ line&.strip&.match?(/^.*\d+$/)
200
+ end
201
+ end
@@ -0,0 +1,106 @@
1
+ # frozen_string_literal: true
2
+
3
+ # Represents a parsed row of cells with positions and supports merging/grouping.
4
+ #
5
+ # @!attribute [r] positions
6
+ # @return [Array<Integer>] positions of the cells in this row
7
+ # @!attribute [r] index
8
+ # @return [Integer, nil] original index when parsed (nil for merged rows)
9
+ # @!attribute [r] cells
10
+ # @return [Array<Hash>] cells with :text and :position
11
+ # @!attribute [r] merged
12
+ # @return [Boolean] whether this row is a merged row (not original)
13
+ class PdfTableExtractorRow
14
+ attr_reader :positions, :index, :cells, :merged
15
+
16
+ def initialize(extractor, cells, positions, index, merged = false)
17
+ @extractor = extractor
18
+ @cells = cells
19
+ @index = index
20
+ @merged = merged
21
+ @positions = positions
22
+ end
23
+
24
+ # Whether this row is a single-cell row at position 0.
25
+ # @param against [Symbol] :previous or :last_merged
26
+ # @return [Boolean]
27
+ def single_cell?(against = :previous)
28
+ other_row = (against == :previous) ? prev : last_merged unless @merged
29
+ second_pos = other_row&.positions&.[](1).to_i
30
+ @positions == [0] && (
31
+ @merged || @index == 0 || other_row.single_cell? || @cells.first[:text].length > second_pos - 2
32
+ )
33
+ end
34
+
35
+ # Previous row in extractor.
36
+ # @return [PdfTableExtractorRow, nil]
37
+ def prev
38
+ return nil if @index.zero?
39
+
40
+ @extractor.rows[@index - 1]
41
+ end
42
+
43
+ # Next row in extractor.
44
+ # @return [PdfTableExtractorRow, nil]
45
+ def nxt
46
+ return nil if @index == @extractor.rows.length - 1
47
+
48
+ @extractor.rows[@index + 1]
49
+ end
50
+
51
+ # @return [Boolean] whether positions are congruent with previous row
52
+ def congruent_with_previous?
53
+ single_cell? == prev&.single_cell? && positions_match_with?(prev)
54
+ end
55
+
56
+ # @return [Boolean] whether positions are congruent with last merged row
57
+ def congruent_with_last_merged?
58
+ single_cell?(:last_merged) == last_merged&.single_cell? && positions_match_with?(last_merged)
59
+ end
60
+
61
+ # @return [Boolean] whether this row is incongruent relative to neighbours
62
+ def incongruent_with_neighbours?
63
+ !single_cell? && !(prev && congruent_with_previous?) && !nxt&.congruent_with_previous?
64
+ end
65
+
66
+ # Check if positions match with another row (within tolerance).
67
+ # @param other_row [PdfTableExtractorRow, nil]
68
+ # @return [Boolean]
69
+ def positions_match_with?(other_row)
70
+ @positions.each do |pos|
71
+ if @extractor.options[:position_tolerance].zero?
72
+ return false unless other_row&.positions.to_a.include?(pos)
73
+ elsif ([pos..pos + @extractor.options[:position_tolerance]] - other_row&.positions.to_a).length == @extractor.options[:position_tolerance] + 1
74
+ return false
75
+ end
76
+ end
77
+ true
78
+ end
79
+
80
+ # Transform this row into a single-cell row by merging text.
81
+ # @return [void]
82
+ def transform_to_single_cell!
83
+ @cells = [{
84
+ text: @cells.sort_by { |c| c[:position] }.map { |c| c[:text] }.join(" ").gsub(/\s+/, " ").strip,
85
+ position: 0
86
+ }]
87
+ @positions = [0]
88
+ end
89
+
90
+ # Merge text into matching cell positions from another row.
91
+ # @param row [PdfTableExtractorRow]
92
+ # @return [void]
93
+ def merge!(row)
94
+ @cells.each do |cell|
95
+ r_cell = row.cells.find { |c| c[:position] == cell[:position] }
96
+ cell[:text] += "\s#{r_cell[:text]}" if r_cell
97
+ end
98
+ end
99
+
100
+ private
101
+
102
+ # @return [PdfTableExtractorRow, nil]
103
+ def last_merged
104
+ @extractor.merged_rows.last
105
+ end
106
+ end
@@ -0,0 +1,5 @@
1
+ # frozen_string_literal: true
2
+
3
+ class PdfTableExtractor
4
+ VERSION = '0.1.0'.freeze
5
+ end
@@ -0,0 +1,6 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative 'pdf_table_extractor/version'
4
+ require_relative 'pdf_table_extractor/pdf_table_extractor'
5
+ require_relative 'pdf_table_extractor/pdf_table_extractor_row'
6
+
metadata ADDED
@@ -0,0 +1,133 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: pdf_table_extractor
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Marko Boskovic
8
+ bindir: bin
9
+ cert_chain: []
10
+ date: 1980-01-02 00:00:00.000000000 Z
11
+ dependencies:
12
+ - !ruby/object:Gem::Dependency
13
+ name: pdf-reader
14
+ requirement: !ruby/object:Gem::Requirement
15
+ requirements:
16
+ - - "~>"
17
+ - !ruby/object:Gem::Version
18
+ version: '2.8'
19
+ type: :runtime
20
+ prerelease: false
21
+ version_requirements: !ruby/object:Gem::Requirement
22
+ requirements:
23
+ - - "~>"
24
+ - !ruby/object:Gem::Version
25
+ version: '2.8'
26
+ - !ruby/object:Gem::Dependency
27
+ name: rake
28
+ requirement: !ruby/object:Gem::Requirement
29
+ requirements:
30
+ - - "~>"
31
+ - !ruby/object:Gem::Version
32
+ version: '13.0'
33
+ type: :development
34
+ prerelease: false
35
+ version_requirements: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - "~>"
38
+ - !ruby/object:Gem::Version
39
+ version: '13.0'
40
+ - !ruby/object:Gem::Dependency
41
+ name: rspec
42
+ requirement: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - "~>"
45
+ - !ruby/object:Gem::Version
46
+ version: '3.0'
47
+ type: :development
48
+ prerelease: false
49
+ version_requirements: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - "~>"
52
+ - !ruby/object:Gem::Version
53
+ version: '3.0'
54
+ - !ruby/object:Gem::Dependency
55
+ name: rubocop
56
+ requirement: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - "~>"
59
+ - !ruby/object:Gem::Version
60
+ version: '1.64'
61
+ type: :development
62
+ prerelease: false
63
+ version_requirements: !ruby/object:Gem::Requirement
64
+ requirements:
65
+ - - "~>"
66
+ - !ruby/object:Gem::Version
67
+ version: '1.64'
68
+ - !ruby/object:Gem::Dependency
69
+ name: standard
70
+ requirement: !ruby/object:Gem::Requirement
71
+ requirements:
72
+ - - "~>"
73
+ - !ruby/object:Gem::Version
74
+ version: '1.36'
75
+ type: :development
76
+ prerelease: false
77
+ version_requirements: !ruby/object:Gem::Requirement
78
+ requirements:
79
+ - - "~>"
80
+ - !ruby/object:Gem::Version
81
+ version: '1.36'
82
+ - !ruby/object:Gem::Dependency
83
+ name: yard
84
+ requirement: !ruby/object:Gem::Requirement
85
+ requirements:
86
+ - - "~>"
87
+ - !ruby/object:Gem::Version
88
+ version: '0.9'
89
+ type: :development
90
+ prerelease: false
91
+ version_requirements: !ruby/object:Gem::Requirement
92
+ requirements:
93
+ - - "~>"
94
+ - !ruby/object:Gem::Version
95
+ version: '0.9'
96
+ description: Extracts tables from PDF text using spacing and position heuristics.
97
+ email:
98
+ - marko@jomb.ch
99
+ executables: []
100
+ extensions: []
101
+ extra_rdoc_files: []
102
+ files:
103
+ - LICENSE
104
+ - README.md
105
+ - lib/pdf_table_extractor.rb
106
+ - lib/pdf_table_extractor/pdf_table_extractor.rb
107
+ - lib/pdf_table_extractor/pdf_table_extractor_row.rb
108
+ - lib/pdf_table_extractor/version.rb
109
+ homepage: https://github.com/jomb-ch/pdf_table_extractor
110
+ licenses:
111
+ - MIT
112
+ metadata:
113
+ rubygems_mfa_required: 'true'
114
+ source_code_uri: https://github.com/jomb-ch/pdf_table_extractor
115
+ changelog_uri: https://github.com/jomb-ch/pdf_table_extractor/releases
116
+ rdoc_options: []
117
+ require_paths:
118
+ - lib
119
+ required_ruby_version: !ruby/object:Gem::Requirement
120
+ requirements:
121
+ - - ">="
122
+ - !ruby/object:Gem::Version
123
+ version: '3.1'
124
+ required_rubygems_version: !ruby/object:Gem::Requirement
125
+ requirements:
126
+ - - ">="
127
+ - !ruby/object:Gem::Version
128
+ version: '0'
129
+ requirements: []
130
+ rubygems_version: 3.7.1
131
+ specification_version: 4
132
+ summary: PDF table extractor
133
+ test_files: []