xsv 0.3.18 → 1.0.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.travis.yml +3 -6
- data/CHANGELOG.md +27 -0
- data/README.md +16 -9
- data/lib/xsv.rb +13 -12
- data/lib/xsv/helpers.rb +51 -67
- data/lib/xsv/relationships_handler.rb +7 -24
- data/lib/xsv/sax_parser.rb +86 -0
- data/lib/xsv/shared_strings_parser.rb +20 -12
- data/lib/xsv/sheet.rb +10 -16
- data/lib/xsv/sheet_bounds_handler.rb +18 -28
- data/lib/xsv/sheet_rows_handler.rb +56 -72
- data/lib/xsv/sheets_ids_handler.rb +7 -40
- data/lib/xsv/styles_handler.rb +18 -33
- data/lib/xsv/version.rb +2 -1
- data/lib/xsv/workbook.rb +48 -36
- data/xsv.gemspec +2 -3
- metadata +10 -23
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 6433d19ba65e00b98adecde398a3bfbfe9583ac811290d8f8ffa5ada16ea9b70
|
4
|
+
data.tar.gz: 903c61f9485f1380a064cc8ccfdfa378a46df1383259e6c82ca21501dcc485a5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: e79c5866cd7dd20f0c36ea46a341234bcbc8cdbed40492c8bab31543fe96fcd4bd242a6c63368841e7ad481e4f72187eca5682effdb445388256e83969e1a969
|
7
|
+
data.tar.gz: 474da420765253453a3a1e9a25e8a2582052600bfdaee3e0f246db9d6b415a2a2ae46bd9c2edbb5364efb99708061125f189fd608cab5122bfc2090dd183004d
|
data/.travis.yml
CHANGED
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,32 @@
|
|
1
1
|
# Xsv Changelog
|
2
2
|
|
3
|
+
## 1.0.2 2021-05-01
|
4
|
+
|
5
|
+
- Ignore phonetic shared string data (thanks @sinoue-1003)
|
6
|
+
- Throw ArgumentError when `Workbook.new` is called unintentionally
|
7
|
+
|
8
|
+
## 1.0.1 2021-03-18
|
9
|
+
|
10
|
+
- Allow passing a block to Workbook.open
|
11
|
+
- `parse_headers!` returns self to allow chaining (thanks @senhalil)
|
12
|
+
|
13
|
+
## 1.0.0 2021-01-26
|
14
|
+
|
15
|
+
- Xsv no longer depends on native extensions, thanks to a pure-Ruby XML parser
|
16
|
+
|
17
|
+
## 1.0.0.pre.2 2021-01-22
|
18
|
+
|
19
|
+
- Reduce allocations in XML parser
|
20
|
+
- Return strings with the correct encoding
|
21
|
+
- Handle XML entities
|
22
|
+
|
23
|
+
## 1.0.0.pre 2021-01-18
|
24
|
+
|
25
|
+
- Switch to a minimalistic XML parser in native Ruby (#21)
|
26
|
+
- Ruby 3.0 compatibility
|
27
|
+
- Various internal cleanup and optimization
|
28
|
+
- API is backwards compatible with 0.3.x
|
29
|
+
|
3
30
|
## 0.3.18 2020-09-30
|
4
31
|
|
5
32
|
- Improve inline string support (#18)
|
data/README.md
CHANGED
@@ -2,8 +2,9 @@
|
|
2
2
|
|
3
3
|
[![Travis CI](https://img.shields.io/travis/martijn/xsv/master)](https://travis-ci.org/martijn/xsv)
|
4
4
|
[![Yard Docs](http://img.shields.io/badge/yard-docs-blue.svg)](https://rubydoc.info/github/martijn/xsv)
|
5
|
+
[![Gem Version](https://badge.fury.io/rb/xsv.svg)](https://badge.fury.io/rb/xsv)
|
5
6
|
|
6
|
-
Xsv is a fast, lightweight parser for Office Open XML spreadsheet files
|
7
|
+
Xsv is a fast, lightweight, pure Ruby parser for Office Open XML spreadsheet files
|
7
8
|
(commonly known as Excel or .xlsx files). It strives to be minimal in the
|
8
9
|
sense that it provides nothing a CSV reader wouldn't, meaning it only
|
9
10
|
deals with minimal formatting and cannot create or modify documents.
|
@@ -33,7 +34,10 @@ Or install it yourself as:
|
|
33
34
|
|
34
35
|
$ gem install xsv
|
35
36
|
|
36
|
-
Xsv targets ruby
|
37
|
+
Xsv targets ruby >= 2.5 and has a just single dependency, `rubyzip`. It has been
|
38
|
+
tested successfully with MRI, JRuby, and TruffleRuby. Due to the lack of
|
39
|
+
native extensions should work well in multi-threaded environments or in Ractor
|
40
|
+
when that becomes stable.
|
37
41
|
|
38
42
|
## Usage
|
39
43
|
|
@@ -76,15 +80,16 @@ end
|
|
76
80
|
sheet[1] # => {"header1" => "value1", "header2" => "value2"}
|
77
81
|
```
|
78
82
|
|
79
|
-
Be aware that hash mode will lead to unpredictable results if
|
80
|
-
columns with the same
|
83
|
+
Be aware that hash mode will lead to unpredictable results if the worksheet
|
84
|
+
has multiple columns with the same header.
|
81
85
|
|
82
|
-
`Xsv::Workbook.open` accepts a filename, or
|
86
|
+
`Xsv::Workbook.open` accepts a filename, or an IO or String containing a workbook. Optionally, you can pass a block
|
87
|
+
which will be called with the workbook as parameter, like `File#open`.
|
83
88
|
|
84
89
|
`Xsv::Sheet` implements `Enumerable` so you can call methods like `#first`,
|
85
|
-
`#filter`/`#select
|
90
|
+
`#filter`/`#select`, and `#map` on it.
|
86
91
|
|
87
|
-
The sheets
|
92
|
+
The sheets can be accessed by index or by name:
|
88
93
|
|
89
94
|
```ruby
|
90
95
|
x = Xsv::Workbook.open("sheet.xlsx")
|
@@ -94,7 +99,7 @@ sheet = x.sheets[0] # gets sheet by index
|
|
94
99
|
sheet = x.sheets_by_name('Name').first # gets sheet by name
|
95
100
|
```
|
96
101
|
|
97
|
-
To get all the
|
102
|
+
To get all the sheets names:
|
98
103
|
|
99
104
|
```ruby
|
100
105
|
sheet_names = x.sheets.map(&:name)
|
@@ -129,9 +134,11 @@ To install this gem onto your local machine, run `bundle exec rake install`. To
|
|
129
134
|
Xsv is faster and more memory efficient than other gems because of two things: it only _reads values_ from Excel files and it's based on a SAX-based parser instead of a DOM-based parser. If you want to read some background on this, check out my blog post on
|
130
135
|
[Efficient XML parsing in Ruby](https://storck.io/posts/efficient-xml-parsing-in-ruby/).
|
131
136
|
|
132
|
-
Jamie Schembri did a shootout of Xsv against various other Excel reading gems comparing parsing speed, memory usage and allocations.
|
137
|
+
Jamie Schembri did a shootout of Xsv against various other Excel reading gems comparing parsing speed, memory usage, and allocations.
|
133
138
|
Check our his blog post: [Faster Excel parsing in Ruby](https://blog.schembri.me/post/faster-excel-parsing-in-ruby/).
|
134
139
|
|
140
|
+
Pre-1.0, Xsv used a native extension for XML parsing, which was faster than the native Ruby one (on MRI). But even with the native Ruby version generally Xsv still outperforms other Ruby parsing gems.
|
141
|
+
|
135
142
|
## Contributing
|
136
143
|
|
137
144
|
Bug reports and pull requests are welcome on GitHub at https://github.com/martijn/xsv.
|
data/lib/xsv.rb
CHANGED
@@ -1,17 +1,18 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
-
require "date"
|
3
|
-
require "ox"
|
4
2
|
|
5
|
-
require
|
6
|
-
|
7
|
-
require
|
8
|
-
require
|
9
|
-
require
|
10
|
-
require
|
11
|
-
require
|
12
|
-
require
|
13
|
-
require
|
14
|
-
require
|
3
|
+
require 'date'
|
4
|
+
|
5
|
+
require 'xsv/helpers'
|
6
|
+
require 'xsv/sax_parser'
|
7
|
+
require 'xsv/relationships_handler'
|
8
|
+
require 'xsv/shared_strings_parser'
|
9
|
+
require 'xsv/sheet'
|
10
|
+
require 'xsv/sheet_bounds_handler'
|
11
|
+
require 'xsv/sheet_rows_handler'
|
12
|
+
require 'xsv/sheets_ids_handler'
|
13
|
+
require 'xsv/styles_handler'
|
14
|
+
require 'xsv/version'
|
15
|
+
require 'xsv/workbook'
|
15
16
|
|
16
17
|
# XSV is a fast, lightweight parser for Office Open XML spreadsheet files
|
17
18
|
# (commonly known as Excel or .xlsx files). It strives to be minimal in the
|
data/lib/xsv/helpers.rb
CHANGED
@@ -1,52 +1,54 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
2
3
|
module Xsv
|
3
4
|
module Helpers
|
4
5
|
# The default OOXML Spreadheet number formats according to the ECMA standard
|
5
6
|
# User formats are appended from index 174 onward
|
6
7
|
BUILT_IN_NUMBER_FORMATS = {
|
7
|
-
1 =>
|
8
|
-
2 =>
|
9
|
-
3 =>
|
10
|
-
4 =>
|
11
|
-
5 =>
|
12
|
-
6 =>
|
13
|
-
7 =>
|
14
|
-
8 =>
|
15
|
-
9 =>
|
16
|
-
10 =>
|
17
|
-
11 =>
|
18
|
-
12 =>
|
19
|
-
13 =>
|
20
|
-
14 =>
|
21
|
-
15 =>
|
22
|
-
16 =>
|
23
|
-
17 =>
|
24
|
-
18 =>
|
25
|
-
19 =>
|
26
|
-
20 =>
|
27
|
-
21 =>
|
28
|
-
22 =>
|
29
|
-
37 =>
|
30
|
-
38 =>
|
31
|
-
39 =>
|
32
|
-
40 =>
|
33
|
-
45 =>
|
34
|
-
46 =>
|
35
|
-
47 =>
|
36
|
-
48 =>
|
37
|
-
49 =>
|
8
|
+
1 => '0',
|
9
|
+
2 => '0.00',
|
10
|
+
3 => '#, ##0',
|
11
|
+
4 => '#, ##0.00',
|
12
|
+
5 => '$#, ##0_);($#, ##0)',
|
13
|
+
6 => '$#, ##0_);[Red]($#, ##0)',
|
14
|
+
7 => '$#, ##0.00_);($#, ##0.00)',
|
15
|
+
8 => '$#, ##0.00_);[Red]($#, ##0.00)',
|
16
|
+
9 => '0%',
|
17
|
+
10 => '0.00%',
|
18
|
+
11 => '0.00E+00',
|
19
|
+
12 => '# ?/?',
|
20
|
+
13 => '# ??/??',
|
21
|
+
14 => 'm/d/yyyy',
|
22
|
+
15 => 'd-mmm-yy',
|
23
|
+
16 => 'd-mmm',
|
24
|
+
17 => 'mmm-yy',
|
25
|
+
18 => 'h:mm AM/PM',
|
26
|
+
19 => 'h:mm:ss AM/PM',
|
27
|
+
20 => 'h:mm',
|
28
|
+
21 => 'h:mm:ss',
|
29
|
+
22 => 'm/d/yyyy h:mm',
|
30
|
+
37 => '#, ##0_);(#, ##0)',
|
31
|
+
38 => '#, ##0_);[Red](#, ##0)',
|
32
|
+
39 => '#, ##0.00_);(#, ##0.00)',
|
33
|
+
40 => '#, ##0.00_);[Red](#, ##0.00)',
|
34
|
+
45 => 'mm:ss',
|
35
|
+
46 => '[h]:mm:ss',
|
36
|
+
47 => 'mm:ss.0',
|
37
|
+
48 => '##0.0E+0',
|
38
|
+
49 => '@'
|
38
39
|
}.freeze
|
39
40
|
|
40
|
-
MINUTE = 60
|
41
|
-
HOUR = 3600
|
42
|
-
A_CODEPOINT =
|
41
|
+
MINUTE = 60
|
42
|
+
HOUR = 3600
|
43
|
+
A_CODEPOINT = 'A'.ord.freeze
|
43
44
|
# The epoch for all dates in OOXML Spreadsheet documents
|
44
45
|
EPOCH = Date.new(1899, 12, 30).freeze
|
45
46
|
|
46
47
|
# Return the index number for the given Excel column name (i.e. "A1" => 0)
|
47
48
|
def column_index(col)
|
48
49
|
col.each_codepoint.reduce(0) do |sum, n|
|
49
|
-
break sum - 1 if n < A_CODEPOINT
|
50
|
+
break sum - 1 if n < A_CODEPOINT # reached a number
|
51
|
+
|
50
52
|
sum * 26 + (n - A_CODEPOINT + 1)
|
51
53
|
end
|
52
54
|
end
|
@@ -59,9 +61,7 @@ module Xsv
|
|
59
61
|
# Return a time as a string for the given Excel time value
|
60
62
|
def parse_time(number)
|
61
63
|
# Disregard date part
|
62
|
-
|
63
|
-
number = number - number.truncate
|
64
|
-
end
|
64
|
+
number -= number.truncate if number.positive?
|
65
65
|
|
66
66
|
base = number * 24
|
67
67
|
|
@@ -70,11 +70,11 @@ module Xsv
|
|
70
70
|
|
71
71
|
# Compensate for rounding errors
|
72
72
|
if minutes >= 60
|
73
|
-
hours
|
73
|
+
hours += (minutes / 60)
|
74
74
|
minutes = minutes % 60
|
75
75
|
end
|
76
76
|
|
77
|
-
|
77
|
+
format('%02d:%02d', hours, minutes)
|
78
78
|
end
|
79
79
|
|
80
80
|
# Returns a time including a date as a {Time} object
|
@@ -92,9 +92,9 @@ module Xsv
|
|
92
92
|
|
93
93
|
# Returns a number as either Integer or Float
|
94
94
|
def parse_number(string)
|
95
|
-
if string.include?
|
95
|
+
if string.include? '.'
|
96
96
|
string.to_f
|
97
|
-
elsif string.include?
|
97
|
+
elsif string.include? 'E'
|
98
98
|
Complex(string).to_f
|
99
99
|
else
|
100
100
|
string.to_i
|
@@ -103,36 +103,20 @@ module Xsv
|
|
103
103
|
|
104
104
|
# Apply date or time number formats, if applicable
|
105
105
|
def parse_number_format(number, format)
|
106
|
-
number = parse_number(number)
|
106
|
+
number = parse_number(number) # number is always a string since it comes out of the Sax Parser
|
107
|
+
|
108
|
+
is_date_format = format.scan(/[dmy]+/).length > 1
|
109
|
+
is_time_format = format.scan(/[hms]+/).length > 1
|
107
110
|
|
108
|
-
if
|
111
|
+
if !is_date_format && !is_time_format
|
112
|
+
number
|
113
|
+
elsif is_date_format && is_time_format
|
109
114
|
parse_datetime(number)
|
110
|
-
elsif is_date_format
|
115
|
+
elsif is_date_format
|
111
116
|
parse_date(number)
|
112
|
-
elsif is_time_format
|
117
|
+
elsif is_time_format
|
113
118
|
parse_time(number)
|
114
|
-
else
|
115
|
-
number
|
116
119
|
end
|
117
120
|
end
|
118
|
-
|
119
|
-
# Tests if the given format string includes both date and time
|
120
|
-
def is_datetime_format?(format)
|
121
|
-
is_date_format?(format) && is_time_format?(format)
|
122
|
-
end
|
123
|
-
|
124
|
-
# Tests if the given format string is a date
|
125
|
-
def is_date_format?(format)
|
126
|
-
return false if format.nil?
|
127
|
-
# If it contains at least 2 sequences of d's, m's or y's it's a date!
|
128
|
-
format.scan(/[dmy]+/).length > 1
|
129
|
-
end
|
130
|
-
|
131
|
-
# Tests if the given format string is a time
|
132
|
-
def is_time_format?(format)
|
133
|
-
return false if format.nil?
|
134
|
-
# If it contains at least 2 sequences of h's, m's or s's it's a time!
|
135
|
-
format.scan(/[hms]+/).length > 1
|
136
|
-
end
|
137
121
|
end
|
138
122
|
end
|
@@ -1,40 +1,23 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
2
3
|
module Xsv
|
3
4
|
# RelationshipsHandler parses the "xl/_rels/workbook.xml.rels" file to get the existing relationships.
|
4
5
|
# This is used internally when opening a workbook.
|
5
|
-
class RelationshipsHandler <
|
6
|
+
class RelationshipsHandler < SaxParser
|
6
7
|
def self.get_relations(io)
|
7
8
|
relations = []
|
8
|
-
handler = new do |relation|
|
9
|
-
relations << relation
|
10
|
-
end
|
11
9
|
|
12
|
-
|
13
|
-
return relations
|
14
|
-
end
|
10
|
+
new { |relation| relations << relation }.parse(io)
|
15
11
|
|
16
|
-
|
12
|
+
relations
|
13
|
+
end
|
17
14
|
|
18
15
|
def initialize(&block)
|
19
16
|
@block = block
|
20
|
-
@relationship = {}
|
21
|
-
end
|
22
|
-
|
23
|
-
def start_element(name)
|
24
|
-
@relationship = {} if name == :Relationship
|
25
|
-
end
|
26
|
-
|
27
|
-
def attr(name, value)
|
28
|
-
case name
|
29
|
-
when :Id, :Type, :Target
|
30
|
-
@relationship[name] = value
|
31
|
-
end
|
32
17
|
end
|
33
18
|
|
34
|
-
def
|
35
|
-
|
36
|
-
|
37
|
-
@block.call(@relationship)
|
19
|
+
def start_element(name, attrs)
|
20
|
+
@block.call(attrs.slice(:Id, :Type, :Target)) if name == 'Relationship'
|
38
21
|
end
|
39
22
|
end
|
40
23
|
end
|
@@ -0,0 +1,86 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Xsv
|
4
|
+
class SaxParser
|
5
|
+
ATTR_REGEX = /((\S+)="(.*?)")/m
|
6
|
+
|
7
|
+
def parse(io)
|
8
|
+
state = :look_start
|
9
|
+
if io.is_a?(String)
|
10
|
+
pbuf = io.dup
|
11
|
+
eof_reached = true
|
12
|
+
must_read = false
|
13
|
+
else
|
14
|
+
pbuf = String.new(capacity: 8192)
|
15
|
+
eof_reached = false
|
16
|
+
must_read = true
|
17
|
+
end
|
18
|
+
|
19
|
+
loop do
|
20
|
+
if must_read
|
21
|
+
begin
|
22
|
+
pbuf << io.sysread(2048)
|
23
|
+
rescue EOFError, TypeError
|
24
|
+
# EOFError is thrown by IO, rubyzip returns nil from sysread on EOF
|
25
|
+
eof_reached = true
|
26
|
+
end
|
27
|
+
|
28
|
+
must_read = false
|
29
|
+
end
|
30
|
+
|
31
|
+
if state == :look_start
|
32
|
+
if (o = pbuf.index('<'))
|
33
|
+
chars = pbuf.slice!(0, o + 1).chop!.force_encoding('utf-8')
|
34
|
+
|
35
|
+
if respond_to?(:characters) && !chars.empty?
|
36
|
+
if chars.index('&')
|
37
|
+
chars.gsub!('&', '&')
|
38
|
+
chars.gsub!(''', "'")
|
39
|
+
chars.gsub!('>', '>')
|
40
|
+
chars.gsub!('<', '<')
|
41
|
+
chars.gsub!('"', '"')
|
42
|
+
end
|
43
|
+
characters(chars)
|
44
|
+
end
|
45
|
+
|
46
|
+
state = :look_end
|
47
|
+
elsif eof_reached
|
48
|
+
# Discard anything after the last tag in the document
|
49
|
+
break
|
50
|
+
else
|
51
|
+
# Continue loop to read more data into the buffer
|
52
|
+
must_read = true
|
53
|
+
next
|
54
|
+
end
|
55
|
+
end
|
56
|
+
|
57
|
+
if state == :look_end
|
58
|
+
if (o = pbuf.index('>'))
|
59
|
+
if (s = pbuf.index(' ')) && s < o
|
60
|
+
tag_name = pbuf.slice!(0, s + 1).chop!
|
61
|
+
args = pbuf.slice!(0, o - s)
|
62
|
+
else
|
63
|
+
tag_name = pbuf.slice!(0, o + 1).chop!
|
64
|
+
args = nil
|
65
|
+
end
|
66
|
+
|
67
|
+
if tag_name.start_with?('/')
|
68
|
+
end_element(tag_name[1..-1]) if respond_to?(:end_element)
|
69
|
+
elsif args.nil?
|
70
|
+
start_element(tag_name, nil)
|
71
|
+
else
|
72
|
+
start_element(tag_name, args.scan(ATTR_REGEX).each_with_object({}) { |m, h| h[m[1].to_sym] = m[2] })
|
73
|
+
end_element(tag_name) if args.end_with?('/') && respond_to?(:end_element)
|
74
|
+
end
|
75
|
+
|
76
|
+
state = :look_start
|
77
|
+
elsif eof_reached
|
78
|
+
raise 'Malformed XML document, looking for end of tag beyond EOF'
|
79
|
+
else
|
80
|
+
must_read = true
|
81
|
+
end
|
82
|
+
end
|
83
|
+
end
|
84
|
+
end
|
85
|
+
end
|
86
|
+
end
|
@@ -1,38 +1,46 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
2
3
|
module Xsv
|
3
4
|
# Interpret the sharedStrings.xml file from the workbook
|
4
5
|
# This is used internally when opening a sheet.
|
5
|
-
class SharedStringsParser <
|
6
|
+
class SharedStringsParser < SaxParser
|
6
7
|
def self.parse(io)
|
7
8
|
strings = []
|
8
|
-
|
9
|
-
|
10
|
-
return strings
|
9
|
+
new { |s| strings << s }.parse(io)
|
10
|
+
strings
|
11
11
|
end
|
12
12
|
|
13
13
|
def initialize(&block)
|
14
14
|
@block = block
|
15
15
|
@state = nil
|
16
|
+
@skip = false
|
16
17
|
end
|
17
18
|
|
18
|
-
def start_element(name)
|
19
|
+
def start_element(name, _attrs)
|
19
20
|
case name
|
20
|
-
when
|
21
|
-
@current_string =
|
22
|
-
|
21
|
+
when 'si'
|
22
|
+
@current_string = ''
|
23
|
+
@skip = false
|
24
|
+
when 'rPh'
|
25
|
+
@skip = true
|
26
|
+
when 't'
|
23
27
|
@state = name
|
24
28
|
end
|
25
29
|
end
|
26
30
|
|
27
|
-
def
|
28
|
-
|
31
|
+
def characters(value)
|
32
|
+
if @state == 't' && !@skip
|
33
|
+
@current_string += value
|
34
|
+
end
|
29
35
|
end
|
30
36
|
|
31
37
|
def end_element(name)
|
32
38
|
case name
|
33
|
-
when
|
39
|
+
when 'si'
|
34
40
|
@block.call(@current_string)
|
35
|
-
when
|
41
|
+
when 'rPh'
|
42
|
+
@skip = false
|
43
|
+
when 't'
|
36
44
|
@state = nil
|
37
45
|
end
|
38
46
|
end
|
data/lib/xsv/sheet.rb
CHANGED
@@ -1,4 +1,5 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
2
3
|
module Xsv
|
3
4
|
# Sheet represents a single worksheet from a workbook and is normally accessed through {Workbook#sheets}
|
4
5
|
#
|
@@ -39,14 +40,14 @@ module Xsv
|
|
39
40
|
@headers = []
|
40
41
|
@mode = :array
|
41
42
|
@row_skip = 0
|
42
|
-
@hidden = ids[:state] ==
|
43
|
+
@hidden = ids[:state] == 'hidden'
|
43
44
|
|
44
45
|
@last_row, @column_count = SheetBoundsHandler.get_bounds(@io, @workbook)
|
45
46
|
end
|
46
47
|
|
47
48
|
# @return [String]
|
48
49
|
def inspect
|
49
|
-
"#<#{self.class.name}:#{
|
50
|
+
"#<#{self.class.name}:#{object_id}>"
|
50
51
|
end
|
51
52
|
|
52
53
|
# Returns true if the worksheet is hidden
|
@@ -60,15 +61,7 @@ module Xsv
|
|
60
61
|
|
61
62
|
handler = SheetRowsHandler.new(@mode, empty_row, @workbook, @row_skip, @last_row, &block)
|
62
63
|
|
63
|
-
|
64
|
-
# handed a string. For larger sheets this leads to awful performance.
|
65
|
-
# This is probably caused by either something in SheetRowsHandler or
|
66
|
-
# the interaction between Zip::InputStream and Ox
|
67
|
-
if @size > 100_000_000
|
68
|
-
Ox.sax_parse(handler, @io)
|
69
|
-
else
|
70
|
-
Ox.sax_parse(handler, @io.read)
|
71
|
-
end
|
64
|
+
handler.parse(@io)
|
72
65
|
|
73
66
|
true
|
74
67
|
end
|
@@ -82,17 +75,17 @@ module Xsv
|
|
82
75
|
return row if i == number
|
83
76
|
end
|
84
77
|
|
85
|
-
|
78
|
+
empty_row
|
86
79
|
end
|
87
80
|
|
88
81
|
# Load headers in the top row of the worksheet. After parsing of headers
|
89
82
|
# all methods return hashes instead of arrays
|
90
|
-
# @return [
|
83
|
+
# @return [self]
|
91
84
|
def parse_headers!
|
92
85
|
@headers = parse_headers
|
93
86
|
@mode = :hash
|
94
87
|
|
95
|
-
|
88
|
+
self
|
96
89
|
end
|
97
90
|
|
98
91
|
# Return the headers of the sheet as an array
|
@@ -107,9 +100,10 @@ module Xsv
|
|
107
100
|
private
|
108
101
|
|
109
102
|
def parse_headers
|
110
|
-
|
103
|
+
case @mode
|
104
|
+
when :array
|
111
105
|
first
|
112
|
-
|
106
|
+
when :hash
|
113
107
|
@mode = :array
|
114
108
|
headers.tap { @mode = :hash }
|
115
109
|
end || []
|
@@ -1,9 +1,10 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
2
3
|
module Xsv
|
3
4
|
# SheetBoundsHandler scans a sheet looking for the outer bounds of the content within.
|
4
5
|
# This is used internally when opening a sheet to deal with worksheets that do not
|
5
6
|
# have a correct dimension tag.
|
6
|
-
class SheetBoundsHandler <
|
7
|
+
class SheetBoundsHandler < SaxParser
|
7
8
|
include Xsv::Helpers
|
8
9
|
|
9
10
|
def self.get_bounds(sheet, workbook)
|
@@ -12,18 +13,17 @@ module Xsv
|
|
12
13
|
|
13
14
|
handler = new(workbook.trim_empty_rows) do |row, col|
|
14
15
|
rows = row
|
15
|
-
cols = col
|
16
|
+
cols = col.zero? ? 0 : col + 1
|
16
17
|
|
17
18
|
return rows, cols
|
18
19
|
end
|
19
20
|
|
20
21
|
sheet.rewind
|
21
|
-
Ox.sax_parse(handler, sheet.read)
|
22
22
|
|
23
|
-
|
24
|
-
end
|
23
|
+
handler.parse(sheet)
|
25
24
|
|
26
|
-
|
25
|
+
[rows, cols]
|
26
|
+
end
|
27
27
|
|
28
28
|
def initialize(trim_empty_rows, &block)
|
29
29
|
@block = block
|
@@ -35,36 +35,22 @@ module Xsv
|
|
35
35
|
@trim_empty_rows = trim_empty_rows
|
36
36
|
end
|
37
37
|
|
38
|
-
def start_element(name)
|
38
|
+
def start_element(name, attrs)
|
39
39
|
case name
|
40
|
-
when
|
40
|
+
when 'c'
|
41
41
|
@state = name
|
42
|
-
@cell =
|
43
|
-
when
|
42
|
+
@cell = attrs[:r]
|
43
|
+
when 'v'
|
44
44
|
col = column_index(@cell)
|
45
45
|
@maxColumn = col if col > @maxColumn
|
46
46
|
@maxRow = @row if @row > @maxRow
|
47
|
-
when
|
47
|
+
when 'row'
|
48
48
|
@state = name
|
49
|
-
@row =
|
50
|
-
when
|
49
|
+
@row = attrs[:r].to_i
|
50
|
+
when 'dimension'
|
51
51
|
@state = name
|
52
|
-
end
|
53
|
-
end
|
54
|
-
|
55
|
-
def end_element(name)
|
56
|
-
if name == :sheetData
|
57
|
-
@block.call(@maxRow, @maxColumn)
|
58
|
-
end
|
59
|
-
end
|
60
52
|
|
61
|
-
|
62
|
-
if @state == :c && name == :r
|
63
|
-
@cell = value
|
64
|
-
elsif @state == :row && name == :r
|
65
|
-
@row = value.to_i
|
66
|
-
elsif @state == :dimension && name == :ref
|
67
|
-
_firstCell, lastCell = value.split(":")
|
53
|
+
_firstCell, lastCell = attrs[:ref].split(':')
|
68
54
|
|
69
55
|
if lastCell
|
70
56
|
@maxColumn = column_index(lastCell)
|
@@ -75,5 +61,9 @@ module Xsv
|
|
75
61
|
end
|
76
62
|
end
|
77
63
|
end
|
64
|
+
|
65
|
+
def end_element(name)
|
66
|
+
@block.call(@maxRow, @maxColumn) if name == 'sheetData'
|
67
|
+
end
|
78
68
|
end
|
79
69
|
end
|
@@ -1,100 +1,58 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
2
3
|
module Xsv
|
3
4
|
# This is the core worksheet parser, implemented as an Ox::Sax handler. This is
|
4
5
|
# used internally to enumerate rows.
|
5
|
-
class SheetRowsHandler <
|
6
|
+
class SheetRowsHandler < SaxParser
|
6
7
|
include Xsv::Helpers
|
7
8
|
|
8
|
-
def format_cell
|
9
|
-
return nil if @current_value.empty?
|
10
|
-
|
11
|
-
case @current_cell[:t]
|
12
|
-
when "s"
|
13
|
-
@workbook.shared_strings[@current_value.to_i]
|
14
|
-
when "str", "inlineStr"
|
15
|
-
@current_value.dup
|
16
|
-
when "e" # N/A
|
17
|
-
nil
|
18
|
-
when nil, "n"
|
19
|
-
if @current_cell[:s]
|
20
|
-
style = @workbook.xfs[@current_cell[:s].to_i]
|
21
|
-
numFmt = @workbook.numFmts[style[:numFmtId].to_i]
|
22
|
-
|
23
|
-
parse_number_format(@current_value, numFmt)
|
24
|
-
else
|
25
|
-
parse_number(@current_value)
|
26
|
-
end
|
27
|
-
when "b"
|
28
|
-
@current_value == "1"
|
29
|
-
else
|
30
|
-
raise Xsv::Error, "Encountered unknown column type #{@current_cell[:t]}"
|
31
|
-
end
|
32
|
-
end
|
33
|
-
|
34
|
-
# Ox::Sax implementation below
|
35
|
-
|
36
9
|
def initialize(mode, empty_row, workbook, row_skip, last_row, &block)
|
37
|
-
@block = block
|
38
|
-
|
39
|
-
# :sheetData
|
40
|
-
# :row
|
41
|
-
# :c
|
42
|
-
# :v
|
43
|
-
@state = nil
|
44
|
-
|
45
10
|
@mode = mode
|
46
11
|
@empty_row = empty_row
|
47
12
|
@workbook = workbook
|
48
13
|
@row_skip = row_skip
|
14
|
+
@last_row = last_row - @row_skip
|
15
|
+
@block = block
|
16
|
+
|
17
|
+
@state = nil
|
18
|
+
|
49
19
|
@row_index = 0
|
50
20
|
@current_row = {}
|
51
21
|
@current_row_attrs = {}
|
52
22
|
@current_cell = {}
|
53
23
|
@current_value = String.new
|
54
|
-
@last_row = last_row
|
55
24
|
|
56
|
-
if @mode == :hash
|
57
|
-
@headers = @empty_row.keys
|
58
|
-
end
|
25
|
+
@headers = @empty_row.keys if @mode == :hash
|
59
26
|
end
|
60
27
|
|
61
|
-
def start_element(name)
|
28
|
+
def start_element(name, attrs)
|
62
29
|
case name
|
63
|
-
when
|
30
|
+
when 'c'
|
64
31
|
@state = name
|
65
|
-
@current_cell
|
32
|
+
@current_cell = attrs
|
66
33
|
@current_value.clear
|
67
|
-
when
|
34
|
+
when 'v', 'is'
|
68
35
|
@state = name
|
69
|
-
when
|
36
|
+
when 'row'
|
70
37
|
@state = name
|
71
38
|
@current_row = @empty_row.dup
|
72
|
-
@current_row_attrs
|
73
|
-
when
|
74
|
-
@state = nil unless @state ==
|
39
|
+
@current_row_attrs = attrs
|
40
|
+
when 't'
|
41
|
+
@state = nil unless @state == 'is'
|
75
42
|
else
|
76
43
|
@state = nil
|
77
44
|
end
|
78
45
|
end
|
79
46
|
|
80
|
-
def
|
81
|
-
if @state ==
|
82
|
-
@current_value << value
|
83
|
-
end
|
84
|
-
end
|
85
|
-
|
86
|
-
def attr(name, value)
|
87
|
-
case @state
|
88
|
-
when :c
|
89
|
-
@current_cell[name] = value
|
90
|
-
when :row
|
91
|
-
@current_row_attrs[name] = value
|
92
|
-
end
|
47
|
+
def characters(value)
|
48
|
+
@current_value << value if @state == 'v' || @state == 'is'
|
93
49
|
end
|
94
50
|
|
95
51
|
def end_element(name)
|
96
52
|
case name
|
97
|
-
when
|
53
|
+
when 'v'
|
54
|
+
@state = nil
|
55
|
+
when 'c'
|
98
56
|
col_index = column_index(@current_cell[:r])
|
99
57
|
|
100
58
|
case @mode
|
@@ -103,28 +61,54 @@ module Xsv
|
|
103
61
|
when :hash
|
104
62
|
@current_row[@headers[col_index]] = format_cell
|
105
63
|
end
|
106
|
-
when
|
107
|
-
|
108
|
-
|
64
|
+
when 'row'
|
65
|
+
real_row_number = @current_row_attrs[:r].to_i
|
66
|
+
adjusted_row_number = real_row_number - @row_skip
|
109
67
|
|
110
|
-
if
|
111
|
-
return
|
112
|
-
end
|
68
|
+
return if real_row_number <= @row_skip
|
113
69
|
|
114
70
|
@row_index += 1
|
115
71
|
|
116
72
|
# Skip first row if we're in hash mode
|
117
|
-
return if
|
73
|
+
return if adjusted_row_number == 1 && @mode == :hash
|
118
74
|
|
119
75
|
# Pad empty rows
|
120
|
-
while @row_index <
|
76
|
+
while @row_index < adjusted_row_number
|
121
77
|
@block.call(@empty_row)
|
122
78
|
@row_index += 1
|
123
79
|
next
|
124
80
|
end
|
125
81
|
|
126
82
|
# Do not return empty trailing rows
|
127
|
-
@block.call(@current_row) unless @row_index > @last_row
|
83
|
+
@block.call(@current_row) unless @row_index > @last_row
|
84
|
+
end
|
85
|
+
end
|
86
|
+
|
87
|
+
private
|
88
|
+
|
89
|
+
def format_cell
|
90
|
+
return nil if @current_value.empty?
|
91
|
+
|
92
|
+
case @current_cell[:t]
|
93
|
+
when 's'
|
94
|
+
@workbook.shared_strings[@current_value.to_i]
|
95
|
+
when 'str', 'inlineStr'
|
96
|
+
@current_value.strip
|
97
|
+
when 'e' # N/A
|
98
|
+
nil
|
99
|
+
when nil, 'n'
|
100
|
+
if @current_cell[:s]
|
101
|
+
style = @workbook.xfs[@current_cell[:s].to_i]
|
102
|
+
numFmt = @workbook.numFmts[style[:numFmtId].to_i]
|
103
|
+
|
104
|
+
parse_number_format(@current_value, numFmt)
|
105
|
+
else
|
106
|
+
parse_number(@current_value)
|
107
|
+
end
|
108
|
+
when 'b'
|
109
|
+
@current_value == '1'
|
110
|
+
else
|
111
|
+
raise Xsv::Error, "Encountered unknown column type #{@current_cell[:t]}"
|
128
112
|
end
|
129
113
|
end
|
130
114
|
end
|
@@ -1,56 +1,23 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
2
3
|
module Xsv
|
3
4
|
# SheetsIdsHandler interprets the relevant parts of workbook.xml
|
4
5
|
# This is used internally to get the sheets ids, relationship_ids, and names when opening a workbook.
|
5
|
-
class SheetsIdsHandler <
|
6
|
+
class SheetsIdsHandler < SaxParser
|
6
7
|
def self.get_sheets_ids(io)
|
7
8
|
sheets_ids = []
|
8
|
-
handler = new do |sheet_ids|
|
9
|
-
sheets_ids << sheet_ids
|
10
|
-
end
|
11
9
|
|
12
|
-
|
13
|
-
return sheets_ids
|
14
|
-
end
|
10
|
+
new { |sheet_ids| sheets_ids << sheet_ids }.parse(io)
|
15
11
|
|
16
|
-
|
12
|
+
sheets_ids
|
13
|
+
end
|
17
14
|
|
18
15
|
def initialize(&block)
|
19
16
|
@block = block
|
20
|
-
@parsing = false
|
21
|
-
end
|
22
|
-
|
23
|
-
def start_element(name)
|
24
|
-
if name == :sheets
|
25
|
-
@parsing = true
|
26
|
-
return
|
27
|
-
end
|
28
|
-
|
29
|
-
return unless name == :sheet
|
30
|
-
|
31
|
-
@sheet_ids = {}
|
32
|
-
end
|
33
|
-
|
34
|
-
def attr(name, value)
|
35
|
-
return unless @parsing
|
36
|
-
|
37
|
-
case name
|
38
|
-
when :name, :sheetId, :state
|
39
|
-
@sheet_ids[name] = value
|
40
|
-
when :'r:id'
|
41
|
-
@sheet_ids[:r_id] = value
|
42
|
-
end
|
43
17
|
end
|
44
18
|
|
45
|
-
def
|
46
|
-
if name ==
|
47
|
-
@parsing = false
|
48
|
-
return
|
49
|
-
end
|
50
|
-
|
51
|
-
return unless name == :sheet
|
52
|
-
|
53
|
-
@block.call(@sheet_ids)
|
19
|
+
def start_element(name, attrs)
|
20
|
+
@block.call(attrs.slice(:name, :sheetId, :state, :'r:id')) if name == 'sheet'
|
54
21
|
end
|
55
22
|
end
|
56
23
|
end
|
data/lib/xsv/styles_handler.rb
CHANGED
@@ -1,59 +1,44 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
+
|
2
3
|
module Xsv
|
3
4
|
# StylesHandler interprets the relevant parts of styles.xml
|
4
5
|
# This is used internally when opening a sheet.
|
5
|
-
class StylesHandler <
|
6
|
-
def self.get_styles(io
|
7
|
-
|
8
|
-
@numFmts = nil
|
9
|
-
handler = new(numFmts) do |xfs, numFmts|
|
6
|
+
class StylesHandler < SaxParser
|
7
|
+
def self.get_styles(io)
|
8
|
+
handler = new(Xsv::Helpers::BUILT_IN_NUMBER_FORMATS.dup) do |xfs, numFmts|
|
10
9
|
@xfs = xfs
|
11
10
|
@numFmts = numFmts
|
12
11
|
end
|
13
12
|
|
14
|
-
|
15
|
-
return @xfs, @numFmts
|
16
|
-
end
|
13
|
+
handler.parse(io)
|
17
14
|
|
18
|
-
|
15
|
+
[@xfs, @numFmts]
|
16
|
+
end
|
19
17
|
|
20
18
|
def initialize(numFmts, &block)
|
21
19
|
@block = block
|
22
20
|
@state = nil
|
23
21
|
@xfs = []
|
24
22
|
@numFmts = numFmts
|
25
|
-
|
26
|
-
@xf = {}
|
27
|
-
@numFmt = {}
|
28
23
|
end
|
29
24
|
|
30
|
-
def start_element(name)
|
25
|
+
def start_element(name, attrs)
|
31
26
|
case name
|
32
|
-
when
|
33
|
-
@state =
|
34
|
-
when
|
35
|
-
@
|
36
|
-
when
|
37
|
-
@
|
38
|
-
end
|
39
|
-
end
|
40
|
-
|
41
|
-
def attr(name, value)
|
42
|
-
case @state
|
43
|
-
when :cellXfs
|
44
|
-
@xf[name] = value
|
45
|
-
when :numFmts
|
46
|
-
@numFmt[name] = value
|
27
|
+
when 'cellXfs'
|
28
|
+
@state = 'cellXfs'
|
29
|
+
when 'xf'
|
30
|
+
@xfs << attrs if @state == 'cellXfs'
|
31
|
+
when 'numFmt'
|
32
|
+
@numFmts[attrs[:numFmtId].to_i] = attrs[:formatCode]
|
47
33
|
end
|
48
34
|
end
|
49
35
|
|
50
36
|
def end_element(name)
|
51
|
-
|
52
|
-
|
53
|
-
elsif @state == :numFmts && name == :numFmt
|
54
|
-
@numFmts[@numFmt[:numFmtId].to_i] = @numFmt[:formatCode]
|
55
|
-
elsif name == :styleSheet
|
37
|
+
case name
|
38
|
+
when 'styleSheet'
|
56
39
|
@block.call(@xfs, @numFmts)
|
40
|
+
when 'cellXfs'
|
41
|
+
@state = nil
|
57
42
|
end
|
58
43
|
end
|
59
44
|
end
|
data/lib/xsv/version.rb
CHANGED
data/lib/xsv/workbook.rb
CHANGED
@@ -1,11 +1,11 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
|
-
|
2
|
+
|
3
|
+
require 'zip'
|
3
4
|
|
4
5
|
module Xsv
|
5
6
|
# An OOXML Spreadsheet document is called a Workbook. A Workbook consists of
|
6
7
|
# multiple Sheets that are available in the array that's accessible through {#sheets}
|
7
8
|
class Workbook
|
8
|
-
|
9
9
|
# Access the Sheet objects contained in the workbook
|
10
10
|
# @return [Array<Sheet>]
|
11
11
|
attr_reader :sheets
|
@@ -15,12 +15,22 @@ module Xsv
|
|
15
15
|
# Open the workbook of the given filename, string or buffer. For additional
|
16
16
|
# options see {.initialize}
|
17
17
|
def self.open(data, **kws)
|
18
|
-
if data.is_a?(IO) || data.respond_to?(:read) # is it a buffer?
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
18
|
+
@workbook = if data.is_a?(IO) || data.respond_to?(:read) # is it a buffer?
|
19
|
+
new(Zip::File.open_buffer(data), **kws)
|
20
|
+
elsif data.start_with?("PK\x03\x04") # is it a string containing a file?
|
21
|
+
new(Zip::File.open_buffer(data), **kws)
|
22
|
+
else # must be a filename
|
23
|
+
new(Zip::File.open(data), **kws)
|
24
|
+
end
|
25
|
+
|
26
|
+
if block_given?
|
27
|
+
begin
|
28
|
+
yield(@workbook)
|
29
|
+
ensure
|
30
|
+
@workbook.close
|
31
|
+
end
|
32
|
+
else
|
33
|
+
@workbook
|
24
34
|
end
|
25
35
|
end
|
26
36
|
|
@@ -32,35 +42,35 @@ module Xsv
|
|
32
42
|
# trim_empty_rows (false) Scan sheet for end of content and don't return trailing rows
|
33
43
|
#
|
34
44
|
def initialize(zip, trim_empty_rows: false)
|
45
|
+
raise ArgumentError, "Passed argument is not an instance of Zip::File. Did you mean to use Workbook.open?" unless zip.is_a?(Zip::File)
|
46
|
+
|
35
47
|
@zip = zip
|
36
48
|
@trim_empty_rows = trim_empty_rows
|
37
49
|
|
38
50
|
@sheets = []
|
39
|
-
@xfs =
|
40
|
-
@
|
41
|
-
|
42
|
-
fetch_shared_strings
|
43
|
-
|
44
|
-
fetch_sheets_ids
|
45
|
-
fetch_relationships
|
46
|
-
fetch_sheets
|
51
|
+
@xfs, @numFmts = fetch_styles
|
52
|
+
@sheet_ids = fetch_sheet_ids
|
53
|
+
@relationships = fetch_relationships
|
54
|
+
@shared_strings = fetch_shared_strings
|
55
|
+
@sheets = fetch_sheets
|
47
56
|
end
|
48
57
|
|
49
58
|
# @return [String]
|
50
59
|
def inspect
|
51
|
-
"#<#{self.class.name}:#{
|
60
|
+
"#<#{self.class.name}:#{object_id}>"
|
52
61
|
end
|
53
62
|
|
54
63
|
# Close the handle to the workbook file and leave all resources for the GC to collect
|
55
64
|
# @return [true]
|
56
65
|
def close
|
57
66
|
@zip.close
|
67
|
+
@zip = nil
|
58
68
|
@sheets = nil
|
59
69
|
@xfs = nil
|
60
70
|
@numFmts = nil
|
61
71
|
@relationships = nil
|
62
72
|
@shared_strings = nil
|
63
|
-
@
|
73
|
+
@sheet_ids = nil
|
64
74
|
|
65
75
|
true
|
66
76
|
end
|
@@ -75,42 +85,44 @@ module Xsv
|
|
75
85
|
private
|
76
86
|
|
77
87
|
def fetch_shared_strings
|
78
|
-
handle = @zip.glob(
|
88
|
+
handle = @zip.glob('xl/sharedStrings.xml').first
|
79
89
|
return if handle.nil?
|
80
90
|
|
81
91
|
stream = handle.get_input_stream
|
82
|
-
|
83
|
-
|
84
|
-
stream
|
92
|
+
SharedStringsParser.parse(stream)
|
93
|
+
ensure
|
94
|
+
stream&.close
|
85
95
|
end
|
86
96
|
|
87
97
|
def fetch_styles
|
88
|
-
stream = @zip.glob(
|
98
|
+
stream = @zip.glob('xl/styles.xml').first.get_input_stream
|
89
99
|
|
90
|
-
|
100
|
+
StylesHandler.get_styles(stream)
|
101
|
+
ensure
|
102
|
+
stream.close
|
91
103
|
end
|
92
104
|
|
93
105
|
def fetch_sheets
|
94
|
-
@zip.glob(
|
106
|
+
@zip.glob('xl/worksheets/sheet*.xml').sort do |a, b|
|
95
107
|
a.name[/\d+/].to_i <=> b.name[/\d+/].to_i
|
96
|
-
end.
|
97
|
-
rel = @relationships.detect { |r| entry.name.end_with?(r[:Target]) && r[:Type].end_with?(
|
98
|
-
sheet_ids = @
|
99
|
-
|
108
|
+
end.map do |entry|
|
109
|
+
rel = @relationships.detect { |r| entry.name.end_with?(r[:Target]) && r[:Type].end_with?('worksheet') }
|
110
|
+
sheet_ids = @sheet_ids.detect { |i| i[:"r:id"] == rel[:Id] }
|
111
|
+
Xsv::Sheet.new(self, entry.get_input_stream, entry.size, sheet_ids)
|
100
112
|
end
|
101
113
|
end
|
102
114
|
|
103
|
-
def
|
104
|
-
stream = @zip.glob(
|
105
|
-
|
106
|
-
|
115
|
+
def fetch_sheet_ids
|
116
|
+
stream = @zip.glob('xl/workbook.xml').first.get_input_stream
|
117
|
+
SheetsIdsHandler.get_sheets_ids(stream)
|
118
|
+
ensure
|
107
119
|
stream.close
|
108
120
|
end
|
109
121
|
|
110
122
|
def fetch_relationships
|
111
|
-
stream = @zip.glob(
|
112
|
-
|
113
|
-
|
123
|
+
stream = @zip.glob('xl/_rels/workbook.xml.rels').first.get_input_stream
|
124
|
+
RelationshipsHandler.get_relations(stream)
|
125
|
+
ensure
|
114
126
|
stream.close
|
115
127
|
end
|
116
128
|
end
|
data/xsv.gemspec
CHANGED
@@ -36,12 +36,11 @@ Gem::Specification.new do |spec|
|
|
36
36
|
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
37
37
|
spec.require_paths = ["lib"]
|
38
38
|
|
39
|
-
spec.required_ruby_version = "
|
39
|
+
spec.required_ruby_version = ">= 2.5"
|
40
40
|
|
41
41
|
spec.add_dependency "rubyzip", ">= 1.3", "< 3"
|
42
|
-
spec.add_dependency "ox", ">= 2.9"
|
43
42
|
|
44
43
|
spec.add_development_dependency "bundler", "< 3"
|
45
44
|
spec.add_development_dependency "rake", "~> 13.0"
|
46
|
-
spec.add_development_dependency "minitest", "~> 5.
|
45
|
+
spec.add_development_dependency "minitest", "~> 5.14.2"
|
47
46
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: xsv
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 1.0.2
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Martijn Storck
|
8
|
-
autorequire:
|
8
|
+
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2021-05-01 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: rubyzip
|
@@ -30,20 +30,6 @@ dependencies:
|
|
30
30
|
- - "<"
|
31
31
|
- !ruby/object:Gem::Version
|
32
32
|
version: '3'
|
33
|
-
- !ruby/object:Gem::Dependency
|
34
|
-
name: ox
|
35
|
-
requirement: !ruby/object:Gem::Requirement
|
36
|
-
requirements:
|
37
|
-
- - ">="
|
38
|
-
- !ruby/object:Gem::Version
|
39
|
-
version: '2.9'
|
40
|
-
type: :runtime
|
41
|
-
prerelease: false
|
42
|
-
version_requirements: !ruby/object:Gem::Requirement
|
43
|
-
requirements:
|
44
|
-
- - ">="
|
45
|
-
- !ruby/object:Gem::Version
|
46
|
-
version: '2.9'
|
47
33
|
- !ruby/object:Gem::Dependency
|
48
34
|
name: bundler
|
49
35
|
requirement: !ruby/object:Gem::Requirement
|
@@ -78,14 +64,14 @@ dependencies:
|
|
78
64
|
requirements:
|
79
65
|
- - "~>"
|
80
66
|
- !ruby/object:Gem::Version
|
81
|
-
version:
|
67
|
+
version: 5.14.2
|
82
68
|
type: :development
|
83
69
|
prerelease: false
|
84
70
|
version_requirements: !ruby/object:Gem::Requirement
|
85
71
|
requirements:
|
86
72
|
- - "~>"
|
87
73
|
- !ruby/object:Gem::Version
|
88
|
-
version:
|
74
|
+
version: 5.14.2
|
89
75
|
description: |2
|
90
76
|
Xsv is a fast, lightweight parser for Office Open XML spreadsheet files
|
91
77
|
(commonly known as Excel or .xlsx files). It strives to be minimal in the
|
@@ -109,6 +95,7 @@ files:
|
|
109
95
|
- lib/xsv.rb
|
110
96
|
- lib/xsv/helpers.rb
|
111
97
|
- lib/xsv/relationships_handler.rb
|
98
|
+
- lib/xsv/sax_parser.rb
|
112
99
|
- lib/xsv/shared_strings_parser.rb
|
113
100
|
- lib/xsv/sheet.rb
|
114
101
|
- lib/xsv/sheet_bounds_handler.rb
|
@@ -125,13 +112,13 @@ metadata:
|
|
125
112
|
homepage_uri: https://github.com/martijn/xsv
|
126
113
|
source_code_uri: https://github.com/martijn/xsv
|
127
114
|
changelog_uri: https://github.com/martijn/xsv/CHANGELOG.md
|
128
|
-
post_install_message:
|
115
|
+
post_install_message:
|
129
116
|
rdoc_options: []
|
130
117
|
require_paths:
|
131
118
|
- lib
|
132
119
|
required_ruby_version: !ruby/object:Gem::Requirement
|
133
120
|
requirements:
|
134
|
-
- - "
|
121
|
+
- - ">="
|
135
122
|
- !ruby/object:Gem::Version
|
136
123
|
version: '2.5'
|
137
124
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
@@ -140,8 +127,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
140
127
|
- !ruby/object:Gem::Version
|
141
128
|
version: '0'
|
142
129
|
requirements: []
|
143
|
-
rubygems_version: 3.
|
144
|
-
signing_key:
|
130
|
+
rubygems_version: 3.2.3
|
131
|
+
signing_key:
|
145
132
|
specification_version: 4
|
146
133
|
summary: A fast and lightweiggt xlsx parser that provides nothing a CSV parser wouldn't
|
147
134
|
test_files: []
|