xsv 1.0.6 → 1.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: '0755959285e8f4e588fc8f72f45c48904bc0b840c1abc7b250faf6bad978e7f0'
4
- data.tar.gz: 482143461be2e72994e8d9758d1a971e87355acdd16cb027a5631956b7898927
3
+ metadata.gz: f1ebfa4e4778af72a8b295d258d899a5b5d01fd029d1294d54af5e4f1e0de05a
4
+ data.tar.gz: aa74ffe0d57eebc12e312bdb42107bc39203cd3a0237f2e2481205c8b3b933c9
5
5
  SHA512:
6
- metadata.gz: a9a48303c59d254233e12994562a341854caffde500f78e5357edebfd16dca12cf7b9b39af6c3c9e1536491f1467456c0b8295bfebf4fddda0f315ab4fbe0875
7
- data.tar.gz: db9fe14a1c829ca66d2d1daa59da9bab181c5b4ba17c89ebc0703369165310ad4390e8713effc329d0ad6ce9f932a822e368b7eec6ae1e3876b4ed27d4bc0969
6
+ metadata.gz: 6ebbb32e48860043bdb0a5d17f6fef525252e8bb01ac63180cd0e749dfbd1b3bb08b8a82e43e9e13da2fa187c02f77e37f525e9c7453334bc3cb8fdf05400187
7
+ data.tar.gz: 4e1450daebcc3ddfbc0585de52f4d1f362ef2d79281e59cc641119a47924f8450b76c3643a6403a440e0a581ebfcc108431a311f902e2b48be61a1d2afd7b19e
@@ -19,7 +19,7 @@ jobs:
19
19
  runs-on: ubuntu-latest
20
20
  strategy:
21
21
  matrix:
22
- ruby-version: ['2.5', '2.6', '2.7', '3.0', '3.1', 'jruby', 'truffleruby']
22
+ ruby-version: ['2.6', '2.7', '3.0', '3.1', 'jruby', 'truffleruby']
23
23
 
24
24
  steps:
25
25
  - uses: actions/checkout@v2
data/.standard.yml CHANGED
@@ -1 +1 @@
1
- ruby_version: 2.5.0
1
+ ruby_version: 2.6.9
data/CHANGELOG.md CHANGED
@@ -1,5 +1,12 @@
1
1
  # Xsv Changelog
2
2
 
3
+ ## 1.1.0 2022-02-13
4
+
5
+ - New, shorter `Xsv.open` syntax as a drop-in replacement for `Xsv::Workbook.open`, which is still supported
6
+ - Enable parsing of headers for all sheets by passing `parse_headers: true` to `Xsv.open`
7
+ - Improvements in performance and test coverage
8
+ - Dropped support for Ruby 2.5, which is EOL. Xsv 1.1.0 supports Ruby 2.6+, latest JRuby, latest TruffleRuby
9
+
3
10
  ## 1.0.6 2022-01-07
4
11
 
5
12
  - Code cleanup, small performance improvements
data/README.md CHANGED
@@ -1,6 +1,7 @@
1
1
  # Xsv .xlsx reader
2
2
 
3
3
  [![Travis CI](https://img.shields.io/travis/martijn/xsv/master)](https://travis-ci.org/martijn/xsv)
4
+ [![Codecov](https://img.shields.io/codecov/c/github/martijn/xsv/main)](https://app.codecov.io/gh/martijn/xsv)
4
5
  [![Yard Docs](http://img.shields.io/badge/yard-docs-blue.svg)](https://rubydoc.info/github/martijn/xsv)
5
6
  [![Gem Version](https://badge.fury.io/rb/xsv.svg)](https://badge.fury.io/rb/xsv)
6
7
 
@@ -41,17 +42,18 @@ when that becomes stable.
41
42
 
42
43
  ## Usage
43
44
 
45
+ ### Array and hash mode
44
46
  Xsv has two modes of operation. By default, it returns an array for
45
47
  each row in the sheet:
46
48
 
47
49
  ```ruby
48
- x = Xsv::Workbook.open("sheet.xlsx")
50
+ x = Xsv.open("sheet.xlsx") # => #<Xsv::Workbook sheets=1>
49
51
 
50
52
  sheet = x.sheets[0]
51
53
 
52
54
  # Iterate over rows
53
- sheet.each_row do |row|
54
- row # => ["header1", "header2"], etc.
55
+ sheet.each do |row|
56
+ row # => ["header1", "header2"]
55
57
  end
56
58
 
57
59
  # Access row by index (zero-based)
@@ -59,40 +61,63 @@ sheet[1] # => ["value1", "value2"]
59
61
  ```
60
62
 
61
63
  Alternatively, it can load the headers from the first row and return a hash
62
- for every row:
64
+ for every row by calling `parse_headers!` on the sheet or setting the `parse_headers`
65
+ option on open:
63
66
 
64
67
  ```ruby
65
- x = Xsv::Workbook.open("sheet.xlsx")
68
+ # Parse headers for all sheets on open
69
+
70
+ x = Xsv.open("sheet.xlsx", parse_headers: true)
71
+
72
+ x.sheets[0][1] # => {"header1" => "value1", "header2" => "value2"}
73
+
74
+ # Manually parse headers for a single sheet
75
+
76
+ x = Xsv.open("sheet.xlsx")
66
77
 
67
78
  sheet = x.sheets[0]
68
79
 
69
- sheet.mode # => :array
80
+ sheet[0] # => ["header1", "header2"]
70
81
 
71
- # Parse headers and switch to hash mode
72
82
  sheet.parse_headers!
73
83
 
74
- sheet.mode # => :hash
84
+ sheet[0] # => {"header1" => "value1", "header2" => "value2"}
85
+ ```
86
+
87
+ Be aware that hash mode will lead to unpredictable results if the worksheet
88
+ has multiple columns with the same header. `Xsv::Sheet` implements `Enumerable` so along with `#each`
89
+ you can call methods like `#first`, `#filter`/`#select`, and `#map` on it.
90
+
91
+ ### Opening a string or buffer instead of filename
75
92
 
76
- sheet.each_row do |row|
77
- row # => {"header1" => "value1", "header2" => "value2"}, etc.
93
+ `Xsv.open` accepts a filename, or an IO or String containing a workbook. Optionally, you can pass a block
94
+ which will be called with the workbook as parameter, like `File#open`. Example of this together:
95
+
96
+ ```ruby
97
+ # Use an existing IO-like object as source
98
+
99
+ file = File.open("sheet.xlsx")
100
+
101
+ Xsv.open(file) do |workbook|
102
+ puts workbook.inspect
78
103
  end
79
104
 
80
- sheet[1] # => {"header1" => "value1", "header2" => "value2"}
81
- ```
105
+ # or even:
82
106
 
83
- Be aware that hash mode will lead to unpredictable results if the worksheet
84
- has multiple columns with the same header.
107
+ Xsv.open(file.read) do |workbook|
108
+ puts workbook.inspect
109
+ end
110
+ ```
85
111
 
86
- `Xsv::Workbook.open` accepts a filename, or an IO or String containing a workbook. Optionally, you can pass a block
87
- which will be called with the workbook as parameter, like `File#open`.
112
+ Prior to Xsv 1.1.0, `Xsv::Workbook.open` was used instead of `Xsv.open`. The parameters are identical and
113
+ the former is maintained for backwards compatibility.
88
114
 
89
- `Xsv::Sheet` implements `Enumerable` so you can call methods like `#first`,
90
- `#filter`/`#select`, and `#map` on it.
115
+ ### Accessing sheets by name
91
116
 
92
117
  The sheets can be accessed by index or by name:
93
118
 
94
119
  ```ruby
95
- x = Xsv::Workbook.open("sheet.xlsx")
120
+ x = Xsv.open("sheet.xlsx")
96
121
 
97
122
  sheet = x.sheets[0] # gets sheet by index
98
123
 
data/benchmark.rb CHANGED
@@ -1,6 +1,6 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
- require 'bundler/inline'
3
+ require "bundler/inline"
4
4
 
5
5
  gemfile do
6
6
  source "https://rubygems.org"
@@ -36,7 +36,7 @@ end
36
36
 
37
37
  file = File.read("test/files/10k-sheet.xlsx")
38
38
 
39
- workbook = Xsv::Workbook.open(file)
39
+ workbook = Xsv.open(file)
40
40
 
41
41
  puts "--- ARRAY MODE ---"
42
42
 
@@ -68,7 +68,7 @@ module Xsv
68
68
  end
69
69
 
70
70
  if tag_name.start_with?("/")
71
- end_element(tag_name[1..-1]) if responds_to_end_element
71
+ end_element(tag_name[1..]) if responds_to_end_element
72
72
  elsif args.nil?
73
73
  start_element(tag_name, nil)
74
74
  else
@@ -78,7 +78,7 @@ module Xsv
78
78
 
79
79
  state = :look_start
80
80
  elsif eof_reached
81
- raise "Malformed XML document, looking for end of tag beyond EOF"
81
+ raise Xsv::Error, "Malformed XML document, looking for end of tag beyond EOF"
82
82
  else
83
83
  must_read = true
84
84
  end
data/lib/xsv/sheet.rb CHANGED
@@ -47,7 +47,7 @@ module Xsv
47
47
 
48
48
  # @return [String]
49
49
  def inspect
50
- "#<#{self.class.name}:#{object_id}>"
50
+ "#<#{self.class.name}:#{object_id} mode=#{@mode}>"
51
51
  end
52
52
 
53
53
  # Returns true if the worksheet is hidden
data/lib/xsv/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Xsv
4
- VERSION = "1.0.6"
4
+ VERSION = "1.1.0"
5
5
  end
data/lib/xsv/workbook.rb CHANGED
@@ -12,36 +12,17 @@ module Xsv
12
12
 
13
13
  attr_reader :shared_strings, :xfs, :num_fmts, :trim_empty_rows
14
14
 
15
- # Open the workbook of the given filename, string or buffer. For additional
16
- # options see {.initialize}
17
- def self.open(data, **kws)
18
- @workbook = if data.is_a?(IO) || data.respond_to?(:read) # is it a buffer?
19
- new(Zip::File.open_buffer(data), **kws)
20
- elsif data.start_with?("PK\x03\x04") # is it a string containing a file?
21
- new(Zip::File.open_buffer(data), **kws)
22
- else # must be a filename
23
- new(Zip::File.open(data), **kws)
24
- end
25
-
26
- if block_given?
27
- begin
28
- yield(@workbook)
29
- ensure
30
- @workbook.close
31
- end
32
- else
33
- @workbook
34
- end
15
+ # @deprecated Use {Xsv.open} instead
16
+ def self.open(data, **kws, &block)
17
+ Xsv.open(data, **kws, &block)
35
18
  end
36
19
 
37
20
  # Open a workbook from an instance of {Zip::File}. Generally it's recommended
38
21
  # to use the {.open} method instead of the constructor.
39
22
  #
40
- # Options:
41
- #
42
- # trim_empty_rows (false) Scan sheet for end of content and don't return trailing rows
43
- #
44
- def initialize(zip, trim_empty_rows: false)
23
+ # @param trim_empty_rows [Boolean] Scan sheet for end of content and don't return trailing rows
24
+ # @param parse_headers [Boolean] Call `parse_headers!` on all sheets on load
25
+ def initialize(zip, trim_empty_rows: false, parse_headers: false)
45
26
  raise ArgumentError, "Passed argument is not an instance of Zip::File. Did you mean to use Workbook.open?" unless zip.is_a?(Zip::File)
46
27
  raise Xsv::Error, "Zip::File is empty" if zip.size.zero?
47
28
 
@@ -53,12 +34,12 @@ module Xsv
53
34
  @sheet_ids = fetch_sheet_ids
54
35
  @relationships = fetch_relationships
55
36
  @shared_strings = fetch_shared_strings
56
- @sheets = fetch_sheets
37
+ @sheets = fetch_sheets(parse_headers ? :hash : :array)
57
38
  end
58
39
 
59
40
  # @return [String]
60
41
  def inspect
61
- "#<#{self.class.name}:#{object_id}>"
42
+ "#<#{self.class.name}:#{object_id} sheets=#{sheets.count} trim_empty_rows=#{@trim_empty_rows}>"
62
43
  end
63
44
 
64
45
  # Close the handle to the workbook file and leave all resources for the GC to collect
@@ -108,13 +89,15 @@ module Xsv
108
89
  stream.close
109
90
  end
110
91
 
111
- def fetch_sheets
92
+ def fetch_sheets(mode)
112
93
  @zip.glob("xl/worksheets/sheet*.xml").sort do |a, b|
113
94
  a.name[/\d+/].to_i <=> b.name[/\d+/].to_i
114
95
  end.map do |entry|
115
96
  rel = @relationships.detect { |r| entry.name.end_with?(r[:Target]) && r[:Type].end_with?("worksheet") }
116
97
  sheet_ids = @sheet_ids.detect { |i| i[:"r:id"] == rel[:Id] }
117
- Xsv::Sheet.new(self, entry.get_input_stream, entry.size, sheet_ids)
98
+ Xsv::Sheet.new(self, entry.get_input_stream, entry.size, sheet_ids).tap do |sheet|
99
+ sheet.parse_headers! if mode == :hash
100
+ end
118
101
  end
119
102
  end
120
103
 
data/lib/xsv.rb CHANGED
@@ -24,4 +24,31 @@ module Xsv
24
24
  # An AssertionFailed error indicates an unexpected condition, meaning a bug
25
25
  # or misinterpreted .xlsx document
26
26
  class AssertionFailed < StandardError; end
27
+
28
+ # Open the workbook of the given filename, string or buffer.
29
+ # @param filename_or_string [String, IO] the contents or filename of a workbook
30
+ # @param trim_empty_rows [Boolean] Scan sheet for end of content and don't return trailing rows
31
+ # @param parse_headers [Boolean] Call `parse_headers!` on all sheets on load
32
+ # @return [Xsv::Workbook] The workbook instance
33
+ def self.open(filename_or_string, trim_empty_rows: false, parse_headers: false)
34
+ zip = if filename_or_string.is_a?(IO) || filename_or_string.respond_to?(:read) # is it a buffer?
35
+ Zip::File.open_buffer(filename_or_string)
36
+ elsif filename_or_string.start_with?("PK\x03\x04") # is it a string containing a file?
37
+ Zip::File.open_buffer(filename_or_string)
38
+ else # must be a filename
39
+ Zip::File.open(filename_or_string)
40
+ end
41
+
42
+ workbook = Xsv::Workbook.new(zip, trim_empty_rows: trim_empty_rows, parse_headers: parse_headers)
43
+
44
+ if block_given?
45
+ begin
46
+ yield(workbook)
47
+ ensure
48
+ workbook.close
49
+ end
50
+ else
51
+ workbook
52
+ end
53
+ end
27
54
  end
data/xsv.gemspec CHANGED
@@ -36,7 +36,7 @@ Gem::Specification.new do |spec|
36
36
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
37
37
  spec.require_paths = ["lib"]
38
38
 
39
- spec.required_ruby_version = ">= 2.5"
39
+ spec.required_ruby_version = ">= 2.6"
40
40
 
41
41
  spec.add_dependency "rubyzip", ">= 1.3", "< 3"
42
42
 
@@ -44,4 +44,5 @@ Gem::Specification.new do |spec|
44
44
  spec.add_development_dependency "rake", "~> 13.0"
45
45
  spec.add_development_dependency "minitest", "~> 5.14.2"
46
46
  spec.add_development_dependency "standard", "~> 1.6.0"
47
+ spec.add_development_dependency "codecov", ">= 0.6.0"
47
48
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: xsv
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.6
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Martijn Storck
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2022-01-07 00:00:00.000000000 Z
11
+ date: 2022-02-13 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rubyzip
@@ -86,6 +86,20 @@ dependencies:
86
86
  - - "~>"
87
87
  - !ruby/object:Gem::Version
88
88
  version: 1.6.0
89
+ - !ruby/object:Gem::Dependency
90
+ name: codecov
91
+ requirement: !ruby/object:Gem::Requirement
92
+ requirements:
93
+ - - ">="
94
+ - !ruby/object:Gem::Version
95
+ version: 0.6.0
96
+ type: :development
97
+ prerelease: false
98
+ version_requirements: !ruby/object:Gem::Requirement
99
+ requirements:
100
+ - - ">="
101
+ - !ruby/object:Gem::Version
102
+ version: 0.6.0
89
103
  description: |2
90
104
  Xsv is a fast, lightweight parser for Office Open XML spreadsheet files
91
105
  (commonly known as Excel or .xlsx files). It strives to be minimal in the
@@ -128,7 +142,7 @@ metadata:
128
142
  homepage_uri: https://github.com/martijn/xsv
129
143
  source_code_uri: https://github.com/martijn/xsv
130
144
  changelog_uri: https://github.com/martijn/xsv/CHANGELOG.md
131
- post_install_message:
145
+ post_install_message:
132
146
  rdoc_options: []
133
147
  require_paths:
134
148
  - lib
@@ -136,15 +150,15 @@ required_ruby_version: !ruby/object:Gem::Requirement
136
150
  requirements:
137
151
  - - ">="
138
152
  - !ruby/object:Gem::Version
139
- version: '2.5'
153
+ version: '2.6'
140
154
  required_rubygems_version: !ruby/object:Gem::Requirement
141
155
  requirements:
142
156
  - - ">="
143
157
  - !ruby/object:Gem::Version
144
158
  version: '0'
145
159
  requirements: []
146
- rubygems_version: 3.3.3
147
- signing_key:
160
+ rubygems_version: 3.2.3
161
+ signing_key:
148
162
  specification_version: 4
149
163
  summary: A fast and lightweight xlsx parser that provides nothing a CSV parser wouldn't
150
164
  test_files: []