xsv 1.0.6 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: '0755959285e8f4e588fc8f72f45c48904bc0b840c1abc7b250faf6bad978e7f0'
4
- data.tar.gz: 482143461be2e72994e8d9758d1a971e87355acdd16cb027a5631956b7898927
3
+ metadata.gz: f1ebfa4e4778af72a8b295d258d899a5b5d01fd029d1294d54af5e4f1e0de05a
4
+ data.tar.gz: aa74ffe0d57eebc12e312bdb42107bc39203cd3a0237f2e2481205c8b3b933c9
5
5
  SHA512:
6
- metadata.gz: a9a48303c59d254233e12994562a341854caffde500f78e5357edebfd16dca12cf7b9b39af6c3c9e1536491f1467456c0b8295bfebf4fddda0f315ab4fbe0875
7
- data.tar.gz: db9fe14a1c829ca66d2d1daa59da9bab181c5b4ba17c89ebc0703369165310ad4390e8713effc329d0ad6ce9f932a822e368b7eec6ae1e3876b4ed27d4bc0969
6
+ metadata.gz: 6ebbb32e48860043bdb0a5d17f6fef525252e8bb01ac63180cd0e749dfbd1b3bb08b8a82e43e9e13da2fa187c02f77e37f525e9c7453334bc3cb8fdf05400187
7
+ data.tar.gz: 4e1450daebcc3ddfbc0585de52f4d1f362ef2d79281e59cc641119a47924f8450b76c3643a6403a440e0a581ebfcc108431a311f902e2b48be61a1d2afd7b19e
@@ -19,7 +19,7 @@ jobs:
19
19
  runs-on: ubuntu-latest
20
20
  strategy:
21
21
  matrix:
22
- ruby-version: ['2.5', '2.6', '2.7', '3.0', '3.1', 'jruby', 'truffleruby']
22
+ ruby-version: ['2.6', '2.7', '3.0', '3.1', 'jruby', 'truffleruby']
23
23
 
24
24
  steps:
25
25
  - uses: actions/checkout@v2
data/.standard.yml CHANGED
@@ -1 +1 @@
1
- ruby_version: 2.5.0
1
+ ruby_version: 2.6.9
data/CHANGELOG.md CHANGED
@@ -1,5 +1,12 @@
1
1
  # Xsv Changelog
2
2
 
3
+ ## 1.1.0 2022-02-13
4
+
5
+ - New, shorter `Xsv.open` syntax as a drop-in replacement for `Xsv::Workbook.open`, which is still supported
6
+ - Enable parsing of headers for all sheets by passing `parse_headers: true` to `Xsv.open`
7
+ - Improvements in performance and test coverage
8
+ - Dropped support for Ruby 2.5, which is EOL. Xsv 1.1.0 supports Ruby 2.6+, latest JRuby, latest TruffleRuby
9
+
3
10
  ## 1.0.6 2022-01-07
4
11
 
5
12
  - Code cleanup, small performance improvements
data/README.md CHANGED
@@ -1,6 +1,7 @@
1
1
  # Xsv .xlsx reader
2
2
 
3
3
  [![Travis CI](https://img.shields.io/travis/martijn/xsv/master)](https://travis-ci.org/martijn/xsv)
4
+ [![Codecov](https://img.shields.io/codecov/c/github/martijn/xsv/main)](https://app.codecov.io/gh/martijn/xsv)
4
5
  [![Yard Docs](http://img.shields.io/badge/yard-docs-blue.svg)](https://rubydoc.info/github/martijn/xsv)
5
6
  [![Gem Version](https://badge.fury.io/rb/xsv.svg)](https://badge.fury.io/rb/xsv)
6
7
 
@@ -41,17 +42,18 @@ when that becomes stable.
41
42
 
42
43
  ## Usage
43
44
 
45
+ ### Array and hash mode
44
46
  Xsv has two modes of operation. By default, it returns an array for
45
47
  each row in the sheet:
46
48
 
47
49
  ```ruby
48
- x = Xsv::Workbook.open("sheet.xlsx")
50
+ x = Xsv.open("sheet.xlsx") # => #<Xsv::Workbook sheets=1>
49
51
 
50
52
  sheet = x.sheets[0]
51
53
 
52
54
  # Iterate over rows
53
- sheet.each_row do |row|
54
- row # => ["header1", "header2"], etc.
55
+ sheet.each do |row|
56
+ row # => ["header1", "header2"]
55
57
  end
56
58
 
57
59
  # Access row by index (zero-based)
@@ -59,40 +61,63 @@ sheet[1] # => ["value1", "value2"]
59
61
  ```
60
62
 
61
63
  Alternatively, it can load the headers from the first row and return a hash
62
- for every row:
64
+ for every row by calling `parse_headers!` on the sheet or setting the `parse_headers`
65
+ option on open:
63
66
 
64
67
  ```ruby
65
- x = Xsv::Workbook.open("sheet.xlsx")
68
+ # Parse headers for all sheets on open
69
+
70
+ x = Xsv.open("sheet.xlsx", parse_headers: true)
71
+
72
+ x.sheets[0][1] # => {"header1" => "value1", "header2" => "value2"}
73
+
74
+ # Manually parse headers for a single sheet
75
+
76
+ x = Xsv.open("sheet.xlsx")
66
77
 
67
78
  sheet = x.sheets[0]
68
79
 
69
- sheet.mode # => :array
80
+ sheet[0] # => ["header1", "header2"]
70
81
 
71
- # Parse headers and switch to hash mode
72
82
  sheet.parse_headers!
73
83
 
74
- sheet.mode # => :hash
84
+ sheet[0] # => {"header1" => "value1", "header2" => "value2"}
85
+ ```
86
+
87
+ Be aware that hash mode will lead to unpredictable results if the worksheet
88
+ has multiple columns with the same header. `Xsv::Sheet` implements `Enumerable` so along with `#each`
89
+ you can call methods like `#first`, `#filter`/`#select`, and `#map` on it.
90
+
91
+ ### Opening a string or buffer instead of filename
75
92
 
76
- sheet.each_row do |row|
77
- row # => {"header1" => "value1", "header2" => "value2"}, etc.
93
+ `Xsv.open` accepts a filename, or an IO or String containing a workbook. Optionally, you can pass a block
94
+ which will be called with the workbook as parameter, like `File#open`. Example of this together:
95
+
96
+ ```ruby
97
+ # Use an existing IO-like object as source
98
+
99
+ file = File.open("sheet.xlsx")
100
+
101
+ Xsv.open(file) do |workbook|
102
+ puts workbook.inspect
78
103
  end
79
104
 
80
- sheet[1] # => {"header1" => "value1", "header2" => "value2"}
81
- ```
105
+ # or even:
82
106
 
83
- Be aware that hash mode will lead to unpredictable results if the worksheet
84
- has multiple columns with the same header.
107
+ Xsv.open(file.read) do |workbook|
108
+ puts workbook.inspect
109
+ end
110
+ ```
85
111
 
86
- `Xsv::Workbook.open` accepts a filename, or an IO or String containing a workbook. Optionally, you can pass a block
87
- which will be called with the workbook as parameter, like `File#open`.
112
+ Prior to Xsv 1.1.0, `Xsv::Workbook.open` was used instead of `Xsv.open`. The parameters are identical and
113
+ the former is maintained for backwards compatibility.
88
114
 
89
- `Xsv::Sheet` implements `Enumerable` so you can call methods like `#first`,
90
- `#filter`/`#select`, and `#map` on it.
115
+ ### Accessing sheets by name
91
116
 
92
117
  The sheets can be accessed by index or by name:
93
118
 
94
119
  ```ruby
95
- x = Xsv::Workbook.open("sheet.xlsx")
120
+ x = Xsv.open("sheet.xlsx")
96
121
 
97
122
  sheet = x.sheets[0] # gets sheet by index
98
123
 
data/benchmark.rb CHANGED
@@ -1,6 +1,6 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
- require 'bundler/inline'
3
+ require "bundler/inline"
4
4
 
5
5
  gemfile do
6
6
  source "https://rubygems.org"
@@ -36,7 +36,7 @@ end
36
36
 
37
37
  file = File.read("test/files/10k-sheet.xlsx")
38
38
 
39
- workbook = Xsv::Workbook.open(file)
39
+ workbook = Xsv.open(file)
40
40
 
41
41
  puts "--- ARRAY MODE ---"
42
42
 
@@ -68,7 +68,7 @@ module Xsv
68
68
  end
69
69
 
70
70
  if tag_name.start_with?("/")
71
- end_element(tag_name[1..-1]) if responds_to_end_element
71
+ end_element(tag_name[1..]) if responds_to_end_element
72
72
  elsif args.nil?
73
73
  start_element(tag_name, nil)
74
74
  else
@@ -78,7 +78,7 @@ module Xsv
78
78
 
79
79
  state = :look_start
80
80
  elsif eof_reached
81
- raise "Malformed XML document, looking for end of tag beyond EOF"
81
+ raise Xsv::Error, "Malformed XML document, looking for end of tag beyond EOF"
82
82
  else
83
83
  must_read = true
84
84
  end
data/lib/xsv/sheet.rb CHANGED
@@ -47,7 +47,7 @@ module Xsv
47
47
 
48
48
  # @return [String]
49
49
  def inspect
50
- "#<#{self.class.name}:#{object_id}>"
50
+ "#<#{self.class.name}:#{object_id} mode=#{@mode}>"
51
51
  end
52
52
 
53
53
  # Returns true if the worksheet is hidden
data/lib/xsv/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Xsv
4
- VERSION = "1.0.6"
4
+ VERSION = "1.1.0"
5
5
  end
data/lib/xsv/workbook.rb CHANGED
@@ -12,36 +12,17 @@ module Xsv
12
12
 
13
13
  attr_reader :shared_strings, :xfs, :num_fmts, :trim_empty_rows
14
14
 
15
- # Open the workbook of the given filename, string or buffer. For additional
16
- # options see {.initialize}
17
- def self.open(data, **kws)
18
- @workbook = if data.is_a?(IO) || data.respond_to?(:read) # is it a buffer?
19
- new(Zip::File.open_buffer(data), **kws)
20
- elsif data.start_with?("PK\x03\x04") # is it a string containing a file?
21
- new(Zip::File.open_buffer(data), **kws)
22
- else # must be a filename
23
- new(Zip::File.open(data), **kws)
24
- end
25
-
26
- if block_given?
27
- begin
28
- yield(@workbook)
29
- ensure
30
- @workbook.close
31
- end
32
- else
33
- @workbook
34
- end
15
+ # @deprecated Use {Xsv.open} instead
16
+ def self.open(data, **kws, &block)
17
+ Xsv.open(data, **kws, &block)
35
18
  end
36
19
 
37
20
  # Open a workbook from an instance of {Zip::File}. Generally it's recommended
38
21
  # to use the {.open} method instead of the constructor.
39
22
  #
40
- # Options:
41
- #
42
- # trim_empty_rows (false) Scan sheet for end of content and don't return trailing rows
43
- #
44
- def initialize(zip, trim_empty_rows: false)
23
+ # @param trim_empty_rows [Boolean] Scan sheet for end of content and don't return trailing rows
24
+ # @param parse_headers [Boolean] Call `parse_headers!` on all sheets on load
25
+ def initialize(zip, trim_empty_rows: false, parse_headers: false)
45
26
  raise ArgumentError, "Passed argument is not an instance of Zip::File. Did you mean to use Workbook.open?" unless zip.is_a?(Zip::File)
46
27
  raise Xsv::Error, "Zip::File is empty" if zip.size.zero?
47
28
 
@@ -53,12 +34,12 @@ module Xsv
53
34
  @sheet_ids = fetch_sheet_ids
54
35
  @relationships = fetch_relationships
55
36
  @shared_strings = fetch_shared_strings
56
- @sheets = fetch_sheets
37
+ @sheets = fetch_sheets(parse_headers ? :hash : :array)
57
38
  end
58
39
 
59
40
  # @return [String]
60
41
  def inspect
61
- "#<#{self.class.name}:#{object_id}>"
42
+ "#<#{self.class.name}:#{object_id} sheets=#{sheets.count} trim_empty_rows=#{@trim_empty_rows}>"
62
43
  end
63
44
 
64
45
  # Close the handle to the workbook file and leave all resources for the GC to collect
@@ -108,13 +89,15 @@ module Xsv
108
89
  stream.close
109
90
  end
110
91
 
111
- def fetch_sheets
92
+ def fetch_sheets(mode)
112
93
  @zip.glob("xl/worksheets/sheet*.xml").sort do |a, b|
113
94
  a.name[/\d+/].to_i <=> b.name[/\d+/].to_i
114
95
  end.map do |entry|
115
96
  rel = @relationships.detect { |r| entry.name.end_with?(r[:Target]) && r[:Type].end_with?("worksheet") }
116
97
  sheet_ids = @sheet_ids.detect { |i| i[:"r:id"] == rel[:Id] }
117
- Xsv::Sheet.new(self, entry.get_input_stream, entry.size, sheet_ids)
98
+ Xsv::Sheet.new(self, entry.get_input_stream, entry.size, sheet_ids).tap do |sheet|
99
+ sheet.parse_headers! if mode == :hash
100
+ end
118
101
  end
119
102
  end
120
103
 
data/lib/xsv.rb CHANGED
@@ -24,4 +24,31 @@ module Xsv
24
24
  # An AssertionFailed error indicates an unexpected condition, meaning a bug
25
25
  # or misinterpreted .xlsx document
26
26
  class AssertionFailed < StandardError; end
27
+
28
+ # Open the workbook of the given filename, string or buffer.
29
+ # @param filename_or_string [String, IO] the contents or filename of a workbook
30
+ # @param trim_empty_rows [Boolean] Scan sheet for end of content and don't return trailing rows
31
+ # @param parse_headers [Boolean] Call `parse_headers!` on all sheets on load
32
+ # @return [Xsv::Workbook] The workbook instance
33
+ def self.open(filename_or_string, trim_empty_rows: false, parse_headers: false)
34
+ zip = if filename_or_string.is_a?(IO) || filename_or_string.respond_to?(:read) # is it a buffer?
35
+ Zip::File.open_buffer(filename_or_string)
36
+ elsif filename_or_string.start_with?("PK\x03\x04") # is it a string containing a file?
37
+ Zip::File.open_buffer(filename_or_string)
38
+ else # must be a filename
39
+ Zip::File.open(filename_or_string)
40
+ end
41
+
42
+ workbook = Xsv::Workbook.new(zip, trim_empty_rows: trim_empty_rows, parse_headers: parse_headers)
43
+
44
+ if block_given?
45
+ begin
46
+ yield(workbook)
47
+ ensure
48
+ workbook.close
49
+ end
50
+ else
51
+ workbook
52
+ end
53
+ end
27
54
  end
data/xsv.gemspec CHANGED
@@ -36,7 +36,7 @@ Gem::Specification.new do |spec|
36
36
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
37
37
  spec.require_paths = ["lib"]
38
38
 
39
- spec.required_ruby_version = ">= 2.5"
39
+ spec.required_ruby_version = ">= 2.6"
40
40
 
41
41
  spec.add_dependency "rubyzip", ">= 1.3", "< 3"
42
42
 
@@ -44,4 +44,5 @@ Gem::Specification.new do |spec|
44
44
  spec.add_development_dependency "rake", "~> 13.0"
45
45
  spec.add_development_dependency "minitest", "~> 5.14.2"
46
46
  spec.add_development_dependency "standard", "~> 1.6.0"
47
+ spec.add_development_dependency "codecov", ">= 0.6.0"
47
48
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: xsv
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.6
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Martijn Storck
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2022-01-07 00:00:00.000000000 Z
11
+ date: 2022-02-13 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rubyzip
@@ -86,6 +86,20 @@ dependencies:
86
86
  - - "~>"
87
87
  - !ruby/object:Gem::Version
88
88
  version: 1.6.0
89
+ - !ruby/object:Gem::Dependency
90
+ name: codecov
91
+ requirement: !ruby/object:Gem::Requirement
92
+ requirements:
93
+ - - ">="
94
+ - !ruby/object:Gem::Version
95
+ version: 0.6.0
96
+ type: :development
97
+ prerelease: false
98
+ version_requirements: !ruby/object:Gem::Requirement
99
+ requirements:
100
+ - - ">="
101
+ - !ruby/object:Gem::Version
102
+ version: 0.6.0
89
103
  description: |2
90
104
  Xsv is a fast, lightweight parser for Office Open XML spreadsheet files
91
105
  (commonly known as Excel or .xlsx files). It strives to be minimal in the
@@ -128,7 +142,7 @@ metadata:
128
142
  homepage_uri: https://github.com/martijn/xsv
129
143
  source_code_uri: https://github.com/martijn/xsv
130
144
  changelog_uri: https://github.com/martijn/xsv/CHANGELOG.md
131
- post_install_message:
145
+ post_install_message:
132
146
  rdoc_options: []
133
147
  require_paths:
134
148
  - lib
@@ -136,15 +150,15 @@ required_ruby_version: !ruby/object:Gem::Requirement
136
150
  requirements:
137
151
  - - ">="
138
152
  - !ruby/object:Gem::Version
139
- version: '2.5'
153
+ version: '2.6'
140
154
  required_rubygems_version: !ruby/object:Gem::Requirement
141
155
  requirements:
142
156
  - - ">="
143
157
  - !ruby/object:Gem::Version
144
158
  version: '0'
145
159
  requirements: []
146
- rubygems_version: 3.3.3
147
- signing_key:
160
+ rubygems_version: 3.2.3
161
+ signing_key:
148
162
  specification_version: 4
149
163
  summary: A fast and lightweight xlsx parser that provides nothing a CSV parser wouldn't
150
164
  test_files: []