rbxl 1.1.0 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +77 -14
- data/README.md +130 -5
- data/lib/rbxl/errors.rb +13 -0
- data/lib/rbxl/read_only_workbook.rb +36 -3
- data/lib/rbxl/read_only_worksheet.rb +61 -9
- data/lib/rbxl/version.rb +1 -1
- data/lib/rbxl/write_only_workbook.rb +1 -1
- metadata +4 -5
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: b7d99201ddbfd10ac1f5173052e0ef0d0bfea0e7e0143bc5e214d28d5cbea335
|
|
4
|
+
data.tar.gz: 513ec07aea3c8888bafd1b60c20f6e508e6ce87a2380c5dbcb536523b09ceab3
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 1dd2f6856dd7c9452d63f132e52f4336958a8bda63e304b353766ba573ed429b196c76dfa067468dcf0d85f5926de5b002f8f498b4907486fe717096ab20dbb2
|
|
7
|
+
data.tar.gz: 298fc80d0760d5468a7b2c95ae32751eb5c0e6070cd546d77d806ef7aabb057674f98a371d0c6c0320fd9784d89633e9fb278d03bdec4219426d89997a5540cb
|
data/CHANGELOG.md
CHANGED
|
@@ -1,25 +1,88 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
All notable changes to this project are documented here. The format is based
|
|
4
|
+
on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project
|
|
5
|
+
follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
4
6
|
|
|
5
|
-
|
|
6
|
-
- Add `date_conversion: true` to `Rbxl.open`: numeric cells whose style points at a date/time `numFmt` (built-in ids 14–22, 27–36, 45–47, 50–58, or a custom format code containing date tokens) are returned as `Date` or `Time`. Off by default — no change in output shape or throughput when the flag is absent.
|
|
7
|
-
- Fix Ruby reader path so self-closing `<row/>` and `<c/>` elements are iterated instead of silently dropped, and never yield `nil` for a row.
|
|
7
|
+
## [Unreleased]
|
|
8
8
|
|
|
9
|
-
## 1.0
|
|
9
|
+
## [1.2.0] - 2026-04-23
|
|
10
|
+
|
|
11
|
+
### Changed
|
|
12
|
+
|
|
13
|
+
- `WorkbookAlreadySavedError` message now points at the save-once design and
|
|
14
|
+
the next action (open a fresh `Rbxl.new` for another file) so callers who
|
|
15
|
+
trip on the constraint don't have to read the source to understand why.
|
|
16
|
+
- Workbook- and worksheet-level parse failures raise `WorkbookFormatError` /
|
|
17
|
+
`WorksheetFormatError` with the workbook path and the XML entry or sheet
|
|
18
|
+
name in the message, replacing generic parser exceptions.
|
|
19
|
+
|
|
20
|
+
### Added
|
|
21
|
+
|
|
22
|
+
- Location-aware coverage around malformed workbook and worksheet XML so bad
|
|
23
|
+
inputs surface the specific entry that failed rather than bubbling up an
|
|
24
|
+
unlabelled `Nokogiri::XML::SyntaxError`.
|
|
25
|
+
- README sections covering the write-only model (append-only, save-once,
|
|
26
|
+
no in-place edit), a "Reading recipes" walkthrough, and an explicit Out
|
|
27
|
+
of scope entry for read-modify-save workflows.
|
|
28
|
+
|
|
29
|
+
### Fixed
|
|
30
|
+
|
|
31
|
+
- Honor Excel's `date1904` workbook setting when `date_conversion: true` is
|
|
32
|
+
enabled, so Mac-originated workbooks map serial dates to the correct Ruby
|
|
33
|
+
`Date` and `Time` values.
|
|
34
|
+
|
|
35
|
+
## [1.1.0] - 2026-04-21
|
|
36
|
+
|
|
37
|
+
### Added
|
|
38
|
+
|
|
39
|
+
- `date_conversion: true` option for `Rbxl.open`: numeric cells whose style
|
|
40
|
+
points at a date/time `numFmt` (built-in ids 14–22, 27–36, 45–47, 50–58,
|
|
41
|
+
or a custom format code containing date tokens) are returned as `Date`
|
|
42
|
+
or `Time`. Off by default — no change in output shape or throughput when
|
|
43
|
+
the flag is absent.
|
|
44
|
+
|
|
45
|
+
### Changed
|
|
46
|
+
|
|
47
|
+
- `Rbxl.open` and `Rbxl.new` now default `read_only: true` and
|
|
48
|
+
`write_only: true` respectively, so the call site no longer needs the
|
|
49
|
+
boilerplate. Explicitly passing `false` raises `NotImplementedError`.
|
|
50
|
+
|
|
51
|
+
### Fixed
|
|
52
|
+
|
|
53
|
+
- Ruby reader path now iterates self-closing `<row/>` and `<c/>` elements
|
|
54
|
+
instead of silently dropping them, and never yields `nil` for a row.
|
|
55
|
+
|
|
56
|
+
## [1.0.2] - 2026-04-17
|
|
57
|
+
|
|
58
|
+
### Added
|
|
59
|
+
|
|
60
|
+
- `streaming: true` option for `Rbxl.open` feeds worksheet XML to the
|
|
61
|
+
native reader in 64 KiB chunks instead of buffering the full worksheet
|
|
62
|
+
first.
|
|
63
|
+
- `Rbxl.max_worksheet_bytes` configuration and `Rbxl::WorksheetTooLargeError`
|
|
64
|
+
so streaming reads can stop oversized worksheet XML entries mid-inflate.
|
|
65
|
+
|
|
66
|
+
### Changed
|
|
10
67
|
|
|
11
|
-
- Add `streaming: true` to `Rbxl.open` to feed worksheet XML to the native reader in 64 KiB chunks instead of buffering the full worksheet first.
|
|
12
|
-
- Add `Rbxl.max_worksheet_bytes` and `Rbxl::WorksheetTooLargeError` so streaming reads can stop oversized worksheet XML entries mid-inflate.
|
|
13
68
|
- Expand RDoc coverage across the public API.
|
|
14
69
|
- Tighten RBS signatures to match the actual runtime types.
|
|
15
|
-
- Reword public docs and gem metadata to describe reads as row-by-row and
|
|
70
|
+
- Reword public docs and gem metadata to describe reads as row-by-row and
|
|
71
|
+
writes as append-only, reserving "streaming" for the new opt-in native
|
|
72
|
+
read path.
|
|
73
|
+
|
|
74
|
+
## [1.0.1] - 2026-04-16
|
|
75
|
+
|
|
76
|
+
### Added
|
|
77
|
+
|
|
78
|
+
- Go and Rust benchmark comparisons.
|
|
16
79
|
|
|
17
|
-
|
|
80
|
+
### Fixed
|
|
18
81
|
|
|
19
|
-
-
|
|
20
|
-
-
|
|
21
|
-
|
|
82
|
+
- ZIP64 handling.
|
|
83
|
+
- Align `rbxl/native` with Nokogiri's libxml2 to avoid mixed-library
|
|
84
|
+
warnings at runtime.
|
|
22
85
|
|
|
23
|
-
## 1.0.0
|
|
86
|
+
## [1.0.0] - 2026-04-16
|
|
24
87
|
|
|
25
|
-
- Initial
|
|
88
|
+
- Initial public release.
|
data/README.md
CHANGED
|
@@ -21,12 +21,22 @@ Supported:
|
|
|
21
21
|
|
|
22
22
|
Out of scope:
|
|
23
23
|
|
|
24
|
+
- in-place editing of an existing `.xlsx` file — rbxl opens workbooks
|
|
25
|
+
read-only and generates new workbooks write-only, with no read-modify-save
|
|
26
|
+
path. If you need to open a file, tweak a handful of cells, and write it
|
|
27
|
+
back preserving everything else, use a full-object-model library instead.
|
|
24
28
|
- preserving arbitrary workbook structure on save
|
|
25
29
|
- rich style round-tripping
|
|
26
30
|
- formulas, images, charts, comments
|
|
27
31
|
|
|
28
32
|
## Usage
|
|
29
33
|
|
|
34
|
+
`Rbxl.open` defaults to read-only and `Rbxl.new` defaults to write-only;
|
|
35
|
+
the `read_only:` / `write_only:` keywords remain for call-site clarity and
|
|
36
|
+
to leave room for a future read/write mode.
|
|
37
|
+
|
|
38
|
+
### Writing a new workbook
|
|
39
|
+
|
|
30
40
|
```ruby
|
|
31
41
|
require "rbxl"
|
|
32
42
|
|
|
@@ -38,6 +48,23 @@ sheet.append([2, "bob", 95.5])
|
|
|
38
48
|
book.save("report.xlsx")
|
|
39
49
|
```
|
|
40
50
|
|
|
51
|
+
Write-only workbooks follow three rules:
|
|
52
|
+
|
|
53
|
+
- **Append-only within a sheet.** `sheet.append(row)` is the only way to
|
|
54
|
+
add data. There is no random-access cell write, no mid-stream edit of a
|
|
55
|
+
previously appended row.
|
|
56
|
+
- **Save-once per workbook.** `save` flushes the full `.xlsx` package in a
|
|
57
|
+
single pass and then closes the workbook. Calling `save` or `add_sheet`
|
|
58
|
+
again raises `Rbxl::WorkbookAlreadySavedError`. To produce another file,
|
|
59
|
+
start a new `Rbxl.new`.
|
|
60
|
+
- **No read-modify-save.** rbxl cannot open an existing `.xlsx` and write
|
|
61
|
+
back to it (see Out of scope above).
|
|
62
|
+
|
|
63
|
+
This is the tradeoff that keeps memory flat: rbxl buffers rows per sheet
|
|
64
|
+
and never materializes a full workbook object graph.
|
|
65
|
+
|
|
66
|
+
### Reading a workbook
|
|
67
|
+
|
|
41
68
|
```ruby
|
|
42
69
|
require "rbxl"
|
|
43
70
|
|
|
@@ -53,11 +80,109 @@ p sheet.calculate_dimension
|
|
|
53
80
|
book.close
|
|
54
81
|
```
|
|
55
82
|
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
83
|
+
### Reading recipes
|
|
84
|
+
|
|
85
|
+
**Plain value arrays (fastest path).** Use `values_only: true` when you
|
|
86
|
+
only care about the cell values, not their coordinates. Rows come back as
|
|
87
|
+
frozen `Array<Object>`:
|
|
88
|
+
|
|
89
|
+
```ruby
|
|
90
|
+
book.sheet("Data").each_row(values_only: true) do |values|
|
|
91
|
+
id, name, score = values
|
|
92
|
+
# ...
|
|
93
|
+
end
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
**Cell objects with coordinates.** Default `each_row` yields a
|
|
97
|
+
`Rbxl::Row` wrapping `Rbxl::ReadOnlyCell`s. Use this when you need the
|
|
98
|
+
Excel coordinate alongside the value:
|
|
99
|
+
|
|
100
|
+
```ruby
|
|
101
|
+
book.sheet("Data").each_row do |row|
|
|
102
|
+
row.index # => 2 (1-based worksheet row number)
|
|
103
|
+
row[0].coordinate # => "A2"
|
|
104
|
+
row[0].value # => "alice"
|
|
105
|
+
row.values # => ["alice", 100, true]
|
|
106
|
+
end
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
**Skip the header row.** `each_row` without a block returns an
|
|
110
|
+
`Enumerator`, so chain `drop`:
|
|
111
|
+
|
|
112
|
+
```ruby
|
|
113
|
+
book.sheet("Data").each_row(values_only: true).drop(1).each do |row|
|
|
114
|
+
# ...
|
|
115
|
+
end
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
**Peek at the first N rows.** `rows(...)` is an enumerator-returning
|
|
119
|
+
alias that composes well with `take`, `first`, `lazy`, etc.:
|
|
120
|
+
|
|
121
|
+
```ruby
|
|
122
|
+
book.sheet("Data").rows(values_only: true).first(5)
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
**Know the data range up-front.** When the workbook has a stored
|
|
126
|
+
dimension, these are O(1) lookups; otherwise pass `force: true` to scan:
|
|
127
|
+
|
|
128
|
+
```ruby
|
|
129
|
+
sheet = book.sheet("Data")
|
|
130
|
+
sheet.max_row # => 500
|
|
131
|
+
sheet.max_column # => 12
|
|
132
|
+
sheet.calculate_dimension # => "A1:L500"
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
**Pad sparse rows to the sheet width.** Without `pad_cells`, a row
|
|
136
|
+
containing only `A1` and `C1` yields two cells. With `pad_cells: true`,
|
|
137
|
+
missing cells are filled with `Rbxl::EmptyCell` (or `nil` in values-only
|
|
138
|
+
mode), aligned to `max_column`:
|
|
139
|
+
|
|
140
|
+
```ruby
|
|
141
|
+
book.sheet("Sparse").each_row(pad_cells: true, values_only: true).first
|
|
142
|
+
# => ["left", nil, "right"]
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
**Expand merged cells.** Excel leaves the anchor cell populated and the
|
|
146
|
+
rest of the merge range empty. Pass `expand_merged: true` to propagate
|
|
147
|
+
the anchor value across the full range; combine with `pad_cells: true`
|
|
148
|
+
when you want the result aligned to the sheet's width:
|
|
149
|
+
|
|
150
|
+
```ruby
|
|
151
|
+
sheet = book.sheet("Merged")
|
|
152
|
+
|
|
153
|
+
sheet.rows(values_only: true).to_a
|
|
154
|
+
# => [["group", "solo"], ["tail"]]
|
|
155
|
+
|
|
156
|
+
sheet.rows(values_only: true, pad_cells: true, expand_merged: true).to_a
|
|
157
|
+
# => [["group", "group", "solo", nil],
|
|
158
|
+
# ["group", "group", "solo", "tail"]]
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
**List sheets before opening any.** Sheet XML is only read on first
|
|
162
|
+
iteration; enumerating names is cheap:
|
|
163
|
+
|
|
164
|
+
```ruby
|
|
165
|
+
book.sheet_names # => ["Summary", "Detail", "Raw"]
|
|
166
|
+
book.sheet("Detail").each_row(values_only: true) { |row| ... }
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
**Locate a bad input.** All rbxl exceptions inherit from `Rbxl::Error`
|
|
170
|
+
and the messages carry the workbook path and (where relevant) the sheet
|
|
171
|
+
name, XML entry, or cell coordinate. Rescue at the sheet level:
|
|
172
|
+
|
|
173
|
+
```ruby
|
|
174
|
+
begin
|
|
175
|
+
book.sheet("Raw").each_row(values_only: true) { |row| ... }
|
|
176
|
+
rescue Rbxl::WorksheetFormatError, Rbxl::WorkbookFormatError => e
|
|
177
|
+
warn e.message # includes workbook path and sheet/entry
|
|
178
|
+
rescue Rbxl::CellValueError => e
|
|
179
|
+
warn e.message # includes workbook path, sheet, and coordinate
|
|
180
|
+
end
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
`Rbxl::CellValueError` is raised by the cell decoder when
|
|
184
|
+
`date_conversion: true` is active. The reader is forward-only, so rescue
|
|
185
|
+
terminates iteration rather than skipping to the next row.
|
|
61
186
|
|
|
62
187
|
### Date / time conversion
|
|
63
188
|
|
data/lib/rbxl/errors.rb
CHANGED
|
@@ -33,4 +33,17 @@ module Rbxl
|
|
|
33
33
|
# bytes consumed from the ZIP entry, so high-compression zip-bomb style
|
|
34
34
|
# worksheets are stopped mid-inflate rather than after the fact.
|
|
35
35
|
class WorksheetTooLargeError < Error; end
|
|
36
|
+
|
|
37
|
+
# Raised when workbook-level XML is malformed or internally inconsistent,
|
|
38
|
+
# for example when +xl/workbook.xml+ cannot be parsed or references a
|
|
39
|
+
# missing relationship target.
|
|
40
|
+
class WorkbookFormatError < Error; end
|
|
41
|
+
|
|
42
|
+
# Raised when a worksheet XML entry cannot be parsed into rows.
|
|
43
|
+
class WorksheetFormatError < Error; end
|
|
44
|
+
|
|
45
|
+
# Raised when a specific cell cannot be decoded. The message includes the
|
|
46
|
+
# workbook path, sheet name, and cell coordinate to make bad inputs easy
|
|
47
|
+
# to locate.
|
|
48
|
+
class CellValueError < WorksheetFormatError; end
|
|
36
49
|
end
|
|
@@ -65,6 +65,7 @@ module Rbxl
|
|
|
65
65
|
@sheet_entries = load_sheet_entries
|
|
66
66
|
@sheet_names = @sheet_entries.keys.freeze
|
|
67
67
|
@date_styles = nil
|
|
68
|
+
@date_1904 = nil
|
|
68
69
|
@closed = false
|
|
69
70
|
end
|
|
70
71
|
|
|
@@ -87,10 +88,12 @@ module Rbxl
|
|
|
87
88
|
ReadOnlyWorksheet.new(
|
|
88
89
|
zip: @zip,
|
|
89
90
|
entry_path: entry_path,
|
|
91
|
+
workbook_path: @path,
|
|
90
92
|
shared_strings: @shared_strings,
|
|
91
93
|
name: name,
|
|
92
94
|
streaming: @streaming,
|
|
93
|
-
date_styles: date_styles
|
|
95
|
+
date_styles: date_styles,
|
|
96
|
+
date_1904: date_1904?
|
|
94
97
|
)
|
|
95
98
|
end
|
|
96
99
|
|
|
@@ -131,6 +134,13 @@ module Rbxl
|
|
|
131
134
|
@date_styles ||= load_date_styles
|
|
132
135
|
end
|
|
133
136
|
|
|
137
|
+
def date_1904?
|
|
138
|
+
return false unless @date_conversion
|
|
139
|
+
|
|
140
|
+
@date_1904 = load_date_1904 if @date_1904.nil?
|
|
141
|
+
@date_1904
|
|
142
|
+
end
|
|
143
|
+
|
|
134
144
|
def load_date_styles
|
|
135
145
|
entry = @zip.find_entry("xl/styles.xml")
|
|
136
146
|
return [].freeze unless entry
|
|
@@ -268,13 +278,27 @@ module Rbxl
|
|
|
268
278
|
rid = node.attribute("r:id")
|
|
269
279
|
next unless name && rid
|
|
270
280
|
|
|
271
|
-
target = relationships.fetch(rid)
|
|
281
|
+
target = relationships.fetch(rid) do
|
|
282
|
+
raise WorkbookFormatError,
|
|
283
|
+
"workbook #{@path} references missing relationship #{rid.inspect} for sheet #{name.inspect}"
|
|
284
|
+
end
|
|
272
285
|
sheets[name] = "xl/#{target}".gsub(%r{/+}, "/")
|
|
273
286
|
end
|
|
274
287
|
|
|
275
288
|
sheets
|
|
276
289
|
end
|
|
277
290
|
|
|
291
|
+
def load_date_1904
|
|
292
|
+
each_xml_node("xl/workbook.xml") do |node|
|
|
293
|
+
next unless node.node_type == Nokogiri::XML::Reader::TYPE_ELEMENT
|
|
294
|
+
next unless node.local_name == "workbookPr"
|
|
295
|
+
|
|
296
|
+
return xml_truthy?(node.attribute("date1904"))
|
|
297
|
+
end
|
|
298
|
+
|
|
299
|
+
false
|
|
300
|
+
end
|
|
301
|
+
|
|
278
302
|
def load_relationship_targets(entry_path)
|
|
279
303
|
relationships = {}
|
|
280
304
|
|
|
@@ -293,11 +317,20 @@ module Rbxl
|
|
|
293
317
|
end
|
|
294
318
|
|
|
295
319
|
def each_xml_node(entry_path)
|
|
296
|
-
|
|
320
|
+
entry = @zip.get_entry(entry_path)
|
|
321
|
+
raise WorkbookFormatError, "workbook #{@path} is missing required entry #{entry_path.inspect}" unless entry
|
|
322
|
+
|
|
323
|
+
io = entry.get_input_stream
|
|
297
324
|
reader = Nokogiri::XML::Reader(io)
|
|
298
325
|
reader.each { |node| yield node }
|
|
326
|
+
rescue Nokogiri::XML::SyntaxError => e
|
|
327
|
+
raise WorkbookFormatError, "invalid workbook XML in #{@path} at #{entry_path}: #{e.message}"
|
|
299
328
|
ensure
|
|
300
329
|
io&.close
|
|
301
330
|
end
|
|
331
|
+
|
|
332
|
+
def xml_truthy?(value)
|
|
333
|
+
value == "1" || value == "true"
|
|
334
|
+
end
|
|
302
335
|
end
|
|
303
336
|
end
|
|
@@ -50,6 +50,7 @@ module Rbxl
|
|
|
50
50
|
|
|
51
51
|
# @param zip [Zip::File] open archive shared with the workbook
|
|
52
52
|
# @param entry_path [String] ZIP entry path for this sheet's XML
|
|
53
|
+
# @param workbook_path [String] filesystem path the workbook was opened from
|
|
53
54
|
# @param shared_strings [Array<String>] pre-decoded shared strings table
|
|
54
55
|
# @param name [String] visible sheet name
|
|
55
56
|
# @param streaming [Boolean] when the native extension is loaded, feed
|
|
@@ -59,13 +60,17 @@ module Rbxl
|
|
|
59
60
|
# id's numFmt is a date/time format. When provided, numeric cells with
|
|
60
61
|
# a matching style are returned as +Date+ or +Time+ instead of +Float+,
|
|
61
62
|
# and the native fast path is bypassed.
|
|
62
|
-
|
|
63
|
+
# @param date_1904 [Boolean] whether the workbook uses Excel's 1904 date
|
|
64
|
+
# system instead of the default 1900 date system
|
|
65
|
+
def initialize(zip:, entry_path:, workbook_path:, shared_strings:, name:, streaming: false, date_styles: nil, date_1904: false)
|
|
63
66
|
@zip = zip
|
|
64
67
|
@entry_path = entry_path
|
|
68
|
+
@workbook_path = workbook_path
|
|
65
69
|
@shared_strings = shared_strings
|
|
66
70
|
@name = name
|
|
67
71
|
@streaming = streaming
|
|
68
72
|
@date_styles = date_styles
|
|
73
|
+
@date_1904 = date_1904
|
|
69
74
|
@disable_native = !date_styles.nil?
|
|
70
75
|
@dimensions = extract_dimensions
|
|
71
76
|
@merge_ranges_by_row = nil
|
|
@@ -171,6 +176,7 @@ module Rbxl
|
|
|
171
176
|
|
|
172
177
|
cell_type = nil
|
|
173
178
|
cell_style = nil
|
|
179
|
+
cell_ref = nil
|
|
174
180
|
collecting_value = false
|
|
175
181
|
in_v = false
|
|
176
182
|
raw_value = nil
|
|
@@ -178,6 +184,7 @@ module Rbxl
|
|
|
178
184
|
current_values = nil
|
|
179
185
|
row_depth = nil
|
|
180
186
|
track_style = !@date_styles.nil?
|
|
187
|
+
wrap_cell_errors = track_style
|
|
181
188
|
|
|
182
189
|
with_sheet_reader do |reader|
|
|
183
190
|
reader.each do |node|
|
|
@@ -192,13 +199,20 @@ module Rbxl
|
|
|
192
199
|
current_values = nil
|
|
193
200
|
end
|
|
194
201
|
when "c"
|
|
202
|
+
cell_ref = node.attribute("r")
|
|
195
203
|
cell_type = node.attribute("t")
|
|
196
204
|
cell_style = track_style ? node.attribute("s")&.to_i : nil
|
|
197
205
|
raw_value = nil
|
|
198
206
|
if current_values && node.self_closing?
|
|
199
|
-
|
|
207
|
+
value = if wrap_cell_errors
|
|
208
|
+
coerce_cell_value(raw_value, cell_type, cell_style, cell_ref)
|
|
209
|
+
else
|
|
210
|
+
coerce_value(raw_value, cell_type, cell_style)
|
|
211
|
+
end
|
|
212
|
+
current_values << value
|
|
200
213
|
cell_type = nil
|
|
201
214
|
cell_style = nil
|
|
215
|
+
cell_ref = nil
|
|
202
216
|
end
|
|
203
217
|
when "v"
|
|
204
218
|
collecting_value = true
|
|
@@ -224,9 +238,15 @@ module Rbxl
|
|
|
224
238
|
yield current_values.freeze
|
|
225
239
|
current_values = nil
|
|
226
240
|
elsif current_values && node.depth == row_depth + 1
|
|
227
|
-
|
|
241
|
+
value = if wrap_cell_errors
|
|
242
|
+
coerce_cell_value(raw_value, cell_type, cell_style, cell_ref)
|
|
243
|
+
else
|
|
244
|
+
coerce_value(raw_value, cell_type, cell_style)
|
|
245
|
+
end
|
|
246
|
+
current_values << value
|
|
228
247
|
cell_type = nil
|
|
229
248
|
cell_style = nil
|
|
249
|
+
cell_ref = nil
|
|
230
250
|
raw_value = nil
|
|
231
251
|
end
|
|
232
252
|
end
|
|
@@ -258,6 +278,7 @@ module Rbxl
|
|
|
258
278
|
value_buffer = +""
|
|
259
279
|
row_depth = nil
|
|
260
280
|
track_style = !@date_styles.nil?
|
|
281
|
+
wrap_cell_errors = track_style
|
|
261
282
|
|
|
262
283
|
with_sheet_reader do |reader|
|
|
263
284
|
reader.each do |node|
|
|
@@ -289,7 +310,12 @@ module Rbxl
|
|
|
289
310
|
cell_style = track_style ? node.attribute("s")&.to_i : nil
|
|
290
311
|
raw_value = nil
|
|
291
312
|
if current_cells && node.self_closing?
|
|
292
|
-
|
|
313
|
+
value = if wrap_cell_errors
|
|
314
|
+
coerce_cell_value(raw_value, cell_type, cell_style, cell_ref)
|
|
315
|
+
else
|
|
316
|
+
coerce_value(raw_value, cell_type, cell_style)
|
|
317
|
+
end
|
|
318
|
+
current_cells << build_row_entry(cell_ref, value, values_only)
|
|
293
319
|
cell_ref = nil
|
|
294
320
|
cell_type = nil
|
|
295
321
|
cell_style = nil
|
|
@@ -322,7 +348,12 @@ module Rbxl
|
|
|
322
348
|
current_row_index = nil
|
|
323
349
|
current_cells = nil
|
|
324
350
|
elsif current_cells && node.depth == row_depth + 1
|
|
325
|
-
|
|
351
|
+
value = if wrap_cell_errors
|
|
352
|
+
coerce_cell_value(raw_value, cell_type, cell_style, cell_ref)
|
|
353
|
+
else
|
|
354
|
+
coerce_value(raw_value, cell_type, cell_style)
|
|
355
|
+
end
|
|
356
|
+
current_cells << build_row_entry(cell_ref, value, values_only)
|
|
326
357
|
cell_ref = nil
|
|
327
358
|
cell_type = nil
|
|
328
359
|
cell_style = nil
|
|
@@ -340,9 +371,14 @@ module Rbxl
|
|
|
340
371
|
end
|
|
341
372
|
|
|
342
373
|
def with_sheet_reader
|
|
343
|
-
|
|
374
|
+
entry = @zip.get_entry(@entry_path)
|
|
375
|
+
raise WorksheetFormatError, "worksheet #{@name.inspect} is missing XML entry #{@entry_path.inspect} in #{@workbook_path}" unless entry
|
|
376
|
+
|
|
377
|
+
io = entry.get_input_stream
|
|
344
378
|
reader = Nokogiri::XML::Reader(io)
|
|
345
379
|
yield reader
|
|
380
|
+
rescue Nokogiri::XML::SyntaxError => e
|
|
381
|
+
raise WorksheetFormatError, "invalid worksheet XML for sheet #{@name.inspect} in #{@workbook_path}: #{e.message}"
|
|
346
382
|
ensure
|
|
347
383
|
io&.close
|
|
348
384
|
end
|
|
@@ -352,7 +388,10 @@ module Rbxl
|
|
|
352
388
|
max_bytes = Rbxl.max_worksheet_bytes
|
|
353
389
|
Rbxl::Native.public_send(method_name, io, @shared_strings, max_bytes, &block)
|
|
354
390
|
rescue RuntimeError => e
|
|
355
|
-
|
|
391
|
+
if e.message&.include?("worksheet bytes exceed limit")
|
|
392
|
+
raise WorksheetTooLargeError,
|
|
393
|
+
"worksheet #{@name.inspect} in #{@workbook_path}: #{e.message}"
|
|
394
|
+
end
|
|
356
395
|
|
|
357
396
|
raise
|
|
358
397
|
ensure
|
|
@@ -586,6 +625,13 @@ module Rbxl
|
|
|
586
625
|
end
|
|
587
626
|
end
|
|
588
627
|
|
|
628
|
+
def coerce_cell_value(raw_value, type, style_id, coordinate)
|
|
629
|
+
coerce_value(raw_value, type, style_id)
|
|
630
|
+
rescue StandardError => e
|
|
631
|
+
raise CellValueError,
|
|
632
|
+
"failed to decode cell #{coordinate || '(unknown coordinate)'} on sheet #{@name.inspect} in #{@workbook_path}: #{e.message}"
|
|
633
|
+
end
|
|
634
|
+
|
|
589
635
|
# Excel's serial date counts days from 1899-12-31 as serial 1, with a
|
|
590
636
|
# documented leap-year bug for the non-existent 1900-02-29 (serial 60)
|
|
591
637
|
# — for serials >= 60 the day-count is shifted back by one so that
|
|
@@ -594,9 +640,15 @@ module Rbxl
|
|
|
594
640
|
# +Time+ so that both date and time-of-day survive the conversion.
|
|
595
641
|
def excel_serial_to_ruby(serial)
|
|
596
642
|
whole = serial.to_i
|
|
597
|
-
whole -= 1 if whole >= 60
|
|
598
643
|
frac = serial - serial.to_i
|
|
599
|
-
|
|
644
|
+
|
|
645
|
+
base =
|
|
646
|
+
if @date_1904
|
|
647
|
+
Date.new(1904, 1, 1) + whole
|
|
648
|
+
else
|
|
649
|
+
whole -= 1 if whole >= 60
|
|
650
|
+
Date.new(1899, 12, 31) + whole
|
|
651
|
+
end
|
|
600
652
|
|
|
601
653
|
return base if frac.zero?
|
|
602
654
|
|
data/lib/rbxl/version.rb
CHANGED
|
@@ -96,7 +96,7 @@ module Rbxl
|
|
|
96
96
|
private
|
|
97
97
|
|
|
98
98
|
def ensure_writable!
|
|
99
|
-
raise WorkbookAlreadySavedError, "write-only workbook can only be saved once" if @saved
|
|
99
|
+
raise WorkbookAlreadySavedError, "write-only workbook can only be saved once by design; call Rbxl.new to build another workbook" if @saved
|
|
100
100
|
raise ClosedWorkbookError, "workbook has been closed" if closed?
|
|
101
101
|
end
|
|
102
102
|
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: rbxl
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.
|
|
4
|
+
version: 1.2.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Taro KOBAYASHI
|
|
@@ -43,8 +43,8 @@ dependencies:
|
|
|
43
43
|
- - "<"
|
|
44
44
|
- !ruby/object:Gem::Version
|
|
45
45
|
version: '2.0'
|
|
46
|
-
description: rbxl is a Ruby gem for
|
|
47
|
-
XLSX
|
|
46
|
+
description: rbxl is a fast, low-memory Ruby gem for row-by-row XLSX reads and append-only
|
|
47
|
+
XLSX writes, with an optional native extension for higher-throughput XML parsing.
|
|
48
48
|
email:
|
|
49
49
|
- taro@matzlika.co.jp
|
|
50
50
|
executables: []
|
|
@@ -96,6 +96,5 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
96
96
|
requirements: []
|
|
97
97
|
rubygems_version: 4.0.3
|
|
98
98
|
specification_version: 4
|
|
99
|
-
summary:
|
|
100
|
-
writes.
|
|
99
|
+
summary: Fast, low-memory XLSX processing for Ruby.
|
|
101
100
|
test_files: []
|