rbxl 1.0.2 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +79 -10
- data/README.md +166 -10
- data/lib/rbxl/errors.rb +13 -0
- data/lib/rbxl/read_only_workbook.rb +113 -7
- data/lib/rbxl/read_only_worksheet.rb +128 -12
- data/lib/rbxl/version.rb +1 -1
- data/lib/rbxl/write_only_workbook.rb +1 -1
- data/lib/rbxl.rb +32 -16
- data/sig/rbxl.rbs +5 -5
- metadata +4 -5
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: b7d99201ddbfd10ac1f5173052e0ef0d0bfea0e7e0143bc5e214d28d5cbea335
|
|
4
|
+
data.tar.gz: 513ec07aea3c8888bafd1b60c20f6e508e6ce87a2380c5dbcb536523b09ceab3
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 1dd2f6856dd7c9452d63f132e52f4336958a8bda63e304b353766ba573ed429b196c76dfa067468dcf0d85f5926de5b002f8f498b4907486fe717096ab20dbb2
|
|
7
|
+
data.tar.gz: 298fc80d0760d5468a7b2c95ae32751eb5c0e6070cd546d77d806ef7aabb057674f98a371d0c6c0320fd9784d89633e9fb278d03bdec4219426d89997a5540cb
|
data/CHANGELOG.md
CHANGED
|
@@ -1,19 +1,88 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
All notable changes to this project are documented here. The format is based
|
|
4
|
+
on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project
|
|
5
|
+
follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
6
|
+
|
|
7
|
+
## [Unreleased]
|
|
8
|
+
|
|
9
|
+
## [1.2.0] - 2026-04-23
|
|
10
|
+
|
|
11
|
+
### Changed
|
|
12
|
+
|
|
13
|
+
- `WorkbookAlreadySavedError` message now points at the save-once design and
|
|
14
|
+
the next action (open a fresh `Rbxl.new` for another file) so callers who
|
|
15
|
+
trip on the constraint don't have to read the source to understand why.
|
|
16
|
+
- Workbook- and worksheet-level parse failures raise `WorkbookFormatError` /
|
|
17
|
+
`WorksheetFormatError` with the workbook path and the XML entry or sheet
|
|
18
|
+
name in the message, replacing generic parser exceptions.
|
|
19
|
+
|
|
20
|
+
### Added
|
|
21
|
+
|
|
22
|
+
- Location-aware coverage around malformed workbook and worksheet XML so bad
|
|
23
|
+
inputs surface the specific entry that failed rather than bubbling up an
|
|
24
|
+
unlabelled `Nokogiri::XML::SyntaxError`.
|
|
25
|
+
- README sections covering the write-only model (append-only, save-once,
|
|
26
|
+
no in-place edit), a "Reading recipes" walkthrough, and an explicit Out
|
|
27
|
+
of scope entry for read-modify-save workflows.
|
|
28
|
+
|
|
29
|
+
### Fixed
|
|
30
|
+
|
|
31
|
+
- Honor Excel's `date1904` workbook setting when `date_conversion: true` is
|
|
32
|
+
enabled, so Mac-originated workbooks map serial dates to the correct Ruby
|
|
33
|
+
`Date` and `Time` values.
|
|
34
|
+
|
|
35
|
+
## [1.1.0] - 2026-04-21
|
|
36
|
+
|
|
37
|
+
### Added
|
|
38
|
+
|
|
39
|
+
- `date_conversion: true` option for `Rbxl.open`: numeric cells whose style
|
|
40
|
+
points at a date/time `numFmt` (built-in ids 14–22, 27–36, 45–47, 50–58,
|
|
41
|
+
or a custom format code containing date tokens) are returned as `Date`
|
|
42
|
+
or `Time`. Off by default — no change in output shape or throughput when
|
|
43
|
+
the flag is absent.
|
|
44
|
+
|
|
45
|
+
### Changed
|
|
46
|
+
|
|
47
|
+
- `Rbxl.open` and `Rbxl.new` now default `read_only: true` and
|
|
48
|
+
`write_only: true` respectively, so the call site no longer needs the
|
|
49
|
+
boilerplate. Explicitly passing `false` raises `NotImplementedError`.
|
|
50
|
+
|
|
51
|
+
### Fixed
|
|
52
|
+
|
|
53
|
+
- Ruby reader path now iterates self-closing `<row/>` and `<c/>` elements
|
|
54
|
+
instead of silently dropping them, and never yields `nil` for a row.
|
|
55
|
+
|
|
56
|
+
## [1.0.2] - 2026-04-17
|
|
57
|
+
|
|
58
|
+
### Added
|
|
59
|
+
|
|
60
|
+
- `streaming: true` option for `Rbxl.open` feeds worksheet XML to the
|
|
61
|
+
native reader in 64 KiB chunks instead of buffering the full worksheet
|
|
62
|
+
first.
|
|
63
|
+
- `Rbxl.max_worksheet_bytes` configuration and `Rbxl::WorksheetTooLargeError`
|
|
64
|
+
so streaming reads can stop oversized worksheet XML entries mid-inflate.
|
|
65
|
+
|
|
66
|
+
### Changed
|
|
4
67
|
|
|
5
|
-
- Add `streaming: true` to `Rbxl.open` to feed worksheet XML to the native reader in 64 KiB chunks instead of buffering the full worksheet first.
|
|
6
|
-
- Add `Rbxl.max_worksheet_bytes` and `Rbxl::WorksheetTooLargeError` so streaming reads can stop oversized worksheet XML entries mid-inflate.
|
|
7
68
|
- Expand RDoc coverage across the public API.
|
|
8
69
|
- Tighten RBS signatures to match the actual runtime types.
|
|
9
|
-
- Reword public docs and gem metadata to describe reads as row-by-row and
|
|
70
|
+
- Reword public docs and gem metadata to describe reads as row-by-row and
|
|
71
|
+
writes as append-only, reserving "streaming" for the new opt-in native
|
|
72
|
+
read path.
|
|
73
|
+
|
|
74
|
+
## [1.0.1] - 2026-04-16
|
|
75
|
+
|
|
76
|
+
### Added
|
|
77
|
+
|
|
78
|
+
- Go and Rust benchmark comparisons.
|
|
10
79
|
|
|
11
|
-
|
|
80
|
+
### Fixed
|
|
12
81
|
|
|
13
|
-
-
|
|
14
|
-
-
|
|
15
|
-
|
|
82
|
+
- ZIP64 handling.
|
|
83
|
+
- Align `rbxl/native` with Nokogiri's libxml2 to avoid mixed-library
|
|
84
|
+
warnings at runtime.
|
|
16
85
|
|
|
17
|
-
## 1.0.0
|
|
86
|
+
## [1.0.0] - 2026-04-16
|
|
18
87
|
|
|
19
|
-
- Initial
|
|
88
|
+
- Initial public release.
|
data/README.md
CHANGED
|
@@ -1,5 +1,7 @@
|
|
|
1
1
|
# rbxl
|
|
2
2
|
|
|
3
|
+
[](https://badge.fury.io/rb/rbxl)
|
|
4
|
+
|
|
3
5
|
Fast, memory-friendly Ruby gem for row-by-row `.xlsx` reads and append-only writes.
|
|
4
6
|
|
|
5
7
|
`rbxl` is built for the two workbook workflows that scale cleanly:
|
|
@@ -10,26 +12,35 @@ Fast, memory-friendly Ruby gem for row-by-row `.xlsx` reads and append-only writ
|
|
|
10
12
|
The API is intentionally small and `openpyxl`-inspired, with an optional
|
|
11
13
|
native extension for faster XML parsing when you need more throughput.
|
|
12
14
|
|
|
13
|
-
|
|
15
|
+
Supported:
|
|
14
16
|
|
|
15
|
-
-
|
|
16
|
-
-
|
|
17
|
-
- `
|
|
18
|
-
- minimal `openpyxl`-like API
|
|
17
|
+
- write-only workbook generation
|
|
18
|
+
- read-only row-by-row iteration
|
|
19
|
+
- opt-in date/time conversion driven by the workbook's `numFmt` styles
|
|
19
20
|
- optional C extension (`rbxl/native`) for maximum performance
|
|
20
21
|
|
|
21
|
-
Out of scope
|
|
22
|
+
Out of scope:
|
|
22
23
|
|
|
24
|
+
- in-place editing of an existing `.xlsx` file — rbxl opens workbooks
|
|
25
|
+
read-only and generates new workbooks write-only, with no read-modify-save
|
|
26
|
+
path. If you need to open a file, tweak a handful of cells, and write it
|
|
27
|
+
back preserving everything else, use a full-object-model library instead.
|
|
23
28
|
- preserving arbitrary workbook structure on save
|
|
24
29
|
- rich style round-tripping
|
|
25
30
|
- formulas, images, charts, comments
|
|
26
31
|
|
|
27
32
|
## Usage
|
|
28
33
|
|
|
34
|
+
`Rbxl.open` defaults to read-only and `Rbxl.new` defaults to write-only;
|
|
35
|
+
the `read_only:` / `write_only:` keywords remain for call-site clarity and
|
|
36
|
+
to leave room for a future read/write mode.
|
|
37
|
+
|
|
38
|
+
### Writing a new workbook
|
|
39
|
+
|
|
29
40
|
```ruby
|
|
30
41
|
require "rbxl"
|
|
31
42
|
|
|
32
|
-
book = Rbxl.new
|
|
43
|
+
book = Rbxl.new
|
|
33
44
|
sheet = book.add_sheet("Report")
|
|
34
45
|
sheet.append(["id", "name", "score"])
|
|
35
46
|
sheet.append([1, "alice", 100])
|
|
@@ -37,10 +48,27 @@ sheet.append([2, "bob", 95.5])
|
|
|
37
48
|
book.save("report.xlsx")
|
|
38
49
|
```
|
|
39
50
|
|
|
51
|
+
Write-only workbooks follow three rules:
|
|
52
|
+
|
|
53
|
+
- **Append-only within a sheet.** `sheet.append(row)` is the only way to
|
|
54
|
+
add data. There is no random-access cell write, no mid-stream edit of a
|
|
55
|
+
previously appended row.
|
|
56
|
+
- **Save-once per workbook.** `save` flushes the full `.xlsx` package in a
|
|
57
|
+
single pass and then closes the workbook. Calling `save` or `add_sheet`
|
|
58
|
+
again raises `Rbxl::WorkbookAlreadySavedError`. To produce another file,
|
|
59
|
+
start a new `Rbxl.new`.
|
|
60
|
+
- **No read-modify-save.** rbxl cannot open an existing `.xlsx` and write
|
|
61
|
+
back to it (see Out of scope above).
|
|
62
|
+
|
|
63
|
+
This is the tradeoff that keeps memory flat: rbxl buffers rows per sheet
|
|
64
|
+
and never materializes a full workbook object graph.
|
|
65
|
+
|
|
66
|
+
### Reading a workbook
|
|
67
|
+
|
|
40
68
|
```ruby
|
|
41
69
|
require "rbxl"
|
|
42
70
|
|
|
43
|
-
book = Rbxl.open("report.xlsx"
|
|
71
|
+
book = Rbxl.open("report.xlsx")
|
|
44
72
|
sheet = book.sheet("Report")
|
|
45
73
|
|
|
46
74
|
sheet.each_row do |row|
|
|
@@ -52,8 +80,136 @@ p sheet.calculate_dimension
|
|
|
52
80
|
book.close
|
|
53
81
|
```
|
|
54
82
|
|
|
55
|
-
|
|
56
|
-
|
|
83
|
+
### Reading recipes
|
|
84
|
+
|
|
85
|
+
**Plain value arrays (fastest path).** Use `values_only: true` when you
|
|
86
|
+
only care about the cell values, not their coordinates. Rows come back as
|
|
87
|
+
frozen `Array<Object>`:
|
|
88
|
+
|
|
89
|
+
```ruby
|
|
90
|
+
book.sheet("Data").each_row(values_only: true) do |values|
|
|
91
|
+
id, name, score = values
|
|
92
|
+
# ...
|
|
93
|
+
end
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
**Cell objects with coordinates.** Default `each_row` yields a
|
|
97
|
+
`Rbxl::Row` wrapping `Rbxl::ReadOnlyCell`s. Use this when you need the
|
|
98
|
+
Excel coordinate alongside the value:
|
|
99
|
+
|
|
100
|
+
```ruby
|
|
101
|
+
book.sheet("Data").each_row do |row|
|
|
102
|
+
row.index # => 2 (1-based worksheet row number)
|
|
103
|
+
row[0].coordinate # => "A2"
|
|
104
|
+
row[0].value # => "alice"
|
|
105
|
+
row.values # => ["alice", 100, true]
|
|
106
|
+
end
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
**Skip the header row.** `each_row` without a block returns an
|
|
110
|
+
`Enumerator`, so chain `drop`:
|
|
111
|
+
|
|
112
|
+
```ruby
|
|
113
|
+
book.sheet("Data").each_row(values_only: true).drop(1).each do |row|
|
|
114
|
+
# ...
|
|
115
|
+
end
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
**Peek at the first N rows.** `rows(...)` is an enumerator-returning
|
|
119
|
+
alias that composes well with `take`, `first`, `lazy`, etc.:
|
|
120
|
+
|
|
121
|
+
```ruby
|
|
122
|
+
book.sheet("Data").rows(values_only: true).first(5)
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
**Know the data range up-front.** When the workbook has a stored
|
|
126
|
+
dimension, these are O(1) lookups; otherwise pass `force: true` to scan:
|
|
127
|
+
|
|
128
|
+
```ruby
|
|
129
|
+
sheet = book.sheet("Data")
|
|
130
|
+
sheet.max_row # => 500
|
|
131
|
+
sheet.max_column # => 12
|
|
132
|
+
sheet.calculate_dimension # => "A1:L500"
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
**Pad sparse rows to the sheet width.** Without `pad_cells`, a row
|
|
136
|
+
containing only `A1` and `C1` yields two cells. With `pad_cells: true`,
|
|
137
|
+
missing cells are filled with `Rbxl::EmptyCell` (or `nil` in values-only
|
|
138
|
+
mode), aligned to `max_column`:
|
|
139
|
+
|
|
140
|
+
```ruby
|
|
141
|
+
book.sheet("Sparse").each_row(pad_cells: true, values_only: true).first
|
|
142
|
+
# => ["left", nil, "right"]
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
**Expand merged cells.** Excel leaves the anchor cell populated and the
|
|
146
|
+
rest of the merge range empty. Pass `expand_merged: true` to propagate
|
|
147
|
+
the anchor value across the full range; combine with `pad_cells: true`
|
|
148
|
+
when you want the result aligned to the sheet's width:
|
|
149
|
+
|
|
150
|
+
```ruby
|
|
151
|
+
sheet = book.sheet("Merged")
|
|
152
|
+
|
|
153
|
+
sheet.rows(values_only: true).to_a
|
|
154
|
+
# => [["group", "solo"], ["tail"]]
|
|
155
|
+
|
|
156
|
+
sheet.rows(values_only: true, pad_cells: true, expand_merged: true).to_a
|
|
157
|
+
# => [["group", "group", "solo", nil],
|
|
158
|
+
# ["group", "group", "solo", "tail"]]
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
**List sheets before opening any.** Sheet XML is only read on first
|
|
162
|
+
iteration; enumerating names is cheap:
|
|
163
|
+
|
|
164
|
+
```ruby
|
|
165
|
+
book.sheet_names # => ["Summary", "Detail", "Raw"]
|
|
166
|
+
book.sheet("Detail").each_row(values_only: true) { |row| ... }
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
**Locate a bad input.** All rbxl exceptions inherit from `Rbxl::Error`
|
|
170
|
+
and the messages carry the workbook path and (where relevant) the sheet
|
|
171
|
+
name, XML entry, or cell coordinate. Rescue at the sheet level:
|
|
172
|
+
|
|
173
|
+
```ruby
|
|
174
|
+
begin
|
|
175
|
+
book.sheet("Raw").each_row(values_only: true) { |row| ... }
|
|
176
|
+
rescue Rbxl::WorksheetFormatError, Rbxl::WorkbookFormatError => e
|
|
177
|
+
warn e.message # includes workbook path and sheet/entry
|
|
178
|
+
rescue Rbxl::CellValueError => e
|
|
179
|
+
warn e.message # includes workbook path, sheet, and coordinate
|
|
180
|
+
end
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
`Rbxl::CellValueError` is raised by the cell decoder when
|
|
184
|
+
`date_conversion: true` is active. The reader is forward-only, so rescue
|
|
185
|
+
terminates iteration rather than skipping to the next row.
|
|
186
|
+
|
|
187
|
+
### Date / time conversion
|
|
188
|
+
|
|
189
|
+
Numeric cells in `.xlsx` files are serial days since 1899-12-31; whether
|
|
190
|
+
they display as `44562`, `2022-01-01`, or `12:00` depends on the cell's
|
|
191
|
+
`numFmt` style. `rbxl` leaves cells as raw `Float` by default so the read
|
|
192
|
+
path stays allocation-light. Pass `date_conversion: true` to opt into
|
|
193
|
+
interpreting the style:
|
|
194
|
+
|
|
195
|
+
```ruby
|
|
196
|
+
require "rbxl"
|
|
197
|
+
|
|
198
|
+
book = Rbxl.open("schedule.xlsx", date_conversion: true)
|
|
199
|
+
book.sheet("Timeline").each_row(values_only: true) do |row|
|
|
200
|
+
row.each { |v| p v } # => Date / Time / Float / String / ...
|
|
201
|
+
end
|
|
202
|
+
book.close
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
With the flag on, `rbxl` parses `xl/styles.xml` once at first use and
|
|
206
|
+
converts numeric cells whose style maps to a built-in date `numFmtId`
|
|
207
|
+
(14–22, 27–36, 45–47, 50–58) or to a custom `formatCode` containing date
|
|
208
|
+
tokens. Whole-number serials return `Date`; fractional serials return
|
|
209
|
+
`Time` so the time-of-day portion is preserved. The flag is off by
|
|
210
|
+
default; leaving it off skips the styles parse entirely and keeps the
|
|
211
|
+
native fast path in use. Turning it on routes reads through the pure-Ruby
|
|
212
|
+
worksheet parser.
|
|
57
213
|
|
|
58
214
|
## Native C Extension
|
|
59
215
|
|
data/lib/rbxl/errors.rb
CHANGED
|
@@ -33,4 +33,17 @@ module Rbxl
|
|
|
33
33
|
# bytes consumed from the ZIP entry, so high-compression zip-bomb style
|
|
34
34
|
# worksheets are stopped mid-inflate rather than after the fact.
|
|
35
35
|
class WorksheetTooLargeError < Error; end
|
|
36
|
+
|
|
37
|
+
# Raised when workbook-level XML is malformed or internally inconsistent,
|
|
38
|
+
# for example when +xl/workbook.xml+ cannot be parsed or references a
|
|
39
|
+
# missing relationship target.
|
|
40
|
+
class WorkbookFormatError < Error; end
|
|
41
|
+
|
|
42
|
+
# Raised when a worksheet XML entry cannot be parsed into rows.
|
|
43
|
+
class WorksheetFormatError < Error; end
|
|
44
|
+
|
|
45
|
+
# Raised when a specific cell cannot be decoded. The message includes the
|
|
46
|
+
# workbook path, sheet name, and cell coordinate to make bad inputs easy
|
|
47
|
+
# to locate.
|
|
48
|
+
class CellValueError < WorksheetFormatError; end
|
|
36
49
|
end
|
|
@@ -36,14 +36,17 @@ module Rbxl
|
|
|
36
36
|
# @return [Array<String>] visible sheet names in workbook order
|
|
37
37
|
attr_reader :sheet_names
|
|
38
38
|
|
|
39
|
-
# Convenience constructor equivalent to
|
|
39
|
+
# Convenience constructor equivalent to
|
|
40
|
+
# <tt>new(path, streaming:, date_conversion:)</tt>.
|
|
40
41
|
#
|
|
41
42
|
# @param path [String, #to_path] path to the <tt>.xlsx</tt> file
|
|
42
43
|
# @param streaming [Boolean] feed worksheet XML to the native parser in
|
|
43
44
|
# chunks (see {Rbxl.open})
|
|
45
|
+
# @param date_conversion [Boolean] convert numeric cells backed by a
|
|
46
|
+
# date/time +numFmt+ to Ruby date/time objects (see {Rbxl.open})
|
|
44
47
|
# @return [Rbxl::ReadOnlyWorkbook]
|
|
45
|
-
def self.open(path, streaming: false)
|
|
46
|
-
new(path, streaming: streaming)
|
|
48
|
+
def self.open(path, streaming: false, date_conversion: false)
|
|
49
|
+
new(path, streaming: streaming, date_conversion: date_conversion)
|
|
47
50
|
end
|
|
48
51
|
|
|
49
52
|
# Opens the ZIP archive, pre-loads shared strings, and indexes the
|
|
@@ -51,13 +54,18 @@ module Rbxl
|
|
|
51
54
|
#
|
|
52
55
|
# @param path [String, #to_path] path to the <tt>.xlsx</tt> file
|
|
53
56
|
# @param streaming [Boolean] forwarded to produced worksheets
|
|
54
|
-
|
|
57
|
+
# @param date_conversion [Boolean] lazily load styles.xml and forward the
|
|
58
|
+
# date-style lookup table to produced worksheets
|
|
59
|
+
def initialize(path, streaming: false, date_conversion: false)
|
|
55
60
|
@path = path
|
|
56
61
|
@zip = Zip::File.open(path)
|
|
57
62
|
@streaming = streaming
|
|
63
|
+
@date_conversion = date_conversion
|
|
58
64
|
@shared_strings = load_shared_strings
|
|
59
65
|
@sheet_entries = load_sheet_entries
|
|
60
66
|
@sheet_names = @sheet_entries.keys.freeze
|
|
67
|
+
@date_styles = nil
|
|
68
|
+
@date_1904 = nil
|
|
61
69
|
@closed = false
|
|
62
70
|
end
|
|
63
71
|
|
|
@@ -77,7 +85,16 @@ module Rbxl
|
|
|
77
85
|
raise SheetNotFoundError, "sheet not found: #{name}"
|
|
78
86
|
end
|
|
79
87
|
|
|
80
|
-
ReadOnlyWorksheet.new(
|
|
88
|
+
ReadOnlyWorksheet.new(
|
|
89
|
+
zip: @zip,
|
|
90
|
+
entry_path: entry_path,
|
|
91
|
+
workbook_path: @path,
|
|
92
|
+
shared_strings: @shared_strings,
|
|
93
|
+
name: name,
|
|
94
|
+
streaming: @streaming,
|
|
95
|
+
date_styles: date_styles,
|
|
96
|
+
date_1904: date_1904?
|
|
97
|
+
)
|
|
81
98
|
end
|
|
82
99
|
|
|
83
100
|
# Releases the underlying ZIP file handle. Idempotent; subsequent calls
|
|
@@ -102,6 +119,72 @@ module Rbxl
|
|
|
102
119
|
raise ClosedWorkbookError, "workbook has been closed" if closed?
|
|
103
120
|
end
|
|
104
121
|
|
|
122
|
+
# Built-in numFmtId values that Excel resolves to date/time formats.
|
|
123
|
+
# Ids outside this set are dates only when the workbook provides a
|
|
124
|
+
# matching custom +<numFmt>+ entry whose format code contains date
|
|
125
|
+
# tokens. See ECMA-376 part 1 §18.8.30.
|
|
126
|
+
BUILTIN_DATE_FMT_IDS = Set.new([14, 15, 16, 17, 18, 19, 20, 21, 22,
|
|
127
|
+
27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
|
|
128
|
+
45, 46, 47, 50, 51, 52, 53, 54, 55, 56,
|
|
129
|
+
57, 58]).freeze
|
|
130
|
+
|
|
131
|
+
def date_styles
|
|
132
|
+
return nil unless @date_conversion
|
|
133
|
+
|
|
134
|
+
@date_styles ||= load_date_styles
|
|
135
|
+
end
|
|
136
|
+
|
|
137
|
+
def date_1904?
|
|
138
|
+
return false unless @date_conversion
|
|
139
|
+
|
|
140
|
+
@date_1904 = load_date_1904 if @date_1904.nil?
|
|
141
|
+
@date_1904
|
|
142
|
+
end
|
|
143
|
+
|
|
144
|
+
def load_date_styles
|
|
145
|
+
entry = @zip.find_entry("xl/styles.xml")
|
|
146
|
+
return [].freeze unless entry
|
|
147
|
+
|
|
148
|
+
custom_date_ids = Set.new
|
|
149
|
+
date_styles = []
|
|
150
|
+
in_cell_xfs = false
|
|
151
|
+
|
|
152
|
+
each_xml_node("xl/styles.xml") do |node|
|
|
153
|
+
case node.node_type
|
|
154
|
+
when Nokogiri::XML::Reader::TYPE_ELEMENT
|
|
155
|
+
case node.local_name
|
|
156
|
+
when "cellXfs"
|
|
157
|
+
in_cell_xfs = true
|
|
158
|
+
when "numFmt"
|
|
159
|
+
id = node.attribute("numFmtId")
|
|
160
|
+
code = node.attribute("formatCode")
|
|
161
|
+
custom_date_ids << id.to_i if id && code && date_format_code?(code)
|
|
162
|
+
when "xf"
|
|
163
|
+
next unless in_cell_xfs
|
|
164
|
+
|
|
165
|
+
fmt_id_int = node.attribute("numFmtId")&.to_i
|
|
166
|
+
date_styles << (!fmt_id_int.nil? &&
|
|
167
|
+
(BUILTIN_DATE_FMT_IDS.include?(fmt_id_int) || custom_date_ids.include?(fmt_id_int)))
|
|
168
|
+
end
|
|
169
|
+
when Nokogiri::XML::Reader::TYPE_END_ELEMENT
|
|
170
|
+
in_cell_xfs = false if node.local_name == "cellXfs"
|
|
171
|
+
end
|
|
172
|
+
end
|
|
173
|
+
|
|
174
|
+
date_styles.freeze
|
|
175
|
+
end
|
|
176
|
+
|
|
177
|
+
# Quoted literals, bracketed directives (e.g. [Red], [$-409]), and
|
|
178
|
+
# backslash-escaped characters never introduce date tokens, so strip
|
|
179
|
+
# them before looking for +y/m/d/h/s+.
|
|
180
|
+
def date_format_code?(code)
|
|
181
|
+
stripped = code.dup
|
|
182
|
+
stripped.gsub!(/\[[^\]]*\]/, "")
|
|
183
|
+
stripped.gsub!(/"[^"]*"/, "")
|
|
184
|
+
stripped.gsub!(/\\./, "")
|
|
185
|
+
stripped.match?(/[ymdhs]/i)
|
|
186
|
+
end
|
|
187
|
+
|
|
105
188
|
def load_shared_strings
|
|
106
189
|
entry = @zip.find_entry("xl/sharedStrings.xml")
|
|
107
190
|
return [] unless entry
|
|
@@ -195,13 +278,27 @@ module Rbxl
|
|
|
195
278
|
rid = node.attribute("r:id")
|
|
196
279
|
next unless name && rid
|
|
197
280
|
|
|
198
|
-
target = relationships.fetch(rid)
|
|
281
|
+
target = relationships.fetch(rid) do
|
|
282
|
+
raise WorkbookFormatError,
|
|
283
|
+
"workbook #{@path} references missing relationship #{rid.inspect} for sheet #{name.inspect}"
|
|
284
|
+
end
|
|
199
285
|
sheets[name] = "xl/#{target}".gsub(%r{/+}, "/")
|
|
200
286
|
end
|
|
201
287
|
|
|
202
288
|
sheets
|
|
203
289
|
end
|
|
204
290
|
|
|
291
|
+
def load_date_1904
|
|
292
|
+
each_xml_node("xl/workbook.xml") do |node|
|
|
293
|
+
next unless node.node_type == Nokogiri::XML::Reader::TYPE_ELEMENT
|
|
294
|
+
next unless node.local_name == "workbookPr"
|
|
295
|
+
|
|
296
|
+
return xml_truthy?(node.attribute("date1904"))
|
|
297
|
+
end
|
|
298
|
+
|
|
299
|
+
false
|
|
300
|
+
end
|
|
301
|
+
|
|
205
302
|
def load_relationship_targets(entry_path)
|
|
206
303
|
relationships = {}
|
|
207
304
|
|
|
@@ -220,11 +317,20 @@ module Rbxl
|
|
|
220
317
|
end
|
|
221
318
|
|
|
222
319
|
def each_xml_node(entry_path)
|
|
223
|
-
|
|
320
|
+
entry = @zip.get_entry(entry_path)
|
|
321
|
+
raise WorkbookFormatError, "workbook #{@path} is missing required entry #{entry_path.inspect}" unless entry
|
|
322
|
+
|
|
323
|
+
io = entry.get_input_stream
|
|
224
324
|
reader = Nokogiri::XML::Reader(io)
|
|
225
325
|
reader.each { |node| yield node }
|
|
326
|
+
rescue Nokogiri::XML::SyntaxError => e
|
|
327
|
+
raise WorkbookFormatError, "invalid workbook XML in #{@path} at #{entry_path}: #{e.message}"
|
|
226
328
|
ensure
|
|
227
329
|
io&.close
|
|
228
330
|
end
|
|
331
|
+
|
|
332
|
+
def xml_truthy?(value)
|
|
333
|
+
value == "1" || value == "true"
|
|
334
|
+
end
|
|
229
335
|
end
|
|
230
336
|
end
|
|
@@ -50,17 +50,28 @@ module Rbxl
|
|
|
50
50
|
|
|
51
51
|
# @param zip [Zip::File] open archive shared with the workbook
|
|
52
52
|
# @param entry_path [String] ZIP entry path for this sheet's XML
|
|
53
|
+
# @param workbook_path [String] filesystem path the workbook was opened from
|
|
53
54
|
# @param shared_strings [Array<String>] pre-decoded shared strings table
|
|
54
55
|
# @param name [String] visible sheet name
|
|
55
56
|
# @param streaming [Boolean] when the native extension is loaded, feed
|
|
56
57
|
# worksheet XML to the parser in chunks instead of reading the entry
|
|
57
58
|
# into memory first
|
|
58
|
-
|
|
59
|
+
# @param date_styles [Array<Boolean>, nil] +true+ at a style id when the
|
|
60
|
+
# id's numFmt is a date/time format. When provided, numeric cells with
|
|
61
|
+
# a matching style are returned as +Date+ or +Time+ instead of +Float+,
|
|
62
|
+
# and the native fast path is bypassed.
|
|
63
|
+
# @param date_1904 [Boolean] whether the workbook uses Excel's 1904 date
|
|
64
|
+
# system instead of the default 1900 date system
|
|
65
|
+
def initialize(zip:, entry_path:, workbook_path:, shared_strings:, name:, streaming: false, date_styles: nil, date_1904: false)
|
|
59
66
|
@zip = zip
|
|
60
67
|
@entry_path = entry_path
|
|
68
|
+
@workbook_path = workbook_path
|
|
61
69
|
@shared_strings = shared_strings
|
|
62
70
|
@name = name
|
|
63
71
|
@streaming = streaming
|
|
72
|
+
@date_styles = date_styles
|
|
73
|
+
@date_1904 = date_1904
|
|
74
|
+
@disable_native = !date_styles.nil?
|
|
64
75
|
@dimensions = extract_dimensions
|
|
65
76
|
@merge_ranges_by_row = nil
|
|
66
77
|
@merge_anchor_values = {}
|
|
@@ -164,12 +175,16 @@ module Rbxl
|
|
|
164
175
|
end
|
|
165
176
|
|
|
166
177
|
cell_type = nil
|
|
178
|
+
cell_style = nil
|
|
179
|
+
cell_ref = nil
|
|
167
180
|
collecting_value = false
|
|
168
181
|
in_v = false
|
|
169
182
|
raw_value = nil
|
|
170
183
|
value_buffer = +""
|
|
171
184
|
current_values = nil
|
|
172
185
|
row_depth = nil
|
|
186
|
+
track_style = !@date_styles.nil?
|
|
187
|
+
wrap_cell_errors = track_style
|
|
173
188
|
|
|
174
189
|
with_sheet_reader do |reader|
|
|
175
190
|
reader.each do |node|
|
|
@@ -179,9 +194,26 @@ module Rbxl
|
|
|
179
194
|
when "row"
|
|
180
195
|
current_values = []
|
|
181
196
|
row_depth = node.depth
|
|
197
|
+
if node.self_closing?
|
|
198
|
+
yield current_values.freeze
|
|
199
|
+
current_values = nil
|
|
200
|
+
end
|
|
182
201
|
when "c"
|
|
202
|
+
cell_ref = node.attribute("r")
|
|
183
203
|
cell_type = node.attribute("t")
|
|
204
|
+
cell_style = track_style ? node.attribute("s")&.to_i : nil
|
|
184
205
|
raw_value = nil
|
|
206
|
+
if current_values && node.self_closing?
|
|
207
|
+
value = if wrap_cell_errors
|
|
208
|
+
coerce_cell_value(raw_value, cell_type, cell_style, cell_ref)
|
|
209
|
+
else
|
|
210
|
+
coerce_value(raw_value, cell_type, cell_style)
|
|
211
|
+
end
|
|
212
|
+
current_values << value
|
|
213
|
+
cell_type = nil
|
|
214
|
+
cell_style = nil
|
|
215
|
+
cell_ref = nil
|
|
216
|
+
end
|
|
185
217
|
when "v"
|
|
186
218
|
collecting_value = true
|
|
187
219
|
in_v = true
|
|
@@ -202,12 +234,19 @@ module Rbxl
|
|
|
202
234
|
raw_value = raw_value ? raw_value << value_buffer : value_buffer.dup
|
|
203
235
|
collecting_value = false
|
|
204
236
|
end
|
|
205
|
-
elsif node.depth == row_depth
|
|
237
|
+
elsif current_values && node.depth == row_depth
|
|
206
238
|
yield current_values.freeze
|
|
207
239
|
current_values = nil
|
|
208
240
|
elsif current_values && node.depth == row_depth + 1
|
|
209
|
-
|
|
241
|
+
value = if wrap_cell_errors
|
|
242
|
+
coerce_cell_value(raw_value, cell_type, cell_style, cell_ref)
|
|
243
|
+
else
|
|
244
|
+
coerce_value(raw_value, cell_type, cell_style)
|
|
245
|
+
end
|
|
246
|
+
current_values << value
|
|
210
247
|
cell_type = nil
|
|
248
|
+
cell_style = nil
|
|
249
|
+
cell_ref = nil
|
|
211
250
|
raw_value = nil
|
|
212
251
|
end
|
|
213
252
|
end
|
|
@@ -231,12 +270,15 @@ module Rbxl
|
|
|
231
270
|
current_cells = nil
|
|
232
271
|
cell_ref = nil
|
|
233
272
|
cell_type = nil
|
|
273
|
+
cell_style = nil
|
|
234
274
|
current_col_index = 0
|
|
235
275
|
collecting_value = false
|
|
236
276
|
in_v = false
|
|
237
277
|
raw_value = nil
|
|
238
278
|
value_buffer = +""
|
|
239
279
|
row_depth = nil
|
|
280
|
+
track_style = !@date_styles.nil?
|
|
281
|
+
wrap_cell_errors = track_style
|
|
240
282
|
|
|
241
283
|
with_sheet_reader do |reader|
|
|
242
284
|
reader.each do |node|
|
|
@@ -248,6 +290,14 @@ module Rbxl
|
|
|
248
290
|
current_col_index = 0
|
|
249
291
|
current_cells = []
|
|
250
292
|
row_depth = node.depth
|
|
293
|
+
if node.self_closing?
|
|
294
|
+
emit_row(current_cells, current_row_index,
|
|
295
|
+
pad_cells: pad_cells, expand_merged: expand_merged,
|
|
296
|
+
values_only: values_only, &block)
|
|
297
|
+
last_row_index = current_row_index
|
|
298
|
+
current_row_index = nil
|
|
299
|
+
current_cells = nil
|
|
300
|
+
end
|
|
251
301
|
when "c"
|
|
252
302
|
cell_ref = node.attribute("r")
|
|
253
303
|
if cell_ref
|
|
@@ -257,7 +307,19 @@ module Rbxl
|
|
|
257
307
|
cell_ref = "#{column_name(current_col_index)}#{current_row_index}"
|
|
258
308
|
end
|
|
259
309
|
cell_type = node.attribute("t")
|
|
310
|
+
cell_style = track_style ? node.attribute("s")&.to_i : nil
|
|
260
311
|
raw_value = nil
|
|
312
|
+
if current_cells && node.self_closing?
|
|
313
|
+
value = if wrap_cell_errors
|
|
314
|
+
coerce_cell_value(raw_value, cell_type, cell_style, cell_ref)
|
|
315
|
+
else
|
|
316
|
+
coerce_value(raw_value, cell_type, cell_style)
|
|
317
|
+
end
|
|
318
|
+
current_cells << build_row_entry(cell_ref, value, values_only)
|
|
319
|
+
cell_ref = nil
|
|
320
|
+
cell_type = nil
|
|
321
|
+
cell_style = nil
|
|
322
|
+
end
|
|
261
323
|
when "v"
|
|
262
324
|
collecting_value = true
|
|
263
325
|
in_v = true
|
|
@@ -278,17 +340,23 @@ module Rbxl
|
|
|
278
340
|
raw_value = raw_value ? raw_value << value_buffer : value_buffer.dup
|
|
279
341
|
collecting_value = false
|
|
280
342
|
end
|
|
281
|
-
elsif node.depth == row_depth
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
|
|
343
|
+
elsif current_cells && node.depth == row_depth
|
|
344
|
+
emit_row(current_cells, current_row_index,
|
|
345
|
+
pad_cells: pad_cells, expand_merged: expand_merged,
|
|
346
|
+
values_only: values_only, &block)
|
|
285
347
|
last_row_index = current_row_index
|
|
286
348
|
current_row_index = nil
|
|
287
349
|
current_cells = nil
|
|
288
350
|
elsif current_cells && node.depth == row_depth + 1
|
|
289
|
-
|
|
351
|
+
value = if wrap_cell_errors
|
|
352
|
+
coerce_cell_value(raw_value, cell_type, cell_style, cell_ref)
|
|
353
|
+
else
|
|
354
|
+
coerce_value(raw_value, cell_type, cell_style)
|
|
355
|
+
end
|
|
356
|
+
current_cells << build_row_entry(cell_ref, value, values_only)
|
|
290
357
|
cell_ref = nil
|
|
291
358
|
cell_type = nil
|
|
359
|
+
cell_style = nil
|
|
292
360
|
raw_value = nil
|
|
293
361
|
end
|
|
294
362
|
end
|
|
@@ -296,10 +364,21 @@ module Rbxl
|
|
|
296
364
|
end
|
|
297
365
|
end
|
|
298
366
|
|
|
367
|
+
def emit_row(cells, row_index, pad_cells:, expand_merged:, values_only:)
|
|
368
|
+
cells = pad_row(cells, row_index, values_only: values_only) if pad_cells
|
|
369
|
+
cells = expand_merged_cells(cells, row_index, values_only: values_only) if expand_merged
|
|
370
|
+
yield values_only ? extract_values(cells).freeze : Row.new(index: row_index, cells: cells)
|
|
371
|
+
end
|
|
372
|
+
|
|
299
373
|
def with_sheet_reader
|
|
300
|
-
|
|
374
|
+
entry = @zip.get_entry(@entry_path)
|
|
375
|
+
raise WorksheetFormatError, "worksheet #{@name.inspect} is missing XML entry #{@entry_path.inspect} in #{@workbook_path}" unless entry
|
|
376
|
+
|
|
377
|
+
io = entry.get_input_stream
|
|
301
378
|
reader = Nokogiri::XML::Reader(io)
|
|
302
379
|
yield reader
|
|
380
|
+
rescue Nokogiri::XML::SyntaxError => e
|
|
381
|
+
raise WorksheetFormatError, "invalid worksheet XML for sheet #{@name.inspect} in #{@workbook_path}: #{e.message}"
|
|
303
382
|
ensure
|
|
304
383
|
io&.close
|
|
305
384
|
end
|
|
@@ -309,7 +388,10 @@ module Rbxl
|
|
|
309
388
|
max_bytes = Rbxl.max_worksheet_bytes
|
|
310
389
|
Rbxl::Native.public_send(method_name, io, @shared_strings, max_bytes, &block)
|
|
311
390
|
rescue RuntimeError => e
|
|
312
|
-
|
|
391
|
+
if e.message&.include?("worksheet bytes exceed limit")
|
|
392
|
+
raise WorksheetTooLargeError,
|
|
393
|
+
"worksheet #{@name.inspect} in #{@workbook_path}: #{e.message}"
|
|
394
|
+
end
|
|
313
395
|
|
|
314
396
|
raise
|
|
315
397
|
ensure
|
|
@@ -527,7 +609,7 @@ module Rbxl
|
|
|
527
609
|
@merge_ranges_by_row ||= extract_merge_ranges_by_row
|
|
528
610
|
end
|
|
529
611
|
|
|
530
|
-
def coerce_value(raw_value, type)
|
|
612
|
+
def coerce_value(raw_value, type, style_id = nil)
|
|
531
613
|
case type
|
|
532
614
|
when "s"
|
|
533
615
|
@shared_strings[raw_value.to_i]
|
|
@@ -536,10 +618,44 @@ module Rbxl
|
|
|
536
618
|
when "b"
|
|
537
619
|
raw_value == "1"
|
|
538
620
|
else
|
|
539
|
-
infer_scalar(raw_value)
|
|
621
|
+
value = infer_scalar(raw_value)
|
|
622
|
+
return value unless @date_styles && style_id && value.is_a?(Numeric) && @date_styles[style_id]
|
|
623
|
+
|
|
624
|
+
excel_serial_to_ruby(value)
|
|
540
625
|
end
|
|
541
626
|
end
|
|
542
627
|
|
|
628
|
+
def coerce_cell_value(raw_value, type, style_id, coordinate)
|
|
629
|
+
coerce_value(raw_value, type, style_id)
|
|
630
|
+
rescue StandardError => e
|
|
631
|
+
raise CellValueError,
|
|
632
|
+
"failed to decode cell #{coordinate || '(unknown coordinate)'} on sheet #{@name.inspect} in #{@workbook_path}: #{e.message}"
|
|
633
|
+
end
|
|
634
|
+
|
|
635
|
+
# Excel's serial date counts days from 1899-12-31 as serial 1, with a
|
|
636
|
+
# documented leap-year bug for the non-existent 1900-02-29 (serial 60)
|
|
637
|
+
# — for serials >= 60 the day-count is shifted back by one so that
|
|
638
|
+
# post-1900 dates line up with the proleptic Gregorian calendar.
|
|
639
|
+
# Whole-number serials are returned as +Date+; fractional serials as
|
|
640
|
+
# +Time+ so that both date and time-of-day survive the conversion.
|
|
641
|
+
def excel_serial_to_ruby(serial)
|
|
642
|
+
whole = serial.to_i
|
|
643
|
+
frac = serial - serial.to_i
|
|
644
|
+
|
|
645
|
+
base =
|
|
646
|
+
if @date_1904
|
|
647
|
+
Date.new(1904, 1, 1) + whole
|
|
648
|
+
else
|
|
649
|
+
whole -= 1 if whole >= 60
|
|
650
|
+
Date.new(1899, 12, 31) + whole
|
|
651
|
+
end
|
|
652
|
+
|
|
653
|
+
return base if frac.zero?
|
|
654
|
+
|
|
655
|
+
seconds = (frac * 86_400).round
|
|
656
|
+
Time.new(base.year, base.month, base.day) + seconds
|
|
657
|
+
end
|
|
658
|
+
|
|
543
659
|
def infer_scalar(raw_value)
|
|
544
660
|
return nil if raw_value.nil? || raw_value.empty?
|
|
545
661
|
|
data/lib/rbxl/version.rb
CHANGED
|
@@ -96,7 +96,7 @@ module Rbxl
|
|
|
96
96
|
private
|
|
97
97
|
|
|
98
98
|
def ensure_writable!
|
|
99
|
-
raise WorkbookAlreadySavedError, "write-only workbook can only be saved once" if @saved
|
|
99
|
+
raise WorkbookAlreadySavedError, "write-only workbook can only be saved once by design; call Rbxl.new to build another workbook" if @saved
|
|
100
100
|
raise ClosedWorkbookError, "workbook has been closed" if closed?
|
|
101
101
|
end
|
|
102
102
|
|
data/lib/rbxl.rb
CHANGED
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
require "cgi"
|
|
2
2
|
require "date"
|
|
3
3
|
require "nokogiri"
|
|
4
|
+
require "set"
|
|
4
5
|
require "stringio"
|
|
5
6
|
require "zip"
|
|
6
7
|
|
|
@@ -32,16 +33,19 @@ require_relative "rbxl/write_only_worksheet"
|
|
|
32
33
|
#
|
|
33
34
|
# require "rbxl"
|
|
34
35
|
#
|
|
35
|
-
# book = Rbxl.open("report.xlsx"
|
|
36
|
+
# book = Rbxl.open("report.xlsx")
|
|
36
37
|
# sheet = book.sheet("Report")
|
|
37
38
|
# sheet.each_row(values_only: true) { |values| p values }
|
|
38
39
|
# book.close
|
|
39
40
|
#
|
|
41
|
+
# Pass <tt>date_conversion: true</tt> to return Date/Time objects for
|
|
42
|
+
# numeric cells that carry a date +numFmt+ style.
|
|
43
|
+
#
|
|
40
44
|
# == Writing
|
|
41
45
|
#
|
|
42
46
|
# require "rbxl"
|
|
43
47
|
#
|
|
44
|
-
# book = Rbxl.new
|
|
48
|
+
# book = Rbxl.new
|
|
45
49
|
# sheet = book.add_sheet("Report")
|
|
46
50
|
# sheet << ["id", "name", "score"]
|
|
47
51
|
# sheet << [1, "alice", 100]
|
|
@@ -84,9 +88,9 @@ module Rbxl
|
|
|
84
88
|
|
|
85
89
|
# Opens an existing workbook in read-only row-by-row mode.
|
|
86
90
|
#
|
|
87
|
-
# The +read_only+ keyword
|
|
88
|
-
#
|
|
89
|
-
#
|
|
91
|
+
# The +read_only+ keyword defaults to +true+ and exists to mark the
|
|
92
|
+
# intent explicitly at the call site. Passing +read_only: false+ raises
|
|
93
|
+
# {NotImplementedError}; a read/write mode is not available.
|
|
90
94
|
#
|
|
91
95
|
# With <tt>streaming: true</tt>, the native backend (when loaded) feeds
|
|
92
96
|
# worksheet XML to the parser in chunks pulled from the ZIP input stream
|
|
@@ -97,29 +101,41 @@ module Rbxl
|
|
|
97
101
|
# differs — and typically pays back a few percent of throughput on small
|
|
98
102
|
# sheets in exchange for the flat memory profile.
|
|
99
103
|
#
|
|
104
|
+
# With <tt>date_conversion: true</tt>, numeric cells whose style points at
|
|
105
|
+
# a date/time +numFmt+ (built-in ids 14–22, 27–36, 45–47, 50–58, or any
|
|
106
|
+
# custom format code containing a date/time token) are returned as
|
|
107
|
+
# +Date+, +Time+, or +DateTime+ instead of a raw serial +Float+. The flag
|
|
108
|
+
# is off by default to preserve byte-for-byte behavior and skip the
|
|
109
|
+
# styles.xml parse for workbooks that don't need it; enabling it
|
|
110
|
+
# disables the native fast path and routes reads through the Ruby
|
|
111
|
+
# worksheet parser.
|
|
112
|
+
#
|
|
100
113
|
# @param path [String, #to_path] filesystem path to an <tt>.xlsx</tt> file
|
|
101
|
-
# @param read_only [Boolean] must be +true+
|
|
114
|
+
# @param read_only [Boolean] retained for call-site clarity; must be +true+
|
|
102
115
|
# @param streaming [Boolean] feed worksheet XML to the native parser in
|
|
103
116
|
# chunks instead of fully inflating the entry in advance. Ignored when
|
|
104
117
|
# the native extension is not loaded.
|
|
118
|
+
# @param date_conversion [Boolean] convert numeric cells backed by a
|
|
119
|
+
# date/time +numFmt+ to +Date+ / +Time+ / +DateTime+
|
|
105
120
|
# @return [Rbxl::ReadOnlyWorkbook]
|
|
106
|
-
# @raise [
|
|
107
|
-
def open(path, read_only:
|
|
108
|
-
raise
|
|
121
|
+
# @raise [NotImplementedError] if +read_only+ is not +true+
|
|
122
|
+
def open(path, read_only: true, streaming: false, date_conversion: false)
|
|
123
|
+
raise NotImplementedError, "read/write mode is not supported; pass read_only: true" unless read_only
|
|
109
124
|
|
|
110
|
-
ReadOnlyWorkbook.open(path, streaming: streaming)
|
|
125
|
+
ReadOnlyWorkbook.open(path, streaming: streaming, date_conversion: date_conversion)
|
|
111
126
|
end
|
|
112
127
|
|
|
113
128
|
# Creates a new workbook in write-only mode.
|
|
114
129
|
#
|
|
115
|
-
# The +write_only+ keyword
|
|
116
|
-
# save-once, append-only contract
|
|
130
|
+
# The +write_only+ keyword defaults to +true+ and exists to mark the
|
|
131
|
+
# save-once, append-only contract explicitly. Passing
|
|
132
|
+
# +write_only: false+ raises {NotImplementedError}.
|
|
117
133
|
#
|
|
118
|
-
# @param write_only [Boolean] must be +true+
|
|
134
|
+
# @param write_only [Boolean] retained for call-site clarity; must be +true+
|
|
119
135
|
# @return [Rbxl::WriteOnlyWorkbook]
|
|
120
|
-
# @raise [
|
|
121
|
-
def new(write_only:
|
|
122
|
-
raise
|
|
136
|
+
# @raise [NotImplementedError] if +write_only+ is not +true+
|
|
137
|
+
def new(write_only: true)
|
|
138
|
+
raise NotImplementedError, "read/write mode is not supported; pass write_only: true" unless write_only
|
|
123
139
|
|
|
124
140
|
WriteOnlyWorkbook.new
|
|
125
141
|
end
|
data/sig/rbxl.rbs
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
module Rbxl
|
|
2
2
|
VERSION: String
|
|
3
3
|
|
|
4
|
-
type cell_value = String | Integer | Float | bool | nil
|
|
4
|
+
type cell_value = String | Integer | Float | bool | Date | Time | nil
|
|
5
5
|
type pathish = String | Pathname
|
|
6
6
|
type row_input = Array[untyped] | Enumerator[untyped, untyped]
|
|
7
7
|
type row_values = Array[cell_value]
|
|
@@ -9,7 +9,7 @@ module Rbxl
|
|
|
9
9
|
type row_cells = Array[row_cell]
|
|
10
10
|
type dimensions = { ref: String, max_col: Integer, max_row: Integer }
|
|
11
11
|
|
|
12
|
-
def self.open: (pathish path, ?read_only: bool, ?streaming: bool) -> ReadOnlyWorkbook
|
|
12
|
+
def self.open: (pathish path, ?read_only: bool, ?streaming: bool, ?date_conversion: bool) -> ReadOnlyWorkbook
|
|
13
13
|
def self.new: (?write_only: bool) -> WriteOnlyWorkbook
|
|
14
14
|
|
|
15
15
|
attr_accessor self.max_shared_strings: Integer?
|
|
@@ -83,8 +83,8 @@ module Rbxl
|
|
|
83
83
|
attr_reader path: String
|
|
84
84
|
attr_reader sheet_names: Array[String]
|
|
85
85
|
|
|
86
|
-
def self.open: (pathish path, ?streaming: bool) -> ReadOnlyWorkbook
|
|
87
|
-
def initialize: (pathish path, ?streaming: bool) -> void
|
|
86
|
+
def self.open: (pathish path, ?streaming: bool, ?date_conversion: bool) -> ReadOnlyWorkbook
|
|
87
|
+
def initialize: (pathish path, ?streaming: bool, ?date_conversion: bool) -> void
|
|
88
88
|
def sheet: (String name) -> ReadOnlyWorksheet
|
|
89
89
|
def close: () -> void
|
|
90
90
|
def closed?: () -> bool
|
|
@@ -94,7 +94,7 @@ module Rbxl
|
|
|
94
94
|
attr_reader name: String
|
|
95
95
|
attr_reader dimensions: dimensions?
|
|
96
96
|
|
|
97
|
-
def initialize: (zip: untyped, entry_path: String, shared_strings: Array[String], name: String, ?streaming: bool) -> void
|
|
97
|
+
def initialize: (zip: untyped, entry_path: String, shared_strings: Array[String], name: String, ?streaming: bool, ?date_styles: Array[bool]?) -> void
|
|
98
98
|
|
|
99
99
|
def each_row: (?pad_cells: bool, ?values_only: bool, ?expand_merged: bool) { (Row | row_values) -> void } -> void
|
|
100
100
|
| (?pad_cells: bool, ?values_only: bool, ?expand_merged: bool) -> Enumerator[Row | row_values, void]
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: rbxl
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.0
|
|
4
|
+
version: 1.2.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Taro KOBAYASHI
|
|
@@ -43,8 +43,8 @@ dependencies:
|
|
|
43
43
|
- - "<"
|
|
44
44
|
- !ruby/object:Gem::Version
|
|
45
45
|
version: '2.0'
|
|
46
|
-
description: rbxl is a Ruby gem for
|
|
47
|
-
XLSX
|
|
46
|
+
description: rbxl is a fast, low-memory Ruby gem for row-by-row XLSX reads and append-only
|
|
47
|
+
XLSX writes, with an optional native extension for higher-throughput XML parsing.
|
|
48
48
|
email:
|
|
49
49
|
- taro@matzlika.co.jp
|
|
50
50
|
executables: []
|
|
@@ -96,6 +96,5 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
96
96
|
requirements: []
|
|
97
97
|
rubygems_version: 4.0.3
|
|
98
98
|
specification_version: 4
|
|
99
|
-
summary:
|
|
100
|
-
writes.
|
|
99
|
+
summary: Fast, low-memory XLSX processing for Ruby.
|
|
101
100
|
test_files: []
|