csv 3.2.1 → 3.2.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/NEWS.md +113 -0
- data/README.md +1 -1
- data/doc/csv/options/generating/write_headers.rdoc +1 -1
- data/doc/csv/recipes/generating.rdoc +1 -1
- data/doc/csv/recipes/parsing.rdoc +1 -1
- data/lib/csv/fields_converter.rb +3 -2
- data/lib/csv/input_record_separator.rb +1 -14
- data/lib/csv/parser.rb +237 -92
- data/lib/csv/row.rb +1 -1
- data/lib/csv/table.rb +14 -4
- data/lib/csv/version.rb +1 -1
- data/lib/csv/writer.rb +5 -5
- data/lib/csv.rb +48 -17
- metadata +3 -5
- data/lib/csv/delete_suffix.rb +0 -18
- data/lib/csv/match_p.rb +0 -20
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 3ef88fc9b205f8f64c5e817b22f121d5d03534c155cb8d27dc9d87aa62e6b7e0
|
4
|
+
data.tar.gz: e12b86cee946837a96ae609314a50246e2135fab855dca58e800de1ddc50e524
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d663d0917d63315e4fc5802f9538727731f39ea06997adea485f5f137f37639f0791db1f9697476e7d0c4a8048a0339b9e4fb7aecd79f3ec8ee8eb24a4cb676e
|
7
|
+
data.tar.gz: f9d22fe50d227b8f8ca0d075ed0fbf1f39a10bf59a1adb68835fede727f3d1dfd1d2f88754caa1098fe9c24df738c71b29ae5fc3080a25d80df484f76530344e
|
data/NEWS.md
CHANGED
@@ -1,5 +1,118 @@
|
|
1
1
|
# News
|
2
2
|
|
3
|
+
## 3.2.4 - 2022-08-22
|
4
|
+
|
5
|
+
### Improvements
|
6
|
+
|
7
|
+
* Cleaned up internal implementations.
|
8
|
+
[[GitHub#249](https://github.com/ruby/csv/pull/249)]
|
9
|
+
[[GitHub#250](https://github.com/ruby/csv/pull/250)]
|
10
|
+
[[GitHub#251](https://github.com/ruby/csv/pull/251)]
|
11
|
+
[Patch by Mau Magnaguagno]
|
12
|
+
|
13
|
+
* Added support for RFC 3339 style time.
|
14
|
+
[[GitHub#248](https://github.com/ruby/csv/pull/248)]
|
15
|
+
[Patch by Thierry Lambert]
|
16
|
+
|
17
|
+
* Added support for transcoding String CSV. Syntax is
|
18
|
+
`from-encoding:to-encoding`.
|
19
|
+
[[GitHub#254](https://github.com/ruby/csv/issues/254)]
|
20
|
+
[Reported by Richard Stueven]
|
21
|
+
|
22
|
+
* Added quoted information to `CSV::FieldInfo`.
|
23
|
+
[[GitHub#254](https://github.com/ruby/csv/pull/253)]
|
24
|
+
[Reported by Hirokazu SUZUKI]
|
25
|
+
|
26
|
+
### Fixes
|
27
|
+
|
28
|
+
* Fixed a link in documents.
|
29
|
+
[[GitHub#244](https://github.com/ruby/csv/pull/244)]
|
30
|
+
[Patch by Peter Zhu]
|
31
|
+
|
32
|
+
### Thanks
|
33
|
+
|
34
|
+
* Peter Zhu
|
35
|
+
|
36
|
+
* Mau Magnaguagno
|
37
|
+
|
38
|
+
* Thierry Lambert
|
39
|
+
|
40
|
+
* Richard Stueven
|
41
|
+
|
42
|
+
* Hirokazu SUZUKI
|
43
|
+
|
44
|
+
## 3.2.3 - 2022-04-09
|
45
|
+
|
46
|
+
### Improvements
|
47
|
+
|
48
|
+
* Added contents summary to `CSV::Table#inspect`.
|
49
|
+
[GitHub#229][Patch by Eriko Sugiyama]
|
50
|
+
[GitHub#235][Patch by Sampat Badhe]
|
51
|
+
|
52
|
+
* Suppressed `$INPUT_RECORD_SEPARATOR` deprecation warning by
|
53
|
+
`Warning.warn`.
|
54
|
+
[GitHub#233][Reported by Jean byroot Boussier]
|
55
|
+
|
56
|
+
* Improved error message for liberal parsing with quoted values.
|
57
|
+
[GitHub#231][Patch by Nikolay Rys]
|
58
|
+
|
59
|
+
* Fixed typos in documentation.
|
60
|
+
[GitHub#236][Patch by Sampat Badhe]
|
61
|
+
|
62
|
+
* Added `:max_field_size` option and deprecated `:field_size_limit` option.
|
63
|
+
[GitHub#238][Reported by Dan Buettner]
|
64
|
+
|
65
|
+
* Added `:symbol_raw` to built-in header converters.
|
66
|
+
[GitHub#237][Reported by taki]
|
67
|
+
[GitHub#239][Patch by Eriko Sugiyama]
|
68
|
+
|
69
|
+
### Fixes
|
70
|
+
|
71
|
+
* Fixed a bug that some texts may be dropped unexpectedly.
|
72
|
+
[Bug #18245][ruby-core:105587][Reported by Hassan Abdul Rehman]
|
73
|
+
|
74
|
+
* Fixed a bug that `:field_size_limit` doesn't work with not complex row.
|
75
|
+
[GitHub#238][Reported by Dan Buettner]
|
76
|
+
|
77
|
+
### Thanks
|
78
|
+
|
79
|
+
* Hassan Abdul Rehman
|
80
|
+
|
81
|
+
* Eriko Sugiyama
|
82
|
+
|
83
|
+
* Jean byroot Boussier
|
84
|
+
|
85
|
+
* Nikolay Rys
|
86
|
+
|
87
|
+
* Sampat Badhe
|
88
|
+
|
89
|
+
* Dan Buettner
|
90
|
+
|
91
|
+
* taki
|
92
|
+
|
93
|
+
## 3.2.2 - 2021-12-24
|
94
|
+
|
95
|
+
### Improvements
|
96
|
+
|
97
|
+
* Added a validation for invalid option combination.
|
98
|
+
[GitHub#225][Patch by adamroyjones]
|
99
|
+
|
100
|
+
* Improved documentation for developers.
|
101
|
+
[GitHub#227][Patch by Eriko Sugiyama]
|
102
|
+
|
103
|
+
### Fixes
|
104
|
+
|
105
|
+
* Fixed a bug that all of `ARGF` contents may not be consumed.
|
106
|
+
[GitHub#228][Reported by Rafael Navaza]
|
107
|
+
|
108
|
+
### Thanks
|
109
|
+
|
110
|
+
* adamroyjones
|
111
|
+
|
112
|
+
* Eriko Sugiyama
|
113
|
+
|
114
|
+
* Rafael Navaza
|
115
|
+
|
3
116
|
## 3.2.1 - 2021-10-23
|
4
117
|
|
5
118
|
### Improvements
|
data/README.md
CHANGED
@@ -35,7 +35,7 @@ end
|
|
35
35
|
|
36
36
|
## Development
|
37
37
|
|
38
|
-
After checking out the repo, run `
|
38
|
+
After checking out the repo, run `ruby run-test.rb` to check if your changes can pass the test.
|
39
39
|
|
40
40
|
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
41
41
|
|
@@ -148,7 +148,7 @@ This example defines and uses a custom write converter to strip whitespace from
|
|
148
148
|
|
149
149
|
==== Recipe: Specify Multiple Write Converters
|
150
150
|
|
151
|
-
Use option <tt>:write_converters</tt> and multiple custom
|
151
|
+
Use option <tt>:write_converters</tt> and multiple custom converters
|
152
152
|
to convert field values when generating \CSV.
|
153
153
|
|
154
154
|
This example defines and uses two custom write converters to strip and upcase generated fields:
|
@@ -83,7 +83,7 @@ Use instance method CSV#each with option +headers+ to read a source \String one
|
|
83
83
|
CSV.new(string, headers: true).each do |row|
|
84
84
|
p row
|
85
85
|
end
|
86
|
-
|
86
|
+
Output:
|
87
87
|
#<CSV::Row "Name":"foo" "Value":"0">
|
88
88
|
#<CSV::Row "Name":"bar" "Value":"1">
|
89
89
|
#<CSV::Row "Name":"baz" "Value":"2">
|
data/lib/csv/fields_converter.rb
CHANGED
@@ -44,7 +44,7 @@ class CSV
|
|
44
44
|
@converters.empty?
|
45
45
|
end
|
46
46
|
|
47
|
-
def convert(fields, headers, lineno)
|
47
|
+
def convert(fields, headers, lineno, quoted_fields)
|
48
48
|
return fields unless need_convert?
|
49
49
|
|
50
50
|
fields.collect.with_index do |field, index|
|
@@ -63,7 +63,8 @@ class CSV
|
|
63
63
|
else
|
64
64
|
header = nil
|
65
65
|
end
|
66
|
-
|
66
|
+
quoted = quoted_fields[index]
|
67
|
+
field = converter[field, FieldInfo.new(index, lineno, header, quoted)]
|
67
68
|
end
|
68
69
|
break unless field.is_a?(String) # short-circuit pipeline for speed
|
69
70
|
end
|
@@ -4,20 +4,7 @@ require "stringio"
|
|
4
4
|
class CSV
|
5
5
|
module InputRecordSeparator
|
6
6
|
class << self
|
7
|
-
|
8
|
-
verbose, $VERBOSE = $VERBOSE, true
|
9
|
-
stderr, $stderr = $stderr, StringIO.new
|
10
|
-
input_record_separator = $INPUT_RECORD_SEPARATOR
|
11
|
-
begin
|
12
|
-
$INPUT_RECORD_SEPARATOR = "\r\n"
|
13
|
-
is_input_record_separator_deprecated = (not $stderr.string.empty?)
|
14
|
-
ensure
|
15
|
-
$INPUT_RECORD_SEPARATOR = input_record_separator
|
16
|
-
$stderr = stderr
|
17
|
-
$VERBOSE = verbose
|
18
|
-
end
|
19
|
-
|
20
|
-
if is_input_record_separator_deprecated
|
7
|
+
if RUBY_VERSION >= "3.0.0"
|
21
8
|
def value
|
22
9
|
"\n"
|
23
10
|
end
|
data/lib/csv/parser.rb
CHANGED
@@ -2,15 +2,10 @@
|
|
2
2
|
|
3
3
|
require "strscan"
|
4
4
|
|
5
|
-
require_relative "delete_suffix"
|
6
5
|
require_relative "input_record_separator"
|
7
|
-
require_relative "match_p"
|
8
6
|
require_relative "row"
|
9
7
|
require_relative "table"
|
10
8
|
|
11
|
-
using CSV::DeleteSuffix if CSV.const_defined?(:DeleteSuffix)
|
12
|
-
using CSV::MatchP if CSV.const_defined?(:MatchP)
|
13
|
-
|
14
9
|
class CSV
|
15
10
|
# Note: Don't use this class directly. This is an internal class.
|
16
11
|
class Parser
|
@@ -27,6 +22,10 @@ class CSV
|
|
27
22
|
class InvalidEncoding < StandardError
|
28
23
|
end
|
29
24
|
|
25
|
+
# Raised when unexpected case is happen.
|
26
|
+
class UnexpectedError < StandardError
|
27
|
+
end
|
28
|
+
|
30
29
|
#
|
31
30
|
# CSV::Scanner receives a CSV output, scans it and return the content.
|
32
31
|
# It also controls the life cycle of the object with its methods +keep_start+,
|
@@ -78,16 +77,17 @@ class CSV
|
|
78
77
|
# +keep_end+, +keep_back+, +keep_drop+.
|
79
78
|
#
|
80
79
|
# CSV::InputsScanner.scan() tries to match with pattern at the current position.
|
81
|
-
# If there's a match, the scanner advances the
|
80
|
+
# If there's a match, the scanner advances the "scan pointer" and returns the matched string.
|
82
81
|
# Otherwise, the scanner returns nil.
|
83
82
|
#
|
84
|
-
# CSV::InputsScanner.rest() returns the
|
83
|
+
# CSV::InputsScanner.rest() returns the "rest" of the string (i.e. everything after the scan pointer).
|
85
84
|
# If there is no more data (eos? = true), it returns "".
|
86
85
|
#
|
87
86
|
class InputsScanner
|
88
|
-
def initialize(inputs, encoding, chunk_size: 8192)
|
87
|
+
def initialize(inputs, encoding, row_separator, chunk_size: 8192)
|
89
88
|
@inputs = inputs.dup
|
90
89
|
@encoding = encoding
|
90
|
+
@row_separator = row_separator
|
91
91
|
@chunk_size = chunk_size
|
92
92
|
@last_scanner = @inputs.empty?
|
93
93
|
@keeps = []
|
@@ -95,11 +95,13 @@ class CSV
|
|
95
95
|
end
|
96
96
|
|
97
97
|
def each_line(row_separator)
|
98
|
+
return enum_for(__method__, row_separator) unless block_given?
|
98
99
|
buffer = nil
|
99
100
|
input = @scanner.rest
|
100
101
|
position = @scanner.pos
|
101
102
|
offset = 0
|
102
103
|
n_row_separator_chars = row_separator.size
|
104
|
+
# trace(__method__, :start, line, input)
|
103
105
|
while true
|
104
106
|
input.each_line(row_separator) do |line|
|
105
107
|
@scanner.pos += line.bytesize
|
@@ -139,25 +141,28 @@ class CSV
|
|
139
141
|
end
|
140
142
|
|
141
143
|
def scan(pattern)
|
144
|
+
# trace(__method__, pattern, :start)
|
142
145
|
value = @scanner.scan(pattern)
|
146
|
+
# trace(__method__, pattern, :done, :last, value) if @last_scanner
|
143
147
|
return value if @last_scanner
|
144
148
|
|
145
|
-
if value
|
146
|
-
|
147
|
-
|
148
|
-
else
|
149
|
-
nil
|
150
|
-
end
|
149
|
+
read_chunk if value and @scanner.eos?
|
150
|
+
# trace(__method__, pattern, :done, value)
|
151
|
+
value
|
151
152
|
end
|
152
153
|
|
153
154
|
def scan_all(pattern)
|
155
|
+
# trace(__method__, pattern, :start)
|
154
156
|
value = @scanner.scan(pattern)
|
157
|
+
# trace(__method__, pattern, :done, :last, value) if @last_scanner
|
155
158
|
return value if @last_scanner
|
156
159
|
|
157
160
|
return nil if value.nil?
|
158
161
|
while @scanner.eos? and read_chunk and (sub_value = @scanner.scan(pattern))
|
162
|
+
# trace(__method__, pattern, :sub, sub_value)
|
159
163
|
value << sub_value
|
160
164
|
end
|
165
|
+
# trace(__method__, pattern, :done, value)
|
161
166
|
value
|
162
167
|
end
|
163
168
|
|
@@ -166,76 +171,135 @@ class CSV
|
|
166
171
|
end
|
167
172
|
|
168
173
|
def keep_start
|
169
|
-
|
174
|
+
# trace(__method__, :start)
|
175
|
+
adjust_last_keep
|
176
|
+
@keeps.push([@scanner, @scanner.pos, nil])
|
177
|
+
# trace(__method__, :done)
|
170
178
|
end
|
171
179
|
|
172
180
|
def keep_end
|
173
|
-
|
174
|
-
|
181
|
+
# trace(__method__, :start)
|
182
|
+
scanner, start, buffer = @keeps.pop
|
183
|
+
if scanner == @scanner
|
184
|
+
keep = @scanner.string.byteslice(start, @scanner.pos - start)
|
185
|
+
else
|
186
|
+
keep = @scanner.string.byteslice(0, @scanner.pos)
|
187
|
+
end
|
175
188
|
if buffer
|
176
189
|
buffer << keep
|
177
190
|
keep = buffer
|
178
191
|
end
|
192
|
+
# trace(__method__, :done, keep)
|
179
193
|
keep
|
180
194
|
end
|
181
195
|
|
182
196
|
def keep_back
|
183
|
-
|
197
|
+
# trace(__method__, :start)
|
198
|
+
scanner, start, buffer = @keeps.pop
|
184
199
|
if buffer
|
200
|
+
# trace(__method__, :rescan, start, buffer)
|
185
201
|
string = @scanner.string
|
186
|
-
|
202
|
+
if scanner == @scanner
|
203
|
+
keep = string.byteslice(start, string.bytesize - start)
|
204
|
+
else
|
205
|
+
keep = string
|
206
|
+
end
|
187
207
|
if keep and not keep.empty?
|
188
208
|
@inputs.unshift(StringIO.new(keep))
|
189
209
|
@last_scanner = false
|
190
210
|
end
|
191
211
|
@scanner = StringScanner.new(buffer)
|
192
212
|
else
|
213
|
+
if @scanner != scanner
|
214
|
+
message = "scanners are different but no buffer: "
|
215
|
+
message += "#{@scanner.inspect}(#{@scanner.object_id}): "
|
216
|
+
message += "#{scanner.inspect}(#{scanner.object_id})"
|
217
|
+
raise UnexpectedError, message
|
218
|
+
end
|
219
|
+
# trace(__method__, :repos, start, buffer)
|
193
220
|
@scanner.pos = start
|
194
221
|
end
|
195
222
|
read_chunk if @scanner.eos?
|
196
223
|
end
|
197
224
|
|
198
225
|
def keep_drop
|
199
|
-
@keeps.pop
|
226
|
+
_, _, buffer = @keeps.pop
|
227
|
+
# trace(__method__, :done, :empty) unless buffer
|
228
|
+
return unless buffer
|
229
|
+
|
230
|
+
last_keep = @keeps.last
|
231
|
+
# trace(__method__, :done, :no_last_keep) unless last_keep
|
232
|
+
return unless last_keep
|
233
|
+
|
234
|
+
if last_keep[2]
|
235
|
+
last_keep[2] << buffer
|
236
|
+
else
|
237
|
+
last_keep[2] = buffer
|
238
|
+
end
|
239
|
+
# trace(__method__, :done)
|
200
240
|
end
|
201
241
|
|
202
242
|
def rest
|
203
243
|
@scanner.rest
|
204
244
|
end
|
205
245
|
|
246
|
+
def check(pattern)
|
247
|
+
@scanner.check(pattern)
|
248
|
+
end
|
249
|
+
|
206
250
|
private
|
207
|
-
def
|
208
|
-
|
251
|
+
def trace(*args)
|
252
|
+
pp([*args, @scanner, @scanner&.string, @scanner&.pos, @keeps])
|
253
|
+
end
|
209
254
|
|
210
|
-
|
211
|
-
|
212
|
-
|
213
|
-
|
214
|
-
|
215
|
-
|
216
|
-
|
217
|
-
|
218
|
-
|
219
|
-
|
220
|
-
|
221
|
-
|
255
|
+
def adjust_last_keep
|
256
|
+
# trace(__method__, :start)
|
257
|
+
|
258
|
+
keep = @keeps.last
|
259
|
+
# trace(__method__, :done, :empty) if keep.nil?
|
260
|
+
return if keep.nil?
|
261
|
+
|
262
|
+
scanner, start, buffer = keep
|
263
|
+
string = @scanner.string
|
264
|
+
if @scanner != scanner
|
265
|
+
start = 0
|
266
|
+
end
|
267
|
+
if start == 0 and @scanner.eos?
|
268
|
+
keep_data = string
|
269
|
+
else
|
270
|
+
keep_data = string.byteslice(start, @scanner.pos - start)
|
271
|
+
end
|
272
|
+
if keep_data
|
273
|
+
if buffer
|
274
|
+
buffer << keep_data
|
275
|
+
else
|
276
|
+
keep[2] = keep_data.dup
|
222
277
|
end
|
223
|
-
keep[0] = 0
|
224
278
|
end
|
225
279
|
|
280
|
+
# trace(__method__, :done)
|
281
|
+
end
|
282
|
+
|
283
|
+
def read_chunk
|
284
|
+
return false if @last_scanner
|
285
|
+
|
286
|
+
adjust_last_keep
|
287
|
+
|
226
288
|
input = @inputs.first
|
227
289
|
case input
|
228
290
|
when StringIO
|
229
291
|
string = input.read
|
230
292
|
raise InvalidEncoding unless string.valid_encoding?
|
293
|
+
# trace(__method__, :stringio, string)
|
231
294
|
@scanner = StringScanner.new(string)
|
232
295
|
@inputs.shift
|
233
296
|
@last_scanner = @inputs.empty?
|
234
297
|
true
|
235
298
|
else
|
236
|
-
chunk = input.gets(
|
299
|
+
chunk = input.gets(@row_separator, @chunk_size)
|
237
300
|
if chunk
|
238
301
|
raise InvalidEncoding unless chunk.valid_encoding?
|
302
|
+
# trace(__method__, :chunk, chunk)
|
239
303
|
@scanner = StringScanner.new(chunk)
|
240
304
|
if input.respond_to?(:eof?) and input.eof?
|
241
305
|
@inputs.shift
|
@@ -243,6 +307,7 @@ class CSV
|
|
243
307
|
end
|
244
308
|
true
|
245
309
|
else
|
310
|
+
# trace(__method__, :no_chunk)
|
246
311
|
@scanner = StringScanner.new("".encode(@encoding))
|
247
312
|
@inputs.shift
|
248
313
|
@last_scanner = @inputs.empty?
|
@@ -277,7 +342,11 @@ class CSV
|
|
277
342
|
end
|
278
343
|
|
279
344
|
def field_size_limit
|
280
|
-
@
|
345
|
+
@max_field_size&.succ
|
346
|
+
end
|
347
|
+
|
348
|
+
def max_field_size
|
349
|
+
@max_field_size
|
281
350
|
end
|
282
351
|
|
283
352
|
def skip_lines
|
@@ -345,6 +414,16 @@ class CSV
|
|
345
414
|
end
|
346
415
|
message = "Invalid byte sequence in #{@encoding}"
|
347
416
|
raise MalformedCSVError.new(message, lineno)
|
417
|
+
rescue UnexpectedError => error
|
418
|
+
if @scanner
|
419
|
+
ignore_broken_line
|
420
|
+
lineno = @lineno
|
421
|
+
else
|
422
|
+
lineno = @lineno + 1
|
423
|
+
end
|
424
|
+
message = "This should not be happen: #{error.message}: "
|
425
|
+
message += "Please report this to https://github.com/ruby/csv/issues"
|
426
|
+
raise MalformedCSVError.new(message, lineno)
|
348
427
|
end
|
349
428
|
end
|
350
429
|
|
@@ -361,6 +440,7 @@ class CSV
|
|
361
440
|
prepare_skip_lines
|
362
441
|
prepare_strip
|
363
442
|
prepare_separators
|
443
|
+
validate_strip_and_col_sep_options
|
364
444
|
prepare_quoted
|
365
445
|
prepare_unquoted
|
366
446
|
prepare_line
|
@@ -388,7 +468,7 @@ class CSV
|
|
388
468
|
@backslash_quote = false
|
389
469
|
end
|
390
470
|
@unconverted_fields = @options[:unconverted_fields]
|
391
|
-
@
|
471
|
+
@max_field_size = @options[:max_field_size]
|
392
472
|
@skip_blanks = @options[:skip_blanks]
|
393
473
|
@fields_converter = @options[:fields_converter]
|
394
474
|
@header_fields_converter = @options[:header_fields_converter]
|
@@ -531,6 +611,28 @@ class CSV
|
|
531
611
|
@not_line_end = Regexp.new("[^\r\n]+".encode(@encoding))
|
532
612
|
end
|
533
613
|
|
614
|
+
# This method verifies that there are no (obvious) ambiguities with the
|
615
|
+
# provided +col_sep+ and +strip+ parsing options. For example, if +col_sep+
|
616
|
+
# and +strip+ were both equal to +\t+, then there would be no clear way to
|
617
|
+
# parse the input.
|
618
|
+
def validate_strip_and_col_sep_options
|
619
|
+
return unless @strip
|
620
|
+
|
621
|
+
if @strip.is_a?(String)
|
622
|
+
if @column_separator.start_with?(@strip) || @column_separator.end_with?(@strip)
|
623
|
+
raise ArgumentError,
|
624
|
+
"The provided strip (#{@escaped_strip}) and " \
|
625
|
+
"col_sep (#{@escaped_column_separator}) options are incompatible."
|
626
|
+
end
|
627
|
+
else
|
628
|
+
if Regexp.new("\\A[#{@escaped_strip}]|[#{@escaped_strip}]\\z").match?(@column_separator)
|
629
|
+
raise ArgumentError,
|
630
|
+
"The provided strip (true) and " \
|
631
|
+
"col_sep (#{@escaped_column_separator}) options are incompatible."
|
632
|
+
end
|
633
|
+
end
|
634
|
+
end
|
635
|
+
|
534
636
|
def prepare_quoted
|
535
637
|
if @quote_character
|
536
638
|
@quotes = Regexp.new(@escaped_quote_character +
|
@@ -656,9 +758,10 @@ class CSV
|
|
656
758
|
case headers
|
657
759
|
when Array
|
658
760
|
@raw_headers = headers
|
761
|
+
quoted_fields = [false] * @raw_headers.size
|
659
762
|
@use_headers = true
|
660
763
|
when String
|
661
|
-
@raw_headers = parse_headers(headers)
|
764
|
+
@raw_headers, quoted_fields = parse_headers(headers)
|
662
765
|
@use_headers = true
|
663
766
|
when nil, false
|
664
767
|
@raw_headers = nil
|
@@ -668,21 +771,28 @@ class CSV
|
|
668
771
|
@use_headers = true
|
669
772
|
end
|
670
773
|
if @raw_headers
|
671
|
-
@headers = adjust_headers(@raw_headers)
|
774
|
+
@headers = adjust_headers(@raw_headers, quoted_fields)
|
672
775
|
else
|
673
776
|
@headers = nil
|
674
777
|
end
|
675
778
|
end
|
676
779
|
|
677
780
|
def parse_headers(row)
|
678
|
-
|
679
|
-
|
680
|
-
|
681
|
-
|
781
|
+
quoted_fields = []
|
782
|
+
converter = lambda do |field, info|
|
783
|
+
quoted_fields << info.quoted?
|
784
|
+
field
|
785
|
+
end
|
786
|
+
headers = CSV.parse_line(row,
|
787
|
+
col_sep: @column_separator,
|
788
|
+
row_sep: @row_separator,
|
789
|
+
quote_char: @quote_character,
|
790
|
+
converters: [converter])
|
791
|
+
[headers, quoted_fields]
|
682
792
|
end
|
683
793
|
|
684
|
-
def adjust_headers(headers)
|
685
|
-
adjusted_headers = @header_fields_converter.convert(headers, nil, @lineno)
|
794
|
+
def adjust_headers(headers, quoted_fields)
|
795
|
+
adjusted_headers = @header_fields_converter.convert(headers, nil, @lineno, quoted_fields)
|
686
796
|
adjusted_headers.each {|h| h.freeze if h.is_a? String}
|
687
797
|
adjusted_headers
|
688
798
|
end
|
@@ -705,28 +815,28 @@ class CSV
|
|
705
815
|
sample[0, 128].index(@quote_character)
|
706
816
|
end
|
707
817
|
|
708
|
-
|
709
|
-
|
710
|
-
|
711
|
-
|
712
|
-
@io = StringIO.new(string, "rb:#{string.encoding}")
|
713
|
-
end
|
818
|
+
class UnoptimizedStringIO # :nodoc:
|
819
|
+
def initialize(string)
|
820
|
+
@io = StringIO.new(string, "rb:#{string.encoding}")
|
821
|
+
end
|
714
822
|
|
715
|
-
|
716
|
-
|
717
|
-
|
823
|
+
def gets(*args)
|
824
|
+
@io.gets(*args)
|
825
|
+
end
|
718
826
|
|
719
|
-
|
720
|
-
|
721
|
-
|
827
|
+
def each_line(*args, &block)
|
828
|
+
@io.each_line(*args, &block)
|
829
|
+
end
|
722
830
|
|
723
|
-
|
724
|
-
|
725
|
-
end
|
831
|
+
def eof?
|
832
|
+
@io.eof?
|
726
833
|
end
|
834
|
+
end
|
727
835
|
|
728
|
-
|
729
|
-
|
836
|
+
SCANNER_TEST = (ENV["CSV_PARSER_SCANNER_TEST"] == "yes")
|
837
|
+
if SCANNER_TEST
|
838
|
+
SCANNER_TEST_CHUNK_SIZE_NAME = "CSV_PARSER_SCANNER_TEST_CHUNK_SIZE"
|
839
|
+
SCANNER_TEST_CHUNK_SIZE_VALUE = ENV[SCANNER_TEST_CHUNK_SIZE_NAME]
|
730
840
|
def build_scanner
|
731
841
|
inputs = @samples.collect do |sample|
|
732
842
|
UnoptimizedStringIO.new(sample)
|
@@ -736,16 +846,27 @@ class CSV
|
|
736
846
|
else
|
737
847
|
inputs << @input
|
738
848
|
end
|
849
|
+
begin
|
850
|
+
chunk_size_value = ENV[SCANNER_TEST_CHUNK_SIZE_NAME]
|
851
|
+
rescue # Ractor::IsolationError
|
852
|
+
# Ractor on Ruby 3.0 can't read ENV value.
|
853
|
+
chunk_size_value = SCANNER_TEST_CHUNK_SIZE_VALUE
|
854
|
+
end
|
855
|
+
chunk_size = Integer((chunk_size_value || "1"), 10)
|
739
856
|
InputsScanner.new(inputs,
|
740
857
|
@encoding,
|
741
|
-
|
858
|
+
@row_separator,
|
859
|
+
chunk_size: chunk_size)
|
742
860
|
end
|
743
861
|
else
|
744
862
|
def build_scanner
|
745
863
|
string = nil
|
746
864
|
if @samples.empty? and @input.is_a?(StringIO)
|
747
865
|
string = @input.read
|
748
|
-
elsif @samples.size == 1 and
|
866
|
+
elsif @samples.size == 1 and
|
867
|
+
@input != ARGF and
|
868
|
+
@input.respond_to?(:eof?) and
|
869
|
+
@input.eof?
|
749
870
|
string = @samples[0]
|
750
871
|
end
|
751
872
|
if string
|
@@ -764,7 +885,7 @@ class CSV
|
|
764
885
|
StringIO.new(sample)
|
765
886
|
end
|
766
887
|
inputs << @input
|
767
|
-
InputsScanner.new(inputs, @encoding)
|
888
|
+
InputsScanner.new(inputs, @encoding, @row_separator)
|
768
889
|
end
|
769
890
|
end
|
770
891
|
end
|
@@ -798,6 +919,14 @@ class CSV
|
|
798
919
|
end
|
799
920
|
end
|
800
921
|
|
922
|
+
def validate_field_size(field)
|
923
|
+
return unless @max_field_size
|
924
|
+
return if field.size <= @max_field_size
|
925
|
+
ignore_broken_line
|
926
|
+
message = "Field size exceeded: #{field.size} > #{@max_field_size}"
|
927
|
+
raise MalformedCSVError.new(message, @lineno)
|
928
|
+
end
|
929
|
+
|
801
930
|
def parse_no_quote(&block)
|
802
931
|
@scanner.each_line(@row_separator) do |line|
|
803
932
|
next if @skip_lines and skip_line?(line)
|
@@ -807,9 +936,16 @@ class CSV
|
|
807
936
|
if line.empty?
|
808
937
|
next if @skip_blanks
|
809
938
|
row = []
|
939
|
+
quoted_fields = []
|
810
940
|
else
|
811
941
|
line = strip_value(line)
|
812
942
|
row = line.split(@split_column_separator, -1)
|
943
|
+
quoted_fields = [false] * row.size
|
944
|
+
if @max_field_size
|
945
|
+
row.each do |column|
|
946
|
+
validate_field_size(column)
|
947
|
+
end
|
948
|
+
end
|
813
949
|
n_columns = row.size
|
814
950
|
i = 0
|
815
951
|
while i < n_columns
|
@@ -818,7 +954,7 @@ class CSV
|
|
818
954
|
end
|
819
955
|
end
|
820
956
|
@last_line = original_line
|
821
|
-
emit_row(row, &block)
|
957
|
+
emit_row(row, quoted_fields, &block)
|
822
958
|
end
|
823
959
|
end
|
824
960
|
|
@@ -840,31 +976,37 @@ class CSV
|
|
840
976
|
next
|
841
977
|
end
|
842
978
|
row = []
|
979
|
+
quoted_fields = []
|
843
980
|
elsif line.include?(@cr) or line.include?(@lf)
|
844
981
|
@scanner.keep_back
|
845
982
|
@need_robust_parsing = true
|
846
983
|
return parse_quotable_robust(&block)
|
847
984
|
else
|
848
985
|
row = line.split(@split_column_separator, -1)
|
986
|
+
quoted_fields = []
|
849
987
|
n_columns = row.size
|
850
988
|
i = 0
|
851
989
|
while i < n_columns
|
852
990
|
column = row[i]
|
853
991
|
if column.empty?
|
992
|
+
quoted_fields << false
|
854
993
|
row[i] = nil
|
855
994
|
else
|
856
995
|
n_quotes = column.count(@quote_character)
|
857
996
|
if n_quotes.zero?
|
997
|
+
quoted_fields << false
|
858
998
|
# no quote
|
859
999
|
elsif n_quotes == 2 and
|
860
1000
|
column.start_with?(@quote_character) and
|
861
1001
|
column.end_with?(@quote_character)
|
1002
|
+
quoted_fields << true
|
862
1003
|
row[i] = column[1..-2]
|
863
1004
|
else
|
864
1005
|
@scanner.keep_back
|
865
1006
|
@need_robust_parsing = true
|
866
1007
|
return parse_quotable_robust(&block)
|
867
1008
|
end
|
1009
|
+
validate_field_size(row[i])
|
868
1010
|
end
|
869
1011
|
i += 1
|
870
1012
|
end
|
@@ -872,13 +1014,14 @@ class CSV
|
|
872
1014
|
@scanner.keep_drop
|
873
1015
|
@scanner.keep_start
|
874
1016
|
@last_line = original_line
|
875
|
-
emit_row(row, &block)
|
1017
|
+
emit_row(row, quoted_fields, &block)
|
876
1018
|
end
|
877
1019
|
@scanner.keep_drop
|
878
1020
|
end
|
879
1021
|
|
880
1022
|
def parse_quotable_robust(&block)
|
881
1023
|
row = []
|
1024
|
+
quoted_fields = []
|
882
1025
|
skip_needless_lines
|
883
1026
|
start_row
|
884
1027
|
while true
|
@@ -888,32 +1031,39 @@ class CSV
|
|
888
1031
|
value = parse_column_value
|
889
1032
|
if value
|
890
1033
|
@scanner.scan_all(@strip_value) if @strip_value
|
891
|
-
|
892
|
-
ignore_broken_line
|
893
|
-
raise MalformedCSVError.new("Field size exceeded", @lineno)
|
894
|
-
end
|
1034
|
+
validate_field_size(value)
|
895
1035
|
end
|
896
1036
|
if parse_column_end
|
897
1037
|
row << value
|
1038
|
+
quoted_fields << @quoted_column_value
|
898
1039
|
elsif parse_row_end
|
899
1040
|
if row.empty? and value.nil?
|
900
|
-
emit_row([], &block) unless @skip_blanks
|
1041
|
+
emit_row([], [], &block) unless @skip_blanks
|
901
1042
|
else
|
902
1043
|
row << value
|
903
|
-
|
1044
|
+
quoted_fields << @quoted_column_value
|
1045
|
+
emit_row(row, quoted_fields, &block)
|
904
1046
|
row = []
|
1047
|
+
quoted_fields = []
|
905
1048
|
end
|
906
1049
|
skip_needless_lines
|
907
1050
|
start_row
|
908
1051
|
elsif @scanner.eos?
|
909
1052
|
break if row.empty? and value.nil?
|
910
1053
|
row << value
|
911
|
-
|
1054
|
+
quoted_fields << @quoted_column_value
|
1055
|
+
emit_row(row, quoted_fields, &block)
|
912
1056
|
break
|
913
1057
|
else
|
914
1058
|
if @quoted_column_value
|
1059
|
+
if liberal_parsing? and (new_line = @scanner.check(@line_end))
|
1060
|
+
message =
|
1061
|
+
"Illegal end-of-line sequence outside of a quoted field " +
|
1062
|
+
"<#{new_line.inspect}>"
|
1063
|
+
else
|
1064
|
+
message = "Any value after quoted field isn't allowed"
|
1065
|
+
end
|
915
1066
|
ignore_broken_line
|
916
|
-
message = "Any value after quoted field isn't allowed"
|
917
1067
|
raise MalformedCSVError.new(message, @lineno)
|
918
1068
|
elsif @unquoted_column_value and
|
919
1069
|
(new_line = @scanner.scan(@line_end))
|
@@ -1006,7 +1156,7 @@ class CSV
|
|
1006
1156
|
if (n_quotes % 2).zero?
|
1007
1157
|
quotes[0, (n_quotes - 2) / 2]
|
1008
1158
|
else
|
1009
|
-
value = quotes[0,
|
1159
|
+
value = quotes[0, n_quotes / 2]
|
1010
1160
|
while true
|
1011
1161
|
quoted_value = @scanner.scan_all(@quoted_value)
|
1012
1162
|
value << quoted_value if quoted_value
|
@@ -1030,11 +1180,9 @@ class CSV
|
|
1030
1180
|
n_quotes = quotes.size
|
1031
1181
|
if n_quotes == 1
|
1032
1182
|
break
|
1033
|
-
elsif (n_quotes % 2) == 1
|
1034
|
-
value << quotes[0, (n_quotes - 1) / 2]
|
1035
|
-
break
|
1036
1183
|
else
|
1037
1184
|
value << quotes[0, n_quotes / 2]
|
1185
|
+
break if (n_quotes % 2) == 1
|
1038
1186
|
end
|
1039
1187
|
end
|
1040
1188
|
value
|
@@ -1070,18 +1218,15 @@ class CSV
|
|
1070
1218
|
|
1071
1219
|
def strip_value(value)
|
1072
1220
|
return value unless @strip
|
1073
|
-
return
|
1221
|
+
return value if value.nil?
|
1074
1222
|
|
1075
1223
|
case @strip
|
1076
1224
|
when String
|
1077
|
-
|
1078
|
-
|
1079
|
-
size -= 1
|
1080
|
-
value = value[1, size]
|
1225
|
+
while value.delete_prefix!(@strip)
|
1226
|
+
# do nothing
|
1081
1227
|
end
|
1082
|
-
while value.
|
1083
|
-
|
1084
|
-
value = value[0, size]
|
1228
|
+
while value.delete_suffix!(@strip)
|
1229
|
+
# do nothing
|
1085
1230
|
end
|
1086
1231
|
else
|
1087
1232
|
value.strip!
|
@@ -1104,22 +1249,22 @@ class CSV
|
|
1104
1249
|
@scanner.keep_start
|
1105
1250
|
end
|
1106
1251
|
|
1107
|
-
def emit_row(row, &block)
|
1252
|
+
def emit_row(row, quoted_fields, &block)
|
1108
1253
|
@lineno += 1
|
1109
1254
|
|
1110
1255
|
raw_row = row
|
1111
1256
|
if @use_headers
|
1112
1257
|
if @headers.nil?
|
1113
|
-
@headers = adjust_headers(row)
|
1258
|
+
@headers = adjust_headers(row, quoted_fields)
|
1114
1259
|
return unless @return_headers
|
1115
1260
|
row = Row.new(@headers, row, true)
|
1116
1261
|
else
|
1117
1262
|
row = Row.new(@headers,
|
1118
|
-
@fields_converter.convert(raw_row, @headers, @lineno))
|
1263
|
+
@fields_converter.convert(raw_row, @headers, @lineno, quoted_fields))
|
1119
1264
|
end
|
1120
1265
|
else
|
1121
1266
|
# convert fields, if needed...
|
1122
|
-
row = @fields_converter.convert(raw_row, nil, @lineno)
|
1267
|
+
row = @fields_converter.convert(raw_row, nil, @lineno, quoted_fields)
|
1123
1268
|
end
|
1124
1269
|
|
1125
1270
|
# inject unconverted fields and accessor, if requested...
|
data/lib/csv/row.rb
CHANGED
@@ -703,7 +703,7 @@ class CSV
|
|
703
703
|
# by +index_or_header+ and +specifiers+.
|
704
704
|
#
|
705
705
|
# The nested objects may be instances of various classes.
|
706
|
-
# See {Dig Methods}[https://docs.ruby-lang.org/en/master/
|
706
|
+
# See {Dig Methods}[https://docs.ruby-lang.org/en/master/dig_methods_rdoc.html].
|
707
707
|
#
|
708
708
|
# Examples:
|
709
709
|
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
data/lib/csv/table.rb
CHANGED
@@ -999,9 +999,15 @@ class CSV
|
|
999
999
|
# Omits the headers if option +write_headers+ is given as +false+
|
1000
1000
|
# (see {Option +write_headers+}[../CSV.html#class-CSV-label-Option+write_headers]):
|
1001
1001
|
# table.to_csv(write_headers: false) # => "foo,0\nbar,1\nbaz,2\n"
|
1002
|
-
|
1002
|
+
#
|
1003
|
+
# Limit rows if option +limit+ is given like +2+:
|
1004
|
+
# table.to_csv(limit: 2) # => "Name,Value\nfoo,0\nbar,1\n"
|
1005
|
+
def to_csv(write_headers: true, limit: nil, **options)
|
1003
1006
|
array = write_headers ? [headers.to_csv(**options)] : []
|
1004
|
-
@table.
|
1007
|
+
limit ||= @table.size
|
1008
|
+
limit = @table.size + 1 + limit if limit < 0
|
1009
|
+
limit = 0 if limit < 0
|
1010
|
+
@table.first(limit).each do |row|
|
1005
1011
|
array.push(row.fields.to_csv(**options)) unless row.header_row?
|
1006
1012
|
end
|
1007
1013
|
|
@@ -1038,9 +1044,13 @@ class CSV
|
|
1038
1044
|
# Example:
|
1039
1045
|
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
1040
1046
|
# table = CSV.parse(source, headers: true)
|
1041
|
-
# table.inspect # => "#<CSV::Table mode:col_or_row row_count:4
|
1047
|
+
# table.inspect # => "#<CSV::Table mode:col_or_row row_count:4>\nName,Value\nfoo,0\nbar,1\nbaz,2\n"
|
1048
|
+
#
|
1042
1049
|
def inspect
|
1043
|
-
"#<#{self.class} mode:#{@mode} row_count:#{to_a.size}>"
|
1050
|
+
inspected = +"#<#{self.class} mode:#{@mode} row_count:#{to_a.size}>"
|
1051
|
+
summary = to_csv(limit: 5)
|
1052
|
+
inspected << "\n" << summary if summary.encoding.ascii_compatible?
|
1053
|
+
inspected
|
1044
1054
|
end
|
1045
1055
|
end
|
1046
1056
|
end
|
data/lib/csv/version.rb
CHANGED
data/lib/csv/writer.rb
CHANGED
@@ -1,11 +1,8 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
require_relative "input_record_separator"
|
4
|
-
require_relative "match_p"
|
5
4
|
require_relative "row"
|
6
5
|
|
7
|
-
using CSV::MatchP if CSV.const_defined?(:MatchP)
|
8
|
-
|
9
6
|
class CSV
|
10
7
|
# Note: Don't use this class directly. This is an internal class.
|
11
8
|
class Writer
|
@@ -42,7 +39,10 @@ class CSV
|
|
42
39
|
@headers ||= row if @use_headers
|
43
40
|
@lineno += 1
|
44
41
|
|
45
|
-
|
42
|
+
if @fields_converter
|
43
|
+
quoted_fields = [false] * row.size
|
44
|
+
row = @fields_converter.convert(row, nil, lineno, quoted_fields)
|
45
|
+
end
|
46
46
|
|
47
47
|
i = -1
|
48
48
|
converted_row = row.collect do |field|
|
@@ -97,7 +97,7 @@ class CSV
|
|
97
97
|
return unless @headers
|
98
98
|
|
99
99
|
converter = @options[:header_fields_converter]
|
100
|
-
@headers = converter.convert(@headers, nil, 0)
|
100
|
+
@headers = converter.convert(@headers, nil, 0, [])
|
101
101
|
@headers.each do |header|
|
102
102
|
header.freeze if header.is_a?(String)
|
103
103
|
end
|
data/lib/csv.rb
CHANGED
@@ -95,14 +95,11 @@ require "stringio"
|
|
95
95
|
|
96
96
|
require_relative "csv/fields_converter"
|
97
97
|
require_relative "csv/input_record_separator"
|
98
|
-
require_relative "csv/match_p"
|
99
98
|
require_relative "csv/parser"
|
100
99
|
require_relative "csv/row"
|
101
100
|
require_relative "csv/table"
|
102
101
|
require_relative "csv/writer"
|
103
102
|
|
104
|
-
using CSV::MatchP if CSV.const_defined?(:MatchP)
|
105
|
-
|
106
103
|
# == \CSV
|
107
104
|
#
|
108
105
|
# === In a Hurry?
|
@@ -341,6 +338,7 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
|
|
341
338
|
# liberal_parsing: false,
|
342
339
|
# nil_value: nil,
|
343
340
|
# empty_value: "",
|
341
|
+
# strip: false,
|
344
342
|
# # For generating.
|
345
343
|
# write_headers: nil,
|
346
344
|
# quote_empty: true,
|
@@ -348,7 +346,6 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
|
|
348
346
|
# write_converters: nil,
|
349
347
|
# write_nil_value: nil,
|
350
348
|
# write_empty_value: "",
|
351
|
-
# strip: false,
|
352
349
|
# }
|
353
350
|
#
|
354
351
|
# ==== Options for Parsing
|
@@ -357,7 +354,9 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
|
|
357
354
|
# - +row_sep+: Specifies the row separator; used to delimit rows.
|
358
355
|
# - +col_sep+: Specifies the column separator; used to delimit fields.
|
359
356
|
# - +quote_char+: Specifies the quote character; used to quote fields.
|
360
|
-
# - +field_size_limit+: Specifies the maximum field size allowed.
|
357
|
+
# - +field_size_limit+: Specifies the maximum field size + 1 allowed.
|
358
|
+
# Deprecated since 3.2.3. Use +max_field_size+ instead.
|
359
|
+
# - +max_field_size+: Specifies the maximum field size allowed.
|
361
360
|
# - +converters+: Specifies the field converters to be used.
|
362
361
|
# - +unconverted_fields+: Specifies whether unconverted fields are to be available.
|
363
362
|
# - +headers+: Specifies whether data contains headers,
|
@@ -366,8 +365,9 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
|
|
366
365
|
# - +header_converters+: Specifies the header converters to be used.
|
367
366
|
# - +skip_blanks+: Specifies whether blanks lines are to be ignored.
|
368
367
|
# - +skip_lines+: Specifies how comments lines are to be recognized.
|
369
|
-
# - +strip+: Specifies whether leading and trailing whitespace are
|
370
|
-
#
|
368
|
+
# - +strip+: Specifies whether leading and trailing whitespace are to be
|
369
|
+
# stripped from fields. This must be compatible with +col_sep+; if it is not,
|
370
|
+
# then an +ArgumentError+ exception will be raised.
|
371
371
|
# - +liberal_parsing+: Specifies whether \CSV should attempt to parse
|
372
372
|
# non-compliant data.
|
373
373
|
# - +nil_value+: Specifies the object that is to be substituted for each null (no-text) field.
|
@@ -863,8 +863,9 @@ class CSV
|
|
863
863
|
# <b><tt>index</tt></b>:: The zero-based index of the field in its row.
|
864
864
|
# <b><tt>line</tt></b>:: The line of the data source this row is from.
|
865
865
|
# <b><tt>header</tt></b>:: The header for the column, when available.
|
866
|
+
# <b><tt>quoted?</tt></b>:: True or false, whether the original value is quoted or not.
|
866
867
|
#
|
867
|
-
FieldInfo = Struct.new(:index, :line, :header)
|
868
|
+
FieldInfo = Struct.new(:index, :line, :header, :quoted?)
|
868
869
|
|
869
870
|
# A Regexp used to find and convert some common Date formats.
|
870
871
|
DateMatcher = / \A(?: (\w+,?\s+)?\w+\s+\d{1,2},?\s+\d{2,4} |
|
@@ -872,10 +873,9 @@ class CSV
|
|
872
873
|
# A Regexp used to find and convert some common DateTime formats.
|
873
874
|
DateTimeMatcher =
|
874
875
|
/ \A(?: (\w+,?\s+)?\w+\s+\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2},?\s+\d{2,4} |
|
875
|
-
|
876
|
-
# ISO-8601
|
876
|
+
# ISO-8601 and RFC-3339 (space instead of T) recognized by DateTime.parse
|
877
877
|
\d{4}-\d{2}-\d{2}
|
878
|
-
(?:T\d{2}:\d{2}(?::\d{2}(?:\.\d+)?(?:[+-]\d{2}(?::\d{2})|Z)?)?)?
|
878
|
+
(?:[T\s]\d{2}:\d{2}(?::\d{2}(?:\.\d+)?(?:[+-]\d{2}(?::\d{2})|Z)?)?)?
|
879
879
|
)\z /x
|
880
880
|
|
881
881
|
# The encoding used by all converters.
|
@@ -925,7 +925,8 @@ class CSV
|
|
925
925
|
symbol: lambda { |h|
|
926
926
|
h.encode(ConverterEncoding).downcase.gsub(/[^\s\w]+/, "").strip.
|
927
927
|
gsub(/\s+/, "_").to_sym
|
928
|
-
}
|
928
|
+
},
|
929
|
+
symbol_raw: lambda { |h| h.encode(ConverterEncoding).to_sym }
|
929
930
|
}
|
930
931
|
|
931
932
|
# Default values for method options.
|
@@ -936,6 +937,7 @@ class CSV
|
|
936
937
|
quote_char: '"',
|
937
938
|
# For parsing.
|
938
939
|
field_size_limit: nil,
|
940
|
+
max_field_size: nil,
|
939
941
|
converters: nil,
|
940
942
|
unconverted_fields: nil,
|
941
943
|
headers: false,
|
@@ -946,6 +948,7 @@ class CSV
|
|
946
948
|
liberal_parsing: false,
|
947
949
|
nil_value: nil,
|
948
950
|
empty_value: "",
|
951
|
+
strip: false,
|
949
952
|
# For generating.
|
950
953
|
write_headers: nil,
|
951
954
|
quote_empty: true,
|
@@ -953,7 +956,6 @@ class CSV
|
|
953
956
|
write_converters: nil,
|
954
957
|
write_nil_value: nil,
|
955
958
|
write_empty_value: "",
|
956
|
-
strip: false,
|
957
959
|
}.freeze
|
958
960
|
|
959
961
|
class << self
|
@@ -1864,6 +1866,7 @@ class CSV
|
|
1864
1866
|
row_sep: :auto,
|
1865
1867
|
quote_char: '"',
|
1866
1868
|
field_size_limit: nil,
|
1869
|
+
max_field_size: nil,
|
1867
1870
|
converters: nil,
|
1868
1871
|
unconverted_fields: nil,
|
1869
1872
|
headers: false,
|
@@ -1879,16 +1882,27 @@ class CSV
|
|
1879
1882
|
encoding: nil,
|
1880
1883
|
nil_value: nil,
|
1881
1884
|
empty_value: "",
|
1885
|
+
strip: false,
|
1882
1886
|
quote_empty: true,
|
1883
1887
|
write_converters: nil,
|
1884
1888
|
write_nil_value: nil,
|
1885
|
-
write_empty_value: ""
|
1886
|
-
strip: false)
|
1889
|
+
write_empty_value: "")
|
1887
1890
|
raise ArgumentError.new("Cannot parse nil as CSV") if data.nil?
|
1888
1891
|
|
1889
1892
|
if data.is_a?(String)
|
1893
|
+
if encoding
|
1894
|
+
if encoding.is_a?(String)
|
1895
|
+
data_external_encoding, data_internal_encoding = encoding.split(":", 2)
|
1896
|
+
if data_internal_encoding
|
1897
|
+
data = data.encode(data_internal_encoding, data_external_encoding)
|
1898
|
+
else
|
1899
|
+
data = data.dup.force_encoding(data_external_encoding)
|
1900
|
+
end
|
1901
|
+
else
|
1902
|
+
data = data.dup.force_encoding(encoding)
|
1903
|
+
end
|
1904
|
+
end
|
1890
1905
|
@io = StringIO.new(data)
|
1891
|
-
@io.set_encoding(encoding || data.encoding)
|
1892
1906
|
else
|
1893
1907
|
@io = data
|
1894
1908
|
end
|
@@ -1906,11 +1920,14 @@ class CSV
|
|
1906
1920
|
@initial_header_converters = header_converters
|
1907
1921
|
@initial_write_converters = write_converters
|
1908
1922
|
|
1923
|
+
if max_field_size.nil? and field_size_limit
|
1924
|
+
max_field_size = field_size_limit - 1
|
1925
|
+
end
|
1909
1926
|
@parser_options = {
|
1910
1927
|
column_separator: col_sep,
|
1911
1928
|
row_separator: row_sep,
|
1912
1929
|
quote_character: quote_char,
|
1913
|
-
|
1930
|
+
max_field_size: max_field_size,
|
1914
1931
|
unconverted_fields: unconverted_fields,
|
1915
1932
|
headers: headers,
|
1916
1933
|
return_headers: return_headers,
|
@@ -1978,10 +1995,24 @@ class CSV
|
|
1978
1995
|
# Returns the limit for field size; used for parsing;
|
1979
1996
|
# see {Option +field_size_limit+}[#class-CSV-label-Option+field_size_limit]:
|
1980
1997
|
# CSV.new('').field_size_limit # => nil
|
1998
|
+
#
|
1999
|
+
# Deprecated since 3.2.3. Use +max_field_size+ instead.
|
1981
2000
|
def field_size_limit
|
1982
2001
|
parser.field_size_limit
|
1983
2002
|
end
|
1984
2003
|
|
2004
|
+
# :call-seq:
|
2005
|
+
# csv.max_field_size -> integer or nil
|
2006
|
+
#
|
2007
|
+
# Returns the limit for field size; used for parsing;
|
2008
|
+
# see {Option +max_field_size+}[#class-CSV-label-Option+max_field_size]:
|
2009
|
+
# CSV.new('').max_field_size # => nil
|
2010
|
+
#
|
2011
|
+
# Since 3.2.3.
|
2012
|
+
def max_field_size
|
2013
|
+
parser.max_field_size
|
2014
|
+
end
|
2015
|
+
|
1985
2016
|
# :call-seq:
|
1986
2017
|
# csv.skip_lines -> regexp or nil
|
1987
2018
|
#
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: csv
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 3.2.
|
4
|
+
version: 3.2.4
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- James Edward Gray II
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date:
|
12
|
+
date: 2022-08-22 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: bundler
|
@@ -116,10 +116,8 @@ files:
|
|
116
116
|
- lib/csv.rb
|
117
117
|
- lib/csv/core_ext/array.rb
|
118
118
|
- lib/csv/core_ext/string.rb
|
119
|
-
- lib/csv/delete_suffix.rb
|
120
119
|
- lib/csv/fields_converter.rb
|
121
120
|
- lib/csv/input_record_separator.rb
|
122
|
-
- lib/csv/match_p.rb
|
123
121
|
- lib/csv/parser.rb
|
124
122
|
- lib/csv/row.rb
|
125
123
|
- lib/csv/table.rb
|
@@ -147,7 +145,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
147
145
|
- !ruby/object:Gem::Version
|
148
146
|
version: '0'
|
149
147
|
requirements: []
|
150
|
-
rubygems_version: 3.
|
148
|
+
rubygems_version: 3.4.0.dev
|
151
149
|
signing_key:
|
152
150
|
specification_version: 4
|
153
151
|
summary: CSV Reading and Writing
|
data/lib/csv/delete_suffix.rb
DELETED
@@ -1,18 +0,0 @@
|
|
1
|
-
# frozen_string_literal: true
|
2
|
-
|
3
|
-
# This provides String#delete_suffix? for Ruby 2.4.
|
4
|
-
unless String.method_defined?(:delete_suffix)
|
5
|
-
class CSV
|
6
|
-
module DeleteSuffix
|
7
|
-
refine String do
|
8
|
-
def delete_suffix(suffix)
|
9
|
-
if end_with?(suffix)
|
10
|
-
self[0...-suffix.size]
|
11
|
-
else
|
12
|
-
self
|
13
|
-
end
|
14
|
-
end
|
15
|
-
end
|
16
|
-
end
|
17
|
-
end
|
18
|
-
end
|
data/lib/csv/match_p.rb
DELETED
@@ -1,20 +0,0 @@
|
|
1
|
-
# frozen_string_literal: true
|
2
|
-
|
3
|
-
# This provides String#match? and Regexp#match? for Ruby 2.3.
|
4
|
-
unless String.method_defined?(:match?)
|
5
|
-
class CSV
|
6
|
-
module MatchP
|
7
|
-
refine String do
|
8
|
-
def match?(pattern)
|
9
|
-
self =~ pattern
|
10
|
-
end
|
11
|
-
end
|
12
|
-
|
13
|
-
refine Regexp do
|
14
|
-
def match?(string)
|
15
|
-
self =~ string
|
16
|
-
end
|
17
|
-
end
|
18
|
-
end
|
19
|
-
end
|
20
|
-
end
|