csv 3.2.1 → 3.2.4
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/NEWS.md +113 -0
- data/README.md +1 -1
- data/doc/csv/options/generating/write_headers.rdoc +1 -1
- data/doc/csv/recipes/generating.rdoc +1 -1
- data/doc/csv/recipes/parsing.rdoc +1 -1
- data/lib/csv/fields_converter.rb +3 -2
- data/lib/csv/input_record_separator.rb +1 -14
- data/lib/csv/parser.rb +237 -92
- data/lib/csv/row.rb +1 -1
- data/lib/csv/table.rb +14 -4
- data/lib/csv/version.rb +1 -1
- data/lib/csv/writer.rb +5 -5
- data/lib/csv.rb +48 -17
- metadata +3 -5
- data/lib/csv/delete_suffix.rb +0 -18
- data/lib/csv/match_p.rb +0 -20
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 3ef88fc9b205f8f64c5e817b22f121d5d03534c155cb8d27dc9d87aa62e6b7e0
|
4
|
+
data.tar.gz: e12b86cee946837a96ae609314a50246e2135fab855dca58e800de1ddc50e524
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d663d0917d63315e4fc5802f9538727731f39ea06997adea485f5f137f37639f0791db1f9697476e7d0c4a8048a0339b9e4fb7aecd79f3ec8ee8eb24a4cb676e
|
7
|
+
data.tar.gz: f9d22fe50d227b8f8ca0d075ed0fbf1f39a10bf59a1adb68835fede727f3d1dfd1d2f88754caa1098fe9c24df738c71b29ae5fc3080a25d80df484f76530344e
|
data/NEWS.md
CHANGED
@@ -1,5 +1,118 @@
|
|
1
1
|
# News
|
2
2
|
|
3
|
+
## 3.2.4 - 2022-08-22
|
4
|
+
|
5
|
+
### Improvements
|
6
|
+
|
7
|
+
* Cleaned up internal implementations.
|
8
|
+
[[GitHub#249](https://github.com/ruby/csv/pull/249)]
|
9
|
+
[[GitHub#250](https://github.com/ruby/csv/pull/250)]
|
10
|
+
[[GitHub#251](https://github.com/ruby/csv/pull/251)]
|
11
|
+
[Patch by Mau Magnaguagno]
|
12
|
+
|
13
|
+
* Added support for RFC 3339 style time.
|
14
|
+
[[GitHub#248](https://github.com/ruby/csv/pull/248)]
|
15
|
+
[Patch by Thierry Lambert]
|
16
|
+
|
17
|
+
* Added support for transcoding String CSV. Syntax is
|
18
|
+
`from-encoding:to-encoding`.
|
19
|
+
[[GitHub#254](https://github.com/ruby/csv/issues/254)]
|
20
|
+
[Reported by Richard Stueven]
|
21
|
+
|
22
|
+
* Added quoted information to `CSV::FieldInfo`.
|
23
|
+
[[GitHub#254](https://github.com/ruby/csv/pull/253)]
|
24
|
+
[Reported by Hirokazu SUZUKI]
|
25
|
+
|
26
|
+
### Fixes
|
27
|
+
|
28
|
+
* Fixed a link in documents.
|
29
|
+
[[GitHub#244](https://github.com/ruby/csv/pull/244)]
|
30
|
+
[Patch by Peter Zhu]
|
31
|
+
|
32
|
+
### Thanks
|
33
|
+
|
34
|
+
* Peter Zhu
|
35
|
+
|
36
|
+
* Mau Magnaguagno
|
37
|
+
|
38
|
+
* Thierry Lambert
|
39
|
+
|
40
|
+
* Richard Stueven
|
41
|
+
|
42
|
+
* Hirokazu SUZUKI
|
43
|
+
|
44
|
+
## 3.2.3 - 2022-04-09
|
45
|
+
|
46
|
+
### Improvements
|
47
|
+
|
48
|
+
* Added contents summary to `CSV::Table#inspect`.
|
49
|
+
[GitHub#229][Patch by Eriko Sugiyama]
|
50
|
+
[GitHub#235][Patch by Sampat Badhe]
|
51
|
+
|
52
|
+
* Suppressed `$INPUT_RECORD_SEPARATOR` deprecation warning by
|
53
|
+
`Warning.warn`.
|
54
|
+
[GitHub#233][Reported by Jean byroot Boussier]
|
55
|
+
|
56
|
+
* Improved error message for liberal parsing with quoted values.
|
57
|
+
[GitHub#231][Patch by Nikolay Rys]
|
58
|
+
|
59
|
+
* Fixed typos in documentation.
|
60
|
+
[GitHub#236][Patch by Sampat Badhe]
|
61
|
+
|
62
|
+
* Added `:max_field_size` option and deprecated `:field_size_limit` option.
|
63
|
+
[GitHub#238][Reported by Dan Buettner]
|
64
|
+
|
65
|
+
* Added `:symbol_raw` to built-in header converters.
|
66
|
+
[GitHub#237][Reported by taki]
|
67
|
+
[GitHub#239][Patch by Eriko Sugiyama]
|
68
|
+
|
69
|
+
### Fixes
|
70
|
+
|
71
|
+
* Fixed a bug that some texts may be dropped unexpectedly.
|
72
|
+
[Bug #18245][ruby-core:105587][Reported by Hassan Abdul Rehman]
|
73
|
+
|
74
|
+
* Fixed a bug that `:field_size_limit` doesn't work with not complex row.
|
75
|
+
[GitHub#238][Reported by Dan Buettner]
|
76
|
+
|
77
|
+
### Thanks
|
78
|
+
|
79
|
+
* Hassan Abdul Rehman
|
80
|
+
|
81
|
+
* Eriko Sugiyama
|
82
|
+
|
83
|
+
* Jean byroot Boussier
|
84
|
+
|
85
|
+
* Nikolay Rys
|
86
|
+
|
87
|
+
* Sampat Badhe
|
88
|
+
|
89
|
+
* Dan Buettner
|
90
|
+
|
91
|
+
* taki
|
92
|
+
|
93
|
+
## 3.2.2 - 2021-12-24
|
94
|
+
|
95
|
+
### Improvements
|
96
|
+
|
97
|
+
* Added a validation for invalid option combination.
|
98
|
+
[GitHub#225][Patch by adamroyjones]
|
99
|
+
|
100
|
+
* Improved documentation for developers.
|
101
|
+
[GitHub#227][Patch by Eriko Sugiyama]
|
102
|
+
|
103
|
+
### Fixes
|
104
|
+
|
105
|
+
* Fixed a bug that all of `ARGF` contents may not be consumed.
|
106
|
+
[GitHub#228][Reported by Rafael Navaza]
|
107
|
+
|
108
|
+
### Thanks
|
109
|
+
|
110
|
+
* adamroyjones
|
111
|
+
|
112
|
+
* Eriko Sugiyama
|
113
|
+
|
114
|
+
* Rafael Navaza
|
115
|
+
|
3
116
|
## 3.2.1 - 2021-10-23
|
4
117
|
|
5
118
|
### Improvements
|
data/README.md
CHANGED
@@ -35,7 +35,7 @@ end
|
|
35
35
|
|
36
36
|
## Development
|
37
37
|
|
38
|
-
After checking out the repo, run `
|
38
|
+
After checking out the repo, run `ruby run-test.rb` to check if your changes can pass the test.
|
39
39
|
|
40
40
|
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
41
41
|
|
@@ -148,7 +148,7 @@ This example defines and uses a custom write converter to strip whitespace from
|
|
148
148
|
|
149
149
|
==== Recipe: Specify Multiple Write Converters
|
150
150
|
|
151
|
-
Use option <tt>:write_converters</tt> and multiple custom
|
151
|
+
Use option <tt>:write_converters</tt> and multiple custom converters
|
152
152
|
to convert field values when generating \CSV.
|
153
153
|
|
154
154
|
This example defines and uses two custom write converters to strip and upcase generated fields:
|
@@ -83,7 +83,7 @@ Use instance method CSV#each with option +headers+ to read a source \String one
|
|
83
83
|
CSV.new(string, headers: true).each do |row|
|
84
84
|
p row
|
85
85
|
end
|
86
|
-
|
86
|
+
Output:
|
87
87
|
#<CSV::Row "Name":"foo" "Value":"0">
|
88
88
|
#<CSV::Row "Name":"bar" "Value":"1">
|
89
89
|
#<CSV::Row "Name":"baz" "Value":"2">
|
data/lib/csv/fields_converter.rb
CHANGED
@@ -44,7 +44,7 @@ class CSV
|
|
44
44
|
@converters.empty?
|
45
45
|
end
|
46
46
|
|
47
|
-
def convert(fields, headers, lineno)
|
47
|
+
def convert(fields, headers, lineno, quoted_fields)
|
48
48
|
return fields unless need_convert?
|
49
49
|
|
50
50
|
fields.collect.with_index do |field, index|
|
@@ -63,7 +63,8 @@ class CSV
|
|
63
63
|
else
|
64
64
|
header = nil
|
65
65
|
end
|
66
|
-
|
66
|
+
quoted = quoted_fields[index]
|
67
|
+
field = converter[field, FieldInfo.new(index, lineno, header, quoted)]
|
67
68
|
end
|
68
69
|
break unless field.is_a?(String) # short-circuit pipeline for speed
|
69
70
|
end
|
@@ -4,20 +4,7 @@ require "stringio"
|
|
4
4
|
class CSV
|
5
5
|
module InputRecordSeparator
|
6
6
|
class << self
|
7
|
-
|
8
|
-
verbose, $VERBOSE = $VERBOSE, true
|
9
|
-
stderr, $stderr = $stderr, StringIO.new
|
10
|
-
input_record_separator = $INPUT_RECORD_SEPARATOR
|
11
|
-
begin
|
12
|
-
$INPUT_RECORD_SEPARATOR = "\r\n"
|
13
|
-
is_input_record_separator_deprecated = (not $stderr.string.empty?)
|
14
|
-
ensure
|
15
|
-
$INPUT_RECORD_SEPARATOR = input_record_separator
|
16
|
-
$stderr = stderr
|
17
|
-
$VERBOSE = verbose
|
18
|
-
end
|
19
|
-
|
20
|
-
if is_input_record_separator_deprecated
|
7
|
+
if RUBY_VERSION >= "3.0.0"
|
21
8
|
def value
|
22
9
|
"\n"
|
23
10
|
end
|
data/lib/csv/parser.rb
CHANGED
@@ -2,15 +2,10 @@
|
|
2
2
|
|
3
3
|
require "strscan"
|
4
4
|
|
5
|
-
require_relative "delete_suffix"
|
6
5
|
require_relative "input_record_separator"
|
7
|
-
require_relative "match_p"
|
8
6
|
require_relative "row"
|
9
7
|
require_relative "table"
|
10
8
|
|
11
|
-
using CSV::DeleteSuffix if CSV.const_defined?(:DeleteSuffix)
|
12
|
-
using CSV::MatchP if CSV.const_defined?(:MatchP)
|
13
|
-
|
14
9
|
class CSV
|
15
10
|
# Note: Don't use this class directly. This is an internal class.
|
16
11
|
class Parser
|
@@ -27,6 +22,10 @@ class CSV
|
|
27
22
|
class InvalidEncoding < StandardError
|
28
23
|
end
|
29
24
|
|
25
|
+
# Raised when unexpected case is happen.
|
26
|
+
class UnexpectedError < StandardError
|
27
|
+
end
|
28
|
+
|
30
29
|
#
|
31
30
|
# CSV::Scanner receives a CSV output, scans it and return the content.
|
32
31
|
# It also controls the life cycle of the object with its methods +keep_start+,
|
@@ -78,16 +77,17 @@ class CSV
|
|
78
77
|
# +keep_end+, +keep_back+, +keep_drop+.
|
79
78
|
#
|
80
79
|
# CSV::InputsScanner.scan() tries to match with pattern at the current position.
|
81
|
-
# If there's a match, the scanner advances the
|
80
|
+
# If there's a match, the scanner advances the "scan pointer" and returns the matched string.
|
82
81
|
# Otherwise, the scanner returns nil.
|
83
82
|
#
|
84
|
-
# CSV::InputsScanner.rest() returns the
|
83
|
+
# CSV::InputsScanner.rest() returns the "rest" of the string (i.e. everything after the scan pointer).
|
85
84
|
# If there is no more data (eos? = true), it returns "".
|
86
85
|
#
|
87
86
|
class InputsScanner
|
88
|
-
def initialize(inputs, encoding, chunk_size: 8192)
|
87
|
+
def initialize(inputs, encoding, row_separator, chunk_size: 8192)
|
89
88
|
@inputs = inputs.dup
|
90
89
|
@encoding = encoding
|
90
|
+
@row_separator = row_separator
|
91
91
|
@chunk_size = chunk_size
|
92
92
|
@last_scanner = @inputs.empty?
|
93
93
|
@keeps = []
|
@@ -95,11 +95,13 @@ class CSV
|
|
95
95
|
end
|
96
96
|
|
97
97
|
def each_line(row_separator)
|
98
|
+
return enum_for(__method__, row_separator) unless block_given?
|
98
99
|
buffer = nil
|
99
100
|
input = @scanner.rest
|
100
101
|
position = @scanner.pos
|
101
102
|
offset = 0
|
102
103
|
n_row_separator_chars = row_separator.size
|
104
|
+
# trace(__method__, :start, line, input)
|
103
105
|
while true
|
104
106
|
input.each_line(row_separator) do |line|
|
105
107
|
@scanner.pos += line.bytesize
|
@@ -139,25 +141,28 @@ class CSV
|
|
139
141
|
end
|
140
142
|
|
141
143
|
def scan(pattern)
|
144
|
+
# trace(__method__, pattern, :start)
|
142
145
|
value = @scanner.scan(pattern)
|
146
|
+
# trace(__method__, pattern, :done, :last, value) if @last_scanner
|
143
147
|
return value if @last_scanner
|
144
148
|
|
145
|
-
if value
|
146
|
-
|
147
|
-
|
148
|
-
else
|
149
|
-
nil
|
150
|
-
end
|
149
|
+
read_chunk if value and @scanner.eos?
|
150
|
+
# trace(__method__, pattern, :done, value)
|
151
|
+
value
|
151
152
|
end
|
152
153
|
|
153
154
|
def scan_all(pattern)
|
155
|
+
# trace(__method__, pattern, :start)
|
154
156
|
value = @scanner.scan(pattern)
|
157
|
+
# trace(__method__, pattern, :done, :last, value) if @last_scanner
|
155
158
|
return value if @last_scanner
|
156
159
|
|
157
160
|
return nil if value.nil?
|
158
161
|
while @scanner.eos? and read_chunk and (sub_value = @scanner.scan(pattern))
|
162
|
+
# trace(__method__, pattern, :sub, sub_value)
|
159
163
|
value << sub_value
|
160
164
|
end
|
165
|
+
# trace(__method__, pattern, :done, value)
|
161
166
|
value
|
162
167
|
end
|
163
168
|
|
@@ -166,76 +171,135 @@ class CSV
|
|
166
171
|
end
|
167
172
|
|
168
173
|
def keep_start
|
169
|
-
|
174
|
+
# trace(__method__, :start)
|
175
|
+
adjust_last_keep
|
176
|
+
@keeps.push([@scanner, @scanner.pos, nil])
|
177
|
+
# trace(__method__, :done)
|
170
178
|
end
|
171
179
|
|
172
180
|
def keep_end
|
173
|
-
|
174
|
-
|
181
|
+
# trace(__method__, :start)
|
182
|
+
scanner, start, buffer = @keeps.pop
|
183
|
+
if scanner == @scanner
|
184
|
+
keep = @scanner.string.byteslice(start, @scanner.pos - start)
|
185
|
+
else
|
186
|
+
keep = @scanner.string.byteslice(0, @scanner.pos)
|
187
|
+
end
|
175
188
|
if buffer
|
176
189
|
buffer << keep
|
177
190
|
keep = buffer
|
178
191
|
end
|
192
|
+
# trace(__method__, :done, keep)
|
179
193
|
keep
|
180
194
|
end
|
181
195
|
|
182
196
|
def keep_back
|
183
|
-
|
197
|
+
# trace(__method__, :start)
|
198
|
+
scanner, start, buffer = @keeps.pop
|
184
199
|
if buffer
|
200
|
+
# trace(__method__, :rescan, start, buffer)
|
185
201
|
string = @scanner.string
|
186
|
-
|
202
|
+
if scanner == @scanner
|
203
|
+
keep = string.byteslice(start, string.bytesize - start)
|
204
|
+
else
|
205
|
+
keep = string
|
206
|
+
end
|
187
207
|
if keep and not keep.empty?
|
188
208
|
@inputs.unshift(StringIO.new(keep))
|
189
209
|
@last_scanner = false
|
190
210
|
end
|
191
211
|
@scanner = StringScanner.new(buffer)
|
192
212
|
else
|
213
|
+
if @scanner != scanner
|
214
|
+
message = "scanners are different but no buffer: "
|
215
|
+
message += "#{@scanner.inspect}(#{@scanner.object_id}): "
|
216
|
+
message += "#{scanner.inspect}(#{scanner.object_id})"
|
217
|
+
raise UnexpectedError, message
|
218
|
+
end
|
219
|
+
# trace(__method__, :repos, start, buffer)
|
193
220
|
@scanner.pos = start
|
194
221
|
end
|
195
222
|
read_chunk if @scanner.eos?
|
196
223
|
end
|
197
224
|
|
198
225
|
def keep_drop
|
199
|
-
@keeps.pop
|
226
|
+
_, _, buffer = @keeps.pop
|
227
|
+
# trace(__method__, :done, :empty) unless buffer
|
228
|
+
return unless buffer
|
229
|
+
|
230
|
+
last_keep = @keeps.last
|
231
|
+
# trace(__method__, :done, :no_last_keep) unless last_keep
|
232
|
+
return unless last_keep
|
233
|
+
|
234
|
+
if last_keep[2]
|
235
|
+
last_keep[2] << buffer
|
236
|
+
else
|
237
|
+
last_keep[2] = buffer
|
238
|
+
end
|
239
|
+
# trace(__method__, :done)
|
200
240
|
end
|
201
241
|
|
202
242
|
def rest
|
203
243
|
@scanner.rest
|
204
244
|
end
|
205
245
|
|
246
|
+
def check(pattern)
|
247
|
+
@scanner.check(pattern)
|
248
|
+
end
|
249
|
+
|
206
250
|
private
|
207
|
-
def
|
208
|
-
|
251
|
+
def trace(*args)
|
252
|
+
pp([*args, @scanner, @scanner&.string, @scanner&.pos, @keeps])
|
253
|
+
end
|
209
254
|
|
210
|
-
|
211
|
-
|
212
|
-
|
213
|
-
|
214
|
-
|
215
|
-
|
216
|
-
|
217
|
-
|
218
|
-
|
219
|
-
|
220
|
-
|
221
|
-
|
255
|
+
def adjust_last_keep
|
256
|
+
# trace(__method__, :start)
|
257
|
+
|
258
|
+
keep = @keeps.last
|
259
|
+
# trace(__method__, :done, :empty) if keep.nil?
|
260
|
+
return if keep.nil?
|
261
|
+
|
262
|
+
scanner, start, buffer = keep
|
263
|
+
string = @scanner.string
|
264
|
+
if @scanner != scanner
|
265
|
+
start = 0
|
266
|
+
end
|
267
|
+
if start == 0 and @scanner.eos?
|
268
|
+
keep_data = string
|
269
|
+
else
|
270
|
+
keep_data = string.byteslice(start, @scanner.pos - start)
|
271
|
+
end
|
272
|
+
if keep_data
|
273
|
+
if buffer
|
274
|
+
buffer << keep_data
|
275
|
+
else
|
276
|
+
keep[2] = keep_data.dup
|
222
277
|
end
|
223
|
-
keep[0] = 0
|
224
278
|
end
|
225
279
|
|
280
|
+
# trace(__method__, :done)
|
281
|
+
end
|
282
|
+
|
283
|
+
def read_chunk
|
284
|
+
return false if @last_scanner
|
285
|
+
|
286
|
+
adjust_last_keep
|
287
|
+
|
226
288
|
input = @inputs.first
|
227
289
|
case input
|
228
290
|
when StringIO
|
229
291
|
string = input.read
|
230
292
|
raise InvalidEncoding unless string.valid_encoding?
|
293
|
+
# trace(__method__, :stringio, string)
|
231
294
|
@scanner = StringScanner.new(string)
|
232
295
|
@inputs.shift
|
233
296
|
@last_scanner = @inputs.empty?
|
234
297
|
true
|
235
298
|
else
|
236
|
-
chunk = input.gets(
|
299
|
+
chunk = input.gets(@row_separator, @chunk_size)
|
237
300
|
if chunk
|
238
301
|
raise InvalidEncoding unless chunk.valid_encoding?
|
302
|
+
# trace(__method__, :chunk, chunk)
|
239
303
|
@scanner = StringScanner.new(chunk)
|
240
304
|
if input.respond_to?(:eof?) and input.eof?
|
241
305
|
@inputs.shift
|
@@ -243,6 +307,7 @@ class CSV
|
|
243
307
|
end
|
244
308
|
true
|
245
309
|
else
|
310
|
+
# trace(__method__, :no_chunk)
|
246
311
|
@scanner = StringScanner.new("".encode(@encoding))
|
247
312
|
@inputs.shift
|
248
313
|
@last_scanner = @inputs.empty?
|
@@ -277,7 +342,11 @@ class CSV
|
|
277
342
|
end
|
278
343
|
|
279
344
|
def field_size_limit
|
280
|
-
@
|
345
|
+
@max_field_size&.succ
|
346
|
+
end
|
347
|
+
|
348
|
+
def max_field_size
|
349
|
+
@max_field_size
|
281
350
|
end
|
282
351
|
|
283
352
|
def skip_lines
|
@@ -345,6 +414,16 @@ class CSV
|
|
345
414
|
end
|
346
415
|
message = "Invalid byte sequence in #{@encoding}"
|
347
416
|
raise MalformedCSVError.new(message, lineno)
|
417
|
+
rescue UnexpectedError => error
|
418
|
+
if @scanner
|
419
|
+
ignore_broken_line
|
420
|
+
lineno = @lineno
|
421
|
+
else
|
422
|
+
lineno = @lineno + 1
|
423
|
+
end
|
424
|
+
message = "This should not be happen: #{error.message}: "
|
425
|
+
message += "Please report this to https://github.com/ruby/csv/issues"
|
426
|
+
raise MalformedCSVError.new(message, lineno)
|
348
427
|
end
|
349
428
|
end
|
350
429
|
|
@@ -361,6 +440,7 @@ class CSV
|
|
361
440
|
prepare_skip_lines
|
362
441
|
prepare_strip
|
363
442
|
prepare_separators
|
443
|
+
validate_strip_and_col_sep_options
|
364
444
|
prepare_quoted
|
365
445
|
prepare_unquoted
|
366
446
|
prepare_line
|
@@ -388,7 +468,7 @@ class CSV
|
|
388
468
|
@backslash_quote = false
|
389
469
|
end
|
390
470
|
@unconverted_fields = @options[:unconverted_fields]
|
391
|
-
@
|
471
|
+
@max_field_size = @options[:max_field_size]
|
392
472
|
@skip_blanks = @options[:skip_blanks]
|
393
473
|
@fields_converter = @options[:fields_converter]
|
394
474
|
@header_fields_converter = @options[:header_fields_converter]
|
@@ -531,6 +611,28 @@ class CSV
|
|
531
611
|
@not_line_end = Regexp.new("[^\r\n]+".encode(@encoding))
|
532
612
|
end
|
533
613
|
|
614
|
+
# This method verifies that there are no (obvious) ambiguities with the
|
615
|
+
# provided +col_sep+ and +strip+ parsing options. For example, if +col_sep+
|
616
|
+
# and +strip+ were both equal to +\t+, then there would be no clear way to
|
617
|
+
# parse the input.
|
618
|
+
def validate_strip_and_col_sep_options
|
619
|
+
return unless @strip
|
620
|
+
|
621
|
+
if @strip.is_a?(String)
|
622
|
+
if @column_separator.start_with?(@strip) || @column_separator.end_with?(@strip)
|
623
|
+
raise ArgumentError,
|
624
|
+
"The provided strip (#{@escaped_strip}) and " \
|
625
|
+
"col_sep (#{@escaped_column_separator}) options are incompatible."
|
626
|
+
end
|
627
|
+
else
|
628
|
+
if Regexp.new("\\A[#{@escaped_strip}]|[#{@escaped_strip}]\\z").match?(@column_separator)
|
629
|
+
raise ArgumentError,
|
630
|
+
"The provided strip (true) and " \
|
631
|
+
"col_sep (#{@escaped_column_separator}) options are incompatible."
|
632
|
+
end
|
633
|
+
end
|
634
|
+
end
|
635
|
+
|
534
636
|
def prepare_quoted
|
535
637
|
if @quote_character
|
536
638
|
@quotes = Regexp.new(@escaped_quote_character +
|
@@ -656,9 +758,10 @@ class CSV
|
|
656
758
|
case headers
|
657
759
|
when Array
|
658
760
|
@raw_headers = headers
|
761
|
+
quoted_fields = [false] * @raw_headers.size
|
659
762
|
@use_headers = true
|
660
763
|
when String
|
661
|
-
@raw_headers = parse_headers(headers)
|
764
|
+
@raw_headers, quoted_fields = parse_headers(headers)
|
662
765
|
@use_headers = true
|
663
766
|
when nil, false
|
664
767
|
@raw_headers = nil
|
@@ -668,21 +771,28 @@ class CSV
|
|
668
771
|
@use_headers = true
|
669
772
|
end
|
670
773
|
if @raw_headers
|
671
|
-
@headers = adjust_headers(@raw_headers)
|
774
|
+
@headers = adjust_headers(@raw_headers, quoted_fields)
|
672
775
|
else
|
673
776
|
@headers = nil
|
674
777
|
end
|
675
778
|
end
|
676
779
|
|
677
780
|
def parse_headers(row)
|
678
|
-
|
679
|
-
|
680
|
-
|
681
|
-
|
781
|
+
quoted_fields = []
|
782
|
+
converter = lambda do |field, info|
|
783
|
+
quoted_fields << info.quoted?
|
784
|
+
field
|
785
|
+
end
|
786
|
+
headers = CSV.parse_line(row,
|
787
|
+
col_sep: @column_separator,
|
788
|
+
row_sep: @row_separator,
|
789
|
+
quote_char: @quote_character,
|
790
|
+
converters: [converter])
|
791
|
+
[headers, quoted_fields]
|
682
792
|
end
|
683
793
|
|
684
|
-
def adjust_headers(headers)
|
685
|
-
adjusted_headers = @header_fields_converter.convert(headers, nil, @lineno)
|
794
|
+
def adjust_headers(headers, quoted_fields)
|
795
|
+
adjusted_headers = @header_fields_converter.convert(headers, nil, @lineno, quoted_fields)
|
686
796
|
adjusted_headers.each {|h| h.freeze if h.is_a? String}
|
687
797
|
adjusted_headers
|
688
798
|
end
|
@@ -705,28 +815,28 @@ class CSV
|
|
705
815
|
sample[0, 128].index(@quote_character)
|
706
816
|
end
|
707
817
|
|
708
|
-
|
709
|
-
|
710
|
-
|
711
|
-
|
712
|
-
@io = StringIO.new(string, "rb:#{string.encoding}")
|
713
|
-
end
|
818
|
+
class UnoptimizedStringIO # :nodoc:
|
819
|
+
def initialize(string)
|
820
|
+
@io = StringIO.new(string, "rb:#{string.encoding}")
|
821
|
+
end
|
714
822
|
|
715
|
-
|
716
|
-
|
717
|
-
|
823
|
+
def gets(*args)
|
824
|
+
@io.gets(*args)
|
825
|
+
end
|
718
826
|
|
719
|
-
|
720
|
-
|
721
|
-
|
827
|
+
def each_line(*args, &block)
|
828
|
+
@io.each_line(*args, &block)
|
829
|
+
end
|
722
830
|
|
723
|
-
|
724
|
-
|
725
|
-
end
|
831
|
+
def eof?
|
832
|
+
@io.eof?
|
726
833
|
end
|
834
|
+
end
|
727
835
|
|
728
|
-
|
729
|
-
|
836
|
+
SCANNER_TEST = (ENV["CSV_PARSER_SCANNER_TEST"] == "yes")
|
837
|
+
if SCANNER_TEST
|
838
|
+
SCANNER_TEST_CHUNK_SIZE_NAME = "CSV_PARSER_SCANNER_TEST_CHUNK_SIZE"
|
839
|
+
SCANNER_TEST_CHUNK_SIZE_VALUE = ENV[SCANNER_TEST_CHUNK_SIZE_NAME]
|
730
840
|
def build_scanner
|
731
841
|
inputs = @samples.collect do |sample|
|
732
842
|
UnoptimizedStringIO.new(sample)
|
@@ -736,16 +846,27 @@ class CSV
|
|
736
846
|
else
|
737
847
|
inputs << @input
|
738
848
|
end
|
849
|
+
begin
|
850
|
+
chunk_size_value = ENV[SCANNER_TEST_CHUNK_SIZE_NAME]
|
851
|
+
rescue # Ractor::IsolationError
|
852
|
+
# Ractor on Ruby 3.0 can't read ENV value.
|
853
|
+
chunk_size_value = SCANNER_TEST_CHUNK_SIZE_VALUE
|
854
|
+
end
|
855
|
+
chunk_size = Integer((chunk_size_value || "1"), 10)
|
739
856
|
InputsScanner.new(inputs,
|
740
857
|
@encoding,
|
741
|
-
|
858
|
+
@row_separator,
|
859
|
+
chunk_size: chunk_size)
|
742
860
|
end
|
743
861
|
else
|
744
862
|
def build_scanner
|
745
863
|
string = nil
|
746
864
|
if @samples.empty? and @input.is_a?(StringIO)
|
747
865
|
string = @input.read
|
748
|
-
elsif @samples.size == 1 and
|
866
|
+
elsif @samples.size == 1 and
|
867
|
+
@input != ARGF and
|
868
|
+
@input.respond_to?(:eof?) and
|
869
|
+
@input.eof?
|
749
870
|
string = @samples[0]
|
750
871
|
end
|
751
872
|
if string
|
@@ -764,7 +885,7 @@ class CSV
|
|
764
885
|
StringIO.new(sample)
|
765
886
|
end
|
766
887
|
inputs << @input
|
767
|
-
InputsScanner.new(inputs, @encoding)
|
888
|
+
InputsScanner.new(inputs, @encoding, @row_separator)
|
768
889
|
end
|
769
890
|
end
|
770
891
|
end
|
@@ -798,6 +919,14 @@ class CSV
|
|
798
919
|
end
|
799
920
|
end
|
800
921
|
|
922
|
+
def validate_field_size(field)
|
923
|
+
return unless @max_field_size
|
924
|
+
return if field.size <= @max_field_size
|
925
|
+
ignore_broken_line
|
926
|
+
message = "Field size exceeded: #{field.size} > #{@max_field_size}"
|
927
|
+
raise MalformedCSVError.new(message, @lineno)
|
928
|
+
end
|
929
|
+
|
801
930
|
def parse_no_quote(&block)
|
802
931
|
@scanner.each_line(@row_separator) do |line|
|
803
932
|
next if @skip_lines and skip_line?(line)
|
@@ -807,9 +936,16 @@ class CSV
|
|
807
936
|
if line.empty?
|
808
937
|
next if @skip_blanks
|
809
938
|
row = []
|
939
|
+
quoted_fields = []
|
810
940
|
else
|
811
941
|
line = strip_value(line)
|
812
942
|
row = line.split(@split_column_separator, -1)
|
943
|
+
quoted_fields = [false] * row.size
|
944
|
+
if @max_field_size
|
945
|
+
row.each do |column|
|
946
|
+
validate_field_size(column)
|
947
|
+
end
|
948
|
+
end
|
813
949
|
n_columns = row.size
|
814
950
|
i = 0
|
815
951
|
while i < n_columns
|
@@ -818,7 +954,7 @@ class CSV
|
|
818
954
|
end
|
819
955
|
end
|
820
956
|
@last_line = original_line
|
821
|
-
emit_row(row, &block)
|
957
|
+
emit_row(row, quoted_fields, &block)
|
822
958
|
end
|
823
959
|
end
|
824
960
|
|
@@ -840,31 +976,37 @@ class CSV
|
|
840
976
|
next
|
841
977
|
end
|
842
978
|
row = []
|
979
|
+
quoted_fields = []
|
843
980
|
elsif line.include?(@cr) or line.include?(@lf)
|
844
981
|
@scanner.keep_back
|
845
982
|
@need_robust_parsing = true
|
846
983
|
return parse_quotable_robust(&block)
|
847
984
|
else
|
848
985
|
row = line.split(@split_column_separator, -1)
|
986
|
+
quoted_fields = []
|
849
987
|
n_columns = row.size
|
850
988
|
i = 0
|
851
989
|
while i < n_columns
|
852
990
|
column = row[i]
|
853
991
|
if column.empty?
|
992
|
+
quoted_fields << false
|
854
993
|
row[i] = nil
|
855
994
|
else
|
856
995
|
n_quotes = column.count(@quote_character)
|
857
996
|
if n_quotes.zero?
|
997
|
+
quoted_fields << false
|
858
998
|
# no quote
|
859
999
|
elsif n_quotes == 2 and
|
860
1000
|
column.start_with?(@quote_character) and
|
861
1001
|
column.end_with?(@quote_character)
|
1002
|
+
quoted_fields << true
|
862
1003
|
row[i] = column[1..-2]
|
863
1004
|
else
|
864
1005
|
@scanner.keep_back
|
865
1006
|
@need_robust_parsing = true
|
866
1007
|
return parse_quotable_robust(&block)
|
867
1008
|
end
|
1009
|
+
validate_field_size(row[i])
|
868
1010
|
end
|
869
1011
|
i += 1
|
870
1012
|
end
|
@@ -872,13 +1014,14 @@ class CSV
|
|
872
1014
|
@scanner.keep_drop
|
873
1015
|
@scanner.keep_start
|
874
1016
|
@last_line = original_line
|
875
|
-
emit_row(row, &block)
|
1017
|
+
emit_row(row, quoted_fields, &block)
|
876
1018
|
end
|
877
1019
|
@scanner.keep_drop
|
878
1020
|
end
|
879
1021
|
|
880
1022
|
def parse_quotable_robust(&block)
|
881
1023
|
row = []
|
1024
|
+
quoted_fields = []
|
882
1025
|
skip_needless_lines
|
883
1026
|
start_row
|
884
1027
|
while true
|
@@ -888,32 +1031,39 @@ class CSV
|
|
888
1031
|
value = parse_column_value
|
889
1032
|
if value
|
890
1033
|
@scanner.scan_all(@strip_value) if @strip_value
|
891
|
-
|
892
|
-
ignore_broken_line
|
893
|
-
raise MalformedCSVError.new("Field size exceeded", @lineno)
|
894
|
-
end
|
1034
|
+
validate_field_size(value)
|
895
1035
|
end
|
896
1036
|
if parse_column_end
|
897
1037
|
row << value
|
1038
|
+
quoted_fields << @quoted_column_value
|
898
1039
|
elsif parse_row_end
|
899
1040
|
if row.empty? and value.nil?
|
900
|
-
emit_row([], &block) unless @skip_blanks
|
1041
|
+
emit_row([], [], &block) unless @skip_blanks
|
901
1042
|
else
|
902
1043
|
row << value
|
903
|
-
|
1044
|
+
quoted_fields << @quoted_column_value
|
1045
|
+
emit_row(row, quoted_fields, &block)
|
904
1046
|
row = []
|
1047
|
+
quoted_fields = []
|
905
1048
|
end
|
906
1049
|
skip_needless_lines
|
907
1050
|
start_row
|
908
1051
|
elsif @scanner.eos?
|
909
1052
|
break if row.empty? and value.nil?
|
910
1053
|
row << value
|
911
|
-
|
1054
|
+
quoted_fields << @quoted_column_value
|
1055
|
+
emit_row(row, quoted_fields, &block)
|
912
1056
|
break
|
913
1057
|
else
|
914
1058
|
if @quoted_column_value
|
1059
|
+
if liberal_parsing? and (new_line = @scanner.check(@line_end))
|
1060
|
+
message =
|
1061
|
+
"Illegal end-of-line sequence outside of a quoted field " +
|
1062
|
+
"<#{new_line.inspect}>"
|
1063
|
+
else
|
1064
|
+
message = "Any value after quoted field isn't allowed"
|
1065
|
+
end
|
915
1066
|
ignore_broken_line
|
916
|
-
message = "Any value after quoted field isn't allowed"
|
917
1067
|
raise MalformedCSVError.new(message, @lineno)
|
918
1068
|
elsif @unquoted_column_value and
|
919
1069
|
(new_line = @scanner.scan(@line_end))
|
@@ -1006,7 +1156,7 @@ class CSV
|
|
1006
1156
|
if (n_quotes % 2).zero?
|
1007
1157
|
quotes[0, (n_quotes - 2) / 2]
|
1008
1158
|
else
|
1009
|
-
value = quotes[0,
|
1159
|
+
value = quotes[0, n_quotes / 2]
|
1010
1160
|
while true
|
1011
1161
|
quoted_value = @scanner.scan_all(@quoted_value)
|
1012
1162
|
value << quoted_value if quoted_value
|
@@ -1030,11 +1180,9 @@ class CSV
|
|
1030
1180
|
n_quotes = quotes.size
|
1031
1181
|
if n_quotes == 1
|
1032
1182
|
break
|
1033
|
-
elsif (n_quotes % 2) == 1
|
1034
|
-
value << quotes[0, (n_quotes - 1) / 2]
|
1035
|
-
break
|
1036
1183
|
else
|
1037
1184
|
value << quotes[0, n_quotes / 2]
|
1185
|
+
break if (n_quotes % 2) == 1
|
1038
1186
|
end
|
1039
1187
|
end
|
1040
1188
|
value
|
@@ -1070,18 +1218,15 @@ class CSV
|
|
1070
1218
|
|
1071
1219
|
def strip_value(value)
|
1072
1220
|
return value unless @strip
|
1073
|
-
return
|
1221
|
+
return value if value.nil?
|
1074
1222
|
|
1075
1223
|
case @strip
|
1076
1224
|
when String
|
1077
|
-
|
1078
|
-
|
1079
|
-
size -= 1
|
1080
|
-
value = value[1, size]
|
1225
|
+
while value.delete_prefix!(@strip)
|
1226
|
+
# do nothing
|
1081
1227
|
end
|
1082
|
-
while value.
|
1083
|
-
|
1084
|
-
value = value[0, size]
|
1228
|
+
while value.delete_suffix!(@strip)
|
1229
|
+
# do nothing
|
1085
1230
|
end
|
1086
1231
|
else
|
1087
1232
|
value.strip!
|
@@ -1104,22 +1249,22 @@ class CSV
|
|
1104
1249
|
@scanner.keep_start
|
1105
1250
|
end
|
1106
1251
|
|
1107
|
-
def emit_row(row, &block)
|
1252
|
+
def emit_row(row, quoted_fields, &block)
|
1108
1253
|
@lineno += 1
|
1109
1254
|
|
1110
1255
|
raw_row = row
|
1111
1256
|
if @use_headers
|
1112
1257
|
if @headers.nil?
|
1113
|
-
@headers = adjust_headers(row)
|
1258
|
+
@headers = adjust_headers(row, quoted_fields)
|
1114
1259
|
return unless @return_headers
|
1115
1260
|
row = Row.new(@headers, row, true)
|
1116
1261
|
else
|
1117
1262
|
row = Row.new(@headers,
|
1118
|
-
@fields_converter.convert(raw_row, @headers, @lineno))
|
1263
|
+
@fields_converter.convert(raw_row, @headers, @lineno, quoted_fields))
|
1119
1264
|
end
|
1120
1265
|
else
|
1121
1266
|
# convert fields, if needed...
|
1122
|
-
row = @fields_converter.convert(raw_row, nil, @lineno)
|
1267
|
+
row = @fields_converter.convert(raw_row, nil, @lineno, quoted_fields)
|
1123
1268
|
end
|
1124
1269
|
|
1125
1270
|
# inject unconverted fields and accessor, if requested...
|
data/lib/csv/row.rb
CHANGED
@@ -703,7 +703,7 @@ class CSV
|
|
703
703
|
# by +index_or_header+ and +specifiers+.
|
704
704
|
#
|
705
705
|
# The nested objects may be instances of various classes.
|
706
|
-
# See {Dig Methods}[https://docs.ruby-lang.org/en/master/
|
706
|
+
# See {Dig Methods}[https://docs.ruby-lang.org/en/master/dig_methods_rdoc.html].
|
707
707
|
#
|
708
708
|
# Examples:
|
709
709
|
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
data/lib/csv/table.rb
CHANGED
@@ -999,9 +999,15 @@ class CSV
|
|
999
999
|
# Omits the headers if option +write_headers+ is given as +false+
|
1000
1000
|
# (see {Option +write_headers+}[../CSV.html#class-CSV-label-Option+write_headers]):
|
1001
1001
|
# table.to_csv(write_headers: false) # => "foo,0\nbar,1\nbaz,2\n"
|
1002
|
-
|
1002
|
+
#
|
1003
|
+
# Limit rows if option +limit+ is given like +2+:
|
1004
|
+
# table.to_csv(limit: 2) # => "Name,Value\nfoo,0\nbar,1\n"
|
1005
|
+
def to_csv(write_headers: true, limit: nil, **options)
|
1003
1006
|
array = write_headers ? [headers.to_csv(**options)] : []
|
1004
|
-
@table.
|
1007
|
+
limit ||= @table.size
|
1008
|
+
limit = @table.size + 1 + limit if limit < 0
|
1009
|
+
limit = 0 if limit < 0
|
1010
|
+
@table.first(limit).each do |row|
|
1005
1011
|
array.push(row.fields.to_csv(**options)) unless row.header_row?
|
1006
1012
|
end
|
1007
1013
|
|
@@ -1038,9 +1044,13 @@ class CSV
|
|
1038
1044
|
# Example:
|
1039
1045
|
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
|
1040
1046
|
# table = CSV.parse(source, headers: true)
|
1041
|
-
# table.inspect # => "#<CSV::Table mode:col_or_row row_count:4
|
1047
|
+
# table.inspect # => "#<CSV::Table mode:col_or_row row_count:4>\nName,Value\nfoo,0\nbar,1\nbaz,2\n"
|
1048
|
+
#
|
1042
1049
|
def inspect
|
1043
|
-
"#<#{self.class} mode:#{@mode} row_count:#{to_a.size}>"
|
1050
|
+
inspected = +"#<#{self.class} mode:#{@mode} row_count:#{to_a.size}>"
|
1051
|
+
summary = to_csv(limit: 5)
|
1052
|
+
inspected << "\n" << summary if summary.encoding.ascii_compatible?
|
1053
|
+
inspected
|
1044
1054
|
end
|
1045
1055
|
end
|
1046
1056
|
end
|
data/lib/csv/version.rb
CHANGED
data/lib/csv/writer.rb
CHANGED
@@ -1,11 +1,8 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
require_relative "input_record_separator"
|
4
|
-
require_relative "match_p"
|
5
4
|
require_relative "row"
|
6
5
|
|
7
|
-
using CSV::MatchP if CSV.const_defined?(:MatchP)
|
8
|
-
|
9
6
|
class CSV
|
10
7
|
# Note: Don't use this class directly. This is an internal class.
|
11
8
|
class Writer
|
@@ -42,7 +39,10 @@ class CSV
|
|
42
39
|
@headers ||= row if @use_headers
|
43
40
|
@lineno += 1
|
44
41
|
|
45
|
-
|
42
|
+
if @fields_converter
|
43
|
+
quoted_fields = [false] * row.size
|
44
|
+
row = @fields_converter.convert(row, nil, lineno, quoted_fields)
|
45
|
+
end
|
46
46
|
|
47
47
|
i = -1
|
48
48
|
converted_row = row.collect do |field|
|
@@ -97,7 +97,7 @@ class CSV
|
|
97
97
|
return unless @headers
|
98
98
|
|
99
99
|
converter = @options[:header_fields_converter]
|
100
|
-
@headers = converter.convert(@headers, nil, 0)
|
100
|
+
@headers = converter.convert(@headers, nil, 0, [])
|
101
101
|
@headers.each do |header|
|
102
102
|
header.freeze if header.is_a?(String)
|
103
103
|
end
|
data/lib/csv.rb
CHANGED
@@ -95,14 +95,11 @@ require "stringio"
|
|
95
95
|
|
96
96
|
require_relative "csv/fields_converter"
|
97
97
|
require_relative "csv/input_record_separator"
|
98
|
-
require_relative "csv/match_p"
|
99
98
|
require_relative "csv/parser"
|
100
99
|
require_relative "csv/row"
|
101
100
|
require_relative "csv/table"
|
102
101
|
require_relative "csv/writer"
|
103
102
|
|
104
|
-
using CSV::MatchP if CSV.const_defined?(:MatchP)
|
105
|
-
|
106
103
|
# == \CSV
|
107
104
|
#
|
108
105
|
# === In a Hurry?
|
@@ -341,6 +338,7 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
|
|
341
338
|
# liberal_parsing: false,
|
342
339
|
# nil_value: nil,
|
343
340
|
# empty_value: "",
|
341
|
+
# strip: false,
|
344
342
|
# # For generating.
|
345
343
|
# write_headers: nil,
|
346
344
|
# quote_empty: true,
|
@@ -348,7 +346,6 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
|
|
348
346
|
# write_converters: nil,
|
349
347
|
# write_nil_value: nil,
|
350
348
|
# write_empty_value: "",
|
351
|
-
# strip: false,
|
352
349
|
# }
|
353
350
|
#
|
354
351
|
# ==== Options for Parsing
|
@@ -357,7 +354,9 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
|
|
357
354
|
# - +row_sep+: Specifies the row separator; used to delimit rows.
|
358
355
|
# - +col_sep+: Specifies the column separator; used to delimit fields.
|
359
356
|
# - +quote_char+: Specifies the quote character; used to quote fields.
|
360
|
-
# - +field_size_limit+: Specifies the maximum field size allowed.
|
357
|
+
# - +field_size_limit+: Specifies the maximum field size + 1 allowed.
|
358
|
+
# Deprecated since 3.2.3. Use +max_field_size+ instead.
|
359
|
+
# - +max_field_size+: Specifies the maximum field size allowed.
|
361
360
|
# - +converters+: Specifies the field converters to be used.
|
362
361
|
# - +unconverted_fields+: Specifies whether unconverted fields are to be available.
|
363
362
|
# - +headers+: Specifies whether data contains headers,
|
@@ -366,8 +365,9 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
|
|
366
365
|
# - +header_converters+: Specifies the header converters to be used.
|
367
366
|
# - +skip_blanks+: Specifies whether blanks lines are to be ignored.
|
368
367
|
# - +skip_lines+: Specifies how comments lines are to be recognized.
|
369
|
-
# - +strip+: Specifies whether leading and trailing whitespace are
|
370
|
-
#
|
368
|
+
# - +strip+: Specifies whether leading and trailing whitespace are to be
|
369
|
+
# stripped from fields. This must be compatible with +col_sep+; if it is not,
|
370
|
+
# then an +ArgumentError+ exception will be raised.
|
371
371
|
# - +liberal_parsing+: Specifies whether \CSV should attempt to parse
|
372
372
|
# non-compliant data.
|
373
373
|
# - +nil_value+: Specifies the object that is to be substituted for each null (no-text) field.
|
@@ -863,8 +863,9 @@ class CSV
|
|
863
863
|
# <b><tt>index</tt></b>:: The zero-based index of the field in its row.
|
864
864
|
# <b><tt>line</tt></b>:: The line of the data source this row is from.
|
865
865
|
# <b><tt>header</tt></b>:: The header for the column, when available.
|
866
|
+
# <b><tt>quoted?</tt></b>:: True or false, whether the original value is quoted or not.
|
866
867
|
#
|
867
|
-
FieldInfo = Struct.new(:index, :line, :header)
|
868
|
+
FieldInfo = Struct.new(:index, :line, :header, :quoted?)
|
868
869
|
|
869
870
|
# A Regexp used to find and convert some common Date formats.
|
870
871
|
DateMatcher = / \A(?: (\w+,?\s+)?\w+\s+\d{1,2},?\s+\d{2,4} |
|
@@ -872,10 +873,9 @@ class CSV
|
|
872
873
|
# A Regexp used to find and convert some common DateTime formats.
|
873
874
|
DateTimeMatcher =
|
874
875
|
/ \A(?: (\w+,?\s+)?\w+\s+\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2},?\s+\d{2,4} |
|
875
|
-
|
876
|
-
# ISO-8601
|
876
|
+
# ISO-8601 and RFC-3339 (space instead of T) recognized by DateTime.parse
|
877
877
|
\d{4}-\d{2}-\d{2}
|
878
|
-
(?:T\d{2}:\d{2}(?::\d{2}(?:\.\d+)?(?:[+-]\d{2}(?::\d{2})|Z)?)?)?
|
878
|
+
(?:[T\s]\d{2}:\d{2}(?::\d{2}(?:\.\d+)?(?:[+-]\d{2}(?::\d{2})|Z)?)?)?
|
879
879
|
)\z /x
|
880
880
|
|
881
881
|
# The encoding used by all converters.
|
@@ -925,7 +925,8 @@ class CSV
|
|
925
925
|
symbol: lambda { |h|
|
926
926
|
h.encode(ConverterEncoding).downcase.gsub(/[^\s\w]+/, "").strip.
|
927
927
|
gsub(/\s+/, "_").to_sym
|
928
|
-
}
|
928
|
+
},
|
929
|
+
symbol_raw: lambda { |h| h.encode(ConverterEncoding).to_sym }
|
929
930
|
}
|
930
931
|
|
931
932
|
# Default values for method options.
|
@@ -936,6 +937,7 @@ class CSV
|
|
936
937
|
quote_char: '"',
|
937
938
|
# For parsing.
|
938
939
|
field_size_limit: nil,
|
940
|
+
max_field_size: nil,
|
939
941
|
converters: nil,
|
940
942
|
unconverted_fields: nil,
|
941
943
|
headers: false,
|
@@ -946,6 +948,7 @@ class CSV
|
|
946
948
|
liberal_parsing: false,
|
947
949
|
nil_value: nil,
|
948
950
|
empty_value: "",
|
951
|
+
strip: false,
|
949
952
|
# For generating.
|
950
953
|
write_headers: nil,
|
951
954
|
quote_empty: true,
|
@@ -953,7 +956,6 @@ class CSV
|
|
953
956
|
write_converters: nil,
|
954
957
|
write_nil_value: nil,
|
955
958
|
write_empty_value: "",
|
956
|
-
strip: false,
|
957
959
|
}.freeze
|
958
960
|
|
959
961
|
class << self
|
@@ -1864,6 +1866,7 @@ class CSV
|
|
1864
1866
|
row_sep: :auto,
|
1865
1867
|
quote_char: '"',
|
1866
1868
|
field_size_limit: nil,
|
1869
|
+
max_field_size: nil,
|
1867
1870
|
converters: nil,
|
1868
1871
|
unconverted_fields: nil,
|
1869
1872
|
headers: false,
|
@@ -1879,16 +1882,27 @@ class CSV
|
|
1879
1882
|
encoding: nil,
|
1880
1883
|
nil_value: nil,
|
1881
1884
|
empty_value: "",
|
1885
|
+
strip: false,
|
1882
1886
|
quote_empty: true,
|
1883
1887
|
write_converters: nil,
|
1884
1888
|
write_nil_value: nil,
|
1885
|
-
write_empty_value: ""
|
1886
|
-
strip: false)
|
1889
|
+
write_empty_value: "")
|
1887
1890
|
raise ArgumentError.new("Cannot parse nil as CSV") if data.nil?
|
1888
1891
|
|
1889
1892
|
if data.is_a?(String)
|
1893
|
+
if encoding
|
1894
|
+
if encoding.is_a?(String)
|
1895
|
+
data_external_encoding, data_internal_encoding = encoding.split(":", 2)
|
1896
|
+
if data_internal_encoding
|
1897
|
+
data = data.encode(data_internal_encoding, data_external_encoding)
|
1898
|
+
else
|
1899
|
+
data = data.dup.force_encoding(data_external_encoding)
|
1900
|
+
end
|
1901
|
+
else
|
1902
|
+
data = data.dup.force_encoding(encoding)
|
1903
|
+
end
|
1904
|
+
end
|
1890
1905
|
@io = StringIO.new(data)
|
1891
|
-
@io.set_encoding(encoding || data.encoding)
|
1892
1906
|
else
|
1893
1907
|
@io = data
|
1894
1908
|
end
|
@@ -1906,11 +1920,14 @@ class CSV
|
|
1906
1920
|
@initial_header_converters = header_converters
|
1907
1921
|
@initial_write_converters = write_converters
|
1908
1922
|
|
1923
|
+
if max_field_size.nil? and field_size_limit
|
1924
|
+
max_field_size = field_size_limit - 1
|
1925
|
+
end
|
1909
1926
|
@parser_options = {
|
1910
1927
|
column_separator: col_sep,
|
1911
1928
|
row_separator: row_sep,
|
1912
1929
|
quote_character: quote_char,
|
1913
|
-
|
1930
|
+
max_field_size: max_field_size,
|
1914
1931
|
unconverted_fields: unconverted_fields,
|
1915
1932
|
headers: headers,
|
1916
1933
|
return_headers: return_headers,
|
@@ -1978,10 +1995,24 @@ class CSV
|
|
1978
1995
|
# Returns the limit for field size; used for parsing;
|
1979
1996
|
# see {Option +field_size_limit+}[#class-CSV-label-Option+field_size_limit]:
|
1980
1997
|
# CSV.new('').field_size_limit # => nil
|
1998
|
+
#
|
1999
|
+
# Deprecated since 3.2.3. Use +max_field_size+ instead.
|
1981
2000
|
def field_size_limit
|
1982
2001
|
parser.field_size_limit
|
1983
2002
|
end
|
1984
2003
|
|
2004
|
+
# :call-seq:
|
2005
|
+
# csv.max_field_size -> integer or nil
|
2006
|
+
#
|
2007
|
+
# Returns the limit for field size; used for parsing;
|
2008
|
+
# see {Option +max_field_size+}[#class-CSV-label-Option+max_field_size]:
|
2009
|
+
# CSV.new('').max_field_size # => nil
|
2010
|
+
#
|
2011
|
+
# Since 3.2.3.
|
2012
|
+
def max_field_size
|
2013
|
+
parser.max_field_size
|
2014
|
+
end
|
2015
|
+
|
1985
2016
|
# :call-seq:
|
1986
2017
|
# csv.skip_lines -> regexp or nil
|
1987
2018
|
#
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: csv
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 3.2.
|
4
|
+
version: 3.2.4
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- James Edward Gray II
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date:
|
12
|
+
date: 2022-08-22 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: bundler
|
@@ -116,10 +116,8 @@ files:
|
|
116
116
|
- lib/csv.rb
|
117
117
|
- lib/csv/core_ext/array.rb
|
118
118
|
- lib/csv/core_ext/string.rb
|
119
|
-
- lib/csv/delete_suffix.rb
|
120
119
|
- lib/csv/fields_converter.rb
|
121
120
|
- lib/csv/input_record_separator.rb
|
122
|
-
- lib/csv/match_p.rb
|
123
121
|
- lib/csv/parser.rb
|
124
122
|
- lib/csv/row.rb
|
125
123
|
- lib/csv/table.rb
|
@@ -147,7 +145,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
147
145
|
- !ruby/object:Gem::Version
|
148
146
|
version: '0'
|
149
147
|
requirements: []
|
150
|
-
rubygems_version: 3.
|
148
|
+
rubygems_version: 3.4.0.dev
|
151
149
|
signing_key:
|
152
150
|
specification_version: 4
|
153
151
|
summary: CSV Reading and Writing
|
data/lib/csv/delete_suffix.rb
DELETED
@@ -1,18 +0,0 @@
|
|
1
|
-
# frozen_string_literal: true
|
2
|
-
|
3
|
-
# This provides String#delete_suffix? for Ruby 2.4.
|
4
|
-
unless String.method_defined?(:delete_suffix)
|
5
|
-
class CSV
|
6
|
-
module DeleteSuffix
|
7
|
-
refine String do
|
8
|
-
def delete_suffix(suffix)
|
9
|
-
if end_with?(suffix)
|
10
|
-
self[0...-suffix.size]
|
11
|
-
else
|
12
|
-
self
|
13
|
-
end
|
14
|
-
end
|
15
|
-
end
|
16
|
-
end
|
17
|
-
end
|
18
|
-
end
|
data/lib/csv/match_p.rb
DELETED
@@ -1,20 +0,0 @@
|
|
1
|
-
# frozen_string_literal: true
|
2
|
-
|
3
|
-
# This provides String#match? and Regexp#match? for Ruby 2.3.
|
4
|
-
unless String.method_defined?(:match?)
|
5
|
-
class CSV
|
6
|
-
module MatchP
|
7
|
-
refine String do
|
8
|
-
def match?(pattern)
|
9
|
-
self =~ pattern
|
10
|
-
end
|
11
|
-
end
|
12
|
-
|
13
|
-
refine Regexp do
|
14
|
-
def match?(string)
|
15
|
-
self =~ string
|
16
|
-
end
|
17
|
-
end
|
18
|
-
end
|
19
|
-
end
|
20
|
-
end
|