csv 3.2.8 → 3.3.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c64817c16c8991fc2596875101449b5452326fe91bd05e4bb6a66213113525d6
4
- data.tar.gz: 19d6d80d6959f6cde0ac651774ea795dbd0f949135cae021fef3983d94248f9c
3
+ metadata.gz: f77fe04292e17f436f9c5838bb3cbd6aa4297b807478c127321323fc4ad27c28
4
+ data.tar.gz: 63ecf6ce80dcfa3b7eb24d0f3947455eb223d1c44a32b58b4d084b29efad8961
5
5
  SHA512:
6
- metadata.gz: 556f6582468d4a3c2994c12c25dba73b8db65e1a10f7306b9b5bc1fa345f47bf7872db1c603ddcd1a0eb359e7857c51a9874be2231dc821730ae62d15604c3b7
7
- data.tar.gz: 348a25f4c1bb8e4fe0d71dc944e0a26165627803cb2528fc067642827fd3c253bda48aba179d3575950a7244bd4e8edf2eed9a99101952a07256a3f4f9d1e7fe
6
+ metadata.gz: f9ffbf24700b4ab0eb35a2cb8086e82e36825558c14b646ae00374cb3e80f7195ebb3c9ac08d11b011ffa0cc10ae607a79e24d68b793b54c50d2dd1e8bd5ec16
7
+ data.tar.gz: 628a0b0c0963d69686845172d84fbbdf1cebb9fc6839e052054f9e7c112bbcb96743a63e8cf5d1a6df8a9299f03ab3b8d81a9b814fb6d0f222e444b9f0282a5d
data/NEWS.md CHANGED
@@ -1,5 +1,132 @@
1
1
  # News
2
2
 
3
+ ## 3.3.5 - 2025-06-01
4
+
5
+ ### Improvements
6
+
7
+ * docs: Fixed `StringScanner` document URL.
8
+ * GH-343
9
+ * Patch by Petrik de Heus
10
+
11
+ ### Thanks
12
+
13
+ * Petrik de Heus
14
+
15
+ ## 3.3.4 - 2025-04-13
16
+
17
+ ### Improvements
18
+
19
+ * `csv-filter`: Removed an experimental command line tool.
20
+ * GH-341
21
+
22
+ ## 3.3.3 - 2025-03-20
23
+
24
+ ### Improvements
25
+
26
+ * `csv-filter`: Added an experimental command line tool to filter a CSV.
27
+ * Patch by Burdette Lamar
28
+
29
+ ### Fixes
30
+
31
+ * Fixed wrong EOF detection for `ARGF`
32
+ * GH-328
33
+ * Reported by Takeshi Nishimatsu
34
+
35
+ * Fixed a regression bug that `CSV.open` rejects integer mode.
36
+ * GH-336
37
+ * Reported by Dave Burgess
38
+
39
+ ### Thanks
40
+
41
+ * Takeshi Nishimatsu
42
+
43
+ * Burdette Lamar
44
+
45
+ * Dave Burgess
46
+
47
+ ## 3.3.2 - 2024-12-21
48
+
49
+ ### Fixes
50
+
51
+ * Fixed a parse bug with a quoted line with `col_sep` and an empty
52
+ line. This was introduced in 3.3.1.
53
+ * GH-324
54
+ * Reported by stoodfarback
55
+
56
+ ### Thanks
57
+
58
+ * stoodfarback
59
+
60
+ ## 3.3.1 - 2024-12-15
61
+
62
+ ### Improvements
63
+
64
+ * `CSV.open`: Changed to detect BOM by default. Note that this isn't
65
+ enabled on Windows because Ruby may have a bug. See also:
66
+ https://bugs.ruby-lang.org/issues/20526
67
+ * GH-301
68
+ * Reported by Junichi Ito
69
+
70
+ * Improved performance.
71
+ * GH-311
72
+ * GH-312
73
+ * Patch by Vladimir Kochnev
74
+
75
+ * `CSV.open`: Added support for `StringIO` as an input.
76
+ * GH-300
77
+ * GH-302
78
+ * Patch by Marcelo
79
+
80
+ * Added a built-in time converter. You can use it by `converters:
81
+ :time`.
82
+ * GH-313
83
+ * Patch by Bart de Water
84
+
85
+ * Added `CSV::TSV` for tab-separated values.
86
+ * GH-272
87
+ * GH-319
88
+ * Reported by kojix2
89
+ * Patch by Jas
90
+
91
+ ### Thanks
92
+
93
+ * Junichi Ito
94
+
95
+ * Vladimir Kochnev
96
+
97
+ * Marcelo
98
+
99
+ * Bart de Water
100
+
101
+ * kojix2
102
+
103
+ * Jas
104
+
105
+ ## 3.3.0 - 2024-03-22
106
+
107
+ ### Fixes
108
+
109
+ * Fixed a regression parse bug in 3.2.9 that parsing with
110
+ `:skip_lines` may cause wrong result.
111
+
112
+ ## 3.2.9 - 2024-03-22
113
+
114
+ ### Fixes
115
+
116
+ * Fixed a parse bug that wrong result may be happen when:
117
+
118
+ * `:skip_lines` is used
119
+ * `:row_separator` is `"\r\n"`
120
+ * There is a line that includes `\n` as a column value
121
+
122
+ Reported by Ryo Tsukamoto.
123
+
124
+ GH-296
125
+
126
+ ### Thanks
127
+
128
+ * Ryo Tsukamoto
129
+
3
130
  ## 3.2.8 - 2023-11-08
4
131
 
5
132
  ### Improvements
data/README.md CHANGED
@@ -30,8 +30,8 @@ end
30
30
 
31
31
  ## Documentation
32
32
 
33
- - [API](https://ruby-doc.org/stdlib/libdoc/csv/rdoc/CSV.html): all classes, methods, and constants.
34
- - [Recipes](https://ruby-doc.org/core/doc/csv/recipes/recipes_rdoc.html): specific code for specific tasks.
33
+ - [API](https://ruby.github.io/csv/): all classes, methods, and constants.
34
+ - [Recipes](https://ruby.github.io/csv/doc/csv/recipes/recipes_rdoc.html): specific code for specific tasks.
35
35
 
36
36
  ## Development
37
37
 
@@ -11,16 +11,20 @@ All code snippets on this page assume that the following has been executed:
11
11
 
12
12
  - {Source and Output Formats}[#label-Source+and+Output+Formats]
13
13
  - {Filtering String to String}[#label-Filtering+String+to+String]
14
- - {Recipe: Filter String to String with Headers}[#label-Recipe-3A+Filter+String+to+String+with+Headers]
14
+ - {Recipe: Filter String to String parsing Headers}[#label-Recipe-3A+Filter+String+to+String+parsing+Headers]
15
+ - {Recipe: Filter String to String parsing and writing Headers}[#label-Recipe-3A+Filter+String+to+String+parsing+and+writing+Headers]
15
16
  - {Recipe: Filter String to String Without Headers}[#label-Recipe-3A+Filter+String+to+String+Without+Headers]
16
17
  - {Filtering String to IO Stream}[#label-Filtering+String+to+IO+Stream]
17
- - {Recipe: Filter String to IO Stream with Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+with+Headers]
18
+ - {Recipe: Filter String to IO Stream parsing Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+parsing+Headers]
19
+ - {Recipe: Filter String to IO Stream parsing and writing Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+parsing+and+writing+Headers]
18
20
  - {Recipe: Filter String to IO Stream Without Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+Without+Headers]
19
21
  - {Filtering IO Stream to String}[#label-Filtering+IO+Stream+to+String]
20
- - {Recipe: Filter IO Stream to String with Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+with+Headers]
22
+ - {Recipe: Filter IO Stream to String parsing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+parsing+Headers]
23
+ - {Recipe: Filter IO Stream to String parsing and writing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+parsing+and+writing+Headers]
21
24
  - {Recipe: Filter IO Stream to String Without Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+Without+Headers]
22
25
  - {Filtering IO Stream to IO Stream}[#label-Filtering+IO+Stream+to+IO+Stream]
23
- - {Recipe: Filter IO Stream to IO Stream with Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+with+Headers]
26
+ - {Recipe: Filter IO Stream to IO Stream parsing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+parsing+Headers]
27
+ - {Recipe: Filter IO Stream to IO Stream parsing and writing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+parsing+and+writing+Headers]
24
28
  - {Recipe: Filter IO Stream to IO Stream Without Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+Without+Headers]
25
29
 
26
30
  === Source and Output Formats
@@ -33,14 +37,27 @@ The input and output \CSV data may be any mixture of \Strings and \IO streams.
33
37
 
34
38
  You can filter one \String to another, with or without headers.
35
39
 
36
- ===== Recipe: Filter \String to \String with Headers
40
+ ===== Recipe: Filter \String to \String parsing Headers
37
41
 
38
42
  Use class method CSV.filter with option +headers+ to filter a \String to another \String:
39
43
  in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
40
44
  out_string = ''
41
45
  CSV.filter(in_string, out_string, headers: true) do |row|
42
- row[0] = row[0].upcase
43
- row[1] *= 4
46
+ row['Name'] = row['Name'].upcase
47
+ row['Value'] *= 4
48
+ end
49
+ out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
50
+
51
+ ===== Recipe: Filter \String to \String parsing and writing Headers
52
+
53
+ Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter a \String to another \String including header row:
54
+ in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
55
+ out_string = ''
56
+ CSV.filter(in_string, out_string, headers: true, out_write_headers: true) do |row|
57
+ unless row.is_a?(Array)
58
+ row['Name'] = row['Name'].upcase
59
+ row['Value'] *= 4
60
+ end
44
61
  end
45
62
  out_string # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n"
46
63
 
@@ -59,15 +76,30 @@ Use class method CSV.filter without option +headers+ to filter a \String to anot
59
76
 
60
77
  You can filter a \String to an \IO stream, with or without headers.
61
78
 
62
- ===== Recipe: Filter \String to \IO Stream with Headers
79
+ ===== Recipe: Filter \String to \IO Stream parsing Headers
63
80
 
64
81
  Use class method CSV.filter with option +headers+ to filter a \String to an \IO stream:
65
82
  in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
66
83
  path = 't.csv'
67
84
  File.open(path, 'w') do |out_io|
68
85
  CSV.filter(in_string, out_io, headers: true) do |row|
69
- row[0] = row[0].upcase
70
- row[1] *= 4
86
+ row['Name'] = row['Name'].upcase
87
+ row['Value'] *= 4
88
+ end
89
+ end
90
+ p File.read(path) # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
91
+
92
+ ===== Recipe: Filter \String to \IO Stream parsing and writing Headers
93
+
94
+ Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter a \String to an \IO stream including header row:
95
+ in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
96
+ path = 't.csv'
97
+ File.open(path, 'w') do |out_io|
98
+ CSV.filter(in_string, out_io, headers: true, out_write_headers: true ) do |row|
99
+ unless row.is_a?(Array)
100
+ row['Name'] = row['Name'].upcase
101
+ row['Value'] *= 4
102
+ end
71
103
  end
72
104
  end
73
105
  p File.read(path) # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n"
@@ -89,17 +121,34 @@ Use class method CSV.filter without option +headers+ to filter a \String to an \
89
121
 
90
122
  You can filter an \IO stream to a \String, with or without headers.
91
123
 
92
- ===== Recipe: Filter \IO Stream to \String with Headers
124
+ ===== Recipe: Filter \IO Stream to \String parsing Headers
93
125
 
94
126
  Use class method CSV.filter with option +headers+ to filter an \IO stream to a \String:
95
127
  in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
96
128
  path = 't.csv'
97
129
  File.write(path, in_string)
98
130
  out_string = ''
99
- File.open(path, headers: true) do |in_io|
131
+ File.open(path) do |in_io|
100
132
  CSV.filter(in_io, out_string, headers: true) do |row|
101
- row[0] = row[0].upcase
102
- row[1] *= 4
133
+ row['Name'] = row['Name'].upcase
134
+ row['Value'] *= 4
135
+ end
136
+ end
137
+ out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
138
+
139
+ ===== Recipe: Filter \IO Stream to \String parsing and writing Headers
140
+
141
+ Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter an \IO stream to a \String including header row:
142
+ in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
143
+ path = 't.csv'
144
+ File.write(path, in_string)
145
+ out_string = ''
146
+ File.open(path) do |in_io|
147
+ CSV.filter(in_io, out_string, headers: true, out_write_headers: true) do |row|
148
+ unless row.is_a?(Array)
149
+ row['Name'] = row['Name'].upcase
150
+ row['Value'] *= 4
151
+ end
103
152
  end
104
153
  end
105
154
  out_string # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n"
@@ -123,7 +172,7 @@ Use class method CSV.filter without option +headers+ to filter an \IO stream to
123
172
 
124
173
  You can filter an \IO stream to another \IO stream, with or without headers.
125
174
 
126
- ===== Recipe: Filter \IO Stream to \IO Stream with Headers
175
+ ===== Recipe: Filter \IO Stream to \IO Stream parsing Headers
127
176
 
128
177
  Use class method CSV.filter with option +headers+ to filter an \IO stream to another \IO stream:
129
178
  in_path = 't.csv'
@@ -133,8 +182,27 @@ Use class method CSV.filter with option +headers+ to filter an \IO stream to ano
133
182
  File.open(in_path) do |in_io|
134
183
  File.open(out_path, 'w') do |out_io|
135
184
  CSV.filter(in_io, out_io, headers: true) do |row|
136
- row[0] = row[0].upcase
137
- row[1] *= 4
185
+ row['Name'] = row['Name'].upcase
186
+ row['Value'] *= 4
187
+ end
188
+ end
189
+ end
190
+ p File.read(out_path) # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
191
+
192
+ ===== Recipe: Filter \IO Stream to \IO Stream parsing and writing Headers
193
+
194
+ Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter an \IO stream to another \IO stream including header row:
195
+ in_path = 't.csv'
196
+ in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
197
+ File.write(in_path, in_string)
198
+ out_path = 'u.csv'
199
+ File.open(in_path) do |in_io|
200
+ File.open(out_path, 'w') do |out_io|
201
+ CSV.filter(in_io, out_io, headers: true, out_write_headers: true) do |row|
202
+ unless row.is_a?(Array)
203
+ row['Name'] = row['Name'].upcase
204
+ row['Value'] *= 4
205
+ end
138
206
  end
139
207
  end
140
208
  end
@@ -165,7 +165,7 @@ This example defines and uses two custom write converters to strip and upcase ge
165
165
  === RFC 4180 Compliance
166
166
 
167
167
  By default, \CSV generates data that is compliant with
168
- {RFC 4180}[https://tools.ietf.org/html/rfc4180]
168
+ {RFC 4180}[https://www.rfc-editor.org/rfc/rfc4180]
169
169
  with respect to:
170
170
  - Column separator.
171
171
  - Quote character.
@@ -45,6 +45,7 @@ All code snippets on this page assume that the following has been executed:
45
45
  - {Recipe: Convert Fields to Numerics}[#label-Recipe-3A+Convert+Fields+to+Numerics]
46
46
  - {Recipe: Convert Fields to Dates}[#label-Recipe-3A+Convert+Fields+to+Dates]
47
47
  - {Recipe: Convert Fields to DateTimes}[#label-Recipe-3A+Convert+Fields+to+DateTimes]
48
+ - {Recipe: Convert Fields to Times}[#label-Recipe-3A+Convert+Fields+to+Times]
48
49
  - {Recipe: Convert Assorted Fields to Objects}[#label-Recipe-3A+Convert+Assorted+Fields+to+Objects]
49
50
  - {Recipe: Convert Fields to Other Objects}[#label-Recipe-3A+Convert+Fields+to+Other+Objects]
50
51
  - {Recipe: Filter Field Strings}[#label-Recipe-3A+Filter+Field+Strings]
@@ -110,7 +111,7 @@ You can parse \CSV data from a \File, with or without headers.
110
111
 
111
112
  ===== Recipe: Parse from \File with Headers
112
113
 
113
- Use instance method CSV#read with option +headers+ to read a file all at once:
114
+ Use class method CSV.read with option +headers+ to read a file all at once:
114
115
  string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
115
116
  path = 't.csv'
116
117
  File.write(path, string)
@@ -191,7 +192,7 @@ Output:
191
192
  === RFC 4180 Compliance
192
193
 
193
194
  By default, \CSV parses data that is compliant with
194
- {RFC 4180}[https://tools.ietf.org/html/rfc4180]
195
+ {RFC 4180}[https://www.rfc-editor.org/rfc/rfc4180]
195
196
  with respect to:
196
197
  - Row separator.
197
198
  - Column separator.
@@ -339,6 +340,7 @@ There are built-in field converters for converting to objects of certain classes
339
340
  - \Integer
340
341
  - \Date
341
342
  - \DateTime
343
+ - \Time
342
344
 
343
345
  Other built-in field converters include:
344
346
  - +:numeric+: converts to \Integer and \Float.
@@ -381,6 +383,13 @@ Convert fields to \DateTime objects using built-in converter +:date_time+:
381
383
  parsed = CSV.parse(source, headers: true, converters: :date_time)
382
384
  parsed.map {|row| row['DateTime'].class} # => [DateTime, DateTime, DateTime]
383
385
 
386
+ ===== Recipe: Convert Fields to Times
387
+
388
+ Convert fields to \Time objects using built-in converter +:time+:
389
+ source = "Name,Time\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2020-05-07T14:59:00-05:00\n"
390
+ parsed = CSV.parse(source, headers: true, converters: :time)
391
+ parsed.map {|row| row['Time'].class} # => [Time, Time, Time]
392
+
384
393
  ===== Recipe: Convert Assorted Fields to Objects
385
394
 
386
395
  Convert assorted fields to objects using built-in converter +:all+:
@@ -542,4 +551,4 @@ Output:
542
551
  #<struct CSV::FieldInfo index=0, line=2, header=nil>
543
552
  #<struct CSV::FieldInfo index=1, line=2, header=nil>
544
553
  #<struct CSV::FieldInfo index=0, line=3, header=nil>
545
- #<struct CSV::FieldInfo index=1, line=3, header=nil>
554
+ #<struct CSV::FieldInfo index=1, line=3, header=nil>
@@ -1,4 +1,4 @@
1
- class Array # :nodoc:
1
+ class Array
2
2
  # Equivalent to CSV::generate_line(self, options)
3
3
  #
4
4
  # ["CSV", "data"].to_csv
@@ -1,4 +1,4 @@
1
- class String # :nodoc:
1
+ class String
2
2
  # Equivalent to CSV::parse_line(self, options)
3
3
  #
4
4
  # "CSV,data".parse_csv
@@ -4,6 +4,13 @@ class CSV
4
4
  # Note: Don't use this class directly. This is an internal class.
5
5
  class FieldsConverter
6
6
  include Enumerable
7
+
8
+ NO_QUOTED_FIELDS = [] # :nodoc:
9
+ def NO_QUOTED_FIELDS.[](_index)
10
+ false
11
+ end
12
+ NO_QUOTED_FIELDS.freeze
13
+
7
14
  #
8
15
  # A CSV::FieldsConverter is a data structure for storing the
9
16
  # fields converter properties to be passed as a parameter
@@ -44,7 +51,7 @@ class CSV
44
51
  @converters.empty?
45
52
  end
46
53
 
47
- def convert(fields, headers, lineno, quoted_fields)
54
+ def convert(fields, headers, lineno, quoted_fields=NO_QUOTED_FIELDS)
48
55
  return fields unless need_convert?
49
56
 
50
57
  fields.collect.with_index do |field, index|
data/lib/csv/parser.rb CHANGED
@@ -18,6 +18,19 @@ class CSV
18
18
  # into your Encoding.
19
19
  #
20
20
 
21
+ class << self
22
+ ARGF_OBJECT_ID = ARGF.object_id
23
+ # Convenient method to check whether the give input reached EOF
24
+ # or not.
25
+ def eof?(input)
26
+ # We can't use input != ARGF in Ractor. Because ARGF isn't a
27
+ # shareable object.
28
+ input.object_id != ARGF_OBJECT_ID and
29
+ input.respond_to?(:eof) and
30
+ input.eof?
31
+ end
32
+ end
33
+
21
34
  # Raised when encoding is invalid.
22
35
  class InvalidEncoding < StandardError
23
36
  end
@@ -34,7 +47,7 @@ class CSV
34
47
  # Uses StringScanner (the official strscan gem). Strscan provides lexical
35
48
  # scanning operations on a String. We inherit its object and take advantage
36
49
  # on the methods. For more information, please visit:
37
- # https://ruby-doc.org/stdlib-2.6.1/libdoc/strscan/rdoc/StringScanner.html
50
+ # https://docs.ruby-lang.org/en/master/StringScanner.html
38
51
  #
39
52
  class Scanner < StringScanner
40
53
  alias_method :scan_all, :scan
@@ -220,6 +233,15 @@ class CSV
220
233
  end
221
234
  # trace(__method__, :repos, start, buffer)
222
235
  @scanner.pos = start
236
+ last_scanner, last_start, last_buffer = @keeps.last
237
+ # Drop the last buffer when the last buffer is the same data
238
+ # in the last keep. If we keep it, we have duplicated data
239
+ # by the next keep_back.
240
+ if last_scanner == @scanner and
241
+ last_buffer and
242
+ last_buffer == last_scanner.string.byteslice(last_start, start)
243
+ @keeps.last[2] = nil
244
+ end
223
245
  end
224
246
  read_chunk if @scanner.eos?
225
247
  end
@@ -303,7 +325,7 @@ class CSV
303
325
  raise InvalidEncoding unless chunk.valid_encoding?
304
326
  # trace(__method__, :chunk, chunk)
305
327
  @scanner = StringScanner.new(chunk)
306
- if input.respond_to?(:eof?) and input.eof?
328
+ if Parser.eof?(input)
307
329
  @inputs.shift
308
330
  @last_scanner = @inputs.empty?
309
331
  end
@@ -400,13 +422,7 @@ class CSV
400
422
 
401
423
  begin
402
424
  @scanner ||= build_scanner
403
- if quote_character.nil?
404
- parse_no_quote(&block)
405
- elsif @need_robust_parsing
406
- parse_quotable_robust(&block)
407
- else
408
- parse_quotable_loose(&block)
409
- end
425
+ __send__(@parse_method, &block)
410
426
  rescue InvalidEncoding
411
427
  if @scanner
412
428
  ignore_broken_line
@@ -450,7 +466,6 @@ class CSV
450
466
  end
451
467
 
452
468
  def prepare_variable
453
- @need_robust_parsing = false
454
469
  @encoding = @options[:encoding]
455
470
  liberal_parsing = @options[:liberal_parsing]
456
471
  if liberal_parsing
@@ -463,7 +478,6 @@ class CSV
463
478
  @double_quote_outside_quote = false
464
479
  @backslash_quote = false
465
480
  end
466
- @need_robust_parsing = true
467
481
  else
468
482
  @liberal_parsing = false
469
483
  @backslash_quote = false
@@ -545,7 +559,6 @@ class CSV
545
559
  @rstrip_value = Regexp.new(@escaped_strip +
546
560
  "+\\z".encode(@encoding))
547
561
  end
548
- @need_robust_parsing = true
549
562
  elsif @strip
550
563
  strip_values = " \t\f\v"
551
564
  @escaped_strip = strip_values.encode(@encoding)
@@ -553,7 +566,6 @@ class CSV
553
566
  @strip_value = Regexp.new("[#{strip_values}]+".encode(@encoding))
554
567
  @rstrip_value = Regexp.new("[#{strip_values}]+\\z".encode(@encoding))
555
568
  end
556
- @need_robust_parsing = true
557
569
  end
558
570
  end
559
571
 
@@ -758,7 +770,7 @@ class CSV
758
770
  case headers
759
771
  when Array
760
772
  @raw_headers = headers
761
- quoted_fields = [false] * @raw_headers.size
773
+ quoted_fields = FieldsConverter::NO_QUOTED_FIELDS
762
774
  @use_headers = true
763
775
  when String
764
776
  @raw_headers, quoted_fields = parse_headers(headers)
@@ -799,6 +811,13 @@ class CSV
799
811
 
800
812
  def prepare_parser
801
813
  @may_quoted = may_quoted?
814
+ if @quote_character.nil?
815
+ @parse_method = :parse_no_quote
816
+ elsif @liberal_parsing or @strip
817
+ @parse_method = :parse_quotable_robust
818
+ else
819
+ @parse_method = :parse_quotable_loose
820
+ end
802
821
  end
803
822
 
804
823
  def may_quoted?
@@ -863,10 +882,7 @@ class CSV
863
882
  string = nil
864
883
  if @samples.empty? and @input.is_a?(StringIO)
865
884
  string = @input.read
866
- elsif @samples.size == 1 and
867
- @input != ARGF and
868
- @input.respond_to?(:eof?) and
869
- @input.eof?
885
+ elsif @samples.size == 1 and Parser.eof?(@input)
870
886
  string = @samples[0]
871
887
  end
872
888
  if string
@@ -935,11 +951,9 @@ class CSV
935
951
  if line.empty?
936
952
  next if @skip_blanks
937
953
  row = []
938
- quoted_fields = []
939
954
  else
940
955
  line = strip_value(line)
941
956
  row = line.split(@split_column_separator, -1)
942
- quoted_fields = [false] * row.size
943
957
  if @max_field_size
944
958
  row.each do |column|
945
959
  validate_field_size(column)
@@ -953,7 +967,7 @@ class CSV
953
967
  end
954
968
  end
955
969
  @last_line = original_line
956
- emit_row(row, quoted_fields, &block)
970
+ emit_row(row, &block)
957
971
  end
958
972
  end
959
973
 
@@ -975,10 +989,10 @@ class CSV
975
989
  next
976
990
  end
977
991
  row = []
978
- quoted_fields = []
992
+ quoted_fields = FieldsConverter::NO_QUOTED_FIELDS
979
993
  elsif line.include?(@cr) or line.include?(@lf)
980
994
  @scanner.keep_back
981
- @need_robust_parsing = true
995
+ @parse_method = :parse_quotable_robust
982
996
  return parse_quotable_robust(&block)
983
997
  else
984
998
  row = line.split(@split_column_separator, -1)
@@ -1002,7 +1016,7 @@ class CSV
1002
1016
  row[i] = column[1..-2]
1003
1017
  else
1004
1018
  @scanner.keep_back
1005
- @need_robust_parsing = true
1019
+ @parse_method = :parse_quotable_robust
1006
1020
  return parse_quotable_robust(&block)
1007
1021
  end
1008
1022
  validate_field_size(row[i])
@@ -1037,13 +1051,13 @@ class CSV
1037
1051
  quoted_fields << @quoted_column_value
1038
1052
  elsif parse_row_end
1039
1053
  if row.empty? and value.nil?
1040
- emit_row([], [], &block) unless @skip_blanks
1054
+ emit_row([], &block) unless @skip_blanks
1041
1055
  else
1042
1056
  row << value
1043
1057
  quoted_fields << @quoted_column_value
1044
1058
  emit_row(row, quoted_fields, &block)
1045
1059
  row = []
1046
- quoted_fields = []
1060
+ quoted_fields.clear
1047
1061
  end
1048
1062
  skip_needless_lines
1049
1063
  start_row
@@ -1248,7 +1262,7 @@ class CSV
1248
1262
  @scanner.keep_start
1249
1263
  end
1250
1264
 
1251
- def emit_row(row, quoted_fields, &block)
1265
+ def emit_row(row, quoted_fields=FieldsConverter::NO_QUOTED_FIELDS, &block)
1252
1266
  @lineno += 1
1253
1267
 
1254
1268
  raw_row = row
data/lib/csv/version.rb CHANGED
@@ -2,5 +2,5 @@
2
2
 
3
3
  class CSV
4
4
  # The version of the installed library.
5
- VERSION = "3.2.8"
5
+ VERSION = "3.3.5"
6
6
  end
data/lib/csv/writer.rb CHANGED
@@ -40,8 +40,7 @@ class CSV
40
40
  @lineno += 1
41
41
 
42
42
  if @fields_converter
43
- quoted_fields = [false] * row.size
44
- row = @fields_converter.convert(row, nil, lineno, quoted_fields)
43
+ row = @fields_converter.convert(row, nil, lineno)
45
44
  end
46
45
 
47
46
  i = -1
data/lib/csv.rb CHANGED
@@ -91,6 +91,7 @@
91
91
 
92
92
  require "forwardable"
93
93
  require "date"
94
+ require "time"
94
95
  require "stringio"
95
96
 
96
97
  require_relative "csv/fields_converter"
@@ -521,6 +522,7 @@ require_relative "csv/writer"
521
522
  # - <tt>:float</tt>: converts each \String-embedded float into a true \Float.
522
523
  # - <tt>:date</tt>: converts each \String-embedded date into a true \Date.
523
524
  # - <tt>:date_time</tt>: converts each \String-embedded date-time into a true \DateTime
525
+ # - <tt>:time</tt>: converts each \String-embedded time into a true \Time
524
526
  # .
525
527
  # This example creates a converter proc, then stores it:
526
528
  # strip_converter = proc {|field| field.strip }
@@ -631,6 +633,7 @@ require_relative "csv/writer"
631
633
  # [:numeric, [:integer, :float]]
632
634
  # [:date, Proc]
633
635
  # [:date_time, Proc]
636
+ # [:time, Proc]
634
637
  # [:all, [:date_time, :numeric]]
635
638
  #
636
639
  # Each of these converters transcodes values to UTF-8 before attempting conversion.
@@ -675,6 +678,15 @@ require_relative "csv/writer"
675
678
  # csv = CSV.parse_line(data, converters: :date_time)
676
679
  # csv # => [#<DateTime: 2020-05-07T14:59:00-05:00 ((2458977j,71940s,0n),-18000s,2299161j)>, "x"]
677
680
  #
681
+ # Converter +time+ converts each field that Time::parse accepts:
682
+ # data = '2020-05-07T14:59:00-05:00,x'
683
+ # # Without the converter
684
+ # csv = CSV.parse_line(data)
685
+ # csv # => ["2020-05-07T14:59:00-05:00", "x"]
686
+ # # With the converter
687
+ # csv = CSV.parse_line(data, converters: :time)
688
+ # csv # => [2020-05-07 14:59:00 -0500, "x"]
689
+ #
678
690
  # Converter +:numeric+ converts with both +:date_time+ and +:numeric+..
679
691
  #
680
692
  # As seen above, method #convert adds \converters to a \CSV instance,
@@ -871,10 +883,10 @@ class CSV
871
883
  # A Regexp used to find and convert some common Date formats.
872
884
  DateMatcher = / \A(?: (\w+,?\s+)?\w+\s+\d{1,2},?\s+\d{2,4} |
873
885
  \d{4}-\d{2}-\d{2} )\z /x
874
- # A Regexp used to find and convert some common DateTime formats.
886
+ # A Regexp used to find and convert some common (Date)Time formats.
875
887
  DateTimeMatcher =
876
888
  / \A(?: (\w+,?\s+)?\w+\s+\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2},?\s+\d{2,4} |
877
- # ISO-8601 and RFC-3339 (space instead of T) recognized by DateTime.parse
889
+ # ISO-8601 and RFC-3339 (space instead of T) recognized by (Date)Time.parse
878
890
  \d{4}-\d{2}-\d{2}
879
891
  (?:[T\s]\d{2}:\d{2}(?::\d{2}(?:\.\d+)?(?:[+-]\d{2}(?::\d{2})|Z)?)?)?
880
892
  )\z /x
@@ -912,6 +924,14 @@ class CSV
912
924
  f
913
925
  end
914
926
  },
927
+ time: lambda { |f|
928
+ begin
929
+ e = f.encode(ConverterEncoding)
930
+ e.match?(DateTimeMatcher) ? Time.parse(e) : f
931
+ rescue # encoding conversion or parse errors
932
+ f
933
+ end
934
+ },
915
935
  all: [:date_time, :numeric],
916
936
  }
917
937
 
@@ -1198,7 +1218,44 @@ class CSV
1198
1218
  # * Argument +in_string_or_io+ must be a \String or an \IO stream.
1199
1219
  # * Argument +out_string_or_io+ must be a \String or an \IO stream.
1200
1220
  # * Arguments <tt>**options</tt> must be keyword options.
1201
- # See {Options for Parsing}[#class-CSV-label-Options+for+Parsing].
1221
+ #
1222
+ # - Each option defined as an {option for parsing}[#class-CSV-label-Options+for+Parsing]
1223
+ # is used for parsing the filter input.
1224
+ # - Each option defined as an {option for generating}[#class-CSV-label-Options+for+Generating]
1225
+ # is used for generator the filter input.
1226
+ #
1227
+ # However, there are three options that may be used for both parsing and generating:
1228
+ # +col_sep+, +quote_char+, and +row_sep+.
1229
+ #
1230
+ # Therefore for method +filter+ (and method +filter+ only),
1231
+ # there are special options that allow these parsing and generating options
1232
+ # to be specified separately:
1233
+ #
1234
+ # - Options +input_col_sep+ and +output_col_sep+
1235
+ # (and their aliases +in_col_sep+ and +out_col_sep+)
1236
+ # specify the column separators for parsing and generating.
1237
+ # - Options +input_quote_char+ and +output_quote_char+
1238
+ # (and their aliases +in_quote_char+ and +out_quote_char+)
1239
+ # specify the quote characters for parsing and generting.
1240
+ # - Options +input_row_sep+ and +output_row_sep+
1241
+ # (and their aliases +in_row_sep+ and +out_row_sep+)
1242
+ # specify the row separators for parsing and generating.
1243
+ #
1244
+ # Example options (for column separators):
1245
+ #
1246
+ # CSV.filter # Default for both parsing and generating.
1247
+ # CSV.filter(in_col_sep: ';') # ';' for parsing, default for generating.
1248
+ # CSV.filter(out_col_sep: '|') # Default for parsing, '|' for generating.
1249
+ # CSV.filter(in_col_sep: ';', out_col_sep: '|') # ';' for parsing, '|' for generating.
1250
+ #
1251
+ # Note that for a special option (e.g., +input_col_sep+)
1252
+ # and its corresponding "regular" option (e.g., +col_sep+),
1253
+ # the two are mutually overriding.
1254
+ #
1255
+ # Another example (possibly surprising):
1256
+ #
1257
+ # CSV.filter(in_col_sep: ';', col_sep: '|') # '|' for both parsing(!) and generating.
1258
+ #
1202
1259
  def filter(input=nil, output=nil, **options)
1203
1260
  # parse options for input, output, or both
1204
1261
  in_options, out_options = Hash.new, {row_sep: InputRecordSeparator.value}
@@ -1508,10 +1565,8 @@ class CSV
1508
1565
 
1509
1566
  #
1510
1567
  # :call-seq:
1511
- # open(file_path, mode = "rb", **options ) -> new_csv
1512
- # open(io, mode = "rb", **options ) -> new_csv
1513
- # open(file_path, mode = "rb", **options ) { |csv| ... } -> object
1514
- # open(io, mode = "rb", **options ) { |csv| ... } -> object
1568
+ # open(path_or_io, mode = "rb", **options ) -> new_csv
1569
+ # open(path_or_io, mode = "rb", **options ) { |csv| ... } -> object
1515
1570
  #
1516
1571
  # possible options elements:
1517
1572
  # keyword form:
@@ -1520,7 +1575,7 @@ class CSV
1520
1575
  # :undef => :replace # replace undefined conversion
1521
1576
  # :replace => string # replacement string ("?" or "\uFFFD" if not specified)
1522
1577
  #
1523
- # * Argument +path+, if given, must be the path to a file.
1578
+ # * Argument +path_or_io+, must be a file path or an \IO stream.
1524
1579
  # :include: ../doc/csv/arguments/io.rdoc
1525
1580
  # * Argument +mode+, if given, must be a \File mode.
1526
1581
  # See {Access Modes}[https://docs.ruby-lang.org/en/master/File.html#class-File-label-Access+Modes].
@@ -1544,6 +1599,9 @@ class CSV
1544
1599
  # path = 't.csv'
1545
1600
  # File.write(path, string)
1546
1601
  #
1602
+ # string_io = StringIO.new
1603
+ # string_io << "foo,0\nbar,1\nbaz,2\n"
1604
+ #
1547
1605
  # ---
1548
1606
  #
1549
1607
  # With no block given, returns a new \CSV object.
@@ -1556,6 +1614,9 @@ class CSV
1556
1614
  # csv = CSV.open(File.open(path))
1557
1615
  # csv # => #<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
1558
1616
  #
1617
+ # Create a \CSV object using a \StringIO:
1618
+ # csv = CSV.open(string_io)
1619
+ # csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
1559
1620
  # ---
1560
1621
  #
1561
1622
  # With a block given, calls the block with the created \CSV object;
@@ -1573,15 +1634,25 @@ class CSV
1573
1634
  # Output:
1574
1635
  # #<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
1575
1636
  #
1637
+ # Using a \StringIO:
1638
+ # csv = CSV.open(string_io) {|csv| p csv}
1639
+ # csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
1640
+ # Output:
1641
+ # #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
1576
1642
  # ---
1577
1643
  #
1578
1644
  # Raises an exception if the argument is not a \String object or \IO object:
1579
1645
  # # Raises TypeError (no implicit conversion of Symbol into String)
1580
1646
  # CSV.open(:foo)
1581
- def open(filename, mode="r", **options)
1647
+ def open(filename_or_io, mode="r", **options)
1582
1648
  # wrap a File opened with the remaining +args+ with no newline
1583
1649
  # decorator
1584
- file_opts = options.dup
1650
+ file_opts = {}
1651
+ may_enable_bom_detection_automatically(filename_or_io,
1652
+ mode,
1653
+ options,
1654
+ file_opts)
1655
+ file_opts.merge!(options)
1585
1656
  unless file_opts.key?(:newline)
1586
1657
  file_opts[:universal_newline] ||= false
1587
1658
  end
@@ -1590,14 +1661,19 @@ class CSV
1590
1661
  options.delete(:replace)
1591
1662
  options.delete_if {|k, _| /newline\z/.match?(k)}
1592
1663
 
1593
- begin
1594
- f = File.open(filename, mode, **file_opts)
1595
- rescue ArgumentError => e
1596
- raise unless /needs binmode/.match?(e.message) and mode == "r"
1597
- mode = "rb"
1598
- file_opts = {encoding: Encoding.default_external}.merge(file_opts)
1599
- retry
1664
+ if filename_or_io.is_a?(StringIO)
1665
+ f = create_stringio(filename_or_io.string, mode, **file_opts)
1666
+ else
1667
+ begin
1668
+ f = File.open(filename_or_io, mode, **file_opts)
1669
+ rescue ArgumentError => e
1670
+ raise unless /needs binmode/.match?(e.message) and mode == "r"
1671
+ mode = "rb"
1672
+ file_opts = {encoding: Encoding.default_external}.merge(file_opts)
1673
+ retry
1674
+ end
1600
1675
  end
1676
+
1601
1677
  begin
1602
1678
  csv = new(f, **options)
1603
1679
  rescue Exception
@@ -1729,6 +1805,23 @@ class CSV
1729
1805
  # Raises an exception if the argument is not a \String object or \IO object:
1730
1806
  # # Raises NoMethodError (undefined method `close' for :foo:Symbol)
1731
1807
  # CSV.parse(:foo)
1808
+ #
1809
+ # ---
1810
+ #
1811
+ # Please make sure if your text contains \BOM or not. CSV.parse will not remove
1812
+ # \BOM automatically. You might want to remove \BOM before calling CSV.parse :
1813
+ # # remove BOM on calling File.open
1814
+ # File.open(path, encoding: 'bom|utf-8') do |file|
1815
+ # CSV.parse(file, headers: true) do |row|
1816
+ # # you can get value by column name because BOM is removed
1817
+ # p row['Name']
1818
+ # end
1819
+ # end
1820
+ #
1821
+ # Output:
1822
+ # # "foo"
1823
+ # # "bar"
1824
+ # # "baz"
1732
1825
  def parse(str, **options, &block)
1733
1826
  csv = new(str, **options)
1734
1827
 
@@ -1862,6 +1955,42 @@ class CSV
1862
1955
  options = default_options.merge(options)
1863
1956
  read(path, **options)
1864
1957
  end
1958
+
1959
+ ON_WINDOWS = /mingw|mswin/.match?(RUBY_PLATFORM)
1960
+ private_constant :ON_WINDOWS
1961
+
1962
+ private
1963
+ def may_enable_bom_detection_automatically(filename_or_io,
1964
+ mode,
1965
+ options,
1966
+ file_opts)
1967
+ if filename_or_io.is_a?(StringIO)
1968
+ # Support to StringIO was dropped for Ruby 2.6 and earlier without BOM support:
1969
+ # https://github.com/ruby/stringio/pull/47
1970
+ return if RUBY_VERSION < "2.7"
1971
+ else
1972
+ # "bom|utf-8" may be buggy on Windows:
1973
+ # https://bugs.ruby-lang.org/issues/20526
1974
+ return if ON_WINDOWS
1975
+ end
1976
+ return unless Encoding.default_external == Encoding::UTF_8
1977
+ return if options.key?(:encoding)
1978
+ return if options.key?(:external_encoding)
1979
+ return if mode.is_a?(String) and mode.include?(":")
1980
+ file_opts[:encoding] = "bom|utf-8"
1981
+ end
1982
+
1983
+ if RUBY_VERSION < "2.7"
1984
+ def create_stringio(str, mode, opts)
1985
+ opts.delete_if {|k, _| k == :universal_newline or DEFAULT_OPTIONS.key?(k)}
1986
+ raise ArgumentError, "Unsupported options parsing StringIO: #{opts.keys}" unless opts.empty?
1987
+ StringIO.new(str, mode)
1988
+ end
1989
+ else
1990
+ def create_stringio(str, mode, opts)
1991
+ StringIO.new(str, mode, **opts)
1992
+ end
1993
+ end
1865
1994
  end
1866
1995
 
1867
1996
  # :call-seq:
@@ -2000,6 +2129,12 @@ class CSV
2000
2129
  writer if @writer_options[:write_headers]
2001
2130
  end
2002
2131
 
2132
+ class TSV < CSV
2133
+ def initialize(data, **options)
2134
+ super(data, **({col_sep: "\t"}.merge(options)))
2135
+ end
2136
+ end
2137
+
2003
2138
  # :call-seq:
2004
2139
  # csv.col_sep -> string
2005
2140
  #
metadata CHANGED
@@ -1,72 +1,15 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: csv
3
3
  version: !ruby/object:Gem::Version
4
- version: 3.2.8
4
+ version: 3.3.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - James Edward Gray II
8
8
  - Kouhei Sutou
9
- autorequire:
10
9
  bindir: bin
11
10
  cert_chain: []
12
- date: 2023-11-08 00:00:00.000000000 Z
13
- dependencies:
14
- - !ruby/object:Gem::Dependency
15
- name: bundler
16
- requirement: !ruby/object:Gem::Requirement
17
- requirements:
18
- - - ">="
19
- - !ruby/object:Gem::Version
20
- version: '0'
21
- type: :development
22
- prerelease: false
23
- version_requirements: !ruby/object:Gem::Requirement
24
- requirements:
25
- - - ">="
26
- - !ruby/object:Gem::Version
27
- version: '0'
28
- - !ruby/object:Gem::Dependency
29
- name: rake
30
- requirement: !ruby/object:Gem::Requirement
31
- requirements:
32
- - - ">="
33
- - !ruby/object:Gem::Version
34
- version: '0'
35
- type: :development
36
- prerelease: false
37
- version_requirements: !ruby/object:Gem::Requirement
38
- requirements:
39
- - - ">="
40
- - !ruby/object:Gem::Version
41
- version: '0'
42
- - !ruby/object:Gem::Dependency
43
- name: benchmark_driver
44
- requirement: !ruby/object:Gem::Requirement
45
- requirements:
46
- - - ">="
47
- - !ruby/object:Gem::Version
48
- version: '0'
49
- type: :development
50
- prerelease: false
51
- version_requirements: !ruby/object:Gem::Requirement
52
- requirements:
53
- - - ">="
54
- - !ruby/object:Gem::Version
55
- version: '0'
56
- - !ruby/object:Gem::Dependency
57
- name: test-unit
58
- requirement: !ruby/object:Gem::Requirement
59
- requirements:
60
- - - ">="
61
- - !ruby/object:Gem::Version
62
- version: 3.4.8
63
- type: :development
64
- prerelease: false
65
- version_requirements: !ruby/object:Gem::Requirement
66
- requirements:
67
- - - ">="
68
- - !ruby/object:Gem::Version
69
- version: 3.4.8
11
+ date: 1980-01-02 00:00:00.000000000 Z
12
+ dependencies: []
70
13
  description: The CSV library provides a complete interface to CSV files and data.
71
14
  It offers tools to enable you to read and write to and from Strings or IO objects,
72
15
  as needed.
@@ -127,8 +70,8 @@ homepage: https://github.com/ruby/csv
127
70
  licenses:
128
71
  - Ruby
129
72
  - BSD-2-Clause
130
- metadata: {}
131
- post_install_message:
73
+ metadata:
74
+ changelog_uri: https://github.com/ruby/csv/releases/tag/v3.3.5
132
75
  rdoc_options:
133
76
  - "--main"
134
77
  - README.md
@@ -145,8 +88,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
145
88
  - !ruby/object:Gem::Version
146
89
  version: '0'
147
90
  requirements: []
148
- rubygems_version: 3.5.0.dev
149
- signing_key:
91
+ rubygems_version: 3.6.7
150
92
  specification_version: 4
151
93
  summary: CSV Reading and Writing
152
94
  test_files: []