csv 3.3.0 → 3.3.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 4be2aab05afe081e964c806337cab1e8763532ecf66ef29bb8f6854cd65b29bf
4
- data.tar.gz: 716f28f2377749b95d8f5f9f19a7cba50499e4ff46c4978efdc002ba25162475
3
+ metadata.gz: 2b09d2eaa0c8e6c8c385fa6f958d93d893ee7c3441cc83248d07562cd719d960
4
+ data.tar.gz: 7e2aff2545b8bbeaefd4adb73627e7345bec284b18d431f3ab41f5a136f479a1
5
5
  SHA512:
6
- metadata.gz: bc67687257ea7790091d9755199624dced05c3ea07a9afc47b86cc49dac1dfce77b884e1c2397de71f1f2de22720778135373652d7af05acb990814e6fa2d940
7
- data.tar.gz: fce6230a6fdcff2ae8044a0f374ea71403cdf1f59d2024bb64a87ade52a897a907dfcb271b39d1ad81299cbd9eebf2ca43008dec99d86648550f73584d8c468c
6
+ metadata.gz: 88525c73bd61ae40b2fe4314a402210acc2ad4993def484e7531d6356d1652f80b2b5a1f72be1f4b4a38350d4c25d00ac43b3a8b1b41817b0c95bc2df8f5c082
7
+ data.tar.gz: 28322a68f915cee37aebbcd2fbaee39293d289760c878f3b6bcfe237a9247fd319368d1c54d9170f25929625b71c89475b016ccbdb80afc88a3d0926784b6a7e
data/NEWS.md CHANGED
@@ -1,5 +1,88 @@
1
1
  # News
2
2
 
3
+ ## 3.3.3 - 2025-03-20
4
+
5
+ ### Improvements
6
+
7
+ * `csv-filter`: Add an experimental command line tool to filter a CSV.
8
+ * Patch by Burdette Lamar
9
+
10
+ ### Fixes
11
+
12
+ * Fixed wrong EOF detection for `ARGF`
13
+ * GH-328
14
+ * Reported by Takeshi Nishimatsu
15
+
16
+ * Fixed a regression bug that `CSV.open` rejects integer mode.
17
+ * GH-336
18
+ * Reported by Dave Burgess
19
+
20
+ ### Thanks
21
+
22
+ * Takeshi Nishimatsu
23
+
24
+ * Burdette Lamar
25
+
26
+ * Dave Burgess
27
+
28
+ ## 3.3.2 - 2024-12-21
29
+
30
+ ### Fixes
31
+
32
+ * Fixed a parse bug with a quoted line with `col_sep` and an empty
33
+ line. This was introduced in 3.3.1.
34
+ * GH-324
35
+ * Reported by stoodfarback
36
+
37
+ ### Thanks
38
+
39
+ * stoodfarback
40
+
41
+ ## 3.3.1 - 2024-12-15
42
+
43
+ ### Improvements
44
+
45
+ * `CSV.open`: Changed to detect BOM by default. Note that this isn't
46
+ enabled on Windows because Ruby may have a bug. See also:
47
+ https://bugs.ruby-lang.org/issues/20526
48
+ * GH-301
49
+ * Reported by Junichi Ito
50
+
51
+ * Improved performance.
52
+ * GH-311
53
+ * GH-312
54
+ * Patch by Vladimir Kochnev
55
+
56
+ * `CSV.open`: Added support for `StringIO` as an input.
57
+ * GH-300
58
+ * GH-302
59
+ * Patch by Marcelo
60
+
61
+ * Added a built-in time converter. You can use it by `converters:
62
+ :time`.
63
+ * GH-313
64
+ * Patch by Bart de Water
65
+
66
+ * Added `CSV::TSV` for tab-separated values.
67
+ * GH-272
68
+ * GH-319
69
+ * Reported by kojix2
70
+ * Patch by Jas
71
+
72
+ ### Thanks
73
+
74
+ * Junichi Ito
75
+
76
+ * Vladimir Kochnev
77
+
78
+ * Marcelo
79
+
80
+ * Bart de Water
81
+
82
+ * kojix2
83
+
84
+ * Jas
85
+
3
86
  ## 3.3.0 - 2024-03-22
4
87
 
5
88
  ### Fixes
data/bin/csv-filter ADDED
@@ -0,0 +1,59 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'optparse'
4
+ require 'csv'
5
+
6
+ options = {}
7
+
8
+ parser = OptionParser.new
9
+
10
+ parser.version = CSV::VERSION
11
+ parser.banner = <<-BANNER
12
+ Usage: #{parser.program_name} [options]
13
+
14
+ Reads and parses the CSV text content of the standard input per the given input options.
15
+ From that content, generates CSV text per the given output options
16
+ and writes that text to the standard output.
17
+ BANNER
18
+
19
+
20
+ parser.on('--input-col-sep=SEPARATOR',
21
+ 'Input column separator string.') do |value|
22
+ options[:input_col_sep] = value
23
+ end
24
+
25
+ parser.on('--input-quote-char=SEPARATOR',
26
+ 'Input quote character.') do |value|
27
+ options[:input_quote_char] = value
28
+ end
29
+
30
+ parser.on('--input-row-sep=SEPARATOR',
31
+ 'Input row separator string.') do |value|
32
+ options[:input_row_sep] = value
33
+ end
34
+
35
+ parser.on('--output-col-sep=SEPARATOR',
36
+ 'Output column separator string.') do |value|
37
+ options[:output_col_sep] = value
38
+ end
39
+
40
+ parser.on('--output-row-sep=SEPARATOR',
41
+ 'Output row separator string.') do |value|
42
+ options[:output_row_sep] = value
43
+ end
44
+
45
+ parser.on('-r', '--row-sep=SEPARATOR',
46
+ 'Row separator string.') do |value|
47
+ options[:row_sep] = value
48
+ end
49
+
50
+ begin
51
+ parser.parse!
52
+ rescue OptionParser::InvalidOption
53
+ $stderr.puts($!.message)
54
+ $stderr.puts(parser)
55
+ exit(false)
56
+ end
57
+
58
+ CSV.filter(**options) do |row|
59
+ end
@@ -11,16 +11,20 @@ All code snippets on this page assume that the following has been executed:
11
11
 
12
12
  - {Source and Output Formats}[#label-Source+and+Output+Formats]
13
13
  - {Filtering String to String}[#label-Filtering+String+to+String]
14
- - {Recipe: Filter String to String with Headers}[#label-Recipe-3A+Filter+String+to+String+with+Headers]
14
+ - {Recipe: Filter String to String parsing Headers}[#label-Recipe-3A+Filter+String+to+String+parsing+Headers]
15
+ - {Recipe: Filter String to String parsing and writing Headers}[#label-Recipe-3A+Filter+String+to+String+parsing+and+writing+Headers]
15
16
  - {Recipe: Filter String to String Without Headers}[#label-Recipe-3A+Filter+String+to+String+Without+Headers]
16
17
  - {Filtering String to IO Stream}[#label-Filtering+String+to+IO+Stream]
17
- - {Recipe: Filter String to IO Stream with Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+with+Headers]
18
+ - {Recipe: Filter String to IO Stream parsing Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+parsing+Headers]
19
+ - {Recipe: Filter String to IO Stream parsing and writing Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+parsing+and+writing+Headers]
18
20
  - {Recipe: Filter String to IO Stream Without Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+Without+Headers]
19
21
  - {Filtering IO Stream to String}[#label-Filtering+IO+Stream+to+String]
20
- - {Recipe: Filter IO Stream to String with Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+with+Headers]
22
+ - {Recipe: Filter IO Stream to String parsing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+parsing+Headers]
23
+ - {Recipe: Filter IO Stream to String parsing and writing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+parsing+and+writing+Headers]
21
24
  - {Recipe: Filter IO Stream to String Without Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+Without+Headers]
22
25
  - {Filtering IO Stream to IO Stream}[#label-Filtering+IO+Stream+to+IO+Stream]
23
- - {Recipe: Filter IO Stream to IO Stream with Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+with+Headers]
26
+ - {Recipe: Filter IO Stream to IO Stream parsing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+parsing+Headers]
27
+ - {Recipe: Filter IO Stream to IO Stream parsing and writing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+parsing+and+writing+Headers]
24
28
  - {Recipe: Filter IO Stream to IO Stream Without Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+Without+Headers]
25
29
 
26
30
  === Source and Output Formats
@@ -33,14 +37,27 @@ The input and output \CSV data may be any mixture of \Strings and \IO streams.
33
37
 
34
38
  You can filter one \String to another, with or without headers.
35
39
 
36
- ===== Recipe: Filter \String to \String with Headers
40
+ ===== Recipe: Filter \String to \String parsing Headers
37
41
 
38
42
  Use class method CSV.filter with option +headers+ to filter a \String to another \String:
39
43
  in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
40
44
  out_string = ''
41
45
  CSV.filter(in_string, out_string, headers: true) do |row|
42
- row[0] = row[0].upcase
43
- row[1] *= 4
46
+ row['Name'] = row['Name'].upcase
47
+ row['Value'] *= 4
48
+ end
49
+ out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
50
+
51
+ ===== Recipe: Filter \String to \String parsing and writing Headers
52
+
53
+ Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter a \String to another \String including header row:
54
+ in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
55
+ out_string = ''
56
+ CSV.filter(in_string, out_string, headers: true, out_write_headers: true) do |row|
57
+ unless row.is_a?(Array)
58
+ row['Name'] = row['Name'].upcase
59
+ row['Value'] *= 4
60
+ end
44
61
  end
45
62
  out_string # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n"
46
63
 
@@ -59,15 +76,30 @@ Use class method CSV.filter without option +headers+ to filter a \String to anot
59
76
 
60
77
  You can filter a \String to an \IO stream, with or without headers.
61
78
 
62
- ===== Recipe: Filter \String to \IO Stream with Headers
79
+ ===== Recipe: Filter \String to \IO Stream parsing Headers
63
80
 
64
81
  Use class method CSV.filter with option +headers+ to filter a \String to an \IO stream:
65
82
  in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
66
83
  path = 't.csv'
67
84
  File.open(path, 'w') do |out_io|
68
85
  CSV.filter(in_string, out_io, headers: true) do |row|
69
- row[0] = row[0].upcase
70
- row[1] *= 4
86
+ row['Name'] = row['Name'].upcase
87
+ row['Value'] *= 4
88
+ end
89
+ end
90
+ p File.read(path) # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
91
+
92
+ ===== Recipe: Filter \String to \IO Stream parsing and writing Headers
93
+
94
+ Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter a \String to an \IO stream including header row:
95
+ in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
96
+ path = 't.csv'
97
+ File.open(path, 'w') do |out_io|
98
+ CSV.filter(in_string, out_io, headers: true, out_write_headers: true ) do |row|
99
+ unless row.is_a?(Array)
100
+ row['Name'] = row['Name'].upcase
101
+ row['Value'] *= 4
102
+ end
71
103
  end
72
104
  end
73
105
  p File.read(path) # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n"
@@ -89,17 +121,34 @@ Use class method CSV.filter without option +headers+ to filter a \String to an \
89
121
 
90
122
  You can filter an \IO stream to a \String, with or without headers.
91
123
 
92
- ===== Recipe: Filter \IO Stream to \String with Headers
124
+ ===== Recipe: Filter \IO Stream to \String parsing Headers
93
125
 
94
126
  Use class method CSV.filter with option +headers+ to filter an \IO stream to a \String:
95
127
  in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
96
128
  path = 't.csv'
97
129
  File.write(path, in_string)
98
130
  out_string = ''
99
- File.open(path, headers: true) do |in_io|
131
+ File.open(path) do |in_io|
100
132
  CSV.filter(in_io, out_string, headers: true) do |row|
101
- row[0] = row[0].upcase
102
- row[1] *= 4
133
+ row['Name'] = row['Name'].upcase
134
+ row['Value'] *= 4
135
+ end
136
+ end
137
+ out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
138
+
139
+ ===== Recipe: Filter \IO Stream to \String parsing and writing Headers
140
+
141
+ Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter an \IO stream to a \String including header row:
142
+ in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
143
+ path = 't.csv'
144
+ File.write(path, in_string)
145
+ out_string = ''
146
+ File.open(path) do |in_io|
147
+ CSV.filter(in_io, out_string, headers: true, out_write_headers: true) do |row|
148
+ unless row.is_a?(Array)
149
+ row['Name'] = row['Name'].upcase
150
+ row['Value'] *= 4
151
+ end
103
152
  end
104
153
  end
105
154
  out_string # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n"
@@ -123,7 +172,7 @@ Use class method CSV.filter without option +headers+ to filter an \IO stream to
123
172
 
124
173
  You can filter an \IO stream to another \IO stream, with or without headers.
125
174
 
126
- ===== Recipe: Filter \IO Stream to \IO Stream with Headers
175
+ ===== Recipe: Filter \IO Stream to \IO Stream parsing Headers
127
176
 
128
177
  Use class method CSV.filter with option +headers+ to filter an \IO stream to another \IO stream:
129
178
  in_path = 't.csv'
@@ -133,8 +182,27 @@ Use class method CSV.filter with option +headers+ to filter an \IO stream to ano
133
182
  File.open(in_path) do |in_io|
134
183
  File.open(out_path, 'w') do |out_io|
135
184
  CSV.filter(in_io, out_io, headers: true) do |row|
136
- row[0] = row[0].upcase
137
- row[1] *= 4
185
+ row['Name'] = row['Name'].upcase
186
+ row['Value'] *= 4
187
+ end
188
+ end
189
+ end
190
+ p File.read(out_path) # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
191
+
192
+ ===== Recipe: Filter \IO Stream to \IO Stream parsing and writing Headers
193
+
194
+ Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter an \IO stream to another \IO stream including header row:
195
+ in_path = 't.csv'
196
+ in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
197
+ File.write(in_path, in_string)
198
+ out_path = 'u.csv'
199
+ File.open(in_path) do |in_io|
200
+ File.open(out_path, 'w') do |out_io|
201
+ CSV.filter(in_io, out_io, headers: true, out_write_headers: true) do |row|
202
+ unless row.is_a?(Array)
203
+ row['Name'] = row['Name'].upcase
204
+ row['Value'] *= 4
205
+ end
138
206
  end
139
207
  end
140
208
  end
@@ -165,7 +165,7 @@ This example defines and uses two custom write converters to strip and upcase ge
165
165
  === RFC 4180 Compliance
166
166
 
167
167
  By default, \CSV generates data that is compliant with
168
- {RFC 4180}[https://tools.ietf.org/html/rfc4180]
168
+ {RFC 4180}[https://www.rfc-editor.org/rfc/rfc4180]
169
169
  with respect to:
170
170
  - Column separator.
171
171
  - Quote character.
@@ -45,6 +45,7 @@ All code snippets on this page assume that the following has been executed:
45
45
  - {Recipe: Convert Fields to Numerics}[#label-Recipe-3A+Convert+Fields+to+Numerics]
46
46
  - {Recipe: Convert Fields to Dates}[#label-Recipe-3A+Convert+Fields+to+Dates]
47
47
  - {Recipe: Convert Fields to DateTimes}[#label-Recipe-3A+Convert+Fields+to+DateTimes]
48
+ - {Recipe: Convert Fields to Times}[#label-Recipe-3A+Convert+Fields+to+Times]
48
49
  - {Recipe: Convert Assorted Fields to Objects}[#label-Recipe-3A+Convert+Assorted+Fields+to+Objects]
49
50
  - {Recipe: Convert Fields to Other Objects}[#label-Recipe-3A+Convert+Fields+to+Other+Objects]
50
51
  - {Recipe: Filter Field Strings}[#label-Recipe-3A+Filter+Field+Strings]
@@ -110,7 +111,7 @@ You can parse \CSV data from a \File, with or without headers.
110
111
 
111
112
  ===== Recipe: Parse from \File with Headers
112
113
 
113
- Use instance method CSV#read with option +headers+ to read a file all at once:
114
+ Use class method CSV.read with option +headers+ to read a file all at once:
114
115
  string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
115
116
  path = 't.csv'
116
117
  File.write(path, string)
@@ -191,7 +192,7 @@ Output:
191
192
  === RFC 4180 Compliance
192
193
 
193
194
  By default, \CSV parses data that is compliant with
194
- {RFC 4180}[https://tools.ietf.org/html/rfc4180]
195
+ {RFC 4180}[https://www.rfc-editor.org/rfc/rfc4180]
195
196
  with respect to:
196
197
  - Row separator.
197
198
  - Column separator.
@@ -339,6 +340,7 @@ There are built-in field converters for converting to objects of certain classes
339
340
  - \Integer
340
341
  - \Date
341
342
  - \DateTime
343
+ - \Time
342
344
 
343
345
  Other built-in field converters include:
344
346
  - +:numeric+: converts to \Integer and \Float.
@@ -381,6 +383,13 @@ Convert fields to \DateTime objects using built-in converter +:date_time+:
381
383
  parsed = CSV.parse(source, headers: true, converters: :date_time)
382
384
  parsed.map {|row| row['DateTime'].class} # => [DateTime, DateTime, DateTime]
383
385
 
386
+ ===== Recipe: Convert Fields to Times
387
+
388
+ Convert fields to \Time objects using built-in converter +:time+:
389
+ source = "Name,Time\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2020-05-07T14:59:00-05:00\n"
390
+ parsed = CSV.parse(source, headers: true, converters: :time)
391
+ parsed.map {|row| row['Time'].class} # => [Time, Time, Time]
392
+
384
393
  ===== Recipe: Convert Assorted Fields to Objects
385
394
 
386
395
  Convert assorted fields to objects using built-in converter +:all+:
@@ -542,4 +551,4 @@ Output:
542
551
  #<struct CSV::FieldInfo index=0, line=2, header=nil>
543
552
  #<struct CSV::FieldInfo index=1, line=2, header=nil>
544
553
  #<struct CSV::FieldInfo index=0, line=3, header=nil>
545
- #<struct CSV::FieldInfo index=1, line=3, header=nil>
554
+ #<struct CSV::FieldInfo index=1, line=3, header=nil>
@@ -1,4 +1,4 @@
1
- class Array # :nodoc:
1
+ class Array
2
2
  # Equivalent to CSV::generate_line(self, options)
3
3
  #
4
4
  # ["CSV", "data"].to_csv
@@ -1,4 +1,4 @@
1
- class String # :nodoc:
1
+ class String
2
2
  # Equivalent to CSV::parse_line(self, options)
3
3
  #
4
4
  # "CSV,data".parse_csv
@@ -4,6 +4,13 @@ class CSV
4
4
  # Note: Don't use this class directly. This is an internal class.
5
5
  class FieldsConverter
6
6
  include Enumerable
7
+
8
+ NO_QUOTED_FIELDS = [] # :nodoc:
9
+ def NO_QUOTED_FIELDS.[](_index)
10
+ false
11
+ end
12
+ NO_QUOTED_FIELDS.freeze
13
+
7
14
  #
8
15
  # A CSV::FieldsConverter is a data structure for storing the
9
16
  # fields converter properties to be passed as a parameter
@@ -44,7 +51,7 @@ class CSV
44
51
  @converters.empty?
45
52
  end
46
53
 
47
- def convert(fields, headers, lineno, quoted_fields)
54
+ def convert(fields, headers, lineno, quoted_fields=NO_QUOTED_FIELDS)
48
55
  return fields unless need_convert?
49
56
 
50
57
  fields.collect.with_index do |field, index|
data/lib/csv/parser.rb CHANGED
@@ -18,6 +18,19 @@ class CSV
18
18
  # into your Encoding.
19
19
  #
20
20
 
21
+ class << self
22
+ ARGF_OBJECT_ID = ARGF.object_id
23
+ # Convenient method to check whether the give input reached EOF
24
+ # or not.
25
+ def eof?(input)
26
+ # We can't use input != ARGF in Ractor. Because ARGF isn't a
27
+ # shareable object.
28
+ input.object_id != ARGF_OBJECT_ID and
29
+ input.respond_to?(:eof) and
30
+ input.eof?
31
+ end
32
+ end
33
+
21
34
  # Raised when encoding is invalid.
22
35
  class InvalidEncoding < StandardError
23
36
  end
@@ -312,7 +325,7 @@ class CSV
312
325
  raise InvalidEncoding unless chunk.valid_encoding?
313
326
  # trace(__method__, :chunk, chunk)
314
327
  @scanner = StringScanner.new(chunk)
315
- if input.respond_to?(:eof?) and input.eof?
328
+ if Parser.eof?(input)
316
329
  @inputs.shift
317
330
  @last_scanner = @inputs.empty?
318
331
  end
@@ -409,13 +422,7 @@ class CSV
409
422
 
410
423
  begin
411
424
  @scanner ||= build_scanner
412
- if quote_character.nil?
413
- parse_no_quote(&block)
414
- elsif @need_robust_parsing
415
- parse_quotable_robust(&block)
416
- else
417
- parse_quotable_loose(&block)
418
- end
425
+ __send__(@parse_method, &block)
419
426
  rescue InvalidEncoding
420
427
  if @scanner
421
428
  ignore_broken_line
@@ -459,7 +466,6 @@ class CSV
459
466
  end
460
467
 
461
468
  def prepare_variable
462
- @need_robust_parsing = false
463
469
  @encoding = @options[:encoding]
464
470
  liberal_parsing = @options[:liberal_parsing]
465
471
  if liberal_parsing
@@ -472,7 +478,6 @@ class CSV
472
478
  @double_quote_outside_quote = false
473
479
  @backslash_quote = false
474
480
  end
475
- @need_robust_parsing = true
476
481
  else
477
482
  @liberal_parsing = false
478
483
  @backslash_quote = false
@@ -554,7 +559,6 @@ class CSV
554
559
  @rstrip_value = Regexp.new(@escaped_strip +
555
560
  "+\\z".encode(@encoding))
556
561
  end
557
- @need_robust_parsing = true
558
562
  elsif @strip
559
563
  strip_values = " \t\f\v"
560
564
  @escaped_strip = strip_values.encode(@encoding)
@@ -562,7 +566,6 @@ class CSV
562
566
  @strip_value = Regexp.new("[#{strip_values}]+".encode(@encoding))
563
567
  @rstrip_value = Regexp.new("[#{strip_values}]+\\z".encode(@encoding))
564
568
  end
565
- @need_robust_parsing = true
566
569
  end
567
570
  end
568
571
 
@@ -767,7 +770,7 @@ class CSV
767
770
  case headers
768
771
  when Array
769
772
  @raw_headers = headers
770
- quoted_fields = [false] * @raw_headers.size
773
+ quoted_fields = FieldsConverter::NO_QUOTED_FIELDS
771
774
  @use_headers = true
772
775
  when String
773
776
  @raw_headers, quoted_fields = parse_headers(headers)
@@ -808,6 +811,13 @@ class CSV
808
811
 
809
812
  def prepare_parser
810
813
  @may_quoted = may_quoted?
814
+ if @quote_character.nil?
815
+ @parse_method = :parse_no_quote
816
+ elsif @liberal_parsing or @strip
817
+ @parse_method = :parse_quotable_robust
818
+ else
819
+ @parse_method = :parse_quotable_loose
820
+ end
811
821
  end
812
822
 
813
823
  def may_quoted?
@@ -872,10 +882,7 @@ class CSV
872
882
  string = nil
873
883
  if @samples.empty? and @input.is_a?(StringIO)
874
884
  string = @input.read
875
- elsif @samples.size == 1 and
876
- @input != ARGF and
877
- @input.respond_to?(:eof?) and
878
- @input.eof?
885
+ elsif @samples.size == 1 and Parser.eof?(@input)
879
886
  string = @samples[0]
880
887
  end
881
888
  if string
@@ -944,11 +951,9 @@ class CSV
944
951
  if line.empty?
945
952
  next if @skip_blanks
946
953
  row = []
947
- quoted_fields = []
948
954
  else
949
955
  line = strip_value(line)
950
956
  row = line.split(@split_column_separator, -1)
951
- quoted_fields = [false] * row.size
952
957
  if @max_field_size
953
958
  row.each do |column|
954
959
  validate_field_size(column)
@@ -962,7 +967,7 @@ class CSV
962
967
  end
963
968
  end
964
969
  @last_line = original_line
965
- emit_row(row, quoted_fields, &block)
970
+ emit_row(row, &block)
966
971
  end
967
972
  end
968
973
 
@@ -984,10 +989,10 @@ class CSV
984
989
  next
985
990
  end
986
991
  row = []
987
- quoted_fields = []
992
+ quoted_fields = FieldsConverter::NO_QUOTED_FIELDS
988
993
  elsif line.include?(@cr) or line.include?(@lf)
989
994
  @scanner.keep_back
990
- @need_robust_parsing = true
995
+ @parse_method = :parse_quotable_robust
991
996
  return parse_quotable_robust(&block)
992
997
  else
993
998
  row = line.split(@split_column_separator, -1)
@@ -1011,7 +1016,7 @@ class CSV
1011
1016
  row[i] = column[1..-2]
1012
1017
  else
1013
1018
  @scanner.keep_back
1014
- @need_robust_parsing = true
1019
+ @parse_method = :parse_quotable_robust
1015
1020
  return parse_quotable_robust(&block)
1016
1021
  end
1017
1022
  validate_field_size(row[i])
@@ -1046,13 +1051,13 @@ class CSV
1046
1051
  quoted_fields << @quoted_column_value
1047
1052
  elsif parse_row_end
1048
1053
  if row.empty? and value.nil?
1049
- emit_row([], [], &block) unless @skip_blanks
1054
+ emit_row([], &block) unless @skip_blanks
1050
1055
  else
1051
1056
  row << value
1052
1057
  quoted_fields << @quoted_column_value
1053
1058
  emit_row(row, quoted_fields, &block)
1054
1059
  row = []
1055
- quoted_fields = []
1060
+ quoted_fields.clear
1056
1061
  end
1057
1062
  skip_needless_lines
1058
1063
  start_row
@@ -1257,7 +1262,7 @@ class CSV
1257
1262
  @scanner.keep_start
1258
1263
  end
1259
1264
 
1260
- def emit_row(row, quoted_fields, &block)
1265
+ def emit_row(row, quoted_fields=FieldsConverter::NO_QUOTED_FIELDS, &block)
1261
1266
  @lineno += 1
1262
1267
 
1263
1268
  raw_row = row
data/lib/csv/version.rb CHANGED
@@ -2,5 +2,5 @@
2
2
 
3
3
  class CSV
4
4
  # The version of the installed library.
5
- VERSION = "3.3.0"
5
+ VERSION = "3.3.3"
6
6
  end
data/lib/csv/writer.rb CHANGED
@@ -40,8 +40,7 @@ class CSV
40
40
  @lineno += 1
41
41
 
42
42
  if @fields_converter
43
- quoted_fields = [false] * row.size
44
- row = @fields_converter.convert(row, nil, lineno, quoted_fields)
43
+ row = @fields_converter.convert(row, nil, lineno)
45
44
  end
46
45
 
47
46
  i = -1
data/lib/csv.rb CHANGED
@@ -91,6 +91,7 @@
91
91
 
92
92
  require "forwardable"
93
93
  require "date"
94
+ require "time"
94
95
  require "stringio"
95
96
 
96
97
  require_relative "csv/fields_converter"
@@ -521,6 +522,7 @@ require_relative "csv/writer"
521
522
  # - <tt>:float</tt>: converts each \String-embedded float into a true \Float.
522
523
  # - <tt>:date</tt>: converts each \String-embedded date into a true \Date.
523
524
  # - <tt>:date_time</tt>: converts each \String-embedded date-time into a true \DateTime
525
+ # - <tt>:time</tt>: converts each \String-embedded time into a true \Time
524
526
  # .
525
527
  # This example creates a converter proc, then stores it:
526
528
  # strip_converter = proc {|field| field.strip }
@@ -631,6 +633,7 @@ require_relative "csv/writer"
631
633
  # [:numeric, [:integer, :float]]
632
634
  # [:date, Proc]
633
635
  # [:date_time, Proc]
636
+ # [:time, Proc]
634
637
  # [:all, [:date_time, :numeric]]
635
638
  #
636
639
  # Each of these converters transcodes values to UTF-8 before attempting conversion.
@@ -675,6 +678,15 @@ require_relative "csv/writer"
675
678
  # csv = CSV.parse_line(data, converters: :date_time)
676
679
  # csv # => [#<DateTime: 2020-05-07T14:59:00-05:00 ((2458977j,71940s,0n),-18000s,2299161j)>, "x"]
677
680
  #
681
+ # Converter +time+ converts each field that Time::parse accepts:
682
+ # data = '2020-05-07T14:59:00-05:00,x'
683
+ # # Without the converter
684
+ # csv = CSV.parse_line(data)
685
+ # csv # => ["2020-05-07T14:59:00-05:00", "x"]
686
+ # # With the converter
687
+ # csv = CSV.parse_line(data, converters: :time)
688
+ # csv # => [2020-05-07 14:59:00 -0500, "x"]
689
+ #
678
690
  # Converter +:numeric+ converts with both +:date_time+ and +:numeric+..
679
691
  #
680
692
  # As seen above, method #convert adds \converters to a \CSV instance,
@@ -871,10 +883,10 @@ class CSV
871
883
  # A Regexp used to find and convert some common Date formats.
872
884
  DateMatcher = / \A(?: (\w+,?\s+)?\w+\s+\d{1,2},?\s+\d{2,4} |
873
885
  \d{4}-\d{2}-\d{2} )\z /x
874
- # A Regexp used to find and convert some common DateTime formats.
886
+ # A Regexp used to find and convert some common (Date)Time formats.
875
887
  DateTimeMatcher =
876
888
  / \A(?: (\w+,?\s+)?\w+\s+\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2},?\s+\d{2,4} |
877
- # ISO-8601 and RFC-3339 (space instead of T) recognized by DateTime.parse
889
+ # ISO-8601 and RFC-3339 (space instead of T) recognized by (Date)Time.parse
878
890
  \d{4}-\d{2}-\d{2}
879
891
  (?:[T\s]\d{2}:\d{2}(?::\d{2}(?:\.\d+)?(?:[+-]\d{2}(?::\d{2})|Z)?)?)?
880
892
  )\z /x
@@ -912,6 +924,14 @@ class CSV
912
924
  f
913
925
  end
914
926
  },
927
+ time: lambda { |f|
928
+ begin
929
+ e = f.encode(ConverterEncoding)
930
+ e.match?(DateTimeMatcher) ? Time.parse(e) : f
931
+ rescue # encoding conversion or parse errors
932
+ f
933
+ end
934
+ },
915
935
  all: [:date_time, :numeric],
916
936
  }
917
937
 
@@ -1198,7 +1218,44 @@ class CSV
1198
1218
  # * Argument +in_string_or_io+ must be a \String or an \IO stream.
1199
1219
  # * Argument +out_string_or_io+ must be a \String or an \IO stream.
1200
1220
  # * Arguments <tt>**options</tt> must be keyword options.
1201
- # See {Options for Parsing}[#class-CSV-label-Options+for+Parsing].
1221
+ #
1222
+ # - Each option defined as an {option for parsing}[#class-CSV-label-Options+for+Parsing]
1223
+ # is used for parsing the filter input.
1224
+ # - Each option defined as an {option for generating}[#class-CSV-label-Options+for+Generating]
1225
+ # is used for generator the filter input.
1226
+ #
1227
+ # However, there are three options that may be used for both parsing and generating:
1228
+ # +col_sep+, +quote_char+, and +row_sep+.
1229
+ #
1230
+ # Therefore for method +filter+ (and method +filter+ only),
1231
+ # there are special options that allow these parsing and generating options
1232
+ # to be specified separately:
1233
+ #
1234
+ # - Options +input_col_sep+ and +output_col_sep+
1235
+ # (and their aliases +in_col_sep+ and +out_col_sep+)
1236
+ # specify the column separators for parsing and generating.
1237
+ # - Options +input_quote_char+ and +output_quote_char+
1238
+ # (and their aliases +in_quote_char+ and +out_quote_char+)
1239
+ # specify the quote characters for parsing and generting.
1240
+ # - Options +input_row_sep+ and +output_row_sep+
1241
+ # (and their aliases +in_row_sep+ and +out_row_sep+)
1242
+ # specify the row separators for parsing and generating.
1243
+ #
1244
+ # Example options (for column separators):
1245
+ #
1246
+ # CSV.filter # Default for both parsing and generating.
1247
+ # CSV.filter(in_col_sep: ';') # ';' for parsing, default for generating.
1248
+ # CSV.filter(out_col_sep: '|') # Default for parsing, '|' for generating.
1249
+ # CSV.filter(in_col_sep: ';', out_col_sep: '|') # ';' for parsing, '|' for generating.
1250
+ #
1251
+ # Note that for a special option (e.g., +input_col_sep+)
1252
+ # and its corresponding "regular" option (e.g., +col_sep+),
1253
+ # the two are mutually overriding.
1254
+ #
1255
+ # Another example (possibly surprising):
1256
+ #
1257
+ # CSV.filter(in_col_sep: ';', col_sep: '|') # '|' for both parsing(!) and generating.
1258
+ #
1202
1259
  def filter(input=nil, output=nil, **options)
1203
1260
  # parse options for input, output, or both
1204
1261
  in_options, out_options = Hash.new, {row_sep: InputRecordSeparator.value}
@@ -1508,10 +1565,8 @@ class CSV
1508
1565
 
1509
1566
  #
1510
1567
  # :call-seq:
1511
- # open(file_path, mode = "rb", **options ) -> new_csv
1512
- # open(io, mode = "rb", **options ) -> new_csv
1513
- # open(file_path, mode = "rb", **options ) { |csv| ... } -> object
1514
- # open(io, mode = "rb", **options ) { |csv| ... } -> object
1568
+ # open(path_or_io, mode = "rb", **options ) -> new_csv
1569
+ # open(path_or_io, mode = "rb", **options ) { |csv| ... } -> object
1515
1570
  #
1516
1571
  # possible options elements:
1517
1572
  # keyword form:
@@ -1520,7 +1575,7 @@ class CSV
1520
1575
  # :undef => :replace # replace undefined conversion
1521
1576
  # :replace => string # replacement string ("?" or "\uFFFD" if not specified)
1522
1577
  #
1523
- # * Argument +path+, if given, must be the path to a file.
1578
+ # * Argument +path_or_io+, must be a file path or an \IO stream.
1524
1579
  # :include: ../doc/csv/arguments/io.rdoc
1525
1580
  # * Argument +mode+, if given, must be a \File mode.
1526
1581
  # See {Access Modes}[https://docs.ruby-lang.org/en/master/File.html#class-File-label-Access+Modes].
@@ -1544,6 +1599,9 @@ class CSV
1544
1599
  # path = 't.csv'
1545
1600
  # File.write(path, string)
1546
1601
  #
1602
+ # string_io = StringIO.new
1603
+ # string_io << "foo,0\nbar,1\nbaz,2\n"
1604
+ #
1547
1605
  # ---
1548
1606
  #
1549
1607
  # With no block given, returns a new \CSV object.
@@ -1556,6 +1614,9 @@ class CSV
1556
1614
  # csv = CSV.open(File.open(path))
1557
1615
  # csv # => #<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
1558
1616
  #
1617
+ # Create a \CSV object using a \StringIO:
1618
+ # csv = CSV.open(string_io)
1619
+ # csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
1559
1620
  # ---
1560
1621
  #
1561
1622
  # With a block given, calls the block with the created \CSV object;
@@ -1573,15 +1634,25 @@ class CSV
1573
1634
  # Output:
1574
1635
  # #<CSV io_type:File io_path:"t.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
1575
1636
  #
1637
+ # Using a \StringIO:
1638
+ # csv = CSV.open(string_io) {|csv| p csv}
1639
+ # csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
1640
+ # Output:
1641
+ # #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
1576
1642
  # ---
1577
1643
  #
1578
1644
  # Raises an exception if the argument is not a \String object or \IO object:
1579
1645
  # # Raises TypeError (no implicit conversion of Symbol into String)
1580
1646
  # CSV.open(:foo)
1581
- def open(filename, mode="r", **options)
1647
+ def open(filename_or_io, mode="r", **options)
1582
1648
  # wrap a File opened with the remaining +args+ with no newline
1583
1649
  # decorator
1584
- file_opts = options.dup
1650
+ file_opts = {}
1651
+ may_enable_bom_detection_automatically(filename_or_io,
1652
+ mode,
1653
+ options,
1654
+ file_opts)
1655
+ file_opts.merge!(options)
1585
1656
  unless file_opts.key?(:newline)
1586
1657
  file_opts[:universal_newline] ||= false
1587
1658
  end
@@ -1590,14 +1661,19 @@ class CSV
1590
1661
  options.delete(:replace)
1591
1662
  options.delete_if {|k, _| /newline\z/.match?(k)}
1592
1663
 
1593
- begin
1594
- f = File.open(filename, mode, **file_opts)
1595
- rescue ArgumentError => e
1596
- raise unless /needs binmode/.match?(e.message) and mode == "r"
1597
- mode = "rb"
1598
- file_opts = {encoding: Encoding.default_external}.merge(file_opts)
1599
- retry
1664
+ if filename_or_io.is_a?(StringIO)
1665
+ f = create_stringio(filename_or_io.string, mode, **file_opts)
1666
+ else
1667
+ begin
1668
+ f = File.open(filename_or_io, mode, **file_opts)
1669
+ rescue ArgumentError => e
1670
+ raise unless /needs binmode/.match?(e.message) and mode == "r"
1671
+ mode = "rb"
1672
+ file_opts = {encoding: Encoding.default_external}.merge(file_opts)
1673
+ retry
1674
+ end
1600
1675
  end
1676
+
1601
1677
  begin
1602
1678
  csv = new(f, **options)
1603
1679
  rescue Exception
@@ -1729,6 +1805,23 @@ class CSV
1729
1805
  # Raises an exception if the argument is not a \String object or \IO object:
1730
1806
  # # Raises NoMethodError (undefined method `close' for :foo:Symbol)
1731
1807
  # CSV.parse(:foo)
1808
+ #
1809
+ # ---
1810
+ #
1811
+ # Please make sure if your text contains \BOM or not. CSV.parse will not remove
1812
+ # \BOM automatically. You might want to remove \BOM before calling CSV.parse :
1813
+ # # remove BOM on calling File.open
1814
+ # File.open(path, encoding: 'bom|utf-8') do |file|
1815
+ # CSV.parse(file, headers: true) do |row|
1816
+ # # you can get value by column name because BOM is removed
1817
+ # p row['Name']
1818
+ # end
1819
+ # end
1820
+ #
1821
+ # Output:
1822
+ # # "foo"
1823
+ # # "bar"
1824
+ # # "baz"
1732
1825
  def parse(str, **options, &block)
1733
1826
  csv = new(str, **options)
1734
1827
 
@@ -1862,6 +1955,42 @@ class CSV
1862
1955
  options = default_options.merge(options)
1863
1956
  read(path, **options)
1864
1957
  end
1958
+
1959
+ ON_WINDOWS = /mingw|mswin/.match?(RUBY_PLATFORM)
1960
+ private_constant :ON_WINDOWS
1961
+
1962
+ private
1963
+ def may_enable_bom_detection_automatically(filename_or_io,
1964
+ mode,
1965
+ options,
1966
+ file_opts)
1967
+ if filename_or_io.is_a?(StringIO)
1968
+ # Support to StringIO was dropped for Ruby 2.6 and earlier without BOM support:
1969
+ # https://github.com/ruby/stringio/pull/47
1970
+ return if RUBY_VERSION < "2.7"
1971
+ else
1972
+ # "bom|utf-8" may be buggy on Windows:
1973
+ # https://bugs.ruby-lang.org/issues/20526
1974
+ return if ON_WINDOWS
1975
+ end
1976
+ return unless Encoding.default_external == Encoding::UTF_8
1977
+ return if options.key?(:encoding)
1978
+ return if options.key?(:external_encoding)
1979
+ return if mode.is_a?(String) and mode.include?(":")
1980
+ file_opts[:encoding] = "bom|utf-8"
1981
+ end
1982
+
1983
+ if RUBY_VERSION < "2.7"
1984
+ def create_stringio(str, mode, opts)
1985
+ opts.delete_if {|k, _| k == :universal_newline or DEFAULT_OPTIONS.key?(k)}
1986
+ raise ArgumentError, "Unsupported options parsing StringIO: #{opts.keys}" unless opts.empty?
1987
+ StringIO.new(str, mode)
1988
+ end
1989
+ else
1990
+ def create_stringio(str, mode, opts)
1991
+ StringIO.new(str, mode, **opts)
1992
+ end
1993
+ end
1865
1994
  end
1866
1995
 
1867
1996
  # :call-seq:
@@ -2000,6 +2129,12 @@ class CSV
2000
2129
  writer if @writer_options[:write_headers]
2001
2130
  end
2002
2131
 
2132
+ class TSV < CSV
2133
+ def initialize(data, **options)
2134
+ super(data, **({col_sep: "\t"}.merge(options)))
2135
+ end
2136
+ end
2137
+
2003
2138
  # :call-seq:
2004
2139
  # csv.col_sep -> string
2005
2140
  #
metadata CHANGED
@@ -1,78 +1,23 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: csv
3
3
  version: !ruby/object:Gem::Version
4
- version: 3.3.0
4
+ version: 3.3.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - James Edward Gray II
8
8
  - Kouhei Sutou
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-03-22 00:00:00.000000000 Z
12
- dependencies:
13
- - !ruby/object:Gem::Dependency
14
- name: bundler
15
- requirement: !ruby/object:Gem::Requirement
16
- requirements:
17
- - - ">="
18
- - !ruby/object:Gem::Version
19
- version: '0'
20
- type: :development
21
- prerelease: false
22
- version_requirements: !ruby/object:Gem::Requirement
23
- requirements:
24
- - - ">="
25
- - !ruby/object:Gem::Version
26
- version: '0'
27
- - !ruby/object:Gem::Dependency
28
- name: rake
29
- requirement: !ruby/object:Gem::Requirement
30
- requirements:
31
- - - ">="
32
- - !ruby/object:Gem::Version
33
- version: '0'
34
- type: :development
35
- prerelease: false
36
- version_requirements: !ruby/object:Gem::Requirement
37
- requirements:
38
- - - ">="
39
- - !ruby/object:Gem::Version
40
- version: '0'
41
- - !ruby/object:Gem::Dependency
42
- name: benchmark_driver
43
- requirement: !ruby/object:Gem::Requirement
44
- requirements:
45
- - - ">="
46
- - !ruby/object:Gem::Version
47
- version: '0'
48
- type: :development
49
- prerelease: false
50
- version_requirements: !ruby/object:Gem::Requirement
51
- requirements:
52
- - - ">="
53
- - !ruby/object:Gem::Version
54
- version: '0'
55
- - !ruby/object:Gem::Dependency
56
- name: test-unit
57
- requirement: !ruby/object:Gem::Requirement
58
- requirements:
59
- - - ">="
60
- - !ruby/object:Gem::Version
61
- version: 3.4.8
62
- type: :development
63
- prerelease: false
64
- version_requirements: !ruby/object:Gem::Requirement
65
- requirements:
66
- - - ">="
67
- - !ruby/object:Gem::Version
68
- version: 3.4.8
11
+ date: 2025-03-20 00:00:00.000000000 Z
12
+ dependencies: []
69
13
  description: The CSV library provides a complete interface to CSV files and data.
70
14
  It offers tools to enable you to read and write to and from Strings or IO objects,
71
15
  as needed.
72
16
  email:
73
17
  -
74
18
  - kou@cozmixng.org
75
- executables: []
19
+ executables:
20
+ - csv-filter
76
21
  extensions: []
77
22
  extra_rdoc_files:
78
23
  - LICENSE.txt
@@ -86,6 +31,7 @@ files:
86
31
  - LICENSE.txt
87
32
  - NEWS.md
88
33
  - README.md
34
+ - bin/csv-filter
89
35
  - doc/csv/arguments/io.rdoc
90
36
  - doc/csv/options/common/col_sep.rdoc
91
37
  - doc/csv/options/common/quote_char.rdoc
@@ -126,7 +72,8 @@ homepage: https://github.com/ruby/csv
126
72
  licenses:
127
73
  - Ruby
128
74
  - BSD-2-Clause
129
- metadata: {}
75
+ metadata:
76
+ changelog_uri: https://github.com/ruby/csv/releases/tag/v3.3.3
130
77
  rdoc_options:
131
78
  - "--main"
132
79
  - README.md
@@ -143,7 +90,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
143
90
  - !ruby/object:Gem::Version
144
91
  version: '0'
145
92
  requirements: []
146
- rubygems_version: 3.6.0.dev
93
+ rubygems_version: 3.6.2
147
94
  specification_version: 4
148
95
  summary: CSV Reading and Writing
149
96
  test_files: []