csv 3.1.1 → 3.1.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/NEWS.md +47 -0
- data/lib/csv.rb +399 -379
- data/lib/csv/core_ext/array.rb +1 -1
- data/lib/csv/core_ext/string.rb +1 -1
- data/lib/csv/fields_converter.rb +6 -0
- data/lib/csv/parser.rb +52 -8
- data/lib/csv/row.rb +17 -15
- data/lib/csv/table.rb +29 -29
- data/lib/csv/version.rb +1 -1
- data/lib/csv/writer.rb +12 -0
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 30015ee78d9fd5fa7b6bc0bbca7b785adef465e39dd654d4c4a020420f2f47ec
|
4
|
+
data.tar.gz: 72ece87ac30a9748dc64b37c8ca77de23b21939c19568c870b9ccb020f8cabf4
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: '09cdbbeb0d72c765d3cd15e690dfb6a25a7e731f5d79f15815cdc17cbb074c5d592d4805f9e59778f911478ea25dc635ee91a43210e8720e8c996166bbe4467c'
|
7
|
+
data.tar.gz: 8144fa3744620a731ff8b601316e476c6263a9928db41fa95ef50902d20928e1be174fbe115d42585a46dc455971e65fe1abc82032a9f6d82c959a675b1e252f
|
data/NEWS.md
CHANGED
@@ -1,5 +1,52 @@
|
|
1
1
|
# News
|
2
2
|
|
3
|
+
## 3.1.2 - 2019-10-12
|
4
|
+
|
5
|
+
### Improvements
|
6
|
+
|
7
|
+
* Added `:col_sep` check.
|
8
|
+
[GitHub#94][Reported by Florent Beaurain]
|
9
|
+
|
10
|
+
* Suppressed warnings.
|
11
|
+
[GitHub#96][Patch by Nobuyoshi Nakada]
|
12
|
+
|
13
|
+
* Improved documentation.
|
14
|
+
[GitHub#101][GitHub#102][Patch by Vitor Oliveira]
|
15
|
+
|
16
|
+
### Fixes
|
17
|
+
|
18
|
+
* Fixed a typo in documentation.
|
19
|
+
[GitHub#95][Patch by Yuji Yaginuma]
|
20
|
+
|
21
|
+
* Fixed a multibyte character handling bug.
|
22
|
+
[GitHub#97][Patch by koshigoe]
|
23
|
+
|
24
|
+
* Fixed typos in documentation.
|
25
|
+
[GitHub#100][Patch by Vitor Oliveira]
|
26
|
+
|
27
|
+
* Fixed a bug that seeked `StringIO` isn't accepted.
|
28
|
+
[GitHub#98][Patch by MATSUMOTO Katsuyoshi]
|
29
|
+
|
30
|
+
* Fixed a bug that `CSV.generate_line` doesn't work with
|
31
|
+
`Encoding.default_internal`.
|
32
|
+
[GitHub#105][Reported by David Rodríguez]
|
33
|
+
|
34
|
+
### Thanks
|
35
|
+
|
36
|
+
* Florent Beaurain
|
37
|
+
|
38
|
+
* Yuji Yaginuma
|
39
|
+
|
40
|
+
* Nobuyoshi Nakada
|
41
|
+
|
42
|
+
* koshigoe
|
43
|
+
|
44
|
+
* Vitor Oliveira
|
45
|
+
|
46
|
+
* MATSUMOTO Katsuyoshi
|
47
|
+
|
48
|
+
* David Rodríguez
|
49
|
+
|
3
50
|
## 3.1.1 - 2019-04-26
|
4
51
|
|
5
52
|
### Improvements
|
data/lib/csv.rb
CHANGED
@@ -10,18 +10,18 @@
|
|
10
10
|
#
|
11
11
|
# Welcome to the new and improved CSV.
|
12
12
|
#
|
13
|
-
# This version of the CSV library began its life as FasterCSV.
|
14
|
-
# intended as a replacement to Ruby's then standard CSV library.
|
13
|
+
# This version of the CSV library began its life as FasterCSV. FasterCSV was
|
14
|
+
# intended as a replacement to Ruby's then standard CSV library. It was
|
15
15
|
# designed to address concerns users of that library had and it had three
|
16
16
|
# primary goals:
|
17
17
|
#
|
18
18
|
# 1. Be significantly faster than CSV while remaining a pure Ruby library.
|
19
|
-
# 2. Use a smaller and easier to maintain code base.
|
20
|
-
# grew larger, was also but considerably richer in features.
|
19
|
+
# 2. Use a smaller and easier to maintain code base. (FasterCSV eventually
|
20
|
+
# grew larger, was also but considerably richer in features. The parsing
|
21
21
|
# core remains quite small.)
|
22
22
|
# 3. Improve on the CSV interface.
|
23
23
|
#
|
24
|
-
# Obviously, the last one is subjective.
|
24
|
+
# Obviously, the last one is subjective. I did try to defer to the original
|
25
25
|
# interface whenever I didn't have a compelling reason to change it though, so
|
26
26
|
# hopefully this won't be too radically different.
|
27
27
|
#
|
@@ -29,20 +29,20 @@
|
|
29
29
|
# the original library as of Ruby 1.9. If you are migrating code from 1.8 or
|
30
30
|
# earlier, you may have to change your code to comply with the new interface.
|
31
31
|
#
|
32
|
-
# == What's Different From the Old CSV?
|
32
|
+
# == What's the Different From the Old CSV?
|
33
33
|
#
|
34
34
|
# I'm sure I'll miss something, but I'll try to mention most of the major
|
35
35
|
# differences I am aware of, to help others quickly get up to speed:
|
36
36
|
#
|
37
37
|
# === CSV Parsing
|
38
38
|
#
|
39
|
-
# * This parser is m17n aware.
|
39
|
+
# * This parser is m17n aware. See CSV for full details.
|
40
40
|
# * This library has a stricter parser and will throw MalformedCSVErrors on
|
41
41
|
# problematic data.
|
42
|
-
# * This library has a less liberal idea of a line ending than CSV.
|
43
|
-
# set as the <tt>:row_sep</tt> is law.
|
42
|
+
# * This library has a less liberal idea of a line ending than CSV. What you
|
43
|
+
# set as the <tt>:row_sep</tt> is law. It can auto-detect your line endings
|
44
44
|
# though.
|
45
|
-
# * The old library returned empty lines as <tt>[nil]</tt>.
|
45
|
+
# * The old library returned empty lines as <tt>[nil]</tt>. This library calls
|
46
46
|
# them <tt>[]</tt>.
|
47
47
|
# * This library has a much faster parser.
|
48
48
|
#
|
@@ -56,9 +56,9 @@
|
|
56
56
|
# * CSV now has a new() method used to wrap objects like String and IO for
|
57
57
|
# reading and writing.
|
58
58
|
# * CSV::generate() is different from the old method.
|
59
|
-
# * CSV no longer supports partial reads.
|
59
|
+
# * CSV no longer supports partial reads. It works line-by-line.
|
60
60
|
# * CSV no longer allows the instance methods to override the separators for
|
61
|
-
# performance reasons.
|
61
|
+
# performance reasons. They must be set in the constructor.
|
62
62
|
#
|
63
63
|
# If you use this library and find yourself missing any functionality I have
|
64
64
|
# trimmed, please {let me know}[mailto:james@grayproductions.net].
|
@@ -70,16 +70,16 @@
|
|
70
70
|
# == What is CSV, really?
|
71
71
|
#
|
72
72
|
# CSV maintains a pretty strict definition of CSV taken directly from
|
73
|
-
# {the RFC}[http://www.ietf.org/rfc/rfc4180.txt].
|
74
|
-
# place and that is to make using this library easier.
|
73
|
+
# {the RFC}[http://www.ietf.org/rfc/rfc4180.txt]. I relax the rules in only one
|
74
|
+
# place and that is to make using this library easier. CSV will parse all valid
|
75
75
|
# CSV.
|
76
76
|
#
|
77
|
-
# What you don't want to do is feed CSV invalid data.
|
77
|
+
# What you don't want to do is to feed CSV invalid data. Because of the way the
|
78
78
|
# CSV format works, it's common for a parser to need to read until the end of
|
79
|
-
# the file to be sure a field is invalid.
|
79
|
+
# the file to be sure a field is invalid. This consumes a lot of time and memory.
|
80
80
|
#
|
81
81
|
# Luckily, when working with invalid CSV, Ruby's built-in methods will almost
|
82
|
-
# always be superior in every way.
|
82
|
+
# always be superior in every way. For example, parsing non-quoted fields is as
|
83
83
|
# easy as:
|
84
84
|
#
|
85
85
|
# data.split(",")
|
@@ -104,7 +104,7 @@ require_relative "csv/writer"
|
|
104
104
|
using CSV::MatchP if CSV.const_defined?(:MatchP)
|
105
105
|
|
106
106
|
#
|
107
|
-
# This class provides a complete interface to CSV files and data.
|
107
|
+
# This class provides a complete interface to CSV files and data. It offers
|
108
108
|
# tools to enable you to read and write to and from Strings or IO objects, as
|
109
109
|
# needed.
|
110
110
|
#
|
@@ -184,7 +184,7 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
|
|
184
184
|
# === CSV with headers
|
185
185
|
#
|
186
186
|
# CSV allows to specify column names of CSV file, whether they are in data, or
|
187
|
-
# provided separately. If headers specified, reading methods return an instance
|
187
|
+
# provided separately. If headers are specified, reading methods return an instance
|
188
188
|
# of CSV::Table, consisting of CSV::Row.
|
189
189
|
#
|
190
190
|
# # Headers are part of data
|
@@ -200,7 +200,7 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
|
|
200
200
|
# data.first.to_h #=> {"Name"=>"Bob", "Department"=>"Engineering", "Salary"=>"1000"}
|
201
201
|
#
|
202
202
|
# # Headers provided by developer
|
203
|
-
# data = CSV.parse('Bob,
|
203
|
+
# data = CSV.parse('Bob,Engineering,1000', headers: %i[name department salary])
|
204
204
|
# data.first #=> #<CSV::Row name:"Bob" department:"Engineering" salary:"1000">
|
205
205
|
#
|
206
206
|
# === Typed data reading
|
@@ -223,42 +223,42 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
|
|
223
223
|
# == CSV and Character Encodings (M17n or Multilingualization)
|
224
224
|
#
|
225
225
|
# This new CSV parser is m17n savvy. The parser works in the Encoding of the IO
|
226
|
-
# or String object being read from or written to.
|
226
|
+
# or String object being read from or written to. Your data is never transcoded
|
227
227
|
# (unless you ask Ruby to transcode it for you) and will literally be parsed in
|
228
|
-
# the Encoding it is in.
|
229
|
-
# Encoding of your data.
|
228
|
+
# the Encoding it is in. Thus CSV will return Arrays or Rows of Strings in the
|
229
|
+
# Encoding of your data. This is accomplished by transcoding the parser itself
|
230
230
|
# into your Encoding.
|
231
231
|
#
|
232
232
|
# Some transcoding must take place, of course, to accomplish this multiencoding
|
233
|
-
# support.
|
233
|
+
# support. For example, <tt>:col_sep</tt>, <tt>:row_sep</tt>, and
|
234
234
|
# <tt>:quote_char</tt> must be transcoded to match your data. Hopefully this
|
235
235
|
# makes the entire process feel transparent, since CSV's defaults should just
|
236
|
-
# magically work for your data.
|
236
|
+
# magically work for your data. However, you can set these values manually in
|
237
237
|
# the target Encoding to avoid the translation.
|
238
238
|
#
|
239
239
|
# It's also important to note that while all of CSV's core parser is now
|
240
|
-
# Encoding agnostic, some features are not.
|
240
|
+
# Encoding agnostic, some features are not. For example, the built-in
|
241
241
|
# converters will try to transcode data to UTF-8 before making conversions.
|
242
242
|
# Again, you can provide custom converters that are aware of your Encodings to
|
243
|
-
# avoid this translation.
|
243
|
+
# avoid this translation. It's just too hard for me to support native
|
244
244
|
# conversions in all of Ruby's Encodings.
|
245
245
|
#
|
246
|
-
# Anyway, the practical side of this is simple:
|
246
|
+
# Anyway, the practical side of this is simple: make sure IO and String objects
|
247
247
|
# passed into CSV have the proper Encoding set and everything should just work.
|
248
248
|
# CSV methods that allow you to open IO objects (CSV::foreach(), CSV::open(),
|
249
249
|
# CSV::read(), and CSV::readlines()) do allow you to specify the Encoding.
|
250
250
|
#
|
251
251
|
# One minor exception comes when generating CSV into a String with an Encoding
|
252
|
-
# that is not ASCII compatible.
|
252
|
+
# that is not ASCII compatible. There's no existing data for CSV to use to
|
253
253
|
# prepare itself and thus you will probably need to manually specify the desired
|
254
|
-
# Encoding for most of those cases.
|
254
|
+
# Encoding for most of those cases. It will try to guess using the fields in a
|
255
255
|
# row of output though, when using CSV::generate_line() or Array#to_csv().
|
256
256
|
#
|
257
257
|
# I try to point out any other Encoding issues in the documentation of methods
|
258
258
|
# as they come up.
|
259
259
|
#
|
260
260
|
# This has been tested to the best of my ability with all non-"dummy" Encodings
|
261
|
-
# Ruby ships with.
|
261
|
+
# Ruby ships with. However, it is brave new code and may have some bugs.
|
262
262
|
# Please feel free to {report}[mailto:james@grayproductions.net] any issues you
|
263
263
|
# find with it.
|
264
264
|
#
|
@@ -354,7 +354,7 @@ class CSV
|
|
354
354
|
|
355
355
|
#
|
356
356
|
# This Hash holds the built-in header converters of CSV that can be accessed
|
357
|
-
# by name.
|
357
|
+
# by name. You can select HeaderConverters with CSV.header_convert() or
|
358
358
|
# through the +options+ Hash passed to CSV::new().
|
359
359
|
#
|
360
360
|
# <b><tt>:downcase</tt></b>:: Calls downcase() on the header String.
|
@@ -364,13 +364,13 @@ class CSV
|
|
364
364
|
# and finally to_sym() is called.
|
365
365
|
#
|
366
366
|
# All built-in header converters transcode header data to UTF-8 before
|
367
|
-
# attempting a conversion.
|
367
|
+
# attempting a conversion. If your data cannot be transcoded to UTF-8 the
|
368
368
|
# conversion will fail and the header will remain unchanged.
|
369
369
|
#
|
370
370
|
# This Hash is intentionally left unfrozen and users should feel free to add
|
371
371
|
# values to it that can be accessed by all CSV objects.
|
372
372
|
#
|
373
|
-
# To add a combo field, the value should be an Array of names.
|
373
|
+
# To add a combo field, the value should be an Array of names. Combo fields
|
374
374
|
# can be nested with other combo fields.
|
375
375
|
#
|
376
376
|
HeaderConverters = {
|
@@ -382,7 +382,7 @@ class CSV
|
|
382
382
|
}
|
383
383
|
|
384
384
|
#
|
385
|
-
# The options used when no overrides are given by calling code.
|
385
|
+
# The options used when no overrides are given by calling code. They are:
|
386
386
|
#
|
387
387
|
# <b><tt>:col_sep</tt></b>:: <tt>","</tt>
|
388
388
|
# <b><tt>:row_sep</tt></b>:: <tt>:auto</tt>
|
@@ -416,331 +416,337 @@ class CSV
|
|
416
416
|
quote_empty: true,
|
417
417
|
}.freeze
|
418
418
|
|
419
|
-
|
420
|
-
|
421
|
-
|
422
|
-
|
423
|
-
|
424
|
-
|
425
|
-
|
426
|
-
|
427
|
-
|
428
|
-
|
429
|
-
|
430
|
-
|
431
|
-
|
432
|
-
|
433
|
-
|
434
|
-
|
435
|
-
|
436
|
-
|
437
|
-
|
438
|
-
|
439
|
-
|
440
|
-
|
419
|
+
class << self
|
420
|
+
#
|
421
|
+
# This method will return a CSV instance, just like CSV::new(), but the
|
422
|
+
# instance will be cached and returned for all future calls to this method for
|
423
|
+
# the same +data+ object (tested by Object#object_id()) with the same
|
424
|
+
# +options+.
|
425
|
+
#
|
426
|
+
# If a block is given, the instance is passed to the block and the return
|
427
|
+
# value becomes the return value of the block.
|
428
|
+
#
|
429
|
+
def instance(data = $stdout, **options)
|
430
|
+
# create a _signature_ for this method call, data object and options
|
431
|
+
sig = [data.object_id] +
|
432
|
+
options.values_at(*DEFAULT_OPTIONS.keys.sort_by { |sym| sym.to_s })
|
433
|
+
|
434
|
+
# fetch or create the instance for this signature
|
435
|
+
@@instances ||= Hash.new
|
436
|
+
instance = (@@instances[sig] ||= new(data, **options))
|
437
|
+
|
438
|
+
if block_given?
|
439
|
+
yield instance # run block, if given, returning result
|
440
|
+
else
|
441
|
+
instance # or return the instance
|
442
|
+
end
|
441
443
|
end
|
442
|
-
end
|
443
444
|
|
444
|
-
|
445
|
-
|
446
|
-
|
447
|
-
|
448
|
-
|
449
|
-
|
450
|
-
|
451
|
-
|
452
|
-
|
453
|
-
|
454
|
-
|
455
|
-
|
456
|
-
|
457
|
-
|
458
|
-
|
459
|
-
|
460
|
-
|
461
|
-
|
462
|
-
|
463
|
-
|
464
|
-
|
465
|
-
|
466
|
-
|
467
|
-
|
468
|
-
|
469
|
-
|
470
|
-
|
471
|
-
|
472
|
-
|
473
|
-
|
474
|
-
|
475
|
-
|
476
|
-
|
477
|
-
|
478
|
-
|
479
|
-
|
445
|
+
#
|
446
|
+
# :call-seq:
|
447
|
+
# filter( **options ) { |row| ... }
|
448
|
+
# filter( input, **options ) { |row| ... }
|
449
|
+
# filter( input, output, **options ) { |row| ... }
|
450
|
+
#
|
451
|
+
# This method is a convenience for building Unix-like filters for CSV data.
|
452
|
+
# Each row is yielded to the provided block which can alter it as needed.
|
453
|
+
# After the block returns, the row is appended to +output+ altered or not.
|
454
|
+
#
|
455
|
+
# The +input+ and +output+ arguments can be anything CSV::new() accepts
|
456
|
+
# (generally String or IO objects). If not given, they default to
|
457
|
+
# <tt>ARGF</tt> and <tt>$stdout</tt>.
|
458
|
+
#
|
459
|
+
# The +options+ parameter is also filtered down to CSV::new() after some
|
460
|
+
# clever key parsing. Any key beginning with <tt>:in_</tt> or
|
461
|
+
# <tt>:input_</tt> will have that leading identifier stripped and will only
|
462
|
+
# be used in the +options+ Hash for the +input+ object. Keys starting with
|
463
|
+
# <tt>:out_</tt> or <tt>:output_</tt> affect only +output+. All other keys
|
464
|
+
# are assigned to both objects.
|
465
|
+
#
|
466
|
+
# The <tt>:output_row_sep</tt> +option+ defaults to
|
467
|
+
# <tt>$INPUT_RECORD_SEPARATOR</tt> (<tt>$/</tt>).
|
468
|
+
#
|
469
|
+
def filter(input=nil, output=nil, **options)
|
470
|
+
# parse options for input, output, or both
|
471
|
+
in_options, out_options = Hash.new, {row_sep: $INPUT_RECORD_SEPARATOR}
|
472
|
+
options.each do |key, value|
|
473
|
+
case key.to_s
|
474
|
+
when /\Ain(?:put)?_(.+)\Z/
|
475
|
+
in_options[$1.to_sym] = value
|
476
|
+
when /\Aout(?:put)?_(.+)\Z/
|
477
|
+
out_options[$1.to_sym] = value
|
478
|
+
else
|
479
|
+
in_options[key] = value
|
480
|
+
out_options[key] = value
|
481
|
+
end
|
482
|
+
end
|
483
|
+
# build input and output wrappers
|
484
|
+
input = new(input || ARGF, **in_options)
|
485
|
+
output = new(output || $stdout, **out_options)
|
486
|
+
|
487
|
+
# read, yield, write
|
488
|
+
input.each do |row|
|
489
|
+
yield row
|
490
|
+
output << row
|
480
491
|
end
|
481
492
|
end
|
482
|
-
|
483
|
-
|
484
|
-
|
485
|
-
|
486
|
-
#
|
487
|
-
|
488
|
-
|
489
|
-
|
493
|
+
|
494
|
+
#
|
495
|
+
# This method is intended as the primary interface for reading CSV files. You
|
496
|
+
# pass a +path+ and any +options+ you wish to set for the read. Each row of
|
497
|
+
# file will be passed to the provided +block+ in turn.
|
498
|
+
#
|
499
|
+
# The +options+ parameter can be anything CSV::new() understands. This method
|
500
|
+
# also understands an additional <tt>:encoding</tt> parameter that you can use
|
501
|
+
# to specify the Encoding of the data in the file to be read. You must provide
|
502
|
+
# this unless your data is in Encoding::default_external(). CSV will use this
|
503
|
+
# to determine how to parse the data. You may provide a second Encoding to
|
504
|
+
# have the data transcoded as it is read. For example,
|
505
|
+
# <tt>encoding: "UTF-32BE:UTF-8"</tt> would read UTF-32BE data from the file
|
506
|
+
# but transcode it to UTF-8 before CSV parses it.
|
507
|
+
#
|
508
|
+
def foreach(path, mode="r", **options, &block)
|
509
|
+
return to_enum(__method__, path, mode, **options) unless block_given?
|
510
|
+
open(path, mode, **options) do |csv|
|
511
|
+
csv.each(&block)
|
512
|
+
end
|
490
513
|
end
|
491
|
-
end
|
492
514
|
|
493
|
-
|
494
|
-
|
495
|
-
|
496
|
-
|
497
|
-
|
498
|
-
|
499
|
-
|
500
|
-
|
501
|
-
|
502
|
-
|
503
|
-
|
504
|
-
|
505
|
-
|
506
|
-
|
507
|
-
|
508
|
-
|
509
|
-
|
510
|
-
|
515
|
+
#
|
516
|
+
# :call-seq:
|
517
|
+
# generate( str, **options ) { |csv| ... }
|
518
|
+
# generate( **options ) { |csv| ... }
|
519
|
+
#
|
520
|
+
# This method wraps a String you provide, or an empty default String, in a
|
521
|
+
# CSV object which is passed to the provided block. You can use the block to
|
522
|
+
# append CSV rows to the String and when the block exits, the final String
|
523
|
+
# will be returned.
|
524
|
+
#
|
525
|
+
# Note that a passed String *is* modified by this method. Call dup() before
|
526
|
+
# passing if you need a new String.
|
527
|
+
#
|
528
|
+
# The +options+ parameter can be anything CSV::new() understands. This method
|
529
|
+
# understands an additional <tt>:encoding</tt> parameter when not passed a
|
530
|
+
# String to set the base Encoding for the output. CSV needs this hint if you
|
531
|
+
# plan to output non-ASCII compatible data.
|
532
|
+
#
|
533
|
+
def generate(str=nil, **options)
|
534
|
+
# add a default empty String, if none was given
|
535
|
+
if str
|
536
|
+
str = StringIO.new(str)
|
537
|
+
str.seek(0, IO::SEEK_END)
|
538
|
+
else
|
539
|
+
encoding = options[:encoding]
|
540
|
+
str = +""
|
541
|
+
str.force_encoding(encoding) if encoding
|
542
|
+
end
|
543
|
+
csv = new(str, **options) # wrap
|
544
|
+
yield csv # yield for appending
|
545
|
+
csv.string # return final String
|
511
546
|
end
|
512
|
-
end
|
513
547
|
|
514
|
-
|
515
|
-
|
516
|
-
|
517
|
-
|
518
|
-
|
519
|
-
|
520
|
-
|
521
|
-
|
522
|
-
|
523
|
-
|
524
|
-
|
525
|
-
|
526
|
-
|
527
|
-
|
528
|
-
|
529
|
-
# String to set the base Encoding for the output. CSV needs this hint if you
|
530
|
-
# plan to output non-ASCII compatible data.
|
531
|
-
#
|
532
|
-
def self.generate(str=nil, **options)
|
533
|
-
# add a default empty String, if none was given
|
534
|
-
if str
|
535
|
-
str = StringIO.new(str)
|
536
|
-
str.seek(0, IO::SEEK_END)
|
537
|
-
else
|
538
|
-
encoding = options[:encoding]
|
548
|
+
#
|
549
|
+
# This method is a shortcut for converting a single row (Array) into a CSV
|
550
|
+
# String.
|
551
|
+
#
|
552
|
+
# The +options+ parameter can be anything CSV::new() understands. This method
|
553
|
+
# understands an additional <tt>:encoding</tt> parameter to set the base
|
554
|
+
# Encoding for the output. This method will try to guess your Encoding from
|
555
|
+
# the first non-+nil+ field in +row+, if possible, but you may need to use
|
556
|
+
# this parameter as a backup plan.
|
557
|
+
#
|
558
|
+
# The <tt>:row_sep</tt> +option+ defaults to <tt>$INPUT_RECORD_SEPARATOR</tt>
|
559
|
+
# (<tt>$/</tt>) when calling this method.
|
560
|
+
#
|
561
|
+
def generate_line(row, **options)
|
562
|
+
options = {row_sep: $INPUT_RECORD_SEPARATOR}.merge(options)
|
539
563
|
str = +""
|
540
|
-
|
564
|
+
if options[:encoding]
|
565
|
+
str.force_encoding(options[:encoding])
|
566
|
+
elsif field = row.find {|f| f.is_a?(String)}
|
567
|
+
str.force_encoding(field.encoding)
|
568
|
+
end
|
569
|
+
(new(str, **options) << row).string
|
541
570
|
end
|
542
|
-
csv = new(str, options) # wrap
|
543
|
-
yield csv # yield for appending
|
544
|
-
csv.string # return final String
|
545
|
-
end
|
546
571
|
|
547
|
-
|
548
|
-
|
549
|
-
|
550
|
-
|
551
|
-
|
552
|
-
|
553
|
-
|
554
|
-
|
555
|
-
|
556
|
-
|
557
|
-
|
558
|
-
|
559
|
-
|
560
|
-
|
561
|
-
|
562
|
-
|
563
|
-
|
564
|
-
|
565
|
-
|
566
|
-
|
567
|
-
|
568
|
-
|
569
|
-
|
572
|
+
#
|
573
|
+
# :call-seq:
|
574
|
+
# open( filename, mode = "rb", **options ) { |faster_csv| ... }
|
575
|
+
# open( filename, **options ) { |faster_csv| ... }
|
576
|
+
# open( filename, mode = "rb", **options )
|
577
|
+
# open( filename, **options )
|
578
|
+
#
|
579
|
+
# This method opens an IO object, and wraps that with CSV. This is intended
|
580
|
+
# as the primary interface for writing a CSV file.
|
581
|
+
#
|
582
|
+
# You must pass a +filename+ and may optionally add a +mode+ for Ruby's
|
583
|
+
# open(). You may also pass an optional Hash containing any +options+
|
584
|
+
# CSV::new() understands as the final argument.
|
585
|
+
#
|
586
|
+
# This method works like Ruby's open() call, in that it will pass a CSV object
|
587
|
+
# to a provided block and close it when the block terminates, or it will
|
588
|
+
# return the CSV object when no block is provided. (*Note*: This is different
|
589
|
+
# from the Ruby 1.8 CSV library which passed rows to the block. Use
|
590
|
+
# CSV::foreach() for that behavior.)
|
591
|
+
#
|
592
|
+
# You must provide a +mode+ with an embedded Encoding designator unless your
|
593
|
+
# data is in Encoding::default_external(). CSV will check the Encoding of the
|
594
|
+
# underlying IO object (set by the +mode+ you pass) to determine how to parse
|
595
|
+
# the data. You may provide a second Encoding to have the data transcoded as
|
596
|
+
# it is read just as you can with a normal call to IO::open(). For example,
|
597
|
+
# <tt>"rb:UTF-32BE:UTF-8"</tt> would read UTF-32BE data from the file but
|
598
|
+
# transcode it to UTF-8 before CSV parses it.
|
599
|
+
#
|
600
|
+
# An opened CSV object will delegate to many IO methods for convenience. You
|
601
|
+
# may call:
|
602
|
+
#
|
603
|
+
# * binmode()
|
604
|
+
# * binmode?()
|
605
|
+
# * close()
|
606
|
+
# * close_read()
|
607
|
+
# * close_write()
|
608
|
+
# * closed?()
|
609
|
+
# * eof()
|
610
|
+
# * eof?()
|
611
|
+
# * external_encoding()
|
612
|
+
# * fcntl()
|
613
|
+
# * fileno()
|
614
|
+
# * flock()
|
615
|
+
# * flush()
|
616
|
+
# * fsync()
|
617
|
+
# * internal_encoding()
|
618
|
+
# * ioctl()
|
619
|
+
# * isatty()
|
620
|
+
# * path()
|
621
|
+
# * pid()
|
622
|
+
# * pos()
|
623
|
+
# * pos=()
|
624
|
+
# * reopen()
|
625
|
+
# * seek()
|
626
|
+
# * stat()
|
627
|
+
# * sync()
|
628
|
+
# * sync=()
|
629
|
+
# * tell()
|
630
|
+
# * to_i()
|
631
|
+
# * to_io()
|
632
|
+
# * truncate()
|
633
|
+
# * tty?()
|
634
|
+
#
|
635
|
+
def open(filename, mode="r", **options)
|
636
|
+
# wrap a File opened with the remaining +args+ with no newline
|
637
|
+
# decorator
|
638
|
+
file_opts = {universal_newline: false}.merge(options)
|
570
639
|
|
571
|
-
|
572
|
-
|
573
|
-
|
574
|
-
|
575
|
-
|
576
|
-
|
577
|
-
|
578
|
-
|
579
|
-
|
580
|
-
|
581
|
-
|
582
|
-
|
583
|
-
|
584
|
-
|
585
|
-
# This method works like Ruby's open() call, in that it will pass a CSV object
|
586
|
-
# to a provided block and close it when the block terminates, or it will
|
587
|
-
# return the CSV object when no block is provided. (*Note*: This is different
|
588
|
-
# from the Ruby 1.8 CSV library which passed rows to the block. Use
|
589
|
-
# CSV::foreach() for that behavior.)
|
590
|
-
#
|
591
|
-
# You must provide a +mode+ with an embedded Encoding designator unless your
|
592
|
-
# data is in Encoding::default_external(). CSV will check the Encoding of the
|
593
|
-
# underlying IO object (set by the +mode+ you pass) to determine how to parse
|
594
|
-
# the data. You may provide a second Encoding to have the data transcoded as
|
595
|
-
# it is read just as you can with a normal call to IO::open(). For example,
|
596
|
-
# <tt>"rb:UTF-32BE:UTF-8"</tt> would read UTF-32BE data from the file but
|
597
|
-
# transcode it to UTF-8 before CSV parses it.
|
598
|
-
#
|
599
|
-
# An opened CSV object will delegate to many IO methods for convenience. You
|
600
|
-
# may call:
|
601
|
-
#
|
602
|
-
# * binmode()
|
603
|
-
# * binmode?()
|
604
|
-
# * close()
|
605
|
-
# * close_read()
|
606
|
-
# * close_write()
|
607
|
-
# * closed?()
|
608
|
-
# * eof()
|
609
|
-
# * eof?()
|
610
|
-
# * external_encoding()
|
611
|
-
# * fcntl()
|
612
|
-
# * fileno()
|
613
|
-
# * flock()
|
614
|
-
# * flush()
|
615
|
-
# * fsync()
|
616
|
-
# * internal_encoding()
|
617
|
-
# * ioctl()
|
618
|
-
# * isatty()
|
619
|
-
# * path()
|
620
|
-
# * pid()
|
621
|
-
# * pos()
|
622
|
-
# * pos=()
|
623
|
-
# * reopen()
|
624
|
-
# * seek()
|
625
|
-
# * stat()
|
626
|
-
# * sync()
|
627
|
-
# * sync=()
|
628
|
-
# * tell()
|
629
|
-
# * to_i()
|
630
|
-
# * to_io()
|
631
|
-
# * truncate()
|
632
|
-
# * tty?()
|
633
|
-
#
|
634
|
-
def self.open(filename, mode="r", **options)
|
635
|
-
# wrap a File opened with the remaining +args+ with no newline
|
636
|
-
# decorator
|
637
|
-
file_opts = {universal_newline: false}.merge(options)
|
640
|
+
begin
|
641
|
+
f = File.open(filename, mode, **file_opts)
|
642
|
+
rescue ArgumentError => e
|
643
|
+
raise unless /needs binmode/.match?(e.message) and mode == "r"
|
644
|
+
mode = "rb"
|
645
|
+
file_opts = {encoding: Encoding.default_external}.merge(file_opts)
|
646
|
+
retry
|
647
|
+
end
|
648
|
+
begin
|
649
|
+
csv = new(f, **options)
|
650
|
+
rescue Exception
|
651
|
+
f.close
|
652
|
+
raise
|
653
|
+
end
|
638
654
|
|
639
|
-
|
640
|
-
|
641
|
-
|
642
|
-
|
643
|
-
|
644
|
-
|
645
|
-
|
646
|
-
|
647
|
-
|
648
|
-
|
649
|
-
rescue Exception
|
650
|
-
f.close
|
651
|
-
raise
|
655
|
+
# handle blocks like Ruby's open(), not like the CSV library
|
656
|
+
if block_given?
|
657
|
+
begin
|
658
|
+
yield csv
|
659
|
+
ensure
|
660
|
+
csv.close
|
661
|
+
end
|
662
|
+
else
|
663
|
+
csv
|
664
|
+
end
|
652
665
|
end
|
653
666
|
|
654
|
-
#
|
655
|
-
|
667
|
+
#
|
668
|
+
# :call-seq:
|
669
|
+
# parse( str, **options ) { |row| ... }
|
670
|
+
# parse( str, **options )
|
671
|
+
#
|
672
|
+
# This method can be used to easily parse CSV out of a String. You may either
|
673
|
+
# provide a +block+ which will be called with each row of the String in turn,
|
674
|
+
# or just use the returned Array of Arrays (when no +block+ is given).
|
675
|
+
#
|
676
|
+
# You pass your +str+ to read from, and an optional +options+ containing
|
677
|
+
# anything CSV::new() understands.
|
678
|
+
#
|
679
|
+
def parse(str, **options, &block)
|
680
|
+
csv = new(str, **options)
|
681
|
+
|
682
|
+
return csv.each(&block) if block_given?
|
683
|
+
|
684
|
+
# slurp contents, if no block is given
|
656
685
|
begin
|
657
|
-
|
686
|
+
csv.read
|
658
687
|
ensure
|
659
688
|
csv.close
|
660
689
|
end
|
661
|
-
else
|
662
|
-
csv
|
663
690
|
end
|
664
|
-
end
|
665
|
-
|
666
|
-
#
|
667
|
-
# :call-seq:
|
668
|
-
# parse( str, **options ) { |row| ... }
|
669
|
-
# parse( str, **options )
|
670
|
-
#
|
671
|
-
# This method can be used to easily parse CSV out of a String. You may either
|
672
|
-
# provide a +block+ which will be called with each row of the String in turn,
|
673
|
-
# or just use the returned Array of Arrays (when no +block+ is given).
|
674
|
-
#
|
675
|
-
# You pass your +str+ to read from, and an optional +options+ containing
|
676
|
-
# anything CSV::new() understands.
|
677
|
-
#
|
678
|
-
def self.parse(*args, &block)
|
679
|
-
csv = new(*args)
|
680
|
-
|
681
|
-
return csv.each(&block) if block_given?
|
682
691
|
|
683
|
-
#
|
684
|
-
|
685
|
-
|
686
|
-
|
687
|
-
|
692
|
+
#
|
693
|
+
# This method is a shortcut for converting a single line of a CSV String into
|
694
|
+
# an Array. Note that if +line+ contains multiple rows, anything beyond the
|
695
|
+
# first row is ignored.
|
696
|
+
#
|
697
|
+
# The +options+ parameter can be anything CSV::new() understands.
|
698
|
+
#
|
699
|
+
def parse_line(line, **options)
|
700
|
+
new(line, **options).shift
|
688
701
|
end
|
689
|
-
end
|
690
702
|
|
691
|
-
|
692
|
-
|
693
|
-
|
694
|
-
|
695
|
-
|
696
|
-
|
697
|
-
|
698
|
-
|
699
|
-
|
700
|
-
|
701
|
-
|
702
|
-
|
703
|
-
|
704
|
-
|
705
|
-
# an additional <tt>:encoding</tt> parameter that you can use to specify the
|
706
|
-
# Encoding of the data in the file to be read. You must provide this unless
|
707
|
-
# your data is in Encoding::default_external(). CSV will use this to determine
|
708
|
-
# how to parse the data. You may provide a second Encoding to have the data
|
709
|
-
# transcoded as it is read. For example,
|
710
|
-
# <tt>encoding: "UTF-32BE:UTF-8"</tt> would read UTF-32BE data from the file
|
711
|
-
# but transcode it to UTF-8 before CSV parses it.
|
712
|
-
#
|
713
|
-
def self.read(path, *options)
|
714
|
-
open(path, *options) { |csv| csv.read }
|
715
|
-
end
|
703
|
+
#
|
704
|
+
# Use to slurp a CSV file into an Array of Arrays. Pass the +path+ to the
|
705
|
+
# file and any +options+ CSV::new() understands. This method also understands
|
706
|
+
# an additional <tt>:encoding</tt> parameter that you can use to specify the
|
707
|
+
# Encoding of the data in the file to be read. You must provide this unless
|
708
|
+
# your data is in Encoding::default_external(). CSV will use this to determine
|
709
|
+
# how to parse the data. You may provide a second Encoding to have the data
|
710
|
+
# transcoded as it is read. For example,
|
711
|
+
# <tt>encoding: "UTF-32BE:UTF-8"</tt> would read UTF-32BE data from the file
|
712
|
+
# but transcode it to UTF-8 before CSV parses it.
|
713
|
+
#
|
714
|
+
def read(path, **options)
|
715
|
+
open(path, **options) { |csv| csv.read }
|
716
|
+
end
|
716
717
|
|
717
|
-
|
718
|
-
|
719
|
-
|
720
|
-
|
718
|
+
# Alias for CSV::read().
|
719
|
+
def readlines(path, **options)
|
720
|
+
read(path, **options)
|
721
|
+
end
|
721
722
|
|
722
|
-
|
723
|
-
|
724
|
-
|
725
|
-
|
726
|
-
|
727
|
-
|
728
|
-
|
729
|
-
|
730
|
-
|
731
|
-
|
732
|
-
|
723
|
+
#
|
724
|
+
# A shortcut for:
|
725
|
+
#
|
726
|
+
# CSV.read( path, { headers: true,
|
727
|
+
# converters: :numeric,
|
728
|
+
# header_converters: :symbol }.merge(options) )
|
729
|
+
#
|
730
|
+
def table(path, **options)
|
731
|
+
default_options = {
|
732
|
+
headers: true,
|
733
|
+
converters: :numeric,
|
734
|
+
header_converters: :symbol,
|
735
|
+
}
|
736
|
+
options = default_options.merge(options)
|
737
|
+
read(path, **options)
|
738
|
+
end
|
733
739
|
end
|
734
740
|
|
735
741
|
#
|
736
742
|
# This constructor will wrap either a String or IO object passed in +data+ for
|
737
|
-
# reading and/or writing.
|
738
|
-
# methods are delegated.
|
743
|
+
# reading and/or writing. In addition to the CSV instance methods, several IO
|
744
|
+
# methods are delegated. (See CSV::open() for a complete list.) If you pass
|
739
745
|
# a String for +data+, you can later retrieve it (after writing to it, for
|
740
746
|
# example) with CSV.string().
|
741
747
|
#
|
742
748
|
# Note that a wrapped String will be positioned at the beginning (for
|
743
|
-
# reading).
|
749
|
+
# reading). If you want it at the end (for writing), use CSV::generate().
|
744
750
|
# If you want any other positioning, pass a preset StringIO object instead.
|
745
751
|
#
|
746
752
|
# You may set any reading and/or writing preferences in the +options+ Hash.
|
@@ -750,25 +756,25 @@ class CSV
|
|
750
756
|
# This String will be transcoded into
|
751
757
|
# the data's Encoding before parsing.
|
752
758
|
# <b><tt>:row_sep</tt></b>:: The String appended to the end of each
|
753
|
-
# row.
|
759
|
+
# row. This can be set to the special
|
754
760
|
# <tt>:auto</tt> setting, which requests
|
755
761
|
# that CSV automatically discover this
|
756
|
-
# from the data.
|
762
|
+
# from the data. Auto-discovery reads
|
757
763
|
# ahead in the data looking for the next
|
758
764
|
# <tt>"\r\n"</tt>, <tt>"\n"</tt>, or
|
759
|
-
# <tt>"\r"</tt> sequence.
|
765
|
+
# <tt>"\r"</tt> sequence. A sequence
|
760
766
|
# will be selected even if it occurs in
|
761
767
|
# a quoted field, assuming that you
|
762
768
|
# would have the same line endings
|
763
|
-
# there.
|
769
|
+
# there. If none of those sequences is
|
764
770
|
# found, +data+ is <tt>ARGF</tt>,
|
765
771
|
# <tt>STDIN</tt>, <tt>STDOUT</tt>, or
|
766
772
|
# <tt>STDERR</tt>, or the stream is only
|
767
773
|
# available for output, the default
|
768
774
|
# <tt>$INPUT_RECORD_SEPARATOR</tt>
|
769
|
-
# (<tt>$/</tt>) is used.
|
770
|
-
# discovery takes a little time.
|
771
|
-
# manually if speed is important.
|
775
|
+
# (<tt>$/</tt>) is used. Obviously,
|
776
|
+
# discovery takes a little time. Set
|
777
|
+
# manually if speed is important. Also
|
772
778
|
# note that IO objects should be opened
|
773
779
|
# in binary mode on Windows if this
|
774
780
|
# feature will be used as the
|
@@ -780,7 +786,7 @@ class CSV
|
|
780
786
|
# before parsing.
|
781
787
|
# <b><tt>:quote_char</tt></b>:: The character used to quote fields.
|
782
788
|
# This has to be a single character
|
783
|
-
# String.
|
789
|
+
# String. This is useful for
|
784
790
|
# application that incorrectly use
|
785
791
|
# <tt>'</tt> as the quote character
|
786
792
|
# instead of the correct <tt>"</tt>.
|
@@ -791,21 +797,21 @@ class CSV
|
|
791
797
|
# before parsing.
|
792
798
|
# <b><tt>:field_size_limit</tt></b>:: This is a maximum size CSV will read
|
793
799
|
# ahead looking for the closing quote
|
794
|
-
# for a field.
|
800
|
+
# for a field. (In truth, it reads to
|
795
801
|
# the first line ending beyond this
|
796
|
-
# size.)
|
802
|
+
# size.) If a quote cannot be found
|
797
803
|
# within the limit CSV will raise a
|
798
804
|
# MalformedCSVError, assuming the data
|
799
|
-
# is faulty.
|
805
|
+
# is faulty. You can use this limit to
|
800
806
|
# prevent what are effectively DoS
|
801
|
-
# attacks on the parser.
|
807
|
+
# attacks on the parser. However, this
|
802
808
|
# limit can cause a legitimate parse to
|
803
809
|
# fail and thus is set to +nil+, or off,
|
804
810
|
# by default.
|
805
811
|
# <b><tt>:converters</tt></b>:: An Array of names from the Converters
|
806
812
|
# Hash and/or lambdas that handle custom
|
807
|
-
# conversion.
|
808
|
-
# doesn't have to be in an Array.
|
813
|
+
# conversion. A single converter
|
814
|
+
# doesn't have to be in an Array. All
|
809
815
|
# built-in converters try to transcode
|
810
816
|
# fields to UTF-8 before converting.
|
811
817
|
# The conversion will fail if the data
|
@@ -815,7 +821,7 @@ class CSV
|
|
815
821
|
# unconverted_fields() method will be
|
816
822
|
# added to all returned rows (Array or
|
817
823
|
# CSV::Row) that will return the fields
|
818
|
-
# as they were before conversion.
|
824
|
+
# as they were before conversion. Note
|
819
825
|
# that <tt>:headers</tt> supplied by
|
820
826
|
# Array or String were not fields of the
|
821
827
|
# document and thus will have an empty
|
@@ -823,21 +829,21 @@ class CSV
|
|
823
829
|
# <b><tt>:headers</tt></b>:: If set to <tt>:first_row</tt> or
|
824
830
|
# +true+, the initial row of the CSV
|
825
831
|
# file will be treated as a row of
|
826
|
-
# headers.
|
832
|
+
# headers. If set to an Array, the
|
827
833
|
# contents will be used as the headers.
|
828
834
|
# If set to a String, the String is run
|
829
835
|
# through a call of CSV::parse_line()
|
830
836
|
# with the same <tt>:col_sep</tt>,
|
831
837
|
# <tt>:row_sep</tt>, and
|
832
838
|
# <tt>:quote_char</tt> as this instance
|
833
|
-
# to produce an Array of headers.
|
839
|
+
# to produce an Array of headers. This
|
834
840
|
# setting causes CSV#shift() to return
|
835
841
|
# rows as CSV::Row objects instead of
|
836
842
|
# Arrays and CSV#read() to return
|
837
843
|
# CSV::Table objects instead of an Array
|
838
844
|
# of Arrays.
|
839
845
|
# <b><tt>:return_headers</tt></b>:: When +false+, header rows are silently
|
840
|
-
# swallowed.
|
846
|
+
# swallowed. If set to +true+, header
|
841
847
|
# rows are returned in a CSV::Row object
|
842
848
|
# with identical headers and
|
843
849
|
# fields (save that the fields do not go
|
@@ -848,12 +854,12 @@ class CSV
|
|
848
854
|
# <b><tt>:header_converters</tt></b>:: Identical in functionality to
|
849
855
|
# <tt>:converters</tt> save that the
|
850
856
|
# conversions are only made to header
|
851
|
-
# rows.
|
857
|
+
# rows. All built-in converters try to
|
852
858
|
# transcode headers to UTF-8 before
|
853
|
-
# converting.
|
859
|
+
# converting. The conversion will fail
|
854
860
|
# if the data cannot be transcoded,
|
855
861
|
# leaving the header unchanged.
|
856
|
-
# <b><tt>:skip_blanks</tt></b>:: When
|
862
|
+
# <b><tt>:skip_blanks</tt></b>:: When setting a +true+ value, CSV will
|
857
863
|
# skip over any empty rows. Note that
|
858
864
|
# this setting will not skip rows that
|
859
865
|
# contain column separators, even if
|
@@ -863,9 +869,9 @@ class CSV
|
|
863
869
|
# using <tt>:skip_lines</tt>, or
|
864
870
|
# inspecting fields.compact.empty? on
|
865
871
|
# each row.
|
866
|
-
# <b><tt>:force_quotes</tt></b>:: When
|
872
|
+
# <b><tt>:force_quotes</tt></b>:: When setting a +true+ value, CSV will
|
867
873
|
# quote all CSV fields it creates.
|
868
|
-
# <b><tt>:skip_lines</tt></b>:: When
|
874
|
+
# <b><tt>:skip_lines</tt></b>:: When setting an object responding to
|
869
875
|
# <tt>match</tt>, every line matching
|
870
876
|
# it is considered a comment and ignored
|
871
877
|
# during parsing. When set to a String,
|
@@ -874,17 +880,17 @@ class CSV
|
|
874
880
|
# a comment. If the passed object does
|
875
881
|
# not respond to <tt>match</tt>,
|
876
882
|
# <tt>ArgumentError</tt> is thrown.
|
877
|
-
# <b><tt>:liberal_parsing</tt></b>:: When
|
883
|
+
# <b><tt>:liberal_parsing</tt></b>:: When setting a +true+ value, CSV will
|
878
884
|
# attempt to parse input not conformant
|
879
885
|
# with RFC 4180, such as double quotes
|
880
886
|
# in unquoted fields.
|
881
887
|
# <b><tt>:nil_value</tt></b>:: When set an object, any values of an
|
882
|
-
# empty field
|
888
|
+
# empty field is replaced by the set
|
883
889
|
# object, not nil.
|
884
|
-
# <b><tt>:empty_value</tt></b>:: When
|
890
|
+
# <b><tt>:empty_value</tt></b>:: When setting an object, any values of a
|
885
891
|
# blank string field is replaced by
|
886
892
|
# the set object.
|
887
|
-
# <b><tt>:quote_empty</tt></b>:: When
|
893
|
+
# <b><tt>:quote_empty</tt></b>:: When setting a +true+ value, CSV will
|
888
894
|
# quote empty values with double quotes.
|
889
895
|
# When +false+, CSV will emit an
|
890
896
|
# empty string for an empty field value.
|
@@ -901,11 +907,11 @@ class CSV
|
|
901
907
|
# <b><tt>:write_empty_value</tt></b>:: When a <tt>String</tt> or +nil+ value,
|
902
908
|
# empty value(s) on each line will be
|
903
909
|
# replaced with the specified value.
|
904
|
-
# <b><tt>:strip</tt></b>:: When
|
910
|
+
# <b><tt>:strip</tt></b>:: When setting a +true+ value, CSV will
|
905
911
|
# strip "\t\r\n\f\v" around the values.
|
906
912
|
# If you specify a string instead of
|
907
913
|
# +true+, CSV will strip string. The
|
908
|
-
# length of string must be 1.
|
914
|
+
# length of the string must be 1.
|
909
915
|
#
|
910
916
|
# See CSV::DEFAULT_OPTIONS for the default settings.
|
911
917
|
#
|
@@ -939,8 +945,12 @@ class CSV
|
|
939
945
|
strip: false)
|
940
946
|
raise ArgumentError.new("Cannot parse nil as CSV") if data.nil?
|
941
947
|
|
942
|
-
|
943
|
-
|
948
|
+
if data.is_a?(String)
|
949
|
+
@io = StringIO.new(data)
|
950
|
+
@io.set_encoding(encoding || data.encoding)
|
951
|
+
else
|
952
|
+
@io = data
|
953
|
+
end
|
944
954
|
@encoding = determine_encoding(encoding, internal_encoding)
|
945
955
|
|
946
956
|
@base_fields_converter_options = {
|
@@ -992,41 +1002,47 @@ class CSV
|
|
992
1002
|
end
|
993
1003
|
|
994
1004
|
#
|
995
|
-
# The encoded <tt>:col_sep</tt> used in parsing and writing.
|
996
|
-
# for details.
|
1005
|
+
# The encoded <tt>:col_sep</tt> used in parsing and writing.
|
1006
|
+
# See CSV::new for details.
|
997
1007
|
#
|
998
1008
|
def col_sep
|
999
1009
|
parser.column_separator
|
1000
1010
|
end
|
1001
1011
|
|
1002
1012
|
#
|
1003
|
-
# The encoded <tt>:row_sep</tt> used in parsing and writing.
|
1004
|
-
# for details.
|
1013
|
+
# The encoded <tt>:row_sep</tt> used in parsing and writing.
|
1014
|
+
# See CSV::new for details.
|
1005
1015
|
#
|
1006
1016
|
def row_sep
|
1007
1017
|
parser.row_separator
|
1008
1018
|
end
|
1009
1019
|
|
1010
1020
|
#
|
1011
|
-
# The encoded <tt>:quote_char</tt> used in parsing and writing.
|
1012
|
-
# for details.
|
1021
|
+
# The encoded <tt>:quote_char</tt> used in parsing and writing.
|
1022
|
+
# See CSV::new for details.
|
1013
1023
|
#
|
1014
1024
|
def quote_char
|
1015
1025
|
parser.quote_character
|
1016
1026
|
end
|
1017
1027
|
|
1018
|
-
#
|
1028
|
+
#
|
1029
|
+
# The limit for field size, if any.
|
1030
|
+
# See CSV::new for details.
|
1031
|
+
#
|
1019
1032
|
def field_size_limit
|
1020
1033
|
parser.field_size_limit
|
1021
1034
|
end
|
1022
1035
|
|
1023
|
-
#
|
1036
|
+
#
|
1037
|
+
# The regex marking a line as a comment.
|
1038
|
+
# See CSV::new for details.
|
1039
|
+
#
|
1024
1040
|
def skip_lines
|
1025
1041
|
parser.skip_lines
|
1026
1042
|
end
|
1027
1043
|
|
1028
1044
|
#
|
1029
|
-
# Returns the current list of converters in effect.
|
1045
|
+
# Returns the current list of converters in effect. See CSV::new for details.
|
1030
1046
|
# Built-in converters will be returned by name, while others will be returned
|
1031
1047
|
# as is.
|
1032
1048
|
#
|
@@ -1036,9 +1052,10 @@ class CSV
|
|
1036
1052
|
name ? name.first : converter
|
1037
1053
|
end
|
1038
1054
|
end
|
1055
|
+
|
1039
1056
|
#
|
1040
|
-
# Returns +true+ if unconverted_fields() to parsed results.
|
1041
|
-
# for details.
|
1057
|
+
# Returns +true+ if unconverted_fields() to parsed results.
|
1058
|
+
# See CSV::new for details.
|
1042
1059
|
#
|
1043
1060
|
def unconverted_fields?
|
1044
1061
|
parser.unconverted_fields?
|
@@ -1046,8 +1063,8 @@ class CSV
|
|
1046
1063
|
|
1047
1064
|
#
|
1048
1065
|
# Returns +nil+ if headers will not be used, +true+ if they will but have not
|
1049
|
-
# yet been read, or the actual headers after they have been read.
|
1050
|
-
# CSV::new for details.
|
1066
|
+
# yet been read, or the actual headers after they have been read.
|
1067
|
+
# See CSV::new for details.
|
1051
1068
|
#
|
1052
1069
|
def headers
|
1053
1070
|
if @writer
|
@@ -1068,14 +1085,17 @@ class CSV
|
|
1068
1085
|
parser.return_headers?
|
1069
1086
|
end
|
1070
1087
|
|
1071
|
-
#
|
1088
|
+
#
|
1089
|
+
# Returns +true+ if headers are written in output.
|
1090
|
+
# See CSV::new for details.
|
1091
|
+
#
|
1072
1092
|
def write_headers?
|
1073
1093
|
@writer_options[:write_headers]
|
1074
1094
|
end
|
1075
1095
|
|
1076
1096
|
#
|
1077
|
-
# Returns the current list of converters in effect for headers.
|
1078
|
-
# for details.
|
1097
|
+
# Returns the current list of converters in effect for headers. See CSV::new
|
1098
|
+
# for details. Built-in converters will be returned by name, while others
|
1079
1099
|
# will be returned as is.
|
1080
1100
|
#
|
1081
1101
|
def header_converters
|
@@ -1201,7 +1221,7 @@ class CSV
|
|
1201
1221
|
|
1202
1222
|
#
|
1203
1223
|
# The primary write method for wrapped Strings and IOs, +row+ (an Array or
|
1204
|
-
# CSV::Row) is converted to CSV and appended to the data source.
|
1224
|
+
# CSV::Row) is converted to CSV and appended to the data source. When a
|
1205
1225
|
# CSV::Row is passed, only the row's fields() are appended to the output.
|
1206
1226
|
#
|
1207
1227
|
# The data source must be open for writing.
|
@@ -1223,9 +1243,9 @@ class CSV
|
|
1223
1243
|
# block that handles a custom conversion.
|
1224
1244
|
#
|
1225
1245
|
# If you provide a block that takes one argument, it will be passed the field
|
1226
|
-
# and is expected to return the converted value or the field itself.
|
1246
|
+
# and is expected to return the converted value or the field itself. If your
|
1227
1247
|
# block takes two arguments, it will also be passed a CSV::FieldInfo Struct,
|
1228
|
-
# containing details about the field.
|
1248
|
+
# containing details about the field. Again, the block should return a
|
1229
1249
|
# converted field or the field itself.
|
1230
1250
|
#
|
1231
1251
|
def convert(name = nil, &converter)
|
@@ -1377,9 +1397,9 @@ class CSV
|
|
1377
1397
|
|
1378
1398
|
#
|
1379
1399
|
# Processes +fields+ with <tt>@converters</tt>, or <tt>@header_converters</tt>
|
1380
|
-
# if +headers+ is passed as +true+, returning the converted field set.
|
1400
|
+
# if +headers+ is passed as +true+, returning the converted field set. Any
|
1381
1401
|
# converter that changes the field into something other than a String halts
|
1382
|
-
# the pipeline of conversion for that field.
|
1402
|
+
# the pipeline of conversion for that field. This is primarily an efficiency
|
1383
1403
|
# shortcut.
|
1384
1404
|
#
|
1385
1405
|
def convert_fields(fields, headers = false)
|