fastercsv 0.2.1 → 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/CHANGELOG +15 -3
- data/README +2 -2
- data/TODO +1 -26
- data/examples/csv_filter.rb +8 -0
- data/examples/csv_reading.rb +57 -0
- data/examples/csv_table.rb +56 -0
- data/examples/csv_writing.rb +67 -0
- data/examples/shortcut_interface.rb +4 -0
- data/lib/faster_csv.rb +432 -74
- data/test/tc_features.rb +41 -13
- data/test/tc_headers.rb +35 -26
- data/test/tc_row.rb +3 -0
- data/test/tc_table.rb +385 -0
- data/test/ts_all.rb +1 -0
- metadata +6 -2
data/CHANGELOG
CHANGED
@@ -2,6 +2,18 @@
|
|
2
2
|
|
3
3
|
Below is a complete listing of changes for each revision of FasterCSV.
|
4
4
|
|
5
|
+
== 1.0.0
|
6
|
+
|
7
|
+
* Fixed FasterCSV.rewind() to reset the FasterCSV.lineno() counter.
|
8
|
+
* Fixed FasterCSV.rewind() to reset the header processing.
|
9
|
+
* Fixed documentation typos.
|
10
|
+
* Switched STDOUT and STDERR usage to $stdout and $stderr where appropriate.
|
11
|
+
* Added FasterCSV::Row.==().
|
12
|
+
* Enhanced FasterCSV::Row.fields() to support Ranges, even for headers.
|
13
|
+
* The slurping methods now return the new FasterCSV::Table objects.
|
14
|
+
* Fixed parser so multibyte <tt>:col_sep</tt> works now.
|
15
|
+
* Added a few examples for usage.
|
16
|
+
|
5
17
|
== 0.2.1
|
6
18
|
|
7
19
|
* Removed autorequire from GemSpec.
|
@@ -12,9 +24,9 @@ Below is a complete listing of changes for each revision of FasterCSV.
|
|
12
24
|
|
13
25
|
* Added VERSION constant.
|
14
26
|
* Significantly improved test speed.
|
15
|
-
* Worked around Date::parse bug so tests will pass on Windows.
|
27
|
+
* Worked around Date::parse() bug so tests will pass on Windows.
|
16
28
|
* Documented test procedure.
|
17
|
-
* Made FasterCSV
|
29
|
+
* Made FasterCSV.lineno() CSV aware.
|
18
30
|
* Added line numbers to MalformedCSVError messages.
|
19
31
|
* <tt>:headers</tt> can now be set to an Array of headers to use.
|
20
32
|
* <tt>:headers</tt> can now be set to an external CSV String of headers to use.
|
@@ -25,7 +37,7 @@ Below is a complete listing of changes for each revision of FasterCSV.
|
|
25
37
|
* Added header information to FieldInfo Struct for conversions by header.
|
26
38
|
* Added an alias to support <tt>require "fastercsv"</tt>.
|
27
39
|
* Added FCSV alias for FasterCSV.
|
28
|
-
* Added FasterCSV::instance and FasterCSV()/FCSV() shortcuts for easy output.
|
40
|
+
* Added FasterCSV::instance() and FasterCSV()/FCSV() shortcuts for easy output.
|
29
41
|
|
30
42
|
== 0.1.9
|
31
43
|
|
data/README
CHANGED
@@ -9,8 +9,8 @@ Welcome to FasterCSV.
|
|
9
9
|
FasterCSV is intended as a replacement to Ruby's standard CSV library. It was designed to address concerns users of that library had and it has three primary goals:
|
10
10
|
|
11
11
|
1. Be significantly faster than CSV while remaining a pure Ruby library.
|
12
|
-
2. Use a smaller and easier to maintain code base. (
|
13
|
-
but
|
12
|
+
2. Use a smaller and easier to maintain code base. (FasterCSV is larger now,
|
13
|
+
but considerably richer in features. The parsing core remains quite small.)
|
14
14
|
3. Improve on the CSV interface.
|
15
15
|
|
16
16
|
Obviously, the last one is subjective. If you love CSV's interface, odds are
|
data/TODO
CHANGED
@@ -3,29 +3,4 @@
|
|
3
3
|
The following is a list of planned expansions for FasterCSV, in no particular
|
4
4
|
order.
|
5
5
|
|
6
|
-
*
|
7
|
-
"Experiment ID: 1",,,,,,,,,,,,
|
8
|
-
"Subject ID: 1013938829432171e868c340.
|
9
|
-
Trial,stimulus,time,type,field1,field2,text_response,Abs. time of
|
10
|
-
response,,,,,
|
11
|
-
26,undefined,14828,KEY,RETURN,UNUSED,DCS,Sat Oct 15 17:48:04 GMT-0400
|
12
|
-
2005,,,,,
|
13
|
-
23,undefined,15078,KEY,RETURN,UNUSED,244,Sat Oct 15 17:48:19 GMT-0400
|
14
|
-
2005,,,,,
|
15
|
-
7,nixontrialleft copy.pct [TAG: 1],5953,KEY,1,UNUSED,,Sat Oct 15
|
16
|
-
17:49:24 GMT-0400 2005,,,,,
|
17
|
-
8,nixontrialfront copy.pct [TAG: 3],6250,KEY,3,UNUSED,,Sat Oct 15
|
18
|
-
17:49:31 GMT-0400 2005,,,,,
|
19
|
-
9,nixontrialright copy.pct [TAG: 2],2469,KEY,2,UNUSED,,Sat Oct 15
|
20
|
-
17:49:34 GMT-0400 2005,,,,,
|
21
|
-
#####
|
22
|
-
more data
|
23
|
-
######
|
24
|
-
,,,,,,,,,,4374.347222,,
|
25
|
-
,,,,,,,,,,,,1.00
|
26
|
-
,,,,,,,,,,,,0.93
|
27
|
-
### and a new block starts
|
28
|
-
"Experiment ID: 3",,,,,,,,,,,,0.92
|
29
|
-
....
|
30
|
-
* Add calculated fields.
|
31
|
-
* Examples, examples, examples...
|
6
|
+
* Rent this space!
|
data/examples/csv_filter.rb
CHANGED
@@ -1,5 +1,10 @@
|
|
1
1
|
#!/usr/local/bin/ruby -w
|
2
2
|
|
3
|
+
# = csv_filter.rb -- Faster CSV Reading and Writing
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-04-01.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
|
3
8
|
require "faster_csv"
|
4
9
|
|
5
10
|
running_total = 0
|
@@ -13,3 +18,6 @@ FasterCSV.filter( :headers => true,
|
|
13
18
|
row << (running_total += row[:quantity] * row[:price])
|
14
19
|
end
|
15
20
|
end
|
21
|
+
# >> Quantity,Product Description,Price,Running Total
|
22
|
+
# >> 1,Text Editor,25.0,25.0
|
23
|
+
# >> 2,MacBook Pros,2499.0,5023.0
|
@@ -0,0 +1,57 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# csv_reading.rb
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-11-05.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
|
8
|
+
require "faster_csv"
|
9
|
+
|
10
|
+
CSV_FILE_PATH = File.join(File.dirname(__FILE__), "purchase.csv")
|
11
|
+
CSV_STR = <<END_CSV
|
12
|
+
first,last
|
13
|
+
James,Gray
|
14
|
+
Dana,Gray
|
15
|
+
END_CSV
|
16
|
+
|
17
|
+
# read a file line by line
|
18
|
+
FasterCSV.foreach(CSV_FILE_PATH) do |line|
|
19
|
+
puts line[1]
|
20
|
+
end
|
21
|
+
# >> Product Description
|
22
|
+
# >> Text Editor
|
23
|
+
# >> MacBook Pros
|
24
|
+
|
25
|
+
# slurp file data
|
26
|
+
data = FasterCSV.read(CSV_FILE_PATH)
|
27
|
+
puts data.flatten.grep(/\A\d+\.\d+\Z/)
|
28
|
+
# >> 25.00
|
29
|
+
# >> 2499.00
|
30
|
+
|
31
|
+
# read a string line by line
|
32
|
+
FasterCSV.parse(CSV_STR) do |line|
|
33
|
+
puts line[0]
|
34
|
+
end
|
35
|
+
# >> first
|
36
|
+
# >> James
|
37
|
+
# >> Dana
|
38
|
+
|
39
|
+
# slurp string data
|
40
|
+
data = FasterCSV.parse(CSV_STR)
|
41
|
+
puts data[1..-1].map { |line| "#{line[0][0, 1].downcase}.#{line[1].downcase}" }
|
42
|
+
# >> j.gray
|
43
|
+
# >> d.gray
|
44
|
+
|
45
|
+
# adding options to make data manipulation easy
|
46
|
+
total = 0
|
47
|
+
FasterCSV.foreach( CSV_FILE_PATH, :headers => true,
|
48
|
+
:header_converters => :symbol,
|
49
|
+
:converters => :numeric ) do |line|
|
50
|
+
line_total = line[:quantity] * line[:price]
|
51
|
+
total += line_total
|
52
|
+
puts "%s: %.2f" % [line[:product_description], line_total]
|
53
|
+
end
|
54
|
+
puts "Total: %.2f" % total
|
55
|
+
# >> Text Editor: 25.00
|
56
|
+
# >> MacBook Pros: 4998.00
|
57
|
+
# >> Total: 5023.00
|
@@ -0,0 +1,56 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# csv_table.rb
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-11-04.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
#
|
8
|
+
# Feature implementation and example code by Ara.T.Howard.
|
9
|
+
|
10
|
+
require "faster_csv"
|
11
|
+
|
12
|
+
table = FCSV.parse(DATA, :headers => true, :header_converters => :symbol)
|
13
|
+
|
14
|
+
# row access
|
15
|
+
table[0].class # => FasterCSV::Row
|
16
|
+
table[0].fields # => ["zaphod", "beeblebrox", "42"]
|
17
|
+
|
18
|
+
# column access
|
19
|
+
table[:first_name] # => ["zaphod", "ara"]
|
20
|
+
|
21
|
+
# cell access
|
22
|
+
table[1][0] # => "ara"
|
23
|
+
table[1][:first_name] # => "ara"
|
24
|
+
table[:first_name][1] # => "ara"
|
25
|
+
|
26
|
+
# manipulation
|
27
|
+
table << %w[james gray 30]
|
28
|
+
table[-1].fields # => ["james", "gray", "30"]
|
29
|
+
|
30
|
+
table[:type] = "name"
|
31
|
+
table[:type] # => ["name", "name", "name"]
|
32
|
+
|
33
|
+
table[:ssn] = %w[123-456-7890 098-765-4321]
|
34
|
+
table[:ssn] # => ["123-456-7890", "098-765-4321", nil]
|
35
|
+
|
36
|
+
# iteration
|
37
|
+
table.each do |row|
|
38
|
+
# ...
|
39
|
+
end
|
40
|
+
|
41
|
+
table.by_col!
|
42
|
+
table.each do |col_name, col_values|
|
43
|
+
# ...
|
44
|
+
end
|
45
|
+
|
46
|
+
# output
|
47
|
+
puts table
|
48
|
+
# >> first_name,last_name,age,type,ssn
|
49
|
+
# >> zaphod,beeblebrox,42,name,123-456-7890
|
50
|
+
# >> ara,howard,34,name,098-765-4321
|
51
|
+
# >> james,gray,30,name,
|
52
|
+
|
53
|
+
__END__
|
54
|
+
first_name,last_name,age
|
55
|
+
zaphod,beeblebrox,42
|
56
|
+
ara,howard,34
|
@@ -0,0 +1,67 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# csv_rails_import.rb
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-11-05.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
|
8
|
+
require "faster_csv"
|
9
|
+
|
10
|
+
CSV_FILE_PATH = File.join(File.dirname(__FILE__), "output.csv")
|
11
|
+
|
12
|
+
# writing to a file
|
13
|
+
FasterCSV.open(CSV_FILE_PATH, "w") do |csv|
|
14
|
+
csv << %w[first last]
|
15
|
+
csv << %w[James Gray]
|
16
|
+
csv << %w[Dana Gray]
|
17
|
+
end
|
18
|
+
puts File.read(CSV_FILE_PATH)
|
19
|
+
# >> first,last
|
20
|
+
# >> James,Gray
|
21
|
+
# >> Dana,Gray
|
22
|
+
|
23
|
+
# appending to an existing file
|
24
|
+
FasterCSV.open(CSV_FILE_PATH, "a") do |csv|
|
25
|
+
csv << %w[Gypsy]
|
26
|
+
csv << %w[Storm]
|
27
|
+
end
|
28
|
+
puts File.read(CSV_FILE_PATH)
|
29
|
+
# >> first,last
|
30
|
+
# >> James,Gray
|
31
|
+
# >> Dana,Gray
|
32
|
+
# >> Gypsy
|
33
|
+
# >> Storm
|
34
|
+
|
35
|
+
# writing to a string
|
36
|
+
csv_str = FasterCSV.generate do |csv|
|
37
|
+
csv << %w[first last]
|
38
|
+
csv << %w[James Gray]
|
39
|
+
csv << %w[Dana Gray]
|
40
|
+
end
|
41
|
+
puts csv_str
|
42
|
+
# >> first,last
|
43
|
+
# >> James,Gray
|
44
|
+
# >> Dana,Gray
|
45
|
+
|
46
|
+
# appending to an existing string
|
47
|
+
FasterCSV.generate(csv_str) do |csv|
|
48
|
+
csv << %w[Gypsy]
|
49
|
+
csv << %w[Storm]
|
50
|
+
end
|
51
|
+
puts csv_str
|
52
|
+
# >> first,last
|
53
|
+
# >> James,Gray
|
54
|
+
# >> Dana,Gray
|
55
|
+
# >> Gypsy
|
56
|
+
# >> Storm
|
57
|
+
|
58
|
+
# changing the output format
|
59
|
+
csv_str = FasterCSV.generate(:col_sep => "\t") do |csv|
|
60
|
+
csv << %w[first last]
|
61
|
+
csv << %w[James Gray]
|
62
|
+
csv << %w[Dana Gray]
|
63
|
+
end
|
64
|
+
puts csv_str
|
65
|
+
# >> first last
|
66
|
+
# >> James Gray
|
67
|
+
# >> Dana Gray
|
data/lib/faster_csv.rb
CHANGED
@@ -69,13 +69,13 @@ require "stringio"
|
|
69
69
|
#
|
70
70
|
# == Shortcut Interface
|
71
71
|
#
|
72
|
-
# FCSV
|
73
|
-
# FCSV(csv = "")
|
74
|
-
# FCSV(
|
72
|
+
# FCSV { |csv_out| csv_out << %w{my data here} } # to $stdout
|
73
|
+
# FCSV(csv = "") { |csv_str| csv_str << %w{my data here} } # to a String
|
74
|
+
# FCSV($stderr) { |csv_err| csv_err << %w{my data here} } # to $stderr
|
75
75
|
#
|
76
76
|
class FasterCSV
|
77
77
|
# The version of the installed library.
|
78
|
-
VERSION = "0.
|
78
|
+
VERSION = "1.0.0".freeze
|
79
79
|
|
80
80
|
#
|
81
81
|
# A FasterCSV::Row is part Array and part Hash. It retains an order for the
|
@@ -95,7 +95,7 @@ class FasterCSV
|
|
95
95
|
# FasterCSV::Row.header_row?() and FasterCSV::Row.field_row?(), that this is
|
96
96
|
# a header row. Otherwise, the row is assumes to be a field row.
|
97
97
|
#
|
98
|
-
def initialize(
|
98
|
+
def initialize(headers, fields, header_row = false)
|
99
99
|
@header_row = header_row
|
100
100
|
|
101
101
|
# handle extra headers or fields
|
@@ -106,6 +106,10 @@ class FasterCSV
|
|
106
106
|
end
|
107
107
|
end
|
108
108
|
|
109
|
+
# Internal data format used to compare equality.
|
110
|
+
attr_reader :row
|
111
|
+
protected :row
|
112
|
+
|
109
113
|
# Returns +true+ if this is a header row.
|
110
114
|
def header_row?
|
111
115
|
@header_row
|
@@ -134,7 +138,7 @@ class FasterCSV
|
|
134
138
|
# than the +offset+ index. You can use this to find duplicate headers,
|
135
139
|
# without resorting to hard-coding exact indices.
|
136
140
|
#
|
137
|
-
def field(
|
141
|
+
def field(header_or_index, minimum_index = 0)
|
138
142
|
# locate the pair
|
139
143
|
finder = header_or_index.is_a?(Integer) ? :[] : :assoc
|
140
144
|
pair = @row[minimum_index..-1].send(finder, header_or_index)
|
@@ -157,7 +161,7 @@ class FasterCSV
|
|
157
161
|
# to <tt>[nil, nil]</tt>. Assigning to an unused header appends the new
|
158
162
|
# pair.
|
159
163
|
#
|
160
|
-
def []=(
|
164
|
+
def []=(*args)
|
161
165
|
value = args.pop
|
162
166
|
|
163
167
|
if args.first.is_a? Integer
|
@@ -190,7 +194,7 @@ class FasterCSV
|
|
190
194
|
#
|
191
195
|
# This method returns the row for chaining.
|
192
196
|
#
|
193
|
-
def <<(
|
197
|
+
def <<(arg)
|
194
198
|
if arg.is_a?(Array) and arg.size == 2 # appending a header and name
|
195
199
|
@row << arg
|
196
200
|
elsif arg.is_a?(Hash) # append header and name pairs
|
@@ -209,7 +213,7 @@ class FasterCSV
|
|
209
213
|
#
|
210
214
|
# This method returns the row for chaining.
|
211
215
|
#
|
212
|
-
def push(
|
216
|
+
def push(*args)
|
213
217
|
args.each { |arg| self << arg }
|
214
218
|
|
215
219
|
self # for chaining
|
@@ -225,7 +229,7 @@ class FasterCSV
|
|
225
229
|
# located as described in FasterCSV::Row.field(). The deleted pair is
|
226
230
|
# returned, or +nil+ if a pair could not be found.
|
227
231
|
#
|
228
|
-
def delete(
|
232
|
+
def delete(header_or_index, minimum_index = 0)
|
229
233
|
if header_or_index.is_a? Integer # by index
|
230
234
|
@row.delete_at(header_or_index)
|
231
235
|
else # by header
|
@@ -240,7 +244,7 @@ class FasterCSV
|
|
240
244
|
#
|
241
245
|
# This method returns the row for chaining.
|
242
246
|
#
|
243
|
-
def delete_if(
|
247
|
+
def delete_if(&block)
|
244
248
|
@row.delete_if(&block)
|
245
249
|
|
246
250
|
self # for chaining
|
@@ -248,16 +252,29 @@ class FasterCSV
|
|
248
252
|
|
249
253
|
#
|
250
254
|
# This method accepts any number of arguments which can be headers, indices,
|
251
|
-
# or two-element Arrays containing a header and offset.
|
252
|
-
# be replaced with a field lookup as described in
|
255
|
+
# Ranges of either, or two-element Arrays containing a header and offset.
|
256
|
+
# Each argument will be replaced with a field lookup as described in
|
257
|
+
# FasterCSV::Row.field().
|
253
258
|
#
|
254
259
|
# If called with no arguments, all fields are returned.
|
255
260
|
#
|
256
|
-
def fields(
|
261
|
+
def fields(*headers_and_or_indices)
|
257
262
|
if headers_and_or_indices.empty? # return all fields--no arguments
|
258
263
|
@row.map { |pair| pair.last }
|
259
264
|
else # or work like values_at()
|
260
|
-
headers_and_or_indices.
|
265
|
+
headers_and_or_indices.inject(Array.new) do |all, h_or_i|
|
266
|
+
all + if h_or_i.is_a? Range
|
267
|
+
index_begin = h_or_i.begin.is_a?(Integer) ? h_or_i.begin :
|
268
|
+
index(h_or_i.begin)
|
269
|
+
index_end = h_or_i.end.is_a?(Integer) ? h_or_i.end :
|
270
|
+
index(h_or_i.end)
|
271
|
+
new_range = h_or_i.exclude_end? ? (index_begin...index_end) :
|
272
|
+
(index_begin..index_end)
|
273
|
+
fields.values_at(new_range)
|
274
|
+
else
|
275
|
+
[field(*Array(h_or_i))]
|
276
|
+
end
|
277
|
+
end
|
261
278
|
end
|
262
279
|
end
|
263
280
|
alias_method :values_at, :fields
|
@@ -271,7 +288,7 @@ class FasterCSV
|
|
271
288
|
# The +offset+ can be used to locate duplicate header names, as described in
|
272
289
|
# FasterCSV::Row.field().
|
273
290
|
#
|
274
|
-
def index(
|
291
|
+
def index(header, minimum_index = 0)
|
275
292
|
# find the pair
|
276
293
|
index = headers[minimum_index..-1].index(header)
|
277
294
|
# return the index at the right offset, if we found one
|
@@ -279,7 +296,7 @@ class FasterCSV
|
|
279
296
|
end
|
280
297
|
|
281
298
|
# Returns +true+ if +name+ is a header for this row, and +false+ otherwise.
|
282
|
-
def header?(
|
299
|
+
def header?(name)
|
283
300
|
headers.include? name
|
284
301
|
end
|
285
302
|
alias_method :include?, :header?
|
@@ -288,7 +305,7 @@ class FasterCSV
|
|
288
305
|
# Returns +true+ if +data+ matches a field in this row, and +false+
|
289
306
|
# otherwise.
|
290
307
|
#
|
291
|
-
def field?(
|
308
|
+
def field?(data)
|
292
309
|
fields.include? data
|
293
310
|
end
|
294
311
|
|
@@ -302,12 +319,20 @@ class FasterCSV
|
|
302
319
|
#
|
303
320
|
# This method returns the row for chaining.
|
304
321
|
#
|
305
|
-
def each(
|
322
|
+
def each(&block)
|
306
323
|
@row.each(&block)
|
307
324
|
|
308
325
|
self # for chaining
|
309
326
|
end
|
310
327
|
|
328
|
+
#
|
329
|
+
# Returns +true+ if this row contains the same headers and fields in the
|
330
|
+
# same order as +other+.
|
331
|
+
#
|
332
|
+
def ==(other)
|
333
|
+
@row == other.row
|
334
|
+
end
|
335
|
+
|
311
336
|
#
|
312
337
|
# Collapses the row into a simple Hash. Be warning that this discards field
|
313
338
|
# order and clobbers duplicate fields.
|
@@ -322,11 +347,331 @@ class FasterCSV
|
|
322
347
|
#
|
323
348
|
# faster_csv_row.fields.to_csv( options )
|
324
349
|
#
|
325
|
-
def to_csv(
|
350
|
+
def to_csv(options = Hash.new)
|
326
351
|
fields.to_csv(options)
|
327
352
|
end
|
328
353
|
alias_method :to_s, :to_csv
|
329
354
|
end
|
355
|
+
|
356
|
+
#
|
357
|
+
# A FasterCSV::Table is a two-dimensional data structure for representing CSV
|
358
|
+
# documents. Tables allow you to work with the data by row or column,
|
359
|
+
# manipulate the data, and even convert the results back to CSV, if needed.
|
360
|
+
#
|
361
|
+
# All tables returned by FasterCSV will be constructed from this class, if
|
362
|
+
# header row processing is activated.
|
363
|
+
#
|
364
|
+
class Table
|
365
|
+
#
|
366
|
+
# Construct a new FasterCSV::Table from +array_of_rows+, which are expected
|
367
|
+
# to be FasterCSV::Row objects. All rows are assumed to have the same
|
368
|
+
# headers.
|
369
|
+
#
|
370
|
+
def initialize(array_of_rows)
|
371
|
+
@table = array_of_rows
|
372
|
+
@mode = :col_or_row
|
373
|
+
end
|
374
|
+
|
375
|
+
# The current access mode for indexing and iteration.
|
376
|
+
attr_reader :mode
|
377
|
+
|
378
|
+
# Internal data format used to compare equality.
|
379
|
+
attr_reader :table
|
380
|
+
protected :table
|
381
|
+
|
382
|
+
#
|
383
|
+
# Returns a duplicate table object, in column mode. This is handy for
|
384
|
+
# chaining in a single call without changing the table mode, but be aware
|
385
|
+
# that this method can consume a fair amount of memory for bigger data sets.
|
386
|
+
#
|
387
|
+
# This method returns the duplicate table for chaining. Don't chain
|
388
|
+
# destructive methods (like []=()) this way though, since you are working
|
389
|
+
# with a duplicate.
|
390
|
+
#
|
391
|
+
def by_col
|
392
|
+
self.class.new(@table.dup).by_col!
|
393
|
+
end
|
394
|
+
|
395
|
+
#
|
396
|
+
# Switches the mode of this table to column mode. All calls to indexing and
|
397
|
+
# iteration methods will work with columns until the mode is changed again.
|
398
|
+
#
|
399
|
+
# This method returns the table and is safe to chain.
|
400
|
+
#
|
401
|
+
def by_col!
|
402
|
+
@mode = :col
|
403
|
+
|
404
|
+
self
|
405
|
+
end
|
406
|
+
|
407
|
+
#
|
408
|
+
# Returns a duplicate table object, in mixed mode. This is handy for
|
409
|
+
# chaining in a single call without changing the table mode, but be aware
|
410
|
+
# that this method can consume a fair amount of memory for bigger data sets.
|
411
|
+
#
|
412
|
+
# This method returns the duplicate table for chaining. Don't chain
|
413
|
+
# destructive methods (like []=()) this way though, since you are working
|
414
|
+
# with a duplicate.
|
415
|
+
#
|
416
|
+
def by_col_or_row
|
417
|
+
self.class.new(@table.dup).by_col_or_row!
|
418
|
+
end
|
419
|
+
|
420
|
+
#
|
421
|
+
# Switches the mode of this table to mixed mode. All calls to indexing and
|
422
|
+
# iteration methods will use the default intelligent indexing system until
|
423
|
+
# the mode is changed again. In mixed mode an index is assumed to be a row
|
424
|
+
# reference while anything else is assumed to be column access by headers.
|
425
|
+
#
|
426
|
+
# This method returns the table and is safe to chain.
|
427
|
+
#
|
428
|
+
def by_col_or_row!
|
429
|
+
@mode = :col_or_row
|
430
|
+
|
431
|
+
self
|
432
|
+
end
|
433
|
+
|
434
|
+
#
|
435
|
+
# Returns a duplicate table object, in row mode. This is handy for chaining
|
436
|
+
# in a single call without changing the table mode, but be aware that this
|
437
|
+
# method can consume a fair amount of memory for bigger data sets.
|
438
|
+
#
|
439
|
+
# This method returns the duplicate table for chaining. Don't chain
|
440
|
+
# destructive methods (like []=()) this way though, since you are working
|
441
|
+
# with a duplicate.
|
442
|
+
#
|
443
|
+
def by_row
|
444
|
+
self.class.new(@table.dup).by_row!
|
445
|
+
end
|
446
|
+
|
447
|
+
#
|
448
|
+
# Switches the mode of this table to row mode. All calls to indexing and
|
449
|
+
# iteration methods will work with rows until the mode is changed again.
|
450
|
+
#
|
451
|
+
# This method returns the table and is safe to chain.
|
452
|
+
#
|
453
|
+
def by_row!
|
454
|
+
@mode = :row
|
455
|
+
|
456
|
+
self
|
457
|
+
end
|
458
|
+
|
459
|
+
#
|
460
|
+
# Returns the headers for the first row of this table (assumed to match all
|
461
|
+
# other rows). An empty Array is returned for empty tables.
|
462
|
+
#
|
463
|
+
def headers
|
464
|
+
if @table.empty?
|
465
|
+
Array.new
|
466
|
+
else
|
467
|
+
@table.first.headers
|
468
|
+
end
|
469
|
+
end
|
470
|
+
|
471
|
+
#
|
472
|
+
# In the default mixed mode, this method returns rows for index access and
|
473
|
+
# columns for header access. You can force the index association by first
|
474
|
+
# calling by_col!() or by_row!().
|
475
|
+
#
|
476
|
+
# Columns are returned as an Array of values. Altering that Array has no
|
477
|
+
# effect on the table.
|
478
|
+
#
|
479
|
+
def [](index_or_header)
|
480
|
+
if @mode == :row or # by index
|
481
|
+
(@mode == :col_or_row and index_or_header.is_a? Integer)
|
482
|
+
@table[index_or_header]
|
483
|
+
else # by header
|
484
|
+
@table.map { |row| row[index_or_header] }
|
485
|
+
end
|
486
|
+
end
|
487
|
+
|
488
|
+
#
|
489
|
+
# In the default mixed mode, this method assigns rows for index access and
|
490
|
+
# columns for header access. You can force the index association by first
|
491
|
+
# calling by_col!() or by_row!().
|
492
|
+
#
|
493
|
+
# Rows may be set to an Array of values (which will inherit the table's
|
494
|
+
# headers()) or a FasterCSV::Row.
|
495
|
+
#
|
496
|
+
# Columns may be set to a single value, which is copied to each row of the
|
497
|
+
# column, or an Array of values. Arrays of values are assigned to rows top
|
498
|
+
# to bottom in row major order. Excess values are ignored and if the Array
|
499
|
+
# does not have a value for each row the extra rows will receive a +nil+.
|
500
|
+
#
|
501
|
+
# Assigning to an existing column or row clobbers the data. Assigning to
|
502
|
+
# new columns creates them at the right end of the table.
|
503
|
+
#
|
504
|
+
def []=(index_or_header, value)
|
505
|
+
if @mode == :row or # by index
|
506
|
+
(@mode == :col_or_row and index_or_header.is_a? Integer)
|
507
|
+
if value.is_a? Array
|
508
|
+
@table[index_or_header] = Row.new(headers, value)
|
509
|
+
else
|
510
|
+
@table[index_or_header] = value
|
511
|
+
end
|
512
|
+
else # set column
|
513
|
+
if value.is_a? Array # multiple values
|
514
|
+
@table.each_with_index do |row, i|
|
515
|
+
if row.header_row?
|
516
|
+
row[index_or_header] = index_or_header
|
517
|
+
else
|
518
|
+
row[index_or_header] = value[i]
|
519
|
+
end
|
520
|
+
end
|
521
|
+
else # repeated value
|
522
|
+
@table.each do |row|
|
523
|
+
if row.header_row?
|
524
|
+
row[index_or_header] = index_or_header
|
525
|
+
else
|
526
|
+
row[index_or_header] = value
|
527
|
+
end
|
528
|
+
end
|
529
|
+
end
|
530
|
+
end
|
531
|
+
end
|
532
|
+
|
533
|
+
#
|
534
|
+
# The mixed mode default is to treat a list of indices as row access,
|
535
|
+
# returning the rows indicated. Anything else is considered columnar
|
536
|
+
# access. For columnar access, the return set has an Array for each row
|
537
|
+
# with the values indicated by the headers in each Array. You can force
|
538
|
+
# column or row mode using by_col!() or by_row!().
|
539
|
+
#
|
540
|
+
# You cannot mix column and row access.
|
541
|
+
#
|
542
|
+
def values_at(*indices_or_headers)
|
543
|
+
if @mode == :row or # by indices
|
544
|
+
( @mode == :col_or_row and indices_or_headers.all? do |index|
|
545
|
+
index.is_a?(Integer) or
|
546
|
+
( index.is_a?(Range) and
|
547
|
+
index.first.is_a?(Integer) and
|
548
|
+
index.last.is_a?(Integer) )
|
549
|
+
end )
|
550
|
+
@table.values_at(*indices_or_headers)
|
551
|
+
else # by headers
|
552
|
+
@table.map { |row| row.values_at(*indices_or_headers) }
|
553
|
+
end
|
554
|
+
end
|
555
|
+
|
556
|
+
#
|
557
|
+
# Adds a new row to the bottom end of this table. You can provide an Array,
|
558
|
+
# which will be converted to a FasterCSV::Row (inheriting the table's
|
559
|
+
# headers()), or a FasterCSV::Row.
|
560
|
+
#
|
561
|
+
# This method returns the table for chaining.
|
562
|
+
#
|
563
|
+
def <<(row_or_array)
|
564
|
+
if row_or_array.is_a? Array # append Array
|
565
|
+
@table << Row.new(headers, row_or_array)
|
566
|
+
else # append Row
|
567
|
+
@table << row_or_array
|
568
|
+
end
|
569
|
+
|
570
|
+
self # for chaining
|
571
|
+
end
|
572
|
+
|
573
|
+
#
|
574
|
+
# A shortcut for appending multiple rows. Equivalent to:
|
575
|
+
#
|
576
|
+
# rows.each { |row| self << row }
|
577
|
+
#
|
578
|
+
# This method returns the table for chaining.
|
579
|
+
#
|
580
|
+
def push(*rows)
|
581
|
+
rows.each { |row| self << row }
|
582
|
+
|
583
|
+
self # for chaining
|
584
|
+
end
|
585
|
+
|
586
|
+
#
|
587
|
+
# Removes and returns the indicated column or row. In the default mixed
|
588
|
+
# mode indices refer to rows and everything else is assumed to be a column
|
589
|
+
# header. Use by_col!() or by_row!() to force the lookup.
|
590
|
+
#
|
591
|
+
def delete(index_or_header)
|
592
|
+
if @mode == :row or # by index
|
593
|
+
(@mode == :col_or_row and index_or_header.is_a? Integer)
|
594
|
+
@table.delete_at(index_or_header)
|
595
|
+
else # by header
|
596
|
+
@table.map { |row| row.delete(index_or_header).last }
|
597
|
+
end
|
598
|
+
end
|
599
|
+
|
600
|
+
#
|
601
|
+
# Removes any column or row for which the block returns +true+. In the
|
602
|
+
# default mixed mode or row mode, iteration is the standard row major
|
603
|
+
# walking of rows. In column mode, interation will +yield+ two element
|
604
|
+
# tuples containing the column name and an Array of values for that column.
|
605
|
+
#
|
606
|
+
# This method returns the table for chaining.
|
607
|
+
#
|
608
|
+
def delete_if(&block)
|
609
|
+
if @mode == :row or @mode == :col_or_row # by index
|
610
|
+
@table.delete_if(&block)
|
611
|
+
else # by header
|
612
|
+
to_delete = Array.new
|
613
|
+
headers.each_with_index do |header, i|
|
614
|
+
to_delete << header if block[[header, self[header]]]
|
615
|
+
end
|
616
|
+
to_delete.map { |header| delete(header) }
|
617
|
+
end
|
618
|
+
|
619
|
+
self # for chaining
|
620
|
+
end
|
621
|
+
|
622
|
+
include Enumerable
|
623
|
+
|
624
|
+
#
|
625
|
+
# In the default mixed mode or row mode, iteration is the standard row major
|
626
|
+
# walking of rows. In column mode, interation will +yield+ two element
|
627
|
+
# tuples containing the column name and an Array of values for that column.
|
628
|
+
#
|
629
|
+
# This method returns the table for chaining.
|
630
|
+
#
|
631
|
+
def each(&block)
|
632
|
+
if @mode == :col
|
633
|
+
headers.each { |header| block[[header, self[header]]] }
|
634
|
+
else
|
635
|
+
@table.each(&block)
|
636
|
+
end
|
637
|
+
|
638
|
+
self # for chaining
|
639
|
+
end
|
640
|
+
|
641
|
+
# Returns +true+ if all rows of this table ==() +other+'s rows.
|
642
|
+
def ==(other)
|
643
|
+
@table == other.table
|
644
|
+
end
|
645
|
+
|
646
|
+
#
|
647
|
+
# Returns the table as an Array of Arrays. Headers will be the first row,
|
648
|
+
# then all of the field rows will follow.
|
649
|
+
#
|
650
|
+
def to_a
|
651
|
+
@table.inject([headers]) do |array, row|
|
652
|
+
if row.header_row?
|
653
|
+
array
|
654
|
+
else
|
655
|
+
array + [row.fields]
|
656
|
+
end
|
657
|
+
end
|
658
|
+
end
|
659
|
+
|
660
|
+
#
|
661
|
+
# Returns the table as a complete CSV String. Headers will be listed first,
|
662
|
+
# then all of the field rows.
|
663
|
+
#
|
664
|
+
def to_csv(options = Hash.new)
|
665
|
+
@table.inject([headers.to_csv(options)]) do |rows, row|
|
666
|
+
if row.header_row?
|
667
|
+
rows
|
668
|
+
else
|
669
|
+
rows + [row.fields.to_csv(options)]
|
670
|
+
end
|
671
|
+
end.join
|
672
|
+
end
|
673
|
+
alias_method :to_s, :to_csv
|
674
|
+
end
|
330
675
|
|
331
676
|
# The error thrown when the parser encounters illegal CSV formatting.
|
332
677
|
class MalformedCSVError < RuntimeError; end
|
@@ -442,15 +787,15 @@ class FasterCSV
|
|
442
787
|
#
|
443
788
|
def self.build_csv_interface
|
444
789
|
Object.const_set(:CSV, Class.new).class_eval do
|
445
|
-
def self.foreach(
|
790
|
+
def self.foreach(path, rs = :auto, &block) # :nodoc:
|
446
791
|
FasterCSV.foreach(path, :row_sep => rs, &block)
|
447
792
|
end
|
448
793
|
|
449
|
-
def self.generate_line(
|
794
|
+
def self.generate_line(row, fs = ",", rs = "") # :nodoc:
|
450
795
|
FasterCSV.generate_line(row, :col_sep => fs, :row_sep => rs)
|
451
796
|
end
|
452
797
|
|
453
|
-
def self.open(
|
798
|
+
def self.open(path, mode, fs = ",", rs = :auto, &block) # :nodoc:
|
454
799
|
if block and mode.include? "r"
|
455
800
|
FasterCSV.open(path, mode, :col_sep => fs, :row_sep => rs) do |csv|
|
456
801
|
csv.each(&block)
|
@@ -460,15 +805,15 @@ class FasterCSV
|
|
460
805
|
end
|
461
806
|
end
|
462
807
|
|
463
|
-
def self.parse(
|
808
|
+
def self.parse(str_or_readable, fs = ",", rs = :auto, &block) # :nodoc:
|
464
809
|
FasterCSV.parse(str_or_readable, :col_sep => fs, :row_sep => rs, &block)
|
465
810
|
end
|
466
811
|
|
467
|
-
def self.parse_line(
|
812
|
+
def self.parse_line(src, fs = ",", rs = :auto) # :nodoc:
|
468
813
|
FasterCSV.parse_line(src, :col_sep => fs, :row_sep => rs)
|
469
814
|
end
|
470
815
|
|
471
|
-
def self.readlines(
|
816
|
+
def self.readlines(path, rs = :auto) # :nodoc:
|
472
817
|
FasterCSV.readlines(path, :row_sep => rs)
|
473
818
|
end
|
474
819
|
end
|
@@ -509,7 +854,7 @@ class FasterCSV
|
|
509
854
|
# The +io+ parameter can be used to serialize to a File, and +options+ can be
|
510
855
|
# anything FasterCSV::new() accepts.
|
511
856
|
#
|
512
|
-
def self.dump(
|
857
|
+
def self.dump(ary_of_objs, io = "", options = Hash.new)
|
513
858
|
obj_template = ary_of_objs.first
|
514
859
|
|
515
860
|
csv = FasterCSV.new(io, options)
|
@@ -566,7 +911,7 @@ class FasterCSV
|
|
566
911
|
#
|
567
912
|
# The +input+ and +output+ arguments can be anything FasterCSV::new() accepts
|
568
913
|
# (generally String or IO objects). If not given, they default to
|
569
|
-
# <tt>ARGF</tt> and <tt
|
914
|
+
# <tt>ARGF</tt> and <tt>$stdout</tt>.
|
570
915
|
#
|
571
916
|
# The +options+ parameter is also filtered down to FasterCSV::new() after some
|
572
917
|
# clever key parsing. Any key beginning with <tt>:in_</tt> or
|
@@ -578,7 +923,7 @@ class FasterCSV
|
|
578
923
|
# The <tt>:output_row_sep</tt> +option+ defaults to
|
579
924
|
# <tt>$INPUT_RECORD_SEPARATOR</tt> (<tt>$/</tt>).
|
580
925
|
#
|
581
|
-
def self.filter(
|
926
|
+
def self.filter(*args)
|
582
927
|
# parse options for input, output, or both
|
583
928
|
in_options, out_options = Hash.new, {:row_sep => $INPUT_RECORD_SEPARATOR}
|
584
929
|
if args.last.is_a? Hash
|
@@ -595,8 +940,8 @@ class FasterCSV
|
|
595
940
|
end
|
596
941
|
end
|
597
942
|
# build input and output wrappers
|
598
|
-
input = FasterCSV.new(args.shift || ARGF,
|
599
|
-
output = FasterCSV.new(args.shift ||
|
943
|
+
input = FasterCSV.new(args.shift || ARGF, in_options)
|
944
|
+
output = FasterCSV.new(args.shift || $stdout, out_options)
|
600
945
|
|
601
946
|
# read, yield, write
|
602
947
|
input.each do |row|
|
@@ -610,9 +955,9 @@ class FasterCSV
|
|
610
955
|
# pass a +path+ and any +options+ you wish to set for the read. Each row of
|
611
956
|
# file will be passed to the provided +block+ in turn.
|
612
957
|
#
|
613
|
-
# The +options+ parameter can be
|
958
|
+
# The +options+ parameter can be anything FasterCSV::new() understands.
|
614
959
|
#
|
615
|
-
def self.foreach(
|
960
|
+
def self.foreach(path, options = Hash.new, &block)
|
616
961
|
open(path, options) do |csv|
|
617
962
|
csv.each(&block)
|
618
963
|
end
|
@@ -633,7 +978,7 @@ class FasterCSV
|
|
633
978
|
#
|
634
979
|
# The +options+ parameter can be anthing FasterCSV::new() understands.
|
635
980
|
#
|
636
|
-
def self.generate(
|
981
|
+
def self.generate(*args)
|
637
982
|
# add a default empty String, if none was given
|
638
983
|
if args.first.is_a? String
|
639
984
|
io = StringIO.new(args.shift)
|
@@ -656,7 +1001,7 @@ class FasterCSV
|
|
656
1001
|
# The <tt>:row_sep</tt> +option+ defaults to <tt>$INPUT_RECORD_SEPARATOR</tt>
|
657
1002
|
# (<tt>$/</tt>) when calling this method.
|
658
1003
|
#
|
659
|
-
def self.generate_line(
|
1004
|
+
def self.generate_line(row, options = Hash.new)
|
660
1005
|
options = {:row_sep => $INPUT_RECORD_SEPARATOR}.merge(options)
|
661
1006
|
(new("", options) << row).string
|
662
1007
|
end
|
@@ -665,12 +1010,12 @@ class FasterCSV
|
|
665
1010
|
# This method will return a FasterCSV instance, just like FasterCSV::new(),
|
666
1011
|
# but the instance will be cached and returned for all future calls to this
|
667
1012
|
# method for the same +data+ object (tested by Object#object_id()) with the
|
668
|
-
# same +options
|
1013
|
+
# same +options+.
|
669
1014
|
#
|
670
1015
|
# If a block is given, the instance is passed to the block and the return
|
671
1016
|
# value becomes the return value of the block.
|
672
1017
|
#
|
673
|
-
def self.instance(
|
1018
|
+
def self.instance(data = $stdout, options = Hash.new)
|
674
1019
|
# create a _signature_ for this method call, data object and options
|
675
1020
|
sig = [data.object_id] +
|
676
1021
|
options.values_at(*DEFAULT_OPTIONS.keys.sort_by { |sym| sym.to_s })
|
@@ -698,7 +1043,7 @@ class FasterCSV
|
|
698
1043
|
# something else, use +options+ to setup converters or provide a custom
|
699
1044
|
# csv_load() implementation.
|
700
1045
|
#
|
701
|
-
def self.load(
|
1046
|
+
def self.load(io_or_str, options = Hash.new)
|
702
1047
|
csv = FasterCSV.new(io_or_str, options)
|
703
1048
|
|
704
1049
|
# load meta information
|
@@ -768,7 +1113,6 @@ class FasterCSV
|
|
768
1113
|
# * pid()
|
769
1114
|
# * pos()
|
770
1115
|
# * reopen()
|
771
|
-
# * rewind()
|
772
1116
|
# * seek()
|
773
1117
|
# * stat()
|
774
1118
|
# * sync()
|
@@ -778,7 +1122,7 @@ class FasterCSV
|
|
778
1122
|
# * to_io()
|
779
1123
|
# * tty?()
|
780
1124
|
#
|
781
|
-
def self.open(
|
1125
|
+
def self.open(*args)
|
782
1126
|
# find the +options+ Hash
|
783
1127
|
options = if args.last.is_a? Hash then args.pop else Hash.new end
|
784
1128
|
# wrap a File opened with the remaining +args+
|
@@ -808,7 +1152,7 @@ class FasterCSV
|
|
808
1152
|
# You pass your +str+ to read from, and an optional +options+ Hash containing
|
809
1153
|
# anything FasterCSV::new() understands.
|
810
1154
|
#
|
811
|
-
def self.parse(
|
1155
|
+
def self.parse(*args, &block)
|
812
1156
|
csv = new(*args)
|
813
1157
|
if block.nil? # slurp contents, if no block is given
|
814
1158
|
begin
|
@@ -828,7 +1172,7 @@ class FasterCSV
|
|
828
1172
|
#
|
829
1173
|
# The +options+ parameter can be anthing FasterCSV::new() understands.
|
830
1174
|
#
|
831
|
-
def self.parse_line(
|
1175
|
+
def self.parse_line(line, options = Hash.new)
|
832
1176
|
new(line, options).shift
|
833
1177
|
end
|
834
1178
|
|
@@ -836,12 +1180,12 @@ class FasterCSV
|
|
836
1180
|
# Use to slurp a CSV file into an Array of Arrays. Pass the +path+ to the
|
837
1181
|
# file and any +options+ FasterCSV::new() understands.
|
838
1182
|
#
|
839
|
-
def self.read(
|
1183
|
+
def self.read(path, options = Hash.new)
|
840
1184
|
open(path, options) { |csv| csv.read }
|
841
1185
|
end
|
842
1186
|
|
843
1187
|
# Alias for FasterCSV::read().
|
844
|
-
def self.readlines(
|
1188
|
+
def self.readlines(*args)
|
845
1189
|
read(*args)
|
846
1190
|
end
|
847
1191
|
|
@@ -906,7 +1250,9 @@ class FasterCSV
|
|
906
1250
|
# Array of headers. This setting causes
|
907
1251
|
# FasterCSV.shift() to return rows as
|
908
1252
|
# FasterCSV::Row objects instead of
|
909
|
-
# Arrays.
|
1253
|
+
# Arrays and FasterCSV.read() to return
|
1254
|
+
# FasterCSV::Table objects instead of
|
1255
|
+
# an Array of Arrays.
|
910
1256
|
# <b><tt>:return_headers</tt></b>:: When +false+, header rows are silently
|
911
1257
|
# swallowed. If set to +true+, header
|
912
1258
|
# rows are returned in a FasterCSV::Row
|
@@ -923,7 +1269,7 @@ class FasterCSV
|
|
923
1269
|
# Options cannot be overriden in the instance methods for performance reasons,
|
924
1270
|
# so be sure to set what you want here.
|
925
1271
|
#
|
926
|
-
def initialize(
|
1272
|
+
def initialize(data, options = Hash.new)
|
927
1273
|
# build the options for this read/write
|
928
1274
|
options = DEFAULT_OPTIONS.merge(options)
|
929
1275
|
|
@@ -943,19 +1289,27 @@ class FasterCSV
|
|
943
1289
|
@lineno = 0
|
944
1290
|
end
|
945
1291
|
|
1292
|
+
### IO and StringIO Delegation ###
|
1293
|
+
|
946
1294
|
#
|
947
1295
|
# The line number of the last row read from this file. Fields with nested
|
948
1296
|
# line-end characters will not affect this count.
|
949
1297
|
#
|
950
1298
|
attr_reader :lineno
|
951
1299
|
|
952
|
-
### IO and StringIO Delegation ###
|
953
|
-
|
954
1300
|
extend Forwardable
|
955
1301
|
def_delegators :@io, :binmode, :close, :close_read, :close_write, :closed?,
|
956
1302
|
:eof, :eof?, :fcntl, :fileno, :flush, :fsync, :ioctl,
|
957
|
-
:isatty, :pid, :pos, :reopen, :
|
958
|
-
:
|
1303
|
+
:isatty, :pid, :pos, :reopen, :seek, :stat, :string,
|
1304
|
+
:sync, :sync=, :tell, :to_i, :to_io, :tty?
|
1305
|
+
|
1306
|
+
# Rewinds the underlying IO object and resets FasterCSV's lineno() counter.
|
1307
|
+
def rewind
|
1308
|
+
@headers = nil
|
1309
|
+
@lineno = 0
|
1310
|
+
|
1311
|
+
@io.rewind
|
1312
|
+
end
|
959
1313
|
|
960
1314
|
### End Delegation ###
|
961
1315
|
|
@@ -967,16 +1321,16 @@ class FasterCSV
|
|
967
1321
|
#
|
968
1322
|
# The data source must be open for writing.
|
969
1323
|
#
|
970
|
-
def <<(
|
1324
|
+
def <<(row)
|
971
1325
|
# handle FasterCSV::Row objects
|
972
1326
|
row = row.fields if row.is_a? self.class::Row
|
973
1327
|
|
974
1328
|
@io << row.map do |field|
|
975
|
-
if field.nil? #
|
1329
|
+
if field.nil? # represent +nil+ fields as empty unquoted fields
|
976
1330
|
""
|
977
1331
|
else
|
978
1332
|
field = String(field) # Stringify fields
|
979
|
-
#
|
1333
|
+
# represent empty fields as empty quoted fields
|
980
1334
|
if field.empty? or field.count(%Q{\r\n#{@col_sep}"}).nonzero?
|
981
1335
|
%Q{"#{field.gsub('"', '""')}"} # escape quoted fields
|
982
1336
|
else
|
@@ -1005,7 +1359,7 @@ class FasterCSV
|
|
1005
1359
|
# containing details about the field. Again, the block should return a
|
1006
1360
|
# converted field or the field itself.
|
1007
1361
|
#
|
1008
|
-
def convert(
|
1362
|
+
def convert(name = nil, &converter)
|
1009
1363
|
add_converter(:converters, self.class::Converters, name, &converter)
|
1010
1364
|
end
|
1011
1365
|
|
@@ -1020,7 +1374,7 @@ class FasterCSV
|
|
1020
1374
|
# Note that this method must be called before header rows are read to have any
|
1021
1375
|
# effect.
|
1022
1376
|
#
|
1023
|
-
def header_convert(
|
1377
|
+
def header_convert(name = nil, &converter)
|
1024
1378
|
add_converter( :header_converters,
|
1025
1379
|
self.class::HeaderConverters,
|
1026
1380
|
name,
|
@@ -1048,7 +1402,12 @@ class FasterCSV
|
|
1048
1402
|
# The data source must be open for reading.
|
1049
1403
|
#
|
1050
1404
|
def read
|
1051
|
-
to_a
|
1405
|
+
rows = to_a
|
1406
|
+
if @use_headers
|
1407
|
+
Table.new(rows)
|
1408
|
+
else
|
1409
|
+
rows
|
1410
|
+
end
|
1052
1411
|
end
|
1053
1412
|
alias_method :readlines, :read
|
1054
1413
|
|
@@ -1112,7 +1471,7 @@ class FasterCSV
|
|
1112
1471
|
# on these
|
1113
1472
|
#
|
1114
1473
|
csv = if parse.sub!(@parsers[:leading_fields], "")
|
1115
|
-
[nil] * $&.length
|
1474
|
+
[nil] * ($&.length / @col_sep.length)
|
1116
1475
|
else
|
1117
1476
|
Array.new
|
1118
1477
|
end
|
@@ -1176,12 +1535,11 @@ class FasterCSV
|
|
1176
1535
|
# Stores the indicated separators for later use.
|
1177
1536
|
#
|
1178
1537
|
# If auto-discovery was requested for <tt>@row_sep</tt>, this method will read
|
1179
|
-
# ahead in the <tt>@io</tt> and try to find one.
|
1180
|
-
#
|
1181
|
-
#
|
1182
|
-
# <tt>$INPUT_RECORD_SEPARATOR</tt> (<tt>$/</tt>).
|
1538
|
+
# ahead in the <tt>@io</tt> and try to find one. +ARGF+, +STDIN+, +STDOUT+,
|
1539
|
+
# +STDERR+ and any stream open for output only with a default
|
1540
|
+
# <tt>@row_sep</tt> of <tt>$INPUT_RECORD_SEPARATOR</tt> (<tt>$/</tt>).
|
1183
1541
|
#
|
1184
|
-
def init_separators(
|
1542
|
+
def init_separators(options)
|
1185
1543
|
# store the selected separators
|
1186
1544
|
@col_sep = options.delete(:col_sep)
|
1187
1545
|
@row_sep = options.delete(:row_sep)
|
@@ -1222,11 +1580,11 @@ class FasterCSV
|
|
1222
1580
|
end
|
1223
1581
|
|
1224
1582
|
# Pre-compiles parsers and stores them by name for access during reads.
|
1225
|
-
def init_parsers(
|
1583
|
+
def init_parsers(options)
|
1226
1584
|
# prebuild Regexps for faster parsing
|
1227
1585
|
@parsers = {
|
1228
1586
|
:leading_fields =>
|
1229
|
-
/\A
|
1587
|
+
/\A(?:#{Regexp.escape(@col_sep)})+/, # for empty leading fields
|
1230
1588
|
:csv_row =>
|
1231
1589
|
### The Primary Parser ###
|
1232
1590
|
/ \G(?:^|#{Regexp.escape(@col_sep)}) # anchor the match
|
@@ -1250,7 +1608,7 @@ class FasterCSV
|
|
1250
1608
|
# The <tt>:unconverted_fields</tt> option is also actived for
|
1251
1609
|
# <tt>:converters</tt> calls, if requested.
|
1252
1610
|
#
|
1253
|
-
def init_converters(
|
1611
|
+
def init_converters(options, field_name = :converters)
|
1254
1612
|
if field_name == :converters
|
1255
1613
|
@unconverted_fields = options.delete(:unconverted_fields)
|
1256
1614
|
end
|
@@ -1280,7 +1638,7 @@ class FasterCSV
|
|
1280
1638
|
end
|
1281
1639
|
|
1282
1640
|
# Stores header row settings and loads header converters, if needed.
|
1283
|
-
def init_headers(
|
1641
|
+
def init_headers(options)
|
1284
1642
|
@use_headers = options.delete(:headers)
|
1285
1643
|
@return_headers = options.delete(:return_headers)
|
1286
1644
|
|
@@ -1299,7 +1657,7 @@ class FasterCSV
|
|
1299
1657
|
# normal parameters of the FasterCSV.convert() and FasterCSV.header_convert()
|
1300
1658
|
# methods.
|
1301
1659
|
#
|
1302
|
-
def add_converter(
|
1660
|
+
def add_converter(var_name, const, name = nil, &converter)
|
1303
1661
|
if name.nil? # custom converter
|
1304
1662
|
instance_variable_get("@#{var_name}") << converter
|
1305
1663
|
else # named converter
|
@@ -1322,7 +1680,7 @@ class FasterCSV
|
|
1322
1680
|
# the pipeline of conversion for that field. This is primarily an efficiency
|
1323
1681
|
# shortcut.
|
1324
1682
|
#
|
1325
|
-
def convert_fields(
|
1683
|
+
def convert_fields(fields, headers = false)
|
1326
1684
|
# see if we are converting headers or fields
|
1327
1685
|
converters = headers ? @header_converters : @converters
|
1328
1686
|
|
@@ -1350,7 +1708,7 @@ class FasterCSV
|
|
1350
1708
|
# When +nil+, +row+ is assumed to be a header row not based on an actual row
|
1351
1709
|
# of the stream.
|
1352
1710
|
#
|
1353
|
-
def parse_headers(
|
1711
|
+
def parse_headers(row = nil)
|
1354
1712
|
if @headers.nil? # header row
|
1355
1713
|
@headers = case @use_headers # save headers
|
1356
1714
|
when Array then @use_headers # Array of headers
|
@@ -1377,7 +1735,7 @@ class FasterCSV
|
|
1377
1735
|
# +row+ and an accessor method for it called unconverted_fields(). The
|
1378
1736
|
# variable is set to the contents of +fields+.
|
1379
1737
|
#
|
1380
|
-
def add_unconverted_fields(
|
1738
|
+
def add_unconverted_fields(row, fields)
|
1381
1739
|
class << row
|
1382
1740
|
attr_reader :unconverted_fields
|
1383
1741
|
end
|
@@ -1390,25 +1748,25 @@ end
|
|
1390
1748
|
FCSV = FasterCSV
|
1391
1749
|
|
1392
1750
|
# Another name for FasterCSV::instance().
|
1393
|
-
def FasterCSV(
|
1751
|
+
def FasterCSV(*args, &block)
|
1394
1752
|
FasterCSV.instance(*args, &block)
|
1395
1753
|
end
|
1396
1754
|
|
1397
1755
|
# Another name for FCSV::instance().
|
1398
|
-
def FCSV(
|
1756
|
+
def FCSV(*args, &block)
|
1399
1757
|
FCSV.instance(*args, &block)
|
1400
1758
|
end
|
1401
1759
|
|
1402
1760
|
class Array
|
1403
1761
|
# Equivalent to <tt>FasterCSV::generate_line(self, options)</tt>.
|
1404
|
-
def to_csv(
|
1762
|
+
def to_csv(options = Hash.new)
|
1405
1763
|
FasterCSV.generate_line(self, options)
|
1406
1764
|
end
|
1407
1765
|
end
|
1408
1766
|
|
1409
1767
|
class String
|
1410
1768
|
# Equivalent to <tt>FasterCSV::parse_line(self, options)</tt>.
|
1411
|
-
def parse_csv(
|
1769
|
+
def parse_csv(options = Hash.new)
|
1412
1770
|
FasterCSV.parse_line(self, options)
|
1413
1771
|
end
|
1414
1772
|
end
|