fastercsv 0.2.1 → 1.0.0
Sign up to get free protection for your applications and to get access to all the features.
- data/CHANGELOG +15 -3
- data/README +2 -2
- data/TODO +1 -26
- data/examples/csv_filter.rb +8 -0
- data/examples/csv_reading.rb +57 -0
- data/examples/csv_table.rb +56 -0
- data/examples/csv_writing.rb +67 -0
- data/examples/shortcut_interface.rb +4 -0
- data/lib/faster_csv.rb +432 -74
- data/test/tc_features.rb +41 -13
- data/test/tc_headers.rb +35 -26
- data/test/tc_row.rb +3 -0
- data/test/tc_table.rb +385 -0
- data/test/ts_all.rb +1 -0
- metadata +6 -2
data/CHANGELOG
CHANGED
@@ -2,6 +2,18 @@
|
|
2
2
|
|
3
3
|
Below is a complete listing of changes for each revision of FasterCSV.
|
4
4
|
|
5
|
+
== 1.0.0
|
6
|
+
|
7
|
+
* Fixed FasterCSV.rewind() to reset the FasterCSV.lineno() counter.
|
8
|
+
* Fixed FasterCSV.rewind() to reset the header processing.
|
9
|
+
* Fixed documentation typos.
|
10
|
+
* Switched STDOUT and STDERR usage to $stdout and $stderr where appropriate.
|
11
|
+
* Added FasterCSV::Row.==().
|
12
|
+
* Enhanced FasterCSV::Row.fields() to support Ranges, even for headers.
|
13
|
+
* The slurping methods now return the new FasterCSV::Table objects.
|
14
|
+
* Fixed parser so multibyte <tt>:col_sep</tt> works now.
|
15
|
+
* Added a few examples for usage.
|
16
|
+
|
5
17
|
== 0.2.1
|
6
18
|
|
7
19
|
* Removed autorequire from GemSpec.
|
@@ -12,9 +24,9 @@ Below is a complete listing of changes for each revision of FasterCSV.
|
|
12
24
|
|
13
25
|
* Added VERSION constant.
|
14
26
|
* Significantly improved test speed.
|
15
|
-
* Worked around Date::parse bug so tests will pass on Windows.
|
27
|
+
* Worked around Date::parse() bug so tests will pass on Windows.
|
16
28
|
* Documented test procedure.
|
17
|
-
* Made FasterCSV
|
29
|
+
* Made FasterCSV.lineno() CSV aware.
|
18
30
|
* Added line numbers to MalformedCSVError messages.
|
19
31
|
* <tt>:headers</tt> can now be set to an Array of headers to use.
|
20
32
|
* <tt>:headers</tt> can now be set to an external CSV String of headers to use.
|
@@ -25,7 +37,7 @@ Below is a complete listing of changes for each revision of FasterCSV.
|
|
25
37
|
* Added header information to FieldInfo Struct for conversions by header.
|
26
38
|
* Added an alias to support <tt>require "fastercsv"</tt>.
|
27
39
|
* Added FCSV alias for FasterCSV.
|
28
|
-
* Added FasterCSV::instance and FasterCSV()/FCSV() shortcuts for easy output.
|
40
|
+
* Added FasterCSV::instance() and FasterCSV()/FCSV() shortcuts for easy output.
|
29
41
|
|
30
42
|
== 0.1.9
|
31
43
|
|
data/README
CHANGED
@@ -9,8 +9,8 @@ Welcome to FasterCSV.
|
|
9
9
|
FasterCSV is intended as a replacement to Ruby's standard CSV library. It was designed to address concerns users of that library had and it has three primary goals:
|
10
10
|
|
11
11
|
1. Be significantly faster than CSV while remaining a pure Ruby library.
|
12
|
-
2. Use a smaller and easier to maintain code base. (
|
13
|
-
but
|
12
|
+
2. Use a smaller and easier to maintain code base. (FasterCSV is larger now,
|
13
|
+
but considerably richer in features. The parsing core remains quite small.)
|
14
14
|
3. Improve on the CSV interface.
|
15
15
|
|
16
16
|
Obviously, the last one is subjective. If you love CSV's interface, odds are
|
data/TODO
CHANGED
@@ -3,29 +3,4 @@
|
|
3
3
|
The following is a list of planned expansions for FasterCSV, in no particular
|
4
4
|
order.
|
5
5
|
|
6
|
-
*
|
7
|
-
"Experiment ID: 1",,,,,,,,,,,,
|
8
|
-
"Subject ID: 1013938829432171e868c340.
|
9
|
-
Trial,stimulus,time,type,field1,field2,text_response,Abs. time of
|
10
|
-
response,,,,,
|
11
|
-
26,undefined,14828,KEY,RETURN,UNUSED,DCS,Sat Oct 15 17:48:04 GMT-0400
|
12
|
-
2005,,,,,
|
13
|
-
23,undefined,15078,KEY,RETURN,UNUSED,244,Sat Oct 15 17:48:19 GMT-0400
|
14
|
-
2005,,,,,
|
15
|
-
7,nixontrialleft copy.pct [TAG: 1],5953,KEY,1,UNUSED,,Sat Oct 15
|
16
|
-
17:49:24 GMT-0400 2005,,,,,
|
17
|
-
8,nixontrialfront copy.pct [TAG: 3],6250,KEY,3,UNUSED,,Sat Oct 15
|
18
|
-
17:49:31 GMT-0400 2005,,,,,
|
19
|
-
9,nixontrialright copy.pct [TAG: 2],2469,KEY,2,UNUSED,,Sat Oct 15
|
20
|
-
17:49:34 GMT-0400 2005,,,,,
|
21
|
-
#####
|
22
|
-
more data
|
23
|
-
######
|
24
|
-
,,,,,,,,,,4374.347222,,
|
25
|
-
,,,,,,,,,,,,1.00
|
26
|
-
,,,,,,,,,,,,0.93
|
27
|
-
### and a new block starts
|
28
|
-
"Experiment ID: 3",,,,,,,,,,,,0.92
|
29
|
-
....
|
30
|
-
* Add calculated fields.
|
31
|
-
* Examples, examples, examples...
|
6
|
+
* Rent this space!
|
data/examples/csv_filter.rb
CHANGED
@@ -1,5 +1,10 @@
|
|
1
1
|
#!/usr/local/bin/ruby -w
|
2
2
|
|
3
|
+
# = csv_filter.rb -- Faster CSV Reading and Writing
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-04-01.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
|
3
8
|
require "faster_csv"
|
4
9
|
|
5
10
|
running_total = 0
|
@@ -13,3 +18,6 @@ FasterCSV.filter( :headers => true,
|
|
13
18
|
row << (running_total += row[:quantity] * row[:price])
|
14
19
|
end
|
15
20
|
end
|
21
|
+
# >> Quantity,Product Description,Price,Running Total
|
22
|
+
# >> 1,Text Editor,25.0,25.0
|
23
|
+
# >> 2,MacBook Pros,2499.0,5023.0
|
@@ -0,0 +1,57 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# csv_reading.rb
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-11-05.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
|
8
|
+
require "faster_csv"
|
9
|
+
|
10
|
+
CSV_FILE_PATH = File.join(File.dirname(__FILE__), "purchase.csv")
|
11
|
+
CSV_STR = <<END_CSV
|
12
|
+
first,last
|
13
|
+
James,Gray
|
14
|
+
Dana,Gray
|
15
|
+
END_CSV
|
16
|
+
|
17
|
+
# read a file line by line
|
18
|
+
FasterCSV.foreach(CSV_FILE_PATH) do |line|
|
19
|
+
puts line[1]
|
20
|
+
end
|
21
|
+
# >> Product Description
|
22
|
+
# >> Text Editor
|
23
|
+
# >> MacBook Pros
|
24
|
+
|
25
|
+
# slurp file data
|
26
|
+
data = FasterCSV.read(CSV_FILE_PATH)
|
27
|
+
puts data.flatten.grep(/\A\d+\.\d+\Z/)
|
28
|
+
# >> 25.00
|
29
|
+
# >> 2499.00
|
30
|
+
|
31
|
+
# read a string line by line
|
32
|
+
FasterCSV.parse(CSV_STR) do |line|
|
33
|
+
puts line[0]
|
34
|
+
end
|
35
|
+
# >> first
|
36
|
+
# >> James
|
37
|
+
# >> Dana
|
38
|
+
|
39
|
+
# slurp string data
|
40
|
+
data = FasterCSV.parse(CSV_STR)
|
41
|
+
puts data[1..-1].map { |line| "#{line[0][0, 1].downcase}.#{line[1].downcase}" }
|
42
|
+
# >> j.gray
|
43
|
+
# >> d.gray
|
44
|
+
|
45
|
+
# adding options to make data manipulation easy
|
46
|
+
total = 0
|
47
|
+
FasterCSV.foreach( CSV_FILE_PATH, :headers => true,
|
48
|
+
:header_converters => :symbol,
|
49
|
+
:converters => :numeric ) do |line|
|
50
|
+
line_total = line[:quantity] * line[:price]
|
51
|
+
total += line_total
|
52
|
+
puts "%s: %.2f" % [line[:product_description], line_total]
|
53
|
+
end
|
54
|
+
puts "Total: %.2f" % total
|
55
|
+
# >> Text Editor: 25.00
|
56
|
+
# >> MacBook Pros: 4998.00
|
57
|
+
# >> Total: 5023.00
|
@@ -0,0 +1,56 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# csv_table.rb
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-11-04.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
#
|
8
|
+
# Feature implementation and example code by Ara.T.Howard.
|
9
|
+
|
10
|
+
require "faster_csv"
|
11
|
+
|
12
|
+
table = FCSV.parse(DATA, :headers => true, :header_converters => :symbol)
|
13
|
+
|
14
|
+
# row access
|
15
|
+
table[0].class # => FasterCSV::Row
|
16
|
+
table[0].fields # => ["zaphod", "beeblebrox", "42"]
|
17
|
+
|
18
|
+
# column access
|
19
|
+
table[:first_name] # => ["zaphod", "ara"]
|
20
|
+
|
21
|
+
# cell access
|
22
|
+
table[1][0] # => "ara"
|
23
|
+
table[1][:first_name] # => "ara"
|
24
|
+
table[:first_name][1] # => "ara"
|
25
|
+
|
26
|
+
# manipulation
|
27
|
+
table << %w[james gray 30]
|
28
|
+
table[-1].fields # => ["james", "gray", "30"]
|
29
|
+
|
30
|
+
table[:type] = "name"
|
31
|
+
table[:type] # => ["name", "name", "name"]
|
32
|
+
|
33
|
+
table[:ssn] = %w[123-456-7890 098-765-4321]
|
34
|
+
table[:ssn] # => ["123-456-7890", "098-765-4321", nil]
|
35
|
+
|
36
|
+
# iteration
|
37
|
+
table.each do |row|
|
38
|
+
# ...
|
39
|
+
end
|
40
|
+
|
41
|
+
table.by_col!
|
42
|
+
table.each do |col_name, col_values|
|
43
|
+
# ...
|
44
|
+
end
|
45
|
+
|
46
|
+
# output
|
47
|
+
puts table
|
48
|
+
# >> first_name,last_name,age,type,ssn
|
49
|
+
# >> zaphod,beeblebrox,42,name,123-456-7890
|
50
|
+
# >> ara,howard,34,name,098-765-4321
|
51
|
+
# >> james,gray,30,name,
|
52
|
+
|
53
|
+
__END__
|
54
|
+
first_name,last_name,age
|
55
|
+
zaphod,beeblebrox,42
|
56
|
+
ara,howard,34
|
@@ -0,0 +1,67 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# csv_rails_import.rb
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-11-05.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
|
8
|
+
require "faster_csv"
|
9
|
+
|
10
|
+
CSV_FILE_PATH = File.join(File.dirname(__FILE__), "output.csv")
|
11
|
+
|
12
|
+
# writing to a file
|
13
|
+
FasterCSV.open(CSV_FILE_PATH, "w") do |csv|
|
14
|
+
csv << %w[first last]
|
15
|
+
csv << %w[James Gray]
|
16
|
+
csv << %w[Dana Gray]
|
17
|
+
end
|
18
|
+
puts File.read(CSV_FILE_PATH)
|
19
|
+
# >> first,last
|
20
|
+
# >> James,Gray
|
21
|
+
# >> Dana,Gray
|
22
|
+
|
23
|
+
# appending to an existing file
|
24
|
+
FasterCSV.open(CSV_FILE_PATH, "a") do |csv|
|
25
|
+
csv << %w[Gypsy]
|
26
|
+
csv << %w[Storm]
|
27
|
+
end
|
28
|
+
puts File.read(CSV_FILE_PATH)
|
29
|
+
# >> first,last
|
30
|
+
# >> James,Gray
|
31
|
+
# >> Dana,Gray
|
32
|
+
# >> Gypsy
|
33
|
+
# >> Storm
|
34
|
+
|
35
|
+
# writing to a string
|
36
|
+
csv_str = FasterCSV.generate do |csv|
|
37
|
+
csv << %w[first last]
|
38
|
+
csv << %w[James Gray]
|
39
|
+
csv << %w[Dana Gray]
|
40
|
+
end
|
41
|
+
puts csv_str
|
42
|
+
# >> first,last
|
43
|
+
# >> James,Gray
|
44
|
+
# >> Dana,Gray
|
45
|
+
|
46
|
+
# appending to an existing string
|
47
|
+
FasterCSV.generate(csv_str) do |csv|
|
48
|
+
csv << %w[Gypsy]
|
49
|
+
csv << %w[Storm]
|
50
|
+
end
|
51
|
+
puts csv_str
|
52
|
+
# >> first,last
|
53
|
+
# >> James,Gray
|
54
|
+
# >> Dana,Gray
|
55
|
+
# >> Gypsy
|
56
|
+
# >> Storm
|
57
|
+
|
58
|
+
# changing the output format
|
59
|
+
csv_str = FasterCSV.generate(:col_sep => "\t") do |csv|
|
60
|
+
csv << %w[first last]
|
61
|
+
csv << %w[James Gray]
|
62
|
+
csv << %w[Dana Gray]
|
63
|
+
end
|
64
|
+
puts csv_str
|
65
|
+
# >> first last
|
66
|
+
# >> James Gray
|
67
|
+
# >> Dana Gray
|
data/lib/faster_csv.rb
CHANGED
@@ -69,13 +69,13 @@ require "stringio"
|
|
69
69
|
#
|
70
70
|
# == Shortcut Interface
|
71
71
|
#
|
72
|
-
# FCSV
|
73
|
-
# FCSV(csv = "")
|
74
|
-
# FCSV(
|
72
|
+
# FCSV { |csv_out| csv_out << %w{my data here} } # to $stdout
|
73
|
+
# FCSV(csv = "") { |csv_str| csv_str << %w{my data here} } # to a String
|
74
|
+
# FCSV($stderr) { |csv_err| csv_err << %w{my data here} } # to $stderr
|
75
75
|
#
|
76
76
|
class FasterCSV
|
77
77
|
# The version of the installed library.
|
78
|
-
VERSION = "0.
|
78
|
+
VERSION = "1.0.0".freeze
|
79
79
|
|
80
80
|
#
|
81
81
|
# A FasterCSV::Row is part Array and part Hash. It retains an order for the
|
@@ -95,7 +95,7 @@ class FasterCSV
|
|
95
95
|
# FasterCSV::Row.header_row?() and FasterCSV::Row.field_row?(), that this is
|
96
96
|
# a header row. Otherwise, the row is assumes to be a field row.
|
97
97
|
#
|
98
|
-
def initialize(
|
98
|
+
def initialize(headers, fields, header_row = false)
|
99
99
|
@header_row = header_row
|
100
100
|
|
101
101
|
# handle extra headers or fields
|
@@ -106,6 +106,10 @@ class FasterCSV
|
|
106
106
|
end
|
107
107
|
end
|
108
108
|
|
109
|
+
# Internal data format used to compare equality.
|
110
|
+
attr_reader :row
|
111
|
+
protected :row
|
112
|
+
|
109
113
|
# Returns +true+ if this is a header row.
|
110
114
|
def header_row?
|
111
115
|
@header_row
|
@@ -134,7 +138,7 @@ class FasterCSV
|
|
134
138
|
# than the +offset+ index. You can use this to find duplicate headers,
|
135
139
|
# without resorting to hard-coding exact indices.
|
136
140
|
#
|
137
|
-
def field(
|
141
|
+
def field(header_or_index, minimum_index = 0)
|
138
142
|
# locate the pair
|
139
143
|
finder = header_or_index.is_a?(Integer) ? :[] : :assoc
|
140
144
|
pair = @row[minimum_index..-1].send(finder, header_or_index)
|
@@ -157,7 +161,7 @@ class FasterCSV
|
|
157
161
|
# to <tt>[nil, nil]</tt>. Assigning to an unused header appends the new
|
158
162
|
# pair.
|
159
163
|
#
|
160
|
-
def []=(
|
164
|
+
def []=(*args)
|
161
165
|
value = args.pop
|
162
166
|
|
163
167
|
if args.first.is_a? Integer
|
@@ -190,7 +194,7 @@ class FasterCSV
|
|
190
194
|
#
|
191
195
|
# This method returns the row for chaining.
|
192
196
|
#
|
193
|
-
def <<(
|
197
|
+
def <<(arg)
|
194
198
|
if arg.is_a?(Array) and arg.size == 2 # appending a header and name
|
195
199
|
@row << arg
|
196
200
|
elsif arg.is_a?(Hash) # append header and name pairs
|
@@ -209,7 +213,7 @@ class FasterCSV
|
|
209
213
|
#
|
210
214
|
# This method returns the row for chaining.
|
211
215
|
#
|
212
|
-
def push(
|
216
|
+
def push(*args)
|
213
217
|
args.each { |arg| self << arg }
|
214
218
|
|
215
219
|
self # for chaining
|
@@ -225,7 +229,7 @@ class FasterCSV
|
|
225
229
|
# located as described in FasterCSV::Row.field(). The deleted pair is
|
226
230
|
# returned, or +nil+ if a pair could not be found.
|
227
231
|
#
|
228
|
-
def delete(
|
232
|
+
def delete(header_or_index, minimum_index = 0)
|
229
233
|
if header_or_index.is_a? Integer # by index
|
230
234
|
@row.delete_at(header_or_index)
|
231
235
|
else # by header
|
@@ -240,7 +244,7 @@ class FasterCSV
|
|
240
244
|
#
|
241
245
|
# This method returns the row for chaining.
|
242
246
|
#
|
243
|
-
def delete_if(
|
247
|
+
def delete_if(&block)
|
244
248
|
@row.delete_if(&block)
|
245
249
|
|
246
250
|
self # for chaining
|
@@ -248,16 +252,29 @@ class FasterCSV
|
|
248
252
|
|
249
253
|
#
|
250
254
|
# This method accepts any number of arguments which can be headers, indices,
|
251
|
-
# or two-element Arrays containing a header and offset.
|
252
|
-
# be replaced with a field lookup as described in
|
255
|
+
# Ranges of either, or two-element Arrays containing a header and offset.
|
256
|
+
# Each argument will be replaced with a field lookup as described in
|
257
|
+
# FasterCSV::Row.field().
|
253
258
|
#
|
254
259
|
# If called with no arguments, all fields are returned.
|
255
260
|
#
|
256
|
-
def fields(
|
261
|
+
def fields(*headers_and_or_indices)
|
257
262
|
if headers_and_or_indices.empty? # return all fields--no arguments
|
258
263
|
@row.map { |pair| pair.last }
|
259
264
|
else # or work like values_at()
|
260
|
-
headers_and_or_indices.
|
265
|
+
headers_and_or_indices.inject(Array.new) do |all, h_or_i|
|
266
|
+
all + if h_or_i.is_a? Range
|
267
|
+
index_begin = h_or_i.begin.is_a?(Integer) ? h_or_i.begin :
|
268
|
+
index(h_or_i.begin)
|
269
|
+
index_end = h_or_i.end.is_a?(Integer) ? h_or_i.end :
|
270
|
+
index(h_or_i.end)
|
271
|
+
new_range = h_or_i.exclude_end? ? (index_begin...index_end) :
|
272
|
+
(index_begin..index_end)
|
273
|
+
fields.values_at(new_range)
|
274
|
+
else
|
275
|
+
[field(*Array(h_or_i))]
|
276
|
+
end
|
277
|
+
end
|
261
278
|
end
|
262
279
|
end
|
263
280
|
alias_method :values_at, :fields
|
@@ -271,7 +288,7 @@ class FasterCSV
|
|
271
288
|
# The +offset+ can be used to locate duplicate header names, as described in
|
272
289
|
# FasterCSV::Row.field().
|
273
290
|
#
|
274
|
-
def index(
|
291
|
+
def index(header, minimum_index = 0)
|
275
292
|
# find the pair
|
276
293
|
index = headers[minimum_index..-1].index(header)
|
277
294
|
# return the index at the right offset, if we found one
|
@@ -279,7 +296,7 @@ class FasterCSV
|
|
279
296
|
end
|
280
297
|
|
281
298
|
# Returns +true+ if +name+ is a header for this row, and +false+ otherwise.
|
282
|
-
def header?(
|
299
|
+
def header?(name)
|
283
300
|
headers.include? name
|
284
301
|
end
|
285
302
|
alias_method :include?, :header?
|
@@ -288,7 +305,7 @@ class FasterCSV
|
|
288
305
|
# Returns +true+ if +data+ matches a field in this row, and +false+
|
289
306
|
# otherwise.
|
290
307
|
#
|
291
|
-
def field?(
|
308
|
+
def field?(data)
|
292
309
|
fields.include? data
|
293
310
|
end
|
294
311
|
|
@@ -302,12 +319,20 @@ class FasterCSV
|
|
302
319
|
#
|
303
320
|
# This method returns the row for chaining.
|
304
321
|
#
|
305
|
-
def each(
|
322
|
+
def each(&block)
|
306
323
|
@row.each(&block)
|
307
324
|
|
308
325
|
self # for chaining
|
309
326
|
end
|
310
327
|
|
328
|
+
#
|
329
|
+
# Returns +true+ if this row contains the same headers and fields in the
|
330
|
+
# same order as +other+.
|
331
|
+
#
|
332
|
+
def ==(other)
|
333
|
+
@row == other.row
|
334
|
+
end
|
335
|
+
|
311
336
|
#
|
312
337
|
# Collapses the row into a simple Hash. Be warning that this discards field
|
313
338
|
# order and clobbers duplicate fields.
|
@@ -322,11 +347,331 @@ class FasterCSV
|
|
322
347
|
#
|
323
348
|
# faster_csv_row.fields.to_csv( options )
|
324
349
|
#
|
325
|
-
def to_csv(
|
350
|
+
def to_csv(options = Hash.new)
|
326
351
|
fields.to_csv(options)
|
327
352
|
end
|
328
353
|
alias_method :to_s, :to_csv
|
329
354
|
end
|
355
|
+
|
356
|
+
#
|
357
|
+
# A FasterCSV::Table is a two-dimensional data structure for representing CSV
|
358
|
+
# documents. Tables allow you to work with the data by row or column,
|
359
|
+
# manipulate the data, and even convert the results back to CSV, if needed.
|
360
|
+
#
|
361
|
+
# All tables returned by FasterCSV will be constructed from this class, if
|
362
|
+
# header row processing is activated.
|
363
|
+
#
|
364
|
+
class Table
|
365
|
+
#
|
366
|
+
# Construct a new FasterCSV::Table from +array_of_rows+, which are expected
|
367
|
+
# to be FasterCSV::Row objects. All rows are assumed to have the same
|
368
|
+
# headers.
|
369
|
+
#
|
370
|
+
def initialize(array_of_rows)
|
371
|
+
@table = array_of_rows
|
372
|
+
@mode = :col_or_row
|
373
|
+
end
|
374
|
+
|
375
|
+
# The current access mode for indexing and iteration.
|
376
|
+
attr_reader :mode
|
377
|
+
|
378
|
+
# Internal data format used to compare equality.
|
379
|
+
attr_reader :table
|
380
|
+
protected :table
|
381
|
+
|
382
|
+
#
|
383
|
+
# Returns a duplicate table object, in column mode. This is handy for
|
384
|
+
# chaining in a single call without changing the table mode, but be aware
|
385
|
+
# that this method can consume a fair amount of memory for bigger data sets.
|
386
|
+
#
|
387
|
+
# This method returns the duplicate table for chaining. Don't chain
|
388
|
+
# destructive methods (like []=()) this way though, since you are working
|
389
|
+
# with a duplicate.
|
390
|
+
#
|
391
|
+
def by_col
|
392
|
+
self.class.new(@table.dup).by_col!
|
393
|
+
end
|
394
|
+
|
395
|
+
#
|
396
|
+
# Switches the mode of this table to column mode. All calls to indexing and
|
397
|
+
# iteration methods will work with columns until the mode is changed again.
|
398
|
+
#
|
399
|
+
# This method returns the table and is safe to chain.
|
400
|
+
#
|
401
|
+
def by_col!
|
402
|
+
@mode = :col
|
403
|
+
|
404
|
+
self
|
405
|
+
end
|
406
|
+
|
407
|
+
#
|
408
|
+
# Returns a duplicate table object, in mixed mode. This is handy for
|
409
|
+
# chaining in a single call without changing the table mode, but be aware
|
410
|
+
# that this method can consume a fair amount of memory for bigger data sets.
|
411
|
+
#
|
412
|
+
# This method returns the duplicate table for chaining. Don't chain
|
413
|
+
# destructive methods (like []=()) this way though, since you are working
|
414
|
+
# with a duplicate.
|
415
|
+
#
|
416
|
+
def by_col_or_row
|
417
|
+
self.class.new(@table.dup).by_col_or_row!
|
418
|
+
end
|
419
|
+
|
420
|
+
#
|
421
|
+
# Switches the mode of this table to mixed mode. All calls to indexing and
|
422
|
+
# iteration methods will use the default intelligent indexing system until
|
423
|
+
# the mode is changed again. In mixed mode an index is assumed to be a row
|
424
|
+
# reference while anything else is assumed to be column access by headers.
|
425
|
+
#
|
426
|
+
# This method returns the table and is safe to chain.
|
427
|
+
#
|
428
|
+
def by_col_or_row!
|
429
|
+
@mode = :col_or_row
|
430
|
+
|
431
|
+
self
|
432
|
+
end
|
433
|
+
|
434
|
+
#
|
435
|
+
# Returns a duplicate table object, in row mode. This is handy for chaining
|
436
|
+
# in a single call without changing the table mode, but be aware that this
|
437
|
+
# method can consume a fair amount of memory for bigger data sets.
|
438
|
+
#
|
439
|
+
# This method returns the duplicate table for chaining. Don't chain
|
440
|
+
# destructive methods (like []=()) this way though, since you are working
|
441
|
+
# with a duplicate.
|
442
|
+
#
|
443
|
+
def by_row
|
444
|
+
self.class.new(@table.dup).by_row!
|
445
|
+
end
|
446
|
+
|
447
|
+
#
|
448
|
+
# Switches the mode of this table to row mode. All calls to indexing and
|
449
|
+
# iteration methods will work with rows until the mode is changed again.
|
450
|
+
#
|
451
|
+
# This method returns the table and is safe to chain.
|
452
|
+
#
|
453
|
+
def by_row!
|
454
|
+
@mode = :row
|
455
|
+
|
456
|
+
self
|
457
|
+
end
|
458
|
+
|
459
|
+
#
|
460
|
+
# Returns the headers for the first row of this table (assumed to match all
|
461
|
+
# other rows). An empty Array is returned for empty tables.
|
462
|
+
#
|
463
|
+
def headers
|
464
|
+
if @table.empty?
|
465
|
+
Array.new
|
466
|
+
else
|
467
|
+
@table.first.headers
|
468
|
+
end
|
469
|
+
end
|
470
|
+
|
471
|
+
#
|
472
|
+
# In the default mixed mode, this method returns rows for index access and
|
473
|
+
# columns for header access. You can force the index association by first
|
474
|
+
# calling by_col!() or by_row!().
|
475
|
+
#
|
476
|
+
# Columns are returned as an Array of values. Altering that Array has no
|
477
|
+
# effect on the table.
|
478
|
+
#
|
479
|
+
def [](index_or_header)
|
480
|
+
if @mode == :row or # by index
|
481
|
+
(@mode == :col_or_row and index_or_header.is_a? Integer)
|
482
|
+
@table[index_or_header]
|
483
|
+
else # by header
|
484
|
+
@table.map { |row| row[index_or_header] }
|
485
|
+
end
|
486
|
+
end
|
487
|
+
|
488
|
+
#
|
489
|
+
# In the default mixed mode, this method assigns rows for index access and
|
490
|
+
# columns for header access. You can force the index association by first
|
491
|
+
# calling by_col!() or by_row!().
|
492
|
+
#
|
493
|
+
# Rows may be set to an Array of values (which will inherit the table's
|
494
|
+
# headers()) or a FasterCSV::Row.
|
495
|
+
#
|
496
|
+
# Columns may be set to a single value, which is copied to each row of the
|
497
|
+
# column, or an Array of values. Arrays of values are assigned to rows top
|
498
|
+
# to bottom in row major order. Excess values are ignored and if the Array
|
499
|
+
# does not have a value for each row the extra rows will receive a +nil+.
|
500
|
+
#
|
501
|
+
# Assigning to an existing column or row clobbers the data. Assigning to
|
502
|
+
# new columns creates them at the right end of the table.
|
503
|
+
#
|
504
|
+
def []=(index_or_header, value)
|
505
|
+
if @mode == :row or # by index
|
506
|
+
(@mode == :col_or_row and index_or_header.is_a? Integer)
|
507
|
+
if value.is_a? Array
|
508
|
+
@table[index_or_header] = Row.new(headers, value)
|
509
|
+
else
|
510
|
+
@table[index_or_header] = value
|
511
|
+
end
|
512
|
+
else # set column
|
513
|
+
if value.is_a? Array # multiple values
|
514
|
+
@table.each_with_index do |row, i|
|
515
|
+
if row.header_row?
|
516
|
+
row[index_or_header] = index_or_header
|
517
|
+
else
|
518
|
+
row[index_or_header] = value[i]
|
519
|
+
end
|
520
|
+
end
|
521
|
+
else # repeated value
|
522
|
+
@table.each do |row|
|
523
|
+
if row.header_row?
|
524
|
+
row[index_or_header] = index_or_header
|
525
|
+
else
|
526
|
+
row[index_or_header] = value
|
527
|
+
end
|
528
|
+
end
|
529
|
+
end
|
530
|
+
end
|
531
|
+
end
|
532
|
+
|
533
|
+
#
|
534
|
+
# The mixed mode default is to treat a list of indices as row access,
|
535
|
+
# returning the rows indicated. Anything else is considered columnar
|
536
|
+
# access. For columnar access, the return set has an Array for each row
|
537
|
+
# with the values indicated by the headers in each Array. You can force
|
538
|
+
# column or row mode using by_col!() or by_row!().
|
539
|
+
#
|
540
|
+
# You cannot mix column and row access.
|
541
|
+
#
|
542
|
+
def values_at(*indices_or_headers)
|
543
|
+
if @mode == :row or # by indices
|
544
|
+
( @mode == :col_or_row and indices_or_headers.all? do |index|
|
545
|
+
index.is_a?(Integer) or
|
546
|
+
( index.is_a?(Range) and
|
547
|
+
index.first.is_a?(Integer) and
|
548
|
+
index.last.is_a?(Integer) )
|
549
|
+
end )
|
550
|
+
@table.values_at(*indices_or_headers)
|
551
|
+
else # by headers
|
552
|
+
@table.map { |row| row.values_at(*indices_or_headers) }
|
553
|
+
end
|
554
|
+
end
|
555
|
+
|
556
|
+
#
|
557
|
+
# Adds a new row to the bottom end of this table. You can provide an Array,
|
558
|
+
# which will be converted to a FasterCSV::Row (inheriting the table's
|
559
|
+
# headers()), or a FasterCSV::Row.
|
560
|
+
#
|
561
|
+
# This method returns the table for chaining.
|
562
|
+
#
|
563
|
+
def <<(row_or_array)
|
564
|
+
if row_or_array.is_a? Array # append Array
|
565
|
+
@table << Row.new(headers, row_or_array)
|
566
|
+
else # append Row
|
567
|
+
@table << row_or_array
|
568
|
+
end
|
569
|
+
|
570
|
+
self # for chaining
|
571
|
+
end
|
572
|
+
|
573
|
+
#
|
574
|
+
# A shortcut for appending multiple rows. Equivalent to:
|
575
|
+
#
|
576
|
+
# rows.each { |row| self << row }
|
577
|
+
#
|
578
|
+
# This method returns the table for chaining.
|
579
|
+
#
|
580
|
+
def push(*rows)
|
581
|
+
rows.each { |row| self << row }
|
582
|
+
|
583
|
+
self # for chaining
|
584
|
+
end
|
585
|
+
|
586
|
+
#
|
587
|
+
# Removes and returns the indicated column or row. In the default mixed
|
588
|
+
# mode indices refer to rows and everything else is assumed to be a column
|
589
|
+
# header. Use by_col!() or by_row!() to force the lookup.
|
590
|
+
#
|
591
|
+
def delete(index_or_header)
|
592
|
+
if @mode == :row or # by index
|
593
|
+
(@mode == :col_or_row and index_or_header.is_a? Integer)
|
594
|
+
@table.delete_at(index_or_header)
|
595
|
+
else # by header
|
596
|
+
@table.map { |row| row.delete(index_or_header).last }
|
597
|
+
end
|
598
|
+
end
|
599
|
+
|
600
|
+
#
|
601
|
+
# Removes any column or row for which the block returns +true+. In the
|
602
|
+
# default mixed mode or row mode, iteration is the standard row major
|
603
|
+
# walking of rows. In column mode, interation will +yield+ two element
|
604
|
+
# tuples containing the column name and an Array of values for that column.
|
605
|
+
#
|
606
|
+
# This method returns the table for chaining.
|
607
|
+
#
|
608
|
+
def delete_if(&block)
|
609
|
+
if @mode == :row or @mode == :col_or_row # by index
|
610
|
+
@table.delete_if(&block)
|
611
|
+
else # by header
|
612
|
+
to_delete = Array.new
|
613
|
+
headers.each_with_index do |header, i|
|
614
|
+
to_delete << header if block[[header, self[header]]]
|
615
|
+
end
|
616
|
+
to_delete.map { |header| delete(header) }
|
617
|
+
end
|
618
|
+
|
619
|
+
self # for chaining
|
620
|
+
end
|
621
|
+
|
622
|
+
include Enumerable
|
623
|
+
|
624
|
+
#
|
625
|
+
# In the default mixed mode or row mode, iteration is the standard row major
|
626
|
+
# walking of rows. In column mode, interation will +yield+ two element
|
627
|
+
# tuples containing the column name and an Array of values for that column.
|
628
|
+
#
|
629
|
+
# This method returns the table for chaining.
|
630
|
+
#
|
631
|
+
def each(&block)
|
632
|
+
if @mode == :col
|
633
|
+
headers.each { |header| block[[header, self[header]]] }
|
634
|
+
else
|
635
|
+
@table.each(&block)
|
636
|
+
end
|
637
|
+
|
638
|
+
self # for chaining
|
639
|
+
end
|
640
|
+
|
641
|
+
# Returns +true+ if all rows of this table ==() +other+'s rows.
|
642
|
+
def ==(other)
|
643
|
+
@table == other.table
|
644
|
+
end
|
645
|
+
|
646
|
+
#
|
647
|
+
# Returns the table as an Array of Arrays. Headers will be the first row,
|
648
|
+
# then all of the field rows will follow.
|
649
|
+
#
|
650
|
+
def to_a
|
651
|
+
@table.inject([headers]) do |array, row|
|
652
|
+
if row.header_row?
|
653
|
+
array
|
654
|
+
else
|
655
|
+
array + [row.fields]
|
656
|
+
end
|
657
|
+
end
|
658
|
+
end
|
659
|
+
|
660
|
+
#
|
661
|
+
# Returns the table as a complete CSV String. Headers will be listed first,
|
662
|
+
# then all of the field rows.
|
663
|
+
#
|
664
|
+
def to_csv(options = Hash.new)
|
665
|
+
@table.inject([headers.to_csv(options)]) do |rows, row|
|
666
|
+
if row.header_row?
|
667
|
+
rows
|
668
|
+
else
|
669
|
+
rows + [row.fields.to_csv(options)]
|
670
|
+
end
|
671
|
+
end.join
|
672
|
+
end
|
673
|
+
alias_method :to_s, :to_csv
|
674
|
+
end
|
330
675
|
|
331
676
|
# The error thrown when the parser encounters illegal CSV formatting.
|
332
677
|
class MalformedCSVError < RuntimeError; end
|
@@ -442,15 +787,15 @@ class FasterCSV
|
|
442
787
|
#
|
443
788
|
def self.build_csv_interface
|
444
789
|
Object.const_set(:CSV, Class.new).class_eval do
|
445
|
-
def self.foreach(
|
790
|
+
def self.foreach(path, rs = :auto, &block) # :nodoc:
|
446
791
|
FasterCSV.foreach(path, :row_sep => rs, &block)
|
447
792
|
end
|
448
793
|
|
449
|
-
def self.generate_line(
|
794
|
+
def self.generate_line(row, fs = ",", rs = "") # :nodoc:
|
450
795
|
FasterCSV.generate_line(row, :col_sep => fs, :row_sep => rs)
|
451
796
|
end
|
452
797
|
|
453
|
-
def self.open(
|
798
|
+
def self.open(path, mode, fs = ",", rs = :auto, &block) # :nodoc:
|
454
799
|
if block and mode.include? "r"
|
455
800
|
FasterCSV.open(path, mode, :col_sep => fs, :row_sep => rs) do |csv|
|
456
801
|
csv.each(&block)
|
@@ -460,15 +805,15 @@ class FasterCSV
|
|
460
805
|
end
|
461
806
|
end
|
462
807
|
|
463
|
-
def self.parse(
|
808
|
+
def self.parse(str_or_readable, fs = ",", rs = :auto, &block) # :nodoc:
|
464
809
|
FasterCSV.parse(str_or_readable, :col_sep => fs, :row_sep => rs, &block)
|
465
810
|
end
|
466
811
|
|
467
|
-
def self.parse_line(
|
812
|
+
def self.parse_line(src, fs = ",", rs = :auto) # :nodoc:
|
468
813
|
FasterCSV.parse_line(src, :col_sep => fs, :row_sep => rs)
|
469
814
|
end
|
470
815
|
|
471
|
-
def self.readlines(
|
816
|
+
def self.readlines(path, rs = :auto) # :nodoc:
|
472
817
|
FasterCSV.readlines(path, :row_sep => rs)
|
473
818
|
end
|
474
819
|
end
|
@@ -509,7 +854,7 @@ class FasterCSV
|
|
509
854
|
# The +io+ parameter can be used to serialize to a File, and +options+ can be
|
510
855
|
# anything FasterCSV::new() accepts.
|
511
856
|
#
|
512
|
-
def self.dump(
|
857
|
+
def self.dump(ary_of_objs, io = "", options = Hash.new)
|
513
858
|
obj_template = ary_of_objs.first
|
514
859
|
|
515
860
|
csv = FasterCSV.new(io, options)
|
@@ -566,7 +911,7 @@ class FasterCSV
|
|
566
911
|
#
|
567
912
|
# The +input+ and +output+ arguments can be anything FasterCSV::new() accepts
|
568
913
|
# (generally String or IO objects). If not given, they default to
|
569
|
-
# <tt>ARGF</tt> and <tt
|
914
|
+
# <tt>ARGF</tt> and <tt>$stdout</tt>.
|
570
915
|
#
|
571
916
|
# The +options+ parameter is also filtered down to FasterCSV::new() after some
|
572
917
|
# clever key parsing. Any key beginning with <tt>:in_</tt> or
|
@@ -578,7 +923,7 @@ class FasterCSV
|
|
578
923
|
# The <tt>:output_row_sep</tt> +option+ defaults to
|
579
924
|
# <tt>$INPUT_RECORD_SEPARATOR</tt> (<tt>$/</tt>).
|
580
925
|
#
|
581
|
-
def self.filter(
|
926
|
+
def self.filter(*args)
|
582
927
|
# parse options for input, output, or both
|
583
928
|
in_options, out_options = Hash.new, {:row_sep => $INPUT_RECORD_SEPARATOR}
|
584
929
|
if args.last.is_a? Hash
|
@@ -595,8 +940,8 @@ class FasterCSV
|
|
595
940
|
end
|
596
941
|
end
|
597
942
|
# build input and output wrappers
|
598
|
-
input = FasterCSV.new(args.shift || ARGF,
|
599
|
-
output = FasterCSV.new(args.shift ||
|
943
|
+
input = FasterCSV.new(args.shift || ARGF, in_options)
|
944
|
+
output = FasterCSV.new(args.shift || $stdout, out_options)
|
600
945
|
|
601
946
|
# read, yield, write
|
602
947
|
input.each do |row|
|
@@ -610,9 +955,9 @@ class FasterCSV
|
|
610
955
|
# pass a +path+ and any +options+ you wish to set for the read. Each row of
|
611
956
|
# file will be passed to the provided +block+ in turn.
|
612
957
|
#
|
613
|
-
# The +options+ parameter can be
|
958
|
+
# The +options+ parameter can be anything FasterCSV::new() understands.
|
614
959
|
#
|
615
|
-
def self.foreach(
|
960
|
+
def self.foreach(path, options = Hash.new, &block)
|
616
961
|
open(path, options) do |csv|
|
617
962
|
csv.each(&block)
|
618
963
|
end
|
@@ -633,7 +978,7 @@ class FasterCSV
|
|
633
978
|
#
|
634
979
|
# The +options+ parameter can be anthing FasterCSV::new() understands.
|
635
980
|
#
|
636
|
-
def self.generate(
|
981
|
+
def self.generate(*args)
|
637
982
|
# add a default empty String, if none was given
|
638
983
|
if args.first.is_a? String
|
639
984
|
io = StringIO.new(args.shift)
|
@@ -656,7 +1001,7 @@ class FasterCSV
|
|
656
1001
|
# The <tt>:row_sep</tt> +option+ defaults to <tt>$INPUT_RECORD_SEPARATOR</tt>
|
657
1002
|
# (<tt>$/</tt>) when calling this method.
|
658
1003
|
#
|
659
|
-
def self.generate_line(
|
1004
|
+
def self.generate_line(row, options = Hash.new)
|
660
1005
|
options = {:row_sep => $INPUT_RECORD_SEPARATOR}.merge(options)
|
661
1006
|
(new("", options) << row).string
|
662
1007
|
end
|
@@ -665,12 +1010,12 @@ class FasterCSV
|
|
665
1010
|
# This method will return a FasterCSV instance, just like FasterCSV::new(),
|
666
1011
|
# but the instance will be cached and returned for all future calls to this
|
667
1012
|
# method for the same +data+ object (tested by Object#object_id()) with the
|
668
|
-
# same +options
|
1013
|
+
# same +options+.
|
669
1014
|
#
|
670
1015
|
# If a block is given, the instance is passed to the block and the return
|
671
1016
|
# value becomes the return value of the block.
|
672
1017
|
#
|
673
|
-
def self.instance(
|
1018
|
+
def self.instance(data = $stdout, options = Hash.new)
|
674
1019
|
# create a _signature_ for this method call, data object and options
|
675
1020
|
sig = [data.object_id] +
|
676
1021
|
options.values_at(*DEFAULT_OPTIONS.keys.sort_by { |sym| sym.to_s })
|
@@ -698,7 +1043,7 @@ class FasterCSV
|
|
698
1043
|
# something else, use +options+ to setup converters or provide a custom
|
699
1044
|
# csv_load() implementation.
|
700
1045
|
#
|
701
|
-
def self.load(
|
1046
|
+
def self.load(io_or_str, options = Hash.new)
|
702
1047
|
csv = FasterCSV.new(io_or_str, options)
|
703
1048
|
|
704
1049
|
# load meta information
|
@@ -768,7 +1113,6 @@ class FasterCSV
|
|
768
1113
|
# * pid()
|
769
1114
|
# * pos()
|
770
1115
|
# * reopen()
|
771
|
-
# * rewind()
|
772
1116
|
# * seek()
|
773
1117
|
# * stat()
|
774
1118
|
# * sync()
|
@@ -778,7 +1122,7 @@ class FasterCSV
|
|
778
1122
|
# * to_io()
|
779
1123
|
# * tty?()
|
780
1124
|
#
|
781
|
-
def self.open(
|
1125
|
+
def self.open(*args)
|
782
1126
|
# find the +options+ Hash
|
783
1127
|
options = if args.last.is_a? Hash then args.pop else Hash.new end
|
784
1128
|
# wrap a File opened with the remaining +args+
|
@@ -808,7 +1152,7 @@ class FasterCSV
|
|
808
1152
|
# You pass your +str+ to read from, and an optional +options+ Hash containing
|
809
1153
|
# anything FasterCSV::new() understands.
|
810
1154
|
#
|
811
|
-
def self.parse(
|
1155
|
+
def self.parse(*args, &block)
|
812
1156
|
csv = new(*args)
|
813
1157
|
if block.nil? # slurp contents, if no block is given
|
814
1158
|
begin
|
@@ -828,7 +1172,7 @@ class FasterCSV
|
|
828
1172
|
#
|
829
1173
|
# The +options+ parameter can be anthing FasterCSV::new() understands.
|
830
1174
|
#
|
831
|
-
def self.parse_line(
|
1175
|
+
def self.parse_line(line, options = Hash.new)
|
832
1176
|
new(line, options).shift
|
833
1177
|
end
|
834
1178
|
|
@@ -836,12 +1180,12 @@ class FasterCSV
|
|
836
1180
|
# Use to slurp a CSV file into an Array of Arrays. Pass the +path+ to the
|
837
1181
|
# file and any +options+ FasterCSV::new() understands.
|
838
1182
|
#
|
839
|
-
def self.read(
|
1183
|
+
def self.read(path, options = Hash.new)
|
840
1184
|
open(path, options) { |csv| csv.read }
|
841
1185
|
end
|
842
1186
|
|
843
1187
|
# Alias for FasterCSV::read().
|
844
|
-
def self.readlines(
|
1188
|
+
def self.readlines(*args)
|
845
1189
|
read(*args)
|
846
1190
|
end
|
847
1191
|
|
@@ -906,7 +1250,9 @@ class FasterCSV
|
|
906
1250
|
# Array of headers. This setting causes
|
907
1251
|
# FasterCSV.shift() to return rows as
|
908
1252
|
# FasterCSV::Row objects instead of
|
909
|
-
# Arrays.
|
1253
|
+
# Arrays and FasterCSV.read() to return
|
1254
|
+
# FasterCSV::Table objects instead of
|
1255
|
+
# an Array of Arrays.
|
910
1256
|
# <b><tt>:return_headers</tt></b>:: When +false+, header rows are silently
|
911
1257
|
# swallowed. If set to +true+, header
|
912
1258
|
# rows are returned in a FasterCSV::Row
|
@@ -923,7 +1269,7 @@ class FasterCSV
|
|
923
1269
|
# Options cannot be overriden in the instance methods for performance reasons,
|
924
1270
|
# so be sure to set what you want here.
|
925
1271
|
#
|
926
|
-
def initialize(
|
1272
|
+
def initialize(data, options = Hash.new)
|
927
1273
|
# build the options for this read/write
|
928
1274
|
options = DEFAULT_OPTIONS.merge(options)
|
929
1275
|
|
@@ -943,19 +1289,27 @@ class FasterCSV
|
|
943
1289
|
@lineno = 0
|
944
1290
|
end
|
945
1291
|
|
1292
|
+
### IO and StringIO Delegation ###
|
1293
|
+
|
946
1294
|
#
|
947
1295
|
# The line number of the last row read from this file. Fields with nested
|
948
1296
|
# line-end characters will not affect this count.
|
949
1297
|
#
|
950
1298
|
attr_reader :lineno
|
951
1299
|
|
952
|
-
### IO and StringIO Delegation ###
|
953
|
-
|
954
1300
|
extend Forwardable
|
955
1301
|
def_delegators :@io, :binmode, :close, :close_read, :close_write, :closed?,
|
956
1302
|
:eof, :eof?, :fcntl, :fileno, :flush, :fsync, :ioctl,
|
957
|
-
:isatty, :pid, :pos, :reopen, :
|
958
|
-
:
|
1303
|
+
:isatty, :pid, :pos, :reopen, :seek, :stat, :string,
|
1304
|
+
:sync, :sync=, :tell, :to_i, :to_io, :tty?
|
1305
|
+
|
1306
|
+
# Rewinds the underlying IO object and resets FasterCSV's lineno() counter.
|
1307
|
+
def rewind
|
1308
|
+
@headers = nil
|
1309
|
+
@lineno = 0
|
1310
|
+
|
1311
|
+
@io.rewind
|
1312
|
+
end
|
959
1313
|
|
960
1314
|
### End Delegation ###
|
961
1315
|
|
@@ -967,16 +1321,16 @@ class FasterCSV
|
|
967
1321
|
#
|
968
1322
|
# The data source must be open for writing.
|
969
1323
|
#
|
970
|
-
def <<(
|
1324
|
+
def <<(row)
|
971
1325
|
# handle FasterCSV::Row objects
|
972
1326
|
row = row.fields if row.is_a? self.class::Row
|
973
1327
|
|
974
1328
|
@io << row.map do |field|
|
975
|
-
if field.nil? #
|
1329
|
+
if field.nil? # represent +nil+ fields as empty unquoted fields
|
976
1330
|
""
|
977
1331
|
else
|
978
1332
|
field = String(field) # Stringify fields
|
979
|
-
#
|
1333
|
+
# represent empty fields as empty quoted fields
|
980
1334
|
if field.empty? or field.count(%Q{\r\n#{@col_sep}"}).nonzero?
|
981
1335
|
%Q{"#{field.gsub('"', '""')}"} # escape quoted fields
|
982
1336
|
else
|
@@ -1005,7 +1359,7 @@ class FasterCSV
|
|
1005
1359
|
# containing details about the field. Again, the block should return a
|
1006
1360
|
# converted field or the field itself.
|
1007
1361
|
#
|
1008
|
-
def convert(
|
1362
|
+
def convert(name = nil, &converter)
|
1009
1363
|
add_converter(:converters, self.class::Converters, name, &converter)
|
1010
1364
|
end
|
1011
1365
|
|
@@ -1020,7 +1374,7 @@ class FasterCSV
|
|
1020
1374
|
# Note that this method must be called before header rows are read to have any
|
1021
1375
|
# effect.
|
1022
1376
|
#
|
1023
|
-
def header_convert(
|
1377
|
+
def header_convert(name = nil, &converter)
|
1024
1378
|
add_converter( :header_converters,
|
1025
1379
|
self.class::HeaderConverters,
|
1026
1380
|
name,
|
@@ -1048,7 +1402,12 @@ class FasterCSV
|
|
1048
1402
|
# The data source must be open for reading.
|
1049
1403
|
#
|
1050
1404
|
def read
|
1051
|
-
to_a
|
1405
|
+
rows = to_a
|
1406
|
+
if @use_headers
|
1407
|
+
Table.new(rows)
|
1408
|
+
else
|
1409
|
+
rows
|
1410
|
+
end
|
1052
1411
|
end
|
1053
1412
|
alias_method :readlines, :read
|
1054
1413
|
|
@@ -1112,7 +1471,7 @@ class FasterCSV
|
|
1112
1471
|
# on these
|
1113
1472
|
#
|
1114
1473
|
csv = if parse.sub!(@parsers[:leading_fields], "")
|
1115
|
-
[nil] * $&.length
|
1474
|
+
[nil] * ($&.length / @col_sep.length)
|
1116
1475
|
else
|
1117
1476
|
Array.new
|
1118
1477
|
end
|
@@ -1176,12 +1535,11 @@ class FasterCSV
|
|
1176
1535
|
# Stores the indicated separators for later use.
|
1177
1536
|
#
|
1178
1537
|
# If auto-discovery was requested for <tt>@row_sep</tt>, this method will read
|
1179
|
-
# ahead in the <tt>@io</tt> and try to find one.
|
1180
|
-
#
|
1181
|
-
#
|
1182
|
-
# <tt>$INPUT_RECORD_SEPARATOR</tt> (<tt>$/</tt>).
|
1538
|
+
# ahead in the <tt>@io</tt> and try to find one. +ARGF+, +STDIN+, +STDOUT+,
|
1539
|
+
# +STDERR+ and any stream open for output only with a default
|
1540
|
+
# <tt>@row_sep</tt> of <tt>$INPUT_RECORD_SEPARATOR</tt> (<tt>$/</tt>).
|
1183
1541
|
#
|
1184
|
-
def init_separators(
|
1542
|
+
def init_separators(options)
|
1185
1543
|
# store the selected separators
|
1186
1544
|
@col_sep = options.delete(:col_sep)
|
1187
1545
|
@row_sep = options.delete(:row_sep)
|
@@ -1222,11 +1580,11 @@ class FasterCSV
|
|
1222
1580
|
end
|
1223
1581
|
|
1224
1582
|
# Pre-compiles parsers and stores them by name for access during reads.
|
1225
|
-
def init_parsers(
|
1583
|
+
def init_parsers(options)
|
1226
1584
|
# prebuild Regexps for faster parsing
|
1227
1585
|
@parsers = {
|
1228
1586
|
:leading_fields =>
|
1229
|
-
/\A
|
1587
|
+
/\A(?:#{Regexp.escape(@col_sep)})+/, # for empty leading fields
|
1230
1588
|
:csv_row =>
|
1231
1589
|
### The Primary Parser ###
|
1232
1590
|
/ \G(?:^|#{Regexp.escape(@col_sep)}) # anchor the match
|
@@ -1250,7 +1608,7 @@ class FasterCSV
|
|
1250
1608
|
# The <tt>:unconverted_fields</tt> option is also actived for
|
1251
1609
|
# <tt>:converters</tt> calls, if requested.
|
1252
1610
|
#
|
1253
|
-
def init_converters(
|
1611
|
+
def init_converters(options, field_name = :converters)
|
1254
1612
|
if field_name == :converters
|
1255
1613
|
@unconverted_fields = options.delete(:unconverted_fields)
|
1256
1614
|
end
|
@@ -1280,7 +1638,7 @@ class FasterCSV
|
|
1280
1638
|
end
|
1281
1639
|
|
1282
1640
|
# Stores header row settings and loads header converters, if needed.
|
1283
|
-
def init_headers(
|
1641
|
+
def init_headers(options)
|
1284
1642
|
@use_headers = options.delete(:headers)
|
1285
1643
|
@return_headers = options.delete(:return_headers)
|
1286
1644
|
|
@@ -1299,7 +1657,7 @@ class FasterCSV
|
|
1299
1657
|
# normal parameters of the FasterCSV.convert() and FasterCSV.header_convert()
|
1300
1658
|
# methods.
|
1301
1659
|
#
|
1302
|
-
def add_converter(
|
1660
|
+
def add_converter(var_name, const, name = nil, &converter)
|
1303
1661
|
if name.nil? # custom converter
|
1304
1662
|
instance_variable_get("@#{var_name}") << converter
|
1305
1663
|
else # named converter
|
@@ -1322,7 +1680,7 @@ class FasterCSV
|
|
1322
1680
|
# the pipeline of conversion for that field. This is primarily an efficiency
|
1323
1681
|
# shortcut.
|
1324
1682
|
#
|
1325
|
-
def convert_fields(
|
1683
|
+
def convert_fields(fields, headers = false)
|
1326
1684
|
# see if we are converting headers or fields
|
1327
1685
|
converters = headers ? @header_converters : @converters
|
1328
1686
|
|
@@ -1350,7 +1708,7 @@ class FasterCSV
|
|
1350
1708
|
# When +nil+, +row+ is assumed to be a header row not based on an actual row
|
1351
1709
|
# of the stream.
|
1352
1710
|
#
|
1353
|
-
def parse_headers(
|
1711
|
+
def parse_headers(row = nil)
|
1354
1712
|
if @headers.nil? # header row
|
1355
1713
|
@headers = case @use_headers # save headers
|
1356
1714
|
when Array then @use_headers # Array of headers
|
@@ -1377,7 +1735,7 @@ class FasterCSV
|
|
1377
1735
|
# +row+ and an accessor method for it called unconverted_fields(). The
|
1378
1736
|
# variable is set to the contents of +fields+.
|
1379
1737
|
#
|
1380
|
-
def add_unconverted_fields(
|
1738
|
+
def add_unconverted_fields(row, fields)
|
1381
1739
|
class << row
|
1382
1740
|
attr_reader :unconverted_fields
|
1383
1741
|
end
|
@@ -1390,25 +1748,25 @@ end
|
|
1390
1748
|
FCSV = FasterCSV
|
1391
1749
|
|
1392
1750
|
# Another name for FasterCSV::instance().
|
1393
|
-
def FasterCSV(
|
1751
|
+
def FasterCSV(*args, &block)
|
1394
1752
|
FasterCSV.instance(*args, &block)
|
1395
1753
|
end
|
1396
1754
|
|
1397
1755
|
# Another name for FCSV::instance().
|
1398
|
-
def FCSV(
|
1756
|
+
def FCSV(*args, &block)
|
1399
1757
|
FCSV.instance(*args, &block)
|
1400
1758
|
end
|
1401
1759
|
|
1402
1760
|
class Array
|
1403
1761
|
# Equivalent to <tt>FasterCSV::generate_line(self, options)</tt>.
|
1404
|
-
def to_csv(
|
1762
|
+
def to_csv(options = Hash.new)
|
1405
1763
|
FasterCSV.generate_line(self, options)
|
1406
1764
|
end
|
1407
1765
|
end
|
1408
1766
|
|
1409
1767
|
class String
|
1410
1768
|
# Equivalent to <tt>FasterCSV::parse_line(self, options)</tt>.
|
1411
|
-
def parse_csv(
|
1769
|
+
def parse_csv(options = Hash.new)
|
1412
1770
|
FasterCSV.parse_line(self, options)
|
1413
1771
|
end
|
1414
1772
|
end
|