glennfu-faster_csv 1.5.5.1
Sign up to get free protection for your applications and to get access to all the features.
- data/AUTHORS +1 -0
- data/CHANGELOG +187 -0
- data/COPYING +340 -0
- data/LICENSE +7 -0
- data/README +71 -0
- data/Rakefile +94 -0
- data/TODO +6 -0
- data/examples/csv_converters.rb +28 -0
- data/examples/csv_filter.rb +23 -0
- data/examples/csv_rails_import.task +21 -0
- data/examples/csv_reading.rb +57 -0
- data/examples/csv_table.rb +56 -0
- data/examples/csv_writing.rb +67 -0
- data/examples/purchase.csv +3 -0
- data/examples/shortcut_interface.rb +36 -0
- data/lib/faster_csv.rb +2021 -0
- data/lib/fastercsv.rb +10 -0
- metadata +85 -0
data/LICENSE
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
= License Terms
|
2
|
+
|
3
|
+
Distributed under the user's choice of the {GPL Version 2}[http://www.gnu.org/licenses/old-licenses/gpl-2.0.html] (see COPYING for details) or the
|
4
|
+
{Ruby software license}[http://www.ruby-lang.org/en/LICENSE.txt] by
|
5
|
+
James Edward Gray II.
|
6
|
+
|
7
|
+
Please email James[mailto:james@grayproductions.net] with any questions.
|
data/README
ADDED
@@ -0,0 +1,71 @@
|
|
1
|
+
= Read Me
|
2
|
+
|
3
|
+
by James Edward Gray II
|
4
|
+
|
5
|
+
== Description
|
6
|
+
|
7
|
+
Welcome to FasterCSV.
|
8
|
+
|
9
|
+
FasterCSV is intended as a replacement to Ruby's standard CSV library. It was designed to address concerns users of that library had and it has three primary goals:
|
10
|
+
|
11
|
+
1. Be significantly faster than CSV while remaining a pure Ruby library.
|
12
|
+
2. Use a smaller and easier to maintain code base. (FasterCSV is larger now,
|
13
|
+
but considerably richer in features. The parsing core remains quite small.)
|
14
|
+
3. Improve on the CSV interface.
|
15
|
+
|
16
|
+
Obviously, the last one is subjective. If you love CSV's interface, odds are
|
17
|
+
good this one won't suit you. I did try to defer to that interface whenever I
|
18
|
+
didn't have a compelling reason to change it though, so hopefully this won't be
|
19
|
+
too radically different.
|
20
|
+
|
21
|
+
== What's Different From CSV?
|
22
|
+
|
23
|
+
I'm sure I'll miss something, but I'll try to mention most of the major differences I am aware of, to help others quickly get up to speed:
|
24
|
+
|
25
|
+
=== CSV Parsing
|
26
|
+
|
27
|
+
* FasterCSV has a stricter parser and will throw MalformedCSVErrors on
|
28
|
+
problematic data.
|
29
|
+
* FasterCSV has a less liberal idea of a line ending than CSV. What you set as
|
30
|
+
the <tt>:row_sep</tt> is law.
|
31
|
+
* CSV returns empty lines as <tt>[nil]</tt>. FasterCSV calls them <tt>[]</tt>.
|
32
|
+
* FasterCSV has a much faster parser.
|
33
|
+
|
34
|
+
=== Interface
|
35
|
+
|
36
|
+
* FasterCSV uses Hash-style parameters to set options.
|
37
|
+
* FasterCSV does not have generate_row() or parse_row() from CSV.
|
38
|
+
* FasterCSV does not have CSV's Reader and Writer classes.
|
39
|
+
* FasterCSV::open() is more like Ruby's open() than CSV::open().
|
40
|
+
* FasterCSV objects support most standard IO methods.
|
41
|
+
* FasterCSV has a new() method used to wrap objects like String and IO for
|
42
|
+
reading and writing.
|
43
|
+
* FasterCSV::generate() is different from CSV::generate().
|
44
|
+
* FasterCSV does not support partial reads. It works line-by-line.
|
45
|
+
* FasterCSV does not allow the instance methods to override the separators for
|
46
|
+
performance reasons. They must be set in the constructor.
|
47
|
+
|
48
|
+
If you use this library and find yourself missing any functionality I have trimmed, please {let me know}[mailto:james@grayproductions.net].
|
49
|
+
|
50
|
+
== Documentation
|
51
|
+
|
52
|
+
See FasterCSV for documentation.
|
53
|
+
|
54
|
+
== Installing
|
55
|
+
|
56
|
+
See the INSTALL file for instructions.
|
57
|
+
|
58
|
+
== What is CSV, really?
|
59
|
+
|
60
|
+
FasterCSV maintains a pretty strict definition of CSV taken directly from {the RFC}[http://www.ietf.org/rfc/rfc4180.txt]. I relax the rules in only one place and that is to make using this library easier. FasterCSV will parse all valid CSV.
|
61
|
+
|
62
|
+
What you don't want to do is feed FasterCSV invalid CSV. Because of the way the CSV format works, it's common for a parser to need to read until the end of the file to be sure a field is invalid. This eats a lot of time and memory.
|
63
|
+
|
64
|
+
Luckily, when working with invalid CSV, Ruby's built-in methods will almost always be superior in every way. For example, parsing non-quoted fields is as easy as:
|
65
|
+
|
66
|
+
data.split(",")
|
67
|
+
|
68
|
+
== Questions and/or Comments
|
69
|
+
|
70
|
+
Feel free to email {James Edward Gray II}[mailto:james@grayproductions.net] with
|
71
|
+
any questions.
|
data/Rakefile
ADDED
@@ -0,0 +1,94 @@
|
|
1
|
+
require "rake/rdoctask"
|
2
|
+
require "rake/testtask"
|
3
|
+
require "rake/gempackagetask"
|
4
|
+
|
5
|
+
require "rubygems"
|
6
|
+
|
7
|
+
dir = File.dirname(__FILE__)
|
8
|
+
lib = File.join(dir, "lib", "faster_csv.rb")
|
9
|
+
version = File.read(lib)[/^\s*VERSION\s*=\s*(['"])(\d\.\d\.\d)\1/, 2]
|
10
|
+
|
11
|
+
task :default => [:test]
|
12
|
+
|
13
|
+
Rake::TestTask.new do |test|
|
14
|
+
test.libs << "test"
|
15
|
+
test.test_files = %w[test/ts_all.rb]
|
16
|
+
test.verbose = true
|
17
|
+
end
|
18
|
+
|
19
|
+
Rake::RDocTask.new do |rdoc|
|
20
|
+
rdoc.main = "README"
|
21
|
+
rdoc.rdoc_dir = "doc/html"
|
22
|
+
rdoc.title = "FasterCSV Documentation"
|
23
|
+
rdoc.options = %w[--charset utf-8]
|
24
|
+
rdoc.rdoc_files.include( "README", "INSTALL",
|
25
|
+
"TODO", "CHANGELOG",
|
26
|
+
"AUTHORS", "COPYING",
|
27
|
+
"LICENSE", "lib/" )
|
28
|
+
end
|
29
|
+
|
30
|
+
desc "Upload current documentation to Rubyforge"
|
31
|
+
task :upload_docs => [:rdoc] do
|
32
|
+
sh "scp -r doc/html/* " +
|
33
|
+
"bbazzarrakk@rubyforge.org:/var/www/gforge-projects/fastercsv/"
|
34
|
+
end
|
35
|
+
|
36
|
+
desc "Show library's code statistics"
|
37
|
+
task :stats do
|
38
|
+
require 'code_statistics'
|
39
|
+
CodeStatistics.new( ["FasterCSV", "lib"],
|
40
|
+
["Units", "test"] ).to_s
|
41
|
+
end
|
42
|
+
|
43
|
+
desc "Time FasterCSV and CSV"
|
44
|
+
task :benchmark do
|
45
|
+
TESTS = 6
|
46
|
+
path = "test/test_data.csv"
|
47
|
+
sh %Q{time ruby -r csv -e } +
|
48
|
+
%Q{'#{TESTS}.times { CSV.foreach("#{path}") { |row| } }'}
|
49
|
+
sh %Q{time ruby -r lib/faster_csv -e } +
|
50
|
+
%Q{'#{TESTS}.times { FasterCSV.foreach("#{path}") { |row| } }'}
|
51
|
+
end
|
52
|
+
|
53
|
+
spec = Gem::Specification.new do |spec|
|
54
|
+
spec.name = "fastercsv"
|
55
|
+
spec.version = version
|
56
|
+
|
57
|
+
spec.platform = Gem::Platform::RUBY
|
58
|
+
spec.summary = "FasterCSV is CSV, but faster, smaller, and cleaner."
|
59
|
+
|
60
|
+
spec.test_files = %w[test/ts_all.rb]
|
61
|
+
spec.files = Dir.glob("{lib,test,examples}/**/*.rb").
|
62
|
+
reject { |item| item.include?(".svn") } +
|
63
|
+
Dir.glob("{test,examples}/**/*.csv").
|
64
|
+
reject { |item| item.include?(".svn") } +
|
65
|
+
%w[Rakefile test/line_endings.gz]
|
66
|
+
|
67
|
+
spec.has_rdoc = true
|
68
|
+
spec.extra_rdoc_files = %w[ AUTHORS COPYING README INSTALL TODO CHANGELOG
|
69
|
+
LICENSE ]
|
70
|
+
spec.rdoc_options << "--title" << "FasterCSV Documentation" <<
|
71
|
+
"--main" << "README"
|
72
|
+
|
73
|
+
spec.require_path = "lib"
|
74
|
+
|
75
|
+
spec.author = "James Edward Gray II"
|
76
|
+
spec.email = "james@grayproductions.net"
|
77
|
+
spec.rubyforge_project = "fastercsv"
|
78
|
+
spec.homepage = "http://fastercsv.rubyforge.org"
|
79
|
+
spec.description = <<END_DESC
|
80
|
+
FasterCSV is intended as a complete replacement to the CSV standard library. It
|
81
|
+
is significantly faster and smaller while still being pure Ruby code. It also
|
82
|
+
strives for a better interface.
|
83
|
+
END_DESC
|
84
|
+
end
|
85
|
+
|
86
|
+
Rake::GemPackageTask.new(spec) do |pkg|
|
87
|
+
pkg.need_zip = true
|
88
|
+
pkg.need_tar = true
|
89
|
+
end
|
90
|
+
|
91
|
+
desc "Add new files to Subversion"
|
92
|
+
task :add_to_svn do
|
93
|
+
sh %Q{svn status | ruby -nae 'system "svn add \#{$F[1]}" if $F[0] == "?"' }
|
94
|
+
end
|
data/TODO
ADDED
@@ -0,0 +1,28 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# csv_converters.rb
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-11-05.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
|
8
|
+
require "faster_csv"
|
9
|
+
|
10
|
+
# convert a specific column
|
11
|
+
options = {
|
12
|
+
:headers => true,
|
13
|
+
:header_converters => :symbol,
|
14
|
+
:converters => [
|
15
|
+
lambda { |f, info| info.index.zero? ? f.to_i : f },
|
16
|
+
lambda { |f, info| info.header == :floats ? f.to_f : f }
|
17
|
+
]
|
18
|
+
}
|
19
|
+
table = FCSV(DATA, options) { |csv| csv.read }
|
20
|
+
|
21
|
+
table[:ints] # => [1, 2, 3]
|
22
|
+
table[:floats] # => [1.0, 2.0, 3.0]
|
23
|
+
|
24
|
+
__END__
|
25
|
+
ints,floats
|
26
|
+
1,1.000
|
27
|
+
2,2
|
28
|
+
3,3.0
|
@@ -0,0 +1,23 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# = csv_filter.rb -- Faster CSV Reading and Writing
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-04-01.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
|
8
|
+
require "faster_csv"
|
9
|
+
|
10
|
+
running_total = 0
|
11
|
+
FasterCSV.filter( :headers => true,
|
12
|
+
:return_headers => true,
|
13
|
+
:header_converters => :symbol,
|
14
|
+
:converters => :numeric ) do |row|
|
15
|
+
if row.header_row?
|
16
|
+
row << "Running Total"
|
17
|
+
else
|
18
|
+
row << (running_total += row[:quantity] * row[:price])
|
19
|
+
end
|
20
|
+
end
|
21
|
+
# >> Quantity,Product Description,Price,Running Total
|
22
|
+
# >> 1,Text Editor,25.0,25.0
|
23
|
+
# >> 2,MacBook Pros,2499.0,5023.0
|
@@ -0,0 +1,21 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# csv_rails_import.task
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-11-05.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
|
8
|
+
namespace :my_app_name do
|
9
|
+
desc "Injects purchase.csv into the database."
|
10
|
+
task :load_purchase => [:environment] do
|
11
|
+
require "#{RAILS_ROOT}/vendor/faster_csv/lib/faster_csv"
|
12
|
+
|
13
|
+
purchase = Purchase.create!
|
14
|
+
|
15
|
+
FCSV.foreach( "#{RAILS_ROOT}/db/questions.csv",
|
16
|
+
:headers => true,
|
17
|
+
:header_converters => :symbol ) do |line|
|
18
|
+
purchase.line_items.create!(line.to_hash)
|
19
|
+
end
|
20
|
+
end
|
21
|
+
end
|
@@ -0,0 +1,57 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# csv_reading.rb
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-11-05.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
|
8
|
+
require "faster_csv"
|
9
|
+
|
10
|
+
CSV_FILE_PATH = File.join(File.dirname(__FILE__), "purchase.csv")
|
11
|
+
CSV_STR = <<END_CSV
|
12
|
+
first,last
|
13
|
+
James,Gray
|
14
|
+
Dana,Gray
|
15
|
+
END_CSV
|
16
|
+
|
17
|
+
# read a file line by line
|
18
|
+
FasterCSV.foreach(CSV_FILE_PATH) do |line|
|
19
|
+
puts line[1]
|
20
|
+
end
|
21
|
+
# >> Product Description
|
22
|
+
# >> Text Editor
|
23
|
+
# >> MacBook Pros
|
24
|
+
|
25
|
+
# slurp file data
|
26
|
+
data = FasterCSV.read(CSV_FILE_PATH)
|
27
|
+
puts data.flatten.grep(/\A\d+\.\d+\Z/)
|
28
|
+
# >> 25.00
|
29
|
+
# >> 2499.00
|
30
|
+
|
31
|
+
# read a string line by line
|
32
|
+
FasterCSV.parse(CSV_STR) do |line|
|
33
|
+
puts line[0]
|
34
|
+
end
|
35
|
+
# >> first
|
36
|
+
# >> James
|
37
|
+
# >> Dana
|
38
|
+
|
39
|
+
# slurp string data
|
40
|
+
data = FasterCSV.parse(CSV_STR)
|
41
|
+
puts data[1..-1].map { |line| "#{line[0][0, 1].downcase}.#{line[1].downcase}" }
|
42
|
+
# >> j.gray
|
43
|
+
# >> d.gray
|
44
|
+
|
45
|
+
# adding options to make data manipulation easy
|
46
|
+
total = 0
|
47
|
+
FasterCSV.foreach( CSV_FILE_PATH, :headers => true,
|
48
|
+
:header_converters => :symbol,
|
49
|
+
:converters => :numeric ) do |line|
|
50
|
+
line_total = line[:quantity] * line[:price]
|
51
|
+
total += line_total
|
52
|
+
puts "%s: %.2f" % [line[:product_description], line_total]
|
53
|
+
end
|
54
|
+
puts "Total: %.2f" % total
|
55
|
+
# >> Text Editor: 25.00
|
56
|
+
# >> MacBook Pros: 4998.00
|
57
|
+
# >> Total: 5023.00
|
@@ -0,0 +1,56 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# csv_table.rb
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-11-04.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
#
|
8
|
+
# Feature implementation and example code by Ara.T.Howard.
|
9
|
+
|
10
|
+
require "faster_csv"
|
11
|
+
|
12
|
+
table = FCSV.parse(DATA, :headers => true, :header_converters => :symbol)
|
13
|
+
|
14
|
+
# row access
|
15
|
+
table[0].class # => FasterCSV::Row
|
16
|
+
table[0].fields # => ["zaphod", "beeblebrox", "42"]
|
17
|
+
|
18
|
+
# column access
|
19
|
+
table[:first_name] # => ["zaphod", "ara"]
|
20
|
+
|
21
|
+
# cell access
|
22
|
+
table[1][0] # => "ara"
|
23
|
+
table[1][:first_name] # => "ara"
|
24
|
+
table[:first_name][1] # => "ara"
|
25
|
+
|
26
|
+
# manipulation
|
27
|
+
table << %w[james gray 30]
|
28
|
+
table[-1].fields # => ["james", "gray", "30"]
|
29
|
+
|
30
|
+
table[:type] = "name"
|
31
|
+
table[:type] # => ["name", "name", "name"]
|
32
|
+
|
33
|
+
table[:ssn] = %w[123-456-7890 098-765-4321]
|
34
|
+
table[:ssn] # => ["123-456-7890", "098-765-4321", nil]
|
35
|
+
|
36
|
+
# iteration
|
37
|
+
table.each do |row|
|
38
|
+
# ...
|
39
|
+
end
|
40
|
+
|
41
|
+
table.by_col!
|
42
|
+
table.each do |col_name, col_values|
|
43
|
+
# ...
|
44
|
+
end
|
45
|
+
|
46
|
+
# output
|
47
|
+
puts table
|
48
|
+
# >> first_name,last_name,age,type,ssn
|
49
|
+
# >> zaphod,beeblebrox,42,name,123-456-7890
|
50
|
+
# >> ara,howard,34,name,098-765-4321
|
51
|
+
# >> james,gray,30,name,
|
52
|
+
|
53
|
+
__END__
|
54
|
+
first_name,last_name,age
|
55
|
+
zaphod,beeblebrox,42
|
56
|
+
ara,howard,34
|
@@ -0,0 +1,67 @@
|
|
1
|
+
#!/usr/local/bin/ruby -w
|
2
|
+
|
3
|
+
# csv_rails_import.rb
|
4
|
+
#
|
5
|
+
# Created by James Edward Gray II on 2006-11-05.
|
6
|
+
# Copyright 2006 Gray Productions. All rights reserved.
|
7
|
+
|
8
|
+
require "faster_csv"
|
9
|
+
|
10
|
+
CSV_FILE_PATH = File.join(File.dirname(__FILE__), "output.csv")
|
11
|
+
|
12
|
+
# writing to a file
|
13
|
+
FasterCSV.open(CSV_FILE_PATH, "w") do |csv|
|
14
|
+
csv << %w[first last]
|
15
|
+
csv << %w[James Gray]
|
16
|
+
csv << %w[Dana Gray]
|
17
|
+
end
|
18
|
+
puts File.read(CSV_FILE_PATH)
|
19
|
+
# >> first,last
|
20
|
+
# >> James,Gray
|
21
|
+
# >> Dana,Gray
|
22
|
+
|
23
|
+
# appending to an existing file
|
24
|
+
FasterCSV.open(CSV_FILE_PATH, "a") do |csv|
|
25
|
+
csv << %w[Gypsy]
|
26
|
+
csv << %w[Storm]
|
27
|
+
end
|
28
|
+
puts File.read(CSV_FILE_PATH)
|
29
|
+
# >> first,last
|
30
|
+
# >> James,Gray
|
31
|
+
# >> Dana,Gray
|
32
|
+
# >> Gypsy
|
33
|
+
# >> Storm
|
34
|
+
|
35
|
+
# writing to a string
|
36
|
+
csv_str = FasterCSV.generate do |csv|
|
37
|
+
csv << %w[first last]
|
38
|
+
csv << %w[James Gray]
|
39
|
+
csv << %w[Dana Gray]
|
40
|
+
end
|
41
|
+
puts csv_str
|
42
|
+
# >> first,last
|
43
|
+
# >> James,Gray
|
44
|
+
# >> Dana,Gray
|
45
|
+
|
46
|
+
# appending to an existing string
|
47
|
+
FasterCSV.generate(csv_str) do |csv|
|
48
|
+
csv << %w[Gypsy]
|
49
|
+
csv << %w[Storm]
|
50
|
+
end
|
51
|
+
puts csv_str
|
52
|
+
# >> first,last
|
53
|
+
# >> James,Gray
|
54
|
+
# >> Dana,Gray
|
55
|
+
# >> Gypsy
|
56
|
+
# >> Storm
|
57
|
+
|
58
|
+
# changing the output format
|
59
|
+
csv_str = FasterCSV.generate(:col_sep => "\t") do |csv|
|
60
|
+
csv << %w[first last]
|
61
|
+
csv << %w[James Gray]
|
62
|
+
csv << %w[Dana Gray]
|
63
|
+
end
|
64
|
+
puts csv_str
|
65
|
+
# >> first last
|
66
|
+
# >> James Gray
|
67
|
+
# >> Dana Gray
|