smarter_csv 1.0.19 → 1.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 9022f349dd8ee2590c73b198fb83114a8c96932d
4
- data.tar.gz: 9c1a769c72e08e2e78d15ad444bf3f9642cd33e5
3
+ metadata.gz: ec1442dad9c0f71dc3264f6df10f5fe1116f9d23
4
+ data.tar.gz: 2025c6c7c81fc6c94fed0ed7d391eb122a464bd1
5
5
  SHA512:
6
- metadata.gz: e3ccf944663244bc4b336d9980c26f1fda874d48586a131f3c761b6885a2753ac443c80a559046e2c6670f90ba192155e10aceb0e84798add22c9a20d78653a1
7
- data.tar.gz: 69b3abf03488df9b79b796dd7efbc3612bd273fb1e4f6f156b238213ef377e22ff1852ae17fb7384722dfbba456d9ab36313e5d2c43d5a599a696d008194cd29
6
+ metadata.gz: 1dd00a098dba973b2f6e0303317bd11cdc527d05ca959dd21ae78f241a640e799b623f0b719da5d146c89856d0e50a557ffa8180725d29c6582e0e10231b091a
7
+ data.tar.gz: 7ab34fac0386ccef0ad13f1116b5ed0669d4ffe6270f8a4b9e39da8421132c4fac4f1c1764b78ca8ce5e702df29c9819ee1359bf0b18c98b735ea53a3935e4d8
@@ -6,6 +6,7 @@ rvm:
6
6
  - 1.9.3
7
7
  - 2.0.0
8
8
  - 2.1.3
9
+ - 2.2.2
9
10
  - jruby
10
11
  - ruby-head
11
12
  - jruby-head
data/README.md CHANGED
@@ -1,4 +1,6 @@
1
- # SmarterCSV [![Build Status](https://secure.travis-ci.org/tilo/smarter_csv.png?branch=master)](http://travis-ci.org/tilo/smarter_csv)
1
+ # SmarterCSV
2
+
3
+ [![Build Status](https://secure.travis-ci.org/tilo/smarter_csv.png?branch=master)](http://travis-ci.org/tilo/smarter_csv) [![Gem Version](https://badge.fury.io/rb/smarter_csv.svg)](http://badge.fury.io/rb/smarter_csv)
2
4
 
3
5
  `smarter_csv` is a Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, suitable for direct processing with Mongoid or ActiveRecord,
4
6
  and parallel processing with Resque or Sidekiq.
@@ -32,7 +34,8 @@ The two main choices you have in terms of how to call `SmarterCSV.process` are:
32
34
  * calling `process` with or without a block
33
35
  * passing a `:chunk_size` to the `process` method, and processing the CSV-file in chunks, rather than in one piece.
34
36
 
35
- Tip: If you are uncertain about what line endings a CSV-file uses, try specifying `:row_sep => :auto` as part of the options. Checkout Example 5 for unusual `:row_sep` and `:col_sep`.
37
+ Tip: If you are uncertain about what line endings a CSV-file uses, try specifying `:row_sep => :auto` as part of the options.
38
+ But this could be slow, because it will try to analyze each CSV file first. If you want to speed things up, set the `:row_sep` manually! Checkout Example 5 for unusual `:row_sep` and `:col_sep`.
36
39
 
37
40
  #### Example 1a: How SmarterCSV processes CSV-files as array of hashes:
38
41
  Please note how each hash contains only the keys for columns with non-null values.
@@ -125,6 +128,40 @@ and how the `process` method returns the number of chunks when called with a blo
125
128
  end
126
129
  => returns number of chunks
127
130
 
131
+ #### Example 6: Using Value Converters
132
+
133
+ $ cat spec/fixtures/with_dates.csv
134
+ first,last,date,price
135
+ Ben,Miller,10/30/1998,$44.50
136
+ Tom,Turner,2/1/2011,$15.99
137
+ Ken,Smith,01/09/2013,$199.99
138
+ $ irb
139
+ > require 'smarter_csv'
140
+ > require 'date'
141
+
142
+ # define a custom converter class, which implements self.convert(value)
143
+ class DateConverter
144
+ def self.convert(value)
145
+ Date.strptime( value, '%m/%d/%Y') # parses custom date format into Date instance
146
+ end
147
+ end
148
+
149
+ class DollarConverter
150
+ def self.convert(value)
151
+ value.sub('$','').to_f
152
+ end
153
+ end
154
+
155
+ options = {:value_converters => {:date => DateConverter, :price => DollarConverter}}
156
+ data = SmarterCSV.process("spec/fixtures/with_dates.csv", options)
157
+ data[0][:date]
158
+ => #<Date: 1998-10-30 ((2451117j,0s,0n),+0s,2299161j)>
159
+ data[0][:date].class
160
+ => Date
161
+ data[0][:price]
162
+ => 44.50
163
+ data[0][:price].class
164
+ => Float
128
165
 
129
166
  ## Documentation
130
167
 
@@ -141,7 +178,7 @@ The options and the block are optional.
141
178
  ---------------------------------------------------------------------------------------------------------------------------------
142
179
  | :col_sep | ',' | column separator |
143
180
  | :row_sep | $/ ,"\n" | row separator or record separator , defaults to system's $/ , which defaults to "\n" |
144
- | | | this can also be set to :auto , but will process the whole cvs file first |
181
+ | | | This can also be set to :auto, but will process the whole cvs file first (slow!) |
145
182
  | :quote_char | '"' | quotation character |
146
183
  | :comment_regexp | /^#/ | regular expression which matches comment lines (see NOTE about the CSV header) |
147
184
  | :chunk_size | nil | if set, determines the desired chunk-size (defaults to nil, no chunk processing) |
@@ -162,6 +199,7 @@ The options and the block are optional.
162
199
  | | | Important if the file does not contain headers, |
163
200
  | | | otherwise you would lose the first line of data. |
164
201
  ---------------------------------------------------------------------------------------------------------------------------------
202
+ | :value_converters | nil | supply a hash of :header => KlassName; the class needs to implement self.convert(val)|
165
203
  | :remove_empty_values | true | remove values which have nil or empty strings as values |
166
204
  | :remove_zero_values | true | remove values which have a numeric value equal to zero / 0 |
167
205
  | :remove_values_matching | nil | removes key/value pairs if value matches given regular expressions. e.g.: |
@@ -235,6 +273,12 @@ Or install it yourself as:
235
273
 
236
274
  ## Changes
237
275
 
276
+ #### 1.1.0 (2015-07-26)
277
+ * added feature :value_converters, which allows parsing of dates, money, and other things (thanks to Raphaël Bleuse, Lucas Camargo de Almeida, Alejandro)
278
+ * added error if :headers_in_file is set to false, and no :user_provided_headers are given (thanks to innhyu)
279
+ * added support to convert dashes to underscore characters in headers (thanks to César Camacho)
280
+ * fixing automatic detection of \r\n line-endings (thanks to feens)
281
+
238
282
  #### 1.0.19 (2014-10-29)
239
283
  * added option :keep_original_headers to keep CSV-headers as-is (thanks to Benjamin Thouret)
240
284
 
@@ -339,6 +383,12 @@ Please [open an Issue on GitHub](https://github.com/tilo/smarter_csv/issues) if
339
383
  Many thanks to people who have filed issues and sent comments.
340
384
  And a special thanks to those who contributed pull requests:
341
385
 
386
+ * [Alejandro](https://github.com/agaviria)
387
+ * [Lucas Camargo de Almeida](https://github.com/lcalmeida)
388
+ * [Raphaël Bleuse](https://github.com/bleuse)
389
+ * [feens](https://github.com/feens)
390
+ * [César Camacho](https://github.com/chanko)
391
+ * [innhyu](https://github.com/innhyu)
342
392
  * [Benjamin Thouret](https://github.com/benichu)
343
393
  * [Chris Hilton](https://github.com/chrismhilton)
344
394
  * [Sean Duckett](http://github.com/sduckett)
@@ -9,7 +9,7 @@ module SmarterCSV
9
9
  :remove_empty_values => true, :remove_zero_values => false , :remove_values_matching => nil , :remove_empty_hashes => true , :strip_whitespace => true,
10
10
  :convert_values_to_numeric => true, :strip_chars_from_headers => nil , :user_provided_headers => nil , :headers_in_file => true,
11
11
  :comment_regexp => /^#/, :chunk_size => nil , :key_mapping_hash => nil , :downcase_header => true, :strings_as_keys => false, :file_encoding => 'utf-8',
12
- :remove_unmapped_keys => false, :keep_original_headers => false,
12
+ :remove_unmapped_keys => false, :keep_original_headers => false, :value_converters => nil,
13
13
  }
14
14
  options = default_options.merge(options)
15
15
  csv_options = options.select{|k,v| [:col_sep, :row_sep, :quote_char].include?(k)} # options.slice(:col_sep, :row_sep, :quote_char)
@@ -40,13 +40,15 @@ module SmarterCSV
40
40
  file_headerA.map!{|x| x.gsub(%r/options[:quote_char]/,'') }
41
41
  file_headerA.map!{|x| x.strip} if options[:strip_whitespace]
42
42
  unless options[:keep_original_headers]
43
- file_headerA.map!{|x| x.gsub(/\s+/,'_')}
43
+ file_headerA.map!{|x| x.gsub(/\s+|-+/,'_')}
44
44
  file_headerA.map!{|x| x.downcase } if options[:downcase_header]
45
45
  end
46
46
 
47
47
  # puts "HeaderA: #{file_headerA.join(' , ')}" if options[:verbose]
48
48
 
49
49
  file_header_size = file_headerA.size
50
+ else
51
+ raise SmarterCSV::IncorrectOption , "ERROR [smarter_csv]: If :headers_in_file is set to false, you have to provide :user_provided_headers" if ! options.keys.include?(:user_provided_headers)
50
52
  end
51
53
  if options[:user_provided_headers] && options[:user_provided_headers].class == Array && ! options[:user_provided_headers].empty?
52
54
  # use user-provided headers
@@ -135,6 +137,15 @@ module SmarterCSV
135
137
  end
136
138
  end
137
139
  end
140
+
141
+ if options[:value_converters]
142
+ hash.each do |k,v|
143
+ converter = options[:value_converters][k]
144
+ next unless converter
145
+ hash[k] = converter.convert(v)
146
+ end
147
+ end
148
+
138
149
  next if hash.empty? if options[:remove_empty_hashes]
139
150
 
140
151
  if use_chunks
@@ -212,11 +223,23 @@ module SmarterCSV
212
223
 
213
224
  # count how many of the pre-defined line-endings we find
214
225
  # ignoring those contained within quote characters
226
+ last_char = nil
215
227
  filehandle.each_char do |c|
216
228
  quoted_char = !quoted_char if c == options[:quote_char]
217
- next if quoted_char || c !~ /\r|\n|\r\n/
218
- counts[c] += 1
229
+ next if quoted_char
230
+
231
+ if last_char == "\r"
232
+ if c == "\n"
233
+ counts["\r\n"] += 1
234
+ else
235
+ counts["\r"] += 1 # \r are counted after they appeared, we might
236
+ end
237
+ elsif c == "\n"
238
+ counts["\n"] += 1
239
+ end
240
+ last_char = c
219
241
  end
242
+ counts["\r"] += 1 if last_char == "\r"
220
243
  # find the key/value pair with the largest counter:
221
244
  k,v = counts.max_by{|k,v| v}
222
245
  return k # the most frequent one is it
@@ -1,3 +1,3 @@
1
1
  module SmarterCSV
2
- VERSION = "1.0.19"
2
+ VERSION = "1.1.0"
3
3
  end
@@ -0,0 +1,3 @@
1
+ item,price
2
+ Book,$9.99
3
+ Mug,$14.99
@@ -0,0 +1,8 @@
1
+ First-Name,Last-Name,Dogs,Cats,Birds,Fish
2
+ Dan,McAllister,2,0,,
3
+ Lucy,Laweless,,5,0,
4
+ ,,,,,
5
+ Miles,O'Brian,0,0,0,21
6
+ Nancy,Homes,2,0,1,
7
+ Hernán,Curaçon,3,0,0,
8
+ ,,,,,
@@ -0,0 +1,4 @@
1
+ first,last,date,price
2
+ Ben,Miller,10/30/1998,$44.50
3
+ Tom,Turner,2/1/2011,$15
4
+ Ken,Smith,01/09/2013,$0.11
@@ -0,0 +1,21 @@
1
+ require 'spec_helper'
2
+
3
+ fixture_path = 'spec/fixtures'
4
+
5
+ describe 'be_able_to' do
6
+ it 'loads_file_with_dashes_in_header_fields as strings' do
7
+ options = {:strings_as_keys => true}
8
+ data = SmarterCSV.process("#{fixture_path}/with_dashes.csv", options)
9
+ data.flatten.size.should == 5
10
+ data[0]['first_name'].should eq 'Dan'
11
+ data[0]['last_name'].should eq 'McAllister'
12
+ end
13
+
14
+ it 'loads_file_with_dashes_in_header_fields as symbols' do
15
+ options = {:strings_as_keys => false}
16
+ data = SmarterCSV.process("#{fixture_path}/with_dashes.csv", options)
17
+ data.flatten.size.should == 5
18
+ data[0][:first_name].should eq 'Dan'
19
+ data[0][:last_name].should eq 'McAllister'
20
+ end
21
+ end
@@ -0,0 +1,52 @@
1
+ require 'spec_helper'
2
+
3
+ fixture_path = 'spec/fixtures'
4
+
5
+ require 'date'
6
+ class DateConverter
7
+ def self.convert(value)
8
+ Date.strptime( value, '%m/%d/%Y')
9
+ end
10
+ end
11
+
12
+ class CurrencyConverter
13
+ def self.convert(value)
14
+ value.sub(/[$]/,'').to_f # would be nice to add a computed column :currency => '€'
15
+ end
16
+ end
17
+
18
+ describe 'be_able_to' do
19
+ it 'convert date values into Date instances' do
20
+ options = {:value_converters => {:date => DateConverter}}
21
+ data = SmarterCSV.process("#{fixture_path}/with_dates.csv", options)
22
+ data.flatten.size.should == 3
23
+ data[0][:date].class.should eq Date
24
+ data[0][:date].to_s.should eq "1998-10-30"
25
+ data[1][:date].to_s.should eq "2011-02-01"
26
+ data[2][:date].to_s.should eq "2013-01-09"
27
+ end
28
+
29
+ it 'converts dollar prices into float values' do
30
+ options = {:value_converters => {:price => CurrencyConverter}}
31
+ data = SmarterCSV.process("#{fixture_path}/money.csv", options)
32
+ data.flatten.size.should == 2
33
+ data[0][:price].class.should eq Float
34
+ data[0][:price].should eq 9.99
35
+ data[1][:price].should eq 14.99
36
+ end
37
+
38
+ it 'convert can use multiple value converters' do
39
+ options = {:value_converters => {:date => DateConverter, :price => CurrencyConverter}}
40
+ data = SmarterCSV.process("#{fixture_path}/with_dates.csv", options)
41
+ data.flatten.size.should == 3
42
+ data[0][:date].class.should eq Date
43
+ data[0][:date].to_s.should eq "1998-10-30"
44
+ data[1][:date].to_s.should eq "2011-02-01"
45
+ data[2][:date].to_s.should eq "2013-01-09"
46
+
47
+ data[0][:price].class.should eq Float
48
+ data[0][:price].should eq 44.50
49
+ data[1][:price].should eq 15.0
50
+ data[2][:price].should eq 0.11
51
+ end
52
+ end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: smarter_csv
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.19
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - |
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2014-10-29 00:00:00.000000000 Z
12
+ date: 2015-07-27 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: rspec
@@ -59,17 +59,21 @@ files:
59
59
  - spec/fixtures/line_endings_r.csv
60
60
  - spec/fixtures/line_endings_rn.csv
61
61
  - spec/fixtures/lots_of_columns.csv
62
+ - spec/fixtures/money.csv
62
63
  - spec/fixtures/no_header.csv
63
64
  - spec/fixtures/numeric.csv
64
65
  - spec/fixtures/pets.csv
65
66
  - spec/fixtures/quoted.csv
66
67
  - spec/fixtures/separator.csv
68
+ - spec/fixtures/with_dashes.csv
69
+ - spec/fixtures/with_dates.csv
67
70
  - spec/smarter_csv/binary_file2_spec.rb
68
71
  - spec/smarter_csv/binary_file_spec.rb
69
72
  - spec/smarter_csv/carriage_return_spec.rb
70
73
  - spec/smarter_csv/chunked_reading_spec.rb
71
74
  - spec/smarter_csv/column_separator_spec.rb
72
75
  - spec/smarter_csv/convert_values_to_numeric_spec.rb
76
+ - spec/smarter_csv/header_transformation_spec.rb
73
77
  - spec/smarter_csv/keep_headers_spec.rb
74
78
  - spec/smarter_csv/key_mapping_spec.rb
75
79
  - spec/smarter_csv/line_ending_spec.rb
@@ -84,6 +88,7 @@ files:
84
88
  - spec/smarter_csv/remove_zero_values_spec.rb
85
89
  - spec/smarter_csv/strings_as_keys_spec.rb
86
90
  - spec/smarter_csv/strip_chars_from_headers_spec.rb
91
+ - spec/smarter_csv/value_converters_spec.rb
87
92
  - spec/spec.opts
88
93
  - spec/spec/spec_helper.rb
89
94
  - spec/spec_helper.rb
@@ -109,7 +114,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
109
114
  requirements:
110
115
  - csv
111
116
  rubyforge_project:
112
- rubygems_version: 2.2.2
117
+ rubygems_version: 2.4.5
113
118
  signing_key:
114
119
  specification_version: 4
115
120
  summary: Ruby Gem for smarter importing of CSV Files (and CSV-like files), with lots
@@ -127,17 +132,21 @@ test_files:
127
132
  - spec/fixtures/line_endings_r.csv
128
133
  - spec/fixtures/line_endings_rn.csv
129
134
  - spec/fixtures/lots_of_columns.csv
135
+ - spec/fixtures/money.csv
130
136
  - spec/fixtures/no_header.csv
131
137
  - spec/fixtures/numeric.csv
132
138
  - spec/fixtures/pets.csv
133
139
  - spec/fixtures/quoted.csv
134
140
  - spec/fixtures/separator.csv
141
+ - spec/fixtures/with_dashes.csv
142
+ - spec/fixtures/with_dates.csv
135
143
  - spec/smarter_csv/binary_file2_spec.rb
136
144
  - spec/smarter_csv/binary_file_spec.rb
137
145
  - spec/smarter_csv/carriage_return_spec.rb
138
146
  - spec/smarter_csv/chunked_reading_spec.rb
139
147
  - spec/smarter_csv/column_separator_spec.rb
140
148
  - spec/smarter_csv/convert_values_to_numeric_spec.rb
149
+ - spec/smarter_csv/header_transformation_spec.rb
141
150
  - spec/smarter_csv/keep_headers_spec.rb
142
151
  - spec/smarter_csv/key_mapping_spec.rb
143
152
  - spec/smarter_csv/line_ending_spec.rb
@@ -152,6 +161,7 @@ test_files:
152
161
  - spec/smarter_csv/remove_zero_values_spec.rb
153
162
  - spec/smarter_csv/strings_as_keys_spec.rb
154
163
  - spec/smarter_csv/strip_chars_from_headers_spec.rb
164
+ - spec/smarter_csv/value_converters_spec.rb
155
165
  - spec/spec.opts
156
166
  - spec/spec/spec_helper.rb
157
167
  - spec/spec_helper.rb