smarter_csv 1.4.2 → 1.5.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +22 -1
- data/CONTRIBUTORS.md +2 -0
- data/README.md +12 -2
- data/lib/smarter_csv/smarter_csv.rb +137 -100
- data/lib/smarter_csv/version.rb +1 -1
- data/spec/fixtures/additional_separator.csv +6 -0
- data/spec/fixtures/duplicate_headers.csv +1 -1
- data/spec/fixtures/hard_sample.csv +2 -0
- data/spec/smarter_csv/additional_separator_spec.rb +45 -0
- data/spec/smarter_csv/binary_file2_spec.rb +1 -1
- data/spec/smarter_csv/duplicate_headers_spec.rb +76 -0
- data/spec/smarter_csv/hard_sample_spec.rb +24 -0
- data/spec/smarter_csv/ignore_comments_spec.rb +45 -30
- data/spec/smarter_csv/invalid_headers_spec.rb +8 -22
- data/spec/smarter_csv/no_header_spec.rb +16 -11
- metadata +12 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 88b9932c898320fb05d5697e155dc0bd3ade887d2fcfab7b660933e230007364
|
4
|
+
data.tar.gz: f0525d9c917aff44f910d4547b8e918faa3beb50d47adc29182df1fc1ec2be19
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 330ad44b9808150f6fdf96dec65d259c2d9cf5eb25e22dc80f63095f4014b065b8aa97a2ba9b814c6cea6f4c0361e04567be403ab78b54d0518b49dc072f36ac
|
7
|
+
data.tar.gz: 27531bd508b5b455a32947badfb85d7e95489ad282a837ef046864806ba7fa12539148ab2fe4c84174fba3ef085dd3adda5d7c070615d684cd99ed0f90b903a3
|
data/CHANGELOG.md
CHANGED
@@ -1,7 +1,28 @@
|
|
1
1
|
|
2
2
|
# SmarterCSV 1.x Change Log
|
3
3
|
|
4
|
-
## 1.
|
4
|
+
## 1.5.2 (2022-04-29)
|
5
|
+
* added missing keys to the SmarterCSV::KeyMappingError exception message #189 (thanks to John Dell)
|
6
|
+
|
7
|
+
## 1.5.1 (2022-04-27)
|
8
|
+
* added raising of `KeyMappingError` if `key_mapping` refers to a non-existent key
|
9
|
+
* added option `duplicate_header_suffix` (thanks to Skye Shaw)
|
10
|
+
When given a non-nil string, it uses the suffix to append numbering 2..n to duplicate headers.
|
11
|
+
If your code will need to process arbitrary CSV files, please set `duplicate_header_suffix`.
|
12
|
+
|
13
|
+
## 1.5.0 (2022-04-25)
|
14
|
+
* fixed bug with trailing col_sep characters, introduced in 1.4.0
|
15
|
+
* Fix deprecation warning in Ruby 3.0.3 / $INPUT_RECORD_SEPARATOR (thanks to Joel Fouse )
|
16
|
+
|
17
|
+
* changed default for `comment_regexp` to be `nil` for a safer default behavior (thanks to David Lazar)
|
18
|
+
**Note**
|
19
|
+
This no longer assumes that lines starting with `#` are comments.
|
20
|
+
If you want to treat lines starting with '#' as comments, use `comment_regexp: /\A#/`
|
21
|
+
|
22
|
+
## 1.4.2 (2022-02-12)
|
23
|
+
* fixed issue with simplecov
|
24
|
+
|
25
|
+
## 1.4.1 (2022-02-12) (PULLED)
|
5
26
|
* minor fix: also support `col_sep: :auto`
|
6
27
|
* added simplecov
|
7
28
|
|
data/CONTRIBUTORS.md
CHANGED
@@ -43,3 +43,5 @@ A Big Thank you to everyone who filed issues, sent comments, and who contributed
|
|
43
43
|
* [Olle Jonsson](https://github.com/olleolleolle)
|
44
44
|
* [Nicolas Guillemain](https://github.com/Viiruus)
|
45
45
|
* [Sp6](https://github.com/sp6)
|
46
|
+
* [Joel Fouse](https://github.com/jfouse)
|
47
|
+
* [John Dell](https://github.com/spovich)
|
data/README.md
CHANGED
@@ -215,7 +215,7 @@ The options and the block are optional.
|
|
215
215
|
| :invalid_byte_sequence | '' | what to replace invalid byte sequences with |
|
216
216
|
| :force_utf8 | false | force UTF-8 encoding of all lines (including headers) in the CSV file |
|
217
217
|
| :skip_lines | nil | how many lines to skip before the first line or header line is processed |
|
218
|
-
| :comment_regexp |
|
218
|
+
| :comment_regexp | nil | regular expression to ignore comment lines (see NOTE on CSV header), e.g./\A#/ |
|
219
219
|
---------------------------------------------------------------------------------------------------------------------------------
|
220
220
|
| :col_sep | ',' | column separator, can be set to :auto |
|
221
221
|
| :force_simple_split | false | force simple splitting on :col_sep character for non-standard CSV-files. |
|
@@ -228,6 +228,7 @@ The options and the block are optional.
|
|
228
228
|
| :headers_in_file | true | Whether or not the file contains headers as the first line. |
|
229
229
|
| | | Important if the file does not contain headers, |
|
230
230
|
| | | otherwise you would lose the first line of data. |
|
231
|
+
| :duplicate_header_suffix | nil | If set, adds numbers to duplicated headers and separates them by the given suffix |
|
231
232
|
| :user_provided_headers | nil | *careful with that axe!* |
|
232
233
|
| | | user provided Array of header strings or symbols, to define |
|
233
234
|
| | | what headers should be used, overriding any in-file headers. |
|
@@ -282,14 +283,23 @@ And header and data validations will also be supported in 2.x
|
|
282
283
|
data = SmarterCSV.process(f)
|
283
284
|
end
|
284
285
|
```
|
286
|
+
|
285
287
|
#### NOTES about CSV Headers:
|
286
288
|
* as this method parses CSV files, it is assumed that the first line of any file will contain a valid header
|
287
|
-
* the first line with the
|
289
|
+
* the first line with the header might be commented out, in which case you will need to set `comment_regexp: /\A#/`
|
290
|
+
This is no longer handled automatically since 1.5.0.
|
288
291
|
* any occurences of :comment_regexp or :row_sep will be stripped from the first line with the CSV header
|
289
292
|
* any of the keys in the header line will be downcased, spaces replaced by underscore, and converted to Ruby symbols before being used as keys in the returned Hashes
|
290
293
|
* you can not combine the :user_provided_headers and :key_mapping options
|
291
294
|
* if the incorrect number of headers are provided via :user_provided_headers, exception SmarterCSV::HeaderSizeMismatch is raised
|
292
295
|
|
296
|
+
#### NOTES on Duplicate Headers:
|
297
|
+
As a corner case, it is possible that a CSV file contains multiple headers with the same name.
|
298
|
+
* If that happens, by default `smarter_csv` will raise a `DuplicateHeaders` error.
|
299
|
+
* If you set `duplicate_header_suffix` to a non-nil string, it will use it to append numbers 2..n to the duplicate headers. To further disambiguate the headers, you can further use `key_mapping` to assign meaningful names.
|
300
|
+
* If your code will need to process arbitrary CSV files, please set `duplicate_header_suffix`.
|
301
|
+
* Another way to deal with duplicate headers it to use `user_assigned_headers` to ignore any headers in the file.
|
302
|
+
|
293
303
|
#### NOTES on Key Mapping:
|
294
304
|
* keys in the header line of the file can be re-mapped to a chosen set of symbols, so the resulting Hashes can be better used internally in your application (e.g. when directly creating MongoDB entries with them)
|
295
305
|
* if you want to completely delete a key, then map it to nil or to '', they will be automatically deleted from any result Hash
|
@@ -5,108 +5,41 @@ module SmarterCSV
|
|
5
5
|
class DuplicateHeaders < SmarterCSVException; end
|
6
6
|
class MissingHeaders < SmarterCSVException; end
|
7
7
|
class NoColSepDetected < SmarterCSVException; end
|
8
|
+
class KeyMappingError < SmarterCSVException; end
|
8
9
|
|
9
|
-
|
10
|
+
# first parameter: filename or input object which responds to readline method
|
11
|
+
def SmarterCSV.process(input, options={}, &block)
|
10
12
|
options = default_options.merge(options)
|
11
13
|
options[:invalid_byte_sequence] = '' if options[:invalid_byte_sequence].nil?
|
12
14
|
|
13
15
|
headerA = []
|
14
16
|
result = []
|
15
|
-
|
16
|
-
|
17
|
-
csv_line_count = 0
|
17
|
+
@file_line_count = 0
|
18
|
+
@csv_line_count = 0
|
18
19
|
has_rails = !! defined?(Rails)
|
19
20
|
begin
|
20
|
-
|
21
|
+
fh = input.respond_to?(:readline) ? input : File.open(input, "r:#{options[:file_encoding]}")
|
21
22
|
|
22
23
|
# auto-detect the row separator
|
23
|
-
options[:row_sep] = SmarterCSV.guess_line_ending(
|
24
|
-
$INPUT_RECORD_SEPARATOR = options[:row_sep]
|
24
|
+
options[:row_sep] = SmarterCSV.guess_line_ending(fh, options) if options[:row_sep].to_sym == :auto
|
25
25
|
# attempt to auto-detect column separator
|
26
|
-
options[:col_sep] = guess_column_separator(
|
26
|
+
options[:col_sep] = guess_column_separator(fh, options) if options[:col_sep].to_sym == :auto
|
27
27
|
# preserve options, in case we need to call the CSV class
|
28
28
|
csv_options = options.select{|k,v| [:col_sep, :row_sep, :quote_char].include?(k)} # options.slice(:col_sep, :row_sep, :quote_char)
|
29
29
|
csv_options.delete(:row_sep) if [nil, :auto].include?( options[:row_sep].to_sym )
|
30
30
|
csv_options.delete(:col_sep) if [nil, :auto].include?( options[:col_sep].to_sym )
|
31
31
|
|
32
|
-
if (options[:force_utf8] || options[:file_encoding] =~ /utf-8/i) && (
|
32
|
+
if (options[:force_utf8] || options[:file_encoding] =~ /utf-8/i) && ( fh.respond_to?(:external_encoding) && fh.external_encoding != Encoding.find('UTF-8') || fh.respond_to?(:encoding) && fh.encoding != Encoding.find('UTF-8') )
|
33
33
|
puts 'WARNING: you are trying to process UTF-8 input, but did not open the input with "b:utf-8" option. See README file "NOTES about File Encodings".'
|
34
34
|
end
|
35
35
|
|
36
|
-
|
37
|
-
|
38
|
-
|
39
|
-
# process the header line in the CSV file..
|
40
|
-
# the first line of a CSV file contains the header .. it might be commented out, so we need to read it anyhow
|
41
|
-
header = f.readline
|
42
|
-
header = header.force_encoding('utf-8').encode('utf-8', invalid: :replace, undef: :replace, replace: options[:invalid_byte_sequence]) if options[:force_utf8] || options[:file_encoding] !~ /utf-8/i
|
43
|
-
header = header.sub(options[:comment_regexp],'').chomp(options[:row_sep])
|
44
|
-
|
45
|
-
file_line_count += 1
|
46
|
-
csv_line_count += 1
|
47
|
-
header = header.gsub(options[:strip_chars_from_headers], '') if options[:strip_chars_from_headers]
|
48
|
-
|
49
|
-
if (header =~ %r{#{options[:quote_char]}}) and (! options[:force_simple_split])
|
50
|
-
file_headerA = begin
|
51
|
-
CSV.parse( header, **csv_options ).flatten.collect!{|x| x.nil? ? '' : x} # to deal with nil values from CSV.parse
|
52
|
-
rescue CSV::MalformedCSVError => e
|
53
|
-
raise $!, "#{$!} [SmarterCSV: csv line #{csv_line_count}]", $!.backtrace
|
54
|
-
end
|
55
|
-
else
|
56
|
-
file_headerA = header.split(options[:col_sep])
|
57
|
-
end
|
58
|
-
file_header_size = file_headerA.size # before mapping, which could delete keys
|
59
|
-
|
60
|
-
file_headerA.map!{|x| x.gsub(%r/#{options[:quote_char]}/,'') }
|
61
|
-
file_headerA.map!{|x| x.strip} if options[:strip_whitespace]
|
62
|
-
unless options[:keep_original_headers]
|
63
|
-
file_headerA.map!{|x| x.gsub(/\s+|-+/,'_')}
|
64
|
-
file_headerA.map!{|x| x.downcase } if options[:downcase_header]
|
36
|
+
if options[:skip_lines].to_i > 0
|
37
|
+
options[:skip_lines].to_i.times do
|
38
|
+
readline_with_counts(fh, options)
|
65
39
|
end
|
66
|
-
else
|
67
|
-
raise SmarterCSV::IncorrectOption , "ERROR: If :headers_in_file is set to false, you have to provide :user_provided_headers" if options[:user_provided_headers].nil?
|
68
|
-
end
|
69
|
-
if options[:user_provided_headers] && options[:user_provided_headers].class == Array && ! options[:user_provided_headers].empty?
|
70
|
-
# use user-provided headers
|
71
|
-
headerA = options[:user_provided_headers]
|
72
|
-
if defined?(file_header_size) && ! file_header_size.nil?
|
73
|
-
if headerA.size != file_header_size
|
74
|
-
raise SmarterCSV::HeaderSizeMismatch , "ERROR: :user_provided_headers defines #{headerA.size} headers != CSV-file #{input} has #{file_header_size} headers"
|
75
|
-
else
|
76
|
-
# we could print out the mapping of file_headerA to headerA here
|
77
|
-
end
|
78
|
-
end
|
79
|
-
else
|
80
|
-
headerA = file_headerA
|
81
40
|
end
|
82
|
-
header_size = headerA.size # used for splitting lines
|
83
41
|
|
84
|
-
headerA
|
85
|
-
|
86
|
-
unless options[:user_provided_headers] # wouldn't make sense to re-map user provided headers
|
87
|
-
key_mappingH = options[:key_mapping]
|
88
|
-
|
89
|
-
# do some key mapping on the keys in the file header
|
90
|
-
# if you want to completely delete a key, then map it to nil or to ''
|
91
|
-
if ! key_mappingH.nil? && key_mappingH.class == Hash && key_mappingH.keys.size > 0
|
92
|
-
headerA.map!{|x| key_mappingH.has_key?(x) ? (key_mappingH[x].nil? ? nil : key_mappingH[x]) : (options[:remove_unmapped_keys] ? nil : x)}
|
93
|
-
end
|
94
|
-
end
|
95
|
-
|
96
|
-
# header_validations
|
97
|
-
duplicate_headers = []
|
98
|
-
headerA.compact.each do |k|
|
99
|
-
duplicate_headers << k if headerA.select{|x| x == k}.size > 1
|
100
|
-
end
|
101
|
-
raise SmarterCSV::DuplicateHeaders , "ERROR: duplicate headers: #{duplicate_headers.join(',')}" unless duplicate_headers.empty?
|
102
|
-
|
103
|
-
if options[:required_headers] && options[:required_headers].is_a?(Array)
|
104
|
-
missing_headers = []
|
105
|
-
options[:required_headers].each do |k|
|
106
|
-
missing_headers << k unless headerA.include?(k)
|
107
|
-
end
|
108
|
-
raise SmarterCSV::MissingHeaders , "ERROR: missing headers: #{missing_headers.join(',')}" unless missing_headers.empty?
|
109
|
-
end
|
42
|
+
headerA, header_size = process_headers(fh, options, csv_options)
|
110
43
|
|
111
44
|
# in case we use chunking.. we'll need to set it up..
|
112
45
|
if ! options[:chunk_size].nil? && options[:chunk_size].to_i > 0
|
@@ -119,42 +52,41 @@ module SmarterCSV
|
|
119
52
|
end
|
120
53
|
|
121
54
|
# now on to processing all the rest of the lines in the CSV file:
|
122
|
-
while !
|
123
|
-
line =
|
55
|
+
while ! fh.eof? # we can't use fh.readlines() here, because this would read the whole file into memory at once, and eof => true
|
56
|
+
line = readline_with_counts(fh, options)
|
124
57
|
|
125
58
|
# replace invalid byte sequence in UTF-8 with question mark to avoid errors
|
126
59
|
line = line.force_encoding('utf-8').encode('utf-8', invalid: :replace, undef: :replace, replace: options[:invalid_byte_sequence]) if options[:force_utf8] || options[:file_encoding] !~ /utf-8/i
|
127
60
|
|
128
|
-
file_line_count
|
129
|
-
|
130
|
-
|
131
|
-
next if line =~ options[:comment_regexp] # ignore all comment lines if there are any
|
61
|
+
print "processing file line %10d, csv line %10d\r" % [@file_line_count, @csv_line_count] if options[:verbose]
|
62
|
+
|
63
|
+
next if options[:comment_regexp] && line =~ options[:comment_regexp] # ignore all comment lines if there are any
|
132
64
|
|
133
65
|
# cater for the quoted csv data containing the row separator carriage return character
|
134
66
|
# in which case the row data will be split across multiple lines (see the sample content in spec/fixtures/carriage_returns_rn.csv)
|
135
67
|
# by detecting the existence of an uneven number of quote characters
|
136
68
|
multiline = line.count(options[:quote_char])%2 == 1 # should handle quote_char nil
|
137
69
|
while line.count(options[:quote_char])%2 == 1 # should handle quote_char nil
|
138
|
-
next_line =
|
70
|
+
next_line = fh.readline(options[:row_sep])
|
139
71
|
next_line = next_line.force_encoding('utf-8').encode('utf-8', invalid: :replace, undef: :replace, replace: options[:invalid_byte_sequence]) if options[:force_utf8] || options[:file_encoding] !~ /utf-8/i
|
140
72
|
line += next_line
|
141
|
-
file_line_count += 1
|
73
|
+
@file_line_count += 1
|
142
74
|
end
|
143
|
-
print "\nline contains uneven number of quote chars so including content through file line %d\n" % file_line_count if options[:verbose] && multiline
|
75
|
+
print "\nline contains uneven number of quote chars so including content through file line %d\n" % @file_line_count if options[:verbose] && multiline
|
144
76
|
|
145
|
-
line.chomp!
|
77
|
+
line.chomp!(options[:row_sep])
|
146
78
|
|
147
79
|
if (line =~ %r{#{options[:quote_char]}}) and (! options[:force_simple_split])
|
148
80
|
dataA = begin
|
149
81
|
CSV.parse( line, **csv_options ).flatten.collect!{|x| x.nil? ? '' : x} # to deal with nil values from CSV.parse
|
150
82
|
rescue CSV::MalformedCSVError => e
|
151
|
-
raise $!, "#{$!} [SmarterCSV: csv line #{csv_line_count}]", $!.backtrace
|
83
|
+
raise $!, "#{$!} [SmarterCSV: csv line #{@csv_line_count}]", $!.backtrace
|
152
84
|
end
|
153
85
|
else
|
154
|
-
dataA =
|
86
|
+
dataA = line.split(options[:col_sep], header_size)
|
155
87
|
end
|
156
|
-
|
157
|
-
dataA.map!{|x| x.strip}
|
88
|
+
dataA.map!{|x| x.sub(/(#{options[:col_sep]})+\z/, '')} # remove any unwanted trailing col_sep characters at the end
|
89
|
+
dataA.map!{|x| x.strip} if options[:strip_whitespace]
|
158
90
|
|
159
91
|
# if all values are blank, then ignore this line
|
160
92
|
# SEE: https://github.com/rails/rails/blob/32015b6f369adc839c4f0955f2d9dce50c0b6123/activesupport/lib/active_support/core_ext/object/blank.rb#L121
|
@@ -208,7 +140,7 @@ module SmarterCSV
|
|
208
140
|
if use_chunks
|
209
141
|
chunk << hash # append temp result to chunk
|
210
142
|
|
211
|
-
if chunk.size >= chunk_size ||
|
143
|
+
if chunk.size >= chunk_size || fh.eof? # if chunk if full, or EOF reached
|
212
144
|
# do something with the chunk
|
213
145
|
if block_given?
|
214
146
|
yield chunk # do something with the hashes in the chunk in the block
|
@@ -249,8 +181,7 @@ module SmarterCSV
|
|
249
181
|
chunk = [] # initialize for next chunk of data
|
250
182
|
end
|
251
183
|
ensure
|
252
|
-
|
253
|
-
f.close if f.respond_to?(:close)
|
184
|
+
fh.close if fh.respond_to?(:close)
|
254
185
|
end
|
255
186
|
if block_given?
|
256
187
|
return chunk_count # when we do processing through a block we only care how many chunks we processed
|
@@ -261,14 +192,22 @@ module SmarterCSV
|
|
261
192
|
|
262
193
|
private
|
263
194
|
|
195
|
+
def self.readline_with_counts(filehandle, options)
|
196
|
+
line = filehandle.readline(options[:row_sep])
|
197
|
+
@file_line_count += 1
|
198
|
+
@csv_line_count += 1
|
199
|
+
line
|
200
|
+
end
|
201
|
+
|
264
202
|
def self.default_options
|
265
203
|
{
|
266
204
|
auto_row_sep_chars: 500,
|
267
205
|
chunk_size: nil ,
|
268
206
|
col_sep: ',',
|
269
|
-
comment_regexp: /\A#/,
|
207
|
+
comment_regexp: nil, # was: /\A#/,
|
270
208
|
convert_values_to_numeric: true,
|
271
209
|
downcase_header: true,
|
210
|
+
duplicate_header_suffix: nil,
|
272
211
|
file_encoding: 'utf-8',
|
273
212
|
force_simple_split: false ,
|
274
213
|
force_utf8: false,
|
@@ -329,11 +268,11 @@ module SmarterCSV
|
|
329
268
|
end
|
330
269
|
|
331
270
|
# raise exception if none is found
|
332
|
-
def self.guess_column_separator(filehandle)
|
271
|
+
def self.guess_column_separator(filehandle, options)
|
333
272
|
del = [',', "\t", ';', ':', '|']
|
334
273
|
n = Hash.new(0)
|
335
274
|
5.times do
|
336
|
-
line = filehandle.readline
|
275
|
+
line = filehandle.readline(options[:row_sep])
|
337
276
|
del.each do |d|
|
338
277
|
n[d] += line.scan(d).count
|
339
278
|
end
|
@@ -379,4 +318,102 @@ module SmarterCSV
|
|
379
318
|
k,_ = counts.max_by{|_,v| v}
|
380
319
|
return k # the most frequent one is it
|
381
320
|
end
|
321
|
+
|
322
|
+
def self.process_headers(filehandle, options, csv_options)
|
323
|
+
if options[:headers_in_file] # extract the header line
|
324
|
+
# process the header line in the CSV file..
|
325
|
+
# the first line of a CSV file contains the header .. it might be commented out, so we need to read it anyhow
|
326
|
+
header = readline_with_counts(filehandle, options)
|
327
|
+
|
328
|
+
header = header.force_encoding('utf-8').encode('utf-8', invalid: :replace, undef: :replace, replace: options[:invalid_byte_sequence]) if options[:force_utf8] || options[:file_encoding] !~ /utf-8/i
|
329
|
+
header = header.sub(options[:comment_regexp],'') if options[:comment_regexp]
|
330
|
+
header = header.chomp(options[:row_sep])
|
331
|
+
|
332
|
+
header = header.gsub(options[:strip_chars_from_headers], '') if options[:strip_chars_from_headers]
|
333
|
+
|
334
|
+
if (header =~ %r{#{options[:quote_char]}}) and (! options[:force_simple_split])
|
335
|
+
file_headerA = begin
|
336
|
+
CSV.parse( header, **csv_options ).flatten.collect!{|x| x.nil? ? '' : x} # to deal with nil values from CSV.parse
|
337
|
+
rescue CSV::MalformedCSVError => e
|
338
|
+
raise $!, "#{$!} [SmarterCSV: csv line #{@csv_line_count}]", $!.backtrace
|
339
|
+
end
|
340
|
+
else
|
341
|
+
file_headerA = header.split(options[:col_sep])
|
342
|
+
end
|
343
|
+
file_header_size = file_headerA.size # before mapping, which could delete keys
|
344
|
+
|
345
|
+
file_headerA.map!{|x| x.gsub(%r/#{options[:quote_char]}/,'') }
|
346
|
+
file_headerA.map!{|x| x.strip} if options[:strip_whitespace]
|
347
|
+
unless options[:keep_original_headers]
|
348
|
+
file_headerA.map!{|x| x.gsub(/\s+|-+/,'_')}
|
349
|
+
file_headerA.map!{|x| x.downcase } if options[:downcase_header]
|
350
|
+
end
|
351
|
+
else
|
352
|
+
raise SmarterCSV::IncorrectOption , "ERROR: If :headers_in_file is set to false, you have to provide :user_provided_headers" unless options[:user_provided_headers]
|
353
|
+
end
|
354
|
+
if options[:user_provided_headers] && options[:user_provided_headers].class == Array && ! options[:user_provided_headers].empty?
|
355
|
+
# use user-provided headers
|
356
|
+
headerA = options[:user_provided_headers]
|
357
|
+
if defined?(file_header_size) && ! file_header_size.nil?
|
358
|
+
if headerA.size != file_header_size
|
359
|
+
raise SmarterCSV::HeaderSizeMismatch , "ERROR: :user_provided_headers defines #{headerA.size} headers != CSV-file #{input} has #{file_header_size} headers"
|
360
|
+
else
|
361
|
+
# we could print out the mapping of file_headerA to headerA here
|
362
|
+
end
|
363
|
+
end
|
364
|
+
else
|
365
|
+
headerA = file_headerA
|
366
|
+
end
|
367
|
+
|
368
|
+
# detect duplicate headers and disambiguate
|
369
|
+
headerA = process_duplicate_headers(headerA, options) if options[:duplicate_header_suffix]
|
370
|
+
header_size = headerA.size # used for splitting lines
|
371
|
+
|
372
|
+
headerA.map!{|x| x.to_sym } unless options[:strings_as_keys] || options[:keep_original_headers]
|
373
|
+
|
374
|
+
unless options[:user_provided_headers] # wouldn't make sense to re-map user provided headers
|
375
|
+
key_mappingH = options[:key_mapping]
|
376
|
+
|
377
|
+
# do some key mapping on the keys in the file header
|
378
|
+
# if you want to completely delete a key, then map it to nil or to ''
|
379
|
+
if ! key_mappingH.nil? && key_mappingH.class == Hash && key_mappingH.keys.size > 0
|
380
|
+
# we can't map keys that are not there
|
381
|
+
missing_keys = key_mappingH.keys - headerA
|
382
|
+
raise(SmarterCSV::KeyMappingError, "missing header(s): #{missing_keys.join(",")}") unless missing_keys.empty?
|
383
|
+
|
384
|
+
headerA.map!{|x| key_mappingH.has_key?(x) ? (key_mappingH[x].nil? ? nil : key_mappingH[x]) : (options[:remove_unmapped_keys] ? nil : x)}
|
385
|
+
end
|
386
|
+
end
|
387
|
+
|
388
|
+
# header_validations
|
389
|
+
duplicate_headers = []
|
390
|
+
headerA.compact.each do |k|
|
391
|
+
duplicate_headers << k if headerA.select{|x| x == k}.size > 1
|
392
|
+
end
|
393
|
+
raise SmarterCSV::DuplicateHeaders , "ERROR: duplicate headers: #{duplicate_headers.join(',')}" unless duplicate_headers.empty?
|
394
|
+
|
395
|
+
if options[:required_headers] && options[:required_headers].is_a?(Array)
|
396
|
+
missing_headers = []
|
397
|
+
options[:required_headers].each do |k|
|
398
|
+
missing_headers << k unless headerA.include?(k)
|
399
|
+
end
|
400
|
+
raise SmarterCSV::MissingHeaders , "ERROR: missing headers: #{missing_headers.join(',')}" unless missing_headers.empty?
|
401
|
+
end
|
402
|
+
|
403
|
+
[headerA, header_size]
|
404
|
+
end
|
405
|
+
|
406
|
+
def self.process_duplicate_headers(headers, options)
|
407
|
+
counts = Hash.new(0)
|
408
|
+
result = []
|
409
|
+
headers.each do |key|
|
410
|
+
counts[key] += 1
|
411
|
+
if counts[key] == 1
|
412
|
+
result << key
|
413
|
+
else
|
414
|
+
result << [key, options[:duplicate_header_suffix], counts[key]].join
|
415
|
+
end
|
416
|
+
end
|
417
|
+
result
|
418
|
+
end
|
382
419
|
end
|
data/lib/smarter_csv/version.rb
CHANGED
@@ -0,0 +1,2 @@
|
|
1
|
+
Name,Email,Financial Status,Paid at,Fulfillment Status,Fulfilled at,Accepts Marketing,Currency,Subtotal,Shipping,Taxes,Total,Discount Code,Discount Amount,Shipping Method,Created at,Lineitem quantity,Lineitem name,Lineitem price,Lineitem compare at price,Lineitem sku,Lineitem requires shipping,Lineitem taxable,Lineitem fulfillment status,Billing Name,Billing Street,Billing Address1,Billing Address2,Billing Company,Billing City,Billing Zip,Billing Province,Billing Country,Billing Phone,Shipping Name,Shipping Street,Shipping Address1,Shipping Address2,Shipping Company,Shipping City,Shipping Zip,Shipping Province,Shipping Country,Shipping Phone,Notes,Note Attributes,Cancelled at,Payment Method,Payment Reference,Refunded Amount,Vendor, rece,Tags,Risk Level,Source,Lineitem discount,Tax 1 Name,Tax 1 Value,Tax 2 Name,Tax 2 Value,Tax 3 Name,Tax 3 Value,Tax 4 Name,Tax 4 Value,Tax 5 Name,Tax 5 Value,Phone,Receipt Number,Duties,Billing Province Name,Shipping Province Name,Payment ID,Payment Terms Name,Next Payment Due At
|
2
|
+
#MR1220817,foo@bar.com,paid,2022-02-08 22:31:28 +0100,unfulfilled,,yes,EUR,144,0,24,144,VIP,119.6,"Livraison Standard GRATUITE, 2-5 jours avec suivi",2022-02-08 22:31:26 +0100,2,Cire Épilation Nacrée,37,,WAX-200-NAC,true,true,pending,French Fry,64 Boulevard Budgié,64 Boulevard Budgié,,,dootdoot’,'49100,,FR,06 12 34 56 78,French Fry,64 Boulevard Budgi,64 Boulevard Budgié,,,dootdoot,'49100,,FR,06 12 34 56 78,,,,Stripe,c23800013619353.2,0,Goober Rég,4331065802905,902,Low,web,0,FR TVA 20%,24,,,,,,,,,3366012111111,,,,,,,
|
@@ -0,0 +1,45 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
fixture_path = 'spec/fixtures'
|
4
|
+
|
5
|
+
describe 'handling of additional trailing column separators' do
|
6
|
+
let(:file) { "#{fixture_path}/additional_separator.csv" }
|
7
|
+
|
8
|
+
describe '' do
|
9
|
+
let(:data) { SmarterCSV.process(file) }
|
10
|
+
|
11
|
+
it 'reads all lines' do
|
12
|
+
data.size.should eq 5
|
13
|
+
end
|
14
|
+
|
15
|
+
it 'reads regular lines' do
|
16
|
+
item = data[0]
|
17
|
+
item[:col1].should == 'eins'
|
18
|
+
item[:col2].should == 'zwei'
|
19
|
+
end
|
20
|
+
|
21
|
+
it 'strips single trailing col_sep character' do
|
22
|
+
item = data[1]
|
23
|
+
item[:col1].should == 'uno'
|
24
|
+
item[:col2].should == 'dos'
|
25
|
+
end
|
26
|
+
|
27
|
+
it 'strips multiple trailing col_sep characters' do
|
28
|
+
item = data[2]
|
29
|
+
item[:col1].should == 'one'
|
30
|
+
item[:col2].should == 'two'
|
31
|
+
end
|
32
|
+
|
33
|
+
it 'strips multiple trailing col_sep chars' do
|
34
|
+
item = data[3]
|
35
|
+
item[:col1].should == 'ichi'
|
36
|
+
item[:col2].should == nil
|
37
|
+
end
|
38
|
+
|
39
|
+
it 'strips multiple trailing col_sep chars' do
|
40
|
+
item = data[4]
|
41
|
+
item[:col1].should == 'un'
|
42
|
+
item[:col2].should == nil
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|
@@ -12,7 +12,7 @@ describe 'be_able_to' do
|
|
12
12
|
it 'loads_binary_file_with_strings_as_keys' do
|
13
13
|
options = {:col_sep => "\cA", :row_sep => "\cB", :comment_regexp => /^#/, :strings_as_keys => true}
|
14
14
|
data = SmarterCSV.process("#{fixture_path}/binary.csv", options)
|
15
|
-
data.
|
15
|
+
data.size.should == 8
|
16
16
|
data.each do |item|
|
17
17
|
# all keys should be strings
|
18
18
|
item.keys.each{|x| x.class.should be == String}
|
@@ -0,0 +1,76 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
fixture_path = 'spec/fixtures'
|
4
|
+
|
5
|
+
describe 'duplicate headers' do
|
6
|
+
describe 'without special handling / default behavior' do
|
7
|
+
it 'raises error on duplicate headers' do
|
8
|
+
expect {
|
9
|
+
SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", {})
|
10
|
+
}.to raise_exception(SmarterCSV::DuplicateHeaders)
|
11
|
+
end
|
12
|
+
|
13
|
+
it 'raises error on duplicate given headers' do
|
14
|
+
expect {
|
15
|
+
options = {:user_provided_headers => [:a,:b,:c,:d,:a]}
|
16
|
+
SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", options)
|
17
|
+
}.to raise_exception(SmarterCSV::DuplicateHeaders)
|
18
|
+
end
|
19
|
+
|
20
|
+
it 'raises error on missing mapped headers and includes missing headers in message' do
|
21
|
+
expect {
|
22
|
+
# the mapping is right, but the underlying csv file is bad
|
23
|
+
options = {:key_mapping => {:email => :a, :firstname => :b, :lastname => :c, :manager_email => :d, :age => :e} }
|
24
|
+
SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", options)
|
25
|
+
}.to raise_exception(SmarterCSV::KeyMappingError, "missing header(s): manager_email")
|
26
|
+
end
|
27
|
+
end
|
28
|
+
|
29
|
+
describe 'with special handling' do
|
30
|
+
context 'with given suffix' do
|
31
|
+
let(:options) { {duplicate_header_suffix: '_'} }
|
32
|
+
|
33
|
+
it 'reads whole file' do
|
34
|
+
data = SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", options)
|
35
|
+
expect(data.size).to eq 2
|
36
|
+
end
|
37
|
+
|
38
|
+
it 'generates the correct keys' do
|
39
|
+
data = SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", options)
|
40
|
+
expect(data.first.keys).to eq [:email, :firstname, :lastname, :email_2, :age]
|
41
|
+
end
|
42
|
+
|
43
|
+
it 'enumerates when duplicate headers are given' do
|
44
|
+
options.merge!({:user_provided_headers => [:a,:b,:c,:a,:a]})
|
45
|
+
data = SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", options)
|
46
|
+
expect(data.first.keys).to eq [:a, :b, :c, :a_2, :a_3]
|
47
|
+
end
|
48
|
+
|
49
|
+
it 'can remap duplicated headers' do
|
50
|
+
options.merge!({:key_mapping => {:email => :a, :firstname => :b, :lastname => :c, :email_2 => :d, :age => :e}})
|
51
|
+
data = SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", options)
|
52
|
+
expect(data.first).to eq({a: 'tom@bla.com', b: 'Tom', c: 'Sawyer', d: 'mike@bla.com', e: 34})
|
53
|
+
end
|
54
|
+
end
|
55
|
+
|
56
|
+
context 'with empty suffix' do
|
57
|
+
let(:options) { {duplicate_header_suffix: ''} }
|
58
|
+
|
59
|
+
it 'reads whole file' do
|
60
|
+
data = SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", options)
|
61
|
+
expect(data.size).to eq 2
|
62
|
+
end
|
63
|
+
|
64
|
+
it 'generates the correct keys' do
|
65
|
+
data = SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", options)
|
66
|
+
expect(data.first.keys).to eq [:email, :firstname, :lastname, :email2, :age]
|
67
|
+
end
|
68
|
+
|
69
|
+
it 'enumerates when duplicate headers are given' do
|
70
|
+
options.merge!({:user_provided_headers => [:a,:b,:c,:a,:a]})
|
71
|
+
data = SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", options)
|
72
|
+
expect(data.first.keys).to eq [:a, :b, :c, :a2, :a3]
|
73
|
+
end
|
74
|
+
end
|
75
|
+
end
|
76
|
+
end
|
@@ -0,0 +1,24 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
fixture_path = 'spec/fixtures'
|
4
|
+
|
5
|
+
describe 'can handle the difficult CSV file' do
|
6
|
+
|
7
|
+
it 'loads the data with default values' do
|
8
|
+
data = SmarterCSV.process("#{fixture_path}/hard_sample.csv")
|
9
|
+
data.size.should eq 1
|
10
|
+
item = data.first
|
11
|
+
item.keys.count.should == 48
|
12
|
+
item[:name].should == '#MR1220817'
|
13
|
+
item[:shipping_method].should == 'Livraison Standard GRATUITE, 2-5 jours avec suivi'
|
14
|
+
item[:lineitem_name].should == 'Cire Épilation Nacrée'
|
15
|
+
item[:phone].should == 3366012111111
|
16
|
+
end
|
17
|
+
|
18
|
+
# the main problem is the data line starting with a # character, but not being a comment
|
19
|
+
it 'fails to load the CSV file with incorrectly set comment_regexp' do
|
20
|
+
options = {comment_regexp: /\A#/ }
|
21
|
+
data = SmarterCSV.process("#{fixture_path}/hard_sample.csv", options)
|
22
|
+
data.size.should eq 0
|
23
|
+
end
|
24
|
+
end
|
@@ -1,30 +1,45 @@
|
|
1
|
-
require 'spec_helper'
|
2
|
-
|
3
|
-
fixture_path = 'spec/fixtures'
|
4
|
-
|
5
|
-
describe 'be_able_to' do
|
6
|
-
it 'ignore comments in CSV files' do
|
7
|
-
options = {}
|
8
|
-
data = SmarterCSV.process("#{fixture_path}/ignore_comments.csv", options)
|
9
|
-
|
10
|
-
data.size.should eq
|
11
|
-
|
12
|
-
# all the keys should be symbols
|
13
|
-
data.each{|item| item.keys.each{|x| x.is_a?(Symbol).should be_truthy}}
|
14
|
-
data.each do |h|
|
15
|
-
h.keys.each do |key|
|
16
|
-
[:"not_a_comment#first_name", :last_name, :dogs, :cats, :birds, :fish].should include( key )
|
17
|
-
end
|
18
|
-
end
|
19
|
-
end
|
20
|
-
|
21
|
-
it 'ignore comments in CSV files
|
22
|
-
options = {
|
23
|
-
data = SmarterCSV.process("#{fixture_path}/
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
data.
|
29
|
-
|
30
|
-
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
fixture_path = 'spec/fixtures'
|
4
|
+
|
5
|
+
describe 'be_able_to' do
|
6
|
+
it 'by default does not ignore comments in CSV files' do
|
7
|
+
options = {}
|
8
|
+
data = SmarterCSV.process("#{fixture_path}/ignore_comments.csv", options)
|
9
|
+
|
10
|
+
data.size.should eq 8
|
11
|
+
|
12
|
+
# all the keys should be symbols
|
13
|
+
data.each{|item| item.keys.each{|x| x.is_a?(Symbol).should be_truthy}}
|
14
|
+
data.each do |h|
|
15
|
+
h.keys.each do |key|
|
16
|
+
[:"not_a_comment#first_name", :last_name, :dogs, :cats, :birds, :fish].should include( key )
|
17
|
+
end
|
18
|
+
end
|
19
|
+
end
|
20
|
+
|
21
|
+
it 'ignore comments in CSV files using comment_regexp' do
|
22
|
+
options = {comment_regexp: /\A#/}
|
23
|
+
data = SmarterCSV.process("#{fixture_path}/ignore_comments.csv", options)
|
24
|
+
|
25
|
+
data.size.should eq 5
|
26
|
+
|
27
|
+
# all the keys should be symbols
|
28
|
+
data.each{|item| item.keys.each{|x| x.is_a?(Symbol).should be_truthy}}
|
29
|
+
data.each do |h|
|
30
|
+
h.keys.each do |key|
|
31
|
+
[:"not_a_comment#first_name", :last_name, :dogs, :cats, :birds, :fish].should include( key )
|
32
|
+
end
|
33
|
+
end
|
34
|
+
end
|
35
|
+
|
36
|
+
it 'ignore comments in CSV files with CRLF' do
|
37
|
+
options = {row_sep: "\r\n"}
|
38
|
+
data = SmarterCSV.process("#{fixture_path}/ignore_comments2.csv", options)
|
39
|
+
|
40
|
+
# all the keys should be symbols
|
41
|
+
data.size.should eq 1
|
42
|
+
data.first[:h1].should eq 'a'
|
43
|
+
data.first[:h2].should eq "b\r\n#c"
|
44
|
+
end
|
45
|
+
end
|
@@ -3,28 +3,6 @@ require 'spec_helper'
|
|
3
3
|
fixture_path = 'spec/fixtures'
|
4
4
|
|
5
5
|
describe 'test exceptions for invalid headers' do
|
6
|
-
it 'raises error on duplicate headers' do
|
7
|
-
expect {
|
8
|
-
SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", {})
|
9
|
-
}.to raise_exception(SmarterCSV::DuplicateHeaders)
|
10
|
-
end
|
11
|
-
|
12
|
-
it 'raises error on duplicate given headers' do
|
13
|
-
expect {
|
14
|
-
options = {:user_provided_headers => [:a,:b,:c,:d,:a]}
|
15
|
-
SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", options)
|
16
|
-
}.to raise_exception(SmarterCSV::DuplicateHeaders)
|
17
|
-
end
|
18
|
-
|
19
|
-
it 'raises error on duplicate mapped headers' do
|
20
|
-
expect {
|
21
|
-
# the mapping is right, but the underlying csv file is bad
|
22
|
-
options = {:key_mapping => {:email => :a, :firstname => :b, :lastname => :c, :manager_email => :d, :age => :e} }
|
23
|
-
SmarterCSV.process("#{fixture_path}/duplicate_headers.csv", options)
|
24
|
-
}.to raise_exception(SmarterCSV::DuplicateHeaders)
|
25
|
-
end
|
26
|
-
|
27
|
-
|
28
6
|
it 'does not raise an error if no required headers are given' do
|
29
7
|
options = {:required_headers => nil} # order does not matter
|
30
8
|
data = SmarterCSV.process("#{fixture_path}/user_import.csv", options)
|
@@ -49,4 +27,12 @@ describe 'test exceptions for invalid headers' do
|
|
49
27
|
SmarterCSV.process("#{fixture_path}/user_import.csv", options)
|
50
28
|
}.to raise_exception(SmarterCSV::MissingHeaders)
|
51
29
|
end
|
30
|
+
|
31
|
+
it 'raises error on missing mapped headers and includes missing headers in message' do
|
32
|
+
expect {
|
33
|
+
# :age does not exist in the CSV header
|
34
|
+
options = {:key_mapping => {:email => :a, :firstname => :b, :lastname => :c, :manager_email => :d, :age => :e} }
|
35
|
+
SmarterCSV.process("#{fixture_path}/user_import.csv", options)
|
36
|
+
}.to raise_exception(SmarterCSV::KeyMappingError, "missing header(s): age")
|
37
|
+
end
|
52
38
|
end
|
@@ -2,23 +2,28 @@ require 'spec_helper'
|
|
2
2
|
|
3
3
|
fixture_path = 'spec/fixtures'
|
4
4
|
|
5
|
-
describe '
|
6
|
-
|
7
|
-
|
8
|
-
|
5
|
+
describe 'no header in file' do
|
6
|
+
let(:headers) { [:a,:b,:c,:d,:e,:f] }
|
7
|
+
let(:options) { {:headers_in_file => false, :user_provided_headers => headers} }
|
8
|
+
subject(:data) { SmarterCSV.process("#{fixture_path}/no_header.csv", options) }
|
9
|
+
|
10
|
+
it 'load the correct number of records' do
|
9
11
|
data.size.should == 5
|
10
|
-
|
11
|
-
data.each{|item| item.keys.each{|x| x.class.should be == Symbol}}
|
12
|
+
end
|
12
13
|
|
13
|
-
|
14
|
+
it 'uses given symbols for all records' do
|
15
|
+
data.each do |item|
|
14
16
|
item.keys.each do |key|
|
15
17
|
[:a,:b,:c,:d,:e,:f].should include( key )
|
16
18
|
end
|
17
19
|
end
|
18
|
-
|
19
|
-
data.each do |h|
|
20
|
-
h.size.should <= 6
|
21
|
-
end
|
22
20
|
end
|
23
21
|
|
22
|
+
it 'loads the correct data' do
|
23
|
+
data[0].should == {a: "Dan", b: "McAllister", c: 2, d: 0}
|
24
|
+
data[1].should == {a: "Lucy", b: "Laweless", d: 5, e: 0}
|
25
|
+
data[2].should == {a: "Miles", b: "O'Brian", c: 0, d: 0, e: 0, f: 21}
|
26
|
+
data[3].should == {a: "Nancy", b: "Homes", c: 2, d: 0, e: 1}
|
27
|
+
data[4].should == {a: "Hernán", b: "Curaçon", c: 3, d: 0, e: 0}
|
28
|
+
end
|
24
29
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: smarter_csv
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.5.2
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tilo Sloboda
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2022-
|
11
|
+
date: 2022-04-29 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: rspec
|
@@ -62,6 +62,7 @@ files:
|
|
62
62
|
- lib/smarter_csv/smarter_csv.rb
|
63
63
|
- lib/smarter_csv/version.rb
|
64
64
|
- smarter_csv.gemspec
|
65
|
+
- spec/fixtures/additional_separator.csv
|
65
66
|
- spec/fixtures/basic.csv
|
66
67
|
- spec/fixtures/binary.csv
|
67
68
|
- spec/fixtures/carriage_returns_n.csv
|
@@ -73,6 +74,7 @@ files:
|
|
73
74
|
- spec/fixtures/empty.csv
|
74
75
|
- spec/fixtures/empty_columns_1.csv
|
75
76
|
- spec/fixtures/empty_columns_2.csv
|
77
|
+
- spec/fixtures/hard_sample.csv
|
76
78
|
- spec/fixtures/ignore_comments.csv
|
77
79
|
- spec/fixtures/ignore_comments2.csv
|
78
80
|
- spec/fixtures/key_mapping.csv
|
@@ -101,6 +103,7 @@ files:
|
|
101
103
|
- spec/fixtures/valid_unicode.csv
|
102
104
|
- spec/fixtures/with_dashes.csv
|
103
105
|
- spec/fixtures/with_dates.csv
|
106
|
+
- spec/smarter_csv/additional_separator_spec.rb
|
104
107
|
- spec/smarter_csv/binary_file2_spec.rb
|
105
108
|
- spec/smarter_csv/binary_file_spec.rb
|
106
109
|
- spec/smarter_csv/blank_spec.rb
|
@@ -109,8 +112,10 @@ files:
|
|
109
112
|
- spec/smarter_csv/close_file_spec.rb
|
110
113
|
- spec/smarter_csv/column_separator_spec.rb
|
111
114
|
- spec/smarter_csv/convert_values_to_numeric_spec.rb
|
115
|
+
- spec/smarter_csv/duplicate_headers_spec.rb
|
112
116
|
- spec/smarter_csv/empty_columns_spec.rb
|
113
117
|
- spec/smarter_csv/extenstions_spec.rb
|
118
|
+
- spec/smarter_csv/hard_sample_spec.rb
|
114
119
|
- spec/smarter_csv/header_transformation_spec.rb
|
115
120
|
- spec/smarter_csv/ignore_comments_spec.rb
|
116
121
|
- spec/smarter_csv/invalid_headers_spec.rb
|
@@ -164,6 +169,7 @@ specification_version: 4
|
|
164
169
|
summary: Ruby Gem for smarter importing of CSV Files (and CSV-like files), with lots
|
165
170
|
of optional features, e.g. chunked processing for huge CSV files
|
166
171
|
test_files:
|
172
|
+
- spec/fixtures/additional_separator.csv
|
167
173
|
- spec/fixtures/basic.csv
|
168
174
|
- spec/fixtures/binary.csv
|
169
175
|
- spec/fixtures/carriage_returns_n.csv
|
@@ -175,6 +181,7 @@ test_files:
|
|
175
181
|
- spec/fixtures/empty.csv
|
176
182
|
- spec/fixtures/empty_columns_1.csv
|
177
183
|
- spec/fixtures/empty_columns_2.csv
|
184
|
+
- spec/fixtures/hard_sample.csv
|
178
185
|
- spec/fixtures/ignore_comments.csv
|
179
186
|
- spec/fixtures/ignore_comments2.csv
|
180
187
|
- spec/fixtures/key_mapping.csv
|
@@ -203,6 +210,7 @@ test_files:
|
|
203
210
|
- spec/fixtures/valid_unicode.csv
|
204
211
|
- spec/fixtures/with_dashes.csv
|
205
212
|
- spec/fixtures/with_dates.csv
|
213
|
+
- spec/smarter_csv/additional_separator_spec.rb
|
206
214
|
- spec/smarter_csv/binary_file2_spec.rb
|
207
215
|
- spec/smarter_csv/binary_file_spec.rb
|
208
216
|
- spec/smarter_csv/blank_spec.rb
|
@@ -211,8 +219,10 @@ test_files:
|
|
211
219
|
- spec/smarter_csv/close_file_spec.rb
|
212
220
|
- spec/smarter_csv/column_separator_spec.rb
|
213
221
|
- spec/smarter_csv/convert_values_to_numeric_spec.rb
|
222
|
+
- spec/smarter_csv/duplicate_headers_spec.rb
|
214
223
|
- spec/smarter_csv/empty_columns_spec.rb
|
215
224
|
- spec/smarter_csv/extenstions_spec.rb
|
225
|
+
- spec/smarter_csv/hard_sample_spec.rb
|
216
226
|
- spec/smarter_csv/header_transformation_spec.rb
|
217
227
|
- spec/smarter_csv/ignore_comments_spec.rb
|
218
228
|
- spec/smarter_csv/invalid_headers_spec.rb
|