smarter_csv 1.7.1 → 1.7.3
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +9 -2
- data/CONTRIBUTORS.md +1 -0
- data/README.md +26 -23
- data/lib/smarter_csv/version.rb +1 -1
- data/lib/smarter_csv.rb +13 -5
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: d4046758f38c21262fdec6bc7e13e3a7811c7aee3944d92e0cc36a2a1cfb032a
|
4
|
+
data.tar.gz: 9d111e2f36171ca488034f3af73fc71c7c9f6fde73986d277aeaf1560a066fa2
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: c46c5c45dd3fafe66735b2b17b0679c5aaff27b3670140d97bc19e1c825ad91310fa2cf55a12a5c7b0c31ef82fe9cc12a2c4bda0a78b218d80ad5816c01c0d9f
|
7
|
+
data.tar.gz: ba03acd95955f8afeb8e96f16c7cfa2e1605dbaf6fddb7008930294aab83196aed21f57605efb3553799381c1c4811528eee2db221efa50dc82f58bcf9135842
|
data/CHANGELOG.md
CHANGED
@@ -1,9 +1,16 @@
|
|
1
1
|
|
2
2
|
# SmarterCSV 1.x Change Log
|
3
3
|
|
4
|
+
## 1.7.3 (2022-12-05)
|
5
|
+
* new option :silence_missing_keys; if set to true, it ignores missing keys in `key_mapping`
|
6
|
+
|
7
|
+
## 1.7.2 (2022-08-29)
|
8
|
+
* new option :with_line_numbers; if set to true, it adds :csv_line_number to each data hash (issue #130)
|
9
|
+
|
4
10
|
## 1.7.1 (2022-07-31)
|
5
|
-
* bugfix (thanks to Viacheslav Markin, Nicolas Rodriguez)
|
6
|
-
|
11
|
+
* bugfix for issue #195 #197 #200 which only appeared when called from Rails (thanks to Viacheslav Markin, Nicolas Rodriguez)
|
12
|
+
|
13
|
+
## 1.7.0 (2022-06-26) (replaced by 1.7.1)
|
7
14
|
* added native code to accellerate line parsing by >10x over 1.6.0
|
8
15
|
* added option `acceleration`, defaulting to `true`, to enable native code.
|
9
16
|
Disable this option to use the ruby code for line parsing.
|
data/CONTRIBUTORS.md
CHANGED
@@ -48,3 +48,4 @@ A Big Thank you to everyone who filed issues, sent comments, and who contributed
|
|
48
48
|
* [Viacheslav Markin](https://github.com/KXEinc)
|
49
49
|
* [Nicolas Rodriguez](https://github.com/n-rodriguez)
|
50
50
|
* [Hirotaka Mizutani ](https://github.com/hirotaka)
|
51
|
+
* [Rahul Chaudhary](https://github.com/rahulch95)
|
data/README.md
CHANGED
@@ -1,9 +1,12 @@
|
|
1
|
-
|
1
|
+
|
2
|
+
# SmarterCSV
|
3
|
+
|
4
|
+
[![codecov](https://codecov.io/gh/tilo/smarter_csv/branch/main/graph/badge.svg?token=1L7OD80182)](https://codecov.io/gh/tilo/smarter_csv) [![Gem Version](https://badge.fury.io/rb/smarter_csv.svg)](http://badge.fury.io/rb/smarter_csv)
|
2
5
|
|
3
|
-
####
|
6
|
+
#### Work towards Future Version 2.0
|
4
7
|
|
5
|
-
* Work towards SmarterCSV 2.0 is still
|
6
|
-
Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/
|
8
|
+
* Work towards SmarterCSV 2.0 is still ongoing, with improved features, and more streamlined options, but consider it as experimental at this time.
|
9
|
+
Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/tree/2.0-develop), open any issues and pull requests with mention of tag v2.0.
|
7
10
|
|
8
11
|
* New versions of SmarterCSV 1.x will soon print a deprecation warning if you set :verbose to true
|
9
12
|
See below for list of deprecated options.
|
@@ -15,11 +18,7 @@
|
|
15
18
|
|
16
19
|
---------------
|
17
20
|
|
18
|
-
|
19
|
-
|
20
|
-
[![Build Status](https://secure.travis-ci.org/tilo/smarter_csv.svg?branch=master)](http://travis-ci.com/tilo/smarter_csv) [![Gem Version](https://badge.fury.io/rb/smarter_csv.svg)](http://badge.fury.io/rb/smarter_csv)
|
21
|
-
|
22
|
-
#### SmarterCSV 1.x
|
21
|
+
#### SmarterCSV 1.x [Current Version]
|
23
22
|
|
24
23
|
`smarter_csv` is a Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, suitable for direct processing with ActiveRecord, parallel processing, or kicking-off batch jobs with Sidekiq.
|
25
24
|
|
@@ -132,7 +131,21 @@ and how the `process` method returns the number of chunks when called with a blo
|
|
132
131
|
|
133
132
|
=> returns number of chunks / rows we processed
|
134
133
|
```
|
135
|
-
|
134
|
+
|
135
|
+
#### Example 4: Reading a CSV-like File, and Processing it with Sidekiq:
|
136
|
+
```ruby
|
137
|
+
filename = '/tmp/strange_db_dump' # a file with CRTL-A as col_separator, and with CTRL-B\n as record_separator (hello iTunes!)
|
138
|
+
options = {
|
139
|
+
:col_sep => "\cA", :row_sep => "\cB\n", :comment_regexp => /^#/,
|
140
|
+
:chunk_size => 100 , :key_mapping => {:export_date => nil, :name => :genre}
|
141
|
+
}
|
142
|
+
n = SmarterCSV.process(filename, options) do |chunk|
|
143
|
+
SidekiqWorkerClass.process_async(chunk ) # pass an array of hashes to Sidekiq workers for parallel processing
|
144
|
+
end
|
145
|
+
=> returns number of chunks
|
146
|
+
```
|
147
|
+
|
148
|
+
#### Example 5: Populate a MongoDB Database in Chunks of 100 records with SmarterCSV:
|
136
149
|
```ruby
|
137
150
|
# using chunks:
|
138
151
|
filename = '/tmp/some.csv'
|
@@ -146,18 +159,6 @@ and how the `process` method returns the number of chunks when called with a blo
|
|
146
159
|
=> returns number of chunks we processed
|
147
160
|
```
|
148
161
|
|
149
|
-
#### Example 5: Reading a CSV-like File, and Processing it with Resque:
|
150
|
-
```ruby
|
151
|
-
filename = '/tmp/strange_db_dump' # a file with CRTL-A as col_separator, and with CTRL-B\n as record_separator (hello iTunes!)
|
152
|
-
options = {
|
153
|
-
:col_sep => "\cA", :row_sep => "\cB\n", :comment_regexp => /^#/,
|
154
|
-
:chunk_size => 100 , :key_mapping => {:export_date => nil, :name => :genre}
|
155
|
-
}
|
156
|
-
n = SmarterCSV.process(filename, options) do |chunk|
|
157
|
-
Resque.enque( ResqueWorkerClass, chunk ) # pass chunks of CSV-data to Resque workers for parallel processing
|
158
|
-
end
|
159
|
-
=> returns number of chunks
|
160
|
-
```
|
161
162
|
#### Example 6: Using Value Converters
|
162
163
|
|
163
164
|
NOTE: If you use `key_mappings` and `value_converters`, make sure that the value converters has references the keys based on the final mapped name, not the original name in the CSV file.
|
@@ -239,6 +240,7 @@ The options and the block are optional.
|
|
239
240
|
| | | You can not combine the :user_provided_headers and :key_mapping options |
|
240
241
|
| :remove_empty_hashes | true | remove / ignore any hashes which don't have any key/value pairs or all empty values |
|
241
242
|
| :verbose | false | print out line number while processing (to track down problems in input files) |
|
243
|
+
| :with_line_numbers | false | add :csv_line_number to each data hash |
|
242
244
|
---------------------------------------------------------------------------------------------------------------------------------
|
243
245
|
|
244
246
|
#### Deprecated 1.x Options: to be replaced in 2.0
|
@@ -251,7 +253,8 @@ And header and data validations will also be supported in 2.x
|
|
251
253
|
| Option | Default | Explanation |
|
252
254
|
---------------------------------------------------------------------------------------------------------------------------------
|
253
255
|
| :key_mapping | nil | a hash which maps headers from the CSV file to keys in the result hash |
|
254
|
-
| :
|
256
|
+
| :silence_missing_key | false | ignore missing keys in `key_mapping` if true |
|
257
|
+
| :required_headers | nil | An array. Each of the given headers must be present after header manipulation, |
|
255
258
|
| | | or an exception is raised No validation if nil is given. |
|
256
259
|
| :remove_unmapped_keys | false | when using :key_mapping option, should non-mapped keys / columns be removed? |
|
257
260
|
| :downcase_header | true | downcase all column headers |
|
data/lib/smarter_csv/version.rb
CHANGED
data/lib/smarter_csv.rb
CHANGED
@@ -2,7 +2,9 @@
|
|
2
2
|
|
3
3
|
require_relative "extensions/hash"
|
4
4
|
require_relative "smarter_csv/version"
|
5
|
+
|
5
6
|
require_relative "smarter_csv/smarter_csv" unless ENV['CI'] # does not compile/link in CI?
|
7
|
+
# require 'smarter_csv.bundle' unless ENV['CI'] # does not compile/link in CI?
|
6
8
|
|
7
9
|
module SmarterCSV
|
8
10
|
class SmarterCSVException < StandardError; end
|
@@ -129,6 +131,8 @@ module SmarterCSV
|
|
129
131
|
|
130
132
|
next if options[:remove_empty_hashes] && hash.empty?
|
131
133
|
|
134
|
+
hash[:csv_line_number] = @csv_line_count if options[:with_line_numbers]
|
135
|
+
|
132
136
|
if use_chunks
|
133
137
|
chunk << hash # append temp result to chunk
|
134
138
|
|
@@ -223,6 +227,7 @@ module SmarterCSV
|
|
223
227
|
remove_zero_values: false,
|
224
228
|
required_headers: nil,
|
225
229
|
row_sep: $/,
|
230
|
+
silence_missing_keys: false,
|
226
231
|
skip_lines: nil,
|
227
232
|
strings_as_keys: false,
|
228
233
|
strip_chars_from_headers: nil,
|
@@ -230,6 +235,7 @@ module SmarterCSV
|
|
230
235
|
user_provided_headers: nil,
|
231
236
|
value_converters: nil,
|
232
237
|
verbose: false,
|
238
|
+
with_line_numbers: false,
|
233
239
|
}
|
234
240
|
end
|
235
241
|
|
@@ -247,12 +253,12 @@ module SmarterCSV
|
|
247
253
|
# puts "SmarterCSV.parse OPTIONS: #{options[:acceleration]}" if options[:verbose]
|
248
254
|
|
249
255
|
if options[:acceleration] && has_acceleration?
|
250
|
-
#
|
256
|
+
# :nocov:
|
251
257
|
has_quotes = line =~ /#{options[:quote_char]}/
|
252
258
|
elements = parse_csv_line_c(line, options[:col_sep], options[:quote_char], header_size)
|
253
259
|
elements.map!{|x| cleanup_quotes(x, options[:quote_char])} if has_quotes
|
254
260
|
return [elements, elements.size]
|
255
|
-
|
261
|
+
# :nocov:
|
256
262
|
else
|
257
263
|
# puts "WARNING: SmarterCSV is using un-accelerated parsing of lines. Check options[:acceleration]"
|
258
264
|
return parse_csv_line_ruby(line, options, header_size)
|
@@ -474,9 +480,11 @@ module SmarterCSV
|
|
474
480
|
# do some key mapping on the keys in the file header
|
475
481
|
# if you want to completely delete a key, then map it to nil or to ''
|
476
482
|
if !key_mappingH.nil? && key_mappingH.class == Hash && key_mappingH.keys.size > 0
|
477
|
-
|
478
|
-
|
479
|
-
|
483
|
+
unless options[:silence_missing_keys]
|
484
|
+
# if silence_missing_keys are not set, raise error if missing header
|
485
|
+
missing_keys = key_mappingH.keys - headerA
|
486
|
+
puts "WARNING: missing header(s): #{missing_keys.join(",")}" unless missing_keys.empty?
|
487
|
+
end
|
480
488
|
|
481
489
|
headerA.map!{|x| key_mappingH.has_key?(x) ? (key_mappingH[x].nil? ? nil : key_mappingH[x]) : (options[:remove_unmapped_keys] ? nil : x)}
|
482
490
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: smarter_csv
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.7.
|
4
|
+
version: 1.7.3
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tilo Sloboda
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2022-
|
11
|
+
date: 2022-12-09 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: awesome_print
|