smarter_csv 1.7.1 → 1.7.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +9 -2
- data/CONTRIBUTORS.md +1 -0
- data/README.md +26 -23
- data/lib/smarter_csv/version.rb +1 -1
- data/lib/smarter_csv.rb +13 -5
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: d4046758f38c21262fdec6bc7e13e3a7811c7aee3944d92e0cc36a2a1cfb032a
|
4
|
+
data.tar.gz: 9d111e2f36171ca488034f3af73fc71c7c9f6fde73986d277aeaf1560a066fa2
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: c46c5c45dd3fafe66735b2b17b0679c5aaff27b3670140d97bc19e1c825ad91310fa2cf55a12a5c7b0c31ef82fe9cc12a2c4bda0a78b218d80ad5816c01c0d9f
|
7
|
+
data.tar.gz: ba03acd95955f8afeb8e96f16c7cfa2e1605dbaf6fddb7008930294aab83196aed21f57605efb3553799381c1c4811528eee2db221efa50dc82f58bcf9135842
|
data/CHANGELOG.md
CHANGED
@@ -1,9 +1,16 @@
|
|
1
1
|
|
2
2
|
# SmarterCSV 1.x Change Log
|
3
3
|
|
4
|
+
## 1.7.3 (2022-12-05)
|
5
|
+
* new option :silence_missing_keys; if set to true, it ignores missing keys in `key_mapping`
|
6
|
+
|
7
|
+
## 1.7.2 (2022-08-29)
|
8
|
+
* new option :with_line_numbers; if set to true, it adds :csv_line_number to each data hash (issue #130)
|
9
|
+
|
4
10
|
## 1.7.1 (2022-07-31)
|
5
|
-
* bugfix (thanks to Viacheslav Markin, Nicolas Rodriguez)
|
6
|
-
|
11
|
+
* bugfix for issue #195 #197 #200 which only appeared when called from Rails (thanks to Viacheslav Markin, Nicolas Rodriguez)
|
12
|
+
|
13
|
+
## 1.7.0 (2022-06-26) (replaced by 1.7.1)
|
7
14
|
* added native code to accellerate line parsing by >10x over 1.6.0
|
8
15
|
* added option `acceleration`, defaulting to `true`, to enable native code.
|
9
16
|
Disable this option to use the ruby code for line parsing.
|
data/CONTRIBUTORS.md
CHANGED
@@ -48,3 +48,4 @@ A Big Thank you to everyone who filed issues, sent comments, and who contributed
|
|
48
48
|
* [Viacheslav Markin](https://github.com/KXEinc)
|
49
49
|
* [Nicolas Rodriguez](https://github.com/n-rodriguez)
|
50
50
|
* [Hirotaka Mizutani ](https://github.com/hirotaka)
|
51
|
+
* [Rahul Chaudhary](https://github.com/rahulch95)
|
data/README.md
CHANGED
@@ -1,9 +1,12 @@
|
|
1
|
-
|
1
|
+
|
2
|
+
# SmarterCSV
|
3
|
+
|
4
|
+
[](https://codecov.io/gh/tilo/smarter_csv) [](http://badge.fury.io/rb/smarter_csv)
|
2
5
|
|
3
|
-
####
|
6
|
+
#### Work towards Future Version 2.0
|
4
7
|
|
5
|
-
* Work towards SmarterCSV 2.0 is still
|
6
|
-
Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/
|
8
|
+
* Work towards SmarterCSV 2.0 is still ongoing, with improved features, and more streamlined options, but consider it as experimental at this time.
|
9
|
+
Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/tree/2.0-develop), open any issues and pull requests with mention of tag v2.0.
|
7
10
|
|
8
11
|
* New versions of SmarterCSV 1.x will soon print a deprecation warning if you set :verbose to true
|
9
12
|
See below for list of deprecated options.
|
@@ -15,11 +18,7 @@
|
|
15
18
|
|
16
19
|
---------------
|
17
20
|
|
18
|
-
|
19
|
-
|
20
|
-
[](http://travis-ci.com/tilo/smarter_csv) [](http://badge.fury.io/rb/smarter_csv)
|
21
|
-
|
22
|
-
#### SmarterCSV 1.x
|
21
|
+
#### SmarterCSV 1.x [Current Version]
|
23
22
|
|
24
23
|
`smarter_csv` is a Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, suitable for direct processing with ActiveRecord, parallel processing, or kicking-off batch jobs with Sidekiq.
|
25
24
|
|
@@ -132,7 +131,21 @@ and how the `process` method returns the number of chunks when called with a blo
|
|
132
131
|
|
133
132
|
=> returns number of chunks / rows we processed
|
134
133
|
```
|
135
|
-
|
134
|
+
|
135
|
+
#### Example 4: Reading a CSV-like File, and Processing it with Sidekiq:
|
136
|
+
```ruby
|
137
|
+
filename = '/tmp/strange_db_dump' # a file with CRTL-A as col_separator, and with CTRL-B\n as record_separator (hello iTunes!)
|
138
|
+
options = {
|
139
|
+
:col_sep => "\cA", :row_sep => "\cB\n", :comment_regexp => /^#/,
|
140
|
+
:chunk_size => 100 , :key_mapping => {:export_date => nil, :name => :genre}
|
141
|
+
}
|
142
|
+
n = SmarterCSV.process(filename, options) do |chunk|
|
143
|
+
SidekiqWorkerClass.process_async(chunk ) # pass an array of hashes to Sidekiq workers for parallel processing
|
144
|
+
end
|
145
|
+
=> returns number of chunks
|
146
|
+
```
|
147
|
+
|
148
|
+
#### Example 5: Populate a MongoDB Database in Chunks of 100 records with SmarterCSV:
|
136
149
|
```ruby
|
137
150
|
# using chunks:
|
138
151
|
filename = '/tmp/some.csv'
|
@@ -146,18 +159,6 @@ and how the `process` method returns the number of chunks when called with a blo
|
|
146
159
|
=> returns number of chunks we processed
|
147
160
|
```
|
148
161
|
|
149
|
-
#### Example 5: Reading a CSV-like File, and Processing it with Resque:
|
150
|
-
```ruby
|
151
|
-
filename = '/tmp/strange_db_dump' # a file with CRTL-A as col_separator, and with CTRL-B\n as record_separator (hello iTunes!)
|
152
|
-
options = {
|
153
|
-
:col_sep => "\cA", :row_sep => "\cB\n", :comment_regexp => /^#/,
|
154
|
-
:chunk_size => 100 , :key_mapping => {:export_date => nil, :name => :genre}
|
155
|
-
}
|
156
|
-
n = SmarterCSV.process(filename, options) do |chunk|
|
157
|
-
Resque.enque( ResqueWorkerClass, chunk ) # pass chunks of CSV-data to Resque workers for parallel processing
|
158
|
-
end
|
159
|
-
=> returns number of chunks
|
160
|
-
```
|
161
162
|
#### Example 6: Using Value Converters
|
162
163
|
|
163
164
|
NOTE: If you use `key_mappings` and `value_converters`, make sure that the value converters has references the keys based on the final mapped name, not the original name in the CSV file.
|
@@ -239,6 +240,7 @@ The options and the block are optional.
|
|
239
240
|
| | | You can not combine the :user_provided_headers and :key_mapping options |
|
240
241
|
| :remove_empty_hashes | true | remove / ignore any hashes which don't have any key/value pairs or all empty values |
|
241
242
|
| :verbose | false | print out line number while processing (to track down problems in input files) |
|
243
|
+
| :with_line_numbers | false | add :csv_line_number to each data hash |
|
242
244
|
---------------------------------------------------------------------------------------------------------------------------------
|
243
245
|
|
244
246
|
#### Deprecated 1.x Options: to be replaced in 2.0
|
@@ -251,7 +253,8 @@ And header and data validations will also be supported in 2.x
|
|
251
253
|
| Option | Default | Explanation |
|
252
254
|
---------------------------------------------------------------------------------------------------------------------------------
|
253
255
|
| :key_mapping | nil | a hash which maps headers from the CSV file to keys in the result hash |
|
254
|
-
| :
|
256
|
+
| :silence_missing_key | false | ignore missing keys in `key_mapping` if true |
|
257
|
+
| :required_headers | nil | An array. Each of the given headers must be present after header manipulation, |
|
255
258
|
| | | or an exception is raised No validation if nil is given. |
|
256
259
|
| :remove_unmapped_keys | false | when using :key_mapping option, should non-mapped keys / columns be removed? |
|
257
260
|
| :downcase_header | true | downcase all column headers |
|
data/lib/smarter_csv/version.rb
CHANGED
data/lib/smarter_csv.rb
CHANGED
@@ -2,7 +2,9 @@
|
|
2
2
|
|
3
3
|
require_relative "extensions/hash"
|
4
4
|
require_relative "smarter_csv/version"
|
5
|
+
|
5
6
|
require_relative "smarter_csv/smarter_csv" unless ENV['CI'] # does not compile/link in CI?
|
7
|
+
# require 'smarter_csv.bundle' unless ENV['CI'] # does not compile/link in CI?
|
6
8
|
|
7
9
|
module SmarterCSV
|
8
10
|
class SmarterCSVException < StandardError; end
|
@@ -129,6 +131,8 @@ module SmarterCSV
|
|
129
131
|
|
130
132
|
next if options[:remove_empty_hashes] && hash.empty?
|
131
133
|
|
134
|
+
hash[:csv_line_number] = @csv_line_count if options[:with_line_numbers]
|
135
|
+
|
132
136
|
if use_chunks
|
133
137
|
chunk << hash # append temp result to chunk
|
134
138
|
|
@@ -223,6 +227,7 @@ module SmarterCSV
|
|
223
227
|
remove_zero_values: false,
|
224
228
|
required_headers: nil,
|
225
229
|
row_sep: $/,
|
230
|
+
silence_missing_keys: false,
|
226
231
|
skip_lines: nil,
|
227
232
|
strings_as_keys: false,
|
228
233
|
strip_chars_from_headers: nil,
|
@@ -230,6 +235,7 @@ module SmarterCSV
|
|
230
235
|
user_provided_headers: nil,
|
231
236
|
value_converters: nil,
|
232
237
|
verbose: false,
|
238
|
+
with_line_numbers: false,
|
233
239
|
}
|
234
240
|
end
|
235
241
|
|
@@ -247,12 +253,12 @@ module SmarterCSV
|
|
247
253
|
# puts "SmarterCSV.parse OPTIONS: #{options[:acceleration]}" if options[:verbose]
|
248
254
|
|
249
255
|
if options[:acceleration] && has_acceleration?
|
250
|
-
#
|
256
|
+
# :nocov:
|
251
257
|
has_quotes = line =~ /#{options[:quote_char]}/
|
252
258
|
elements = parse_csv_line_c(line, options[:col_sep], options[:quote_char], header_size)
|
253
259
|
elements.map!{|x| cleanup_quotes(x, options[:quote_char])} if has_quotes
|
254
260
|
return [elements, elements.size]
|
255
|
-
|
261
|
+
# :nocov:
|
256
262
|
else
|
257
263
|
# puts "WARNING: SmarterCSV is using un-accelerated parsing of lines. Check options[:acceleration]"
|
258
264
|
return parse_csv_line_ruby(line, options, header_size)
|
@@ -474,9 +480,11 @@ module SmarterCSV
|
|
474
480
|
# do some key mapping on the keys in the file header
|
475
481
|
# if you want to completely delete a key, then map it to nil or to ''
|
476
482
|
if !key_mappingH.nil? && key_mappingH.class == Hash && key_mappingH.keys.size > 0
|
477
|
-
|
478
|
-
|
479
|
-
|
483
|
+
unless options[:silence_missing_keys]
|
484
|
+
# if silence_missing_keys are not set, raise error if missing header
|
485
|
+
missing_keys = key_mappingH.keys - headerA
|
486
|
+
puts "WARNING: missing header(s): #{missing_keys.join(",")}" unless missing_keys.empty?
|
487
|
+
end
|
480
488
|
|
481
489
|
headerA.map!{|x| key_mappingH.has_key?(x) ? (key_mappingH[x].nil? ? nil : key_mappingH[x]) : (options[:remove_unmapped_keys] ? nil : x)}
|
482
490
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: smarter_csv
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.7.
|
4
|
+
version: 1.7.3
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tilo Sloboda
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2022-
|
11
|
+
date: 2022-12-09 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: awesome_print
|