smarter_csv 1.7.1 → 1.7.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ab42c787dba96369ba0f499294d5421cc2fc9a514b1cff039e5017d19fd2ff4c
4
- data.tar.gz: f3605e56395d498169c449657945c00c677a75d17e2a7cbcca1a4f4e65aa45f2
3
+ metadata.gz: d4046758f38c21262fdec6bc7e13e3a7811c7aee3944d92e0cc36a2a1cfb032a
4
+ data.tar.gz: 9d111e2f36171ca488034f3af73fc71c7c9f6fde73986d277aeaf1560a066fa2
5
5
  SHA512:
6
- metadata.gz: a264574b464219fd53a862be345c008fc0a93c076345d62be4c75935a148f9a469ccbea14fb2f821cedaada1f2b4ec875468ef50b3ae173bcbb76a5079d43406
7
- data.tar.gz: 978236fa33bf4656eba89d929545c81ec2dc28b179890dd4254fb04cfa9b8646f4040aa2a62d44361dea221714f1cba0d7e4c858d338d30080c4a0e43c003552
6
+ metadata.gz: c46c5c45dd3fafe66735b2b17b0679c5aaff27b3670140d97bc19e1c825ad91310fa2cf55a12a5c7b0c31ef82fe9cc12a2c4bda0a78b218d80ad5816c01c0d9f
7
+ data.tar.gz: ba03acd95955f8afeb8e96f16c7cfa2e1605dbaf6fddb7008930294aab83196aed21f57605efb3553799381c1c4811528eee2db221efa50dc82f58bcf9135842
data/CHANGELOG.md CHANGED
@@ -1,9 +1,16 @@
1
1
 
2
2
  # SmarterCSV 1.x Change Log
3
3
 
4
+ ## 1.7.3 (2022-12-05)
5
+ * new option :silence_missing_keys; if set to true, it ignores missing keys in `key_mapping`
6
+
7
+ ## 1.7.2 (2022-08-29)
8
+ * new option :with_line_numbers; if set to true, it adds :csv_line_number to each data hash (issue #130)
9
+
4
10
  ## 1.7.1 (2022-07-31)
5
- * bugfix (thanks to Viacheslav Markin, Nicolas Rodriguez)
6
- ## 1.7.0 (2022-06-26)
11
+ * bugfix for issue #195 #197 #200 which only appeared when called from Rails (thanks to Viacheslav Markin, Nicolas Rodriguez)
12
+
13
+ ## 1.7.0 (2022-06-26) (replaced by 1.7.1)
7
14
  * added native code to accellerate line parsing by >10x over 1.6.0
8
15
  * added option `acceleration`, defaulting to `true`, to enable native code.
9
16
  Disable this option to use the ruby code for line parsing.
data/CONTRIBUTORS.md CHANGED
@@ -48,3 +48,4 @@ A Big Thank you to everyone who filed issues, sent comments, and who contributed
48
48
  * [Viacheslav Markin](https://github.com/KXEinc)
49
49
  * [Nicolas Rodriguez](https://github.com/n-rodriguez)
50
50
  * [Hirotaka Mizutani ](https://github.com/hirotaka)
51
+ * [Rahul Chaudhary](https://github.com/rahulch95)
data/README.md CHANGED
@@ -1,9 +1,12 @@
1
- [![codecov](https://codecov.io/gh/tilo/smarter_csv/branch/main/graph/badge.svg?token=1L7OD80182)](https://codecov.io/gh/tilo/smarter_csv)
1
+
2
+ # SmarterCSV
3
+
4
+ [![codecov](https://codecov.io/gh/tilo/smarter_csv/branch/main/graph/badge.svg?token=1L7OD80182)](https://codecov.io/gh/tilo/smarter_csv) [![Gem Version](https://badge.fury.io/rb/smarter_csv.svg)](http://badge.fury.io/rb/smarter_csv)
2
5
 
3
- #### Service Announcement
6
+ #### Work towards Future Version 2.0
4
7
 
5
- * Work towards SmarterCSV 2.0 is still on it's way, with much improved features, and more streamlined options.
6
- Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/blob/master/README.md), open any issues and pull requests with mention of v2.0.
8
+ * Work towards SmarterCSV 2.0 is still ongoing, with improved features, and more streamlined options, but consider it as experimental at this time.
9
+ Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/tree/2.0-develop), open any issues and pull requests with mention of tag v2.0.
7
10
 
8
11
  * New versions of SmarterCSV 1.x will soon print a deprecation warning if you set :verbose to true
9
12
  See below for list of deprecated options.
@@ -15,11 +18,7 @@
15
18
 
16
19
  ---------------
17
20
 
18
- # SmarterCSV
19
-
20
- [![Build Status](https://secure.travis-ci.org/tilo/smarter_csv.svg?branch=master)](http://travis-ci.com/tilo/smarter_csv) [![Gem Version](https://badge.fury.io/rb/smarter_csv.svg)](http://badge.fury.io/rb/smarter_csv)
21
-
22
- #### SmarterCSV 1.x
21
+ #### SmarterCSV 1.x [Current Version]
23
22
 
24
23
  `smarter_csv` is a Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, suitable for direct processing with ActiveRecord, parallel processing, or kicking-off batch jobs with Sidekiq.
25
24
 
@@ -132,7 +131,21 @@ and how the `process` method returns the number of chunks when called with a blo
132
131
 
133
132
  => returns number of chunks / rows we processed
134
133
  ```
135
- #### Example 4: Populate a MongoDB Database in Chunks of 100 records with SmarterCSV:
134
+
135
+ #### Example 4: Reading a CSV-like File, and Processing it with Sidekiq:
136
+ ```ruby
137
+ filename = '/tmp/strange_db_dump' # a file with CRTL-A as col_separator, and with CTRL-B\n as record_separator (hello iTunes!)
138
+ options = {
139
+ :col_sep => "\cA", :row_sep => "\cB\n", :comment_regexp => /^#/,
140
+ :chunk_size => 100 , :key_mapping => {:export_date => nil, :name => :genre}
141
+ }
142
+ n = SmarterCSV.process(filename, options) do |chunk|
143
+ SidekiqWorkerClass.process_async(chunk ) # pass an array of hashes to Sidekiq workers for parallel processing
144
+ end
145
+ => returns number of chunks
146
+ ```
147
+
148
+ #### Example 5: Populate a MongoDB Database in Chunks of 100 records with SmarterCSV:
136
149
  ```ruby
137
150
  # using chunks:
138
151
  filename = '/tmp/some.csv'
@@ -146,18 +159,6 @@ and how the `process` method returns the number of chunks when called with a blo
146
159
  => returns number of chunks we processed
147
160
  ```
148
161
 
149
- #### Example 5: Reading a CSV-like File, and Processing it with Resque:
150
- ```ruby
151
- filename = '/tmp/strange_db_dump' # a file with CRTL-A as col_separator, and with CTRL-B\n as record_separator (hello iTunes!)
152
- options = {
153
- :col_sep => "\cA", :row_sep => "\cB\n", :comment_regexp => /^#/,
154
- :chunk_size => 100 , :key_mapping => {:export_date => nil, :name => :genre}
155
- }
156
- n = SmarterCSV.process(filename, options) do |chunk|
157
- Resque.enque( ResqueWorkerClass, chunk ) # pass chunks of CSV-data to Resque workers for parallel processing
158
- end
159
- => returns number of chunks
160
- ```
161
162
  #### Example 6: Using Value Converters
162
163
 
163
164
  NOTE: If you use `key_mappings` and `value_converters`, make sure that the value converters has references the keys based on the final mapped name, not the original name in the CSV file.
@@ -239,6 +240,7 @@ The options and the block are optional.
239
240
  | | | You can not combine the :user_provided_headers and :key_mapping options |
240
241
  | :remove_empty_hashes | true | remove / ignore any hashes which don't have any key/value pairs or all empty values |
241
242
  | :verbose | false | print out line number while processing (to track down problems in input files) |
243
+ | :with_line_numbers | false | add :csv_line_number to each data hash |
242
244
  ---------------------------------------------------------------------------------------------------------------------------------
243
245
 
244
246
  #### Deprecated 1.x Options: to be replaced in 2.0
@@ -251,7 +253,8 @@ And header and data validations will also be supported in 2.x
251
253
  | Option | Default | Explanation |
252
254
  ---------------------------------------------------------------------------------------------------------------------------------
253
255
  | :key_mapping | nil | a hash which maps headers from the CSV file to keys in the result hash |
254
- | :required_headers | nil | An array. Eacn of the given headers must be present after header manipulation, |
256
+ | :silence_missing_key | false | ignore missing keys in `key_mapping` if true |
257
+ | :required_headers | nil | An array. Each of the given headers must be present after header manipulation, |
255
258
  | | | or an exception is raised No validation if nil is given. |
256
259
  | :remove_unmapped_keys | false | when using :key_mapping option, should non-mapped keys / columns be removed? |
257
260
  | :downcase_header | true | downcase all column headers |
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module SmarterCSV
4
- VERSION = "1.7.1"
4
+ VERSION = "1.7.3"
5
5
  end
data/lib/smarter_csv.rb CHANGED
@@ -2,7 +2,9 @@
2
2
 
3
3
  require_relative "extensions/hash"
4
4
  require_relative "smarter_csv/version"
5
+
5
6
  require_relative "smarter_csv/smarter_csv" unless ENV['CI'] # does not compile/link in CI?
7
+ # require 'smarter_csv.bundle' unless ENV['CI'] # does not compile/link in CI?
6
8
 
7
9
  module SmarterCSV
8
10
  class SmarterCSVException < StandardError; end
@@ -129,6 +131,8 @@ module SmarterCSV
129
131
 
130
132
  next if options[:remove_empty_hashes] && hash.empty?
131
133
 
134
+ hash[:csv_line_number] = @csv_line_count if options[:with_line_numbers]
135
+
132
136
  if use_chunks
133
137
  chunk << hash # append temp result to chunk
134
138
 
@@ -223,6 +227,7 @@ module SmarterCSV
223
227
  remove_zero_values: false,
224
228
  required_headers: nil,
225
229
  row_sep: $/,
230
+ silence_missing_keys: false,
226
231
  skip_lines: nil,
227
232
  strings_as_keys: false,
228
233
  strip_chars_from_headers: nil,
@@ -230,6 +235,7 @@ module SmarterCSV
230
235
  user_provided_headers: nil,
231
236
  value_converters: nil,
232
237
  verbose: false,
238
+ with_line_numbers: false,
233
239
  }
234
240
  end
235
241
 
@@ -247,12 +253,12 @@ module SmarterCSV
247
253
  # puts "SmarterCSV.parse OPTIONS: #{options[:acceleration]}" if options[:verbose]
248
254
 
249
255
  if options[:acceleration] && has_acceleration?
250
- # puts "NOTICE: Accelerated SmarterCSV / #{options[:acceleration]}" if options[:verbose]
256
+ # :nocov:
251
257
  has_quotes = line =~ /#{options[:quote_char]}/
252
258
  elements = parse_csv_line_c(line, options[:col_sep], options[:quote_char], header_size)
253
259
  elements.map!{|x| cleanup_quotes(x, options[:quote_char])} if has_quotes
254
260
  return [elements, elements.size]
255
-
261
+ # :nocov:
256
262
  else
257
263
  # puts "WARNING: SmarterCSV is using un-accelerated parsing of lines. Check options[:acceleration]"
258
264
  return parse_csv_line_ruby(line, options, header_size)
@@ -474,9 +480,11 @@ module SmarterCSV
474
480
  # do some key mapping on the keys in the file header
475
481
  # if you want to completely delete a key, then map it to nil or to ''
476
482
  if !key_mappingH.nil? && key_mappingH.class == Hash && key_mappingH.keys.size > 0
477
- # we can't map keys that are not there
478
- missing_keys = key_mappingH.keys - headerA
479
- puts "WARNING: missing header(s): #{missing_keys.join(",")}" unless missing_keys.empty?
483
+ unless options[:silence_missing_keys]
484
+ # if silence_missing_keys are not set, raise error if missing header
485
+ missing_keys = key_mappingH.keys - headerA
486
+ puts "WARNING: missing header(s): #{missing_keys.join(",")}" unless missing_keys.empty?
487
+ end
480
488
 
481
489
  headerA.map!{|x| key_mappingH.has_key?(x) ? (key_mappingH[x].nil? ? nil : key_mappingH[x]) : (options[:remove_unmapped_keys] ? nil : x)}
482
490
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: smarter_csv
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.7.1
4
+ version: 1.7.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Tilo Sloboda
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2022-07-31 00:00:00.000000000 Z
11
+ date: 2022-12-09 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: awesome_print