smarter_csv 1.7.1 → 1.7.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +6 -2
- data/README.md +24 -22
- data/lib/smarter_csv/version.rb +1 -1
- data/lib/smarter_csv.rb +7 -2
- metadata +3 -3
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 34a12dae406ef192b3fbac9dd8a4236e18a7a936d4289cc296e49bf3b88fd386
|
|
4
|
+
data.tar.gz: f317413b7467386b1337938b2288763d1a6da279c6823ad3f4653ff82ea90d39
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: ebbd40e8c6ea684200c8efedc12174da1a0a99ab9fae8bcb00f3bfdb8dcac479285644de09003b04b073b46f8ea64cbb29686628e9b7986d3baa07b041ee7dbd
|
|
7
|
+
data.tar.gz: 9c9ba18bd64474811bbb3be2b350ab3b25a33dbd3e2cc802d697d04dcefeff2cc24150e87be6f6789eedc045a717ced2590efe6e6b6056a5c0b18095edbd0b38
|
data/CHANGELOG.md
CHANGED
|
@@ -1,9 +1,13 @@
|
|
|
1
1
|
|
|
2
2
|
# SmarterCSV 1.x Change Log
|
|
3
3
|
|
|
4
|
+
## 1.7.2 (2022-08-29)
|
|
5
|
+
* new option :with_line_numbers; if set to true, it adds :csv_line_number to each data hash (issue #130)
|
|
6
|
+
|
|
4
7
|
## 1.7.1 (2022-07-31)
|
|
5
|
-
* bugfix (thanks to Viacheslav Markin, Nicolas Rodriguez)
|
|
6
|
-
|
|
8
|
+
* bugfix for issue #195 #197 #200 which only appeared when called from Rails (thanks to Viacheslav Markin, Nicolas Rodriguez)
|
|
9
|
+
|
|
10
|
+
## 1.7.0 (2022-06-26) (replaced by 1.7.1)
|
|
7
11
|
* added native code to accellerate line parsing by >10x over 1.6.0
|
|
8
12
|
* added option `acceleration`, defaulting to `true`, to enable native code.
|
|
9
13
|
Disable this option to use the ruby code for line parsing.
|
data/README.md
CHANGED
|
@@ -1,9 +1,12 @@
|
|
|
1
|
-
|
|
1
|
+
|
|
2
|
+
# SmarterCSV
|
|
3
|
+
|
|
4
|
+
[](https://codecov.io/gh/tilo/smarter_csv) [](http://badge.fury.io/rb/smarter_csv)
|
|
2
5
|
|
|
3
|
-
####
|
|
6
|
+
#### Work towards Future Version 2.0
|
|
4
7
|
|
|
5
|
-
* Work towards SmarterCSV 2.0 is still
|
|
6
|
-
Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/
|
|
8
|
+
* Work towards SmarterCSV 2.0 is still ongoing, with improved features, and more streamlined options, but consider it as experimental at this time.
|
|
9
|
+
Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/tree/2.0-develop), open any issues and pull requests with mention of tag v2.0.
|
|
7
10
|
|
|
8
11
|
* New versions of SmarterCSV 1.x will soon print a deprecation warning if you set :verbose to true
|
|
9
12
|
See below for list of deprecated options.
|
|
@@ -15,11 +18,7 @@
|
|
|
15
18
|
|
|
16
19
|
---------------
|
|
17
20
|
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
[](http://travis-ci.com/tilo/smarter_csv) [](http://badge.fury.io/rb/smarter_csv)
|
|
21
|
-
|
|
22
|
-
#### SmarterCSV 1.x
|
|
21
|
+
#### SmarterCSV 1.x [Current Version]
|
|
23
22
|
|
|
24
23
|
`smarter_csv` is a Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, suitable for direct processing with ActiveRecord, parallel processing, or kicking-off batch jobs with Sidekiq.
|
|
25
24
|
|
|
@@ -132,7 +131,21 @@ and how the `process` method returns the number of chunks when called with a blo
|
|
|
132
131
|
|
|
133
132
|
=> returns number of chunks / rows we processed
|
|
134
133
|
```
|
|
135
|
-
|
|
134
|
+
|
|
135
|
+
#### Example 4: Reading a CSV-like File, and Processing it with Sidekiq:
|
|
136
|
+
```ruby
|
|
137
|
+
filename = '/tmp/strange_db_dump' # a file with CRTL-A as col_separator, and with CTRL-B\n as record_separator (hello iTunes!)
|
|
138
|
+
options = {
|
|
139
|
+
:col_sep => "\cA", :row_sep => "\cB\n", :comment_regexp => /^#/,
|
|
140
|
+
:chunk_size => 100 , :key_mapping => {:export_date => nil, :name => :genre}
|
|
141
|
+
}
|
|
142
|
+
n = SmarterCSV.process(filename, options) do |chunk|
|
|
143
|
+
SidekiqWorkerClass.process_async(chunk ) # pass an array of hashes to Sidekiq workers for parallel processing
|
|
144
|
+
end
|
|
145
|
+
=> returns number of chunks
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
#### Example 5: Populate a MongoDB Database in Chunks of 100 records with SmarterCSV:
|
|
136
149
|
```ruby
|
|
137
150
|
# using chunks:
|
|
138
151
|
filename = '/tmp/some.csv'
|
|
@@ -146,18 +159,6 @@ and how the `process` method returns the number of chunks when called with a blo
|
|
|
146
159
|
=> returns number of chunks we processed
|
|
147
160
|
```
|
|
148
161
|
|
|
149
|
-
#### Example 5: Reading a CSV-like File, and Processing it with Resque:
|
|
150
|
-
```ruby
|
|
151
|
-
filename = '/tmp/strange_db_dump' # a file with CRTL-A as col_separator, and with CTRL-B\n as record_separator (hello iTunes!)
|
|
152
|
-
options = {
|
|
153
|
-
:col_sep => "\cA", :row_sep => "\cB\n", :comment_regexp => /^#/,
|
|
154
|
-
:chunk_size => 100 , :key_mapping => {:export_date => nil, :name => :genre}
|
|
155
|
-
}
|
|
156
|
-
n = SmarterCSV.process(filename, options) do |chunk|
|
|
157
|
-
Resque.enque( ResqueWorkerClass, chunk ) # pass chunks of CSV-data to Resque workers for parallel processing
|
|
158
|
-
end
|
|
159
|
-
=> returns number of chunks
|
|
160
|
-
```
|
|
161
162
|
#### Example 6: Using Value Converters
|
|
162
163
|
|
|
163
164
|
NOTE: If you use `key_mappings` and `value_converters`, make sure that the value converters has references the keys based on the final mapped name, not the original name in the CSV file.
|
|
@@ -239,6 +240,7 @@ The options and the block are optional.
|
|
|
239
240
|
| | | You can not combine the :user_provided_headers and :key_mapping options |
|
|
240
241
|
| :remove_empty_hashes | true | remove / ignore any hashes which don't have any key/value pairs or all empty values |
|
|
241
242
|
| :verbose | false | print out line number while processing (to track down problems in input files) |
|
|
243
|
+
| :with_line_numbers | false | add :csv_line_number to heach data hash |
|
|
242
244
|
---------------------------------------------------------------------------------------------------------------------------------
|
|
243
245
|
|
|
244
246
|
#### Deprecated 1.x Options: to be replaced in 2.0
|
data/lib/smarter_csv/version.rb
CHANGED
data/lib/smarter_csv.rb
CHANGED
|
@@ -2,7 +2,9 @@
|
|
|
2
2
|
|
|
3
3
|
require_relative "extensions/hash"
|
|
4
4
|
require_relative "smarter_csv/version"
|
|
5
|
+
|
|
5
6
|
require_relative "smarter_csv/smarter_csv" unless ENV['CI'] # does not compile/link in CI?
|
|
7
|
+
# require 'smarter_csv.bundle' unless ENV['CI'] # does not compile/link in CI?
|
|
6
8
|
|
|
7
9
|
module SmarterCSV
|
|
8
10
|
class SmarterCSVException < StandardError; end
|
|
@@ -129,6 +131,8 @@ module SmarterCSV
|
|
|
129
131
|
|
|
130
132
|
next if options[:remove_empty_hashes] && hash.empty?
|
|
131
133
|
|
|
134
|
+
hash[:csv_line_number] = @csv_line_count if options[:with_line_numbers]
|
|
135
|
+
|
|
132
136
|
if use_chunks
|
|
133
137
|
chunk << hash # append temp result to chunk
|
|
134
138
|
|
|
@@ -230,6 +234,7 @@ module SmarterCSV
|
|
|
230
234
|
user_provided_headers: nil,
|
|
231
235
|
value_converters: nil,
|
|
232
236
|
verbose: false,
|
|
237
|
+
with_line_numbers: false,
|
|
233
238
|
}
|
|
234
239
|
end
|
|
235
240
|
|
|
@@ -247,12 +252,12 @@ module SmarterCSV
|
|
|
247
252
|
# puts "SmarterCSV.parse OPTIONS: #{options[:acceleration]}" if options[:verbose]
|
|
248
253
|
|
|
249
254
|
if options[:acceleration] && has_acceleration?
|
|
250
|
-
#
|
|
255
|
+
# :nocov:
|
|
251
256
|
has_quotes = line =~ /#{options[:quote_char]}/
|
|
252
257
|
elements = parse_csv_line_c(line, options[:col_sep], options[:quote_char], header_size)
|
|
253
258
|
elements.map!{|x| cleanup_quotes(x, options[:quote_char])} if has_quotes
|
|
254
259
|
return [elements, elements.size]
|
|
255
|
-
|
|
260
|
+
# :nocov:
|
|
256
261
|
else
|
|
257
262
|
# puts "WARNING: SmarterCSV is using un-accelerated parsing of lines. Check options[:acceleration]"
|
|
258
263
|
return parse_csv_line_ruby(line, options, header_size)
|
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: smarter_csv
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.7.
|
|
4
|
+
version: 1.7.2
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Tilo Sloboda
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2022-
|
|
11
|
+
date: 2022-08-29 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: awesome_print
|
|
@@ -140,7 +140,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
140
140
|
- !ruby/object:Gem::Version
|
|
141
141
|
version: '0'
|
|
142
142
|
requirements: []
|
|
143
|
-
rubygems_version: 3.
|
|
143
|
+
rubygems_version: 3.3.3
|
|
144
144
|
signing_key:
|
|
145
145
|
specification_version: 4
|
|
146
146
|
summary: Ruby Gem for smarter importing of CSV Files (and CSV-like files), with lots
|