smarter_csv 1.7.1 → 1.7.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +6 -2
- data/README.md +24 -22
- data/lib/smarter_csv/version.rb +1 -1
- data/lib/smarter_csv.rb +7 -2
- metadata +3 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 34a12dae406ef192b3fbac9dd8a4236e18a7a936d4289cc296e49bf3b88fd386
|
4
|
+
data.tar.gz: f317413b7467386b1337938b2288763d1a6da279c6823ad3f4653ff82ea90d39
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: ebbd40e8c6ea684200c8efedc12174da1a0a99ab9fae8bcb00f3bfdb8dcac479285644de09003b04b073b46f8ea64cbb29686628e9b7986d3baa07b041ee7dbd
|
7
|
+
data.tar.gz: 9c9ba18bd64474811bbb3be2b350ab3b25a33dbd3e2cc802d697d04dcefeff2cc24150e87be6f6789eedc045a717ced2590efe6e6b6056a5c0b18095edbd0b38
|
data/CHANGELOG.md
CHANGED
@@ -1,9 +1,13 @@
|
|
1
1
|
|
2
2
|
# SmarterCSV 1.x Change Log
|
3
3
|
|
4
|
+
## 1.7.2 (2022-08-29)
|
5
|
+
* new option :with_line_numbers; if set to true, it adds :csv_line_number to each data hash (issue #130)
|
6
|
+
|
4
7
|
## 1.7.1 (2022-07-31)
|
5
|
-
* bugfix (thanks to Viacheslav Markin, Nicolas Rodriguez)
|
6
|
-
|
8
|
+
* bugfix for issue #195 #197 #200 which only appeared when called from Rails (thanks to Viacheslav Markin, Nicolas Rodriguez)
|
9
|
+
|
10
|
+
## 1.7.0 (2022-06-26) (replaced by 1.7.1)
|
7
11
|
* added native code to accellerate line parsing by >10x over 1.6.0
|
8
12
|
* added option `acceleration`, defaulting to `true`, to enable native code.
|
9
13
|
Disable this option to use the ruby code for line parsing.
|
data/README.md
CHANGED
@@ -1,9 +1,12 @@
|
|
1
|
-
|
1
|
+
|
2
|
+
# SmarterCSV
|
3
|
+
|
4
|
+
[![codecov](https://codecov.io/gh/tilo/smarter_csv/branch/main/graph/badge.svg?token=1L7OD80182)](https://codecov.io/gh/tilo/smarter_csv) [![Gem Version](https://badge.fury.io/rb/smarter_csv.svg)](http://badge.fury.io/rb/smarter_csv)
|
2
5
|
|
3
|
-
####
|
6
|
+
#### Work towards Future Version 2.0
|
4
7
|
|
5
|
-
* Work towards SmarterCSV 2.0 is still
|
6
|
-
Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/
|
8
|
+
* Work towards SmarterCSV 2.0 is still ongoing, with improved features, and more streamlined options, but consider it as experimental at this time.
|
9
|
+
Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/tree/2.0-develop), open any issues and pull requests with mention of tag v2.0.
|
7
10
|
|
8
11
|
* New versions of SmarterCSV 1.x will soon print a deprecation warning if you set :verbose to true
|
9
12
|
See below for list of deprecated options.
|
@@ -15,11 +18,7 @@
|
|
15
18
|
|
16
19
|
---------------
|
17
20
|
|
18
|
-
|
19
|
-
|
20
|
-
[![Build Status](https://secure.travis-ci.org/tilo/smarter_csv.svg?branch=master)](http://travis-ci.com/tilo/smarter_csv) [![Gem Version](https://badge.fury.io/rb/smarter_csv.svg)](http://badge.fury.io/rb/smarter_csv)
|
21
|
-
|
22
|
-
#### SmarterCSV 1.x
|
21
|
+
#### SmarterCSV 1.x [Current Version]
|
23
22
|
|
24
23
|
`smarter_csv` is a Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, suitable for direct processing with ActiveRecord, parallel processing, or kicking-off batch jobs with Sidekiq.
|
25
24
|
|
@@ -132,7 +131,21 @@ and how the `process` method returns the number of chunks when called with a blo
|
|
132
131
|
|
133
132
|
=> returns number of chunks / rows we processed
|
134
133
|
```
|
135
|
-
|
134
|
+
|
135
|
+
#### Example 4: Reading a CSV-like File, and Processing it with Sidekiq:
|
136
|
+
```ruby
|
137
|
+
filename = '/tmp/strange_db_dump' # a file with CRTL-A as col_separator, and with CTRL-B\n as record_separator (hello iTunes!)
|
138
|
+
options = {
|
139
|
+
:col_sep => "\cA", :row_sep => "\cB\n", :comment_regexp => /^#/,
|
140
|
+
:chunk_size => 100 , :key_mapping => {:export_date => nil, :name => :genre}
|
141
|
+
}
|
142
|
+
n = SmarterCSV.process(filename, options) do |chunk|
|
143
|
+
SidekiqWorkerClass.process_async(chunk ) # pass an array of hashes to Sidekiq workers for parallel processing
|
144
|
+
end
|
145
|
+
=> returns number of chunks
|
146
|
+
```
|
147
|
+
|
148
|
+
#### Example 5: Populate a MongoDB Database in Chunks of 100 records with SmarterCSV:
|
136
149
|
```ruby
|
137
150
|
# using chunks:
|
138
151
|
filename = '/tmp/some.csv'
|
@@ -146,18 +159,6 @@ and how the `process` method returns the number of chunks when called with a blo
|
|
146
159
|
=> returns number of chunks we processed
|
147
160
|
```
|
148
161
|
|
149
|
-
#### Example 5: Reading a CSV-like File, and Processing it with Resque:
|
150
|
-
```ruby
|
151
|
-
filename = '/tmp/strange_db_dump' # a file with CRTL-A as col_separator, and with CTRL-B\n as record_separator (hello iTunes!)
|
152
|
-
options = {
|
153
|
-
:col_sep => "\cA", :row_sep => "\cB\n", :comment_regexp => /^#/,
|
154
|
-
:chunk_size => 100 , :key_mapping => {:export_date => nil, :name => :genre}
|
155
|
-
}
|
156
|
-
n = SmarterCSV.process(filename, options) do |chunk|
|
157
|
-
Resque.enque( ResqueWorkerClass, chunk ) # pass chunks of CSV-data to Resque workers for parallel processing
|
158
|
-
end
|
159
|
-
=> returns number of chunks
|
160
|
-
```
|
161
162
|
#### Example 6: Using Value Converters
|
162
163
|
|
163
164
|
NOTE: If you use `key_mappings` and `value_converters`, make sure that the value converters has references the keys based on the final mapped name, not the original name in the CSV file.
|
@@ -239,6 +240,7 @@ The options and the block are optional.
|
|
239
240
|
| | | You can not combine the :user_provided_headers and :key_mapping options |
|
240
241
|
| :remove_empty_hashes | true | remove / ignore any hashes which don't have any key/value pairs or all empty values |
|
241
242
|
| :verbose | false | print out line number while processing (to track down problems in input files) |
|
243
|
+
| :with_line_numbers | false | add :csv_line_number to heach data hash |
|
242
244
|
---------------------------------------------------------------------------------------------------------------------------------
|
243
245
|
|
244
246
|
#### Deprecated 1.x Options: to be replaced in 2.0
|
data/lib/smarter_csv/version.rb
CHANGED
data/lib/smarter_csv.rb
CHANGED
@@ -2,7 +2,9 @@
|
|
2
2
|
|
3
3
|
require_relative "extensions/hash"
|
4
4
|
require_relative "smarter_csv/version"
|
5
|
+
|
5
6
|
require_relative "smarter_csv/smarter_csv" unless ENV['CI'] # does not compile/link in CI?
|
7
|
+
# require 'smarter_csv.bundle' unless ENV['CI'] # does not compile/link in CI?
|
6
8
|
|
7
9
|
module SmarterCSV
|
8
10
|
class SmarterCSVException < StandardError; end
|
@@ -129,6 +131,8 @@ module SmarterCSV
|
|
129
131
|
|
130
132
|
next if options[:remove_empty_hashes] && hash.empty?
|
131
133
|
|
134
|
+
hash[:csv_line_number] = @csv_line_count if options[:with_line_numbers]
|
135
|
+
|
132
136
|
if use_chunks
|
133
137
|
chunk << hash # append temp result to chunk
|
134
138
|
|
@@ -230,6 +234,7 @@ module SmarterCSV
|
|
230
234
|
user_provided_headers: nil,
|
231
235
|
value_converters: nil,
|
232
236
|
verbose: false,
|
237
|
+
with_line_numbers: false,
|
233
238
|
}
|
234
239
|
end
|
235
240
|
|
@@ -247,12 +252,12 @@ module SmarterCSV
|
|
247
252
|
# puts "SmarterCSV.parse OPTIONS: #{options[:acceleration]}" if options[:verbose]
|
248
253
|
|
249
254
|
if options[:acceleration] && has_acceleration?
|
250
|
-
#
|
255
|
+
# :nocov:
|
251
256
|
has_quotes = line =~ /#{options[:quote_char]}/
|
252
257
|
elements = parse_csv_line_c(line, options[:col_sep], options[:quote_char], header_size)
|
253
258
|
elements.map!{|x| cleanup_quotes(x, options[:quote_char])} if has_quotes
|
254
259
|
return [elements, elements.size]
|
255
|
-
|
260
|
+
# :nocov:
|
256
261
|
else
|
257
262
|
# puts "WARNING: SmarterCSV is using un-accelerated parsing of lines. Check options[:acceleration]"
|
258
263
|
return parse_csv_line_ruby(line, options, header_size)
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: smarter_csv
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.7.
|
4
|
+
version: 1.7.2
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tilo Sloboda
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2022-
|
11
|
+
date: 2022-08-29 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: awesome_print
|
@@ -140,7 +140,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
140
140
|
- !ruby/object:Gem::Version
|
141
141
|
version: '0'
|
142
142
|
requirements: []
|
143
|
-
rubygems_version: 3.
|
143
|
+
rubygems_version: 3.3.3
|
144
144
|
signing_key:
|
145
145
|
specification_version: 4
|
146
146
|
summary: Ruby Gem for smarter importing of CSV Files (and CSV-like files), with lots
|