wtf_csv 1.0.0 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.travis.yml +15 -0
- data/Gemfile +6 -1
- data/Gemfile.lock +6 -3
- data/README.md +62 -6
- data/lib/wtf_csv.rb +1 -2
- data/lib/wtf_csv/version.rb +1 -1
- data/lib/wtf_csv/wtf_csv.rb +4 -9
- data/wtf_csv.gemspec +0 -3
- metadata +3 -17
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a45b42eec0958f8aad1ee1bf538d3962e9fea3e1
|
4
|
+
data.tar.gz: aaf58a2d7f69f7439940685068e5ac121ad51a59
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 87e0bc2762ccf9f5161f82bd3db7339fe851afc78e64ad07a8020c70068b42f1c8ac31bdec80d0083daaf3498131862ac679f7731a503b576109e95b42dd3f80
|
7
|
+
data.tar.gz: c37b98ab992b82c716936d8ce14e98d2cae900f23888523bb073472e1bf1fae20c8e0acf5090ff91d1edff59c0be2850669d33c9d1833b839cabcefc87ead635
|
data/.travis.yml
ADDED
data/Gemfile
CHANGED
data/Gemfile.lock
CHANGED
@@ -1,13 +1,13 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
wtf_csv (
|
5
|
-
smarter_csv
|
4
|
+
wtf_csv (1.1.0)
|
6
5
|
|
7
6
|
GEM
|
8
7
|
remote: https://rubygems.org/
|
9
8
|
specs:
|
10
9
|
diff-lcs (1.2.5)
|
10
|
+
rake (10.4.2)
|
11
11
|
rspec (3.3.0)
|
12
12
|
rspec-core (~> 3.3.0)
|
13
13
|
rspec-expectations (~> 3.3.0)
|
@@ -21,11 +21,14 @@ GEM
|
|
21
21
|
diff-lcs (>= 1.2.0, < 2.0)
|
22
22
|
rspec-support (~> 3.3.0)
|
23
23
|
rspec-support (3.3.0)
|
24
|
-
smarter_csv (1.1.0)
|
25
24
|
|
26
25
|
PLATFORMS
|
27
26
|
ruby
|
28
27
|
|
29
28
|
DEPENDENCIES
|
29
|
+
rake
|
30
30
|
rspec
|
31
31
|
wtf_csv!
|
32
|
+
|
33
|
+
BUNDLED WITH
|
34
|
+
1.10.6
|
data/README.md
CHANGED
@@ -1,13 +1,46 @@
|
|
1
|
-
#
|
2
|
-
|
1
|
+
# WtfCSV
|
2
|
+
|
3
|
+
[](http://travis-ci.org/gremerritt/wtf_csv)
|
4
|
+
|
5
|
+
`wtf_csv` is a Ruby Gem to detect formatting issues in a CSV
|
6
|
+
|
7
|
+
### Motivation
|
3
8
|
|
4
9
|
The CSV file format is meant to be an easy way to transport data. Anyone who has had to maintain an import process, however, knows that it's easy to mess up. Usually the entire landscape looks like this:
|
5
10
|
1. An importer expects CSV files to be provided in some specific format
|
6
11
|
2. The files are given in a different format
|
7
12
|
3. The import fails; or even worse, the import succeeds but the data is mangled
|
8
|
-
4. Some poor
|
13
|
+
4. Some poor soul must dig through the CSV file to figure out what happened. Usually issues are related to bad cell quoting, inconsistent column counts, etc.
|
9
14
|
|
10
|
-
This gem seeks to make this process less terrible by providing a way to easily surface common formatting issues
|
15
|
+
This gem seeks to make this process less terrible by providing a way to easily surface common formatting issues in a CSV file.
|
16
|
+
|
17
|
+
## Documentation
|
18
|
+
|
19
|
+
`WtfCSV.scan` will return a hash with four keys: `:quote_errors`, `:encoding_errors`, `:column_errors`, and `:length_errors`. Each key's value will be an array of the issues that were found including information about the issue, in the format described below.
|
20
|
+
|
21
|
+
### :quote_errors
|
22
|
+
`[<line number>, <column_number>, <text of the improperly quoted field>]`
|
23
|
+
|
24
|
+
### :encoding_errors
|
25
|
+
`[<line number>, <column number>]`
|
26
|
+
|
27
|
+
### :column_errors
|
28
|
+
This array will always be empty if the `:check_col_count` is set to `false`
|
29
|
+
|
30
|
+
If `WtfCSV.scan` was able to determine how many columns should be in each row, either by using the `:col_threshold` option or because the `:num_cols` option was set, the format will be:
|
31
|
+
|
32
|
+
`[<line number>, <number of columns in the line>, <number of columns that should be in the line>]`
|
33
|
+
|
34
|
+
If `WtfCSV.scan` wasn't able to determine how many columns should be in each row (because an adequate number of columns weren't above the `:col_threshold` percentage) the format will be:
|
35
|
+
|
36
|
+
`[<number of columns>, <number of rows that have this number of columns>]`
|
37
|
+
|
38
|
+
### :length_errors
|
39
|
+
This array will always be empty unless the `:max_chars_in_field` option is being used
|
40
|
+
|
41
|
+
`[<line number>, <column number>, <field length>]`
|
42
|
+
|
43
|
+
## Configuration
|
11
44
|
|
12
45
|
`WtfCSV.scan` has the following options:
|
13
46
|
```
|
@@ -16,7 +49,6 @@ This gem seeks to make this process less terrible by providing a way to easily s
|
|
16
49
|
|---------------------------------|----------|--------------------------------------------------------------------------------------|
|
17
50
|
| :col_sep | ',' | Column separator |
|
18
51
|
| :row_sep | $/ ,"\n" | Row separator - defaults to system's $/ , which defaults to "\n" |
|
19
|
-
| | | This can also be set to :auto, but will process the whole cvs file first (slow!) |
|
20
52
|
| :quote_char | '"' | Quotation character |
|
21
53
|
| :escape_char | '\' | Character to escape quotes |
|
22
54
|
|---------------------------------|----------|--------------------------------------------------------------------------------------|
|
@@ -41,4 +73,28 @@ This gem seeks to make this process less terrible by providing a way to easily s
|
|
41
73
|
|---------------------------------|----------|--------------------------------------------------------------------------------------|
|
42
74
|
```
|
43
75
|
|
44
|
-
|
76
|
+
## Installation
|
77
|
+
|
78
|
+
Add this line to your application's Gemfile:
|
79
|
+
|
80
|
+
gem 'wtf_csv'
|
81
|
+
|
82
|
+
And then execute:
|
83
|
+
|
84
|
+
$ bundle
|
85
|
+
|
86
|
+
Or install it yourself as:
|
87
|
+
|
88
|
+
$ gem install wtf_csv
|
89
|
+
|
90
|
+
## Bugs and Feature Requests
|
91
|
+
|
92
|
+
Please [open an Issue on GitHub](https://github.com/gremerritt/wtf_csv/issues) with any bugs or feature requests. Thanks!
|
93
|
+
|
94
|
+
## Contributing
|
95
|
+
|
96
|
+
1. Fork it
|
97
|
+
2. Create your feature branch (`git checkout -b new-feature`)
|
98
|
+
3. Commit your changes (`git commit -am 'adds a new feature'`)
|
99
|
+
4. Push to the branch (`git push origin new-feature`)
|
100
|
+
5. Create new Pull Request
|
data/lib/wtf_csv.rb
CHANGED
data/lib/wtf_csv/version.rb
CHANGED
data/lib/wtf_csv/wtf_csv.rb
CHANGED
@@ -18,11 +18,6 @@ module WtfCSV
|
|
18
18
|
f = File.open(file, "r:#{options[:file_encoding]}")
|
19
19
|
trgt_line_count = `wc -l "#{file}"`.strip.split(' ')[0].to_i if block_given?
|
20
20
|
|
21
|
-
if options[:row_sep] == :auto
|
22
|
-
options[:row_sep] = SmarterCSV.guess_line_ending(f, options)
|
23
|
-
f.rewind
|
24
|
-
end
|
25
|
-
|
26
21
|
# credit to tilo, author of smarter_csv, on how to loop over lines without reading whole file into memory
|
27
22
|
old_row_sep = $/
|
28
23
|
$/ = options[:row_sep]
|
@@ -197,10 +192,10 @@ module WtfCSV
|
|
197
192
|
end
|
198
193
|
end
|
199
194
|
|
200
|
-
return {quote_errors
|
201
|
-
encoding_errors
|
202
|
-
column_errors
|
203
|
-
length_errors
|
195
|
+
return {:quote_errors => quote_errors,
|
196
|
+
:encoding_errors => encoding_errors,
|
197
|
+
:column_errors => column_errors,
|
198
|
+
:length_errors => length_errors}
|
204
199
|
|
205
200
|
end
|
206
201
|
end
|
data/wtf_csv.gemspec
CHANGED
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: wtf_csv
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Greg Merritt
|
@@ -10,20 +10,6 @@ bindir: bin
|
|
10
10
|
cert_chain: []
|
11
11
|
date: 2015-09-11 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
|
-
- !ruby/object:Gem::Dependency
|
14
|
-
name: smarter_csv
|
15
|
-
requirement: !ruby/object:Gem::Requirement
|
16
|
-
requirements:
|
17
|
-
- - ">="
|
18
|
-
- !ruby/object:Gem::Version
|
19
|
-
version: '0'
|
20
|
-
type: :runtime
|
21
|
-
prerelease: false
|
22
|
-
version_requirements: !ruby/object:Gem::Requirement
|
23
|
-
requirements:
|
24
|
-
- - ">="
|
25
|
-
- !ruby/object:Gem::Version
|
26
|
-
version: '0'
|
27
13
|
- !ruby/object:Gem::Dependency
|
28
14
|
name: rspec
|
29
15
|
requirement: !ruby/object:Gem::Requirement
|
@@ -48,6 +34,7 @@ extra_rdoc_files: []
|
|
48
34
|
files:
|
49
35
|
- ".gitignore"
|
50
36
|
- ".rspec"
|
37
|
+
- ".travis.yml"
|
51
38
|
- Gemfile
|
52
39
|
- Gemfile.lock
|
53
40
|
- README.md
|
@@ -90,8 +77,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
90
77
|
- - ">="
|
91
78
|
- !ruby/object:Gem::Version
|
92
79
|
version: '0'
|
93
|
-
requirements:
|
94
|
-
- smarter_csv
|
80
|
+
requirements: []
|
95
81
|
rubyforge_project:
|
96
82
|
rubygems_version: 2.0.3
|
97
83
|
signing_key:
|