csvlint 0.1.2 → 0.1.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,15 +1,15 @@
1
1
  ---
2
2
  !binary "U0hBMQ==":
3
3
  metadata.gz: !binary |-
4
- NjUyMGZlNGIyNjM4ZmU1OGVjMDc3Y2U2YmQ0NWEzMzM3ZGFkYzA0Yw==
4
+ YmFkYTMwMzU2Njg0ZDk5NmJkOTU2ZjczNWVjZTRjNTVmZWEyOTY2NA==
5
5
  data.tar.gz: !binary |-
6
- NzYzMjUxY2NlZmE4ODY3NjYwNTU4NzUyMmE2NTU0NjM1MmQ0YzI4NQ==
6
+ ODIzOTA1MmQ0MzI3ZTZlOTc2NTU5NmFkNzgxODkxODU2MjFhY2VmYw==
7
7
  SHA512:
8
8
  metadata.gz: !binary |-
9
- ODQwMGQzMjY4NGQzNGY5ZGY1Nzg3ODc5MDI3M2E3NmNmNzUwNzU1MzBjNjNk
10
- YzRiMjE3NTJlNTc5YWI2NjhkNGJjNDcxNzVkMDgyZmRkYjZlYWFmZjUyOWQx
11
- ZmVkOWRjZDJhMGU4M2QyYmI0YzZmMDBlYWRhMjFmNGFlZmJmZTg=
9
+ Y2IwYTYzYjg1N2JmMmFmN2Q1NjIwYTQxMTU0Y2Y5YTM3YzFkYjkyZWI1MWEy
10
+ ODJiZTZjNDhlMDc0MjliZGRkMDQyZDRkMDdlNzRjMmFhN2IyYTJkZDU4MWZl
11
+ NzdmYWQwZDI5MTI1Nzk5ZDQxZjRiZGVlMTI3NzMwNmI2NGZiZTM=
12
12
  data.tar.gz: !binary |-
13
- ZjhkODBhMWU1ODFjN2JiNjQ3ZjBlZmFkNzAwMmVhOTNiYzgwNDNjM2RjODg5
14
- YWE2NGUwNjk1M2JlYWFjZmVlY2U0YjNmMTZmM2U4YTFlMDcyMGViNWU3NWY2
15
- NTY5NmIyNmE5ZjYwNDkwOTE1NjQ2MWRiZGU0ZGRhM2MxMTEyYjU=
13
+ NDBlNGVhMTVkM2IzNDIyNWEzMjU5NDI4YjZjNjlmODg5YWNiOTMzNDk4ZGQ5
14
+ ODVmMTM2MWJjNTIwNzdhNmNkNTNhZGUyZTUyZTgyYzM4Njc4YWUwYzA1YWJl
15
+ MmYwNmFkMTVlMGIyNDE5ODVkN2E3NWE5ZWFlYzQ1NGVjNzY1Y2Q=
@@ -0,0 +1,137 @@
1
+ # Change Log
2
+
3
+ ## [Unreleased](https://github.com/theodi/csvlint.rb/tree/HEAD)
4
+
5
+ [Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.2...HEAD)
6
+
7
+ **Merged pull requests:**
8
+
9
+ - Error reporting schema expanded test suite [\#138](https://github.com/theodi/csvlint.rb/pull/138) ([quadrophobiac](https://github.com/quadrophobiac))
10
+
11
+ - Validate header size improvement [\#137](https://github.com/theodi/csvlint.rb/pull/137) ([adamc00](https://github.com/adamc00))
12
+
13
+ - Invalid schema [\#132](https://github.com/theodi/csvlint.rb/pull/132) ([bcouston](https://github.com/bcouston))
14
+
15
+ ## [0.1.2](https://github.com/theodi/csvlint.rb/tree/0.1.2) (2015-07-15)
16
+
17
+ [Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.1...0.1.2)
18
+
19
+ **Closed issues:**
20
+
21
+ - When an encoding error is thrown the line content is put into the column field in the error object [\#131](https://github.com/theodi/csvlint.rb/issues/131)
22
+
23
+ **Merged pull requests:**
24
+
25
+ - Catch invalid URIs [\#133](https://github.com/theodi/csvlint.rb/pull/133) ([pezholio](https://github.com/pezholio))
26
+
27
+ - Emit a warning when the CSV header does not match the supplied schema [\#127](https://github.com/theodi/csvlint.rb/pull/127) ([adamc00](https://github.com/adamc00))
28
+
29
+ ## [0.1.1](https://github.com/theodi/csvlint.rb/tree/0.1.1) (2015-07-13)
30
+
31
+ [Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.0...0.1.1)
32
+
33
+ **Closed issues:**
34
+
35
+ - Add Command Line Support [\#128](https://github.com/theodi/csvlint.rb/issues/128)
36
+
37
+ - BUG: Incorrect inconsistent\_values error on numeric columns [\#106](https://github.com/theodi/csvlint.rb/issues/106)
38
+
39
+ **Merged pull requests:**
40
+
41
+ - Fixes line content incorrectly being put into the row column field when there is an encoding error. [\#130](https://github.com/theodi/csvlint.rb/pull/130) ([glacier](https://github.com/glacier))
42
+
43
+ - Add command line help [\#129](https://github.com/theodi/csvlint.rb/pull/129) ([pezholio](https://github.com/pezholio))
44
+
45
+ - Remove stray q character. [\#125](https://github.com/theodi/csvlint.rb/pull/125) ([adamc00](https://github.com/adamc00))
46
+
47
+ - csvlint utility can take arguments to specify a schema and pp errors [\#124](https://github.com/theodi/csvlint.rb/pull/124) ([adamc00](https://github.com/adamc00))
48
+
49
+ - Fixed warning - use expect\( \) rather than .should [\#123](https://github.com/theodi/csvlint.rb/pull/123) ([jezhiggins](https://github.com/jezhiggins))
50
+
51
+ - Fixed spelling mistake [\#121](https://github.com/theodi/csvlint.rb/pull/121) ([jezhiggins](https://github.com/jezhiggins))
52
+
53
+ - Avoid using \#blank? if unnecessary [\#120](https://github.com/theodi/csvlint.rb/pull/120) ([jpmckinney](https://github.com/jpmckinney))
54
+
55
+ - eliminate some date and time formats, related \#105 [\#119](https://github.com/theodi/csvlint.rb/pull/119) ([jpmckinney](https://github.com/jpmckinney))
56
+
57
+ - Match another CSV error about line endings [\#118](https://github.com/theodi/csvlint.rb/pull/118) ([jpmckinney](https://github.com/jpmckinney))
58
+
59
+ - fixed typo mistake in README [\#117](https://github.com/theodi/csvlint.rb/pull/117) ([railsfactory-kumaresan](https://github.com/railsfactory-kumaresan))
60
+
61
+ - Integrate @jpmickinney's build\_formats improvements [\#112](https://github.com/theodi/csvlint.rb/pull/112) ([Floppy](https://github.com/Floppy))
62
+
63
+ - make limit\_lines into a non-dialect option [\#110](https://github.com/theodi/csvlint.rb/pull/110) ([Floppy](https://github.com/Floppy))
64
+
65
+ - fix coveralls stats [\#109](https://github.com/theodi/csvlint.rb/pull/109) ([Floppy](https://github.com/Floppy))
66
+
67
+ - Speed up \#build\_formats \(changes its API\) [\#103](https://github.com/theodi/csvlint.rb/pull/103) ([jpmckinney](https://github.com/jpmckinney))
68
+
69
+ - Limit lines [\#101](https://github.com/theodi/csvlint.rb/pull/101) ([Hoedic](https://github.com/Hoedic))
70
+
71
+ ## [0.1.0](https://github.com/theodi/csvlint.rb/tree/0.1.0) (2014-11-27)
72
+
73
+ **Implemented enhancements:**
74
+
75
+ - Blank values shouldn't count as inconsistencies [\#90](https://github.com/theodi/csvlint.rb/issues/90)
76
+
77
+ - Make sure we don't check schema column count and ragged row count together [\#66](https://github.com/theodi/csvlint.rb/issues/66)
78
+
79
+ - Include the failed constraints in error message when doing field validation [\#64](https://github.com/theodi/csvlint.rb/issues/64)
80
+
81
+ - Include the column value in error message when field validation fails [\#63](https://github.com/theodi/csvlint.rb/issues/63)
82
+
83
+ - Expose optional JSON table schema fields [\#55](https://github.com/theodi/csvlint.rb/issues/55)
84
+
85
+ - Ensure header rows are properly handled and validated [\#48](https://github.com/theodi/csvlint.rb/issues/48)
86
+
87
+ - Support zipped CSV? [\#30](https://github.com/theodi/csvlint.rb/issues/30)
88
+
89
+ - Improve feedback on inconsistent values [\#29](https://github.com/theodi/csvlint.rb/issues/29)
90
+
91
+ - Reported error positions are not massively useful [\#15](https://github.com/theodi/csvlint.rb/issues/15)
92
+
93
+ **Fixed bugs:**
94
+
95
+ - undefined method `\[\]' for nil:NilClass from fetch\_error [\#71](https://github.com/theodi/csvlint.rb/issues/71)
96
+
97
+ - Inconsistent column bases [\#69](https://github.com/theodi/csvlint.rb/issues/69)
98
+
99
+ - Improve error handling in Schema loading [\#42](https://github.com/theodi/csvlint.rb/issues/42)
100
+
101
+ - Recover from some line ending problems [\#41](https://github.com/theodi/csvlint.rb/issues/41)
102
+
103
+ - Inconsistent values due to number format differences [\#32](https://github.com/theodi/csvlint.rb/issues/32)
104
+
105
+ - New lines in quoted fields are valid [\#31](https://github.com/theodi/csvlint.rb/issues/31)
106
+
107
+ - Wrongly reporting incorrect file extension [\#23](https://github.com/theodi/csvlint.rb/issues/23)
108
+
109
+ - Incorrect extension reported when URL has query options at the end [\#14](https://github.com/theodi/csvlint.rb/issues/14)
110
+
111
+ **Closed issues:**
112
+
113
+ - Get gem continuously deploying [\#93](https://github.com/theodi/csvlint.rb/issues/93)
114
+
115
+ - Publish on rubygems.org [\#92](https://github.com/theodi/csvlint.rb/issues/92)
116
+
117
+ - Duplicate column names [\#87](https://github.com/theodi/csvlint.rb/issues/87)
118
+
119
+ - Return code is always 0 \(except when it isn't\) [\#85](https://github.com/theodi/csvlint.rb/issues/85)
120
+
121
+ - Can't pipe data to csvlint [\#84](https://github.com/theodi/csvlint.rb/issues/84)
122
+
123
+ - They have some validator running if someone wants to inspect it for "inspiration" [\#27](https://github.com/theodi/csvlint.rb/issues/27)
124
+
125
+ - Allow CSV parsing options to be configured as a parameter [\#6](https://github.com/theodi/csvlint.rb/issues/6)
126
+
127
+ - Use explicit CSV parsing options [\#5](https://github.com/theodi/csvlint.rb/issues/5)
128
+
129
+ - Improving encoding detection [\#2](https://github.com/theodi/csvlint.rb/issues/2)
130
+
131
+ **Merged pull requests:**
132
+
133
+ - Continuously deploy gem [\#102](https://github.com/theodi/csvlint.rb/pull/102) ([pezholio](https://github.com/pezholio))
134
+
135
+
136
+
137
+ \* *This Change Log was automatically generated by [github_changelog_generator](https://github.com/skywinder/Github-Changelog-Generator)*
@@ -36,4 +36,5 @@ Gem::Specification.new do |spec|
36
36
  spec.add_development_dependency "rspec-expectations"
37
37
  spec.add_development_dependency "coveralls"
38
38
  spec.add_development_dependency "pry"
39
+ spec.add_development_dependency "github_changelog_generator"
39
40
  end
@@ -0,0 +1,153 @@
1
+ Feature: Collect all the tests that should trigger dialect check related errors
2
+
3
+ Scenario: Title rows, I wish to trigger a :title_row type message
4
+ Given I have a CSV file called "title-row.csv"
5
+ And it is stored at the url "http://example.com/example1.csv"
6
+ And I ask if there are warnings
7
+ Then there should be 1 warnings
8
+ And that warning should have the type "title_row"
9
+
10
+ # :nonrfc_line_breaks
11
+
12
+ Scenario: LF line endings in file give an info message of type :nonrfc_line_breaks
13
+ Given I have a CSV file called "lf-line-endings.csv"
14
+ And it is stored at the url "http://example.com/example1.csv"
15
+ And I set header to "true"
16
+ And I ask if there are info messages
17
+ Then there should be 2 info messages
18
+ And one of the messages should have the type "nonrfc_line_breaks"
19
+
20
+ Scenario: CR line endings in file give an info message of type :nonrfc_line_breaks
21
+ Given I have a CSV file called "cr-line-endings.csv"
22
+ And it is stored at the url "http://example.com/example1.csv"
23
+ And I set header to "true"
24
+ And I ask if there are info messages
25
+ Then there should be 2 info messages
26
+ And one of the messages should have the type "nonrfc_line_breaks"
27
+
28
+ Scenario: CRLF line endings in file produces no info messages of type :nonrfc_line_breaks
29
+ Given I have a CSV file called "crlf-line-endings.csv"
30
+ And it is stored at the url "http://example.com/example1.csv"
31
+ And I set header to "true"
32
+ And I ask if there are info messages
33
+ Then there should be 1 info message
34
+
35
+ # :line_breaks
36
+
37
+ Scenario: Incorrect line endings specified in settings
38
+ Given I have a CSV file called "cr-line-endings.csv"
39
+ And I set the line endings to linefeed
40
+ And it is stored at the url "http://example.com/example1.csv"
41
+ And I ask if there are errors
42
+ Then there should be 1 error
43
+ And that error should have the type "line_breaks"
44
+
45
+ Scenario: inconsistent line endings in file cause an error
46
+ Given I have a CSV file called "inconsistent-line-endings.csv"
47
+ And it is stored at the url "http://example.com/example1.csv"
48
+ And I ask if there are errors
49
+ Then there should be 1 error
50
+ And that error should have the type "line_breaks"
51
+
52
+
53
+ Scenario: inconsistent line endings with unquoted fields in file cause an error
54
+ Given I have a CSV file called "inconsistent-line-endings-unquoted.csv"
55
+ And it is stored at the url "http://example.com/example1.csv"
56
+ And I ask if there are errors
57
+ Then there should be 1 error
58
+ And that error should have the type "line_breaks"
59
+
60
+ #:unclosed_quote
61
+
62
+ Scenario: CSV with incorrect quoting
63
+ Given I have a CSV with the following content:
64
+ """
65
+ "col1","col2","col3"
66
+ "Foo","Bar","Baz
67
+ """
68
+ And it is stored at the url "http://example.com/example1.csv"
69
+ When I ask if there are errors
70
+ Then there should be 1 error
71
+ And that error should have the type "unclosed_quote"
72
+ And that error should have the row "2"
73
+ And that error should have the content ""Foo","Bar","Baz"
74
+
75
+ # :invalid_encoding
76
+
77
+ Scenario: Report invalid Encoding
78
+ Given I have a CSV file called "invalid-byte-sequence.csv"
79
+ And I set an encoding header of "UTF-8"
80
+ And it is stored at the url "http://example.com/example1.csv"
81
+ When I ask if there are errors
82
+ Then there should be 1 error
83
+ And that error should have the type "invalid_encoding"
84
+
85
+ Scenario: Report invalid file
86
+ #should this throw an excel error?
87
+ Given I have a CSV file called "spreadsheet.xls"
88
+ And it is stored at the url "http://example.com/example1.csv"
89
+ When I ask if there are errors
90
+ Then there should be 1 error
91
+ And that error should have the type "invalid_encoding"
92
+
93
+ # :blank_rows
94
+
95
+ Scenario: Successfully report a CSV with blank rows
96
+ Given I have a CSV with the following content:
97
+ """
98
+ "col1","col2","col3"
99
+ "Foo","Bar","Baz"
100
+ "","",
101
+ "Baz","Bar","Foo"
102
+ """
103
+ And it is stored at the url "http://example.com/example1.csv"
104
+ When I ask if there are errors
105
+ Then there should be 1 error
106
+ And that error should have the type "blank_rows"
107
+ And that error should have the row "3"
108
+ And that error should have the content ""","","
109
+
110
+ Scenario: Successfully report a CSV with multiple trailing empty rows
111
+ Given I have a CSV with the following content:
112
+ """
113
+ "col1","col2","col3"
114
+ "Foo","Bar","Baz"
115
+ "Foo","Bar","Baz"
116
+
117
+
118
+ """
119
+ And it is stored at the url "http://example.com/example1.csv"
120
+ When I ask if there are errors
121
+ Then there should be 1 error
122
+ And that error should have the type "blank_rows"
123
+ And that error should have the row "4"
124
+
125
+ Scenario: Successfully report a CSV with an empty row
126
+ Given I have a CSV with the following content:
127
+ """
128
+ "col1","col2","col3"
129
+ "Foo","Bar","Baz"
130
+
131
+ "Foo","Bar","Baz"
132
+ """
133
+ And it is stored at the url "http://example.com/example1.csv"
134
+ When I ask if there are errors
135
+ Then there should be 1 error
136
+ And that error should have the type "blank_rows"
137
+ And that error should have the row "3"
138
+
139
+ #:check_options
140
+
141
+ Scenario: Warn if options seem to return invalid data
142
+ Given I have a CSV with the following content:
143
+ """
144
+ 'Foo';'Bar';'Baz'
145
+ '1';'2';'3'
146
+ '3';'2';'1'
147
+ """
148
+ And I set the delimiter to ","
149
+ And I set quotechar to """
150
+ And it is stored at the url "http://example.com/example1.csv"
151
+ And I ask if there are warnings
152
+ Then there should be 1 warnings
153
+ And that warning should have the type "check_options"
@@ -60,4 +60,46 @@ Feature: Schema Validation
60
60
  """
61
61
  When I ask if there are warnings
62
62
  Then there should be 1 warnings
63
-
63
+
64
+ Scenario: Schema with valid regex
65
+ Given I have a CSV with the following content:
66
+ """
67
+ "firstname","id","email"
68
+ "Bob","1234","bob@example.org"
69
+ "Alice","5","alice@example.com"
70
+ """
71
+ And it is stored at the url "http://example.com/example1.csv"
72
+ And I have a schema with the following content:
73
+ """
74
+ {
75
+ "fields": [
76
+ { "name": "Name", "constraints": { "required": true, "pattern": "^[A-Za-z0-9_]*$" } },
77
+ { "name": "Id", "constraints": { "required": true, "minLength": 1 } },
78
+ { "name": "Email", "constraints": { "required": true } }
79
+ ]
80
+ }
81
+ """
82
+ When I ask if there are errors
83
+ Then there should be 0 error
84
+
85
+ Scenario: Schema with invalid regex
86
+ Given I have a CSV with the following content:
87
+ """
88
+ "firstname","id","email"
89
+ "Bob","1234","bob@example.org"
90
+ "Alice","5","alice@example.com"
91
+ """
92
+ And it is stored at the url "http://example.com/example1.csv"
93
+ And I have a schema with the following content:
94
+ """
95
+ {
96
+ "fields": [
97
+ { "name": "Name", "constraints": { "required": true, "pattern": "((" } },
98
+ { "name": "Id", "constraints": { "required": true, "minLength": 1 } },
99
+ { "name": "Email", "constraints": { "required": true } }
100
+ ]
101
+ }
102
+ """
103
+ When I ask if there are errors
104
+ Then there should be 1 error
105
+ And that error should have the type "invalid_regex"
@@ -1,13 +1,15 @@
1
1
  module Csvlint
2
2
  module ErrorCollector
3
3
  attr_reader :errors, :warnings, :info_messages
4
-
4
+ # Creates a validation error
5
5
  def build_errors(type, category = nil, row = nil, column = nil, content = nil, constraints = {})
6
6
  @errors << Csvlint::ErrorMessage.new(type, category, row, column, content, constraints)
7
7
  end
8
+ # Creates a validation warning
8
9
  def build_warnings(type, category = nil, row = nil, column = nil, content = nil, constraints = {})
9
10
  @warnings << Csvlint::ErrorMessage.new(type, category, row, column, content, constraints)
10
11
  end
12
+ # Creates a validation information message
11
13
  def build_info_messages(type, category = nil, row = nil, column = nil, content = nil, constraints = {})
12
14
  @info_messages << Csvlint::ErrorMessage.new(type, category, row, column, content, constraints)
13
15
  end
@@ -9,6 +9,7 @@ module Csvlint
9
9
  @column = column
10
10
  @content = content
11
11
  @constraints = constraints
12
+
12
13
  end
13
14
  end
14
15
  end
@@ -1,10 +1,10 @@
1
1
  module Csvlint
2
-
2
+
3
3
  class Field
4
4
  include Csvlint::ErrorCollector
5
5
 
6
6
  attr_reader :name, :constraints, :title, :description
7
-
7
+
8
8
  def initialize(name, constraints={}, title=nil, description=nil)
9
9
  @name = name
10
10
  @constraints = constraints || {}
@@ -13,9 +13,12 @@ module Csvlint
13
13
  @description = description
14
14
  reset
15
15
  end
16
-
17
- def validate_column(value, row=nil, column=nil)
16
+
17
+ def validate_column(value, row=nil, column=nil, all_errors=[])
18
18
  reset
19
+ unless all_errors.any?{|error| ((error.type == :invalid_regex) && (error.column == column))}
20
+ validate_regex(value, row, column)
21
+ end
19
22
  validate_length(value, row, column)
20
23
  validate_values(value, row, column)
21
24
  parsed = validate_type(value, row, column)
@@ -26,11 +29,11 @@ module Csvlint
26
29
  private
27
30
  def validate_length(value, row, column)
28
31
  if constraints["required"] == true
29
- build_errors(:missing_value, :schema, row, column, value,
32
+ build_errors(:missing_value, :schema, row, column, value,
30
33
  { "required" => true }) if value.nil? || value.length == 0
31
34
  end
32
35
  if constraints["minLength"]
33
- build_errors(:min_length, :schema, row, column, value,
36
+ build_errors(:min_length, :schema, row, column, value,
34
37
  { "minLength" => constraints["minLength"] }) if value.nil? || value.length < constraints["minLength"]
35
38
  end
36
39
  if constraints["maxLength"]
@@ -38,12 +41,26 @@ module Csvlint
38
41
  { "maxLength" => constraints["maxLength"] } ) if !value.nil? && value.length > constraints["maxLength"]
39
42
  end
40
43
  end
41
-
42
- def validate_values(value, row, column)
43
- if constraints["pattern"]
44
- build_errors(:pattern, :schema, row, column, value,
45
- { "pattern" => constraints["pattern"] } ) if !value.nil? && !value.match( constraints["pattern"] )
44
+
45
+ def validate_regex(value, row, column)
46
+ pattern = constraints["pattern"]
47
+ if pattern
48
+ begin
49
+ Regexp.new(pattern)
50
+ build_errors(:pattern, :schema, row, column, value,
51
+ { "pattern" => constraints["pattern"] } ) if !value.nil? && !value.match( constraints["pattern"] )
52
+ rescue RegexpError
53
+ build_errors(:invalid_regex, :schema, nil, column, ("#{name}: Constraints: Pattern: #{pattern}"),
54
+ { "pattern" => constraints["pattern"] })
55
+ end
46
56
  end
57
+ end
58
+
59
+ def validate_values(value, row, column)
60
+ # If a pattern exists, raise an invalid regex error if it is not in
61
+ # valid regex form, else, if the value of the relevant field in the csv
62
+ # does not match the given regex pattern in the schema, raise a
63
+ # pattern error.
47
64
  if constraints["unique"] == true
48
65
  if @uniques.include? value
49
66
  build_errors(:unique, :schema, row, column, value, { "unique" => true })
@@ -52,7 +69,7 @@ module Csvlint
52
69
  end
53
70
  end
54
71
  end
55
-
72
+
56
73
  def validate_type(value, row, column)
57
74
  if constraints["type"] && value != ""
58
75
  parsed = convert_to_type(value)
@@ -66,21 +83,21 @@ module Csvlint
66
83
  end
67
84
  return nil
68
85
  end
69
-
86
+
70
87
  def validate_range(value, row, column)
71
88
  #TODO: we're ignoring issues with converting ranges to actual types, maybe we
72
89
  #should generate a warning? The schema is invalid
73
90
  if constraints["minimum"]
74
91
  minimumValue = convert_to_type( constraints["minimum"] )
75
92
  if minimumValue
76
- build_errors(:below_minimum, :schema, row, column, value,
93
+ build_errors(:below_minimum, :schema, row, column, value,
77
94
  { "minimum" => constraints["minimum"] }) unless value >= minimumValue
78
95
  end
79
96
  end
80
97
  if constraints["maximum"]
81
98
  maximumValue = convert_to_type( constraints["maximum"] )
82
99
  if maximumValue
83
- build_errors(:above_maximum, :schema, row, column, value,
100
+ build_errors(:above_maximum, :schema, row, column, value,
84
101
  { "maximum" => constraints["maximum"] }) unless value <= maximumValue
85
102
  end
86
103
  end
@@ -96,7 +113,7 @@ module Csvlint
96
113
  end
97
114
  end
98
115
  return parsed
99
- end
116
+ end
100
117
 
101
118
  TYPE_VALIDATIONS = {
102
119
  'http://www.w3.org/2001/XMLSchema#string' => lambda { |value, constraints| value },
@@ -170,4 +187,4 @@ module Csvlint
170
187
  end,
171
188
  }
172
189
  end
173
- end
190
+ end
@@ -1,11 +1,11 @@
1
1
  module Csvlint
2
-
2
+
3
3
  class Schema
4
-
4
+
5
5
  include Csvlint::ErrorCollector
6
-
6
+
7
7
  attr_reader :uri, :fields, :title, :description
8
-
8
+
9
9
  def initialize(uri, fields=[], title=nil, description=nil)
10
10
  @uri = uri
11
11
  @fields = fields
@@ -17,16 +17,16 @@ module Csvlint
17
17
  def validate_header(header)
18
18
  reset
19
19
 
20
- found_header = header.join(',')
21
- expected_header = @fields.map{ |f| f.name }.join(',')
20
+ found_header = header.to_csv(:row_sep => '')
21
+ expected_header = @fields.map{ |f| f.name }.to_csv(:row_sep => '')
22
22
  if found_header != expected_header
23
23
  build_warnings(:malformed_header, :schema, 1, nil, found_header, expected_header)
24
24
  end
25
25
 
26
26
  return valid?
27
27
  end
28
-
29
- def validate_row(values, row=nil)
28
+
29
+ def validate_row(values, row=nil, all_errors=[])
30
30
  reset
31
31
  if values.length < fields.length
32
32
  fields[values.size..-1].each_with_index do |field, i|
@@ -38,34 +38,37 @@ module Csvlint
38
38
  build_warnings(:extra_column, :schema, row, fields.size+i+1)
39
39
  end
40
40
  end
41
-
41
+
42
42
  fields.each_with_index do |field,i|
43
43
  value = values[i] || ""
44
- result = field.validate_column(value, row, i+1)
44
+ result = field.validate_column(value, row, i+1, all_errors)
45
45
  @errors += fields[i].errors
46
- @warnings += fields[i].warnings
46
+ @warnings += fields[i].warnings
47
47
  end
48
-
48
+
49
49
  return valid?
50
50
  end
51
-
51
+
52
52
  def Schema.from_json_table(uri, json)
53
53
  fields = []
54
54
  json["fields"].each do |field_desc|
55
- fields << Csvlint::Field.new( field_desc["name"] , field_desc["constraints"],
55
+ fields << Csvlint::Field.new( field_desc["name"] , field_desc["constraints"],
56
56
  field_desc["title"], field_desc["description"] )
57
57
  end if json["fields"]
58
58
  return Schema.new( uri , fields, json["title"], json["description"] )
59
59
  end
60
-
60
+
61
+ # Difference in functionality between from_json_table and load_from_json_table
62
+ # needs to be specified
63
+
61
64
  def Schema.load_from_json_table(uri)
62
65
  begin
63
66
  json = JSON.parse( open(uri).read )
64
67
  return Schema.from_json_table(uri,json)
65
68
  rescue
66
- return nil
69
+ return Schema.new(nil, [], "malformed", "malformed")
67
70
  end
68
71
  end
69
-
72
+
70
73
  end
71
- end
74
+ end
@@ -1,25 +1,26 @@
1
1
  module Csvlint
2
-
2
+
3
3
  class Validator
4
-
4
+
5
5
  include Csvlint::ErrorCollector
6
-
6
+
7
7
  attr_reader :encoding, :content_type, :extension, :headers, :line_breaks, :dialect, :csv_header, :schema, :data
8
-
8
+
9
9
  ERROR_MATCHERS = {
10
10
  "Missing or stray quote" => :stray_quote,
11
11
  "Illegal quoting" => :whitespace,
12
12
  "Unclosed quoted field" => :unclosed_quote,
13
13
  "Unquoted fields do not allow \\r or \\n" => :line_breaks,
14
14
  }
15
-
16
- def initialize(source, dialect = nil, schema = nil, options = {})
15
+
16
+ def initialize(source, dialect = nil, schema = nil, options = {})
17
+
17
18
  @source = source
18
19
  @formats = []
19
20
  @schema = schema
20
-
21
+
21
22
  @supplied_dialect = dialect != nil
22
-
23
+
23
24
  @dialect = {
24
25
  "header" => true,
25
26
  "delimiter" => ",",
@@ -27,18 +28,19 @@ module Csvlint
27
28
  "lineTerminator" => :auto,
28
29
  "quoteChar" => '"'
29
30
  }.merge(dialect || {})
30
-
31
+
31
32
  @csv_header = @dialect["header"]
32
33
  @limit_lines = options[:limit_lines]
33
34
  @csv_options = dialect_to_csv_options(@dialect)
34
- @extension = parse_extension(source)
35
+ @extension = parse_extension(source) unless @source.nil?
35
36
  reset
36
37
  validate
38
+
37
39
  end
38
-
40
+
39
41
  def validate
40
- single_col = false
41
- io = nil
42
+ single_col = false
43
+ io = nil
42
44
  begin
43
45
  io = @source.respond_to?(:gets) ? @source : open(@source, :allow_redirections=>:all)
44
46
  validate_metadata(io)
@@ -47,19 +49,19 @@ module Csvlint
47
49
  unless sum.nil?
48
50
  build_warnings(:title_row, :structure) if @col_counts.first < (sum / @col_counts.size.to_f)
49
51
  end
50
- build_warnings(:check_options, :structure) if @expected_columns == 1
51
- check_consistency
52
+ build_warnings(:check_options, :structure) if @expected_columns == 1
53
+ check_consistency
52
54
  rescue OpenURI::HTTPError, Errno::ENOENT
53
55
  build_errors(:not_found)
54
56
  ensure
55
57
  io.close if io && io.respond_to?(:close)
56
58
  end
57
59
  end
58
-
60
+
59
61
  def validate_metadata(io)
60
62
  @encoding = io.charset rescue nil
61
63
  @content_type = io.content_type rescue nil
62
- @headers = io.meta rescue nil
64
+ @headers = io.meta rescue nil
63
65
  assumed_header = undeclared_header = !@supplied_dialect
64
66
  if @headers
65
67
  if @headers["content-type"] =~ /text\/csv/
@@ -74,31 +76,33 @@ module Csvlint
74
76
  assumed_header = false
75
77
  end
76
78
  if @headers["content-type"] !~ /charset=/
77
- build_warnings(:no_encoding, :context)
79
+ build_warnings(:no_encoding, :context)
78
80
  else
79
81
  build_warnings(:encoding, :context) if @encoding != "utf-8"
80
82
  end
81
83
  build_warnings(:no_content_type, :context) if @content_type == nil
82
84
  build_warnings(:excel, :context) if @content_type == nil && @extension =~ /.xls(x)?/
83
85
  build_errors(:wrong_content_type, :context) unless (@content_type && @content_type =~ /text\/csv/)
84
-
86
+
85
87
  if undeclared_header
86
88
  build_errors(:undeclared_header, :structure)
87
89
  assumed_header = false
88
90
  end
89
-
91
+
90
92
  end
91
93
  build_info_messages(:assumed_header, :structure) if assumed_header
92
94
  end
93
-
95
+
96
+ # analyses the provided csv and builds errors, warnings and info messages
94
97
  def parse_csv(io)
95
98
  @expected_columns = 0
96
99
  current_line = 0
97
100
  reported_invalid_encoding = false
101
+ all_errors = []
98
102
  @col_counts = []
99
-
100
- @csv_options[:encoding] = @encoding
101
-
103
+
104
+ @csv_options[:encoding] = @encoding
105
+
102
106
  begin
103
107
  wrapper = WrappedIO.new( io )
104
108
  csv = CSV.new( wrapper, @csv_options )
@@ -110,37 +114,38 @@ module Csvlint
110
114
  row = nil
111
115
  loop do
112
116
  current_line += 1
113
- if @limit_lines && current_line > @limit_lines
117
+ if @limit_lines && current_line > @limit_lines
114
118
  break
115
119
  end
116
120
  begin
117
121
  wrapper.reset_line
118
122
  row = csv.shift
119
123
  @data << row
120
- if row
124
+ if row
121
125
  if current_line == 1 && header?
122
126
  row = row.reject{|col| col.nil? || col.empty?}
123
127
  validate_header(row)
124
128
  @col_counts << row.size
125
- else
129
+ else
126
130
  build_formats(row)
127
131
  @col_counts << row.reject{|col| col.nil? || col.empty?}.size
128
132
  @expected_columns = row.size unless @expected_columns != 0
129
-
133
+
130
134
  build_errors(:blank_rows, :structure, current_line, nil, wrapper.line) if row.reject{ |c| c.nil? || c.empty? }.size == 0
131
-
135
+ # Builds errors and warnings related to the provided schema file
132
136
  if @schema
133
- @schema.validate_row(row, current_line)
137
+ @schema.validate_row(row, current_line, all_errors)
134
138
  @errors += @schema.errors
139
+ all_errors += @schema.errors
135
140
  @warnings += @schema.warnings
136
141
  else
137
142
  build_errors(:ragged_rows, :structure, current_line, nil, wrapper.line) if !row.empty? && row.size != @expected_columns
138
143
  end
139
-
144
+
140
145
  end
141
- else
146
+ else
142
147
  break
143
- end
148
+ end
144
149
  rescue CSV::MalformedCSVError => e
145
150
  type = fetch_error(e)
146
151
  if type == :stray_quote && !wrapper.line.match(csv.row_sep)
@@ -154,8 +159,8 @@ module Csvlint
154
159
  build_errors(:invalid_encoding, :structure, current_line, nil, wrapper.line) unless reported_invalid_encoding
155
160
  reported_invalid_encoding = true
156
161
  end
157
- end
158
-
162
+ end
163
+
159
164
  def validate_header(header)
160
165
  names = Set.new
161
166
  header.each_with_index do |name,i|
@@ -173,18 +178,18 @@ module Csvlint
173
178
  end
174
179
  return valid?
175
180
  end
176
-
181
+
177
182
  def header?
178
183
  @csv_header
179
184
  end
180
-
185
+
181
186
  def fetch_error(error)
182
187
  e = error.message.match(/^(.+?)(?: [io]n)? \(?line \d+\)?\.?$/i)
183
188
  message = e[1] rescue nil
184
189
  ERROR_MATCHERS.fetch(message, :unknown_error)
185
190
  end
186
-
187
- def dialect_to_csv_options(dialect)
191
+
192
+ def dialect_to_csv_options(dialect)
188
193
  skipinitialspace = dialect["skipInitialSpace"] || true
189
194
  delimiter = dialect["delimiter"]
190
195
  delimiter = delimiter + " " if !skipinitialspace
@@ -195,8 +200,8 @@ module Csvlint
195
200
  :skip_blanks => false
196
201
  }
197
202
  end
198
-
199
- def build_formats(row)
203
+
204
+ def build_formats(row)
200
205
  row.each_with_index do |col, i|
201
206
  next if col.nil? || col.empty?
202
207
  @formats[i] ||= Hash.new(0)
@@ -228,11 +233,11 @@ module Csvlint
228
233
  else
229
234
  :string
230
235
  end
231
-
236
+
232
237
  @formats[i][format] += 1
233
238
  end
234
239
  end
235
-
240
+
236
241
  def check_consistency
237
242
  @formats.each_with_index do |format,i|
238
243
  if format
@@ -243,10 +248,11 @@ module Csvlint
243
248
  end
244
249
  end
245
250
  end
246
-
251
+
247
252
  private
248
-
253
+
249
254
  def parse_extension(source)
255
+ # byebug
250
256
  case source
251
257
  when File
252
258
  return File.extname( source.path )
@@ -254,11 +260,16 @@ module Csvlint
254
260
  return ""
255
261
  when StringIO
256
262
  return ""
257
- when Tempfile
263
+ when Tempfile
264
+ # this is triggered when the revalidate dialect use case happens
258
265
  return ""
259
266
  else
260
- parsed = URI.parse(source)
261
- File.extname(parsed.path)
267
+ begin
268
+ parsed = URI.parse(source)
269
+ File.extname(parsed.path)
270
+ rescue URI::InvalidURIError
271
+ return ""
272
+ end
262
273
  end
263
274
  end
264
275
 
@@ -1,3 +1,3 @@
1
1
  module Csvlint
2
- VERSION = "0.1.2"
2
+ VERSION = "0.1.3"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: csvlint
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.2
4
+ version: 0.1.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - pezholio
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-07-15 00:00:00.000000000 Z
11
+ date: 2015-07-24 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: mime-types
@@ -248,6 +248,20 @@ dependencies:
248
248
  - - ! '>='
249
249
  - !ruby/object:Gem::Version
250
250
  version: '0'
251
+ - !ruby/object:Gem::Dependency
252
+ name: github_changelog_generator
253
+ requirement: !ruby/object:Gem::Requirement
254
+ requirements:
255
+ - - ! '>='
256
+ - !ruby/object:Gem::Version
257
+ version: '0'
258
+ type: :development
259
+ prerelease: false
260
+ version_requirements: !ruby/object:Gem::Requirement
261
+ requirements:
262
+ - - ! '>='
263
+ - !ruby/object:Gem::Version
264
+ version: '0'
251
265
  description: CSV Validator
252
266
  email:
253
267
  - pezholio@gmail.com
@@ -261,6 +275,7 @@ files:
261
275
  - .gitignore
262
276
  - .ruby-version
263
277
  - .travis.yml
278
+ - CHANGELOG.md
264
279
  - Gemfile
265
280
  - LICENSE.md
266
281
  - README.md
@@ -270,6 +285,7 @@ files:
270
285
  - csvlint.gemspec
271
286
  - features/check_format.feature
272
287
  - features/csv_options.feature
288
+ - features/csvupload.feature
273
289
  - features/fixtures/cr-line-endings.csv
274
290
  - features/fixtures/crlf-line-endings.csv
275
291
  - features/fixtures/inconsistent-line-endings-unquoted.csv
@@ -336,6 +352,7 @@ summary: CSV Validator
336
352
  test_files:
337
353
  - features/check_format.feature
338
354
  - features/csv_options.feature
355
+ - features/csvupload.feature
339
356
  - features/fixtures/cr-line-endings.csv
340
357
  - features/fixtures/crlf-line-endings.csv
341
358
  - features/fixtures/inconsistent-line-endings-unquoted.csv