csvlint 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,15 +1,15 @@
1
1
  ---
2
2
  !binary "U0hBMQ==":
3
3
  metadata.gz: !binary |-
4
- NjUyMGZlNGIyNjM4ZmU1OGVjMDc3Y2U2YmQ0NWEzMzM3ZGFkYzA0Yw==
4
+ YmFkYTMwMzU2Njg0ZDk5NmJkOTU2ZjczNWVjZTRjNTVmZWEyOTY2NA==
5
5
  data.tar.gz: !binary |-
6
- NzYzMjUxY2NlZmE4ODY3NjYwNTU4NzUyMmE2NTU0NjM1MmQ0YzI4NQ==
6
+ ODIzOTA1MmQ0MzI3ZTZlOTc2NTU5NmFkNzgxODkxODU2MjFhY2VmYw==
7
7
  SHA512:
8
8
  metadata.gz: !binary |-
9
- ODQwMGQzMjY4NGQzNGY5ZGY1Nzg3ODc5MDI3M2E3NmNmNzUwNzU1MzBjNjNk
10
- YzRiMjE3NTJlNTc5YWI2NjhkNGJjNDcxNzVkMDgyZmRkYjZlYWFmZjUyOWQx
11
- ZmVkOWRjZDJhMGU4M2QyYmI0YzZmMDBlYWRhMjFmNGFlZmJmZTg=
9
+ Y2IwYTYzYjg1N2JmMmFmN2Q1NjIwYTQxMTU0Y2Y5YTM3YzFkYjkyZWI1MWEy
10
+ ODJiZTZjNDhlMDc0MjliZGRkMDQyZDRkMDdlNzRjMmFhN2IyYTJkZDU4MWZl
11
+ NzdmYWQwZDI5MTI1Nzk5ZDQxZjRiZGVlMTI3NzMwNmI2NGZiZTM=
12
12
  data.tar.gz: !binary |-
13
- ZjhkODBhMWU1ODFjN2JiNjQ3ZjBlZmFkNzAwMmVhOTNiYzgwNDNjM2RjODg5
14
- YWE2NGUwNjk1M2JlYWFjZmVlY2U0YjNmMTZmM2U4YTFlMDcyMGViNWU3NWY2
15
- NTY5NmIyNmE5ZjYwNDkwOTE1NjQ2MWRiZGU0ZGRhM2MxMTEyYjU=
13
+ NDBlNGVhMTVkM2IzNDIyNWEzMjU5NDI4YjZjNjlmODg5YWNiOTMzNDk4ZGQ5
14
+ ODVmMTM2MWJjNTIwNzdhNmNkNTNhZGUyZTUyZTgyYzM4Njc4YWUwYzA1YWJl
15
+ MmYwNmFkMTVlMGIyNDE5ODVkN2E3NWE5ZWFlYzQ1NGVjNzY1Y2Q=
@@ -0,0 +1,137 @@
1
+ # Change Log
2
+
3
+ ## [Unreleased](https://github.com/theodi/csvlint.rb/tree/HEAD)
4
+
5
+ [Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.2...HEAD)
6
+
7
+ **Merged pull requests:**
8
+
9
+ - Error reporting schema expanded test suite [\#138](https://github.com/theodi/csvlint.rb/pull/138) ([quadrophobiac](https://github.com/quadrophobiac))
10
+
11
+ - Validate header size improvement [\#137](https://github.com/theodi/csvlint.rb/pull/137) ([adamc00](https://github.com/adamc00))
12
+
13
+ - Invalid schema [\#132](https://github.com/theodi/csvlint.rb/pull/132) ([bcouston](https://github.com/bcouston))
14
+
15
+ ## [0.1.2](https://github.com/theodi/csvlint.rb/tree/0.1.2) (2015-07-15)
16
+
17
+ [Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.1...0.1.2)
18
+
19
+ **Closed issues:**
20
+
21
+ - When an encoding error is thrown the line content is put into the column field in the error object [\#131](https://github.com/theodi/csvlint.rb/issues/131)
22
+
23
+ **Merged pull requests:**
24
+
25
+ - Catch invalid URIs [\#133](https://github.com/theodi/csvlint.rb/pull/133) ([pezholio](https://github.com/pezholio))
26
+
27
+ - Emit a warning when the CSV header does not match the supplied schema [\#127](https://github.com/theodi/csvlint.rb/pull/127) ([adamc00](https://github.com/adamc00))
28
+
29
+ ## [0.1.1](https://github.com/theodi/csvlint.rb/tree/0.1.1) (2015-07-13)
30
+
31
+ [Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.0...0.1.1)
32
+
33
+ **Closed issues:**
34
+
35
+ - Add Command Line Support [\#128](https://github.com/theodi/csvlint.rb/issues/128)
36
+
37
+ - BUG: Incorrect inconsistent\_values error on numeric columns [\#106](https://github.com/theodi/csvlint.rb/issues/106)
38
+
39
+ **Merged pull requests:**
40
+
41
+ - Fixes line content incorrectly being put into the row column field when there is an encoding error. [\#130](https://github.com/theodi/csvlint.rb/pull/130) ([glacier](https://github.com/glacier))
42
+
43
+ - Add command line help [\#129](https://github.com/theodi/csvlint.rb/pull/129) ([pezholio](https://github.com/pezholio))
44
+
45
+ - Remove stray q character. [\#125](https://github.com/theodi/csvlint.rb/pull/125) ([adamc00](https://github.com/adamc00))
46
+
47
+ - csvlint utility can take arguments to specify a schema and pp errors [\#124](https://github.com/theodi/csvlint.rb/pull/124) ([adamc00](https://github.com/adamc00))
48
+
49
+ - Fixed warning - use expect\( \) rather than .should [\#123](https://github.com/theodi/csvlint.rb/pull/123) ([jezhiggins](https://github.com/jezhiggins))
50
+
51
+ - Fixed spelling mistake [\#121](https://github.com/theodi/csvlint.rb/pull/121) ([jezhiggins](https://github.com/jezhiggins))
52
+
53
+ - Avoid using \#blank? if unnecessary [\#120](https://github.com/theodi/csvlint.rb/pull/120) ([jpmckinney](https://github.com/jpmckinney))
54
+
55
+ - eliminate some date and time formats, related \#105 [\#119](https://github.com/theodi/csvlint.rb/pull/119) ([jpmckinney](https://github.com/jpmckinney))
56
+
57
+ - Match another CSV error about line endings [\#118](https://github.com/theodi/csvlint.rb/pull/118) ([jpmckinney](https://github.com/jpmckinney))
58
+
59
+ - fixed typo mistake in README [\#117](https://github.com/theodi/csvlint.rb/pull/117) ([railsfactory-kumaresan](https://github.com/railsfactory-kumaresan))
60
+
61
+ - Integrate @jpmickinney's build\_formats improvements [\#112](https://github.com/theodi/csvlint.rb/pull/112) ([Floppy](https://github.com/Floppy))
62
+
63
+ - make limit\_lines into a non-dialect option [\#110](https://github.com/theodi/csvlint.rb/pull/110) ([Floppy](https://github.com/Floppy))
64
+
65
+ - fix coveralls stats [\#109](https://github.com/theodi/csvlint.rb/pull/109) ([Floppy](https://github.com/Floppy))
66
+
67
+ - Speed up \#build\_formats \(changes its API\) [\#103](https://github.com/theodi/csvlint.rb/pull/103) ([jpmckinney](https://github.com/jpmckinney))
68
+
69
+ - Limit lines [\#101](https://github.com/theodi/csvlint.rb/pull/101) ([Hoedic](https://github.com/Hoedic))
70
+
71
+ ## [0.1.0](https://github.com/theodi/csvlint.rb/tree/0.1.0) (2014-11-27)
72
+
73
+ **Implemented enhancements:**
74
+
75
+ - Blank values shouldn't count as inconsistencies [\#90](https://github.com/theodi/csvlint.rb/issues/90)
76
+
77
+ - Make sure we don't check schema column count and ragged row count together [\#66](https://github.com/theodi/csvlint.rb/issues/66)
78
+
79
+ - Include the failed constraints in error message when doing field validation [\#64](https://github.com/theodi/csvlint.rb/issues/64)
80
+
81
+ - Include the column value in error message when field validation fails [\#63](https://github.com/theodi/csvlint.rb/issues/63)
82
+
83
+ - Expose optional JSON table schema fields [\#55](https://github.com/theodi/csvlint.rb/issues/55)
84
+
85
+ - Ensure header rows are properly handled and validated [\#48](https://github.com/theodi/csvlint.rb/issues/48)
86
+
87
+ - Support zipped CSV? [\#30](https://github.com/theodi/csvlint.rb/issues/30)
88
+
89
+ - Improve feedback on inconsistent values [\#29](https://github.com/theodi/csvlint.rb/issues/29)
90
+
91
+ - Reported error positions are not massively useful [\#15](https://github.com/theodi/csvlint.rb/issues/15)
92
+
93
+ **Fixed bugs:**
94
+
95
+ - undefined method `\[\]' for nil:NilClass from fetch\_error [\#71](https://github.com/theodi/csvlint.rb/issues/71)
96
+
97
+ - Inconsistent column bases [\#69](https://github.com/theodi/csvlint.rb/issues/69)
98
+
99
+ - Improve error handling in Schema loading [\#42](https://github.com/theodi/csvlint.rb/issues/42)
100
+
101
+ - Recover from some line ending problems [\#41](https://github.com/theodi/csvlint.rb/issues/41)
102
+
103
+ - Inconsistent values due to number format differences [\#32](https://github.com/theodi/csvlint.rb/issues/32)
104
+
105
+ - New lines in quoted fields are valid [\#31](https://github.com/theodi/csvlint.rb/issues/31)
106
+
107
+ - Wrongly reporting incorrect file extension [\#23](https://github.com/theodi/csvlint.rb/issues/23)
108
+
109
+ - Incorrect extension reported when URL has query options at the end [\#14](https://github.com/theodi/csvlint.rb/issues/14)
110
+
111
+ **Closed issues:**
112
+
113
+ - Get gem continuously deploying [\#93](https://github.com/theodi/csvlint.rb/issues/93)
114
+
115
+ - Publish on rubygems.org [\#92](https://github.com/theodi/csvlint.rb/issues/92)
116
+
117
+ - Duplicate column names [\#87](https://github.com/theodi/csvlint.rb/issues/87)
118
+
119
+ - Return code is always 0 \(except when it isn't\) [\#85](https://github.com/theodi/csvlint.rb/issues/85)
120
+
121
+ - Can't pipe data to csvlint [\#84](https://github.com/theodi/csvlint.rb/issues/84)
122
+
123
+ - They have some validator running if someone wants to inspect it for "inspiration" [\#27](https://github.com/theodi/csvlint.rb/issues/27)
124
+
125
+ - Allow CSV parsing options to be configured as a parameter [\#6](https://github.com/theodi/csvlint.rb/issues/6)
126
+
127
+ - Use explicit CSV parsing options [\#5](https://github.com/theodi/csvlint.rb/issues/5)
128
+
129
+ - Improving encoding detection [\#2](https://github.com/theodi/csvlint.rb/issues/2)
130
+
131
+ **Merged pull requests:**
132
+
133
+ - Continuously deploy gem [\#102](https://github.com/theodi/csvlint.rb/pull/102) ([pezholio](https://github.com/pezholio))
134
+
135
+
136
+
137
+ \* *This Change Log was automatically generated by [github_changelog_generator](https://github.com/skywinder/Github-Changelog-Generator)*
@@ -36,4 +36,5 @@ Gem::Specification.new do |spec|
36
36
  spec.add_development_dependency "rspec-expectations"
37
37
  spec.add_development_dependency "coveralls"
38
38
  spec.add_development_dependency "pry"
39
+ spec.add_development_dependency "github_changelog_generator"
39
40
  end
@@ -0,0 +1,153 @@
1
+ Feature: Collect all the tests that should trigger dialect check related errors
2
+
3
+ Scenario: Title rows, I wish to trigger a :title_row type message
4
+ Given I have a CSV file called "title-row.csv"
5
+ And it is stored at the url "http://example.com/example1.csv"
6
+ And I ask if there are warnings
7
+ Then there should be 1 warnings
8
+ And that warning should have the type "title_row"
9
+
10
+ # :nonrfc_line_breaks
11
+
12
+ Scenario: LF line endings in file give an info message of type :nonrfc_line_breaks
13
+ Given I have a CSV file called "lf-line-endings.csv"
14
+ And it is stored at the url "http://example.com/example1.csv"
15
+ And I set header to "true"
16
+ And I ask if there are info messages
17
+ Then there should be 2 info messages
18
+ And one of the messages should have the type "nonrfc_line_breaks"
19
+
20
+ Scenario: CR line endings in file give an info message of type :nonrfc_line_breaks
21
+ Given I have a CSV file called "cr-line-endings.csv"
22
+ And it is stored at the url "http://example.com/example1.csv"
23
+ And I set header to "true"
24
+ And I ask if there are info messages
25
+ Then there should be 2 info messages
26
+ And one of the messages should have the type "nonrfc_line_breaks"
27
+
28
+ Scenario: CRLF line endings in file produces no info messages of type :nonrfc_line_breaks
29
+ Given I have a CSV file called "crlf-line-endings.csv"
30
+ And it is stored at the url "http://example.com/example1.csv"
31
+ And I set header to "true"
32
+ And I ask if there are info messages
33
+ Then there should be 1 info message
34
+
35
+ # :line_breaks
36
+
37
+ Scenario: Incorrect line endings specified in settings
38
+ Given I have a CSV file called "cr-line-endings.csv"
39
+ And I set the line endings to linefeed
40
+ And it is stored at the url "http://example.com/example1.csv"
41
+ And I ask if there are errors
42
+ Then there should be 1 error
43
+ And that error should have the type "line_breaks"
44
+
45
+ Scenario: inconsistent line endings in file cause an error
46
+ Given I have a CSV file called "inconsistent-line-endings.csv"
47
+ And it is stored at the url "http://example.com/example1.csv"
48
+ And I ask if there are errors
49
+ Then there should be 1 error
50
+ And that error should have the type "line_breaks"
51
+
52
+
53
+ Scenario: inconsistent line endings with unquoted fields in file cause an error
54
+ Given I have a CSV file called "inconsistent-line-endings-unquoted.csv"
55
+ And it is stored at the url "http://example.com/example1.csv"
56
+ And I ask if there are errors
57
+ Then there should be 1 error
58
+ And that error should have the type "line_breaks"
59
+
60
+ #:unclosed_quote
61
+
62
+ Scenario: CSV with incorrect quoting
63
+ Given I have a CSV with the following content:
64
+ """
65
+ "col1","col2","col3"
66
+ "Foo","Bar","Baz
67
+ """
68
+ And it is stored at the url "http://example.com/example1.csv"
69
+ When I ask if there are errors
70
+ Then there should be 1 error
71
+ And that error should have the type "unclosed_quote"
72
+ And that error should have the row "2"
73
+ And that error should have the content ""Foo","Bar","Baz"
74
+
75
+ # :invalid_encoding
76
+
77
+ Scenario: Report invalid Encoding
78
+ Given I have a CSV file called "invalid-byte-sequence.csv"
79
+ And I set an encoding header of "UTF-8"
80
+ And it is stored at the url "http://example.com/example1.csv"
81
+ When I ask if there are errors
82
+ Then there should be 1 error
83
+ And that error should have the type "invalid_encoding"
84
+
85
+ Scenario: Report invalid file
86
+ #should this throw an excel error?
87
+ Given I have a CSV file called "spreadsheet.xls"
88
+ And it is stored at the url "http://example.com/example1.csv"
89
+ When I ask if there are errors
90
+ Then there should be 1 error
91
+ And that error should have the type "invalid_encoding"
92
+
93
+ # :blank_rows
94
+
95
+ Scenario: Successfully report a CSV with blank rows
96
+ Given I have a CSV with the following content:
97
+ """
98
+ "col1","col2","col3"
99
+ "Foo","Bar","Baz"
100
+ "","",
101
+ "Baz","Bar","Foo"
102
+ """
103
+ And it is stored at the url "http://example.com/example1.csv"
104
+ When I ask if there are errors
105
+ Then there should be 1 error
106
+ And that error should have the type "blank_rows"
107
+ And that error should have the row "3"
108
+ And that error should have the content ""","","
109
+
110
+ Scenario: Successfully report a CSV with multiple trailing empty rows
111
+ Given I have a CSV with the following content:
112
+ """
113
+ "col1","col2","col3"
114
+ "Foo","Bar","Baz"
115
+ "Foo","Bar","Baz"
116
+
117
+
118
+ """
119
+ And it is stored at the url "http://example.com/example1.csv"
120
+ When I ask if there are errors
121
+ Then there should be 1 error
122
+ And that error should have the type "blank_rows"
123
+ And that error should have the row "4"
124
+
125
+ Scenario: Successfully report a CSV with an empty row
126
+ Given I have a CSV with the following content:
127
+ """
128
+ "col1","col2","col3"
129
+ "Foo","Bar","Baz"
130
+
131
+ "Foo","Bar","Baz"
132
+ """
133
+ And it is stored at the url "http://example.com/example1.csv"
134
+ When I ask if there are errors
135
+ Then there should be 1 error
136
+ And that error should have the type "blank_rows"
137
+ And that error should have the row "3"
138
+
139
+ #:check_options
140
+
141
+ Scenario: Warn if options seem to return invalid data
142
+ Given I have a CSV with the following content:
143
+ """
144
+ 'Foo';'Bar';'Baz'
145
+ '1';'2';'3'
146
+ '3';'2';'1'
147
+ """
148
+ And I set the delimiter to ","
149
+ And I set quotechar to """
150
+ And it is stored at the url "http://example.com/example1.csv"
151
+ And I ask if there are warnings
152
+ Then there should be 1 warnings
153
+ And that warning should have the type "check_options"
@@ -60,4 +60,46 @@ Feature: Schema Validation
60
60
  """
61
61
  When I ask if there are warnings
62
62
  Then there should be 1 warnings
63
-
63
+
64
+ Scenario: Schema with valid regex
65
+ Given I have a CSV with the following content:
66
+ """
67
+ "firstname","id","email"
68
+ "Bob","1234","bob@example.org"
69
+ "Alice","5","alice@example.com"
70
+ """
71
+ And it is stored at the url "http://example.com/example1.csv"
72
+ And I have a schema with the following content:
73
+ """
74
+ {
75
+ "fields": [
76
+ { "name": "Name", "constraints": { "required": true, "pattern": "^[A-Za-z0-9_]*$" } },
77
+ { "name": "Id", "constraints": { "required": true, "minLength": 1 } },
78
+ { "name": "Email", "constraints": { "required": true } }
79
+ ]
80
+ }
81
+ """
82
+ When I ask if there are errors
83
+ Then there should be 0 error
84
+
85
+ Scenario: Schema with invalid regex
86
+ Given I have a CSV with the following content:
87
+ """
88
+ "firstname","id","email"
89
+ "Bob","1234","bob@example.org"
90
+ "Alice","5","alice@example.com"
91
+ """
92
+ And it is stored at the url "http://example.com/example1.csv"
93
+ And I have a schema with the following content:
94
+ """
95
+ {
96
+ "fields": [
97
+ { "name": "Name", "constraints": { "required": true, "pattern": "((" } },
98
+ { "name": "Id", "constraints": { "required": true, "minLength": 1 } },
99
+ { "name": "Email", "constraints": { "required": true } }
100
+ ]
101
+ }
102
+ """
103
+ When I ask if there are errors
104
+ Then there should be 1 error
105
+ And that error should have the type "invalid_regex"
@@ -1,13 +1,15 @@
1
1
  module Csvlint
2
2
  module ErrorCollector
3
3
  attr_reader :errors, :warnings, :info_messages
4
-
4
+ # Creates a validation error
5
5
  def build_errors(type, category = nil, row = nil, column = nil, content = nil, constraints = {})
6
6
  @errors << Csvlint::ErrorMessage.new(type, category, row, column, content, constraints)
7
7
  end
8
+ # Creates a validation warning
8
9
  def build_warnings(type, category = nil, row = nil, column = nil, content = nil, constraints = {})
9
10
  @warnings << Csvlint::ErrorMessage.new(type, category, row, column, content, constraints)
10
11
  end
12
+ # Creates a validation information message
11
13
  def build_info_messages(type, category = nil, row = nil, column = nil, content = nil, constraints = {})
12
14
  @info_messages << Csvlint::ErrorMessage.new(type, category, row, column, content, constraints)
13
15
  end
@@ -9,6 +9,7 @@ module Csvlint
9
9
  @column = column
10
10
  @content = content
11
11
  @constraints = constraints
12
+
12
13
  end
13
14
  end
14
15
  end
@@ -1,10 +1,10 @@
1
1
  module Csvlint
2
-
2
+
3
3
  class Field
4
4
  include Csvlint::ErrorCollector
5
5
 
6
6
  attr_reader :name, :constraints, :title, :description
7
-
7
+
8
8
  def initialize(name, constraints={}, title=nil, description=nil)
9
9
  @name = name
10
10
  @constraints = constraints || {}
@@ -13,9 +13,12 @@ module Csvlint
13
13
  @description = description
14
14
  reset
15
15
  end
16
-
17
- def validate_column(value, row=nil, column=nil)
16
+
17
+ def validate_column(value, row=nil, column=nil, all_errors=[])
18
18
  reset
19
+ unless all_errors.any?{|error| ((error.type == :invalid_regex) && (error.column == column))}
20
+ validate_regex(value, row, column)
21
+ end
19
22
  validate_length(value, row, column)
20
23
  validate_values(value, row, column)
21
24
  parsed = validate_type(value, row, column)
@@ -26,11 +29,11 @@ module Csvlint
26
29
  private
27
30
  def validate_length(value, row, column)
28
31
  if constraints["required"] == true
29
- build_errors(:missing_value, :schema, row, column, value,
32
+ build_errors(:missing_value, :schema, row, column, value,
30
33
  { "required" => true }) if value.nil? || value.length == 0
31
34
  end
32
35
  if constraints["minLength"]
33
- build_errors(:min_length, :schema, row, column, value,
36
+ build_errors(:min_length, :schema, row, column, value,
34
37
  { "minLength" => constraints["minLength"] }) if value.nil? || value.length < constraints["minLength"]
35
38
  end
36
39
  if constraints["maxLength"]
@@ -38,12 +41,26 @@ module Csvlint
38
41
  { "maxLength" => constraints["maxLength"] } ) if !value.nil? && value.length > constraints["maxLength"]
39
42
  end
40
43
  end
41
-
42
- def validate_values(value, row, column)
43
- if constraints["pattern"]
44
- build_errors(:pattern, :schema, row, column, value,
45
- { "pattern" => constraints["pattern"] } ) if !value.nil? && !value.match( constraints["pattern"] )
44
+
45
+ def validate_regex(value, row, column)
46
+ pattern = constraints["pattern"]
47
+ if pattern
48
+ begin
49
+ Regexp.new(pattern)
50
+ build_errors(:pattern, :schema, row, column, value,
51
+ { "pattern" => constraints["pattern"] } ) if !value.nil? && !value.match( constraints["pattern"] )
52
+ rescue RegexpError
53
+ build_errors(:invalid_regex, :schema, nil, column, ("#{name}: Constraints: Pattern: #{pattern}"),
54
+ { "pattern" => constraints["pattern"] })
55
+ end
46
56
  end
57
+ end
58
+
59
+ def validate_values(value, row, column)
60
+ # If a pattern exists, raise an invalid regex error if it is not in
61
+ # valid regex form, else, if the value of the relevant field in the csv
62
+ # does not match the given regex pattern in the schema, raise a
63
+ # pattern error.
47
64
  if constraints["unique"] == true
48
65
  if @uniques.include? value
49
66
  build_errors(:unique, :schema, row, column, value, { "unique" => true })
@@ -52,7 +69,7 @@ module Csvlint
52
69
  end
53
70
  end
54
71
  end
55
-
72
+
56
73
  def validate_type(value, row, column)
57
74
  if constraints["type"] && value != ""
58
75
  parsed = convert_to_type(value)
@@ -66,21 +83,21 @@ module Csvlint
66
83
  end
67
84
  return nil
68
85
  end
69
-
86
+
70
87
  def validate_range(value, row, column)
71
88
  #TODO: we're ignoring issues with converting ranges to actual types, maybe we
72
89
  #should generate a warning? The schema is invalid
73
90
  if constraints["minimum"]
74
91
  minimumValue = convert_to_type( constraints["minimum"] )
75
92
  if minimumValue
76
- build_errors(:below_minimum, :schema, row, column, value,
93
+ build_errors(:below_minimum, :schema, row, column, value,
77
94
  { "minimum" => constraints["minimum"] }) unless value >= minimumValue
78
95
  end
79
96
  end
80
97
  if constraints["maximum"]
81
98
  maximumValue = convert_to_type( constraints["maximum"] )
82
99
  if maximumValue
83
- build_errors(:above_maximum, :schema, row, column, value,
100
+ build_errors(:above_maximum, :schema, row, column, value,
84
101
  { "maximum" => constraints["maximum"] }) unless value <= maximumValue
85
102
  end
86
103
  end
@@ -96,7 +113,7 @@ module Csvlint
96
113
  end
97
114
  end
98
115
  return parsed
99
- end
116
+ end
100
117
 
101
118
  TYPE_VALIDATIONS = {
102
119
  'http://www.w3.org/2001/XMLSchema#string' => lambda { |value, constraints| value },
@@ -170,4 +187,4 @@ module Csvlint
170
187
  end,
171
188
  }
172
189
  end
173
- end
190
+ end
@@ -1,11 +1,11 @@
1
1
  module Csvlint
2
-
2
+
3
3
  class Schema
4
-
4
+
5
5
  include Csvlint::ErrorCollector
6
-
6
+
7
7
  attr_reader :uri, :fields, :title, :description
8
-
8
+
9
9
  def initialize(uri, fields=[], title=nil, description=nil)
10
10
  @uri = uri
11
11
  @fields = fields
@@ -17,16 +17,16 @@ module Csvlint
17
17
  def validate_header(header)
18
18
  reset
19
19
 
20
- found_header = header.join(',')
21
- expected_header = @fields.map{ |f| f.name }.join(',')
20
+ found_header = header.to_csv(:row_sep => '')
21
+ expected_header = @fields.map{ |f| f.name }.to_csv(:row_sep => '')
22
22
  if found_header != expected_header
23
23
  build_warnings(:malformed_header, :schema, 1, nil, found_header, expected_header)
24
24
  end
25
25
 
26
26
  return valid?
27
27
  end
28
-
29
- def validate_row(values, row=nil)
28
+
29
+ def validate_row(values, row=nil, all_errors=[])
30
30
  reset
31
31
  if values.length < fields.length
32
32
  fields[values.size..-1].each_with_index do |field, i|
@@ -38,34 +38,37 @@ module Csvlint
38
38
  build_warnings(:extra_column, :schema, row, fields.size+i+1)
39
39
  end
40
40
  end
41
-
41
+
42
42
  fields.each_with_index do |field,i|
43
43
  value = values[i] || ""
44
- result = field.validate_column(value, row, i+1)
44
+ result = field.validate_column(value, row, i+1, all_errors)
45
45
  @errors += fields[i].errors
46
- @warnings += fields[i].warnings
46
+ @warnings += fields[i].warnings
47
47
  end
48
-
48
+
49
49
  return valid?
50
50
  end
51
-
51
+
52
52
  def Schema.from_json_table(uri, json)
53
53
  fields = []
54
54
  json["fields"].each do |field_desc|
55
- fields << Csvlint::Field.new( field_desc["name"] , field_desc["constraints"],
55
+ fields << Csvlint::Field.new( field_desc["name"] , field_desc["constraints"],
56
56
  field_desc["title"], field_desc["description"] )
57
57
  end if json["fields"]
58
58
  return Schema.new( uri , fields, json["title"], json["description"] )
59
59
  end
60
-
60
+
61
+ # Difference in functionality between from_json_table and load_from_json_table
62
+ # needs to be specified
63
+
61
64
  def Schema.load_from_json_table(uri)
62
65
  begin
63
66
  json = JSON.parse( open(uri).read )
64
67
  return Schema.from_json_table(uri,json)
65
68
  rescue
66
- return nil
69
+ return Schema.new(nil, [], "malformed", "malformed")
67
70
  end
68
71
  end
69
-
72
+
70
73
  end
71
- end
74
+ end
@@ -1,25 +1,26 @@
1
1
  module Csvlint
2
-
2
+
3
3
  class Validator
4
-
4
+
5
5
  include Csvlint::ErrorCollector
6
-
6
+
7
7
  attr_reader :encoding, :content_type, :extension, :headers, :line_breaks, :dialect, :csv_header, :schema, :data
8
-
8
+
9
9
  ERROR_MATCHERS = {
10
10
  "Missing or stray quote" => :stray_quote,
11
11
  "Illegal quoting" => :whitespace,
12
12
  "Unclosed quoted field" => :unclosed_quote,
13
13
  "Unquoted fields do not allow \\r or \\n" => :line_breaks,
14
14
  }
15
-
16
- def initialize(source, dialect = nil, schema = nil, options = {})
15
+
16
+ def initialize(source, dialect = nil, schema = nil, options = {})
17
+
17
18
  @source = source
18
19
  @formats = []
19
20
  @schema = schema
20
-
21
+
21
22
  @supplied_dialect = dialect != nil
22
-
23
+
23
24
  @dialect = {
24
25
  "header" => true,
25
26
  "delimiter" => ",",
@@ -27,18 +28,19 @@ module Csvlint
27
28
  "lineTerminator" => :auto,
28
29
  "quoteChar" => '"'
29
30
  }.merge(dialect || {})
30
-
31
+
31
32
  @csv_header = @dialect["header"]
32
33
  @limit_lines = options[:limit_lines]
33
34
  @csv_options = dialect_to_csv_options(@dialect)
34
- @extension = parse_extension(source)
35
+ @extension = parse_extension(source) unless @source.nil?
35
36
  reset
36
37
  validate
38
+
37
39
  end
38
-
40
+
39
41
  def validate
40
- single_col = false
41
- io = nil
42
+ single_col = false
43
+ io = nil
42
44
  begin
43
45
  io = @source.respond_to?(:gets) ? @source : open(@source, :allow_redirections=>:all)
44
46
  validate_metadata(io)
@@ -47,19 +49,19 @@ module Csvlint
47
49
  unless sum.nil?
48
50
  build_warnings(:title_row, :structure) if @col_counts.first < (sum / @col_counts.size.to_f)
49
51
  end
50
- build_warnings(:check_options, :structure) if @expected_columns == 1
51
- check_consistency
52
+ build_warnings(:check_options, :structure) if @expected_columns == 1
53
+ check_consistency
52
54
  rescue OpenURI::HTTPError, Errno::ENOENT
53
55
  build_errors(:not_found)
54
56
  ensure
55
57
  io.close if io && io.respond_to?(:close)
56
58
  end
57
59
  end
58
-
60
+
59
61
  def validate_metadata(io)
60
62
  @encoding = io.charset rescue nil
61
63
  @content_type = io.content_type rescue nil
62
- @headers = io.meta rescue nil
64
+ @headers = io.meta rescue nil
63
65
  assumed_header = undeclared_header = !@supplied_dialect
64
66
  if @headers
65
67
  if @headers["content-type"] =~ /text\/csv/
@@ -74,31 +76,33 @@ module Csvlint
74
76
  assumed_header = false
75
77
  end
76
78
  if @headers["content-type"] !~ /charset=/
77
- build_warnings(:no_encoding, :context)
79
+ build_warnings(:no_encoding, :context)
78
80
  else
79
81
  build_warnings(:encoding, :context) if @encoding != "utf-8"
80
82
  end
81
83
  build_warnings(:no_content_type, :context) if @content_type == nil
82
84
  build_warnings(:excel, :context) if @content_type == nil && @extension =~ /.xls(x)?/
83
85
  build_errors(:wrong_content_type, :context) unless (@content_type && @content_type =~ /text\/csv/)
84
-
86
+
85
87
  if undeclared_header
86
88
  build_errors(:undeclared_header, :structure)
87
89
  assumed_header = false
88
90
  end
89
-
91
+
90
92
  end
91
93
  build_info_messages(:assumed_header, :structure) if assumed_header
92
94
  end
93
-
95
+
96
+ # analyses the provided csv and builds errors, warnings and info messages
94
97
  def parse_csv(io)
95
98
  @expected_columns = 0
96
99
  current_line = 0
97
100
  reported_invalid_encoding = false
101
+ all_errors = []
98
102
  @col_counts = []
99
-
100
- @csv_options[:encoding] = @encoding
101
-
103
+
104
+ @csv_options[:encoding] = @encoding
105
+
102
106
  begin
103
107
  wrapper = WrappedIO.new( io )
104
108
  csv = CSV.new( wrapper, @csv_options )
@@ -110,37 +114,38 @@ module Csvlint
110
114
  row = nil
111
115
  loop do
112
116
  current_line += 1
113
- if @limit_lines && current_line > @limit_lines
117
+ if @limit_lines && current_line > @limit_lines
114
118
  break
115
119
  end
116
120
  begin
117
121
  wrapper.reset_line
118
122
  row = csv.shift
119
123
  @data << row
120
- if row
124
+ if row
121
125
  if current_line == 1 && header?
122
126
  row = row.reject{|col| col.nil? || col.empty?}
123
127
  validate_header(row)
124
128
  @col_counts << row.size
125
- else
129
+ else
126
130
  build_formats(row)
127
131
  @col_counts << row.reject{|col| col.nil? || col.empty?}.size
128
132
  @expected_columns = row.size unless @expected_columns != 0
129
-
133
+
130
134
  build_errors(:blank_rows, :structure, current_line, nil, wrapper.line) if row.reject{ |c| c.nil? || c.empty? }.size == 0
131
-
135
+ # Builds errors and warnings related to the provided schema file
132
136
  if @schema
133
- @schema.validate_row(row, current_line)
137
+ @schema.validate_row(row, current_line, all_errors)
134
138
  @errors += @schema.errors
139
+ all_errors += @schema.errors
135
140
  @warnings += @schema.warnings
136
141
  else
137
142
  build_errors(:ragged_rows, :structure, current_line, nil, wrapper.line) if !row.empty? && row.size != @expected_columns
138
143
  end
139
-
144
+
140
145
  end
141
- else
146
+ else
142
147
  break
143
- end
148
+ end
144
149
  rescue CSV::MalformedCSVError => e
145
150
  type = fetch_error(e)
146
151
  if type == :stray_quote && !wrapper.line.match(csv.row_sep)
@@ -154,8 +159,8 @@ module Csvlint
154
159
  build_errors(:invalid_encoding, :structure, current_line, nil, wrapper.line) unless reported_invalid_encoding
155
160
  reported_invalid_encoding = true
156
161
  end
157
- end
158
-
162
+ end
163
+
159
164
  def validate_header(header)
160
165
  names = Set.new
161
166
  header.each_with_index do |name,i|
@@ -173,18 +178,18 @@ module Csvlint
173
178
  end
174
179
  return valid?
175
180
  end
176
-
181
+
177
182
  def header?
178
183
  @csv_header
179
184
  end
180
-
185
+
181
186
  def fetch_error(error)
182
187
  e = error.message.match(/^(.+?)(?: [io]n)? \(?line \d+\)?\.?$/i)
183
188
  message = e[1] rescue nil
184
189
  ERROR_MATCHERS.fetch(message, :unknown_error)
185
190
  end
186
-
187
- def dialect_to_csv_options(dialect)
191
+
192
+ def dialect_to_csv_options(dialect)
188
193
  skipinitialspace = dialect["skipInitialSpace"] || true
189
194
  delimiter = dialect["delimiter"]
190
195
  delimiter = delimiter + " " if !skipinitialspace
@@ -195,8 +200,8 @@ module Csvlint
195
200
  :skip_blanks => false
196
201
  }
197
202
  end
198
-
199
- def build_formats(row)
203
+
204
+ def build_formats(row)
200
205
  row.each_with_index do |col, i|
201
206
  next if col.nil? || col.empty?
202
207
  @formats[i] ||= Hash.new(0)
@@ -228,11 +233,11 @@ module Csvlint
228
233
  else
229
234
  :string
230
235
  end
231
-
236
+
232
237
  @formats[i][format] += 1
233
238
  end
234
239
  end
235
-
240
+
236
241
  def check_consistency
237
242
  @formats.each_with_index do |format,i|
238
243
  if format
@@ -243,10 +248,11 @@ module Csvlint
243
248
  end
244
249
  end
245
250
  end
246
-
251
+
247
252
  private
248
-
253
+
249
254
  def parse_extension(source)
255
+ # byebug
250
256
  case source
251
257
  when File
252
258
  return File.extname( source.path )
@@ -254,11 +260,16 @@ module Csvlint
254
260
  return ""
255
261
  when StringIO
256
262
  return ""
257
- when Tempfile
263
+ when Tempfile
264
+ # this is triggered when the revalidate dialect use case happens
258
265
  return ""
259
266
  else
260
- parsed = URI.parse(source)
261
- File.extname(parsed.path)
267
+ begin
268
+ parsed = URI.parse(source)
269
+ File.extname(parsed.path)
270
+ rescue URI::InvalidURIError
271
+ return ""
272
+ end
262
273
  end
263
274
  end
264
275
 
@@ -1,3 +1,3 @@
1
1
  module Csvlint
2
- VERSION = "0.1.2"
2
+ VERSION = "0.1.3"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: csvlint
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.2
4
+ version: 0.1.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - pezholio
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-07-15 00:00:00.000000000 Z
11
+ date: 2015-07-24 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: mime-types
@@ -248,6 +248,20 @@ dependencies:
248
248
  - - ! '>='
249
249
  - !ruby/object:Gem::Version
250
250
  version: '0'
251
+ - !ruby/object:Gem::Dependency
252
+ name: github_changelog_generator
253
+ requirement: !ruby/object:Gem::Requirement
254
+ requirements:
255
+ - - ! '>='
256
+ - !ruby/object:Gem::Version
257
+ version: '0'
258
+ type: :development
259
+ prerelease: false
260
+ version_requirements: !ruby/object:Gem::Requirement
261
+ requirements:
262
+ - - ! '>='
263
+ - !ruby/object:Gem::Version
264
+ version: '0'
251
265
  description: CSV Validator
252
266
  email:
253
267
  - pezholio@gmail.com
@@ -261,6 +275,7 @@ files:
261
275
  - .gitignore
262
276
  - .ruby-version
263
277
  - .travis.yml
278
+ - CHANGELOG.md
264
279
  - Gemfile
265
280
  - LICENSE.md
266
281
  - README.md
@@ -270,6 +285,7 @@ files:
270
285
  - csvlint.gemspec
271
286
  - features/check_format.feature
272
287
  - features/csv_options.feature
288
+ - features/csvupload.feature
273
289
  - features/fixtures/cr-line-endings.csv
274
290
  - features/fixtures/crlf-line-endings.csv
275
291
  - features/fixtures/inconsistent-line-endings-unquoted.csv
@@ -336,6 +352,7 @@ summary: CSV Validator
336
352
  test_files:
337
353
  - features/check_format.feature
338
354
  - features/csv_options.feature
355
+ - features/csvupload.feature
339
356
  - features/fixtures/cr-line-endings.csv
340
357
  - features/fixtures/crlf-line-endings.csv
341
358
  - features/fixtures/inconsistent-line-endings-unquoted.csv