csvlint 0.1.4 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (38) hide show
  1. checksums.yaml +8 -8
  2. data/.gitignore +7 -1
  3. data/CHANGELOG.md +19 -1
  4. data/README.md +93 -36
  5. data/bin/csvlint +68 -27
  6. data/csvlint.gemspec +2 -0
  7. data/features/csvw_schema_validation.feature +127 -0
  8. data/features/fixtures/spreadsheet.xlsx +0 -0
  9. data/features/sources.feature +3 -4
  10. data/features/step_definitions/parse_csv_steps.rb +13 -1
  11. data/features/step_definitions/schema_validation_steps.rb +27 -1
  12. data/features/step_definitions/sources_steps.rb +1 -1
  13. data/features/step_definitions/validation_errors_steps.rb +48 -1
  14. data/features/step_definitions/validation_info_steps.rb +5 -1
  15. data/features/step_definitions/validation_warnings_steps.rb +15 -1
  16. data/features/support/load_tests.rb +114 -0
  17. data/features/validation_errors.feature +12 -24
  18. data/features/validation_warnings.feature +18 -6
  19. data/lib/csvlint.rb +10 -0
  20. data/lib/csvlint/csvw/column.rb +359 -0
  21. data/lib/csvlint/csvw/date_format.rb +182 -0
  22. data/lib/csvlint/csvw/metadata_error.rb +13 -0
  23. data/lib/csvlint/csvw/number_format.rb +211 -0
  24. data/lib/csvlint/csvw/property_checker.rb +761 -0
  25. data/lib/csvlint/csvw/table.rb +204 -0
  26. data/lib/csvlint/csvw/table_group.rb +165 -0
  27. data/lib/csvlint/schema.rb +40 -23
  28. data/lib/csvlint/validate.rb +142 -19
  29. data/lib/csvlint/version.rb +1 -1
  30. data/spec/csvw/column_spec.rb +112 -0
  31. data/spec/csvw/date_format_spec.rb +49 -0
  32. data/spec/csvw/number_format_spec.rb +403 -0
  33. data/spec/csvw/table_group_spec.rb +143 -0
  34. data/spec/csvw/table_spec.rb +90 -0
  35. data/spec/schema_spec.rb +27 -1
  36. data/spec/spec_helper.rb +0 -1
  37. data/spec/validator_spec.rb +16 -10
  38. metadata +53 -2
checksums.yaml CHANGED
@@ -1,15 +1,15 @@
1
1
  ---
2
2
  !binary "U0hBMQ==":
3
3
  metadata.gz: !binary |-
4
- NTcwMjkzODNkNWM4ZDg1ZjU2MjQzZjdhYzJhNDE0OGFlZWZmMjQ0ZA==
4
+ YjlmZmFlNGZjOWQ5MmNlNDZiOTUxMWY0NGExYTRkYjhhNzdlNjAyNA==
5
5
  data.tar.gz: !binary |-
6
- N2I0YWJjYzgzM2UxZDk1ODdkY2M0ZTRmNGU2ZDI3ZGNhMTgxNGFiMA==
6
+ ODFjZmJkZmI0Nzg2NmMzN2ViOGNiNDlmODA0NDcxMzM0Zjk4NTgwOQ==
7
7
  SHA512:
8
8
  metadata.gz: !binary |-
9
- NDM5MmU5MDEwNjQyOWFlMGUwNWFmNGNjNTc0YTU4NmE0ZWY2YjMyMDRkODY2
10
- NmIyODJhMzEyOGIyYjlkMTQwM2FhYjJmNzU3NGVjY2E3OTgyMjg0NWRkZGEx
11
- YzU1MjAyYjJhYzUxNGFmZTc2YmVkYjQ0MjI3ZjdkZmNlYzJkZjM=
9
+ ZTIyMGVkYjIyMjc2ZWViNTBhYmZkMWIxN2E1OTU0OTFhNGMxNzBlYzg0OTI4
10
+ NDRkMzY2YzgxNmQwZGZiZDE5M2M2NzYwMzk3ZWZjMDc3YWM0YzQ0NTczY2U3
11
+ MGZjNTUwMGI2MzgzZDQxYzkzMzBiNzI3NmJkZTIxYjZiYjc5MDA=
12
12
  data.tar.gz: !binary |-
13
- MzVjNWU4MDFjYWEyNWNhNWY0OGM3NTUxNmU4MzAxZTYwMGYyZTBiOTdmYzhl
14
- MmJiNWM0OTQ0YWRlZWY3MDAyN2RjN2RmMWIyNTk5NWJlODgwZDVkODUwN2E0
15
- NTJlYjMxNTg5Y2I4ZDcyNzUyN2EzNjNkNmRiOTRiMzlkOTdhZjg=
13
+ NTI1M2I5Yzc3NGNhOTg3Y2VkMmM3ZGM1ZTdiZWNmMzM0ZTY5ODljODNmNWYy
14
+ MDA0NGVlMGFhNDQ2ZjZjYjI0Nzc2OTdhMWRmODI5YTEzMGRmNTQxZjAyOTA5
15
+ YjVmMjk4NDIyOWEzMzIxMTBlYjQ4YTgwZmE4MWZlYTQ4MjMzZmE=
data/.gitignore CHANGED
@@ -19,4 +19,10 @@ coverage/
19
19
  /.rspec
20
20
 
21
21
  .idea
22
- .DS_Store
22
+ .DS_Store
23
+ features/csvw_validation_tests.feature
24
+ features/fixtures/csvw
25
+
26
+ bin/run-csvw-tests
27
+
28
+ features/csvw_json_transformation_tests.feature
@@ -2,7 +2,25 @@
2
2
 
3
3
  ## [Unreleased](https://github.com/theodi/csvlint.rb/tree/HEAD)
4
4
 
5
- [Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.3...HEAD)
5
+ [Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.4...HEAD)
6
+
7
+ **Closed issues:**
8
+
9
+ - CSV on the web support [\#141](https://github.com/theodi/csvlint.rb/issues/141)
10
+
11
+ **Merged pull requests:**
12
+
13
+ - Recover from `ArgumentError`s when attempting to locate a schema and detect bad schema when JSON is malformed [\#152](https://github.com/theodi/csvlint.rb/pull/152) ([pezholio](https://github.com/pezholio))
14
+
15
+ - Catch errors if link headers are don't have particular values [\#151](https://github.com/theodi/csvlint.rb/pull/151) ([pezholio](https://github.com/pezholio))
16
+
17
+ - Rescue excel warning [\#149](https://github.com/theodi/csvlint.rb/pull/149) ([quadrophobiac](https://github.com/quadrophobiac))
18
+
19
+ - CSVW-based validation! [\#142](https://github.com/theodi/csvlint.rb/pull/142) ([JeniT](https://github.com/JeniT))
20
+
21
+ ## [0.1.4](https://github.com/theodi/csvlint.rb/tree/0.1.4) (2015-08-06)
22
+
23
+ [Full Changelog](https://github.com/theodi/csvlint.rb/compare/0.1.3...0.1.4)
6
24
 
7
25
  **Merged pull requests:**
8
26
 
data/README.md CHANGED
@@ -31,13 +31,13 @@ You can either use this gem within your own Ruby code, or as a standolone comman
31
31
  After installing the gem, you can validate a CSV on the command line like so:
32
32
 
33
33
  csvlint myfile.csv
34
-
34
+
35
35
  You will then see the validation result, together with any warnings or errors e.g.
36
36
 
37
37
  ```
38
38
  myfile.csv is INVALID
39
39
  1. blank_rows. Row: 3
40
- 1. title_row.
40
+ 1. title_row.
41
41
  2. inconsistent_values. Column: 14
42
42
  ```
43
43
 
@@ -50,40 +50,40 @@ You can also optionally pass a schema file like so:
50
50
  Currently the gem supports retrieving a CSV accessible from a URL, File, or an IO-style object (e.g. StringIO)
51
51
 
52
52
  require 'csvlint'
53
-
53
+
54
54
  validator = Csvlint::Validator.new( "http://example.org/data.csv" )
55
55
  validator = Csvlint::Validator.new( File.new("/path/to/my/data.csv" ))
56
56
  validator = Csvlint::Validator.new( StringIO.new( my_data_in_a_string ) )
57
57
 
58
- When validating from a URL the range of errors and warnings is wider as the library will also check HTTP headers for
58
+ When validating from a URL the range of errors and warnings is wider as the library will also check HTTP headers for
59
59
  best practices
60
-
61
- #invoke the validation
60
+
61
+ #invoke the validation
62
62
  validator.validate
63
-
63
+
64
64
  #check validation status
65
65
  validator.valid?
66
-
66
+
67
67
  #access array of errors, each is an Csvlint::ErrorMessage object
68
68
  validator.errors
69
-
69
+
70
70
  #access array of warnings
71
71
  validator.warnings
72
-
72
+
73
73
  #access array of information messages
74
74
  validator.info_messages
75
-
75
+
76
76
  #get some information about the CSV file that was validated
77
77
  validator.encoding
78
78
  validator.content_type
79
79
  validator.extension
80
-
80
+
81
81
  #retrieve HTTP headers from request
82
82
  validator.headers
83
83
 
84
84
  ## Controlling CSV Parsing
85
85
 
86
- The validator supports configuration of the [CSV Dialect](http://dataprotocols.org/csv-dialect/) used in a data file. This is specified by
86
+ The validator supports configuration of the [CSV Dialect](http://dataprotocols.org/csv-dialect/) used in a data file. This is specified by
87
87
  passing a dialect hash to the constructor:
88
88
 
89
89
  dialect = {
@@ -94,17 +94,17 @@ passing a dialect hash to the constructor:
94
94
 
95
95
  The options should be a Hash that conforms to the [CSV Dialect](http://dataprotocols.org/csv-dialect/) JSON structure.
96
96
 
97
- While these options configure the parser to correctly process the file, the validator will still raise errors or warnings for CSV
97
+ While these options configure the parser to correctly process the file, the validator will still raise errors or warnings for CSV
98
98
  structure that it considers to be invalid, e.g. a missing header or different delimiters.
99
99
 
100
- Note that the parser will also check for a `header` parameter on the `Content-Type` header returned when fetching a remote CSV file. As
100
+ Note that the parser will also check for a `header` parameter on the `Content-Type` header returned when fetching a remote CSV file. As
101
101
  specified in [RFC 4180](http://www.ietf.org/rfc/rfc4180.txt) the values for this can be `present` and `absent`, e.g:
102
102
 
103
103
  Content-Type: text/csv; header=present
104
104
 
105
105
  ## Error Reporting
106
106
 
107
- The validator provides feedback on a validation result using instances of `Csvlint::ErrorMessage`. Errors are divided into errors, warnings and information
107
+ The validator provides feedback on a validation result using instances of `Csvlint::ErrorMessage`. Errors are divided into errors, warnings and information
108
108
  messages. A validation attempt is successful if there are no errors.
109
109
 
110
110
  Messages provide context including:
@@ -122,7 +122,7 @@ The following types of error can be reported:
122
122
  * `:wrong_content_type` -- content type is not `text/csv`
123
123
  * `:ragged_rows` -- row has a different number of columns (than the first row in the file)
124
124
  * `:blank_rows` -- completely empty row, e.g. blank line or a line where all column values are empty
125
- * `:invalid_encoding` -- encoding error when parsing row, e.g. because of invalid characters
125
+ * `:invalid_encoding` -- encoding error when parsing row, e.g. because of invalid characters
126
126
  * `:not_found` -- HTTP 404 error when retrieving the data
127
127
  * `:stray_quote` -- missing or stray quote
128
128
  * `:unclosed_quote` -- unclosed quoted field
@@ -153,36 +153,66 @@ There are also information messages available:
153
153
 
154
154
  ## Schema Validation
155
155
 
156
- The library supports validating data against a schema. A schema configuration can be provided as a Hash or parsed from JSON. The structure currently
157
- follows JSON Table Schema with some extensions.
156
+ The library supports validating data against a schema. A schema configuration can be provided as a Hash or parsed from JSON. The structure currently
157
+ follows JSON Table Schema with some extensions and rudinmentary [CSV on the Web Metadata](http://www.w3.org/TR/tabular-metadata/).
158
158
 
159
- An example schema file is:
159
+ An example JSON Table Schema schema file is:
160
160
 
161
161
  {
162
162
  "fields": [
163
- {
164
- "name": "id",
165
- "constraints": { "required": true }
163
+ {
164
+ "name": "id",
165
+ "constraints": { "required": true }
166
166
  },
167
- {
168
- "name": "price",
169
- "constraints": { "required": true, "minLength": 1 }
167
+ {
168
+ "name": "price",
169
+ "constraints": { "required": true, "minLength": 1 }
170
170
  },
171
- {
172
- "name": "postcode",
173
- "constraints": {
174
- "required": true,
175
- "pattern": "[A-Z]{1,2}[0-9][0-9A-Z]? ?[0-9][A-Z]{2}"
176
- }
171
+ {
172
+ "name": "postcode",
173
+ "constraints": {
174
+ "required": true,
175
+ "pattern": "[A-Z]{1,2}[0-9][0-9A-Z]? ?[0-9][A-Z]{2}"
176
+ }
177
177
  }
178
178
  ]
179
179
  }
180
180
 
181
- Parsing and validating with a schema:
181
+ An equivalent CSV on the Web Metadata file is:
182
+
183
+ {
184
+ "@context": "http://www.w3.org/ns/csvw",
185
+ "url": "http://example.com/example1.csv",
186
+ "tableSchema": {
187
+ "columns": [
188
+ {
189
+ "name": "id",
190
+ "required": true
191
+ },
192
+ {
193
+ "name": "price",
194
+ "required": true,
195
+ "datatype": { "base": "string", "minLength": 1 }
196
+ },
197
+ {
198
+ "name": "postcode",
199
+ "required": true
200
+ }
201
+ ]
202
+ }
203
+ }
182
204
 
183
- schema = Csvlint::Schema.load_from_json_table(uri)
205
+ Parsing and validating with a schema (of either kind):
206
+
207
+ schema = Csvlint::Schema.load_from_json(uri)
184
208
  validator = Csvlint::Validator.new( "http://example.org/data.csv", nil, schema )
185
209
 
210
+ ### CSV on the Web Validation Support
211
+
212
+ This gem passes all the validation tests in the [official CSV on the Web test suite](http://w3c.github.io/csvw/tests/) (though there might still be errors or parts of the [CSV on the Web standard](http://www.w3.org/TR/tabular-metadata/) that aren't tested by that test suite).
213
+
214
+ ### JSON Table Schema Support
215
+
186
216
  Supported constraints:
187
217
 
188
218
  * `required` -- there must be a value for this field in every row
@@ -192,7 +222,7 @@ Supported constraints:
192
222
  * `pattern` -- values must match the provided regular expression
193
223
  * `type` -- specifies an XML Schema data type. Values of the column must be a valid value for that type
194
224
  * `minimum` -- specify a minimum range for values, the value will be parsed as specified by `type`
195
- * `maximum` -- specify a maximum range for values, the value will be parsed as specified by `type`
225
+ * `maximum` -- specify a maximum range for values, the value will be parsed as specified by `type`
196
226
  * `datePattern` -- specify a `strftime` compatible date pattern to be used when parsing date values and min/max constraints
197
227
 
198
228
  Supported data types (this is still a work in progress):
@@ -214,7 +244,7 @@ Supported data types (this is still a work in progress):
214
244
  * Time -- `http://www.w3.org/2001/XMLSchema#time`
215
245
 
216
246
  Use of an unknown data type will result in the column failing to validate.
217
-
247
+
218
248
  Schema validation provides some additional types of error and warning messages:
219
249
 
220
250
  * `:missing_value` (error) -- a column marked as `required` in the schema has no value
@@ -248,3 +278,30 @@ validator = Csvlint::Validator.new( "http://example.org/data.csv", nil, nil, opt
248
278
  3. Commit your changes (`git commit -am 'Add some feature'`)
249
279
  4. Push to the branch (`git push origin my-new-feature`)
250
280
  5. Create new Pull Request
281
+
282
+ ### Testing
283
+
284
+ The codebase includes both rspec and cucumber tests, which can be run together using:
285
+
286
+ $ rake
287
+
288
+ or separately:
289
+
290
+ $ rake spec
291
+ $ rake features
292
+
293
+ When the cucumber tests are first run, a script will create tests based on the latest version of the [CSV on the Web test suite](http://w3c.github.io/csvw/tests/), including creating a local cache of the test files. This requires an internet connection and some patience. Following that download, the tests will run locally; there's also a batch script:
294
+
295
+ $ bin/run-csvw-tests
296
+
297
+ which will run the tests from the command line.
298
+
299
+ If you need to refresh the CSV on the Web tests:
300
+
301
+ $ rm bin/run-csvw-tests
302
+ $ rm features/csvw_validation_tests.feature
303
+ $ rm -r features/fixtures/csvw
304
+
305
+ and then run the cucumber tests again or:
306
+
307
+ $ ruby features/support/load_tests.rb
@@ -16,8 +16,8 @@ opts.on("-d", "--dump-errors", "Pretty print error and warning objects.") do |d|
16
16
  options[:dump] = d
17
17
  end
18
18
 
19
- opts.on("-s", "--schema-file FILENAME", "Schema file") do |s|
20
- options[:schema_file] = s
19
+ opts.on("-s", "--schema FILENAME", "Schema file") do |s|
20
+ options[:schema] = s
21
21
  end
22
22
 
23
23
  opts.on_tail("-h", "--help",
@@ -35,14 +35,15 @@ rescue OptionParser::InvalidOption => e
35
35
  end
36
36
 
37
37
  def print_error(index, error, dump, color)
38
-
39
38
  location = ""
40
39
  location += error.row.to_s if error.row
41
40
  location += "#{error.row ? "," : ""}#{error.column.to_s}" if error.column
42
41
  if error.row || error.column
43
42
  location = "#{error.row ? "Row" : "Column"}: #{location}"
44
43
  end
45
- output_string = "#{index+1}. #{error.type}. #{location}"
44
+ output_string = "#{index+1}. #{error.type}"
45
+ output_string += ". #{location}" unless location.empty?
46
+ output_string += ". #{error.content}" if error.content
46
47
 
47
48
  if $stdout.tty?
48
49
  puts output_string.colorize(color)
@@ -56,6 +57,30 @@ def print_error(index, error, dump, color)
56
57
 
57
58
  end
58
59
 
60
+ def validate_csv(source, schema, dump)
61
+ validator = Csvlint::Validator.new( source, nil, schema )
62
+
63
+ if $stdout.tty?
64
+ puts "#{source.path || source || "CSV"} is #{validator.valid? ? "VALID".green : "INVALID".red}"
65
+ else
66
+ puts "#{source.path || source || "CSV"} is #{validator.valid? ? "VALID" : "INVALID"}"
67
+ end
68
+
69
+ if validator.errors.size > 0
70
+ validator.errors.each_with_index do |error, i|
71
+ print_error(i, error, dump, :red)
72
+ end
73
+ end
74
+
75
+ if validator.warnings.size > 0
76
+ validator.warnings.each_with_index do |error, i|
77
+ print_error(i, error, dump, :yellow)
78
+ end
79
+ end
80
+
81
+ return validator.valid?
82
+ end
83
+
59
84
  if ARGV.length == 0 && !$stdin.tty?
60
85
  source = StringIO.new(ARGF.read)
61
86
  else
@@ -63,13 +88,13 @@ else
63
88
  source = ARGV[0]
64
89
  unless source =~ /^http(s)?/
65
90
  begin
66
- source = File.new( source ) unless source =~ /^http(s)?/
91
+ source = File.new( source ) unless source =~ /^http(s)?/
67
92
  rescue Errno::ENOENT
68
93
  puts "#{source} not found"
69
94
  exit 1
70
95
  end
71
96
  end
72
- else
97
+ elsif !options[:schema]
73
98
  puts "No CSV data to validate."
74
99
  puts opts
75
100
  exit 1
@@ -77,34 +102,50 @@ else
77
102
  end
78
103
 
79
104
  schema = nil
80
- if options[:schema_file]
105
+ if options[:schema]
81
106
  begin
82
- schemafile = File.read( options[:schema_file] )
107
+ schema = Csvlint::Schema.load_from_json(options[:schema])
108
+ rescue JSON::ParserError => e
109
+ output_string = "invalid metadata: malformed JSON"
110
+ if $stdout.tty?
111
+ puts output_string.colorize(:red)
112
+ else
113
+ puts output_string
114
+ end
115
+ exit 1
116
+ rescue Csvlint::Csvw::MetadataError => e
117
+ output_string = "invalid metadata: #{e.message}#{" at " + e.path if e.path}"
118
+ if $stdout.tty?
119
+ puts output_string.colorize(:red)
120
+ else
121
+ puts output_string
122
+ end
123
+ exit 1
83
124
  rescue Errno::ENOENT
84
- puts "#{options[:schema_file]} not found"
125
+ puts "#{options[:schema]} not found"
85
126
  exit 1
86
127
  end
87
- schema = Csvlint::Schema.from_json_table(nil, JSON.parse(schemafile))
88
- end
89
-
90
- validator = Csvlint::Validator.new( source, nil, schema )
91
-
92
- if $stdout.tty?
93
- puts "#{ARGV[0] || "CSV"} is #{validator.valid? ? "VALID".green : "INVALID".red}"
94
- else
95
- puts "#{ARGV[0] || "CSV"} is #{validator.valid? ? "VALID" : "INVALID"}"
96
128
  end
97
129
 
98
- if validator.errors.size > 0
99
- validator.errors.each_with_index do |error, i|
100
- print_error(i, error, options[:dump], :red)
130
+ valid = true
131
+ if source.nil?
132
+ unless schema.instance_of? Csvlint::Csvw::TableGroup
133
+ puts "No CSV data to validate."
134
+ puts opts
135
+ exit 1
101
136
  end
102
- end
103
-
104
- if validator.warnings.size > 0
105
- validator.warnings.each_with_index do |error, i|
106
- print_error(i, error, options[:dump], :yellow)
137
+ schema.tables.keys.each do |source|
138
+ begin
139
+ source = source.sub("file:","")
140
+ source = File.new( source )
141
+ rescue Errno::ENOENT
142
+ puts "#{source} not found"
143
+ exit 1
144
+ end unless source =~ /^http(s)?/
145
+ valid &= validate_csv(source, schema, options[:dump])
107
146
  end
147
+ else
148
+ valid = validate_csv(source, schema, options[:dump])
108
149
  end
109
150
 
110
- exit 1 unless validator.valid?
151
+ exit 1 unless valid
@@ -23,6 +23,8 @@ Gem::Specification.new do |spec|
23
23
  spec.add_dependency "open_uri_redirections"
24
24
  spec.add_dependency "activesupport"
25
25
  spec.add_dependency "addressable"
26
+ spec.add_dependency "escape_utils"
27
+ spec.add_dependency "uri_template"
26
28
 
27
29
  spec.add_development_dependency "bundler", "~> 1.3"
28
30
  spec.add_development_dependency "rake"
@@ -0,0 +1,127 @@
1
+ Feature: CSVW Schema Validation
2
+
3
+ Scenario: Valid CSV
4
+ Given I have a CSV with the following content:
5
+ """
6
+ "Bob","1234","bob@example.org"
7
+ "Alice","5","alice@example.com"
8
+ """
9
+ And it is stored at the url "http://example.com/example1.csv"
10
+ And I have metadata with the following content:
11
+ """
12
+ {
13
+ "@context": "http://www.w3.org/ns/csvw",
14
+ "url": "http://example.com/example1.csv",
15
+ "dialect": { "header": false },
16
+ "tableSchema": {
17
+ "columns": [
18
+ { "name": "Name", "required": true },
19
+ { "name": "Id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
20
+ { "name": "Email", "required": true }
21
+ ]
22
+ }
23
+ }
24
+ """
25
+ When I ask if there are errors
26
+ Then there should be 0 error
27
+
28
+ Scenario: Schema invalid CSV
29
+ Given I have a CSV with the following content:
30
+ """
31
+ "Bob","1234","bob@example.org"
32
+ "Alice","5","alice@example.com"
33
+ """
34
+ And it is stored at the url "http://example.com/example1.csv"
35
+ And I have metadata with the following content:
36
+ """
37
+ {
38
+ "@context": "http://www.w3.org/ns/csvw",
39
+ "url": "http://example.com/example1.csv",
40
+ "dialect": { "header": false },
41
+ "tableSchema": {
42
+ "columns": [
43
+ { "name": "Name", "required": true },
44
+ { "name": "Id", "required": true, "datatype": { "base": "string", "minLength": 3 } },
45
+ { "name": "Email", "required": true }
46
+ ]
47
+ }
48
+ }
49
+ """
50
+ When I ask if there are errors
51
+ Then there should be 1 error
52
+
53
+ Scenario: CSV with incorrect header
54
+ Given I have a CSV with the following content:
55
+ """
56
+ "name","id","contact"
57
+ "Bob","1234","bob@example.org"
58
+ "Alice","5","alice@example.com"
59
+ """
60
+ And it is stored at the url "http://example.com/example1.csv"
61
+ And I have metadata with the following content:
62
+ """
63
+ {
64
+ "@context": "http://www.w3.org/ns/csvw",
65
+ "url": "http://example.com/example1.csv",
66
+ "tableSchema": {
67
+ "columns": [
68
+ { "titles": "name", "required": true },
69
+ { "titles": "id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
70
+ { "titles": "email", "required": true }
71
+ ]
72
+ }
73
+ }
74
+ """
75
+ When I ask if there are errors
76
+ Then there should be 1 error
77
+
78
+ Scenario: Schema with valid regex
79
+ Given I have a CSV with the following content:
80
+ """
81
+ "firstname","id","email"
82
+ "Bob","1234","bob@example.org"
83
+ "Alice","5","alice@example.com"
84
+ """
85
+ And it is stored at the url "http://example.com/example1.csv"
86
+ And I have metadata with the following content:
87
+ """
88
+ {
89
+ "@context": "http://www.w3.org/ns/csvw",
90
+ "url": "http://example.com/example1.csv",
91
+ "tableSchema": {
92
+ "columns": [
93
+ { "titles": "firstname", "required": true, "datatype": { "base": "string", "format": "^[A-Za-z0-9_]*$" } },
94
+ { "titles": "id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
95
+ { "titles": "email", "required": true }
96
+ ]
97
+ }
98
+ }
99
+ """
100
+ When I ask if there are warnings
101
+ Then there should be 0 warnings
102
+
103
+ Scenario: Schema with invalid regex
104
+ Given I have a CSV with the following content:
105
+ """
106
+ "firstname","id","email"
107
+ "Bob","1234","bob@example.org"
108
+ "Alice","5","alice@example.com"
109
+ """
110
+ And it is stored at the url "http://example.com/example1.csv"
111
+ And I have metadata with the following content:
112
+ """
113
+ {
114
+ "@context": "http://www.w3.org/ns/csvw",
115
+ "url": "http://example.com/example1.csv",
116
+ "tableSchema": {
117
+ "columns": [
118
+ { "titles": "firstname", "required": true, "datatype": { "base": "string", "format": "((" } },
119
+ { "titles": "id", "required": true, "datatype": { "base": "string", "minLength": 1 } },
120
+ { "titles": "email", "required": true }
121
+ ]
122
+ }
123
+ }
124
+ """
125
+ When I ask if there are warnings
126
+ Then there should be 1 warnings
127
+ And that warning should have the type "invalid_regex"